Zenodo launches integration with Software Heritage

by Lars Holm Nielsen, on October 21, 2024


Zenodo and Software Heritage, through the EU-funded FAIRCORE4EOSC project, have launched a new integration. In order to fulfill the promise of an interconnected and interoperable academic ecosystem, research software infrastructures should support the archiving of source code within the universal source code archive, contributing to the global software commons. This integration ensures that software source code deposited in Zenodo is automatically archived in Software Heritage. It implements the recommendations from the EOSC Scholarly Infrastructures for Research Software report:

“In the 21st century, many research activities use computing systems to monitor their experiments, to visualise or analyse their results, or to check hypotheses through simulation.It has therefore become essential to archive, preserve and share research software.”

“Over the past decade, awareness has been raised about the importance of software in the scholarly world. Several infrastructures have started to be built, or adapted, to address some of the following key challenges that need to be tackled to put software on equal footing with other research outputs in the scholarly world:

  • Archiving software to ensure research software artifacts are not lost.
  • Referencing software to ensure research artifacts can be precisely identified.
  • Describing software to easily discover and identify research software artifacts.
  • Crediting all authors to ensure their contributions are recognized.”

Zenodo: Research software + Versioning + GitHub

Zenodo has long had a strong focus on supporting research software. Since 2014 Zenodo has an integration with GitHub that enables researchers to easily archive research software in GitHub into Zenodo. Upon deposit of the research software in Zenodo (either from GitHub or directly in Zenodo), the researcher would obtain a DOI (Digital Object Identifier) which would facilitate the persistent identification of software and support researchers in adopting the Software Citation Principles, in particular in citing research software papers. The Zenodo versioning feature further enabled both the citation of individual snapshots of software vs. citing a software project as a whole. Today, Zenodo is the largest minter of software DOIs and is able to track citations to software independently of which persistent identifier was used in the citation.

Integration with Software Heritage

The new integration between Zenodo and Software Heritage enhances the capabilities to archive, reference, describe, and cite research software artifacts. Most of the process occurs behind the scenes, ensuring seamless and transparent software archiving for researchers, regardless of their workflow.


Figure 1 - Zenodo record for a software deposit showing that it has been archived in Software Heritage in the bottom right corner.

When a researcher deposits software in Zenodo, the software will be automatically sent to Software Heritage (if the files are publicly accessible). Zenodo then obtains the associated Software Hash Identifier (SWHID), links it with the DOI, and displays it on the record landing page. The DOI integrates with the scholarly publishing ecosystem, while the SWHID provides direct access to the archived source code, including the full version history. This bi-directional linking ensures interoperability between two key identifiers for research software.


Figure 2 - The corresponding Zenodo software record in Software Heritage.

In addition to archiving software in Software Heritage, Zenodo has enhanced the upload form with software-specific fields, such as programming languages and repository URLs on top of our already existing fields such as the SPDX license field. We’ve also added support for CodeMeta and Citation File Format export formats.

What’s next?

While the core integration with Software Heritage has launched, further backend improvements are planned for the coming six months, primarily aimed at improving interoperability. Additionally, this integration will be fully incorporated into InvenioRDM, making it easier for other repositories, such as institutional ones, to integrate with Software Heritage.

This work was funded by the European Commission through grant agreement no. 101057264 (FAIRCORE4EOSC).



EU Open Research Repository Moves to Production

by Lars Holm Nielsen, on October 21, 2024


In March 2024, the EU and CERN officially launched the EU Open Research Repository on Zenodo in a pilot phase, and since then, it has rapidly gained momentum. Over the past several months, we have successfully onboarded 130 EU-funded projects as EU projects communities - a feature that provides projects an easy go-to solution for sharing and preserving the research outputs from their projects. About 23% of all EU-funded projects (FP7, Horizon 2020 and Horizon Europe) during the past 10 years have a research output on Zenodo amounting to 11.000 different grants.

As we now move from the pilot to the production phase, the EU Open Research Repository is set to become an essential tool for EU projects, offering an easy, accessible platform to support the broader implementation of EU open science policies.

Supporting EU Open Science policy

The EU has long been a driving force behind open science, progressively supporting its adoption through successive Research and Innovation Framework Programmes. This effort began with the Open Access pilot in FP7, followed by the addition of Open Data provisions in Horizon 2020, and now, in Horizon Europe, through a strong commitment to comprehensive open science practices. These practices include open access to scientific publications, responsible management of research data, and a clear focus on FAIR (Findable, Accessible, Interoperable, Reusable) principles.

The establishment of the EU Open Research Repository represents a continuation of these efforts. Built upon the foundation of Zenodo — a general-purpose open repository operated by CERN — the new repository enables researchers to deposit a wide range of outputs, including papers, datasets, software, posters, presentations, and more. This platform provides an easy, go-to solution for EU programme beneficiaries to comply with open science requirements, helping them make their research outputs FAIR in practice.

Managed by CERN on behalf of the European Commission, the EU Open Research Repository helps EU-funded projects streamline the management and dissemination of their research outputs, supporting the continued growth of the open science ecosystem in Europe.

New Features During the Pilot

Today, we are also excited to announce the launch of our new browse feature, which offers users an overview of the content within the EU Open Research Repository. This feature allows users to browse research outputs by funding programme, subject, or project, providing an intuitive, user-friendly experience. All of this is made possible thanks to the integration of high-quality open data from CORDIS, the EU’s project database, which provides funding programme details and subject classifications using EuroSciVoc.

The pilot phase introduced several new features designed to make it easier for EU-funded projects to manage their research outputs. One of the key additions was a workflow for projects to request a project community, which allows them to either create a new community or integrate an existing Zenodo community into the EU Open Research Repository.

What's next?

As we transition into the production phase, we have several key objectives. First and foremost, we will continue to integrate feedback from the early adopters who participated in the pilot phase. Their insights have been invaluable in shaping the platform, and we are committed to ensuring that their needs and suggestions are addressed as we move forward.

In addition to this, we plan to onboard approximately 2,700 project communities that we have already identified on Zenodo. These projects will benefit from the new features and improved workflows that have been developed during the pilot.

Another important focus for the future is the automatic integration of EU-funded submissions made to Zenodo that fall outside of dedicated project communities. This will further streamline the process of depositing research outputs and ensure that all relevant submissions are included in the EU Open Research Repository.

Lastly, we are committed to enhancing the FAIRness of Zenodo by implementing improvements that will better support domain-specific features. This will also involve harmonizing curation efforts across different projects to ensure the high quality of metadata associated with deposited research outputs.

Funding

The EU Open Research Repository is funded by the European Union under grant agreement no. 101122956 (HORIZON-ZEN).

You can learn more about the HORIZON-ZEN project on https://about.zenodo.org/projects/horizon-zen/



Win the 2024 Dataworks! $100,000 Grand Prize by reusing Zenodo data

by Pearl D. Go, on October 14, 2024


Announcing the 2024 DataWorks! Prize!

The Federation of American Societies for Experimental Biology (FASEB) and the National Institutes of Health (NIH) invite you to submit your data reuse project proposal that demonstrates the power of data reuse to advance human health.

The 2024 DataWorks! Prize is a collaboration with the seven generalist repositories participating in the NIH-funded Generalist Repositories Ecosystem Initiative (GREI) and will focus on best practices in data reuse and secondary analysis that advance human health. Participants will participate in a two-phase challenge.

  • Phase 1: Research teams will submit a proposal for a secondary analysis research project that can be completed within a 6 month period and incorporates data from one or more generalist repositories participating in the GREI (more information on GREI); data from other repositories can be combined.
  • Phase 2: Selected teams will complete their reuse/secondary analysis research projects and share their findings publicly.

Awards and Prizes

The NIH Office of Data Science Strategy will award up to $500,000 total in cash prizes to the Challenge winners. NIH will award the prize purse in the following amounts:

  • Phase 1: $25,000 per winner, up to ten (10) winners
  • Phase 2: Grand Prize: $100,000 for one (1) winner; Distinguished Achievement Awards: $75,000 per winner, up to two (2) winners

Successful submissions must:

  • Address a pivotal health research question via data reuse and secondary data analysis
  • Include data from at least one GREI repository, including Zenodo.
  • Share results with the broader community
  • Be submitted by October 24, 2024

Reusing data in Zenodo:

As a cross-domain repository, Zenodo enables researchers to share and preserve a wide range of interdisciplinary research outputs, including research papers, data sets, research software, reports, presentations, and any other research related digital outputs.

How can I find Zenodo data for reuse?

Search for data by simply typing keywords of interest in the search box at the top of the Zenodo home page. Facet search results by using the menu on the left side of the page by resource type (e.g., dataset), access status (e.g., open), subject area, and file type (for example, hypertension, pathogen, comparative genomics, precision medicine). You can order the search results based on date, best match, most viewed, etc. Use the Zenodo Search Guide to further craft your search.

You can also try your search in Zenodo Communities, A Zenodo community provides a space for domains, projects, and institutions to curate and manage a collection of their research outputs and share with members of the community and beyond (for example, some Communities funded by NIH).

Full details about the challenge can be found here.

We encourage you and your team to submit your project proposals. Your ideas could shape the future of healthcare.