Image for Software Heritage: Building the Universal Software Archive for Open Science

Software Heritage: Building the Universal Software Archive for Open Science

Roberto Di Cosmo (University Paris Diderot)





Abstract
Software is at the heart of our digital society and embodies a growing part of our scientific, technical and organisational knowledge.



Software Heritage is an open non-profit initiative whose mission is to ensure that this precious body of knowledge will be preserved over time and made available to all.



We do this for multiple reasons. To preserve the scientific and technological knowledge embedded in software source code. To allow better software development and reuse for society and industry. To foster better science, building the infrastructure for preserving, sharing and referencing research software, a stepping stone for reproducibility, and a necessary complement to Open Access.



We do this now, because we are at a turning point: the founding fathers are still around, and willing to contribute their knowledge, but only for a limited time. And we face the risk of massive loss of source code developed by the Free and Open Source community, with code hosting sites that shut down when their popularity decreases.



Software Heritage archives already more than 5 billion unique source code files, spanning more than 85 million projects, with their full development history: in this talk we will review the motivations and the challenges that arise in building such a unique shared infrastructure.