It is well known that the scientific method requires that experiments may be reproduced.
Scientific papers have been for a long time the main way for communicating and sharing information needed to ensure the successful repetition, and extension, of experiments. However, during last years, scientific research has become more and more computational, involving huge amounts of data, hard data analysis tasks, and specialized software tools, often distributed.
Computational reproducibility poses new challenges for scientific replication and the research paper may not effectively support replication of computational analyses. Companion websites aimed at making data and software packages shareable may be useful, but are not enough. Frameworks for creating descriptive and interactive publications by linking them with associated objects, including software source codes, data sets and related annotations, data pipelines and workflows are needed.
It is noteworthy that reproducible research in bioinformatics refers to the ability to repeat the calculations for analyzing the data and obtaining the computational results, rather than to validate results by another algorithm or implementation.
During the analysis of large data sets, researchers may introduce modifications to the adopted methods, fine tune parameters, and updates to the data, until the results presented in the paper are achieved. Usually, the resulting publications only report little details of these activities. The tools and data that actually lead to the final results may then be lost or unrecoverable. It is essential that software tools and data are made publicly available in the same identical format, so that the computational experiments can be reproduced by analyzing the same data by means of the same tools.
Licensing is also essential. Code must be available to be distributed, data must be accessible in a readable format, and a platform must be available for widely distributing the data and code. But, in addition, both data and code need to be licensed permissively enough to allow other independent researchers to adopt these tools and to reproduce the work without any legal burden.
Reproducibility in bioinformatics poses additional issue, which are related to the evolution of methods and tools, as well as to the data that is increasingly made available in databases. As a consequence, under certain circumstances data analyses could not even be reproducible at all unless complete details on tools and database versions are archived.
The workshop will provide an excellent environment and a range of opportunities to present and discuss methods, theoretical approaches, algorithms, tools, platforms, practical applications and experiences on the focus theme as well as on many other bioinformatics topics, as from tradition of NETTAB & EMBnet previous events.