Home » Programme » Speakers


Chris Evelo
Jacques van Helden
van Helden
Barend Mons
Susanna Assunta Sansone
Susanna A
  Peter McQuilton
Reza Salek

Chris Evelo,
Department of Bioinformatics – BiGCaT, Maastricht University, Maastricht, The Netherlands

Chris Evelo is head of the Department of Bioinformatics – BiGCaT at Maastricht University, which he started in 2001.
His main research interest is to integrate different bioinformatics approaches to allow real understanding of the data generated in large scale genomics experiments. For this he develops pipelines that start with evaluation of data quality and allow filtering and normalisation. Data is then statistically analysed and studied for patterns, gene clusters and profiles. After coupling through genome databases the results can be understood in the context of existing biological knowledge. Since the latter knowledge is domain specific and needs to be formalised he developed a pathway content wiki (see wikipathways.org) and a pathway analysis tool (see pathvisio.org).

And then magic happens… (about linking different types of data and what is needed to make that work)

Jacques van Helden,
Université d’Aix-Marseille (AMU), Marseille, France

Jacques van Helden is Professor of bioinformatics at Aix-Marseille Université (AMU), Marseille, France.
His research activities consist in conceiving, developing, assessing and applying bioinformatics approaches to analyse regulatory sequences and biomolecular networks (metabolic pathways, protein interactions, regulatory networks).
Initially trained as an agronomic engineer, he made a PhD thesis in developmental genetics (regulation of the achaete-scute complex in Drosophila melanogaster). In 1997, during a PhD in Julio Collado Vides’ lab (Cuernavaca, Mexico), he started to develop the Regulatory Sequence Analysis Tools (RSAT, http://rsat.eu/), which remained his main research project and contribution. He also developed bioinformatics approaches relying on graph theory (path finding, sub-graph extraction), to infer metabolic pathways from sets of functionally related genes (operons, co-expression clusters, …).
His teaching activities include bioinformatics, statistics for bioinformatics, genome analysis, analysis of regulatory sequences, analysis of NGS data, network analysis, programming, evolutionary biology, biology and society.

Ensuring reproducibility and portability of NGS analysis workflows
Next Generation Sequencing (NGS) is rapidly propagating to an increasing number of laboratories worldwide, and covers a broadening range of applications in all domains of life sciences. Despite the availability of all the short read sequences in official database (SRA, ENA), it is doubtful that anyone could reproduce the bioinformatics analysis from other lab’s publications. Indeed, scientific journals impose to deposit sequences in databases prior to publication, but do not require any formal specification of the analytic workflows. Classical textual descriptions (“Methods” section of articles, supplementary material) are far from sufficient to reproduce or even trace back the precise steps from the raw data to the published results.
In his talk, Jacques van Helden will address a series of challenging issues and present some hints to ensure traceability and reproducibility of results from Next Generation Sequencing technologies: (1) providing formal specification of the bioinformatics workflows (tools, parameters); (2) freezing an image of the complete software environment required to run the analyses (operating system, libraries, properly versioned tools); (3) enforcing the statistical treatment and assessing the robustness of the results to small sample fluctuations; (4) ensuring the readability of the protocols with reports integrating code, results and interpretation.

Barend Mons,
Leiden University Medical Center (LUMC), Leiden, The Netherlands

Barend Mons is Professor of Biosemantics at the Leiden University Medical Center. Next to his leading role in the research of the group, he plays a leading role in the international development of ‘data stewardship’ for biomedical data. For instance, he is head-of-node of Elixir-NL. Elixir is a pan-European project to develop and foster bioinformatics infrastructure across the member states.
Barend Mons is a molecular biologist by training and received his PhD on genetic differentiation of malaria parasites from Leiden University (1986). He performed over a decade of research on malaria genetics and vaccine development, also serving for 3 years the research department of the European Commission in this field. He did gain further experience in science management at the Research council of The Netherlands (NWO).
Barend is the co founder of three spin-off companies in biotechnological and semantic technologies and is an advisor for several companies as well. From the year 2000 onward he increasingly focuses on the development of semantic technologies to manage big data and he founded the Biosemantics groups, first at Erasmus University in Rotterdam and later also in Leiden. Both groups collaborate very closely.
His research is currently focused on nanopublications as a substrate for in silico knowledge discovery. Barend is also one of the founders of the Concept Web Alliance, with “nanopublications” as its first brainchild. Nanopublications are currently implemented in the semantic project of the Innovative Medicines Initiative (IMI) called Open PHACTS.

The future: partly FAIR, partly Cloudy

Susanna Assunta Sansone,
Oxford e-Research Center (OERC), Oxford, United Kingdom

Susanna Sansone is Associate Director at the University of Oxford e-Research Centre and she also works at Nature Publishing Group as data consultant and Honorary Academic Editor for Scientific Data, an open access data publication platform.
She holds a PhD in Molecular Biology from Imperial College of Science, Technology and Medicine, London; after few years working on vaccine genetics in an Imperial’s spin-off, she moved to the European Bioinformatics Institute (EBI, Cambridge) where she worked for nine years as a Project and Team Coordinator and Principal Investigator.
As Principal Investigator at the Oxford e-Research Centre, her activities are around and in support of data curation, management and publication and their pivotal roles in enabling reproducible research, driving science and discoveries. She focuses on life science, environmental and biomedical domains, collaborating with data producers and service providers, and pre-competitive informatics initiatives, journals and funding agencies to develop software and promote the creation and uptake of community-developed ontology and standards.
She leads the Centre in several projects and in the ELIXIR UK Node, where she is responsible for standards and curation areas; she is also partner in two NIH Big Data to Knowledge (BD2K) Centers of Excellence.

How to create awareness, inform and educate
A growing worldwide movement for reproducible research encourages making data, along with the experimental details, available according to the FAIR principles of Findability, Accessibility, Interoperability and Reusability (see http://www.nature.com/articles/sdata201618). Several data management, sharing policies and plans have emerged and, in parallel, a growing number of community-based groups are developing hundreds of standards to harmonize the reporting of different experiments. Community mobilization is evident also by the number of efforts and alliances, but also data journals and data centres being launched. Susanna A Sansone will paint this dynamic landscape, highlighting ELIXIR-UK related activities, such as BioSharing and ISA, and their role in scholarly communication using the Springer Nature’s Scientific Data journal as example.

Peter McQuilton
Peter McQuilton,
Oxford e-Research Center (OERC), Oxford, United Kingdom

Peter McQuilton is the Lead Content Knowledge Engineer for the BioSharing project, based at the University of Oxford e-Research Centre in the UK.
He hold a PhD in Drosophila embryonic nervous system development from the University of Cambridge, UK. After his PhD, he worked for over 10 years as a Biocurator at FlyBase (www.flybase.org), an NIH/MRC-funded Model Organism Database focused on the genetics and genomics of Drosophila melanogaster (the fruitfly that bothers your wine in the summer). Peter was involved in a number of projects relating to the extraction of genetic data from the published literature, text-mining, website design, and outreach/education. As Content Lead for the BioSharing project, Peter’s activities are in and around data curation, text-mining, ontology design, and data sharing and publication in the life, natural and biomedical sciences.

BioSharing – mapping the landscape of Standards, Databases and Data policies in the life, biomedical and environmental sciences
BioSharing (https://www.biosharing.org/) is a curated, web-based, searchable portal of three linked registries of content standards, databases, and data policies in the life sciences, broadly encompassing the biological, environmental and biomedical sciences. Launched in 2011 and built by the same core team as the successful MIBBI portal, BioSharing harnesses community curation to collate and cross-reference resources across the life sciences from around the world. Every record is designed to be interlinked, providing a detailed description not only on the resource itself, but also on its relationships with other life science infrastructures. BioSharing maps the dynamic landscape of over 600 community-developed standards, monitoring their development, evolution and implementation in (over 700) databases, detailing their adoption in funder and journal data policies.
BioSharing content can be searched using simple or advanced searches, explored via a step-by-step wizard, filtered via our faceted search, or grouped as ‘Collections’. Examples are the NPG Scientific Data, BioMedCentral or PLOS Recommendations, that collate and link standards and repositories that are recommended in the data policies in those journals. Similarly other Collections are being generated by organisations or for projects, such as the BD2K bioCADDIE project. As a community effort, BioSharing offers users the ability to ‘claim’, edit and update records. BioSharing cultivates an active user community, operating via an open working group under Force11 and the Research Data Alliance.

Reza Salek
Reza Salek,
European Bioinformatics Institute (EBI), Hinxton, Cambridge, United Kingdom

Reza Salek got his PhD in Molecular Biophysics and Biochemistry from University College London, UK. He started working in the field of metabolomics in University of Cambridge, UK, overtime moving from lab based experiments to data handling, workflows and standards. In the past, He has worked as scientific investigator at the Medical Research Council, Cambridge UK. In 2012, he joined EMBL-EBI and currently works as a Scientific Coordinator/Project Manager. He’s actively involved in data standards developments, chairing Data Standards Task Group and director of Metabolomics Society, working with both HUPO-PSI and MSI initiatives. At EMBL-EBI they host the MetaboLights (http://www.ebi.ac.uk/metabolights), the first general purpose repository for metabolomics data were he lead the curation efforts and standards compliant. In past, he has managed and coordinated a large EU infrastructure project on metabolomics data standards, COSMOS (Coordination of Standards in Metabolomics – http://cosmos-fp7.eu/), which has re-ignited standards effort within the community. Professionally, He is member of the Cambridge Systems Biology Centre, Cambridge Neuroscience and Cambridge Cancer Centre. He is the main organiser of the “EMBO Practical Course on Metabolomics Bioinformatics for Life Scientists” ongoing since 2012, giving him the opportunity to work with groups of talented and excellent instructors/tutors in metabolomics that share the same passion for metabolomics data handling and standards. He is also one of the directors of the Metabolic Profiling Forum (MPF), also Associate Editor for Nature’s Frontiers Metabolomics Journal.

Data sharing, standards and workflows in metabolomics; towards reproducibility in Metabolomics
With increasing amounts of metabolomics publications and data being produced, only a small portion of datasets are publicly shared that follow an open data standards format. For metabolomics results to become reproducible mere descriptions of investigations as text in a manuscript is not sufficient. What can increase the chance of reproducibility, the ultimate aim within any scientific field, is to have a standards framework for data sharing, reporting results as well as sharing the complete study files from the rawest form to highest level of knowledge based data.
In recent years, metabolomics data standards have developed extensively, to include the primary research data, derived results and the experimental description and importantly the metadata in a machine-readable way. This also includes vendor independent data standards such as mzML for mass spectrometry and nmrML for NMR raw data that have both enabled the development of advanced data processing algorithms by the scientific community. We have also seen emergence of metabolomics experimental data sharing platforms such as EMBL-EBI, MetaboLights and NIH Metabolomics workbench. There have been recent efforts in creating metabolomics data analysis workflow (Galaxy and Knime), bringing a step closer to producing auditable trail for data analysis. Such systems can potentially generate reproducible results, ideally running on a dedicated e-infrastructure platform, such as ones currently being developed by the PhenoMeNal H2020 consortium, coordinated by EMBL-EBI.
Altogether, all this should pave the way for both data analysis results becoming more reproducible, and possibility of data integration and reuse of data in metabolomics.



COST is supported by the
EU Framework Programme Horizon 2020

Conference Secretariat

Meeting Planner srl, Bari, Italy