Society and regional strategies 3 min

OpenMinTed: a platform of computing tools to extract and exploit information from scientific literature

The Bibliome-MaIAGE team and INRA's Scientific and Technical Information Delegation (DIST) are contributing to the European OpenMinTeD infrastructure project, the objective of which is to set up an online IT platform, encouraging and facilitating the use of text and data mining technologies for research.

Published on 03 February 2018

© INRAE, OpenMinTed

Faced with the upsurge in the quantities of published scientific knowledge, researchers have an increasing need of tools to help them quickly analyze texts and extract accurate data. Text mining technologies have been developed to meet this expectation. However, the devices have been designed by taking into account the specificities of the research fields, the types of text to be treated or the desired analysis, resulting in a fragmented landscape of incompatible text mining solutions.

Create a platform for collaboration and knowledge sharing on text-mining

The objective of the European OpenMinTeD project, funded under the Horizon 2020 programme, is to create a platform for collaboration and knowledge sharing on text mining for scientists in all fields. INRA, with the Bibliome-MaIAGE team and the DIST, is involved in the project along with 16 other academic partners whose contributions are coordinated by the Athena Research and Innovation Centre (ARC). The consortium is working on the integration of resources (scientific literature and annotation resources) and text mining software components, facilitating their reuse by making them interoperable. INRA's contribution to OpenMinTeD is to bring and integrate Alvis technologies developed by the Bibliome team over many years. The design of the platform being guided by use cases, this contribution fits more broadly into the design and implementation of innovative applications in the fields of agriculture and food.

With INRA units in food microbiology and the Migale bioinformatics platform, Bibliome-MaIAGE team and DIST have set up the Florilege application. Its objective is to bring together in a unified representation public information (from databases and scientific articles) on the positive flora of foods (useful for processing, biopreservation, probiotics).

Two other use cases have been developed by Bibliome-MaIAGE and DIST. The first was developed in collaboration with the Info Genomic Research Unit (URGI) within the WheatIS application, an integrated information system on wheat phenotypes and genotypes. The second, built with the Institute of Plant Sciences Paris-Saclay on the "SeeDev" application, integrates data from the "FLAGdb++" plant genome database, with the regulations involved in the development of Arabidopsis thaliana seed extracted from scientific publications. This allows researchers not only to obtain information on the activity of genes during seed development (their interactions or the proteins they produce, for example) but also to have access to the scientific texts describing this activity. Each of these innovative services integrates experimental data, expert data and data extracted en masse by OpenMinTeD from text, into a unified, easy-to-access package.

The last OpenMinTeD consortium meeting took place from 12 to 14 February 2018 at INRA research centre in Jouy-en-Josas. The partners, joined by Open Access communities providing content and text mining IT communities, are currently completing the integration of their applications and components into the platform, which will be officially launched on 24 May 2018 in Brussels.

Bibliome's web site 

Learn more