Agroecology Reading time 3 min

New protein created via artificial intelligence

The goal of artificial intelligence (AI) is to help computers (better) solve problems normally dealt with by humans. In Toulouse, bioinformatics researchers have used AI and automated reasoning algorithms to design a hyperstable self-assembling protein. This work is the fruit of a collaboration between INRAE scientists in the Research Unit for Applied Mathematics and Informatics (at the centre of Occitanie-Toulouse) and scientists in Belgium and Japan.

Published on 02 January 2019

Automated reasoning is an area within AI research whose goal is to help computers solve extremely complex puzzles, which may involve thousands or millions of interconnected components. Indeed, automated reasoning was recently used to prove a theorem that mathematicians have wrestled with for decades. The AI integrated into the software programme ToulBar2 (see the sidebar) has made it possible to solve the hardest Sudoku puzzles within milliseconds. However, there exist even more challenging puzzles whose solutions are harder to find.

Proteins are the fundamental compounds of life. They are needed for cells in humans, animals, plants, fungi, and microbes alike to function properly. Proteins can bind to other molecules, or they can come together to form complex structures. Their main role is to catalyse chemical reactions at ambient temperatures and pressures while remaining biodegradable. At the INRAE centre of Occitanie-Toulouse, researchers in synthetic biology have been trying to speed up protein design. Although several recent advances in AI have been made thanks to artificial neural networks, a different approach was used here: automated reasoning algorithms.

As part of a collaboration with biochemists at KU Leuven in Belgium and the Riken Institute in Japan, INRAE scientists instructed ToulBar2 to design a protein according to a few specifications. The AI tool had to select and organise the protein's atoms such that copies of the protein would self-assemble in water and form an even larger symmetrical protein. ToulBar2 had access to a preliminary protein structure and an approximate estimate of intermolecular forces (obtained using the talaris2014 energy function in Rosetta, protein modelling software developed at the University of Washington). With just these resources, ToulBar2 managed to crack a puzzle where finding the solution meant evaluating a greater number of combinations than there are atoms in the known universe. It did so by discovering the protein's optimal arrangement of atoms and was able to show that its solution was the best one possible given the constraints of intermolecular forces. This work was not carried out using a super computer. A simple desktop model sufficed. The researchers then translated the protein's sequence of amino acids into a DNA sequence, which was inserted into a bacterial species (E. coli). The bacterium then multiplied and generated many copies of the protein. The protein's final form was exactly what was expected. It resulted from self-assembly in water from the basic components designed by ToulBar2.

Such novel proteins may have promising applications for the medical field, green chemistry, the biofuel industry, and recycling systems. It is crucial to be able to design new, customised proteins to deal with health issues. It may also help reduce our environmental footprint.

ToulBar2: a tool for solving the most complex of conundrums

ToulBar2 is a software programme created by scientists in the Research Unit for Applied Mathematics and Informatics at the INRAE centre of Occitanie-Toulouse. It utilises automated reasoning algorithms and is specialised for solving complex puzzles that involve selecting components that will compose a finished whole. These puzzles are difficult to solve, even for computers (the technical term for them is "NP-complete problems"). A well-known example of this type of problem is the Sudoku puzzle. In a Sudoku puzzle, each cell contains a number between 1 and 9. Within each line, column, and three-by-three block, numbers must differ. Based on these rules, ToulBar2 can almost instantaneously produce a solution for a given Sudoku puzzle. ToulBar2's strength lies in the fact that it can "think" beyond logical constraints: it can also handle cost-based rules (e.g., a rule that specifies that adjacent cells must contain numbers that differ by more than 1, where the penalty for violating this rule is 100 cost units). When there are rules, ToulBar2 produces the least costly solution and can prove that this solution is optimal. Thanks to this ability, ToulBar2 can solve problems much more complex than Sudoku puzzles. For example, ToulBar2 and its creators have learned about and improved the design of proteins thanks to a long-standing collaboration with INSA-LISBP.

References
Noguchi, H., Addy, C., Simoncini, D., Wouters, S., Mylemans, B., Van Meervelt, L., Schiex, T., Zhang, K.Y., Tame, J.R.H. and Voet, A.R.D., 2019. Computational design of symmetrical eight-bladed β*-propeller proteins. IUCrJ, 6(1).* https://journals.iucr.org/m/issues/2019/01/00/jt5028/index.html https://doi.org/10.1107/S205225251801480X Allouche, D., André, I., Barbe, S., Davies, J., De Givry, S., Katsirelos, G., O'Sullivan, B., Prestwich, S., Schiex, T. and Traoré, S., 2014. Computational protein design as an optimization problem. Artificial Intelligence, 212, pp.59-79. https://www.sciencedirect.com/science/article/pii/S0004370214000332 https://doi.org/10.1016/j.artint.2014.03.005 Simoncini, D., Allouche, D., de Givry, S., Delmas, C., Barbe, S. and Schiex, T., 2015. Guaranteed discrete energy optimization on large protein design problems. Journal of chemical theory and computation, 11(12), pp.5980-5989. https://pubs.acs.org/doi/abs/10.1021/acs.jctc.5b00594 https://doi.org/10.1021/acs.jctc.5b00594

Contacts

Thomas SCHIEX

Scientific director

Mathematics and Applied Informatics, Toulouse

Centre

Occitanie-Toulouse

Access the centre

Divisions

MATHNUM

See the division

TRANSFORM

See the division

Learn more

thematic Climate change and risks

Soils for food security and the climate

The "4 parts per 1000" initiative proposes to increase organic matter contents and encourage carbon sequestration in soils, through the application of appropriate farming and forestry practices.

13 December 2019

thematic Climate change and risks

Two publications present the results of FACCE-JPI

The Joint Programming Initiative on Agriculture, Food Security and Climate Change (FACCE-JPI) has published its new brochure and a flyer on soil management in the context of climate change mitigation. The brochure presents the achievements and future actions of FACCE-JPI and includes several key interviews. The flyer highlights some of the research projects that are part of the 'FACCE-JPI Multi-Partner Call on Agricultural Greenhouse Gas Research'.

12 December 2019

thematic Society and regional strategies

Big Data for the Greater Good

The explosive growth of data and advances in Big Data analytics have created a new frontier for innovation, competition, productivity, and well-being in almost every sector of our society, as well as a source of immense economic and societal value.

07 January 2020

suggestions

New protein created via artificial intelligence

Contacts

Centre

Divisions

Learn more

Soils for food security and the climate

Two publications present the results of FACCE-JPI

Big Data for the Greater Good

OUR RESEARCH THEMES

INNOVATION

EXPERTISE AND SUPPORT FOR PUBLIC POLICIES

THE INSTITUTE

EXCELLENCE

OUR VALUES

WORKING WITH US

suggestions