Agroecology 3 min
New protein created via artificial intelligence
The goal of artificial intelligence (AI) is to help computers (better) solve problems normally dealt with by humans. In Toulouse, bioinformatics researchers have used AI and automated reasoning algorithms to design a hyperstable self-assembling protein. This work is the fruit of a collaboration between INRAE scientists in the Research Unit for Applied Mathematics and Informatics (at the centre of Occitanie-Toulouse) and scientists in Belgium and Japan.
Published on 02 January 2019
Automated reasoning is an area within AI research whose goal is to help computers solve extremely complex puzzles, which may involve thousands or millions of interconnected components. Indeed, automated reasoning was recently used to prove a theorem that mathematicians have wrestled with for decades. The AI integrated into the software programme ToulBar2 (see the sidebar) has made it possible to solve the hardest Sudoku puzzles within milliseconds. However, there exist even more challenging puzzles whose solutions are harder to find.
Proteins are the fundamental compounds of life. They are needed for cells in humans, animals, plants, fungi, and microbes alike to function properly. Proteins can bind to other molecules, or they can come together to form complex structures. Their main role is to catalyse chemical reactions at ambient temperatures and pressures while remaining biodegradable. At the INRAE centre of Occitanie-Toulouse, researchers in synthetic biology have been trying to speed up protein design. Although several recent advances in AI have been made thanks to artificial neural networks, a different approach was used here: automated reasoning algorithms.
As part of a collaboration with biochemists at KU Leuven in Belgium and the Riken Institute in Japan, INRAE scientists instructed ToulBar2 to design a protein according to a few specifications. The AI tool had to select and organise the protein's atoms such that copies of the protein would self-assemble in water and form an even larger symmetrical protein. ToulBar2 had access to a preliminary protein structure and an approximate estimate of intermolecular forces (obtained using the talaris2014 energy function in Rosetta, protein modelling software developed at the University of Washington). With just these resources, ToulBar2 managed to crack a puzzle where finding the solution meant evaluating a greater number of combinations than there are atoms in the known universe. It did so by discovering the protein's optimal arrangement of atoms and was able to show that its solution was the best one possible given the constraints of intermolecular forces. This work was not carried out using a super computer. A simple desktop model sufficed. The researchers then translated the protein's sequence of amino acids into a DNA sequence, which was inserted into a bacterial species (E. coli). The bacterium then multiplied and generated many copies of the protein. The protein's final form was exactly what was expected. It resulted from self-assembly in water from the basic components designed by ToulBar2.
Such novel proteins may have promising applications for the medical field, green chemistry, the biofuel industry, and recycling systems. It is crucial to be able to design new, customised proteins to deal with health issues. It may also help reduce our environmental footprint.
ToulBar2: a tool for solving the most complex of conundrums
ToulBar2 is a software programme created by scientists in the Research Unit for Applied Mathematics and Informatics at the INRAE centre of Occitanie-Toulouse. It utilises automated reasoning algorithms and is specialised for solving complex puzzles that involve selecting components that will compose a finished whole. These puzzles are difficult to solve, even for computers (the technical term for them is "NP-complete problems"). A well-known example of this type of problem is the Sudoku puzzle. In a Sudoku puzzle, each cell contains a number between 1 and 9. Within each line, column, and three-by-three block, numbers must differ. Based on these rules, ToulBar2 can almost instantaneously produce a solution for a given Sudoku puzzle. ToulBar2's strength lies in the fact that it can "think" beyond logical constraints: it can also handle cost-based rules (e.g., a rule that specifies that adjacent cells must contain numbers that differ by more than 1, where the penalty for violating this rule is 100 cost units). When there are rules, ToulBar2 produces the least costly solution and can prove that this solution is optimal. Thanks to this ability, ToulBar2 can solve problems much more complex than Sudoku puzzles. For example, ToulBar2 and its creators have learned about and improved the design of proteins thanks to a long-standing collaboration with INSA-LISBP.
Noguchi, H., Addy, C., Simoncini, D., Wouters, S., Mylemans, B., Van Meervelt, L., Schiex, T., Zhang, K.Y., Tame, J.R.H. and Voet, A.R.D., 2019. Computational design of symmetrical eight-bladed β-propeller proteins. IUCrJ, 6(1).
Allouche, D., André, I., Barbe, S., Davies, J., De Givry, S., Katsirelos, G., O'Sullivan, B., Prestwich, S., Schiex, T. and Traoré, S., 2014. Computational protein design as an optimization problem. Artificial Intelligence, 212, pp.59-79.
Simoncini, D., Allouche, D., de Givry, S., Delmas, C., Barbe, S. and Schiex, T., 2015. Guaranteed discrete energy optimization on large protein design problems. Journal of chemical theory and computation, 11(12), pp.5980-5989.