Bayesian networks are simple, probabilistic graphical models, which were introduced in the 1980s and have been used increasingly widely since then. They are commonly used for classification and in the computer science fields of machine learning and artificial intelligence. They are both knowledge representation models, acting as “calculating machines” for conditional probabilities and decision support systems. The automatic learning of a Bayesian network from observations of random variables enables the extraction of knowledge from data.
Until the 2000s, the methods used were limited to around thirty variables. Major advances have been achieved during the past ten years thanks to integer linear programming and optimum cut generation, thus pushing the limit to around one hundred variables.
More rapid and more efficient
Scientists in the Mathematics and Applied Informatics Unit in Toulouse (MIAT) at the INRAE Occitanie-Toulouse Research Centre, and the Mathematics and Applied Informatics Unit (MIA) at the Versailles-Grignon Research Centre have developed a new and more efficient method which requires a shorter calculation time. Called ELSA, for Exact Learning of Bayesian network Structure using Acyclicity reasoning, this method picks up on the idea of cuts by integrating them in CPBayes, a constraint programming tool dedicated to this learning problem.
“Using a different approach, we have developed a high quality, and notably much more rapid, calculation method” explains Simon de Givry, research scientist in the MIAT Unit. “Using a test set of forty problems containing more than 500 random variables, and over the same allotted time of 90 hours, ELSA managed to optimally solve 23, while GOBNILP only solved nine and CPBayes just four.”
Among the prospects for further research improvements, the use of decision tree diagrams is expected to enable more efficient manipulation of very large value domains, which would further accelerate the operations performed by ELSA.
One of the tools used to design future sunflower varieties
In the context of the Sunrise programme, studies were carried out by the Joint Research Unit for Plant-Microbe-Environment Interactions (LIPME) to identify genes of interest regarding drought tolerance and to model the agricultural characteristics of future sunflower varieties carrying these genes. ELSA contributed to better understanding the genetic and molecular bases controlling plant physiology and development in order to predict the characteristics of the hybrids.
Fulya Trösser, Simon de Givry, and George Katsirelos
Improved acyclicity reasoning for bayesian network structure learning with constraint programming
In Proc. of IJCAI-21, Montreal, Canada, 2021