Thesis of Vincent Liard


Subject:
Evolutionnary origin of the complexity of biological systems -- a study based on in silico experimental evolution

Defense date: 27/10/2020

Advisor: Guillaume Beslon
Coadvisor: Jonathan Rouzaud-Cornabas

Summary:

Digital genetics is a recently developed approach to experimentally study evolution. In digital genetics a population of simulated organisms is submitted to a selection/variation process, leading to the emergence of an evolutionary dynamic that can be studied on its own. Several digital genetics platforms have been developed, mainly in the Artificial Life community, among which Avida, developed at Michigan State University in the Beacon center for more than 20 years is the most used.
Aevol is a platform developed by the LIRIS/Inria Beagle team. It’s specificity is that it encapsulates a realistic genomic structure that enables it to study the evolution of several genomic features such as the genome length, the polycistronic structure, the gene regulation network or the influence of the genomic structure on the evolution of other properties such as the maintenance of cooperation among bacteria. More recently Aevol has been used to generate “realistic” benchmarks in order to test bioinformatics tools.

One on the main limits of extent digital genetic platforms is that they are limited to qualitative comparison with “real” evolution and in particular with in vivo experimental evolution assays such as the “Long-Term Experimental Evolution” (LTEE) assays that is ran since 1988 by Richard Lenski at the Michigan State University. This limit comes from at least two factors. First, there is a clear computational limit: simulating very large populations of organisms owning genomes containing thousands of genes and millions of base pairs is indeed a challenge. Second, there is a formalism difficulty. Indeed, different formalisms have been proposed in the literature but none of them is close enough to the reality of molecular structures to enable quantitative comparison with real organisms.
The objective of this thesis is to tackle both difficulties in the context of Aevol in order to propose a new platform that will enable direct comparison of the genetic structures with what is observed in bacteria (i.e. hundreds to thousands of genes and hundreds of thousands to millions of base-pairs). Moreover, while the genetic sequence of Aevol uses binary code, we will extend it to a 4-bases code. This will enable a more realistic representation of the genetic and genomic sequences (including promoters, terminators, ribosome binding sites or gene sequences) and quantitative validation of the observed structures. Finally, while in the current version of Aevol the organisms are evaluated by comparison with a target function, the new platform will encapsulate population genetics sub-models such as the classical Fisher’s geometric model or the multilinear model of epistasis of Hansen & Wagner. This will enable direct comparison of the results with the mathematical theory of population genetics that constitutes the core of theoretical evolutionary biology.
To develop this new platform, the PhD student will have to develop two sub-models that will be integrated to the Aevol core: One will tackle the question of the genetic sequence. The objective here will be to be as close as possible to the real genetic code with its 4 bases, 64 codons and 20 amino-acids. The decoded sequence will then be used to compute the organim’s traits in a Fisher-based model. This will require the development of the second submodel in order to compute the selective advantage of the organism given its distance to the optimal value of each selected trait. Both sub-models will be based on in molecular biology and population genetics. The development of these sub-models will be done during the first year of the PhD. Once both sub-models have been specified, the platform will be implemented using state-of-the-art scientific computing tools in order to allow the simulation of large populations of complex organisms. This will constitute the second year of the PhD. The development phase will be based on a preliminary study conducted by Jonathan Rouzaud-Cornabas on the Aevol platform. Finally, the last year of the PhD will be devoted to testing the platform and conducting experiments with it. Of particular interest will be the replication, in the new platform, of the LTEE protocol to test the influence of environment stabilization on the evolution of the simulated organisms.


Jury:
Mr Schneider DominiqueProfesseur(e)Université Grenoble AlpesRapporteur(e)
Mr Muller Jean-PierreDirecteur(trice) de rechercheCIRADRapporteur(e)
Mme Dillmann ChristineProfesseur(e)Université Paris Saclay
Mr Lopez PhilippeMaître de conférenceUPMC
Mr Beslon GuillaumeProfesseur(e)INSA LyonDirecteur(trice) de thèse
Mr Rouzaud-Cornabas JonathanMaître de conférenceINSA LyonCo-encadrant(e)