Cographs, orthologs, and the inference of species trees from paralogs

Peter F. Stadler
University Leipzig

Marc Hellmuth
Univ. Greifswald

Nick Wieseke
Uni Leipzig

Adrian Fritz
Uni Saarbruecken

Sarah Berkemer
Uni Leipzig

Martin Middendorf
Uni Leipzig

Markus Lechner
Uni Marburg

Hans-Peter Lehnhof
Uni Saarland

Maribel Hernandes-Rosales
UNAM Juriquilla

Vincent Moulton
U East Anglia

Katharina Huber
U East Anglia

Marius Brunnert
Uni Leipzig

PDF

Minisymposium: GENERAL SESSION TALKS

Content: Phylogenomics heavily relies on well-curated sequence data sets that consist, for each gene, exclusively of 1:1-orthologs, i.e., of genes that have arisen through speciation events. Paralogs, which arose from duplication events, are treated as a dangerous nuisance that has to be detected and removed. Building upon recent advances in mathematical phylogenetics we demonstrate that gene duplications convey meaningful phylogenetic information and allow the inference of plausible phylogenetic trees provided orthologs and paralogs can be distinguished with a high degree of certainty. Starting from tree-free estimates of orthology, co-graph editing can sufficiently reduce the noise to by translated into constraints on the species trees. While the resolution is very poor for individual gene families, we show that genome-wide data sets are sufficient to generate fully resolved phylogenetic trees. The mathematical content of this work comprises (1) the characterization of graphs of orthologous genes as co-graphs, (2) an analysis of cograph editing that allows the reliable correction of empirical data to mathematically correct co-graphs, (3) the identification of a triple set in the corresponding co-trees that constrains the species tree, and (4) results on the decomposition of co-graphs that suggest that paralgous gene pairs can in many be safely included in classical phylogenetic reconstruction pipelines. The presentation will summarize results obtained by a larger group of authors in several publications as well recent unpublished results. Authors are listed in random order.

Back to all abstracts