CoreAligner: Multiple Genome Alignment for Core Genome Structure Identification


Horizontal gene transfers (HGT) have played a significant role in prokaryotic genome evolution, and the genes constituting a prokaryotic genome appear to be divided into two classes: a "core gene pool" that comprises intrinsic genes encoding the proteins of basic cellular functions, and a "flexible gene pool" that comprises HGT-acquired genes encoding proteins which function under particular conditions, such as genomic islands. Therefore, the identification of the set of intrinsically conserved genes, or the genomic core, among a taxonomic group is crucial not only for establishing the identity of each taxonomic group, but also for understanding prokaryotic diversity and evolution.

We have developed a method, named CoreAligner, for identifying the core structure of related genomes, which is defined as a set of sufficiently long segments in which gene orders are conserved among multiple genomes so that they are likely to have been inherited mainly through vertical transfer. Using a dynamic programming based algorithm, CoreAligner finds the order of pre-identified orthologous groups (OGs) that retains to the greatest possible extent the conserved gene orders.

Reference

Download the CoreAligner program

Requirements

CoreAligner requires a set of orthologous groups among related genomes as an input, which can be created on the MBGD system.

You may need to install the following Perl libraries in your machine.

NOTE: The program can be better used through the RECOG system (Research Environment for Comparative Genomics) that includes CoreAligner as a built-in function.


Please send questions and comments to: Ikuo Uchiyama (uchiyama@nibb.ac.jp)