GenomePop page
GenomePop
Genomepop is flexible software that allows the forward simulation of DNA sequences in a metapopulation context under a variety of conditions. Users can input their own sequences or tell the program to generate them. Retro and recurrent mutation can be allowed or neglected. A number of populations with any migration model can be defined. Each population consists of a number N of sequences or genomes. Bottlenecks and/or population expansion can be defined by the user during a desired number of generations. In addition, selective nucleotide sites undergoing directional selection could exist. Markovian mutation models (as GTR) are allowed, the same as GTR x MG94 codon models. In the 2-allele model, several chromosomes can compose each genome. Consequently, SNPs segregating independently or linked in the same chromosome can be studied. At each chromosome, constant or variable recombination (hotspots) can be considered. GenomePop manage both haploid and diploid genomes. Several runs can be executed and a sample, of user-defined size, is obtained for each run and population.
Methodology of simulations
GenomePop is based on a SMS(Simulating in the Mutation Space) algorithm which allows for an efficient use of computer memory. The basic idea of SMS considers an individual as the differences (mutations) between this individual in reference to an original or consensus genotype (the master sequence). Thus, SMS provides a forward simulation framework for representing individuals just as the mutations they carry with respect to the wild genotype. Therefore, the dimensionality of the problem of representing genomes is reduced by several–fold factor. By using the SMS representation, efficiency is gained in both computation space and time. However, there is also a necessity to redefine the implementation of some processes such as mutation, migration, recombination and fitness evaluation to adjust to the new way of storing genomes in this less-redundant manner (see the following PDF for a more in deep explanation of the algorithms).
There are different forward evolution models that can be implemented in GenomePop (see below). Furthermore, GenomePop can implement stepping stone or island migration models or any other migration model the user wish to set (see the link of Migration in the panel at the left).
Current Settings
Binary model
To evolve sequences studying the resulting SNPs. GenomePop can manage up to 500,000 independent SNPs (in a Pentium 4, 3.2 GHz) or an unlimited, depending on computer memory, partially linked in one chromosome or up to 10,000 by chromosome if distinct chromosomes are defined (for example 100 chromosomes with 10,000 SNPs each). Allow diploid or haploid genomes considering just two alleles per site and one or more chromosomes. Within each chromosome different recombination rates can be defined. Optionally, sites will evolve as neutral but specific sites could be defined as being selected. Conversely, sites could be evolved under selection but specific sites can be neutral. The output will be in GenePop 4.0 program format. See some two allele model examples Example0-0 Example0-1 Example0-2 Example0-3 Example0-4. The name of any input file must be changed to be GenomePopInput.txt.
Non binary model
Evolve real or simulated DNA sequences under different mutation models. The same above evolutionary and demographic scenarios can be settled. However, recombination hot spots are not still allowed in non-binary models. Output sequences in Nexus or Phylip formats. See an example Example1-1 Example1-2. The name of any input file must be changed to be GenomePopInput.txt.
Codon model
Evolve real or simulated DNA sequences under a MG94 codon model. It allows intracodon recombination. The codon model can be combined with any nucleotide mutation model. The same above evolutionary and demographic scenarios can be settled. However, recombination hot spots are not still allowed in non-binary models. Output sequences in Nexus or Phylip formats. See an example Example2. The name of any input file must be changed to be GenomePopInput.txt.
Future Settings
Codon usage bias
The user will be able to set codon models accounting for codon usage bias. Also, different selection schemes at the interpopulation level will be added.
Java Interface
A java interface for input facilities will be provided someday
Future updates of GenomePop will include parallelized mpi version able to run in computer clusters.