Evolutionary rate analyses of orthologs and paralogs from 12 Drosophila genomes.
Heger A., Ponting CP.
The newly sequenced genome sequences of 11 Drosophila species provide the first opportunity to investigate variations in evolutionary rates across a clade of closely related species. Protein-coding genes were predicted using established Drosophila melanogaster genes as templates, with recovery rates ranging from 81%-97% depending on species divergence and on genome assembly quality. Orthology and paralogy assignments were shown to be self-consistent among the different Drosophila species and to be consistent with regions of conserved gene order (synteny blocks). Next, we investigated the rates of diversification among these species' gene repertoires with respect to amino acid substitutions and to gene duplications. Constraints on amino acid sequences appear to have been most pronounced on D. ananassae and least pronounced on D. simulans and D. erecta terminal lineages. Codons predicted to have been subject to positive selection were found to be significantly over-represented among genes with roles in immune response and RNA metabolism, with the latter category including each subunit of the Dicer-2/r2d2 heterodimer. The vast majority of gene duplications (96.5%) and synteny rearrangements were found to occur, as expected, within single Müller elements. We show that the rate of ancient gene duplications was relatively uniform. However, gene duplications in terminal lineages are strongly skewed toward very recent events, consistent with either a rapid-birth and rapid-death model or the presence of large proportions of copy number variable genes in these Drosophila populations. Duplications were significantly more frequent among trypsin-like proteases and DM8 putative lipid-binding domain proteins.