Single genome retrieval of context-dependent variability in mutation rates for human germline.
Sahakyan AB., Balasubramanian S.
BACKGROUND: Accurate knowledge of the core components of substitution rates is of vital importance to understand genome evolution and dynamics. By performing a single-genome and direct analysis of 39,894 retrotransposon remnants, we reveal sequence context-dependent germline nucleotide substitution rates for the human genome. RESULTS: The rates are characterised through rate constants in a time-domain, and are made available through a dedicated program (Trek) and a stand-alone database. Due to the nature of the method design and the imposed stringency criteria, we expect our rate constants to be good estimates for the rates of spontaneous mutations. Benefiting from such data, we study the short-range nucleotide (up to 7-mer) organisation and the germline basal substitution propensity (BSP) profile of the human genome; characterise novel, CpG-independent, substitution prone and resistant motifs; confirm a decreased tendency of moieties with low BSP to undergo somatic mutations in a number of cancer types; and, produce a Trek-based estimate of the overall mutation rate in human. CONCLUSIONS: The extended set of rate constants we report may enrich our resources and help advance our understanding of genome dynamics and evolution, with possible implications for the role of spontaneous mutations in the emergence of pathological genotypes and neutral evolution of proteomes.