Concept, design and implementation of a cardiovascular gene-centric 50 k SNP array for large-scale genomic association studies.
Keating BJ., Tischfield S., Murray SS., Bhangale T., Price TS., Glessner JT., Galver L., Barrett JC., Grant SFA., Farlow DN., Chandrupatla HR., Hansen M., Ajmal S., Papanicolaou GJ., Guo Y., Li M., Derohannessian S., de Bakker PIW., Bailey SD., Montpetit A., Edmondson AC., Taylor K., Gai X., Wang SS., Fornage M., Shaikh T., Groop L., Boehnke M., Hall AS., Hattersley AT., Frackelton E., Patterson N., Chiang CWK., Kim CE., Fabsitz RR., Ouwehand W., Price AL., Munroe P., Caulfield M., Drake T., Boerwinkle E., Reich D., Whitehead AS., Cappola TP., Samani NJ., Lusis AJ., Schadt E., Wilson JG., Koenig W., McCarthy MI., Kathiresan S., Gabriel SB., Hakonarson H., Anand SS., Reilly M., Engert JC., Nickerson DA., Rader DJ., Hirschhorn JN., Fitzgerald GA.
A wealth of genetic associations for cardiovascular and metabolic phenotypes in humans has been accumulating over the last decade, in particular a large number of loci derived from recent genome wide association studies (GWAS). True complex disease-associated loci often exert modest effects, so their delineation currently requires integration of diverse phenotypic data from large studies to ensure robust meta-analyses. We have designed a gene-centric 50 K single nucleotide polymorphism (SNP) array to assess potentially relevant loci across a range of cardiovascular, metabolic and inflammatory syndromes. The array utilizes a "cosmopolitan" tagging approach to capture the genetic diversity across approximately 2,000 loci in populations represented in the HapMap and SeattleSNPs projects. The array content is informed by GWAS of vascular and inflammatory disease, expression quantitative trait loci implicated in atherosclerosis, pathway based approaches and comprehensive literature searching. The custom flexibility of the array platform facilitated interrogation of loci at differing stringencies, according to a gene prioritization strategy that allows saturation of high priority loci with a greater density of markers than the existing GWAS tools, particularly in African HapMap samples. We also demonstrate that the IBC array can be used to complement GWAS, increasing coverage in high priority CVD-related loci across all major HapMap populations. DNA from over 200,000 extensively phenotyped individuals will be genotyped with this array with a significant portion of the generated data being released into the academic domain facilitating in silico replication attempts, analyses of rare variants and cross-cohort meta-analyses in diverse populations. These datasets will also facilitate more robust secondary analyses, such as explorations with alternative genetic models, epistasis and gene-environment interactions.