Publications by authors named "Hugues Aschard"

73 Publications

Multitrait GWAS to connect disease variants and biological mechanisms.

PLoS Genet 2021 Aug 30;17(8):e1009713. Epub 2021 Aug 30.

Department of Computational Biology, Institut Pasteur, Paris, France.

Genome-wide association studies (GWASs) have uncovered a wealth of associations between common variants and human phenotypes. Here, we present an integrative analysis of GWAS summary statistics from 36 phenotypes to decipher multitrait genetic architecture and its link with biological mechanisms. Our framework incorporates multitrait association mapping along with an investigation of the breakdown of genetic associations into clusters of variants harboring similar multitrait association profiles. Focusing on two subsets of immunity and metabolism phenotypes, we then demonstrate how genetic variants within clusters can be mapped to biological pathways and disease mechanisms. Finally, for the metabolism set, we investigate the link between gene cluster assignment and the success of drug targets in randomized controlled trials.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1009713DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8437297PMC
August 2021

Multi-ancestry genome-wide gene-sleep interactions identify novel loci for blood pressure.

Mol Psychiatry 2021 Apr 15. Epub 2021 Apr 15.

Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands.

Long and short sleep duration are associated with elevated blood pressure (BP), possibly through effects on molecular pathways that influence neuroendocrine and vascular systems. To gain new insights into the genetic basis of sleep-related BP variation, we performed genome-wide gene by short or long sleep duration interaction analyses on four BP traits (systolic BP, diastolic BP, mean arterial pressure, and pulse pressure) across five ancestry groups in two stages using 2 degree of freedom (df) joint test followed by 1df test of interaction effects. Primary multi-ancestry analysis in 62,969 individuals in stage 1 identified three novel gene by sleep interactions that were replicated in an additional 59,296 individuals in stage 2 (stage 1 + 2 P < 5 × 10), including rs7955964 (FIGNL2/ANKRD33) that increases BP among long sleepers, and rs73493041 (SNORA26/C9orf170) and rs10406644 (KCTD15/LSM14A) that increase BP among short sleepers (P < 5 × 10). Secondary ancestry-specific analysis identified another novel gene by long sleep interaction at rs111887471 (TRPC3/KIAA1109) in individuals of African ancestry (P = 2 × 10). Combined stage 1 and 2 analyses additionally identified significant gene by long sleep interactions at 10 loci including MKLN1 and RGL3/ELAVL3 previously associated with BP, and significant gene by short sleep interactions at 10 loci including C2orf43 previously associated with BP (P < 10). 2df test also identified novel loci for BP after modeling sleep that has known functions in sleep-wake regulation, nervous and cardiometabolic systems. This study indicates that sleep and primary mechanisms regulating BP may interact to elevate BP level, suggesting novel insights into sleep-related BP regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41380-021-01087-0DOI Listing
April 2021

Estimating the effective sample size in association studies of quantitative traits.

G3 (Bethesda) 2021 Mar 18. Epub 2021 Mar 18.

Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

The effective sample size (ESS) is a metric used to summarize in a single term the amount of correlation in a sample. It is of particular interest when predicting the statistical power of genome-wide association studies (GWAS) based on linear mixed models. Here, we introduce an analytical form of the ESS for mixed-model GWAS of quantitative traits and relate it to empirical estimators recently proposed. Using our framework, we derived approximations of the ESS for analyses of related and unrelated samples and for both marginal genetic and gene-environment interaction tests. We conducted simulations to validate our approximations and to provide a quantitative perspective on the statistical power of various scenarios, including power loss due to family relatedness and power gains due to conditioning on the polygenic signal. Our analyses also demonstrate that the power of gene-environment interaction GWAS in related individuals strongly depends on the family structure and exposure distribution. Finally, we performed a series of mixed-model GWAS on data from the UK Biobank and confirmed the simulation results. We notably found that the expected power drop due to family relatedness in the UK Biobank is negligible.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/g3journal/jkab057DOI Listing
March 2021

Genetic meta-analysis of cancer diagnosis following statin use identifies new associations and implicates human leukocyte antigen (HLA) in women.

Pharmacogenomics J 2021 Aug 1;21(4):446-457. Epub 2021 Mar 1.

Université de Montréal Beaulieu-Saucier Pharmacogenomics Center, Montreal, QC, Canada.

We sought to perform a genomic evaluation of the risk of incident cancer in statin users, free of cancer at study entry. Patients who previously participated in two phase IV trials (TNT and IDEAL) with genetic data were used (n = 11,196). A GWAS meta-analysis using Cox modeling for the prediction of incident cancer was conducted in the pooled cohort and sex-stratified. rs13210472 (near HLA-DOA gene) was associated with higher risk of incident cancer amongst women with prevalent coronary artery disease (CAD) taking statins (hazard ratio [HR]: 2.66, 95% confidence interval [CI]: 1.88-3.76, P = 3.5 × 10). Using the UK Biobank and focusing exclusively on women statin users with CAD (n= 2952), rs13210472 remained significantly associated with incident cancer (HR: 1.71, 95% CI: 1.14-2.56, P = 9.0 × 10). The association was not observed in non-statin users. In this genetic meta-analysis, we have identified a variant in women statin users with prevalent CAD that was associated with incident cancer, possibly implicating the human leukocyte antigen pathway.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41397-021-00221-zDOI Listing
August 2021

Intraocular Pressure, Glaucoma, and Dietary Caffeine Consumption: A Gene-Diet Interaction Study from the UK Biobank.

Ophthalmology 2021 Jun 14;128(6):866-876. Epub 2020 Dec 14.

Department of Ophthalmology, Icahn School of Medicine at Mount Sinai, New York, New York.

Purpose: We examined the association of habitual caffeine intake with intraocular pressure (IOP) and glaucoma and whether genetic predisposition to higher IOP modified these associations. We also assessed whether genetic predisposition to higher coffee consumption was related to IOP.

Design: Cross-sectional study in the UK Biobank.

Participants: We included 121 374 participants (baseline ages, 39-73 years) with data on coffee and tea intake (collected 2006-2010) and corneal-compensated IOP measurements in 2009. In a subset of 77 906 participants with up to 5 web-based 24-hour-recall food frequency questionnaires (2009-2012), we evaluated total caffeine intake. We also assessed the same relationships with glaucoma (9286 cases and 189 763 controls).

Methods: We evaluated multivariable-adjusted associations with IOP using linear regression and with glaucoma using logistic regression. For both outcomes, we examined gene-diet interactions using a polygenic risk score (PRS) that combined the effects of 111 genetic variants associated with IOP. We also performed Mendelian randomization using 8 genetic variants associated with coffee intake to assess potential causal effects of coffee consumption on IOP.

Main Outcome Measures: Intraocular pressure and glaucoma.

Results: Mendelian randomization analysis did not support a causal effect of coffee drinking on IOP (P > 0.1). Greater caffeine intake was associated weakly with lower IOP: the highest (≥232 mg/day) versus lowest (<87 mg/day) caffeine consumption was associated with a 0.10-mmHg lower IOP (P = 0.01). However, the IOP PRS modified this association: among those in the highest IOP PRS quartile, consuming > 480 mg/day versus < 80 mg/day was associated with a 0.35-mmHg higher IOP (P = 0.01). The relationship between caffeine intake and glaucoma was null (P ≥ 0.1). However, the IOP PRS also modified this relationship: compared with those in the lowest IOP PRS quartile consuming no caffeine, those in the highest IOP PRS quartile consuming ≥ 321 mg/day showed a 3.90-fold higher glaucoma prevalence (P = 0.0003).

Conclusions: Habitual caffeine consumption was associated weakly with lower IOP, and the association between caffeine consumption and glaucoma was null. However, among participants with the strongest genetic predisposition to elevated IOP, greater caffeine consumption was associated with higher IOP and higher glaucoma prevalence.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ophtha.2020.12.009DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8154631PMC
June 2021

Ultrarare heterozygous pathogenic variants of genes causing dominant forms of early-onset deafness underlie severe presbycusis.

Proc Natl Acad Sci U S A 2020 12 23;117(49):31278-31289. Epub 2020 Nov 23.

Centre de Bioinformatique, Biostatistique et Biologie Intégrative, Institut Pasteur, 75015 Paris, France.

Presbycusis, or age-related hearing loss (ARHL), is a major public health issue. About half the phenotypic variance has been attributed to genetic factors. Here, we assessed the contribution to presbycusis of ultrarare pathogenic variants, considered indicative of Mendelian forms. We focused on severe presbycusis without environmental or comorbidity risk factors and studied multiplex family age-related hearing loss (mARHL) and simplex/sporadic age-related hearing loss (sARHL) cases and controls with normal hearing by whole-exome sequencing. Ultrarare variants (allele frequency [AF] < 0.0001) of 35 genes responsible for autosomal dominant early-onset forms of deafness, predicted to be pathogenic, were detected in 25.7% of mARHL and 22.7% of sARHL cases vs. 7.5% of controls ( = 0.001); half were previously unknown (AF < 0.000002). , , , and variants were present in 8.9% of ARHL cases but less than 1% of controls. Evidence for a causal role of variants in presbycusis was provided by pathogenicity prediction programs, documented haploinsufficiency, three-dimensional structure/function analyses, cell biology experiments, and reported early effects. We also established mice, carrying the :p.(Asn327Ile) variant detected in an mARHL case, as a mouse model for a monogenic form of presbycusis. Deafness gene variants can thus result in a continuum of auditory phenotypes. Our findings demonstrate that the genetics of presbycusis is shaped by not only well-studied polygenic risk factors of small effect size revealed by common variants but also, ultrarare variants likely resulting in monogenic forms, thereby paving the way for treatment with emerging inner ear gene therapy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.2010782117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7733833PMC
December 2020

Deriving stratified effects from joint models investigating gene-environment interactions.

BMC Bioinformatics 2020 Jun 18;21(1):251. Epub 2020 Jun 18.

Department of Computational Biology, USR 3756 CNRS, Institut Pasteur, Paris, France.

Background: Models including an interaction term and performing a joint test of SNP and/or interaction effect are often used to discover Gene-Environment (GxE) interactions. When the environmental exposure is a binary variable, analyses from exposure-stratified models which consist of estimating genetic effect in unexposed and exposed individuals separately can be of interest. In large-scale consortia focusing on GxE interactions in which only the joint test has been performed, it may be challenging to get summary statistics from both exposure-stratified and marginal (i.e not accounting for interaction) models.

Results: In this work, we developed a simple framework to estimate summary statistics in each stratum of a binary exposure and in the marginal model using summary statistics from the "joint" model. We performed simulation studies to assess our estimators' accuracy and examined potential sources of bias, such as correlation between genotype and exposure and differing phenotypic variances within exposure strata. Results from these simulations highlight the high theoretical accuracy of our estimators and yield insights into the impact of potential sources of bias. We then applied our methods to real data and demonstrate our estimators' retained accuracy after filtering SNPs by sample size to mitigate potential bias.

Conclusions: These analyses demonstrated the accuracy of our method in estimating both stratified and marginal summary statistics from a joint model of gene-environment interaction. In addition to facilitating the interpretation of GxE screenings, this work could be used to guide further functional analyses. We provide a user-friendly Python script to apply this strategy to real datasets. The Python script and documentation are available at https://gitlab.pasteur.fr/statistical-genetics/j2s.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-020-03569-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302007PMC
June 2020

Gene-educational attainment interactions in a multi-ancestry genome-wide meta-analysis identify novel blood pressure loci.

Mol Psychiatry 2021 Jun 5;26(6):2111-2125. Epub 2020 May 5.

Health Disparities Research Section, Laboratory of Epidemiology and Population Sciences, National Institute on Aging, National Institutes of Health, Baltimore, MD, 21224, USA.

Educational attainment is widely used as a surrogate for socioeconomic status (SES). Low SES is a risk factor for hypertension and high blood pressure (BP). To identify novel BP loci, we performed multi-ancestry meta-analyses accounting for gene-educational attainment interactions using two variables, "Some College" (yes/no) and "Graduated College" (yes/no). Interactions were evaluated using both a 1 degree of freedom (DF) interaction term and a 2DF joint test of genetic and interaction effects. Analyses were performed for systolic BP, diastolic BP, mean arterial pressure, and pulse pressure. We pursued genome-wide interrogation in Stage 1 studies (N = 117 438) and follow-up on promising variants in Stage 2 studies (N = 293 787) in five ancestry groups. Through combined meta-analyses of Stages 1 and 2, we identified 84 known and 18 novel BP loci at genome-wide significance level (P < 5 × 10). Two novel loci were identified based on the 1DF test of interaction with educational attainment, while the remaining 16 loci were identified through the 2DF joint test of genetic and interaction effects. Ten novel loci were identified in individuals of African ancestry. Several novel loci show strong biological plausibility since they involve physiologic systems implicated in BP regulation. They include genes involved in the central nervous system-adrenal signaling axis (ZDHHC17, CADPS, PIK3C2G), vascular structure and function (GNB3, CDON), and renal function (HAS2 and HAS2-AS1, SLIT3). Collectively, these findings suggest a role of educational attainment or SES in further dissection of the genetic architecture of BP.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41380-020-0719-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7641978PMC
June 2021

JASS: command line and web interface for the joint analysis of GWAS results.

NAR Genom Bioinform 2020 Mar 24;2(1):lqaa003. Epub 2020 Jan 24.

Department of Computational Biology-USR 3756 CNRS, Institut Pasteur, 75015 Paris, France.

Genome-wide association study (GWAS) has been the driving force for identifying association between genetic variants and human phenotypes. Thousands of GWAS summary statistics covering a broad range of human traits and diseases are now publicly available. These GWAS have proven their utility for a range of secondary analyses, including in particular the joint analysis of multiple phenotypes to identify new associated genetic variants. However, although several methods have been proposed, there are very few large-scale applications published so far because of challenges in implementing these methods on real data. Here, we present JASS (Joint Analysis of Summary Statistics), a polyvalent Python package that addresses this need. Our package incorporates recently developed joint tests such as the omnibus approach and various weighted sum of -score tests while solving all practical and computational barriers for large-scale multivariate analysis of GWAS summary statistics. This includes data cleaning and harmonization tools, an efficient algorithm for fast derivation of joint statistics, an optimized data management process and a web interface for exploration purposes. Both benchmark analyses and real data applications demonstrated the robustness and strong potential of JASS for the detection of new associated genetic variants. Our package is freely available at https://gitlab.pasteur.fr/statistical-genetics/jass.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nargab/lqaa003DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6978790PMC
March 2020

Fine-mapping of 150 breast cancer risk regions identifies 191 likely target genes.

Nat Genet 2020 01 7;52(1):56-73. Epub 2020 Jan 7.

Unit of Medical Genetics, Department of Medical Oncology and Hematology, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Milan, Italy.

Genome-wide association studies have identified breast cancer risk variants in over 150 genomic regions, but the mechanisms underlying risk remain largely unknown. These regions were explored by combining association analysis with in silico genomic feature annotations. We defined 205 independent risk-associated signals with the set of credible causal variants in each one. In parallel, we used a Bayesian approach (PAINTOR) that combines genetic association, linkage disequilibrium and enriched genomic features to determine variants with high posterior probabilities of being causal. Potentially causal variants were significantly over-represented in active gene regulatory regions and transcription factor binding sites. We applied our INQUSIT pipeline for prioritizing genes as targets of those potentially causal variants, using gene expression (expression quantitative trait loci), chromatin interaction and functional annotations. Known cancer drivers, transcription factors and genes in the developmental, apoptosis, immune system and DNA integrity checkpoint gene ontology pathways were over-represented among the highest-confidence target genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-019-0537-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6974400PMC
January 2020

Mixed-model admixture mapping identifies smoking-dependent loci of lung function in African Americans.

Eur J Hum Genet 2020 05 13;28(5):656-668. Epub 2019 Dec 13.

Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Admixture mapping has led to the discovery of many genes associated with differential disease risk by ancestry, highlighting the importance of ancestry-based approaches to association studies. However, the potential of admixture mapping in deciphering the interplay between genes and environment exposures has been seldom explored. Here we performed a genome-wide screening of local ancestry-smoking interactions for five spirometric lung function phenotypes in 3300 African Americans from the COPDGene study. To account for population structure and outcome heterogeneity across exposure groups, we developed a multi-component linear mixed model for mapping gene-environment interactions and empirically showed its robustness and increased power. When applied to the COPDGene study, our approach identified two 11p15.2-3 and 2q37 loci, exhibiting local ancestry-smoking interactions at genome-wide significant level, which would have been missed by standard single-nucleotide polymorphism analyses. These two loci harbor the PARVA and RAB17 genes previously recognized to be involved in smoking behavior. Overall, our study provides the first evidence for potential synergistic effects between African ancestry and smoking on pulmonary function, and underlines the importance of ethnic diversity in genetic studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41431-019-0545-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7171162PMC
May 2020

Making the Most of Clumping and Thresholding for Polygenic Scores.

Am J Hum Genet 2019 12 21;105(6):1213-1221. Epub 2019 Nov 21.

Laboratoire TIMC-IMAG, UMR 5525, Univ. Grenoble Alpes, CNRS, La Tronche, France. Electronic address:

Polygenic prediction has the potential to contribute to precision medicine. Clumping and thresholding (C+T) is a widely used method to derive polygenic scores. When using C+T, several p value thresholds are tested to maximize predictive ability of the derived polygenic scores. Along with this p value threshold, we propose to tune three other hyper-parameters for C+T. We implement an efficient way to derive thousands of different C+T scores corresponding to a grid over four hyper-parameters. For example, it takes a few hours to derive 123K different C+T scores for 300K individuals and 1M variants using 16 physical cores. We find that optimizing over these four hyper-parameters improves the predictive performance of C+T in both simulations and real data applications as compared to tuning only the p value threshold. A particularly large increase can be noted when predicting depression status, from an AUC of 0.557 (95% CI: [0.544-0.569]) when tuning only the p value threshold to an AUC of 0.592 (95% CI: [0.580-0.604]) when tuning all four hyper-parameters we propose for C+T. We further propose stacked clumping and thresholding (SCT), a polygenic score that results from stacking all derived C+T scores. Instead of choosing one set of hyper-parameters that maximizes prediction in some training set, SCT learns an optimal linear combination of all C+T scores by using an efficient penalized regression. We apply SCT to eight different case-control diseases in the UK biobank data and find that SCT substantially improves prediction accuracy with an average AUC increase of 0.035 over standard C+T.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2019.11.001DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6904799PMC
December 2019

Multi-ancestry sleep-by-SNP interaction analysis in 126,926 individuals reveals lipid loci stratified by sleep duration.

Nat Commun 2019 11 12;10(1):5121. Epub 2019 Nov 12.

Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, Netherlands.

Both short and long sleep are associated with an adverse lipid profile, likely through different biological pathways. To elucidate the biology of sleep-associated adverse lipid profile, we conduct multi-ancestry genome-wide sleep-SNP interaction analyses on three lipid traits (HDL-c, LDL-c and triglycerides). In the total study sample (discovery + replication) of 126,926 individuals from 5 different ancestry groups, when considering either long or short total sleep time interactions in joint analyses, we identify 49 previously unreported lipid loci, and 10 additional previously unreported lipid loci in a restricted sample of European-ancestry cohorts. In addition, we identify new gene-sleep interactions for known lipid loci such as LPL and PCSK9. The previously unreported lipid loci have a modest explained variance in lipid levels: most notable, gene-short-sleep interactions explain 4.25% of the variance in triglyceride level. Collectively, these findings contribute to our understanding of the biological mechanisms involved in sleep-associated adverse lipid profiles.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-12958-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6851116PMC
November 2019

A comprehensive study of metabolite genetics reveals strong pleiotropy and heterogeneity across time and context.

Nat Commun 2019 10 21;10(1):4788. Epub 2019 Oct 21.

Department of Computational Biology - USR 3756 CNRS, Institut Pasteur, Paris, France.

Genetic studies of metabolites have identified thousands of variants, many of which are associated with downstream metabolic and obesogenic disorders. However, these studies have relied on univariate analyses, reducing power and limiting context-specific understanding. Here we aim to provide an integrated perspective of the genetic basis of metabolites by leveraging the Finnish Metabolic Syndrome In Men (METSIM) cohort, a unique genetic resource which contains metabolic measurements, mostly lipids, across distinct time points as well as information on statin usage. We increase effective sample size by an average of two-fold by applying the Covariates for Multi-phenotype Studies (CMS) approach, identifying 588 significant SNP-metabolite associations, including 228 new associations. Our analysis pinpoints a small number of master metabolic regulator genes, balancing the relative proportion of dozens of metabolite levels. We further identify associations to changes in metabolic levels across time as well as genetic interactions with statin at both the master metabolic regulator and genome-wide level.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-12703-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6803661PMC
October 2019

RAISS: robust and accurate imputation from summary statistics.

Bioinformatics 2019 11;35(22):4837-4839

Groupe de Génétique Statistique, Département de Génomes and Génétique, C3BI, Institut Pasteur, Paris, France.

Motivation: Multi-trait analyses using public summary statistics from genome-wide association studies (GWASs) are becoming increasingly popular. A constraint of multi-trait methods is that they require complete summary data for all traits. Although methods for the imputation of summary statistics exist, they lack precision for genetic variants with small effect size. This is benign for univariate analyses where only variants with large effect size are selected a posteriori. However, it can lead to strong p-value inflation in multi-trait testing. Here we present a new approach that improve the existing imputation methods and reach a precision suitable for multi-trait analyses.

Results: We fine-tuned parameters to obtain a very high accuracy imputation from summary statistics. We demonstrate this accuracy for variants of all effect sizes on real data of 28 GWAS. We implemented the resulting methodology in a python package specially designed to efficiently impute multiple GWAS in parallel.

Availability And Implementation: The python package is available at: https://gitlab.pasteur.fr/statistical-genetics/raiss, its accompanying documentation is accessible here http://statistical-genetics.pages.pasteur.fr/raiss/.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz466DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6853677PMC
November 2019

A multi-ancestry genome-wide study incorporating gene-smoking interactions identifies multiple new loci for pulse pressure and mean arterial pressure.

Hum Mol Genet 2019 08;28(15):2615-2633

Icelandic Heart Association, Kopavogur, Iceland.

Elevated blood pressure (BP), a leading cause of global morbidity and mortality, is influenced by both genetic and lifestyle factors. Cigarette smoking is one such lifestyle factor. Across five ancestries, we performed a genome-wide gene-smoking interaction study of mean arterial pressure (MAP) and pulse pressure (PP) in 129 913 individuals in stage 1 and follow-up analysis in 480 178 additional individuals in stage 2. We report here 136 loci significantly associated with MAP and/or PP. Of these, 61 were previously published through main-effect analysis of BP traits, 37 were recently reported by us for systolic BP and/or diastolic BP through gene-smoking interaction analysis and 38 were newly identified (P < 5 × 10-8, false discovery rate < 0.05). We also identified nine new signals near known loci. Of the 136 loci, 8 showed significant interaction with smoking status. They include CSMD1 previously reported for insulin resistance and BP in the spontaneously hypertensive rats. Many of the 38 new loci show biologic plausibility for a role in BP regulation. SLC26A7 encodes a chloride/bicarbonate exchanger expressed in the renal outer medullary collecting duct. AVPR1A is widely expressed, including in vascular smooth muscle cells, kidney, myocardium and brain. FHAD1 is a long non-coding RNA overexpressed in heart failure. TMEM51 was associated with contractile function in cardiomyocytes. CASP9 plays a central role in cardiomyocyte apoptosis. Identified only in African ancestry were 30 novel loci. Our findings highlight the value of multi-ancestry investigations, particularly in studies of interaction with lifestyle factors, where genomic and lifestyle differences may contribute to novel findings.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddz070DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6644157PMC
August 2019

Genetic Correlations Between Diabetes and Glaucoma: An Analysis of Continuous and Dichotomous Phenotypes.

Am J Ophthalmol 2019 10 20;206:245-255. Epub 2019 May 20.

Channing Division of Network Medicine, Harvard Medical School, Brigham and Women's Hospital, Boston, Massachusetts, USA; Department of Ophthalmology, Icahn School of Medicine at Mount Sinai, New York, New York, USA. Electronic address:

Purpose: A genetic correlation is the proportion of phenotypic variance between traits that is shared on a genetic basis. Here we explore genetic correlations between diabetes- and glaucoma-related traits.

Design: Cross-sectional study.

Methods: We assembled genome-wide association study summary statistics from European-derived participants regarding diabetes-related traits like fasting blood sugar (FBS) and type 2 diabetes (T2D) and glaucoma-related traits (intraocular pressure [IOP], central corneal thickness [CCT], corneal hysteresis [CH], corneal resistance factor [CRF], cup-to-disc ratio [CDR], and primary open-angle glaucoma [POAG]). We included data from the National Eye Institute Glaucoma Human Genetics Collaboration Heritable Overall Operational Database, the UK Biobank, and the International Glaucoma Genetics Consortium. We calculated genetic correlation (r) between traits using linkage disequilibrium score regression. We also calculated genetic correlations between IOP, CCT, and select diabetes-related traits based on individual level phenotype data in 2 Northern European population-based samples using pedigree information and Sequential Oligogenic Linkage Analysis Routines.

Results: Overall, there was little r between diabetes- and glaucoma-related traits. Specifically, we found a nonsignificant negative correlation between T2D and POAG (r = -0.14; P = .16). Using Sequential Oligogenic Linkage Analysis Routines, the genetic correlations between measured IOP, CCT, FBS, fasting insulin, and hemoglobin A1c were null. In contrast, genetic correlations between IOP and POAG (r ≥ 0.45; P ≤ 3.0 × 10) and between CDR and POAG were high (r = 0.57; P = 2.8 × 10). However, genetic correlations between corneal properties (CCT, CRF, and CH) and POAG were low (r range -0.18 to 0.11) and nonsignificant (P ≥ .07).

Conclusion: These analyses suggest that there is limited genetic correlation between diabetes- and glaucoma-related traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajo.2019.05.015DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6864262PMC
October 2019

Multi-ancestry genome-wide gene-smoking interaction study of 387,272 individuals identifies new loci associated with serum lipids.

Nat Genet 2019 04 29;51(4):636-648. Epub 2019 Mar 29.

Human Genomics Laboratory, Pennington Biomedical Research Center, Baton Rouge, LA, USA.

The concentrations of high- and low-density-lipoprotein cholesterol and triglycerides are influenced by smoking, but it is unknown whether genetic associations with lipids may be modified by smoking. We conducted a multi-ancestry genome-wide gene-smoking interaction study in 133,805 individuals with follow-up in an additional 253,467 individuals. Combined meta-analyses identified 13 new loci associated with lipids, some of which were detected only because association differed by smoking status. Additionally, we demonstrate the importance of including diverse populations, particularly in studies of interactions with lifestyle factors, where genomic and lifestyle differences by ancestry may contribute to novel findings.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-019-0378-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6467258PMC
April 2019

Genetic effects on the commensal microbiota in inflammatory bowel disease patients.

PLoS Genet 2019 03 8;15(3):e1008018. Epub 2019 Mar 8.

Department of Gastroenterology, Saint Antoine Hospital, Paris, France.

Several bacteria in the gut microbiota have been shown to be associated with inflammatory bowel disease (IBD), and dozens of IBD genetic variants have been identified in genome-wide association studies. However, the role of the microbiota in the etiology of IBD in terms of host genetic susceptibility remains unclear. Here, we studied the association between four major genetic variants associated with an increased risk of IBD and bacterial taxa in up to 633 IBD cases. We performed systematic screening for associations, identifying and replicating associations between NOD2 variants and two taxa: the Roseburia genus and the Faecalibacterium prausnitzii species. By exploring the overall association patterns between genes and bacteria, we found that IBD risk alleles were significantly enriched for associations concordant with bacteria-IBD associations. To understand the significance of this pattern in terms of the study design and known effects from the literature, we used counterfactual principles to assess the fitness of a few parsimonious gene-bacteria-IBD causal models. Our analyses showed evidence that the disease risk of these genetic variants were likely to be partially mediated by the microbiome. We confirmed these results in extensive simulation studies and sensitivity analyses using the association between NOD2 and F. prausnitzii as a case study.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1008018DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6426259PMC
March 2019

Efficient Implementation of Penalized Regression for Genetic Risk Prediction.

Genetics 2019 05 26;212(1):65-74. Epub 2019 Feb 26.

Laboratoire TIMC-IMAG, UMR 5525, University of Grenoble Alpes, CNRS, 38700 La Tronche, France

Polygenic Risk Scores (PRS) combine genotype information across many single-nucleotide polymorphisms (SNPs) to give a score reflecting the genetic risk of developing a disease. PRS might have a major impact on public health, possibly allowing for screening campaigns to identify high-genetic risk individuals for a given disease. The "Clumping+Thresholding" (C+T) approach is the most common method to derive PRS. C+T uses only univariate genome-wide association studies (GWAS) summary statistics, which makes it fast and easy to use. However, previous work showed that jointly estimating SNP effects for computing PRS has the potential to significantly improve the predictive performance of PRS as compared to C+T. In this paper, we present an efficient method for the joint estimation of SNP effects using individual-level data, allowing for practical application of penalized logistic regression (PLR) on modern datasets including hundreds of thousands of individuals. Moreover, our implementation of PLR directly includes automatic choices for hyper-parameters. We also provide an implementation of penalized linear regression for quantitative traits. We compare the performance of PLR, C+T and a derivation of random forests using both real and simulated data. Overall, we find that PLR achieves equal or higher predictive performance than C+T in most scenarios considered, while being scalable to biobank data. In particular, we find that improvement in predictive performance is more pronounced when there are few effects located in nearby genomic regions with correlated SNPs; for instance, in simulations, AUC values increase from 83% with the best prediction of C+T to 92.5% with PLR. We confirm these results in a data analysis of a case-control study for celiac disease where PLR and the standard C+T method achieve AUC values of 89% and of 82.5%. Applying penalized linear regression to 350,000 individuals of the UK Biobank, we predict height with a larger correlation than with the best prediction of C+T (∼65% instead of ∼55%), further demonstrating its scalability and strong predictive power, even for highly polygenic traits. Moreover, using 150,000 individuals of the UK Biobank, we are able to predict breast cancer better than C+T, fitting PLR in a few minutes only. In conclusion, this paper demonstrates the feasibility and relevance of using penalized regression for PRS computation when large individual-level datasets are available, thanks to the efficient implementation available in our R package bigstatsr.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1534/genetics.119.302019DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6499521PMC
May 2019

Multiancestry Genome-Wide Association Study of Lipid Levels Incorporating Gene-Alcohol Interactions.

Am J Epidemiol 2019 06;188(6):1033-1054

Department of Epidemiology and Biostatistics, Imperial College London, London, United Kingdom.

A person's lipid profile is influenced by genetic variants and alcohol consumption, but the contribution of interactions between these exposures has not been studied. We therefore incorporated gene-alcohol interactions into a multiancestry genome-wide association study of levels of high-density lipoprotein cholesterol, low-density lipoprotein cholesterol, and triglycerides. We included 45 studies in stage 1 (genome-wide discovery) and 66 studies in stage 2 (focused follow-up), for a total of 394,584 individuals from 5 ancestry groups. Analyses covered the period July 2014-November 2017. Genetic main effects and interaction effects were jointly assessed by means of a 2-degrees-of-freedom (df) test, and a 1-df test was used to assess the interaction effects alone. Variants at 495 loci were at least suggestively associated (P < 1 × 10-6) with lipid levels in stage 1 and were evaluated in stage 2, followed by combined analyses of stage 1 and stage 2. In the combined analysis of stages 1 and 2, a total of 147 independent loci were associated with lipid levels at P < 5 × 10-8 using 2-df tests, of which 18 were novel. No genome-wide-significant associations were found testing the interaction effect alone. The novel loci included several genes (proprotein convertase subtilisin/kexin type 5 (PCSK5), vascular endothelial growth factor B (VEGFB), and apolipoprotein B mRNA editing enzyme, catalytic polypeptide 1 (APOBEC1) complementation factor (A1CF)) that have a putative role in lipid metabolism on the basis of existing evidence from cellular and experimental models.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/aje/kwz005DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6545280PMC
June 2019

Adjusting for Principal Components of Molecular Phenotypes Induces Replicating False Positives.

Genetics 2019 04 28;211(4):1179-1189. Epub 2019 Jan 28.

Department of Medicine, University of California San Francisco, 94158 California

High-throughput measurements of molecular phenotypes provide an unprecedented opportunity to model cellular processes and their impact on disease. These highly structured datasets are usually strongly confounded, creating false positives and reducing power. This has motivated many approaches based on principal components analysis (PCA) to estimate and correct for confounders, which have become indispensable elements of association tests between molecular phenotypes and both genetic and nongenetic factors. Here, we show that these correction approaches induce a bias, and that it persists for large sample sizes and replicates out-of-sample. We prove this theoretically for PCA by deriving an analytic, deterministic, and intuitive bias approximation. We assess other methods with realistic simulations, which show that perturbing any of several basic parameters can cause false positive rate (FPR) inflation. Our experiments show the bias depends on covariate and confounder sparsity, effect sizes, and their correlation. Surprisingly, when the covariate and confounder have [Formula: see text], standard two-step methods all have [Formula: see text]-fold FPR inflation. Our analysis informs best practices for confounder correction in genomic studies, and suggests many false discoveries have been made and replicated in some differential expression analyses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1534/genetics.118.301768DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6456307PMC
April 2019

Multi-ancestry study of blood lipid levels identifies four loci interacting with physical activity.

Nat Commun 2019 01 22;10(1):376. Epub 2019 Jan 22.

Laboratory of Genetics and Molecular Cardiology, Heart Institute (InCor), University of São Paulo Medical School, São Paulo, 01246903, SP, Brazil.

Many genetic loci affect circulating lipid levels, but it remains unknown whether lifestyle factors, such as physical activity, modify these genetic effects. To identify lipid loci interacting with physical activity, we performed genome-wide analyses of circulating HDL cholesterol, LDL cholesterol, and triglyceride levels in up to 120,979 individuals of European, African, Asian, Hispanic, and Brazilian ancestry, with follow-up of suggestive associations in an additional 131,012 individuals. We find four loci, in/near CLASP1, LHX1, SNTA1, and CNTNAP2, that are associated with circulating lipid levels through interaction with physical activity; higher levels of physical activity enhance the HDL cholesterol-increasing effects of the CLASP1, LHX1, and SNTA1 loci and attenuate the LDL cholesterol-increasing effect of the CNTNAP2 locus. The CLASP1, LHX1, and SNTA1 regions harbor genes linked to muscle function and lipid metabolism. Our results elucidate the role of physical activity interactions in the genetic contribution to blood lipid levels.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-08008-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6342931PMC
January 2019

Joint Analysis of Multiple Interaction Parameters in Genetic Association Studies.

Genetics 2019 02 21;211(2):483-494. Epub 2018 Dec 21.

Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115

With growing human genetic and epidemiologic data, there has been increased interest for the study of gene-by-environment (G-E) interaction effects. Still, major questions remain on how to test jointly a large number of interactions between multiple SNPs and multiple exposures. In this study, we first compared the relative performance of four fixed-effect joint analysis approaches using simulated data, considering up to 10 exposures and 300 SNPs: (1) omnibus test, (2) multi-exposure and genetic risk score (GRS) test, (3) multi-SNP and environmental risk score (ERS) test, and (4) GRS-ERS test. Our simulations explored both linear and logistic regression while considering three statistics: the test, the test, and the (LRT). We further applied the approaches to three large sets of human cohort data ( = 37,664), focusing on type 2 diabetes (T2D), obesity, hypertension, and coronary heart disease with smoking, physical activity, diets, and total energy intake. Overall, GRS-based approaches were the most robust, and had the highest power, especially when the G-E interaction effects were correlated with the marginal genetic and environmental effects. We also observed severe miscalibration of joint statistics in logistic models when the number of events per variable was too low when using either the test or LRT test. Finally, our real data application detected nominally significant interaction effects for three outcomes (T2D, obesity, and hypertension), mainly from the GRS-ERS approach. In conclusion, this study provides guidelines for testing multiple interaction parameters in modern human cohorts including extensive genetic and environmental data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1534/genetics.118.301394DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6366922PMC
February 2019

Exploring the genetic basis of human population differences in DNA methylation and their causal impact on immune gene regulation.

Genome Biol 2018 12 18;19(1):222. Epub 2018 Dec 18.

Unit of Human Evolutionary Genetics, Institut Pasteur, 75015, Paris, France.

Background: DNA methylation is influenced by both environmental and genetic factors and is increasingly thought to affect variation in complex traits and diseases. Yet, the extent of ancestry-related differences in DNA methylation, their genetic determinants, and their respective causal impact on immune gene regulation remain elusive.

Results: We report extensive population differences in DNA methylation between 156 individuals of African and European descent, detected in primary monocytes that are used as a model of a major innate immunity cell type. Most of these differences (~ 70%) are driven by DNA sequence variants nearby CpG sites, which account for ~ 60% of the variance in DNA methylation. We also identify several master regulators of DNA methylation variation in trans, including a regulatory hub nearby the transcription factor-encoding CTCF gene, which contributes markedly to ancestry-related differences in DNA methylation. Furthermore, we establish that variation in DNA methylation is associated with varying gene expression levels following mostly, but not exclusively, a canonical model of negative associations, particularly in enhancer regions. Specifically, we find that DNA methylation highly correlates with transcriptional activity of 811 and 230 genes, at the basal state and upon immune stimulation, respectively. Finally, using a Bayesian approach, we estimate causal mediation effects of DNA methylation on gene expression in ~ 20% of the studied cases, indicating that DNA methylation can play an active role in immune gene regulation.

Conclusion: Using a system-level approach, our study reveals substantial ancestry-related differences in DNA methylation and provides evidence for their causal impact on immune gene regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-018-1601-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6299574PMC
December 2018

Novel genetic associations for blood pressure identified via gene-alcohol interaction in up to 570K individuals across multiple ancestries.

PLoS One 2018 18;13(6):e0198166. Epub 2018 Jun 18.

Icelandic Heart Association, Kopavogur, Iceland.

Heavy alcohol consumption is an established risk factor for hypertension; the mechanism by which alcohol consumption impact blood pressure (BP) regulation remains unknown. We hypothesized that a genome-wide association study accounting for gene-alcohol consumption interaction for BP might identify additional BP loci and contribute to the understanding of alcohol-related BP regulation. We conducted a large two-stage investigation incorporating joint testing of main genetic effects and single nucleotide variant (SNV)-alcohol consumption interactions. In Stage 1, genome-wide discovery meta-analyses in ≈131K individuals across several ancestry groups yielded 3,514 SNVs (245 loci) with suggestive evidence of association (P < 1.0 x 10-5). In Stage 2, these SNVs were tested for independent external replication in ≈440K individuals across multiple ancestries. We identified and replicated (at Bonferroni correction threshold) five novel BP loci (380 SNVs in 21 genes) and 49 previously reported BP loci (2,159 SNVs in 109 genes) in European ancestry, and in multi-ancestry meta-analyses (P < 5.0 x 10-8). For African ancestry samples, we detected 18 potentially novel BP loci (P < 5.0 x 10-8) in Stage 1 that warrant further replication. Additionally, correlated meta-analysis identified eight novel BP loci (11 genes). Several genes in these loci (e.g., PINX1, GATA4, BLK, FTO and GABBR2) have been previously reported to be associated with alcohol consumption. These findings provide insights into the role of alcohol consumption in the genetic architecture of hypertension.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0198166PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6005576PMC
January 2019

VarExp: estimating variance explained by genome-wide GxE summary statistics.

Bioinformatics 2018 10;34(19):3412-3414

Groupe de Génétique Statistique, Département de Génomes and Génétique, C3BI, Institut Pasteur, Paris, France.

Summary: Many genome-wide association studies and genome-wide screening for gene-environment (GxE) interactions have been performed to elucidate the underlying mechanisms of human traits and diseases. When the analyzed outcome is quantitative, the overall contribution of identified genetic variants to the outcome is often expressed as the percentage of phenotypic variance explained. This is commonly done using individual-level genotype data but it is challenging when results are derived through meta-analyses. Here, we present R package, 'VarExp', that allows for the estimation of the percentage of phenotypic variance explained using summary statistics only. It allows for a range of models to be evaluated, including marginal genetic effects, GxE interaction effects and both effects jointly. Its implementation integrates all recent methodological developments and does not need external data to be uploaded by users.

Availability And Implementation: The R package is available at https://gitlab.pasteur.fr/statistical-genetics/VarExp.git.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty379DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157079PMC
October 2018

Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr.

Bioinformatics 2018 08;34(16):2781-2787

Laboratoire TIMC-IMAG, UMR 5525, CNRS, Université Grenoble Alpes, Grenoble, France.

Motivation: Genome-wide datasets produced for association studies have dramatically increased in size over the past few years, with modern datasets commonly including millions of variants measured in dozens of thousands of individuals. This increase in data size is a major challenge severely slowing down genomic analyses, leading to some software becoming obsolete and researchers having limited access to diverse analysis tools.

Results: Here we present two R packages, bigstatsr and bigsnpr, allowing for the analysis of large scale genomic data to be performed within R. To address large data size, the packages use memory-mapping for accessing data matrices stored on disk instead of in RAM. To perform data pre-processing and data analysis, the packages integrate most of the tools that are commonly used, either through transparent system calls to existing software, or through updated or improved implementation of existing methods. In particular, the packages implement fast and accurate computations of principal component analysis and association studies, functions to remove single nucleotide polymorphisms in linkage disequilibrium and algorithms to learn polygenic risk scores on millions of single nucleotide polymorphisms. We illustrate applications of the two R packages by analyzing a case-control genomic dataset for celiac disease, performing an association study and computing polygenic risk scores. Finally, we demonstrate the scalability of the R packages by analyzing a simulated genome-wide dataset including 500 000 individuals and 1 million markers on a single desktop computer.

Availability And Implementation: https://privefl.github.io/bigstatsr/ and https://privefl.github.io/bigsnpr/.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty185DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6084588PMC
August 2018

lme4qtl: linear mixed models with flexible covariance structure for genetic studies of related individuals.

BMC Bioinformatics 2018 02 27;19(1):68. Epub 2018 Feb 27.

Unitat de Genòmica de Malalties Complexes, Institut d'Investigació Biomèdica Sant Pau (IIB-Sant Pau), Barcelona, Spain.

Background: Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by modeling such correlations as random effects in linear mixed models (LMMs). The R package lme4 is a well-established tool that implements major LMM features using sparse matrix methods; however, it is not fully adapted for QTL mapping association and linkage studies. In particular, two LMM features are lacking in the base version of lme4: the definition of random effects by custom covariance matrices; and parameter constraints, which are essential in advanced QTL models. Apart from applications in linkage studies of related individuals, such functionalities are of high interest for association studies in situations where multiple covariance matrices need to be modeled, a scenario not covered by many genome-wide association study (GWAS) software.

Results: To address the aforementioned limitations, we developed a new R package lme4qtl as an extension of lme4. First, lme4qtl contributes new models for genetic studies within a single tool integrated with lme4 and its companion packages. Second, lme4qtl offers a flexible framework for scenarios with multiple levels of relatedness and becomes efficient when covariance matrices are sparse. We showed the value of our package using real family-based data in the Genetic Analysis of Idiopathic Thrombophilia 2 (GAIT2) project.

Conclusions: Our software lme4qtl enables QTL mapping models with a versatile structure of random effects and efficient computation for sparse covariances. lme4qtl is available at https://github.com/variani/lme4qtl .
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-018-2057-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5830078PMC
February 2018

A Large-Scale Multi-ancestry Genome-wide Study Accounting for Smoking Behavior Identifies Multiple Significant Loci for Blood Pressure.

Am J Hum Genet 2018 03 15;102(3):375-400. Epub 2018 Feb 15.

Health Disparities Research Section, Laboratory of Epidemiology and Population Sciences, National Institute on Aging, NIH, Baltimore, MD 21224, USA.

Genome-wide association analysis advanced understanding of blood pressure (BP), a major risk factor for vascular conditions such as coronary heart disease and stroke. Accounting for smoking behavior may help identify BP loci and extend our knowledge of its genetic architecture. We performed genome-wide association meta-analyses of systolic and diastolic BP incorporating gene-smoking interactions in 610,091 individuals. Stage 1 analysis examined ∼18.8 million SNPs and small insertion/deletion variants in 129,913 individuals from four ancestries (European, African, Asian, and Hispanic) with follow-up analysis of promising variants in 480,178 additional individuals from five ancestries. We identified 15 loci that were genome-wide significant (p < 5 × 10) in stage 1 and formally replicated in stage 2. A combined stage 1 and 2 meta-analysis identified 66 additional genome-wide significant loci (13, 35, and 18 loci in European, African, and trans-ancestry, respectively). A total of 56 known BP loci were also identified by our results (p < 5 × 10). Of the newly identified loci, ten showed significant interaction with smoking status, but none of them were replicated in stage 2. Several loci were identified in African ancestry, highlighting the importance of genetic studies in diverse populations. The identified loci show strong evidence for regulatory features and support shared pathophysiology with cardiometabolic and addiction traits. They also highlight a role in BP regulation for biological candidates such as modulators of vascular structure and function (CDKN1B, BCAR1-CFDP1, PXDN, EEA1), ciliopathies (SDCCAG8, RPGRIP1L), telomere maintenance (TNKS, PINX1, AKTIP), and central dopaminergic signaling (MSRA, EBF2).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2018.01.015DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5985266PMC
March 2018
-->