Publications by authors named "Dandi Qiao"

40 Publications

Multiethnic genome-wide and HLA association study of total serum IgE level.

J Allergy Clin Immunol 2021 Sep 15. Epub 2021 Sep 15.

Department of Human Genetics, University of Chicago, Chicago, Ill.

Background: Total serum IgE (tIgE) is an important intermediate phenotype of allergic disease. Whole genome genetic association studies across ancestries may identify important determinants of IgE.

Objective: We aimed to increase understanding of genetic variants affecting tIgE production across the ancestry and allergic disease spectrum by leveraging data from the National Heart, Lung and Blood Institute Trans-Omics for Precision Medicine program; the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA); and the Atopic Dermatitis Research Network (N = 21,901).

Methods: We performed genome-wide association within strata of study, disease, and ancestry groups, and we combined results via a meta-regression approach that models heterogeneity attributable to ancestry. We also tested for association between HLA alleles called from whole genome sequence data and tIgE, assessing replication of associations in HLA alleles called from genotype array data.

Results: We identified 6 loci at genome-wide significance (P < 5 × 10), including 4 loci previously reported as genome-wide significant for tIgE, as well as new regions in chr11q13.5 and chr15q22.2, which were also identified in prior genome-wide association studies of atopic dermatitis and asthma. In the HLA allele association study, HLA-A∗02:01 was associated with decreased tIgE level (P = 2 × 10; P = 5 × 10; P = 4 × 10), and HLA-DQB1∗03:02 was strongly associated with decreased tIgE level in Hispanic/Latino ancestry populations (P = 8 × 10).

Conclusion: We performed the largest genome-wide association study and HLA association study of tIgE focused on ancestrally diverse populations and found several known tIgE and allergic disease loci that are relevant in non-European ancestry populations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jaci.2021.09.011DOI Listing
September 2021

Heterozygosity of the Alpha 1-Antitrypsin Pi*Z Allele and Risk of Liver Disease.

Hepatol Commun 2021 Aug 3;5(8):1348-1361. Epub 2021 Apr 3.

Department of Medicine Beth Israel Deaconess Medical Center Boston MA USA.

The serpin family A member 1 () Z allele is present in approximately one in 25 individuals of European ancestry. Z allele homozygosity (Pi*ZZ) is the most common cause of alpha 1-antitrypsin deficiency and is a proven risk factor for cirrhosis. We examined whether heterozygous Z allele (Pi*Z) carriers in United Kingdom (UK) Biobank, a population-based cohort, are at increased risk of liver disease. We replicated findings in Massachusetts General Brigham Biobank, a hospital-based cohort. We also examined variants associated with liver disease and assessed for gene-gene and gene-environment interactions. In UK Biobank, we identified 1,493 cases of cirrhosis, 12,603 Z allele heterozygotes, and 129 Z allele homozygotes among 312,671 unrelated white British participants. Heterozygous carriage of the Z allele was associated with cirrhosis compared to noncarriage (odds ratio [OR], 1.53;  = 1.1×10); homozygosity of the Z allele also increased the risk of cirrhosis (OR, 11.8;  = 1.8 × 10). The OR for cirrhosis of the Z allele was comparable to that of well-established genetic variants, including patatin-like phospholipase domain containing 3 () I148M (OR, 1.48;  = 1.1 × 10) and transmembrane 6 superfamily member 2 () E167K (OR, 1.34;  = 2.6 × 10). In heterozygotes compared to noncarriers, the Z allele was associated with higher alanine aminotransferase (ALT;  = = 4.6 × 10), aspartate aminotransferase (AST;  = 2.2 × 10), alkaline phosphatase ( = 3.3 × 10), gamma-glutamyltransferase ( = 1.2 × 10), and total bilirubin ( = 6.4 × 10); Z allele homozygotes had even greater elevations in liver biochemistries. Body mass index (BMI) amplified the association of the Z allele for ALT ( interaction = 0.021) and AST ( interaction = 0.0040), suggesting a gene-environment interaction. Finally, we demonstrated genetic interactions between variants in , and hydroxysteroid 17-beta dehydrogenase 13 (); there was no evidence of epistasis between the Z allele and these variants. Z allele heterozygosity is an important risk factor for liver disease; this risk is amplified by increasing BMI.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/hep4.1718DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8369947PMC
August 2021

Tempo-spatial regulation of the Wnt pathway by FAM13A modulates the stemness of alveolar epithelial progenitors.

EBioMedicine 2021 Jul 3;69:103463. Epub 2021 Jul 3.

Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA. Electronic address:

Background: Family with Sequence Similarity 13, Member A (FAM13A) gene has been consistently associated with COPD by Genome-wide association studies (GWAS). Our previous study demonstrated that FAM13A was mainly expressed in the lung epithelial progenitors including Club cells and alveolar type II epithelial (ATII) cells. Fam13a mice were resistant to cigarette smoke (CS)-induced emphysema through promoting β-catenin/Wnt activation. Given the important roles of β-catenin/Wnt activation in alveolar regeneration during injury, it is unclear when and where FAM13A regulates the Wnt pathway, the requisite pathway for alveolar epithelial repair, in vivo during CS exposure in lung epithelial progenitors.

Methods: Fam13a or Fam13a mice were crossed with TCF/Lef:H2B-GFP Wnt-signaling reporter mouse line to indicate β-catenin/Wnt-activated cells labeled with GFP followed by acute (1 month) or chronic (7 months) CS exposure. Fluorescence-activated flow cytometry analysis, immunofluorescence and organoid culture system were performed to identify the β-catenin/Wnt-activated cells in Fam13a or Fam13a mice exposed to CS. Fam13a;SftpcCreERT2;Rosa26RmTmG mouse line, where GFP labels ATII cells, was generated for alveolar organoid culture followed by analyses of organoid number, immunofluorescence and gene expression. Single cell RNA-seq data from COPD ever smokers and nonsmoker control lungs were further analyzed.

Findings: We found that FAM13A-deficiency significantly increased Wnt activation mainly in lung epithelial cells. Consistently, after long-term CS exposure in vivo, FAM13A deficiency bestows alveolar epithelial progenitor cells with enhanced proliferation and differentiation in the ex vivo organoid model. Importantly, expression of FAM13A is significantly increased in human COPD-derived ATII cells compared to healthy ATII cells as suggested by single cell RNA-sequencing data.

Interpretation: Our findings suggest that FAM13A-deficiency promotes the Wnt pathway-mediated ATII cell repair/regeneration, and thereby possibly mitigating CS-induced alveolar destruction. FUND: This project is funded by the National Institutes of Health of United States of America (NIH) grants R01HL127200, R01HL137927, R01HL148667 and R01HL147148 (XZ).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ebiom.2021.103463DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8264115PMC
July 2021

Connecting COPD GWAS genes: FAM13A controls TGFβ2 secretion by modulating AP-3 transport.

Am J Respir Cell Mol Biol 2021 Jun 24. Epub 2021 Jun 24.

Brigham and Women's Hospital, 1861, Channing Division of Network Medicine, Boston, Massachusetts, United States;

Chronic Obstructive Pulmonary Disease (COPD) is a common, complex disease and a major cause of morbidity and mortality. Although multiple genetic determinants of COPD have been implicated by genome-wide association studies (GWAS), the pathophysiologic significance of these associations remains largely unknown. From a COPD protein-protein interaction network module, we selected a network path between two COPD GWAS genes for validation studies: FAM13A-AP3D1-CTGF-TGFB2. We find that TGFβ2, FAM13A, and AP3D1 (but not CTGF) form a cellular protein complex. Functional characterization suggests that this complex mediates the secretion of TGFβ2 through an AP-3-dependent pathway, with FAM13A acting as a negative regulator by targeting a late stage of this transport that involves the dissociation of coat-cargo interaction. Moreover, we find that TGFβ2 is a transmembrane protein that engages the AP-3 complex for delivery to the late endosomal compartments for subsequent secretion through exosomes. These results identify a pathophysiologic context that unifies the biological network role of two COPD GWAS proteins and reveal novel mechanisms of cargo transport through an intracellular pathway.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1165/rcmb.2021-0016OCDOI Listing
June 2021

Genome-wide association analysis of COVID-19 mortality risk in SARS-CoV-2 genomes identifies mutation in the SARS-CoV-2 spike protein that colocalizes with P.1 of the Brazilian strain.

Genet Epidemiol 2021 10 22;45(7):685-693. Epub 2021 Jun 22.

Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, Massachusetts, USA.

SARS-CoV-2 mortality has been extensively studied in relation to host susceptibility. How sequence variations in the SARS-CoV-2 genome affect pathogenicity is poorly understood. Starting in October 2020, using the methodology of genome-wide association studies (GWAS), we looked at the association between whole-genome sequencing (WGS) data of the virus and COVID-19 mortality as a potential method of early identification of highly pathogenic strains to target for containment. Although continuously updating our analysis, in December 2020, we analyzed 7548 single-stranded SARS-CoV-2 genomes of COVID-19 patients in the GISAID database and associated variants with mortality using a logistic regression. In total, evaluating 29,891 sequenced loci of the viral genome for association with patient/host mortality, two loci, at 12,053 and 25,088 bp, achieved genome-wide significance (p values of 4.09e-09 and 4.41e-23, respectively), though only 25,088 bp remained significant in follow-up analyses. Our association findings were exclusively driven by the samples that were submitted from Brazil (p value of 4.90e-13 for 25,088 bp). The mutation frequency of 25,088 bp in the Brazilian samples on GISAID has rapidly increased from about 0.4 in October/December 2020 to 0.77 in March 2021. Although GWAS methodology is suitable for samples in which mutation frequencies varies between geographical regions, it cannot account for mutation frequencies that change rapidly overtime, rendering a GWAS follow-up analysis of the GISAID samples that have been submitted after December 2020 as invalid. The locus at 25,088 bp is located in the P.1 strain, which later (April 2021) became one of the distinguishing loci (precisely, substitution V1176F) of the Brazilian strain as defined by the Centers for Disease Control. Specifically, the mutations at 25,088 bp occur in the S2 subunit of the SARS-CoV-2 spike protein, which plays a key role in viral entry of target host cells. Since the mutations alter amino acid coding sequences, they potentially imposing structural changes that could enhance viral infectivity and symptom severity. Our analysis suggests that GWAS methodology can provide suitable analysis tools for the real-time detection of new more transmissible and pathogenic viral strains in databases such as GISAID, though new approaches are needed to accommodate rapidly changing mutation frequencies over time, in the presence of simultaneously changing case/control ratios. Improvements of the associated metadata/patient information in terms of quality and availability will also be important to fully utilize the potential of GWAS methodology in this field.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/gepi.22421DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8426743PMC
October 2021

Robust, flexible, and scalable tests for Hardy-Weinberg equilibrium across diverse ancestries.

Genetics 2021 05;218(1)

Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC 27157, USA.

Traditional Hardy-Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in data sets composed of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and to evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence data sets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false-positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently among the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/genetics/iyab044DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8128395PMC
May 2021

Whole genome sequence analysis of pulmonary function and COPD in 19,996 multi-ethnic participants.

Nat Commun 2020 10 14;11(1):5182. Epub 2020 Oct 14.

The Institute for Translational Genomics and Population Sciences, The Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, 90502, USA.

Chronic obstructive pulmonary disease (COPD), diagnosed by reduced lung function, is a leading cause of morbidity and mortality. We performed whole genome sequence (WGS) analysis of lung function and COPD in a multi-ethnic sample of 11,497 participants from population- and family-based studies, and 8499 individuals from COPD-enriched studies in the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. We identify at genome-wide significance 10 known GWAS loci and 22 distinct, previously unreported loci, including two common variant signals from stratified analysis of African Americans. Four novel common variants within the regions of PIAS1, RGN (two variants) and FTO show evidence of replication in the UK Biobank (European ancestry n ~ 320,000), while colocalization analyses leveraging multi-omic data from GTEx and TOPMed identify potential molecular mechanisms underlying four of the 22 novel loci. Our study demonstrates the value of performing WGS analyses and multi-omic follow-up in cohorts of diverse ancestry.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-18334-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7598941PMC
October 2020

A linear prognostic score based on the ratio of interleukin-6 to interleukin-10 predicts outcomes in COVID-19.

EBioMedicine 2020 Nov 8;61:103026. Epub 2020 Oct 8.

Department of Medicine, Royal College of Surgeons in Ireland, Dublin, Ireland; Beaumont Hospital, Dublin, Ireland. Electronic address:

Background: Prognostic tools are required to guide clinical decision-making in COVID-19.

Methods: We studied the relationship between the ratio of interleukin (IL)-6 to IL-10 and clinical outcome in 80 patients hospitalized for COVID-19, and created a simple 5-point linear score predictor of clinical outcome, the Dublin-Boston score. Clinical outcome was analysed as a three-level ordinal variable ("Improved", "Unchanged", or "Declined"). For both IL-6:IL-10 ratio and IL-6 alone, we associated clinical outcome with a) baseline biomarker levels, b) change in biomarker level from day 0 to day 2, c) change in biomarker from day 0 to day 4, and d) slope of biomarker change throughout the study. The associations between ordinal clinical outcome and each of the different predictors were performed with proportional odds logistic regression. Associations were run both "unadjusted" and adjusted for age and sex. Nested cross-validation was used to identify the model for incorporation into the Dublin-Boston score.

Findings: The 4-day change in IL-6:IL-10 ratio was chosen to derive the Dublin-Boston score. Each 1 point increase in the score was associated with a 5.6 times increased odds for a more severe outcome (OR 5.62, 95% CI -3.22-9.81, P = 1.2 × 10). Both the Dublin-Boston score and the 4-day change in IL-6:IL-10 significantly outperformed IL-6 alone in predicting clinical outcome at day 7.

Interpretation: The Dublin-Boston score is easily calculated and can be applied to a spectrum of hospitalized COVID-19 patients. More informed prognosis could help determine when to escalate care, institute or remove mechanical ventilation, or drive considerations for therapies.

Funding: Funding was received from the Elaine Galwey Research Fellowship, American Thoracic Society, National Institutes of Health and the Parker B Francis Research Opportunity Award.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ebiom.2020.103026DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7543971PMC
November 2020

Statistical considerations for the analysis of massively parallel reporter assays data.

Genet Epidemiol 2020 10 18;44(7):785-794. Epub 2020 Jul 18.

Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts.

Noncoding DNA contains gene regulatory elements that alter gene expression, and the function of these elements can be modified by genetic variation. Massively parallel reporter assays (MPRA) enable high-throughput identification and characterization of functional genetic variants, but the statistical methods to identify allelic effects in MPRA data have not been fully developed. In this study, we demonstrate how the baseline allelic imbalance in MPRA libraries can produce biased results, and we propose a novel, nonparametric, adaptive testing method that is robust to this bias. We compare the performance of this method with other commonly used methods, and we demonstrate that our novel adaptive method controls Type I error in a wide range of scenarios while maintaining excellent power. We have implemented these tests along with routines for simulating MPRA data in the Analysis Toolset for MPRA (@MPRA), an R package for the design and analyses of MPRA experiments. It is publicly available at http://github.com/redaq/atMPRA.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/gepi.22337DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7722129PMC
October 2020

Machine Learning and Prediction of All-Cause Mortality in COPD.

Chest 2020 09 27;158(3):952-964. Epub 2020 Apr 27.

Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA. Electronic address:

Background: COPD is a leading cause of mortality.

Research Question: We hypothesized that applying machine learning to clinical and quantitative CT imaging features would improve mortality prediction in COPD.

Study Design And Methods: We selected 30 clinical, spirometric, and imaging features as inputs for a random survival forest. We used top features in a Cox regression to create a machine learning mortality prediction (MLMP) in COPD model and also assessed the performance of other statistical and machine learning models. We trained the models in subjects with moderate to severe COPD from a subset of subjects in Genetic Epidemiology of COPD (COPDGene) and tested prediction performance in the remainder of individuals with moderate to severe COPD in COPDGene and Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE). We compared our model with the BMI, airflow obstruction, dyspnea, exercise capacity (BODE) index; BODE modifications; and the age, dyspnea, and airflow obstruction index.

Results: We included 2,632 participants from COPDGene and 1,268 participants from ECLIPSE. The top predictors of mortality were 6-min walk distance, FEV % predicted, and age. The top imaging predictor was pulmonary artery-to-aorta ratio. The MLMP-COPD model resulted in a C index ≥ 0.7 in both COPDGene and ECLIPSE (6.4- and 7.2-year median follow-ups, respectively), significantly better than all tested mortality indexes (P < .05). The MLMP-COPD model had fewer predictors but similar performance to that of other models. The group with the highest BODE scores (7-10) had 64% mortality, whereas the highest mortality group defined by the MLMP-COPD model had 77% mortality (P = .012).

Interpretation: An MLMP-COPD model outperformed four existing models for predicting all-cause mortality across two COPD cohorts. Performance of machine learning was similar to that of traditional statistical methods. The model is available online at: https://cdnm.shinyapps.io/cgmortalityapp/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.chest.2020.02.079DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7478228PMC
September 2020

FAM13A Represses AMPK Activity and Regulates Hepatic Glucose and Lipid Metabolism.

iScience 2020 Mar 22;23(3):100928. Epub 2020 Feb 22.

Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA. Electronic address:

Obesity commonly co-exists with fatty liver disease with increasing health burden worldwide. Family with Sequence Similarity 13, Member A (FAM13A) has been associated with lipid levels and fat mass by genome-wide association studies (GWAS). However, the function of FAM13A in maintaining metabolic homeostasis in vivo remains unclear. Here, we demonstrated that rs2276936 in this locus has allelic-enhancer activity in massively parallel reporter assays (MPRA) and reporter assay. The DNA region containing rs2276936 regulates expression of endogenous FAM13A in HepG2 cells. In vivo, Fam13a mice are protected from high-fat diet (HFD)-induced fatty liver accompanied by increased insulin sensitivity and reduced glucose production in liver. Mechanistically, loss of Fam13a led to the activation of AMP-activated protein kinase (AMPK) and increased mitochondrial respiration in primary hepatocytes. These findings demonstrate that FAM13A mediates obesity-related dysregulation of lipid and glucose homeostasis. Targeting FAM13A might be a promising treatment of obesity and fatty liver disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.isci.2020.100928DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7063182PMC
March 2020

Whole Genome Sequencing Identifies CRISPLD2 as a Lung Function Gene in Children With Asthma.

Chest 2019 12 23;156(6):1068-1079. Epub 2019 Sep 23.

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA.

Background: Asthma is a common respiratory disorder with a highly heterogeneous nature that remains poorly understood. The objective was to use whole genome sequencing (WGS) data to identify regions of common genetic variation contributing to lung function in individuals with a diagnosis of asthma.

Methods: WGS data were generated for 1,053 individuals from trios and extended pedigrees participating in the family-based Genetic Epidemiology of Asthma in Costa Rica study. Asthma affection status was defined through a physician's diagnosis of asthma, and most participants with asthma also had airway hyperresponsiveness (AHR) to methacholine. Family-based association tests for single variants were performed to assess the associations with lung function phenotypes.

Results: A genome-wide significant association was identified between baseline FEV/FVC ratio and a single-nucleotide polymorphism in the top hit cysteine-rich secretory protein LCCL domain-containing 2 (CRISPLD2) (rs12051168; P = 3.6 × 10 in the unadjusted model) that retained suggestive significance in the covariate-adjusted model (P = 5.6 × 10). Rs12051168 was also nominally associated with other related phenotypes: baseline FEV (P = 3.3 × 10), postbronchodilator (PB) FEV (7.3 × 10), and PB FEV/FVC ratio (P = 2.7 × 10). The identified baseline FEV/FVC ratio and rs12051168 association was meta-analyzed and replicated in three independent cohorts in which most participants with asthma also had confirmed AHR (combined weighted z-score P = .015) but not in cohorts without information about AHR.

Conclusions: These findings suggest that using specific asthma characteristics, such as AHR, can help identify more genetically homogeneous asthma subgroups with genotype-phenotype associations that may not be observed in all children with asthma. CRISPLD2 also may be important for baseline lung function in individuals with asthma who also may have AHR.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.chest.2019.08.2202DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6904857PMC
December 2019

metaFARVAT: An Efficient Tool for Meta-Analysis of Family-Based, Case-Control, and Population-Based Rare Variant Association Studies.

Front Genet 2019 19;10:572. Epub 2019 Jun 19.

Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea.

Family-based designs have been shown to be powerful in detecting the significant rare variants associated with human diseases. However, very few significant results have been found owing to relatively small sample sizes and the fact that statistical analyses often suffer from high false-negative error rates. These limitations can be avoided by combining results from multiple studies via meta-analysis. However, statistical methods for meta-analysis with rare variants are limited for family-based samples. In this report, we propose a tool for the meta-analysis of family-based rare variant associations, metaFARVAT. metaFARVAT is based on a quasi-likelihood score for each variant. These scores are combined to generate burden test, variable-threshold test, sequence kernel association test (SKAT), and optimal SKAT statistics. The proposed method tests homogeneous and heterogeneous effects of variants among different studies and can be applied to both quantitative and dichotomous phenotypes. Simulation results demonstrated the robustness and efficiency of the proposed method in different scenarios. By applying metaFARVAT to data from a family-based study and a case-control study, we identified a few promising candidate genes, including , which is associated with chronic obstructive pulmonary disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2019.00572DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6593391PMC
June 2019

Genetic landscape of chronic obstructive pulmonary disease identifies heterogeneous cell-type and phenotype associations.

Nat Genet 2019 03 25;51(3):494-505. Epub 2019 Feb 25.

Department of Internal Medicine and Environmental Health Center, School of Medicine, Kangwon National University, Chuncheon, South Korea.

Chronic obstructive pulmonary disease (COPD) is the leading cause of respiratory mortality worldwide. Genetic risk loci provide new insights into disease pathogenesis. We performed a genome-wide association study in 35,735 cases and 222,076 controls from the UK Biobank and additional studies from the International COPD Genetics Consortium. We identified 82 loci associated with P < 5 × 10; 47 of these were previously described in association with either COPD or population-based measures of lung function. Of the remaining 35 new loci, 13 were associated with lung function in 79,055 individuals from the SpiroMeta consortium. Using gene expression and regulation data, we identified functional enrichment of COPD risk loci in lung tissue, smooth muscle, and several lung cell types. We found 14 COPD loci shared with either asthma or pulmonary fibrosis. COPD genetic risk loci clustered into groups based on associations with quantitative imaging features and comorbidities. Our analyses provide further support for the genetic susceptibility and heterogeneity of COPD.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-018-0342-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6546635PMC
March 2019

New genetic signals for lung function highlight pathways and chronic obstructive pulmonary disease associations across multiple ancestries.

Nat Genet 2019 03 25;51(3):481-493. Epub 2019 Feb 25.

Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK.

Reduced lung function predicts mortality and is key to the diagnosis of chronic obstructive pulmonary disease (COPD). In a genome-wide association study in 400,102 individuals of European ancestry, we define 279 lung function signals, 139 of which are new. In combination, these variants strongly predict COPD in independent populations. Furthermore, the combined effect of these variants showed generalizability across smokers and never smokers, and across ancestral groups. We highlight biological pathways, known and potential drug targets for COPD and, in phenome-wide association studies, autoimmune-related and other pleiotropic effects of lung function-associated variants. This new genetic evidence has potential to improve future preventive and therapeutic strategies for COPD.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-018-0321-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6397078PMC
March 2019

Identification of Functional Variants in the FAM13A Chronic Obstructive Pulmonary Disease Genome-Wide Association Study Locus by Massively Parallel Reporter Assays.

Am J Respir Crit Care Med 2019 01;199(1):52-61

1 Channing Division of Network Medicine and.

Rationale: The identification of causal variants responsible for disease associations from genome-wide association studies (GWASs) facilitates functional understanding of the biological mechanisms by which those genetic variants influence disease susceptibility.

Objective: We aim to identify causal variants in or near the FAM13A (family with sequence similarity member 13A) GWAS locus associated with chronic obstructive pulmonary disease (COPD).

Methods: We used an integrated approach featuring conditional genetic analysis, massively parallel reporter assays (MPRAs), traditional reporter assays, chromatin conformation capture assays, and clustered regularly interspaced short palindromic repeats (CRISPR)-based gene editing to characterize COPD-associated regulatory variants in the FAM13A region in human bronchial epithelial cell lines.

Measurements And Main Results: Conditional genetic association suggests the presence of two independent COPD association signals in FAM13A. MPRAs identified 45 regulatory variants within FAM13A, among which six variants were prioritized for further investigation. Three COPD-associated variants demonstrated significant allele-specific activity in reporter assays. One of three variants, rs2013701, was tested in the endogenous genomic context by CRISPR-based genome editing that confirmed its allele-specific effects on FAM13A expression and on cell proliferation, providing functional characterization for this COPD-associated variant.

Conclusions: The human GWAS association near FAM13A may contain independent association signals. MPRAs identified multiple functional variants in this region, including rs2013701, a putative COPD-causing variant with allele-specific regulatory activity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1164/rccm.201802-0337OCDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6353020PMC
January 2019

Whole exome sequencing analysis in severe chronic obstructive pulmonary disease.

Hum Mol Genet 2018 11;27(21):3801-3812

Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America.

Chronic obstructive pulmonary disease (COPD), one of the leading causes of death worldwide, is substantially influenced by genetic factors. Alpha-1 antitrypsin deficiency demonstrates that rare coding variants of large effect can influence COPD susceptibility. To identify additional rare coding variants in patients with severe COPD, we conducted whole exome sequencing analysis in 2543 subjects from two family-based studies (Boston Early-Onset COPD Study and International COPD Genetics Network) and one case-control study (COPDGene). Applying a gene-based segregation test in the family-based data, we identified significant segregation of rare loss of function variants in TBC1D10A and RFPL1 (P-value < 2x10-6), but were unable to find similar variants in the case-control study. In single-variant, gene-based and pathway association analyses, we were unable to find significant findings that replicated or were significant in meta-analysis. However, we found that the top results in the two datasets were in proximity to each other in the protein-protein interaction network (P-value = 0.014), suggesting enrichment of these results for similar biological processes. A network of these association results and their neighbors was significantly enriched in the transforming growth factor beta-receptor binding and cilia-related pathways. Finally, in a more detailed examination of candidate genes, we identified individuals with putative high-risk variants, including patients harboring homozygous mutations in genes associated with cutis laxa and Niemann-Pick Disease Type C. Our results likely reflect heterogeneity of genetic risk for COPD along with limitations of statistical power and functional annotation, and highlight the potential of network analysis to gain insight into genetic association studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddy269DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6196654PMC
November 2018

Rare Variants in Known Susceptibility Loci and Their Contribution to Risk of Lung Cancer.

J Thorac Oncol 2018 10 4;13(10):1483-1495. Epub 2018 Jul 4.

Maria Sklodowska-Curie Institute of Oncology Center, Warsaw, Poland.

Background: Genome-wide association studies are widely used to map genomic regions contributing to lung cancer (LC) susceptibility, but they typically do not identify the precise disease-causing genes/variants. To unveil the inherited genetic variants that cause LC, we performed focused exome-sequencing analyses on genes located in 121 genome-wide association study-identified loci previously implicated in the risk of LC, chronic obstructive pulmonary disease, pulmonary function level, and smoking behavior.

Methods: Germline DNA from 260 case patients with LC and 318 controls were sequenced by utilizing VCRome 2.1 exome capture. Filtering was based on enrichment of rare and potential deleterious variants in cases (risk alleles) or controls (protective alleles). Allelic association analyses of single-variant and gene-based burden tests of multiple variants were performed. Promising candidates were tested in two independent validation studies with a total of 1773 case patients and 1123 controls.

Results: We identified 48 rare variants with deleterious effects in the discovery analysis and validated 12 of the 43 candidates that were covered in the validation platforms. The top validated candidates included one well-established truncating variant, namely, BRCA2, DNA repair associated gene (BRCA2) K3326X (OR = 2.36, 95% confidence interval [CI]: 1.38-3.99), and three newly identified variations, namely, lymphotoxin beta gene (LTB) p.Leu87Phe (OR = 7.52, 95% CI: 1.01-16.56), prolyl 3-hydroxylase 2 gene (P3H2) p.Gln185His (OR = 5.39, 95% CI: 0.75-15.43), and dishevelled associated activator of morphogenesis 2 gene (DAAM2) p.Asp762Gly (OR = 0.25, 95% CI: 0.10-0.79). Burden tests revealed strong associations between zinc finger protein 93 gene (ZNF93), DAAM2, bromodomain containing 9 gene (BRD9), and the gene LTB and LC susceptibility.

Conclusion: Our results extend the catalogue of regions associated with LC and highlight the importance of germline rare coding variants in LC susceptibility.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jtho.2018.06.016DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6366341PMC
October 2018

Whole-Genome Sequencing in Severe Chronic Obstructive Pulmonary Disease.

Am J Respir Cell Mol Biol 2018 11;59(5):614-622

1 Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts.

Genome-wide association studies have identified common variants associated with chronic obstructive pulmonary disease (COPD). Whole-genome sequencing (WGS) offers comprehensive coverage of the entire genome, as compared with genotyping arrays or exome sequencing. We hypothesized that WGS in subjects with severe COPD and smoking control subjects with normal pulmonary function would allow us to identify novel genetic determinants of COPD. We sequenced 821 patients with severe COPD and 973 control subjects from the COPDGene and Boston Early-Onset COPD studies, including both non-Hispanic white and African American individuals. We performed single-variant and grouped-variant analyses, and in addition, we assessed the overlap of variants between sequencing- and array-based imputation. Our most significantly associated variant was in a known region near HHIP (combined P = 1.6 × 10); additional variants approaching genome-wide significance included previously described regions in CHRNA5, TNS1, and SERPINA6/SERPINA1 (the latter in African American individuals). None of our associations were clearly driven by rare variants, and we found minimal evidence of replication of genes identified by previously reported smaller sequencing studies. With WGS, we identified more than 20 million new variants, not seen with imputation, including more than 10,000 of potential importance in previously identified COPD genome-wide association study regions. WGS in severe COPD identifies a large number of potentially important functional variants, with the strongest associations being in known COPD risk loci, including HHIP and SERPINA1. Larger sample sizes will be needed to identify associated variants in novel regions of the genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1165/rcmb.2018-0088OCDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6236690PMC
November 2018

Genome-wide assessment of gene-by-smoking interactions in COPD.

Sci Rep 2018 06 18;8(1):9319. Epub 2018 Jun 18.

Department of public health sciences, Seoul national university, Seoul, Korea.

Cigarette smoke exposure is a major risk factor in chronic obstructive pulmonary disease (COPD) and its interactions with genetic variants could affect lung function. However, few gene-smoking interactions have been reported. In this report, we evaluated the effects of gene-smoking interactions on lung function using Korea Associated Resource (KARE) data with the spirometric variables-forced expiratory volume in 1 s (FEV). We found that variations in FEV were different among smoking status. Thus, we considered a linear mixed model for association analysis under heteroscedasticity according to smoking status. We found a previously identified locus near SOX9 on chromosome 17 to be the most significant based on a joint test of the main and interaction effects of smoking. Smoking interactions were replicated with Gene-Environment of Interaction and phenotype (GENIE), Multi-Ethnic Study of Atherosclerosis-Lung (MESA-Lung), and COPDGene studies. We found that individuals with minor alleles, rs17765644, rs17178251, rs11870732, and rs4793541, tended to have lower FEV values, and lung function decreased much faster with age for smokers. There have been very few reports to replicate a common variant gene-smoking interaction, and our results revealed that statistical models for gene-smoking interaction analyses should be carefully selected.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-018-27463-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6006158PMC
June 2018

A Genome-Wide Linkage Study for Chronic Obstructive Pulmonary Disease in a Dutch Genetic Isolate Identifies Novel Rare Candidate Variants.

Front Genet 2018 19;9:133. Epub 2018 Apr 19.

Department of Epidemiology, Erasmus Medical Center, Rotterdam, Netherlands.

Chronic obstructive pulmonary disease (COPD) is a complex and heritable disease, associated with multiple genetic variants. Specific familial types of COPD may be explained by rare variants, which have not been widely studied. We aimed to discover rare genetic variants underlying COPD through a genome-wide linkage scan. Affected-only analysis was performed using the 6K Illumina Linkage IV Panel in 142 cases clustered in 27 families from a genetic isolate, the Erasmus Rucphen Family (ERF) study. Potential causal variants were identified by searching for shared rare variants in the exome-sequence data of the affected members of the families contributing most to the linkage peak. The identified rare variants were then tested for association with COPD in a large meta-analysis of several cohorts. Significant evidence for linkage was observed on chromosomes 15q14-15q25 [logarithm of the odds (LOD) score = 5.52], 11p15.4-11q14.1 (LOD = 3.71) and 5q14.3-5q33.2 (LOD = 3.49). In the chromosome 15 peak, that harbors the known COPD locus for nicotinic receptors, and in the chromosome 5 peak we could not identify shared variants. In the chromosome 11 locus, we identified four rare (minor allele frequency (MAF) <0.02), predicted pathogenic, missense variants. These were shared among the affected family members. The identified variants localize to genes including neuroblast differentiation-associated protein (), previously associated with blood biomarkers in COPD, phospholipase C Beta 3 (), shown to increase airway hyper-responsiveness, solute carrier family 22-A11 (), involved in amino acid metabolism and ion transport, and metallothionein-like protein 5 (), involved in nicotinate and nicotinamide metabolism. Association of and variants were confirmed in the meta-analysis of 9,888 cases and 27,060 controls. In conclusion, we have identified novel rare variants in plausible genes related to COPD. Further studies utilizing large sample whole-genome sequencing should further confirm the associations at chromosome 11 and investigate the chromosome 15 and 5 linked regions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2018.00133DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5916965PMC
April 2018

WISARD: workbench for integrated superfast association studies for related datasets.

BMC Med Genomics 2018 04 20;11(Suppl 2):39. Epub 2018 Apr 20.

Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea.

Background: A Mendelian transmission produces phenotypic and genetic relatedness between family members, giving family-based analytical methods an important role in genetic epidemiological studies-from heritability estimations to genetic association analyses. With the advance in genotyping technologies, whole-genome sequence data can be utilized for genetic epidemiological studies, and family-based samples may become more useful for detecting de novo mutations. However, genetic analyses employing family-based samples usually suffer from the complexity of the computational/statistical algorithms, and certain types of family designs, such as incorporating data from extended families, have rarely been used.

Results: We present a Workbench for Integrated Superfast Association studies for Related Data (WISARD) programmed in C/C++. WISARD enables the fast and a comprehensive analysis of SNP-chip and next-generation sequencing data on extended families, with applications from designing genetic studies to summarizing analysis results. In addition, WISARD can automatically be run in a fully multithreaded manner, and the integration of R software for visualization makes it more accessible to non-experts.

Conclusions: Comparison with existing toolsets showed that WISARD is computationally suitable for integrated analysis of related subjects, and demonstrated that WISARD outperforms existing toolsets. WISARD has also been successfully utilized to analyze the large-scale massive sequencing dataset of chronic obstructive pulmonary disease data (COPD), and we identified multiple genes associated with COPD, which demonstrates its practical value.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-018-0345-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5918457PMC
April 2018

Selecting cases and controls for DNA sequencing studies using family histories of disease.

Stat Med 2017 06 21;36(13):2081-2099. Epub 2017 Feb 21.

Interdisciplinary Program of Bioinformatics, Seoul National University, Seoul, Korea.

Recent improvements in sequencing technology have enabled the investigation of so-called missing heritability, and a large number of affected subjects have been sequenced in order to detect significant associations between human diseases and rare variants. However, the cost of genome sequencing is still high, and a statistically powerful strategy for selecting informative subjects would be useful. Therefore, in this report, we propose a new statistical method for selecting cases and controls for sequencing studies based on family history. We assume that disease status is determined by unobserved liability scores. Our method consists of two steps: first, the conditional means of liability are estimated with the liability threshold model given the individual's disease status and those of their relatives. Second, the informative subjects are selected with the estimated conditional means. Our simulation studies showed that statistical power is substantially affected by the subject selection strategy chosen, and power is maximized when affected (unaffected) subjects with high (low) risks are selected as cases (controls). The proposed method was successfully applied to genome-wide association studies for type 2 diabetes, and our analysis results reveal the practical value of the proposed methods. Copyright © 2017 John Wiley & Sons, Ltd.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/sim.7248DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5810411PMC
June 2017

Gene-based segregation method for identifying rare variants in family-based sequencing studies.

Genet Epidemiol 2017 05 13;41(4):309-319. Epub 2017 Feb 13.

Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, United States of America.

Whole-exome sequencing using family data has identified rare coding variants in Mendelian diseases or complex diseases with Mendelian subtypes, using filters based on variant novelty, functionality, and segregation with the phenotype within families. However, formal statistical approaches are limited. We propose a gene-based segregation test (GESE) that quantifies the uncertainty of the filtering approach. It is constructed using the probability of segregation events under the null hypothesis of Mendelian transmission. This test takes into account different degrees of relatedness in families, the number of functional rare variants in the gene, and their minor allele frequencies in the corresponding population. In addition, a weighted version of this test allows incorporating additional subject phenotypes to improve statistical power. We show via simulations that the GESE and weighted GESE tests maintain appropriate type I error rate, and have greater power than several commonly used region-based methods. We apply our method to whole-exome sequencing data from 49 extended pedigrees with severe, early-onset chronic obstructive pulmonary disease (COPD) in the Boston Early-Onset COPD study (BEOCOPD) and identify several promising candidate genes. Our proposed methods show great potential for identifying rare coding variants of large effect and high penetrance for family-based sequencing data. The proposed tests are implemented in an R package that is available on CRAN (https://cran.r-project.org/web/packages/GESE/).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/gepi.22037DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5397337PMC
May 2017

Genetic loci associated with chronic obstructive pulmonary disease overlap with loci for lung function and pulmonary fibrosis.

Nat Genet 2017 Mar 6;49(3):426-432. Epub 2017 Feb 6.

Pulmonary, Critical Care, Sleep and Allergy Division, Department of Internal Medicine, University of Nebraska Medical Center, Omaha, Nebraska, USA.

Chronic obstructive pulmonary disease (COPD) is a leading cause of mortality worldwide. We performed a genetic association study in 15,256 cases and 47,936 controls, with replication of select top results (P < 5 × 10) in 9,498 cases and 9,748 controls. In the combined meta-analysis, we identified 22 loci associated at genome-wide significance, including 13 new associations with COPD. Nine of these 13 loci have been associated with lung function in general population samples, while 4 (EEFSEC, DSP, MTCL1, and SFTPD) are new. We noted two loci shared with pulmonary fibrosis (FAM13A and DSP) but that had opposite risk alleles for COPD. None of our loci overlapped with genome-wide associations for asthma, although one locus has been implicated in joint susceptibility to asthma and obesity. We also identified genetic correlation between COPD and asthma. Our findings highlight new loci associated with COPD, demonstrate the importance of specific loci associated with lung function to COPD, and identify potential regions of genetic overlap between COPD and other respiratory diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3752DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5381275PMC
March 2017

Increasing Generality and Power of Rare-Variant Tests by Utilizing Extended Pedigrees.

Am J Hum Genet 2016 Oct 22;99(4):846-859. Epub 2016 Sep 22.

Division of Genetics, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA. Electronic address:

Recently, multiple studies have performed whole-exome or whole-genome sequencing to identify groups of rare variants associated with complex traits and diseases. They have primarily utilized case-control study designs that often require thousands of individuals to reach acceptable statistical power. Family-based studies can be more powerful because a rare variant can be enriched in an extended pedigree and segregate with the phenotype. Although many methods have been proposed for using family data to discover rare variants involved in a disease, a majority of them focus on a specific pedigree structure and are designed to analyze either binary or continuously measured outcomes. In this article, we propose RareIBD, a general and powerful approach to identifying rare variants involved in disease susceptibility. Our method can be applied to large extended families of arbitrary structure, including pedigrees with only affected individuals. The method accommodates both binary and quantitative traits. A series of simulation experiments suggest that RareIBD is a powerful test that outperforms existing approaches. In addition, our method accounts for individuals in top generations, which are not usually genotyped in extended families. In contrast to available statistical tests, RareIBD generates accurate p values even when genetic data from these individuals are missing. We applied RareIBD, as well as other methods, to two extended family datasets generated by different genotyping technologies and representing different ethnicities. The analysis of real data confirmed that RareIBD is the only method that properly controls type I error.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2016.08.015DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5065687PMC
October 2016

Boosting Gene Mapping Power and Efficiency with Efficient Exact Variance Component Tests of Single Nucleotide Polymorphism Sets.

Genetics 2016 11 19;204(3):921-931. Epub 2016 Sep 19.

Department of Biostatistics, University of California, Los Angeles, California 90095.

Single nucleotide polymorphism (SNP) set tests have been a powerful method in analyzing next-generation sequencing (NGS) data. The popular sequence kernel association test (SKAT) method tests a set of variants as random effects in the linear mixed model setting. Its P-value is calculated based on asymptotic theory that requires a large sample size. Therefore, it is known that SKAT is conservative and can lose power at small or moderate sample sizes. Given the current cost of sequencing technology, scales of NGS are still limited. In this report, we derive and implement computationally efficient, exact (nonasymptotic) score (eScore), likelihood ratio (eLRT), and restricted likelihood ratio (eRLRT) tests, ExactVCTest, that can achieve high power even when sample sizes are small. We perform simulation studies under various genetic scenarios. Our ExactVCTest (i.e., eScore, eLRT, eRLRT) exhibits well-controlled type I error. Under the alternative model, eScore P-values are universally smaller than those from SKAT. eLRT and eRLRT demonstrate significantly higher power than eScore, SKAT, and SKAT optimal (SKAT-o) across all scenarios and various samples sizes. We applied these tests to an exome sequencing study. Our findings replicate previous results and shed light on rare variant effects within genes. The software package is implemented in the open source, high-performance technical computing language Julia, and is freely available at https://github.com/Tao-Hu/VarianceComponentTest.jl Analysis of each trait in the exome sequencing data set with 399 individuals and 16,619 genes takes around 1 min on a desktop computer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1534/genetics.116.190454DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5105869PMC
November 2016

FARVATX: Family-Based Rare Variant Association Test for X-Linked Genes.

Genet Epidemiol 2016 09 21;40(6):475-85. Epub 2016 Jun 21.

Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea.

Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease. Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/gepi.21979DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4981534PMC
September 2016
-->