Publications by authors named "Hyun Min Kang"

112 Publications

FIVEx: an interactive eQTL browser across public datasets.

Bioinformatics 2021 Aug 30. Epub 2021 Aug 30.

Department of Biostatistics and the Center for Statistical Genetics, University of Michigan, Ann Arbor, MI.

Summary: Expression quantitative trait loci (eQTLs) characterize the associations between genetic variation and gene expression to provide insights into tissue-specific gene regulation. Interactive visualization of tissue-specific eQTLs or splice QTLs (sQTLs) can facilitate our understanding of functional variants relevant to disease-related traits. However, combining the multi-dimensional nature of eQTLs/sQTLs into a concise and informative visualization is challenging. Existing QTL visualization tools provide useful ways to summarize the unprecedented scale of transcriptomic data but are not necessarily tailored to answer questions about the functional interpretations of trait-associated variants or other variants of interest. We developed FIVEx, an interactive eQTL/sQTL browser with an intuitive interface tailored to the functional interpretation of associated variants. It features the ability to navigate seamlessly between different data views while providing relevant tissue- and locus-specific information to offer users a better understanding of population-scale multi-tissue transcriptomic profiles. Our implementation of the FIVEx browser on the EBI eQTL catalogue, encompassing 16 publicly available RNA-seq studies, provides important insights for understanding potential tissue-specific regulatory mechanisms underlying trait-associated signals.

Availability And Implementation: A FIVEx instance visualizing EBI eQTL catalogue data can be found at https://fivex.sph.umich.edu. Its source code is open source under an MIT license at https://github.com/statgen/fivex.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btab614DOI Listing
August 2021

Identification of novel and rare variants associated with handgrip strength using whole genome sequence data from the NHLBI Trans-Omics in Precision Medicine (TOPMed) Program.

PLoS One 2021 2;16(7):e0253611. Epub 2021 Jul 2.

Department of Biostatistics, Boston University School of Public Health, Boston, MA, United States of America.

Handgrip strength is a widely used measure of muscle strength and a predictor of a range of morbidities including cardiovascular diseases and all-cause mortality. Previous genome-wide association studies of handgrip strength have focused on common variants primarily in persons of European descent. We aimed to identify rare and ancestry-specific genetic variants associated with handgrip strength by conducting whole-genome sequence association analyses using 13,552 participants from six studies representing diverse population groups from the Trans-Omics in Precision Medicine (TOPMed) Program. By leveraging multiple handgrip strength measures performed in study participants over time, we increased our effective sample size by 7-12%. Single-variant analyses identified ten handgrip strength loci among African-Americans: four rare variants, five low-frequency variants, and one common variant. One significant and four suggestive genes were identified associated with handgrip strength when aggregating rare and functional variants; all associations were ancestry-specific. We additionally leveraged the different ancestries available in the UK Biobank to further explore the ancestry-specific association signals from the single-variant association analyses. In conclusion, our study identified 11 new loci associated with handgrip strength with rare and/or ancestry-specific genetic variations, highlighting the added value of whole-genome sequencing in diverse samples. Several of the associations identified using single-variant or aggregate analyses lie in genes with a function relevant to the brain or muscle or were reported to be associated with muscle or age-related traits. Further studies in samples with sequence data and diverse ancestries are needed to confirm these findings.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0253611PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8253404PMC
July 2021

Microscopic examination of spatial transcriptome using Seq-Scope.

Cell 2021 Jun 10;184(13):3559-3572.e22. Epub 2021 Jun 10.

Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI 48109, USA. Electronic address:

Spatial barcoding technologies have the potential to reveal histological details of transcriptomic profiles; however, they are currently limited by their low resolution. Here, we report Seq-Scope, a spatial barcoding technology with a resolution comparable to an optical microscope. Seq-Scope is based on a solid-phase amplification of randomly barcoded single-molecule oligonucleotides using an Illumina sequencing platform. The resulting clusters annotated with spatial coordinates are processed to expose RNA-capture moiety. These RNA-capturing barcoded clusters define the pixels of Seq-Scope that are ∼0.5-0.8 μm apart from each other. From tissue sections, Seq-Scope visualizes spatial transcriptome heterogeneity at multiple histological scales, including tissue zonation according to the portal-central (liver), crypt-surface (colon) and inflammation-fibrosis (injured liver) axes, cellular components including single-cell types and subtypes, and subcellular architectures of nucleus and cytoplasm. Seq-Scope is quick, straightforward, precise, and easy-to-implement and makes spatial single-cell analysis accessible to a wide group of biomedical researchers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2021.05.010DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8238917PMC
June 2021

Determinants of penetrance and variable expressivity in monogenic metabolic conditions across 77,184 exomes.

Nat Commun 2021 06 9;12(1):3505. Epub 2021 Jun 9.

Department of Medicine and Therapeutics, The Chinese University of Hong Kong, Hong Kong, China.

Hundreds of thousands of genetic variants have been reported to cause severe monogenic diseases, but the probability that a variant carrier develops the disease (termed penetrance) is unknown for virtually all of them. Additionally, the clinical utility of common polygenetic variation remains uncertain. Using exome sequencing from 77,184 adult individuals (38,618 multi-ancestral individuals from a type 2 diabetes case-control study and 38,566 participants from the UK Biobank, for whom genotype array data were also available), we apply clinical standard-of-care gene variant curation for eight monogenic metabolic conditions. Rare variants causing monogenic diabetes and dyslipidemias display effect sizes significantly larger than the top 1% of the corresponding polygenic scores. Nevertheless, penetrance estimates for monogenic variant carriers average 60% or lower for most conditions. We assess epidemiologic and genetic factors contributing to risk prediction in monogenic variant carriers, demonstrating that inclusion of polygenic variation significantly improves biomarker estimation for two monogenic dyslipidemias.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-23556-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8190084PMC
June 2021

Sparse Allele Vectors and the Savvy Software Suite.

Bioinformatics 2021 May 14. Epub 2021 May 14.

Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.

Summary: The sparse allele vectors (SAV) file format is an efficient storage format for large-scale DNA variation data and is designed for high throughput association analysis by leveraging techniques for fast deserialization of data into computer memory. A command line interface has been developed to complement the storage format and supports basic features like importing, exporting and subsetting. Additionally, a C ++ programming API is available allowing for easy integration into analysis software.

Availability And Implementation: https://github.com/statgen/savvy.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btab378DOI Listing
May 2021

muCNV: Genotyping Structural Variants for Population-level Sequencing.

Bioinformatics 2021 Mar 24. Epub 2021 Mar 24.

Human Genetics Center, University of Texas Health Science Center at Houston, 1200 Pressler St., Houston, TX 77030, USA.

Motivation: There are high demands for joint genotyping of structural variations with short-read sequencing, but efficient and accurate genotyping in population scale is a challenging task.

Results: We developed muCNV that aggregates per-sample summary pileups for joint genotyping of > 100,000 samples. Pilot results show very low Mendelian inconsistencies. Applications to large-scale projects in cloud show the computational efficiencies of muCNV genotyping pipeline.

Availability: muCNV is publicly available for download at: https://github.com/gjun/muCNV.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btab199DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496513PMC
March 2021

Robust, flexible, and scalable tests for Hardy-Weinberg equilibrium across diverse ancestries.

Genetics 2021 05;218(1)

Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC 27157, USA.

Traditional Hardy-Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in data sets composed of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and to evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence data sets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false-positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently among the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/genetics/iyab044DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8128395PMC
May 2021

Genetic architectures of proximal and distal colorectal cancer are partly distinct.

Gut 2021 Jul 25;70(7):1325-1334. Epub 2021 Feb 25.

Cancer Prevention and Control Program, Catalan Institute of Oncology - IDIBELL, L'Hospitalet de Llobregat, Barcelona, Spain.

Objective: An understanding of the etiologic heterogeneity of colorectal cancer (CRC) is critical for improving precision prevention, including individualized screening recommendations and the discovery of novel drug targets and repurposable drug candidates for chemoprevention. Known differences in molecular characteristics and environmental risk factors among tumors arising in different locations of the colorectum suggest partly distinct mechanisms of carcinogenesis. The extent to which the contribution of inherited genetic risk factors for CRC differs by anatomical subsite of the primary tumor has not been examined.

Design: To identify new anatomical subsite-specific risk loci, we performed genome-wide association study (GWAS) meta-analyses including data of 48 214 CRC cases and 64 159 controls of European ancestry. We characterised effect heterogeneity at CRC risk loci using multinomial modelling.

Results: We identified 13 loci that reached genome-wide significance (p<5×10) and that were not reported by previous GWASs for overall CRC risk. Multiple lines of evidence support candidate genes at several of these loci. We detected substantial heterogeneity between anatomical subsites. Just over half (61) of 109 known and new risk variants showed no evidence for heterogeneity. In contrast, 22 variants showed association with distal CRC (including rectal cancer), but no evidence for association or an attenuated association with proximal CRC. For two loci, there was strong evidence for effects confined to proximal colon cancer.

Conclusion: Genetic architectures of proximal and distal CRC are partly distinct. Studies of risk factors and mechanisms of carcinogenesis, and precision prevention strategies should take into consideration the anatomical subsite of the tumour.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1136/gutjnl-2020-321534DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8223655PMC
July 2021

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.

Nature 2021 02 10;590(7845):290-299. Epub 2021 Feb 10.

The Broad Institute of MIT and Harvard, Cambridge, MA, USA.

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes). In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-021-03205-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875770PMC
February 2021

FASTQuick: rapid and comprehensive quality assessment of raw sequence reads.

Gigascience 2021 Jan;10(2)

Department of Biostatistics, University of Michigan School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA.

Background: Rapid and thorough quality assessment of sequenced genomes on an ultra-high-throughput scale is crucial for successful large-scale genomic studies. Comprehensive quality assessment typically requires full genome alignment, which costs a substantial amount of computational resources and turnaround time. Existing tools are either computationally expensive owing to full alignment or lacking essential quality metrics by skipping read alignment.

Findings: We developed a set of rapid and accurate methods to produce comprehensive quality metrics directly from a subset of raw sequence reads (from whole-genome or whole-exome sequencing) without full alignment. Our methods offer orders of magnitude faster turnaround time than existing full alignment-based methods while providing comprehensive and sophisticated quality metrics, including estimates of genetic ancestry and cross-sample contamination.

Conclusions: By rapidly and comprehensively performing the quality assessment, our tool will help investigators detect potential issues in ultra-high-throughput sequence reads in real time within a low computational cost at the early stages of the analyses, ensuring high-quality downstream results and preventing unexpected loss in time, money, and invaluable specimens.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giab004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7844880PMC
January 2021

Investigating rare pathogenic/likely pathogenic exonic variation in bipolar disorder.

Mol Psychiatry 2021 Jan 22. Epub 2021 Jan 22.

HudsonAlpha Institute for Biotechnology, Huntsville, AL, 35806, USA.

Bipolar disorder (BD) is a serious mental illness with substantial common variant heritability. However, the role of rare coding variation in BD is not well established. We examined the protein-coding (exonic) sequences of 3,987 unrelated individuals with BD and 5,322 controls of predominantly European ancestry across four cohorts from the Bipolar Sequencing Consortium (BSC). We assessed the burden of rare, protein-altering, single nucleotide variants classified as pathogenic or likely pathogenic (P-LP) both exome-wide and within several groups of genes with phenotypic or biologic plausibility in BD. While we observed an increased burden of rare coding P-LP variants within 165 genes identified as BD GWAS regions in 3,987 BD cases (meta-analysis OR = 1.9, 95% CI = 1.3-2.8, one-sided p = 6.0 × 10), this enrichment did not replicate in an additional 9,929 BD cases and 14,018 controls (OR = 0.9, one-side p = 0.70). Although BD shares common variant heritability with schizophrenia, in the BSC sample we did not observe a significant enrichment of P-LP variants in SCZ GWAS genes, in two classes of neuronal synaptic genes (RBFOX2 and FMRP) associated with SCZ or in loss-of-function intolerant genes. In this study, the largest analysis of exonic variation in BD, individuals with BD do not carry a replicable enrichment of rare P-LP variants across the exome or in any of several groups of genes with biologic plausibility. Moreover, despite a strong shared susceptibility between BD and SCZ through common genetic variation, we do not observe an association between BD risk and rare P-LP coding variants in genes known to modulate risk for SCZ.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41380-020-01006-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8295400PMC
January 2021

Loss-of-function genomic variants highlight potential therapeutic targets for cardiovascular disease.

Nat Commun 2020 12 18;11(1):6417. Epub 2020 Dec 18.

The Institute for Translational Genomics and Population Sciences, Department of Pediatrics and Los Angeles Biomedical Research Institute, Harbor-UCLA, Torrance, CA, USA.

Pharmaceutical drugs targeting dyslipidemia and cardiovascular disease (CVD) may increase the risk of fatty liver disease and other metabolic disorders. To identify potential novel CVD drug targets without these adverse effects, we perform genome-wide analyses of participants in the HUNT Study in Norway (n = 69,479) to search for protein-altering variants with beneficial impact on quantitative blood traits related to cardiovascular disease, but without detrimental impact on liver function. We identify 76 (11 previously unreported) presumed causal protein-altering variants associated with one or more CVD- or liver-related blood traits. Nine of the variants are predicted to result in loss-of-function of the protein. This includes ZNF529:p.K405X, which is associated with decreased low-density-lipoprotein (LDL) cholesterol (P = 1.3 × 10) without being associated with liver enzymes or non-fasting blood glucose. Silencing of ZNF529 in human hepatoma cells results in upregulation of LDL receptor and increased LDL uptake in the cells. This suggests that inhibition of ZNF529 or its gene product should be prioritized as a novel candidate drug target for treating dyslipidemia and associated CVD.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-20086-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7749177PMC
December 2020

Integrating comprehensive functional annotations to boost power and accuracy in gene-based association analysis.

PLoS Genet 2020 12 15;16(12):e1009060. Epub 2020 Dec 15.

Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA.

Gene-based association tests aggregate genotypes across multiple variants for each gene, providing an interpretable gene-level analysis framework for genome-wide association studies (GWAS). Early gene-based test applications often focused on rare coding variants; a more recent wave of gene-based methods, e.g. TWAS, use eQTLs to interrogate regulatory associations. Regulatory variants are expected to be particularly valuable for gene-based analysis, since most GWAS associations to date are non-coding. However, identifying causal genes from regulatory associations remains challenging and contentious. Here, we present a statistical framework and computational tool to integrate heterogeneous annotations with GWAS summary statistics for gene-based analysis, applied with comprehensive coding and tissue-specific regulatory annotations. We compare power and accuracy identifying causal genes across single-annotation, omnibus, and annotation-agnostic gene-based tests in simulation studies and an analysis of 128 traits from the UK Biobank, and find that incorporating heterogeneous annotations in gene-based association analysis increases power and performance identifying causal genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1009060DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7737906PMC
December 2020

Asthma and its relationship to mitochondrial copy number: Results from the Asthma Translational Genomics Collaborative (ATGC) of the Trans-Omics for Precision Medicine (TOPMed) program.

PLoS One 2020 25;15(11):e0242364. Epub 2020 Nov 25.

Center for Individualized and Genomic Medicine Research (CIGMA), Department of Internal Medicine, Henry Ford Health System, Detroit, Michigan, United States of America.

Background: Mitochondria support critical cellular functions, such as energy production through oxidative phosphorylation, regulation of reactive oxygen species, apoptosis, and calcium homeostasis.

Objective: Given the heightened level of cellular activity in patients with asthma, we sought to determine whether mitochondrial DNA (mtDNA) copy number measured in peripheral blood differed between individuals with and without asthma.

Methods: Whole genome sequence data was generated as part of the Trans-Omics for Precision Medicine (TOPMed) Program on participants from the Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-ethnicity (SAPPHIRE) and the Study of African Americans, Asthma, Genes, & Environment II (SAGE II). We restricted our analysis to individuals who self-identified as African American (3,651 asthma cases and 1,344 controls). Mitochondrial copy number was estimated using the sequencing read depth ratio for the mitochondrial and nuclear genomes. Respiratory complex expression was assessed using RNA-sequencing.

Results: Average mitochondrial copy number was significantly higher among individuals with asthma when compared with controls (SAPPHIRE: 218.60 vs. 200.47, P<0.001; SAGE II: 235.99 vs. 223.07, P<0.001). Asthma status was significantly associated with mitochondrial copy number after accounting for potential explanatory variables, such as participant age, sex, leukocyte counts, and mitochondrial haplogroup. Despite the consistent relationship between asthma status and mitochondrial copy number, the latter was not associated with time-to-exacerbation or patient-reported asthma control. Mitochondrial respiratory complex gene expression was disproportionately lower in individuals with asthma when compared with individuals without asthma and other protein-encoding genes.

Conclusions: We observed a robust association between asthma and higher mitochondrial copy number. Asthma having an effect on mitochondria function was also supported by lower respiratory complex gene expression in this group.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0242364PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7688161PMC
January 2021

Population-scale study of eRNA transcription reveals bipartite functional enhancer architecture.

Nat Commun 2020 11 24;11(1):5963. Epub 2020 Nov 24.

Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, 14853, USA.

Enhancer RNAs (eRNA) are unstable non-coding RNAs, transcribed bidirectionally from active regulatory sequences, whose expression levels correlate with enhancer activity. We use capped-nascent-RNA sequencing to efficiently capture bidirectional transcription initiation across several human lymphoblastoid cell lines (Yoruba population) and detect ~75,000 eRNA transcription sites with high sensitivity and specificity. The use of nascent-RNA sequencing sidesteps the confounding effect of eRNA instability. We identify quantitative trait loci (QTLs) associated with the level and directionality of eRNA expression. High-resolution analyses of these two types of QTLs reveal distinct positions of enrichment at the central transcription factor (TF) binding regions and at the flanking eRNA initiation regions, both of which are associated with mRNA expression QTLs. These two regions-the central TF-binding footprint and the eRNA initiation cores-define a bipartite architecture of enhancers, inform enhancer function, and can be used as an indicator of the significance of non-coding regulatory variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-19829-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7687912PMC
November 2020

Holistic characterization of single-hepatocyte transcriptome responses to high-fat diet.

Am J Physiol Endocrinol Metab 2021 02 26;320(2):E244-E258. Epub 2020 Oct 26.

Department of Molecular and Integrative Physiology and Institute for Gerontology, University of Michigan Medical School, Ann Arbor, Michigan.

During nutritional overload and obesity, hepatocyte function is grossly altered, and a subset of hepatocytes begins to accumulate fat droplets, leading to nonalcoholic fatty liver disease (NAFLD). Recent single-cell studies revealed how nonparenchymal cells, such as macrophages, hepatic stellate cells, and endothelial cells, heterogeneously respond to NAFLD. However, it remains to be characterized how hepatocytes, the major constituents of the liver, respond to nutritional overload in NAFLD. Here, using droplet-based, single-cell RNA sequencing (Drop-seq), we characterized how the transcriptomic landscape of individual hepatocytes is altered in response to high-fat diet (HFD) and NAFLD. We showed that the entire hepatocyte population undergoes substantial transcriptome changes upon HFD, although the patterns of alteration were highly heterogeneous, with zonation-dependent and -independent effects. Periportal (zone 1) hepatocytes downregulated many zone 1-specific marker genes, whereas a small number of genes mediating gluconeogenesis were upregulated. Pericentral (zone 3) hepatocytes also downregulated many zone 3-specific genes; however, they upregulated several genes that promote HFD-induced fat droplet formation, consistent with findings that zone 3 hepatocytes accumulate more lipid droplets. Zone 3 hepatocytes also upregulated ketogenic pathways as an adaptive mechanism to HFD. Interestingly, many of the top HFD-induced genes, which encode proteins regulating lipid metabolism, were strongly co-expressed with each other in a subset of hepatocytes, producing a variegated pattern of spatial co-localization that is independent of metabolic zonation. In conclusion, our data set provides a useful resource for understanding hepatocellular alteration during NAFLD at single cell level.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1152/ajpendo.00391.2020DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8260362PMC
February 2021

MEPE loss-of-function variant associates with decreased bone mineral density and increased fracture risk.

Nat Commun 2020 10 23;11(1):4093. Epub 2020 Oct 23.

Division of Cardiovascular Medicine, Department of Internal Medicine, University of Michigan, 1500 E. Medical Center Dr., Ann Arbor, MI, 48109, USA.

A major challenge in genetic association studies is that most associated variants fall in the non-coding part of the human genome. We searched for variants associated with bone mineral density (BMD) after enriching the discovery cohort for loss-of-function (LoF) mutations by sequencing a subset of the Nord-Trøndelag Health Study, followed by imputation in the remaining sample (N = 19,705), and identified ten known BMD loci. However, one previously unreported variant, LoF mutation in MEPE, p.(Lys70IlefsTer26, minor allele frequency [MAF] = 0.8%), was associated with decreased ultradistal forearm BMD (P-value = 2.1 × 10), and increased osteoporosis (P-value = 4.2 × 10) and fracture risk (P-value = 1.6 × 10). The MEPE LoF association with BMD and fractures was further evaluated in 279,435 UK (MAF = 0.05%, heel bone estimated BMD P-value = 1.2 × 10, any fracture P-value = 0.05) and 375,984 Icelandic samples (MAF = 0.03%, arm BMD P-value = 0.12, forearm fracture P-value = 0.005). Screening for the MEPE LoF mutations before adulthood could potentially prevent osteoporosis and fractures due to the lifelong effect on BMD observed in the study. A key implication for precision medicine is that high-impact functional variants missing from the publicly available cosmopolitan panels could be clinically more relevant than polygenic risk scores.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-17315-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7585430PMC
October 2020

Identifying Novel Susceptibility Genes for Colorectal Cancer Risk From a Transcriptome-Wide Association Study of 125,478 Subjects.

Gastroenterology 2021 03 12;160(4):1164-1178.e6. Epub 2020 Oct 12.

Department of Cancer Biology and Genetics and the Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio.

Background And Aims: Susceptibility genes and the underlying mechanisms for the majority of risk loci identified by genome-wide association studies (GWAS) for colorectal cancer (CRC) risk remain largely unknown. We conducted a transcriptome-wide association study (TWAS) to identify putative susceptibility genes.

Methods: Gene-expression prediction models were built using transcriptome and genetic data from the 284 normal transverse colon tissues of European descendants from the Genotype-Tissue Expression (GTEx), and model performance was evaluated using data from The Cancer Genome Atlas (n = 355). We applied the gene-expression prediction models and GWAS data to evaluate associations of genetically predicted gene-expression with CRC risk in 58,131 CRC cases and 67,347 controls of European ancestry. Dual-luciferase reporter assays and knockdown experiments in CRC cells and tumor xenografts were conducted.

Results: We identified 25 genes associated with CRC risk at a Bonferroni-corrected threshold of P < 9.1 × 10, including genes in 4 novel loci, PYGL (14q22.1), RPL28 (19q13.42), CAPN12 (19q13.2), MYH7B (20q11.22), and MAP1L3CA (20q11.22). In 9 known GWAS-identified loci, we uncovered 9 genes that have not been reported previously, whereas 4 genes remained statistically significant after adjusting for the lead risk variant of the locus. Through colocalization analysis in GWAS loci, we additionally identified 12 putative susceptibility genes that were supported by TWAS analysis at P < .01. We showed that risk allele of the lead risk variant rs1741640 affected the promoter activity of CABLES2. Knockdown experiments confirmed that CABLES2 plays a vital role in colorectal carcinogenesis.

Conclusions: Our study reveals new putative susceptibility genes and provides new insight into the biological mechanisms underlying CRC development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1053/j.gastro.2020.08.062DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7956223PMC
March 2021

Author Correction: Multiplexed droplet single-cell RNA-sequencing using natural genetic variation.

Nat Biotechnol 2020 Nov;38(11):1356

Institute for Human Genetics (IHG), University of California, San Francisco, San Francisco, California, USA.

An amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-020-0715-9DOI Listing
November 2020

Type 2 and interferon inflammation regulate SARS-CoV-2 entry factor expression in the airway epithelium.

Nat Commun 2020 10 12;11(1):5139. Epub 2020 Oct 12.

Center for Genes, Environment, and Health, National Jewish Health, Denver, CO, USA.

Coronavirus disease 2019 (COVID-19) is caused by SARS-CoV-2, an emerging virus that utilizes host proteins ACE2 and TMPRSS2 as entry factors. Understanding the factors affecting the pattern and levels of expression of these genes is important for deeper understanding of SARS-CoV-2 tropism and pathogenesis. Here we explore the role of genetics and co-expression networks in regulating these genes in the airway, through the analysis of nasal airway transcriptome data from 695 children. We identify expression quantitative trait loci for both ACE2 and TMPRSS2, that vary in frequency across world populations. We find TMPRSS2 is part of a mucus secretory network, highly upregulated by type 2 (T2) inflammation through the action of interleukin-13, and that the interferon response to respiratory viruses highly upregulates ACE2 expression. IL-13 and virus infection mediated effects on ACE2 expression were also observed at the protein level in the airway epithelium. Finally, we define airway responses to common coronavirus infections in children, finding that these infections generate host responses similar to other viral species, including upregulation of IL6 and ACE2. Our results reveal possible mechanisms influencing SARS-CoV-2 infectivity and COVID-19 clinical outcomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-18781-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7550582PMC
October 2020

Mapping the 17q12-21.1 Locus for Variants Associated with Early-Onset Asthma in African Americans.

Am J Respir Crit Care Med 2021 02;203(4):424-436

Department of Internal Medicine, Center for Individualized and Genomic Medicine Research and.

The 17q12-21.1 locus is one of the most highly replicated genetic associations with asthma. Individuals of African descent have lower linkage disequilibrium in this region, which could facilitate identifying causal variants. To identify functional variants at 17q12-21.1 associated with early-onset asthma among African American individuals. We evaluated African American participants from SAPPHIRE (Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity) ( = 1,940), SAGE II (Study of African Americans, Asthma, Genes and Environment) ( = 885), and GCPD-A (Study of the Genetic Causes of Complex Pediatric Disorders-Asthma) ( = 2,805). Associations with asthma onset at ages under 5 years were meta-analyzed across cohorts. The lead signal was reevaluated considering haplotypes informed by genetic ancestry (i.e., African vs. European). Both an expression-quantitative trait locus analysis and a phenome-wide association study were performed on the lead variant. The meta-analyzed results from SAPPHIRE, SAGE II, and the GCPD-A identified rs11078928 as the top association for early-onset asthma. A haplotype analysis suggested that the asthma association partitioned most closely with the rs11078928 genotype. Genetic ancestry did not appear to influence the effect of this variant. In the expression-quantitative trait locus analysis, rs11078928 was related to alternative splicing of (gasdermin-B) transcripts. The phenome-wide association study of rs11078928 suggested that this variant was predominantly associated with asthma and asthma-associated symptoms. A splice-acceptor polymorphism appears to be a causal variant for asthma at the 17q12-21.1 locus. This variant appears to have the same magnitude of effect in individuals of African and European descent.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1164/rccm.202006-2623OCDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7885840PMC
February 2021

Single-Cell Transcriptome Analysis of Colon Cancer Cell Response to 5-Fluorouracil-Induced DNA Damage.

Cell Rep 2020 08;32(8):108077

Department of Molecular & Integrative Physiology and Institute for Gerontology, University of Michigan Medical School, Ann Arbor, MI 48109, USA. Electronic address:

DNA damage often induces heterogeneous cell-fate responses, such as cell-cycle arrest and apoptosis. Through single-cell RNA sequencing (scRNA-seq), we characterize the transcriptome response of cultured colon cancer cell lines to 5-fluorouracil (5FU)-induced DNA damage. After 5FU treatment, a single population of colon cancer cells adopts three distinct transcriptome phenotypes, which correspond to diversified cell-fate responses: apoptosis, cell-cycle checkpoint, and stress resistance. Although some genes are regulated uniformly across all groups of cells, many genes showed group-specific expression patterns mediating DNA damage responses specific to the corresponding cell fate. Some of these observations are reproduced at the protein level by flow cytometry and are replicated in cells treated with other 5FU-unrelated genotoxic drugs, camptothecin and etoposide. This work provides a resource for understanding heterogeneous DNA damage responses involving fractional killing and chemoresistance, which are among the major challenges in current cancer chemotherapy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.celrep.2020.108077DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7486130PMC
August 2020

Type 2 and interferon inflammation strongly regulate SARS-CoV-2 related gene expression in the airway epithelium.

bioRxiv 2020 Apr 10. Epub 2020 Apr 10.

Center for Genes, Environment, and Health, National Jewish Health, Denver, CO, 80206 USA.

Coronavirus disease 2019 (COVID-19) outcomes vary from asymptomatic infection to death. This disparity may reflect different airway levels of the SARS-CoV-2 receptor, ACE2, and the spike protein activator, TMPRSS2. Here we explore the role of genetics and co-expression networks in regulating these genes in the airway, through the analysis of nasal airway transcriptome data from 695 children. We identify expression quantitative trait loci (eQTL) for both and , that vary in frequency across world populations. Importantly, we find is part of a mucus secretory network, highly upregulated by T2 inflammation through the action of interleukin-13, and that interferon response to respiratory viruses highly upregulates expression. Finally, we define airway responses to coronavirus infections in children, finding that these infections upregulate while also stimulating a more pronounced cytotoxic immune response relative to other respiratory viruses. Our results reveal mechanisms likely influencing SARS-CoV-2 infectivity and COVID-19 clinical outcomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2020.04.09.034454DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7239056PMC
April 2020

Lung Function in African American Children with Asthma Is Associated with Novel Regulatory Variants of the KIT Ligand and Gene-By-Air-Pollution Interaction.

Genetics 2020 07 23;215(3):869-886. Epub 2020 Apr 23.

Department of Medicine, University of California, San Francisco, California 94143.

Baseline lung function, quantified as forced expiratory volume in the first second of exhalation (FEV), is a standard diagnostic criterion used by clinicians to identify and classify lung diseases. Using whole-genome sequencing data from the National Heart, Lung, and Blood Institute Trans-Omics for Precision Medicine project, we identified a novel genetic association with FEV on chromosome 12 in 867 African American children with asthma ( = 1.26 × 10, β = 0.302). Conditional analysis within 1 Mb of the tag signal (rs73429450) yielded one major and two other weaker independent signals within this peak. We explored statistical and functional evidence for all variants in linkage disequilibrium with the three independent signals and yielded nine variants as the most likely candidates responsible for the association with FEV Hi-C data and expression QTL analysis demonstrated that these variants physically interacted with (KIT ligand, also known as ), and their minor alleles were associated with increased expression of the gene in nasal epithelial cells. Gene-by-air-pollution interaction analysis found that the candidate variant rs58475486 interacted with past-year ambient sulfur dioxide exposure ( = 0.003, β = 0.32). This study identified a novel protective genetic association with FEV, possibly mediated through , in African American children with asthma. This is the first study that has identified a genetic association between lung function and , which has established a role in orchestrating allergic inflammation in asthma.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1534/genetics.120.303231DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7337089PMC
July 2020

Ancestry-agnostic estimation of DNA sample contamination from sequence reads.

Genome Res 2020 02 24;30(2):185-194. Epub 2020 Jan 24.

Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan 48109-2029, USA.

Detecting and estimating DNA sample contamination are important steps to ensure high-quality genotype calls and reliable downstream analysis. Existing methods rely on population allele frequency information for accurate estimation of contamination rates. Correctly specifying population allele frequencies for each individual in early stage of sequence analysis is impractical or even impossible for large-scale sequencing centers that simultaneously process samples from multiple studies across diverse populations. On the other hand, incorrectly specified allele frequencies may result in substantial bias in estimated contamination rates. For example, we observed that existing methods often fail to identify 10% contaminated samples at a typical 3% contamination exclusion threshold when genetic ancestry is misspecified. Such an incomplete screening of contaminated samples substantially inflates the estimated rate of genotyping errors even in deeply sequenced genomes and exomes. We propose a robust statistical method that accurately estimates DNA contamination and is agnostic to genetic ancestry of the intended or contaminating sample. Our method integrates the estimation of genetic ancestry and DNA contamination in a unified likelihood framework by leveraging individual-specific allele frequencies projected from reference genotypes onto principal component coordinates. Our method can also be used for estimating genetic ancestries, similar to LASER or , but simultaneously accounting for potential contamination. We demonstrate that our method robustly estimates contamination rates and genetic ancestries across populations and contamination scenarios. We further demonstrate that, in the presence of contamination, genetic ancestry inference can be substantially biased with existing methods that ignore contamination, while our method corrects for such biases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.246934.118DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7050530PMC
February 2020

Identification of CFTR variants in Latino patients with cystic fibrosis from the Dominican Republic and Puerto Rico.

Pediatr Pulmonol 2020 02 30;55(2):533-540. Epub 2019 Oct 30.

Department of Pediatrics, Centro de Neumología Pediátrica, San Juan, Puerto Rico.

Background: In cystic fibrosis (CF), the spectrum and frequency of CFTR variants differ by geography and race/ethnicity. CFTR variants in White patients are well-described compared with Latino patients. No studies of CFTR variants have been done in patients with CF in the Dominican Republic or Puerto Rico.

Methods: CFTR was sequenced in 61 Dominican Republican patients and 21 Puerto Rican patients with CF and greater than ​​​​60 mmol/L sweat chloride. The spectrum of CFTR variants was identified and the proportion of patients with 0, 1, or 2 CFTR variants identified was determined. The functional effects of identified CFTR variants were investigated using clinical annotation databases and computational prediction tools.

Results: Our study found 10% of Dominican patients had two CFTR variants identified compared with 81% of Puerto Rican patients. No CFTR variants were identified in 69% of Dominican patients and 10% of Puerto Rican patients. In Dominican patients, there were 19 identified CFTR variants, accounting for 25 out of 122 disease alleles (20%). In Puerto Rican patients, there were 16 identified CFTR variants, accounting for 36 out of 42 disease alleles (86%) in Puerto Rican patients. Thirty CFTR variants were identified overall. The most frequent variants for Dominican patients were p.Phe508del and p.Ala559Thr and for Puerto Rican patients were p.Phe508del, p.Arg1066Cys, p.Arg334Trp, and p.I507del.

Conclusions: In this first description of the CFTR variants in patients with CF from the Dominican Republic and Puerto Rico, there was a low detection rate of two CFTR variants after full sequencing with the majority of patients from the Dominican Republic without identified variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ppul.24549DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7571374PMC
February 2020

Author Correction: Using and producing publicly available genomic data to accelerate discovery in nephrology.

Nat Rev Nephrol 2019 Sep;15(9):590

Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.

An amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41581-019-0186-8DOI Listing
September 2019

Using and producing publicly available genomic data to accelerate discovery in nephrology.

Nat Rev Nephrol 2019 09;15(9):523-524

Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41581-019-0166-zDOI Listing
September 2019
-->