Publications by authors named "Yulin Dai"

39 Publications

An integrative study of genetic variants with brain tissue expression identifies viral etiology and potential drug targets of multiple sclerosis.

Mol Cell Neurosci 2021 Sep 17;115:103656. Epub 2021 Jul 17.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA. Electronic address:

Multiple sclerosis (MS) is a neuroinflammatory disorder leading to chronic disability. Brain lesions in MS commonly arise in normal-appearing white matter (NAWM). Genome-wide association studies (GWAS) have identified genetic variants associated with MS. Transcriptome alterations have been observed in case-control studies of NAWM. We developed a Cross-Dataset Evaluation (CDE) function for our network-based tool, Edge-Weighted Dense Module Search of GWAS (EW_dmGWAS). We applied CDE to integrate publicly available MS GWAS summary statistics of 41,505 cases and controls with collectively 38 NAWM expression samples, using the human protein interactome as the reference network, to investigate biological underpinnings of MS etiology. We validated the resulting modules with colocalization of GWAS and expression quantitative trait loci (eQTL) signals, using GTEx Consortium expression data for MS-relevant tissues: 14 brain tissues and 4 immune-related tissues. Other network assessments included a drug target query and functional gene set enrichment analysis. CDE prioritized a MS NAWM network containing 55 unique genes. The gene list was enriched (p-value = 2.34 × 10) with GWAS-eQTL colocalized genes: CDK4, IFITM3, MAPK1, MAPK3, METTL12B and PIK3R2. The resultant network also included drug signatures of FDA-approved medications. Gene set enrichment analysis revealed the top functional term "intracellular transport of virus", among other viral pathways. We prioritize critical genes from the resultant network: CDK4, IFITM3, MAPK1, MAPK3, METTL12B and PIK3R2. Enriched drug signatures suggest potential drug targets and drug repositioning strategies for MS. Finally, we propose mechanisms of potential MS viral onset, based on prioritized gene set and functional enrichment analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.mcn.2021.103656DOI Listing
September 2021

Genome-wide correlation of DNA methylation and gene expression in postmortem brain tissues of opioid use disorder patients.

Int J Neuropsychopharmacol 2021 Jul 2. Epub 2021 Jul 2.

Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.

Background: Opioid use disorder (OUD) affects millions of people, causing nearly fifty thousand deaths annually in the United States. While opioid exposure and OUD are known to cause widespread transcriptomic and epigenetic changes, few studies in human samples have been conducted. Understanding how OUD affects the brain at the molecular level could help decipher disease pathogenesis and shed light on OUD treatment.

Methods: We generated genome-wide transcriptomic and DNA methylation profiles of 22 OUD subjects and 19 non-psychiatric controls. We applied weighted gene co-expression network analysis (WGCNA) to identify genetic markers consistently associated with OUD at both transcriptomic and methylomic levels. We then performed functional enrichment for biological interpretation. We employed cross-omics analysis to uncover OUD-specific regulatory networks.

Results: We found six OUD-associated co-expression gene modules and six co-methylation modules (false discovery rate < 0.1). Genes in these modules are involved in astrocyte and glial cell differentiation, gliogenesis, response to organic substance, and response to cytokine (false discovery rate < 0.05). Cross-omics analysis revealed immune-related transcription regulators, suggesting the role of transcription factor-targeted regulatory networks in OUD pathogenesis.

Conclusions: Our integrative analysis of multi-omics data in OUD postmortem brain samples suggested complex gene regulatory mechanisms involved in OUD-associated expression patterns. Candidate genes and their upstream regulators revealed in astrocyte, and glial cells could provide new insights into OUD treatment development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ijnp/pyab043DOI Listing
July 2021

Association of CXCR6 with COVID-19 severity: delineating the host genetic factors in transcriptomic regulation.

Hum Genet 2021 Sep 21;140(9):1313-1328. Epub 2021 Jun 21.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX, 77030, USA.

The coronavirus disease 2019 (COVID-19) is an infectious disease that mainly affects the host respiratory system with ~ 80% asymptomatic or mild cases and ~ 5% severe cases. Recent genome-wide association studies (GWAS) have identified several genetic loci associated with the severe COVID-19 symptoms. Delineating the genetic variants and genes is important for better understanding its biological mechanisms. We implemented integrative approaches, including transcriptome-wide association studies (TWAS), colocalization analysis, and functional element prediction analysis, to interpret the genetic risks using two independent GWAS datasets in lung and immune cells. To understand the context-specific molecular alteration, we further performed deep learning-based single-cell transcriptomic analyses on a bronchoalveolar lavage fluid (BALF) dataset from moderate and severe COVID-19 patients. We discovered and replicated the genetically regulated expression of CXCR6 and CCR9 genes. These two genes have a protective effect on lung, and a risk effect on whole blood, respectively. The colocalization analysis of GWAS and cis-expression quantitative trait loci highlighted the regulatory effect on CXCR6 expression in lung and immune cells. In the lung-resident memory CD8 T (T) cells, we found a 2.24-fold decrease of cell proportion among CD8 T cells and lower expression of CXCR6 in the severe patients than moderate patients. Pro-inflammatory transcriptional programs were highlighted in the T cellular trajectory from moderate to severe patients. CXCR6 from the 3p21.31 locus is associated with severe COVID-19. CXCR6 tends to have a lower expression in lung T cells of severe patients, which aligns with the protective effect of CXCR6 from TWAS analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00439-021-02305-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8216591PMC
September 2021

Distinct effect of prenatal and postnatal brain expression across 20 brain disorders and anthropometric social traits: a systematic study of spatiotemporal modularity.

Brief Bioinform 2021 Jun 4. Epub 2021 Jun 4.

Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA.

Different spatiotemporal abnormalities have been implicated in different neuropsychiatric disorders and anthropometric social traits, yet an investigation in the temporal network modularity with brain tissue transcriptomics has been lacking. We developed a supervised network approach to investigate the genome-wide association study (GWAS) results in the spatial and temporal contexts and demonstrated it in 20 brain disorders and anthropometric social traits. BrainSpan transcriptome profiles were used to discover significant modules enriched with trait susceptibility genes in a developmental stage-stratified manner. We investigated whether, and in which developmental stages, GWAS-implicated genes are coordinately expressed in brain transcriptome. We identified significant network modules for each disorder and trait at different developmental stages, providing a systematic view of network modularity at specific developmental stages for a myriad of brain disorders and traits. Specifically, we observed a strong pattern of the fetal origin for most psychiatric disorders and traits [such as schizophrenia (SCZ), bipolar disorder, obsessive-compulsive disorder and neuroticism], whereas increased co-expression activities of genes were more strongly associated with neurological diseases [such as Alzheimer's disease (AD) and amyotrophic lateral sclerosis] and anthropometric traits (such as college completion, education and subjective well-being) in postnatal brains. Further analyses revealed enriched cell types and functional features that were supported and corroborated prior knowledge in specific brain disorders, such as clathrin-mediated endocytosis in AD, myelin sheath in multiple sclerosis and regulation of synaptic plasticity in both college completion and education. Our study provides a landscape view of the spatiotemporal features in a myriad of brain-related disorders and traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bbab214DOI Listing
June 2021

Investigating Cellular Trajectories in the Severity of COVID-19 and Their Transcriptional Programs Using Machine Learning Approaches.

Genes (Basel) 2021 04 24;12(5). Epub 2021 Apr 24.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Single-cell RNA sequencing of the bronchoalveolar lavage fluid (BALF) samples from COVID-19 patients has enabled us to examine gene expression changes of human tissue in response to the SARS-CoV-2 virus infection. However, the underlying mechanisms of COVID-19 pathogenesis at single-cell resolution, its transcriptional drivers, and dynamics require further investigation. In this study, we applied machine learning algorithms to infer the trajectories of cellular changes and identify their transcriptional programs. Our study generated cellular trajectories that show the COVID-19 pathogenesis of healthy-to-moderate and healthy-to-severe on macrophages and T cells, and we observed more diverse trajectories in macrophages compared to T cells. Furthermore, our deep-learning algorithm DrivAER identified several pathways (e.g., xenobiotic pathway and complement pathway) and transcription factors (e.g., MITF and GATA3) that could be potential drivers of the transcriptomic changes for COVID-19 pathogenesis and the markers of the COVID-19 severity. Moreover, macrophages-related functions corresponded more to the disease severity compared to T cells-related functions. Our findings more proficiently dissected the transcriptomic changes leading to the severity of a COVID-19 infection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/genes12050635DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8145325PMC
April 2021

Progression of prostate carcinoma is promoted by adipose stromal cell-secreted CXCL12 signaling in prostate epithelium.

NPJ Precis Oncol 2021 Mar 22;5(1):26. Epub 2021 Mar 22.

The Brown Foundation Institute of Molecular Medicine for the Prevention of Disease, The University of Texas Health Sciences Center at Houston, Houston, TX, USA.

Aggressiveness of carcinomas is linked with tumor recruitment of adipose stromal cells (ASC), which is increased in obesity. ASC promote cancer through molecular pathways not fully understood. Here, we demonstrate that epithelial-mesenchymal transition (EMT) in prostate tumors is promoted by obesity and suppressed upon pharmacological ASC depletion in HiMyc mice, a spontaneous genetic model of prostate cancer. CXCL12 expression in tumors was associated with ASC recruitment and localized to stromal cells expressing platelet-derived growth factor receptors Pdgfra and Pdgfrb. The role of this chemokine secreted by stromal cells in cancer progression was further investigated by using tissue-specific knockout models. ASC deletion of CXCL12 gene in the Pdgfr + lineages suppressed tumor growth and EMT, indicating stroma as the key source of CXCL12. Clinical sample analysis revealed that CXCL12 expression by peritumoral adipose stroma is increased in obesity, and that the correlating increase in Pdgfr/CXCL12 expression in the tumor is linked with decreased survival of patients with prostate carcinoma. Our study establishes ASC as the source of CXCL12 driving tumor aggressiveness and outlines an approach to treatment of carcinoma progression.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41698-021-00160-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7985375PMC
March 2021

Metabolic study of ginsenoside Rg3 and glimepiride in type 2 diabetic rats by liquid chromatography coupled with quadrupole-Orbitrap mass spectrometry.

Rapid Commun Mass Spectrom 2021 Jun;35(11):e9083

Jilin Ginseng Academy, Changchun University of Chinese Medicine, Changchun, 130117, China.

Rationale: Ginsenoside Rg3 and glimepiride have been applied to treat type 2 diabetes (T2DM) because of their good hypoglycemic effects. In this study, the effects of ginsenoside Rg3 acting synergistically with glimepiride were investigated in liver microsomes from rats with type 2 diabetes.

Methods: An in vitro incubation system with normal rat liver microsomes (RLM) and type 2 diabetic rat liver microsomes (TRLM) was developed. The system also included two experimental groups consisting of RLM and TRLM pretreated with ginsenoside Rg3 and glimepiride (named the RLMR and TRLMR groups, respectively). The metabolism in the different groups was analyzed by ultra-performance liquid chromatography coupled with quadrupole-orbitrap mass spectrometry (UPLC/Q-Orbitrap MS).

Results: The results showed that the concentration of glimepiride increased in RLM and TRLM after treatment with ginsenoside Rg3. Five metabolites (M1-M5) of glimepiride were found, and they were named 3N-hydroxyglimepiride, hydroxyglimepiride, 1,2-epoxy ether-3-hydroxyglimepiride, 1N-hydroxyglimepiride and 1N,2C,S,O,O-epoxy ether-3-hydroxyglimepiride. The metabolite of ginsenoside Rg3 was ginsenoside Rh2.

Conclusions: An in vitro incubation system with RLM and TRLM was developed. The system revealed pathways that produce glimepiride metabolites. Ginsenoside Rg3 may inhibit the activity of cytochrome P450 enzymes in vitro. The present study showed that ginsenoside Rg3 and glimepiride may be combined for the treatment of T2DM.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/rcm.9083DOI Listing
June 2021

Deep generative neural network for accurate drug response imputation.

Nat Commun 2021 03 19;12(1):1740. Epub 2021 Mar 19.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.

Drug response differs substantially in cancer patients due to inter- and intra-tumor heterogeneity. Particularly, transcriptome context, especially tumor microenvironment, has been shown playing a significant role in shaping the actual treatment outcome. In this study, we develop a deep variational autoencoder (VAE) model to compress thousands of genes into latent vectors in a low-dimensional space. We then demonstrate that these encoded vectors could accurately impute drug response, outperform standard signature-gene based approaches, and appropriately control the overfitting problem. We apply rigorous quality assessment and validation, including assessing the impact of cell line lineage, cross-validation, cross-panel evaluation, and application in independent clinical data sets, to warrant the accuracy of the imputed drug response in both cell lines and cancer samples. Specifically, the expression-regulated component (EReX) of the observed drug response achieves high correlation across panels. Using the well-trained models, we impute drug response of The Cancer Genome Atlas data and investigate the features and signatures associated with the imputed drug response, including cell line origins, somatic mutations and tumor mutation burdens, tumor microenvironment, and confounding factors. In summary, our deep learning method and the results are useful for the study of signatures and markers of drug response.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-21997-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7979803PMC
March 2021

Association of with COVID-19 severity: Delineating the host genetic factors in transcriptomic regulation.

bioRxiv 2021 Feb 19. Epub 2021 Feb 19.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Background: The coronavirus disease 2019 (COVID-19) is an infectious disease that mainly affects the host respiratory system with ∼80% asymptomatic or mild cases and ∼5% severe cases. Recent genome-wide association studies (GWAS) have identified several genetic loci associated with the severe COVID-19 symptoms. Delineating the genetic variants and genes is important for better understanding its biological mechanisms.

Methods: We implemented integrative approaches, including transcriptome-wide association studies (TWAS), colocalization analysis and functional element prediction analysis, to interpret the genetic risks using two independent GWAS datasets in lung and immune cells. To understand the context-specific molecular alteration, we further performed deep learning-based single cell transcriptomic analyses on a bronchoalveolar lavage fluid (BALF) dataset from moderate and severe COVID-19 patients.

Results: We discovered and replicated the genetically regulated expression of and genes. These two genes have a protective effect on the lung and a risk effect on whole blood, respectively. The colocalization analysis of GWAS and -expression quantitative trait loci highlighted the regulatory effect on expression in lung and immune cells. In the lung resident memory CD8 T (T ) cells, we found a 3.32-fold decrease of cell proportion and lower expression of in the severe than moderate patients using the BALF transcriptomic dataset. Pro-inflammatory transcriptional programs were highlighted in T cells trajectory from moderate to severe patients.

Conclusions: from the . locus is associated with severe COVID-19. tends to have a lower expression in lung T cells of severe patients, which aligns with the protective effect of from TWAS analysis. We illustrate one potential mechanism of host genetic factor impacting the severity of COVID-19 through regulating the expression of and T cell proportion and stability. Our results shed light on potential therapeutic targets for severe COVID-19.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2021.02.17.431554DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7899454PMC
February 2021

Characterization of genome-wide association study data reveals spatiotemporal heterogeneity of mental disorders.

BMC Med Genomics 2020 12 28;13(Suppl 11):192. Epub 2020 Dec 28.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA.

Background: Psychiatric disorders such as schizophrenia (SCZ), bipolar disorder (BIP), major depressive disorder (MDD), attention deficit-hyperactivity disorder (ADHD), and autism spectrum disorder (ASD) are often related to brain development. Both shared and unique biological and neurodevelopmental processes have been reported to be involved in these disorders.

Methods: In this work, we developed an integrative analysis framework to seek for the sensitive spatiotemporal point during brain development underlying each disorder. Specifically, we first identified spatiotemporal gene co-expression modules for four brain regions three developmental stages (prenatal, birth to 11 years old, and older than 13 years), totaling 12 spatiotemporal sites. By integrating GWAS summary statistics and the spatiotemporal co-expression modules, we characterized the risk genes and their co-expression partners for five disorders.

Results: We found that SCZ and BIP, ASD and ADHD tend to cluster with each other and keep a distance from other psychiatric disorders. At the gene level, we identified several genes that were shared among the most significant modules, such as CTNNB1 and LNX1, and a hub gene, ATF2, in multiple modules. Moreover, we pinpointed two spatiotemporal points in the prenatal stage with active expression activities and highlighted one postnatal point for BIP. Further functional analysis of the disorder-related module highlighted the apoptotic signaling pathway for ASD and the immune-related and cell-cell adhesion function for SCZ, respectively.

Conclusion: Our study demonstrated the dynamic changes of disorder-related genes at the network level, shedding light on the spatiotemporal regulation during brain development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-020-00832-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7771094PMC
December 2020

Accelerating bioinformatics research with International Conference on Intelligent Biology and Medicine 2020.

BMC Bioinformatics 2020 Dec 28;21(Suppl 21):563. Epub 2020 Dec 28.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.

The International Association for Intelligent Biology and Medicine (IAIBM) is a nonprofit organization that promotes intelligent biology and medical science. It hosts an annual International Conference on Intelligent Biology and Medicine (ICIBM), which was initially established in 2012. Due to the coronavirus (COVID-19) pandemic, the ICIBM 2020 was held for the first time as a virtual online conference on August 9 to 10. The virtual conference had ~ 300 registered participants and featured 41 online real-time presentations. ICIBM 2020 received a total of 75 manuscript submissions, and 12 were selected to be published in this special issue of BMC Bioinformatics. These 12 manuscripts cover a wide range of bioinformatics topics including network analysis, imaging analysis, machine learning, gene expression analysis, and sequence analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-020-03890-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7767910PMC
December 2020

Age-associated telomere attrition in adipocyte progenitors predisposes to metabolic disease.

Nat Metab 2020 12 14;2(12):1482-1497. Epub 2020 Dec 14.

Institute of Molecular Medicine, McGovern Medical School at the University of Texas Health Science Center, Houston, TX, USA.

White and beige adipocytes in subcutaneous adipose tissue (SAT) and visceral adipose tissue (VAT) are maintained by proliferation and differentiation of adipose progenitor cells (APCs). Here we use mice with tissue-specific telomerase reverse transcriptase (TERT) gene knockout (KO), which undergo premature telomere shortening and proliferative senescence in APCs, to investigate the effect of over-nutrition on APC exhaustion and metabolic dysfunction. We find that TERT KO in the Pdgfra cell lineage results in adipocyte hypertrophy, inflammation and fibrosis in SAT, while TERT KO in the Pdgfrb lineage leads to adipocyte hypertrophy in both SAT and VAT. Systemic insulin resistance is observed in both KO models and is aggravated by a high-fat diet. Analysis of human biopsies demonstrates that telomere shortening in SAT is associated with metabolic disease progression after bariatric surgery. Our data indicate that over-nutrition can promote APC senescence and provide a mechanistic link between ageing, obesity and diabetes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s42255-020-00320-4DOI Listing
December 2020

Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations.

Nucleic Acids Res 2021 01;49(1):53-66

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Assessing the causal tissues of human complex diseases is important for the prioritization of trait-associated genetic variants. Yet, the biological underpinnings of trait-associated variants are extremely difficult to infer due to statistical noise in genome-wide association studies (GWAS), and because >90% of genetic variants from GWAS are located in non-coding regions. Here, we collected the largest human epigenomic map from ENCODE and Roadmap consortia and implemented a deep-learning-based convolutional neural network (CNN) model to predict the regulatory roles of genetic variants across a comprehensive list of epigenomic modifications. Our model, called DeepFun, was built on DNA accessibility maps, histone modification marks, and transcription factors. DeepFun can systematically assess the impact of non-coding variants in the most functional elements with tissue or cell-type specificity, even for rare variants or de novo mutations. By applying this model, we prioritized trait-associated loci for 51 publicly-available GWAS studies. We demonstrated that CNN-based analyses on dense and high-resolution epigenomic annotations can refine important GWAS associations in order to identify regulatory loci from background signals, which yield novel insights for better understanding the molecular basis of human complex disease. We anticipate our approaches will become routine in GWAS downstream analysis and non-coding variant evaluation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkaa1137DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7797043PMC
January 2021

Gene expression imputation and cell-type deconvolution in human brain with spatiotemporal precision and its implications for brain-related disorders.

Genome Res 2021 01 3;31(1):146-158. Epub 2020 Dec 3.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA.

As the most complex organ of the human body, the brain is composed of diverse regions, each consisting of distinct cell types and their respective cellular interactions. Human brain development involves a finely tuned cascade of interactive events. These include spatiotemporal gene expression changes and dynamic alterations in cell-type composition. However, our understanding of this process is still largely incomplete owing to the difficulty of brain spatiotemporal transcriptome collection. In this study, we developed a tensor-based approach to impute gene expression on a transcriptome-wide level. After rigorous computational benchmarking, we applied our approach to infer missing data points in the widely used BrainSpan resource and completed the entire grid of spatiotemporal transcriptomics. Next, we conducted deconvolutional analyses to comprehensively characterize major cell-type dynamics across the entire BrainSpan resource to estimate the cellular temporal changes and distinct neocortical areas across development. Moreover, integration of these results with GWAS summary statistics for 13 brain-associated traits revealed multiple novel trait-cell-type associations and trait-spatiotemporal relationships. In summary, our imputed BrainSpan transcriptomic data provide a valuable resource for the research community and our findings help further studies of the transcriptional and cellular dynamics of the human brain and related diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.265769.120DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7849392PMC
January 2021

CSEA-DB: an omnibus for human complex trait and cell type associations.

Nucleic Acids Res 2021 01;49(D1):D862-D870

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

During the past decade, genome-wide association studies (GWAS) have identified many genetic variants with susceptibility to several thousands of complex diseases or traits. The genetic regulation of gene expression is highly tissue-specific and cell type-specific. Recently, single-cell technology has paved the way to dissect cellular heterogeneity in human tissues. Here, we present a reference database for GWAS trait-associated cell type-specificity, named Cell type-Specific Enrichment Analysis DataBase (CSEA-DB, available at https://bioinfo.uth.edu/CSEADB/). Specifically, we curated total of 5120 GWAS summary statistics data for a wide range of human traits and diseases followed by rigorous quality control. We further collected >900 000 cells from the leading consortia such as Human Cell Landscape, Human Cell Atlas, and extensive literature mining, including 752 tissue cell types from 71 adult and fetal tissues across 11 human organ systems. The tissues and cell types were annotated with Uberon and Cell Ontology. By applying our deTS algorithm, we conducted 10 250 480 times of trait-cell type associations, reporting a total of 598 (11.68%) GWAS traits with at least one significantly associated cell type. In summary, CSEA-DB could serve as a repository of association map for human complex traits and their underlying cell types, manually curated GWAS, and single-cell transcriptome resources.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkaa1064DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7778923PMC
January 2021

A Genome-wide Association Study Discovers 46 Loci of the Human Metabolome in the Hispanic Community Health Study/Study of Latinos.

Am J Hum Genet 2020 11 7;107(5):849-863. Epub 2020 Oct 7.

Human Genetics Center, University of Texas Health Science Center, Houston, TX 77030, USA. Electronic address:

Variation in levels of the human metabolome reflect changes in homeostasis, providing a window into health and disease. The genetic impact on circulating metabolites in Hispanics, a population with high cardiometabolic disease burden, is largely unknown. We conducted genome-wide association analyses on 640 circulating metabolites in 3,926 Hispanic Community Health Study/Study of Latinos participants. The estimated heritability for 640 metabolites ranged between 0%-54% with a median at 2.5%. We discovered 46 variant-metabolite pairs (p value < 1.2 × 10, minor allele frequency ≥ 1%, proportion of variance explained [PEV] mean = 3.4%, PEV = 1%-22%) with generalized effects in two population-based studies and confirmed 301 known locus-metabolite associations. Half of the identified variants with generalized effect were located in genes, including five nonsynonymous variants. We identified co-localization with the expression quantitative trait loci at 105 discovered and 151 known loci-metabolites sets. rs5855544, upstream of SLC51A, was associated with higher levels of three steroid sulfates and co-localized with expression levels of SLC51A in several tissues. Mendelian randomization (MR) analysis identified several metabolites associated with coronary heart disease (CHD) and type 2 diabetes. For example, two variants located in or near CYP4F2 (rs2108622 and rs79400241, respectively), involved in vitamin E metabolism, were associated with the levels of octadecanedioate and vitamin E metabolites (gamma-CEHC and gamma-CEHC glucuronide); MR analysis showed that genetically high levels of these metabolites were associated with lower odds of CHD. Our findings document the genetic architecture of circulating metabolites in an underrepresented Hispanic/Latino community, shedding light on disease etiology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2020.09.003DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7675000PMC
November 2020

Decoding whole-genome mutational signatures in 37 human pan-cancers by denoising sparse autoencoder neural network.

Oncogene 2020 07 11;39(27):5031-5041. Epub 2020 Jun 11.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.

Millions of somatic mutations have recently been discovered in cancer genomes. These mutations in cancer genomes occur due to internal and external mutagenesis forces. Decoding the mutational processes by examining their unique patterns has successfully revealed many known and novel signatures from whole exome data, but many still remain undiscovered. Here, we developed a deep learning approach, DeepMS, to decompose mutational signatures using 52,671,908 somatic mutations from 2780 highly curated cancer genomes with whole genome sequencing (WGS) in 37 cancer types/subtypes. With rigorous model training and comparison, we characterized 54 signatures for single base substitutions (SBSs), 11 for doublet base substitutions (DBSs) and 16 for small insertions and deletions (Indels). Compared to the previous methods, DeepMS could discover 37 SBS, 5 DBS, and 9 Indel new signatures, many of which represent associations with DNA mismatch or base excision repair and cisplatin therapy mechanisms. We further developed a regression-based model to estimate the correlation between signatures and clinical and demographical phenotypes. The first deep learning model DeepMS on WGS somatic mutational profiles enable us identify more comprehensive context-based mutational signatures than traditional NMF approaches. Our work substantially expands the landscape of the naturally occurring mutational signatures in cancer genomes, and provides new insights into cancer biology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41388-020-1343-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334101PMC
July 2020

Correction to: The International Conference on Intelligent Biology and Medicine (ICIBM) 2019: bioinformatics methods and applications for human diseases.

BMC Bioinformatics 2020 Apr 16;21(1):148. Epub 2020 Apr 16.

Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.

After publication of this supplement article [1], it is requested the grant ID in the Funding section should be corrected from NSF grant IIS-7811367 to NSF grant IIS-1902617. Therefore, the correct 'Funding' section in this article should read: We thank the National Science Foundation (NSF grant IIS-1902617) for the financial support of ICIBM 2019. This article has not received sponsorship for publication.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-020-3487-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7161002PMC
April 2020

An integrative, genomic, transcriptomic and network-assisted study to identify genes associated with human cleft lip with or without cleft palate.

BMC Med Genomics 2020 04 3;13(Suppl 5):39. Epub 2020 Apr 3.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX, 77030, USA.

Background: Cleft lip with or without cleft palate (CL/P) is one of the most common congenital human birth defects. A combination of genetic and epidemiology studies has contributed to a better knowledge of CL/P-associated candidate genes and environmental risk factors. However, the etiology of CL/P remains not fully understood. In this study, to identify new CL/P-associated genes, we conducted an integrative analysis using our in-house network tools, dmGWAS [dense module search for Genome-Wide Association Studies (GWAS)] and EW_dmGWAS (Edge-Weighted dmGWAS), in a combination with GWAS data, the human protein-protein interaction (PPI) network, and differential gene expression profiles.

Results: A total of 87 genes were consistently detected in both European and Asian ancestries in dmGWAS. There were 31.0% (27/87) showed nominal significance with CL/P (gene-based p < 0.05), with three genes showing strong association signals, including KIAA1598, GPR183, and ZMYND11 (p < 1 × 10). In EW_dmGWAS, we identified 253 and 245 module genes associated with CL/P for European ancestry and the Asian ancestry, respectively. Functional enrichment analysis demonstrated that these genes were involved in cell adhesion, protein localization to the plasma membrane, the regulation of the apoptotic signaling pathway, and other pathological conditions. A small proportion of genes (5.1% for European ancestry; 2.4% for Asian ancestry) had prior evidence in CL/P as annotated in CleftGeneDB database. Our analysis highlighted nine novel CL/P candidate genes (BRD1, CREBBP, CSK, DNM1L, LOR, PTPN18, SND1, TGS1, and VIM) and 17 previously reported genes in the top modules.

Conclusions: The genes identified through superimposing GWAS signals and differential gene expression profiles onto human PPI network, as well as their functional features, helped our understanding of the etiology of CL/P. Our multi-omics integrative analyses revealed nine novel candidate genes involved in CL/P.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-020-0675-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7118807PMC
April 2020

Dense module searching for gene networks associated with multiple sclerosis.

BMC Med Genomics 2020 04 3;13(Suppl 5):48. Epub 2020 Apr 3.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX, 77030, USA.

Background: Multiple sclerosis (MS) is a complex disease in which the immune system attacks the central nervous system. The molecular mechanisms contributing to the etiology of MS remain poorly understood. Genome-wide association studies (GWAS) of MS have identified a small number of genetic loci significant at the genome level, but they are mainly non-coding variants. Network-assisted analysis may help better interpret the functional roles of the variants with association signals and potential translational medicine application. The Dense Module Searching of GWAS tool (dmGWAS version 2.4) developed in our team is applied to 2 MS GWAS datasets (GeneMSA and IMSGC GWAS) using the human protein interactome as the reference network. A dual evaluation strategy is used to generate results with reproducibility.

Results: Approximately 7500 significant network modules were identified for each independent GWAS dataset, and 20 significant modules were identified from the dual evaluation. The top modules included GRB2, HDAC1, JAK2, MAPK1, and STAT3 as central genes. Top module genes were enriched with functional terms such as "regulation of glial cell differentiation" (adjusted p-value = 2.58 × 10), "T-cell costimulation" (adjusted p-value = 2.11 × 10) and "virus receptor activity" (adjusted p-value = 1.67 × 10). Interestingly, top gene networks included several MS FDA approved drug target genes HDAC1, IL2RA, KEAP1, and RELA, CONCLUSIONS: Our dmGWAS network analyses highlighted several genes (GRB2, HDAC1, IL2RA, JAK2, KEAP1, MAPK1, RELA and STAT3) in top modules that are promising to interpret GWAS signals and link to MS drug targets. The genes enriched with glial cell differentiation are important for understanding neurodegenerative processes in MS and for remyelination therapy investigation. Importantly, our identified genetic signals enriched in T cell costimulation and viral receptor activity supported the viral infection onset hypothesis for MS.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-020-0674-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7118851PMC
April 2020

T cell exhaustion is associated with antigen abundance and promotes transplant acceptance.

Am J Transplant 2020 09 17;20(9):2540-2550. Epub 2020 Apr 17.

Immunobiology and Transplant Science Center, Department of Surgery, Houston Methodist Research Institute and Institute for Academic Medicine, Houston Methodist Hospital, Houston, Texas, USA.

Exhaustion of T cells limits their ability to clear chronic infections or eradicate tumors. Here, in the context of transplant, we investigated whether T cell exhaustion occurs and has a role in determining transplant outcome. A peptide/MHC tetramer-based approach was used to track exhausted CD8 T cells in a male-to-female skin transplant model. Transplant of large whole-tail skins, but not small tail skins (0.8 cm × 0.8 cm), led to exhaustion of anti-male tetramer CD8 T cells and subsequently the acceptance of skin grafts. To study CD4 T cell exhaustion, we used the TCR-transgenic B6 TEa cells that recognize a major transplant antigen I-Eα from Balb/c mice. TEa cells were adoptively transferred either into B6 recipients that received Balb/c donor skins or into CB6F1 mice that contained an excessive amount of I-Eα antigen. Adoptively transferred TEa cells in skin-graft recipients were not exhausted. By contrast, virtually all adoptively transferred TEa cells were exhausted in CB6F1 mice. Those exhausted TEa cells lost ability to reject Balb/c skins upon further transfer into lymphopenic B6.Rag1 mice. Hence, T cell exhaustion develops in the presence of abundant antigen and promotes transplant acceptance. These findings are essential for better understanding the nature of transplant tolerance.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/ajt.15870DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8000649PMC
September 2020

Diverse types of genomic evidence converge on alcohol use disorder risk genes.

J Med Genet 2020 11 13;57(11):733-743. Epub 2020 Mar 13.

School of Biomedical Science, University of Texas Health Science Center at Houston, Houston, Texas, USA

Background: Alcohol use disorder (AUD) is one of the most common forms of substance use disorders with a strong contribution of genetic (50%-60%) and environmental factors. Genome-wide association studies (GWAS) have identified a number of AUD-associated variants, including those in alcohol metabolism genes. These genetic variants may modulate gene expression, making individuals more susceptible to AUD. A long-term alcohol consumption can also change the transcriptome patterns of subjects via epigenetic modulations.

Methods: To explore the interactive effect of genetic and epigenetic factors on AUD, we conducted a secondary analysis by integrating GWAS, CNV, brain transcriptome and DNA methylation data to unravel novel AUD-associated genes/variants. We applied the mega-analysis of OR (MegaOR) method to prioritise AUD candidate genes (AUDgenes).

Results: We identified a consensus set of 206 AUDgenes based on the multi-omics data. We demonstrated that these AUDgenes tend to interact with each other more frequent than chance expectation. Functional annotation analysis indicated that these AUDgenes were involved in substance dependence, synaptic transmission, glial cell proliferation and enriched in neuronal and liver cells. We obtained a multidimensional evidence that AUD is a polygenic disorder influenced by both genetic and epigenetic factors as well as the interaction of them.

Conclusion: We characterised multidimensional evidence of genetic, epigenetic and transcriptomic data in AUD. We found that 206 AUD associated genes were highly expressed in liver, brain cerebellum, frontal cortex, hippocampus and pituitary. Our studies provides important insights into the molecular mechanism of AUD and potential target genes for AUD treatment.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1136/jmedgenet-2019-106490DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7487038PMC
November 2020

The International Conference on Intelligent Biology and Medicine (ICIBM) 2019: bioinformatics methods and applications for human diseases.

BMC Bioinformatics 2019 Dec 20;20(Suppl 24):676. Epub 2019 Dec 20.

Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.

Between June 9-11, 2019, the International Conference on Intelligent Biology and Medicine (ICIBM 2019) was held in Columbus, Ohio, USA. The conference included 12 scientific sessions, five tutorials or workshops, one poster session, four keynote talks and four eminent scholar talks that covered a wide range of topics in bioinformatics, medical informatics, systems biology and intelligent computing. Here, we describe 13 high quality research articles selected for publishing in BMC Bioinformatics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-3240-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6924135PMC
December 2019

TSEA-DB: a trait-tissue association map for human complex traits and diseases.

Nucleic Acids Res 2020 01;48(D1):D1022-D1030

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Assessing the causal tissues of human traits and diseases is important for better interpreting trait-associated genetic variants, understanding disease etiology, and improving treatment strategies. Here, we present a reference database for trait-associated tissue specificity based on genome-wide association study (GWAS) results, named Tissue-Specific Enrichment Analysis DataBase (TSEA-DB, available at https://bioinfo.uth.edu/TSEADB/). We collected GWAS summary statistics data for a wide range of human traits and diseases followed by rigorous quality control. The current version of TSEA-DB includes 4423 data sets from the UK Biobank (UKBB) and 596 from other resources (GWAS Catalog and literature mining), totaling 5019 unique GWAS data sets and 15 770 trait-associated gene sets. TSEA-DB aims to provide reference tissue(s) enriched with the genes from GWAS. To this end, we systematically performed a tissue-specific enrichment analysis using our recently developed tool deTS and gene expression profiles from two reference tissue panels: the GTEx panel (47 tissues) and the ENCODE panel (44 tissues). The comprehensive trait-tissue association results can be easily accessed, searched, visualized, analyzed, and compared across the studies and traits through our web site. TSEA-DB represents one of the many timely and comprehensive approaches in exploring human trait-tissue association.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkz957DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7145616PMC
January 2020

Erratum: Lina Lu et al.; Low-Grade Dysplastic Nodules Revealed as the Tipping Point during Multistep Hepatocarcinogenesis by Dynamic Network Biomarkers. 2017, , 268.

Genes (Basel) 2019 05 2;10(5). Epub 2019 May 2.

Key Laboratory of Systems Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai 200031, China.

The authors wish to make the following correction to their paper [...].
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/genes10050335DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6562478PMC
May 2019

A Convergent Study of Genetic Variants Associated With Crohn's Disease: Evidence From GWAS, Gene Expression, Methylation, eQTL and TWAS.

Front Genet 2019 9;10:318. Epub 2019 Apr 9.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States.

Crohn's Disease (CD) is one of the predominant forms of inflammatory bowel disease (IBD). A combination of genetic and non-genetic risk factors have been reported to contribute to the development of CD. Many high-throughput omics studies have been conducted to identify disease associated risk variants that might contribute to CD, such as genome-wide association studies (GWAS) and next generation sequencing studies. A pressing need remains to prioritize and characterize candidate genes that underlie the etiology of CD. In this study, we collected a comprehensive multi-dimensional data from GWAS, gene expression, and methylation studies and generated transcriptome-wide association study (TWAS) data to further interpret the GWAS association results. We applied our previously developed method called mega-analysis of Odds Ratio (MegaOR) to prioritize CD candidate genes (CDgenes). As a result, we identified consensus sets of CDgenes (62-235 genes) based on the evidence matrix. We demonstrated that these CDgenes were significantly more frequently interact with each other than randomly expected. Functional annotation of these genes highlighted critical immune-related processes such as immune response, MHC class II receptor activity, and immunological disorders. In particular, the constitutive photomorphogenesis 9 (COP9) signalosome related genes were found to be significantly enriched in CDgenes, implying a potential role of COP9 signalosome involved in the pathogenesis of CD. Finally, we found some of the CDgenes shared biological functions with known drug targets of CD, such as the regulation of inflammatory response and the leukocyte adhesion to vascular endothelial cell. In summary, we identified highly confident CDgenes from multi-dimensional evidence, providing insights for the understanding of CD etiology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2019.00318DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6467075PMC
April 2019

deTS: tissue-specific enrichment analysis to decode tissue specificity.

Bioinformatics 2019 10;35(19):3842-3845

School of Biomedical Informatics, Center for Precision Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.

Motivation: Diseases and traits are under dynamic tissue-specific regulation. However, heterogeneous tissues are often collected in biomedical studies, which reduce the power in the identification of disease-associated variants and gene expression profiles.

Results: We present deTS, an R package, to conduct tissue-specific enrichment analysis with two built-in reference panels. Statistical methods are developed and implemented for detecting tissue-specific genes and for enrichment test of different forms of query data. Our applications using multi-trait genome-wide association studies data and cancer expression data showed that deTS could effectively identify the most relevant tissues for each query trait or sample, providing insights for future studies.

Availability And Implementation: https://github.com/bsml320/deTS and CRAN https://cran.r-project.org/web/packages/deTS/.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz138DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6761978PMC
October 2019

Gene2vec: distributed representation of genes based on co-expression.

BMC Genomics 2019 Feb 4;20(Suppl 1):82. Epub 2019 Feb 4.

The University of Texas School of Biomedical Informatics, Houston, TX, 77030, USA.

Background: Existing functional description of genes are categorical, discrete, and mostly through manual process. In this work, we explore the idea of gene embedding, distributed representation of genes, in the spirit of word embedding.

Results: From a pure data-driven fashion, we trained a 200-dimension vector representation of all human genes, using gene co-expression patterns in 984 data sets from the GEO databases. These vectors capture functional relatedness of genes in terms of recovering known pathways - the average inner product (similarity) of genes within a pathway is 1.52X greater than that of random genes. Using t-SNE, we produced a gene co-expression map that shows local concentrations of tissue specific genes. We also illustrated the usefulness of the embedded gene vectors, laden with rich information on gene co-expression patterns, in tasks such as gene-gene interaction prediction.

Conclusions: We proposed a machine learning method that utilizes transcriptome-wide gene co-expression to generate a distributed representation of genes. We further demonstrated the utility of our distribution by predicting gene-gene interaction based solely on gene names. The distributed representation of genes could be useful for more bioinformatics applications.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-018-5370-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6360648PMC
February 2019

Investigation of multi-trait associations using pathway-based analysis of GWAS summary statistics.

BMC Genomics 2019 Feb 4;20(Suppl 1):79. Epub 2019 Feb 4.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA.

Background: Genome-wide association studies (GWAS) have been successful in identifying disease-associated genetic variants. Recently, an increasing number of GWAS summary statistics have been made available to the research community, providing extensive repositories for studies of human complex diseases. In particular, cross-trait associations at the genetic level can be beneficial from large-scale GWAS summary statistics by using genetic variants that are associated with multiple traits. However, direct assessment of cross-trait associations using susceptibility loci has been challenging due to the complex genetic architectures in most diseases, calling for advantageous methods that could integrate functional interpretation and imply biological mechanisms.

Results: We developed an analytical framework for systematic integration of cross-trait associations. It incorporates two different approaches to detect enriched pathways and requires only summary statistics. We demonstrated the framework using 25 traits belonging to four phenotype groups. Our results revealed an average of 54 significantly associated pathways (ranged between 18 and 175) per trait. We further proved that pathway-based analysis provided increased power to estimate cross-trait associations compared to gene-level analysis. Based on Fisher's Exact Test (FET), we identified a total of 24 (53) pairs of trait-trait association at adjusted p < 1 × 10 (p < 0.01) among the 25 traits. Our trait-trait association network revealed not only many relationships among the traits within the same group but also novel relationships among traits from different groups, which warrants further investigation in future.

Conclusions: Our study revealed that risk variants for 25 different traits aggregated in particular biological pathways and that these pathways were frequently shared among traits. Our results confirmed known mechanisms and also suggested several novel insights into the etiology of multi-traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-018-5373-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6360716PMC
February 2019

Multiple transcription factors contribute to inter-chromosomal interaction in yeast.

BMC Syst Biol 2018 12 21;12(Suppl 8):140. Epub 2018 Dec 21.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA.

Background: Chromatin interactions medicated by genomic elements located throughout the genome play important roles in gene regulation and can be identified with the technologies such as high-throughput chromosome conformation capture (Hi-C), followed by next-generation sequencing. These techniques were wildly used to reveal the relative spatial disposition of chromatins in human, mouse and yeast. Unlike metazoan where CTCF plays major roles in mediating chromatin interactions, in yeast, the transcription factors (TFs) involved in this biological process are poorly known.

Results: Here, we presented two computational approaches to estimate the TFs enriched in the chromatin physical inter-chromosomal interactions in yeast. Through the Chi-square method, we found TFs whose binding data are differentially distributed in different interaction groups, including Cin5, Stp1 and Sut1, whose binding data are negatively correlated with the chromosome spatial distance. A multivariate linear regression model was employed to estimate the potential contribution of different transcription factors against the physical distance of chromosomes. Rlr1, Set12 and Dig1 were found to be top positively participated in these chromosomal interactions. Ste12 was highlighted to be involved in gene reposition. Overall, we found 10 TFs enriched from both computational approaches, potentially to be involved in inter-chromosomal interactions.

Conclusions: No transcription factor (TF) in our study was found to have a dominant impact on the inter-chromosomal interaction as CTCF did in human or other metazoan, suggesting species without CTCF might have different regulatory systems in mediating inter-chromosomal interactions. In summary, we presented a systematic examination of TFs involved in chromatin interaction in yeast and the results provide candidate TFs for future studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12918-018-0643-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6302461PMC
December 2018
-->