Publications by authors named "Peilin Jia"

148 Publications

MOF-Directed Construction of Cu-Carbon and [email protected] Carbon as Superior Supports of Metal Nanoparticles toward Efficient Hydrogen Generation.

ACS Appl Mater Interfaces 2021 Oct 29. Epub 2021 Oct 29.

Inner Mongolia Key Laboratory of Coal Chemistry, School of Chemistry and Chemical Engineering, Inner Mongolia University, Hohhot 010021, China.

The modulation of electronic behavior of metal-based catalysts is vital to optimize their catalytic performance. Herein, metal-organic frameworks (MOFs) are pyrolyzed to afford a series of different-structured Cu-carbon composites and [email protected] carbon composites. Then a series of CO-resistant catalysts, namely, Co or Ni nanoparticles supported by the Cu-based composites, are synthesized for the hydrogen generation from aqueous NHBH. Their catalytic activities are boosted under light irradiation and regulated by the compositions and the fine structures of doped N species with pyridine, pyrrole, and graphitic configurations in the composite supports. Particularly, the optimized Co-based catalyst with the highest graphitic N content exhibits a high activity, achieving a total turnover frequency (TOF) value of 210 min, which is higher than all the reported unprecious catalysts. Further investigations verify that the light-driven synergistic electron effect of plasmonic Cu-based composites and Co nanoparticles accounts for the high-performance hydrogen generation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acsami.1c15117DOI Listing
October 2021

CeDR Atlas: a knowledgebase of cellular drug response.

Nucleic Acids Res 2021 Oct 11. Epub 2021 Oct 11.

CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.

Drug response to many diseases varies dramatically due to the complex genomics and functional features and contexts. Cellular diversity of human tissues, especially tumors, is one of the major contributing factors to the different drug response in different samples. With the accumulation of single-cell RNA sequencing (scRNA-seq) data, it is now possible to study the drug response to different treatments at the single cell resolution. Here, we present CeDR Atlas (available at https://ngdc.cncb.ac.cn/cedr), a knowledgebase reporting computational inference of cellular drug response for hundreds of cell types from various tissues. We took advantage of the high-throughput profiling of drug-induced gene expression available through the Connectivity Map resource (CMap) as well as hundreds of scRNA-seq data covering cells from a wide variety of organs/tissues, diseases, and conditions. Currently, CeDR maintains the results for more than 582 single cell data objects for human, mouse and cell lines, including about 140 phenotypes and 1250 tissue-cell combination types. All the results can be explored and searched by keywords for drugs, cell types, tissues, diseases, and signature genes. Overall, CeDR fine maps drug response at cellular resolution and sheds lights on the design of combinatorial treatments, drug resistance and even drug side effects.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkab897DOI Listing
October 2021

An integrative study of genetic variants with brain tissue expression identifies viral etiology and potential drug targets of multiple sclerosis.

Mol Cell Neurosci 2021 09 17;115:103656. Epub 2021 Jul 17.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA. Electronic address:

Multiple sclerosis (MS) is a neuroinflammatory disorder leading to chronic disability. Brain lesions in MS commonly arise in normal-appearing white matter (NAWM). Genome-wide association studies (GWAS) have identified genetic variants associated with MS. Transcriptome alterations have been observed in case-control studies of NAWM. We developed a Cross-Dataset Evaluation (CDE) function for our network-based tool, Edge-Weighted Dense Module Search of GWAS (EW_dmGWAS). We applied CDE to integrate publicly available MS GWAS summary statistics of 41,505 cases and controls with collectively 38 NAWM expression samples, using the human protein interactome as the reference network, to investigate biological underpinnings of MS etiology. We validated the resulting modules with colocalization of GWAS and expression quantitative trait loci (eQTL) signals, using GTEx Consortium expression data for MS-relevant tissues: 14 brain tissues and 4 immune-related tissues. Other network assessments included a drug target query and functional gene set enrichment analysis. CDE prioritized a MS NAWM network containing 55 unique genes. The gene list was enriched (p-value = 2.34 × 10) with GWAS-eQTL colocalized genes: CDK4, IFITM3, MAPK1, MAPK3, METTL12B and PIK3R2. The resultant network also included drug signatures of FDA-approved medications. Gene set enrichment analysis revealed the top functional term "intracellular transport of virus", among other viral pathways. We prioritize critical genes from the resultant network: CDK4, IFITM3, MAPK1, MAPK3, METTL12B and PIK3R2. Enriched drug signatures suggest potential drug targets and drug repositioning strategies for MS. Finally, we propose mechanisms of potential MS viral onset, based on prioritized gene set and functional enrichment analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.mcn.2021.103656DOI Listing
September 2021

Genome-Wide Correlation of DNA Methylation and Gene Expression in Postmortem Brain Tissues of Opioid Use Disorder Patients.

Int J Neuropsychopharmacol 2021 11;24(11):879-891

Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.

Background: Opioid use disorder (OUD) affects millions of people, causing nearly 50 000 deaths annually in the United States. While opioid exposure and OUD are known to cause widespread transcriptomic and epigenetic changes, few studies in human samples have been conducted. Understanding how OUD affects the brain at the molecular level could help decipher disease pathogenesis and shed light on OUD treatment.

Methods: We generated genome-wide transcriptomic and DNA methylation profiles of 22 OUD subjects and 19 non-psychiatric controls. We applied weighted gene co-expression network analysis to identify genetic markers consistently associated with OUD at both transcriptomic and methylomic levels. We then performed functional enrichment for biological interpretation. We employed cross-omics analysis to uncover OUD-specific regulatory networks.

Results: We found 6 OUD-associated co-expression gene modules and 6 co-methylation modules (false discovery rate <0.1). Genes in these modules are involved in astrocyte and glial cell differentiation, gliogenesis, response to organic substance, and response to cytokine (false discovery rate <0.05). Cross-omics analysis revealed immune-related transcription regulators, suggesting the role of transcription factor-targeted regulatory networks in OUD pathogenesis.

Conclusions: Our integrative analysis of multi-omics data in OUD postmortem brain samples suggested complex gene regulatory mechanisms involved in OUD-associated expression patterns. Candidate genes and their upstream regulators revealed in astrocyte, and glial cells could provide new insights into OUD treatment development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ijnp/pyab043DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8598308PMC
November 2021

An Integrative Transcriptomic and Methylation Approach for Identifying Differentially Expressed Circular RNAs Associated with DNA Methylation Change.

Biomedicines 2021 Jun 8;9(6). Epub 2021 Jun 8.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Recently, accumulating evidence has supported that circular RNA (circRNA) plays important roles in tumorigenesis by regulating gene expression at transcriptional and post-transcriptional levels. Expression of circRNAs can be epigenetically silenced by DNA methylation; however, the underlying regulatory mechanisms of circRNAs by DNA methylation remains largely unknown. We explored this regulation in hepatocellular carcinoma (HCC) using genome-wide DNA methylation and RNA sequencing data of the primary tumor and matched adjacent normal tissues from 20 HCC patients. Our pipeline identified 1012 upregulated and 747 downregulated circRNAs (collectively referred to as differentially expressed circRNAs, or DE circRNAs) from HCC RNA-seq data. Among them, 329 DE circRNAs covered differentially methylated sites (adjusted -value < 0.05, |ΔM| > 0.5) in circRNAs' interior and/or flanking regions. Interestingly, the corresponding parental genes of 46 upregulated and 31 downregulated circRNAs did not show significant expression change in the HCC tumor versus normal samples. Importantly, 34 of the 77 DE circRNAs (44.2%) had significant correlation with DNA methylation change in HCC (Spearman's rank-order correlation, -value < 0.05), suggesting that aberrant DNA methylation might regulate circular RNA expression in HCC. Our study revealed genome-wide differential circRNA expression in HCC. The significant correlation with DNA methylation change suggested that epigenetic regulation might act on both mRNA and circRNA expression. The specific regulation in HCC and general view in other cancer or disease requires further investigation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/biomedicines9060657DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8227141PMC
June 2021

Association of CXCR6 with COVID-19 severity: delineating the host genetic factors in transcriptomic regulation.

Hum Genet 2021 Sep 21;140(9):1313-1328. Epub 2021 Jun 21.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX, 77030, USA.

The coronavirus disease 2019 (COVID-19) is an infectious disease that mainly affects the host respiratory system with ~ 80% asymptomatic or mild cases and ~ 5% severe cases. Recent genome-wide association studies (GWAS) have identified several genetic loci associated with the severe COVID-19 symptoms. Delineating the genetic variants and genes is important for better understanding its biological mechanisms. We implemented integrative approaches, including transcriptome-wide association studies (TWAS), colocalization analysis, and functional element prediction analysis, to interpret the genetic risks using two independent GWAS datasets in lung and immune cells. To understand the context-specific molecular alteration, we further performed deep learning-based single-cell transcriptomic analyses on a bronchoalveolar lavage fluid (BALF) dataset from moderate and severe COVID-19 patients. We discovered and replicated the genetically regulated expression of CXCR6 and CCR9 genes. These two genes have a protective effect on lung, and a risk effect on whole blood, respectively. The colocalization analysis of GWAS and cis-expression quantitative trait loci highlighted the regulatory effect on CXCR6 expression in lung and immune cells. In the lung-resident memory CD8 T (T) cells, we found a 2.24-fold decrease of cell proportion among CD8 T cells and lower expression of CXCR6 in the severe patients than moderate patients. Pro-inflammatory transcriptional programs were highlighted in the T cellular trajectory from moderate to severe patients. CXCR6 from the 3p21.31 locus is associated with severe COVID-19. CXCR6 tends to have a lower expression in lung T cells of severe patients, which aligns with the protective effect of CXCR6 from TWAS analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00439-021-02305-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8216591PMC
September 2021

Cell-type deconvolution analysis identifies cancer-associated myofibroblast component as a poor prognostic factor in multiple cancer types.

Oncogene 2021 Jul 17;40(28):4686-4694. Epub 2021 Jun 17.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.

Cancer-associated fibroblasts (CAFs) constitute a prominent component of the tumor microenvironment and play critical roles in cancer progression and drug resistance. Although recent studies indicate CAFs may consist of several CAF subtypes, the breadth of CAF heterogeneity and functional roles of CAF subtypes in cancer progression remain unclear. In this study, we implemented a cell-type deconvolutional approach to comprehensively characterize cell-type alternations across 18 cancer types from The Cancer Genome Atlas (TCGA). Pan-cancer survival analysis using deconvoluted CAF subtypes revealed myofibroblastic CAF (myCAF) composition as a poor prognostic factor in nine cancer types. Patients with higher myCAF compositions tend to have worse response to six antineoplastic drugs predicted by a lncRNA-based Elastic Net prediction model (LENP). In addition, integrative mutational analysis identified 14 and 413 genes associated with the differentiation degree of myCAF and inflammatory CAF (iCAF), respectively, with significant enrichment of genes involved in fibroblast and extracellular matrix (ECM)-related pathways. In summary, our findings systematically illustrated the complex roles of CAF subtypes in patient prognosis and drug response, and identified putative driver genes in CAF-subtype differentiation. These results provided novel therapeutic perspectives for targeting CAF subtypes in tumor microenvironment and arranging treatment scheme based on the CAF compositions in different cancer types.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41388-021-01870-xDOI Listing
July 2021

Distinct effect of prenatal and postnatal brain expression across 20 brain disorders and anthropometric social traits: a systematic study of spatiotemporal modularity.

Brief Bioinform 2021 Nov;22(6)

Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA.

Different spatiotemporal abnormalities have been implicated in different neuropsychiatric disorders and anthropometric social traits, yet an investigation in the temporal network modularity with brain tissue transcriptomics has been lacking. We developed a supervised network approach to investigate the genome-wide association study (GWAS) results in the spatial and temporal contexts and demonstrated it in 20 brain disorders and anthropometric social traits. BrainSpan transcriptome profiles were used to discover significant modules enriched with trait susceptibility genes in a developmental stage-stratified manner. We investigated whether, and in which developmental stages, GWAS-implicated genes are coordinately expressed in brain transcriptome. We identified significant network modules for each disorder and trait at different developmental stages, providing a systematic view of network modularity at specific developmental stages for a myriad of brain disorders and traits. Specifically, we observed a strong pattern of the fetal origin for most psychiatric disorders and traits [such as schizophrenia (SCZ), bipolar disorder, obsessive-compulsive disorder and neuroticism], whereas increased co-expression activities of genes were more strongly associated with neurological diseases [such as Alzheimer's disease (AD) and amyotrophic lateral sclerosis] and anthropometric traits (such as college completion, education and subjective well-being) in postnatal brains. Further analyses revealed enriched cell types and functional features that were supported and corroborated prior knowledge in specific brain disorders, such as clathrin-mediated endocytosis in AD, myelin sheath in multiple sclerosis and regulation of synaptic plasticity in both college completion and education. Our study provides a landscape view of the spatiotemporal features in a myriad of brain-related disorders and traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bbab214DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8575009PMC
November 2021

DeepFun: a deep learning sequence-based model to decipher non-coding variant effect in a tissue- and cell type-specific manner.

Nucleic Acids Res 2021 07;49(W1):W131-W139

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

More than 90% of the genetic variants identified from genome-wide association studies (GWAS) are located in non-coding regions of the human genome. Here, we present a user-friendly web server, DeepFun (https://bioinfo.uth.edu/deepfun/), to assess the functional activity of non-coding genetic variants. This new server is built on a convolutional neural network (CNN) framework that has been extensively evaluated. Specifically, we collected chromatin profiles from ENCODE and Roadmap projects to construct the feature space, including 1548 DNase I accessibility, 1536 histone mark, and 4795 transcription factor binding profiles covering 225 tissues or cell types. With such comprehensive epigenomics annotations, DeepFun expands the functionality of existing non-coding variant prioritizing tools to provide a more specific functional assessment on non-coding variants in a tissue- and cell type-specific manner. By using the datasets from various GWAS studies, we conducted independent validations and demonstrated the functions of the DeepFun web server in predicting the effect of a non-coding variant in a specific tissue or cell type, as well as visualizing the potential motifs in the region around variants. We expect our server will be widely used in genetics, functional genomics, and disease studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkab429DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8262726PMC
July 2021

DeepVISP: Deep Learning for Virus Site Integration Prediction and Motif Discovery.

Adv Sci (Weinh) 2021 05 8;8(9):2004958. Epub 2021 Mar 8.

Center for Precision Health School of Biomedical Informatics The University of Texas Health Science Center at Houston (UTHealth) Houston TX 77030 USA.

Approximately 15% of human cancers are estimated to be attributed to viruses. Virus sequences can be integrated into the host genome, leading to genomic instability and carcinogenesis. Here, a new deep convolutional neural network (CNN) model is developed with attention architecture, namely DeepVISP, for accurately predicting oncogenic virus integration sites (VISs) in the human genome. Using the curated benchmark integration data of three viruses, hepatitis B virus (HBV), human herpesvirus (HPV), and Epstein-Barr virus (EBV), DeepVISP achieves high accuracy and robust performance for all three viruses through automatically learning informative features and essential genomic positions only from the DNA sequences. In comparison, DeepVISP outperforms conventional machine learning methods by 8.43-34.33% measured by area under curve (AUC) value enhancement in three viruses. Moreover, DeepVISP can decode -regulatory factors that are potentially involved in virus integration and tumorigenesis, such as HOXB7, IKZF1, and LHX6. These findings are supported by multiple lines of evidence in literature. The clustering analysis of the informative motifs reveales that the representative k-mers in clusters could help guide virus recognition of the host genes. A user-friendly web server is developed for predicting putative oncogenic VISs in the human genome using DeepVISP.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/advs.202004958DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8097320PMC
May 2021

Rewired Pathways and Disrupted Pathway Crosstalk in Schizophrenia Transcriptomes by Multiple Differential Coexpression Methods.

Genes (Basel) 2021 04 29;12(5). Epub 2021 Apr 29.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Transcriptomic studies of mental disorders using the human brain tissues have been limited, and gene expression signatures in schizophrenia (SCZ) remain elusive. In this study, we applied three differential co-expression methods to analyze five transcriptomic datasets (three RNA-Seq and two microarray datasets) derived from SCZ and matched normal postmortem brain samples. We aimed to uncover biological pathways where internal correlation structure was rewired or inter-coordination was disrupted in SCZ. In total, we identified 60 rewired pathways, many of which were related to neurotransmitter, synapse, immune, and cell adhesion. We found the hub genes, which were on the center of rewired pathways, were highly mutually consistent among the five datasets. The combinatory list of 92 hub genes was generally multi-functional, suggesting their complex and dynamic roles in SCZ pathophysiology. In our constructed pathway crosstalk network, we found "Clostridium neurotoxicity" and "signaling events mediated by focal adhesion kinase" had the highest interactions. We further identified disconnected gene links underlying the disrupted pathway crosstalk. Among them, four gene pairs (, , , and ) were normally correlated in universal contexts. In summary, we systematically identified rewired pathways, disrupted pathway crosstalk circuits, and critical genes and gene links in schizophrenia transcriptomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/genes12050665DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8146818PMC
April 2021

Deep generative neural network for accurate drug response imputation.

Nat Commun 2021 03 19;12(1):1740. Epub 2021 Mar 19.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.

Drug response differs substantially in cancer patients due to inter- and intra-tumor heterogeneity. Particularly, transcriptome context, especially tumor microenvironment, has been shown playing a significant role in shaping the actual treatment outcome. In this study, we develop a deep variational autoencoder (VAE) model to compress thousands of genes into latent vectors in a low-dimensional space. We then demonstrate that these encoded vectors could accurately impute drug response, outperform standard signature-gene based approaches, and appropriately control the overfitting problem. We apply rigorous quality assessment and validation, including assessing the impact of cell line lineage, cross-validation, cross-panel evaluation, and application in independent clinical data sets, to warrant the accuracy of the imputed drug response in both cell lines and cancer samples. Specifically, the expression-regulated component (EReX) of the observed drug response achieves high correlation across panels. Using the well-trained models, we impute drug response of The Cancer Genome Atlas data and investigate the features and signatures associated with the imputed drug response, including cell line origins, somatic mutations and tumor mutation burdens, tumor microenvironment, and confounding factors. In summary, our deep learning method and the results are useful for the study of signatures and markers of drug response.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-21997-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7979803PMC
March 2021

Association of with COVID-19 severity: Delineating the host genetic factors in transcriptomic regulation.

bioRxiv 2021 Feb 19. Epub 2021 Feb 19.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Background: The coronavirus disease 2019 (COVID-19) is an infectious disease that mainly affects the host respiratory system with ∼80% asymptomatic or mild cases and ∼5% severe cases. Recent genome-wide association studies (GWAS) have identified several genetic loci associated with the severe COVID-19 symptoms. Delineating the genetic variants and genes is important for better understanding its biological mechanisms.

Methods: We implemented integrative approaches, including transcriptome-wide association studies (TWAS), colocalization analysis and functional element prediction analysis, to interpret the genetic risks using two independent GWAS datasets in lung and immune cells. To understand the context-specific molecular alteration, we further performed deep learning-based single cell transcriptomic analyses on a bronchoalveolar lavage fluid (BALF) dataset from moderate and severe COVID-19 patients.

Results: We discovered and replicated the genetically regulated expression of and genes. These two genes have a protective effect on the lung and a risk effect on whole blood, respectively. The colocalization analysis of GWAS and -expression quantitative trait loci highlighted the regulatory effect on expression in lung and immune cells. In the lung resident memory CD8 T (T ) cells, we found a 3.32-fold decrease of cell proportion and lower expression of in the severe than moderate patients using the BALF transcriptomic dataset. Pro-inflammatory transcriptional programs were highlighted in T cells trajectory from moderate to severe patients.

Conclusions: from the . locus is associated with severe COVID-19. tends to have a lower expression in lung T cells of severe patients, which aligns with the protective effect of from TWAS analysis. We illustrate one potential mechanism of host genetic factor impacting the severity of COVID-19 through regulating the expression of and T cell proportion and stability. Our results shed light on potential therapeutic targets for severe COVID-19.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2021.02.17.431554DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7899454PMC
February 2021

Characterization of genome-wide association study data reveals spatiotemporal heterogeneity of mental disorders.

BMC Med Genomics 2020 12 28;13(Suppl 11):192. Epub 2020 Dec 28.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 820, Houston, TX, 77030, USA.

Background: Psychiatric disorders such as schizophrenia (SCZ), bipolar disorder (BIP), major depressive disorder (MDD), attention deficit-hyperactivity disorder (ADHD), and autism spectrum disorder (ASD) are often related to brain development. Both shared and unique biological and neurodevelopmental processes have been reported to be involved in these disorders.

Methods: In this work, we developed an integrative analysis framework to seek for the sensitive spatiotemporal point during brain development underlying each disorder. Specifically, we first identified spatiotemporal gene co-expression modules for four brain regions three developmental stages (prenatal, birth to 11 years old, and older than 13 years), totaling 12 spatiotemporal sites. By integrating GWAS summary statistics and the spatiotemporal co-expression modules, we characterized the risk genes and their co-expression partners for five disorders.

Results: We found that SCZ and BIP, ASD and ADHD tend to cluster with each other and keep a distance from other psychiatric disorders. At the gene level, we identified several genes that were shared among the most significant modules, such as CTNNB1 and LNX1, and a hub gene, ATF2, in multiple modules. Moreover, we pinpointed two spatiotemporal points in the prenatal stage with active expression activities and highlighted one postnatal point for BIP. Further functional analysis of the disorder-related module highlighted the apoptotic signaling pathway for ASD and the immune-related and cell-cell adhesion function for SCZ, respectively.

Conclusion: Our study demonstrated the dynamic changes of disorder-related genes at the network level, shedding light on the spatiotemporal regulation during brain development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-020-00832-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7771094PMC
December 2020

Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations.

Nucleic Acids Res 2021 01;49(1):53-66

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Assessing the causal tissues of human complex diseases is important for the prioritization of trait-associated genetic variants. Yet, the biological underpinnings of trait-associated variants are extremely difficult to infer due to statistical noise in genome-wide association studies (GWAS), and because >90% of genetic variants from GWAS are located in non-coding regions. Here, we collected the largest human epigenomic map from ENCODE and Roadmap consortia and implemented a deep-learning-based convolutional neural network (CNN) model to predict the regulatory roles of genetic variants across a comprehensive list of epigenomic modifications. Our model, called DeepFun, was built on DNA accessibility maps, histone modification marks, and transcription factors. DeepFun can systematically assess the impact of non-coding variants in the most functional elements with tissue or cell-type specificity, even for rare variants or de novo mutations. By applying this model, we prioritized trait-associated loci for 51 publicly-available GWAS studies. We demonstrated that CNN-based analyses on dense and high-resolution epigenomic annotations can refine important GWAS associations in order to identify regulatory loci from background signals, which yield novel insights for better understanding the molecular basis of human complex disease. We anticipate our approaches will become routine in GWAS downstream analysis and non-coding variant evaluation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkaa1137DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7797043PMC
January 2021

Convergent genomic and pharmacological evidence of PI3K/GSK3 signaling alterations in neurons from schizophrenia patients.

Neuropsychopharmacology 2021 02 7;46(3):673-682. Epub 2020 Dec 7.

Louis A. Faillace, MD, Department of Psychiatry and Behavioral Sciences, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX, USA.

Human-induced pluripotent stem cells (hiPSCs) allow for the establishment of brain cellular models of psychiatric disorders that account for a patient's genetic background. Here, we conducted an RNA-sequencing profiling study of hiPSC-derived cell lines from schizophrenia (SCZ) subjects, most of which are from a multiplex family, from the population isolate of the Central Valley of Costa Rica. hiPSCs, neural precursor cells, and cortical neurons derived from six healthy controls and seven SCZ subjects were generated using standard methodology. Transcriptome from these cells was obtained using Illumina HiSeq 2500, and differential expression analyses were performed using DESeq2 (|fold change|>1.5 and false discovery rate < 0.3), in patients compared to controls. We identified 454 differentially expressed genes in hiPSC-derived neurons, enriched in pathways including phosphoinositide 3-kinase/glycogen synthase kinase 3 (PI3K/GSK3) signaling, with serum-glucocorticoid kinase 1 (SGK1), an inhibitor of glycogen synthase kinase 3β, as part of this pathway. We further found that pharmacological inhibition of downstream effectors of the PI3K/GSK3 pathway, SGK1 and GSK3, induced alterations in levels of neurite markers βIII tubulin and fibroblast growth factor 12, with differential effects in patients compared to controls. While demonstrating the utility of hiPSCs derived from multiplex families to identify significant cell-specific gene network alterations in SCZ, these studies support a role for disruption of PI3K/GSK3 signaling as a risk factor for SCZ.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41386-020-00924-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8027596PMC
February 2021

Gene expression imputation and cell-type deconvolution in human brain with spatiotemporal precision and its implications for brain-related disorders.

Genome Res 2021 01 3;31(1):146-158. Epub 2020 Dec 3.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA.

As the most complex organ of the human body, the brain is composed of diverse regions, each consisting of distinct cell types and their respective cellular interactions. Human brain development involves a finely tuned cascade of interactive events. These include spatiotemporal gene expression changes and dynamic alterations in cell-type composition. However, our understanding of this process is still largely incomplete owing to the difficulty of brain spatiotemporal transcriptome collection. In this study, we developed a tensor-based approach to impute gene expression on a transcriptome-wide level. After rigorous computational benchmarking, we applied our approach to infer missing data points in the widely used BrainSpan resource and completed the entire grid of spatiotemporal transcriptomics. Next, we conducted deconvolutional analyses to comprehensively characterize major cell-type dynamics across the entire BrainSpan resource to estimate the cellular temporal changes and distinct neocortical areas across development. Moreover, integration of these results with GWAS summary statistics for 13 brain-associated traits revealed multiple novel trait-cell-type associations and trait-spatiotemporal relationships. In summary, our imputed BrainSpan transcriptomic data provide a valuable resource for the research community and our findings help further studies of the transcriptional and cellular dynamics of the human brain and related diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.265769.120DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7849392PMC
January 2021

A developmental stage-specific network approach for studying dynamic co-regulation of transcription factors and microRNAs during craniofacial development.

Development 2020 12 24;147(24). Epub 2020 Dec 24.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Craniofacial development is regulated through dynamic and complex mechanisms that involve various signaling cascades and gene regulations. Disruption of such regulations can result in craniofacial birth defects. Here, we propose the first developmental stage-specific network approach by integrating two crucial regulators, transcription factors (TFs) and microRNAs (miRNAs), to study their co-regulation during craniofacial development. Specifically, we used TFs, miRNAs and non-TF genes to form feed-forward loops (FFLs) using genomic data covering mouse embryonic days E10.5 to E14.5. We identified key novel regulators (TFs Foxm1, Hif1a, Zbtb16, Myog, Myod1 and Tcf7, and miRNAs miR-340-5p and miR-129-5p) and target genes (, and ) expression of which changed in a developmental stage-dependent manner. We found that the Wnt-FoxO-Hippo pathway (from E10.5 to E11.5), tissue remodeling (from E12.5 to E13.5) and miR-129-5p-mediated regulation (from E10.5 to E14.5) might play crucial roles in craniofacial development. Enrichment analyses further suggested their functions. Our experiments validated the regulatory roles of miR-340-5p and Foxm1 in the Wnt-FoxO-Hippo subnetwork, as well as the role of miR-129-5p in the miR-129-5p- subnetwork. Thus, our study helps understand the comprehensive regulatory mechanisms for craniofacial development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1242/dev.192948DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7774895PMC
December 2020

CSEA-DB: an omnibus for human complex trait and cell type associations.

Nucleic Acids Res 2021 01;49(D1):D862-D870

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

During the past decade, genome-wide association studies (GWAS) have identified many genetic variants with susceptibility to several thousands of complex diseases or traits. The genetic regulation of gene expression is highly tissue-specific and cell type-specific. Recently, single-cell technology has paved the way to dissect cellular heterogeneity in human tissues. Here, we present a reference database for GWAS trait-associated cell type-specificity, named Cell type-Specific Enrichment Analysis DataBase (CSEA-DB, available at https://bioinfo.uth.edu/CSEADB/). Specifically, we curated total of 5120 GWAS summary statistics data for a wide range of human traits and diseases followed by rigorous quality control. We further collected >900 000 cells from the leading consortia such as Human Cell Landscape, Human Cell Atlas, and extensive literature mining, including 752 tissue cell types from 71 adult and fetal tissues across 11 human organ systems. The tissues and cell types were annotated with Uberon and Cell Ontology. By applying our deTS algorithm, we conducted 10 250 480 times of trait-cell type associations, reporting a total of 598 (11.68%) GWAS traits with at least one significantly associated cell type. In summary, CSEA-DB could serve as a repository of association map for human complex traits and their underlying cell types, manually curated GWAS, and single-cell transcriptome resources.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkaa1064DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7778923PMC
January 2021

Editorial: Advanced Interpretable Machine Learning Methods for Clinical NGS Big Data of Complex Hereditary Diseases.

Front Genet 2020 23;11:600902. Epub 2020 Oct 23.

University of Texas Health Science Center at Houston, Houston, TX, United States.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2020.600902DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7644923PMC
October 2020

Differential Expression of Viral Transcripts From Single-Cell RNA Sequencing of Moderate and Severe COVID-19 Patients and Its Implications for Case Severity.

Front Microbiol 2020 16;11:603509. Epub 2020 Oct 16.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States.

With steady increase of new COVID-19 cases around the world, especially in the United States, health care resources in areas with the disease outbreak are quickly exhausted by overwhelming numbers of COVID-19 patients. Therefore, strategies that can effectively and quickly predict the disease progression and stratify patients for appropriate health care arrangements are urgently needed. We explored the features and evolutionary difference of viral gene expression in the SARS-CoV-2 infected cells from the bronchoalveolar lavage fluids of patients with moderate and severe COVID-19 using both single cell and bulk tissue transcriptome data. We found SARS-CoV-2 sequences were detectable in 8 types of immune related cells, including macrophages, T cells, and NK cells. We first reported that the SARS-CoV-2 ORF10 gene was differentially expressed in the severe vs. moderate samples. Specifically, ORF10 was abundantly expressed in infected cells of severe cases, while it was barely detectable in the infected cells of moderate cases. Consequently, the expression ratio of ORF10 to nucleocapsid (N) was significantly higher in severe than moderate cases ( = 0.0062). Moreover, we found transcription regulatory sequences (TRSs) of the viral leader sequence-independent fusions with a 5' joint point at position 1073 of SARS-CoV-2 genome were detected mainly in the patients with death outcome, suggesting its potential indication of clinical outcome. Finally, we identified the motifs in TRS of the viral leader sequence-dependent fusion events of SARS-CoV-2 and compared with that in SARS-CoV, suggesting its evolutionary trajectory. These results implicated potential roles and predictive features of viral transcripts in the pathogenesis of COVID-19 moderate and severe patients. Such features and evolutionary patterns require more data to validate in future.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fmicb.2020.603509DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7596306PMC
October 2020

KinaseMD: kinase mutations and drug response database.

Nucleic Acids Res 2021 01;49(D1):D552-D561

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston TX 77030, USA.

Mutations in kinases are abundant and critical to study signaling pathways and regulatory roles in human disease, especially in cancer. Somatic mutations in kinase genes can affect drug treatment, both sensitivity and resistance, to clinically used kinase inhibitors. Here, we present a newly constructed database, KinaseMD (kinase mutations and drug response), to structurally and functionally annotate kinase mutations. KinaseMD integrates 679 374 somatic mutations, 251 522 network-rewiring events, and 390 460 drug response records curated from various sources for 547 kinases. We uniquely annotate the mutations and kinase inhibitor response in four types of protein substructures (gatekeeper, A-loop, G-loop and αC-helix) that are linked to kinase inhibitor resistance in literature. In addition, we annotate functional mutations that may rewire kinase regulatory network and report four phosphorylation signals (gain, loss, up-regulation and down-regulation). Overall, KinaseMD provides the most updated information on mutations, unique annotations of drug response especially drug resistance and functional sites of kinases. KinaseMD is accessible at https://bioinfo.uth.edu/kmd/, having functions for searching, browsing and downloading data. To our knowledge, there has been no systematic annotation of these structural mutations linking to kinase inhibitor response. In summary, KinaseMD is a centralized database for kinase mutations and drug response.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkaa945DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7779064PMC
January 2021

A Genome-wide Association Study Discovers 46 Loci of the Human Metabolome in the Hispanic Community Health Study/Study of Latinos.

Am J Hum Genet 2020 11 7;107(5):849-863. Epub 2020 Oct 7.

Human Genetics Center, University of Texas Health Science Center, Houston, TX 77030, USA. Electronic address:

Variation in levels of the human metabolome reflect changes in homeostasis, providing a window into health and disease. The genetic impact on circulating metabolites in Hispanics, a population with high cardiometabolic disease burden, is largely unknown. We conducted genome-wide association analyses on 640 circulating metabolites in 3,926 Hispanic Community Health Study/Study of Latinos participants. The estimated heritability for 640 metabolites ranged between 0%-54% with a median at 2.5%. We discovered 46 variant-metabolite pairs (p value < 1.2 × 10, minor allele frequency ≥ 1%, proportion of variance explained [PEV] mean = 3.4%, PEV = 1%-22%) with generalized effects in two population-based studies and confirmed 301 known locus-metabolite associations. Half of the identified variants with generalized effect were located in genes, including five nonsynonymous variants. We identified co-localization with the expression quantitative trait loci at 105 discovered and 151 known loci-metabolites sets. rs5855544, upstream of SLC51A, was associated with higher levels of three steroid sulfates and co-localized with expression levels of SLC51A in several tissues. Mendelian randomization (MR) analysis identified several metabolites associated with coronary heart disease (CHD) and type 2 diabetes. For example, two variants located in or near CYP4F2 (rs2108622 and rs79400241, respectively), involved in vitamin E metabolism, were associated with the levels of octadecanedioate and vitamin E metabolites (gamma-CEHC and gamma-CEHC glucuronide); MR analysis showed that genetically high levels of these metabolites were associated with lower odds of CHD. Our findings document the genetic architecture of circulating metabolites in an underrepresented Hispanic/Latino community, shedding light on disease etiology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2020.09.003DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7675000PMC
November 2020

Identification of de novo mutations in prenatal neurodevelopment-associated genes in schizophrenia in two Han Chinese patient-sibling family-based cohorts.

Transl Psychiatry 2020 09 1;10(1):307. Epub 2020 Sep 1.

410 AI, LLC, 10 Plummer Ct, Germantown, MD, 20876, USA.

Schizophrenia (SCZ) is a severe psychiatric disorder with a strong genetic component. High heritability of SCZ suggests a major role for transmitted genetic variants. Furthermore, SCZ is also associated with a marked reduction in fecundity, leading to the hypothesis that alleles with large effects on risk might often occur de novo. In this study, we conducted whole-genome sequencing for 23 families from two cohorts with unaffected siblings and parents. Two nonsense de novo mutations (DNMs) in GJC1 and HIST1H2AD were identified in SCZ patients. Ten genes (DPYSL2, NBPF1, SDK1, ZNF595, ZNF718, GCNT2, SNX9, AACS, KCNQ1, and MSI2) were found to carry more DNMs in SCZ patients than their unaffected siblings by burden test. Expression analyses indicated that these DNM implicated genes showed significantly higher expression in prefrontal cortex in prenatal stage. The DNM in the GJC1 gene is highly likely a loss function mutation (pLI = 0.94), leading to the dysregulation of ion channel in the glutamatergic excitatory neurons. Analysis of rare variants in independent exome sequencing dataset indicates that GJC1 has significantly more rare variants in SCZ patients than in unaffected controls. Data from genome-wide association studies suggested that common variants in the GJC1 gene may be associated with SCZ and SCZ-related traits. Genes co-expressed with GJC1 are involved in SCZ, SCZ-associated pathways, and drug targets. These evidences suggest that GJC1 may be a risk gene for SCZ and its function may be involved in prenatal and early neurodevelopment, a vulnerable period for developmental disorders such as SCZ.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41398-020-00987-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7463022PMC
September 2020

Molecular signatures identified by integrating gene expression and methylation in non-seminoma and seminoma of testicular germ cell tumours.

Epigenetics 2021 Jan-Feb;16(2):162-176. Epub 2020 Jul 13.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston , Houston, TX, USA.

Testicular germ cell tumours (TGCTs) are the most common cancer in young male adults (aged 15 to 40). Unlike most other cancer types, identification of molecular signatures in TGCT has rarely reported. In this study, we developed a novel integrative analysis framework to identify co-methylated and co-expressed genes [mRNAs and microRNAs (miRNAs)] modules in two TGCT subtypes: non-seminoma (NSE) and seminoma (SE). We first integrated DNA methylation and mRNA/miRNA expression data and then used a statistical method, CoMEx (mbined score of DNA ethylation and pression), to assess differentially expressed and methylated (DEM) genes/miRNAs. Next, we identified co-methylation and co-expression modules by applying WGCNA (eighted ene orrelation etwork nalysis) tool to these DEM genes/miRNAs. The module with the highest average earson's orrelation oefficient (PCC) after considering all pair-wise molecules (genes/miRNAs) included 91 molecules. By integrating both transcription factor and miRNA regulations, we constructed subtype-specific regulatory networks for NSE and SE. We identified four hub miRNAs (miR-182-5p, miR-520b, miR-520c-3p, and miR-7-5p), two hub TFs (MYC and SP1), and two genes ( and ) in the NSE-specific regulatory network, and two hub miRNAs (miR-182-5p and miR-338-3p), five hub TFs (ETS1, HIF1A, HNF1A, MYC, and SP1), and three hub genes (, and ) in the SE-specific regulatory network. miRNA (miR-182-5p) and two TFs (MYC and SP1) were common hubs of NSE and SE. We further examined pathways enriched in these subtype-specific networks. Our study provides a comprehensive view of the molecular signatures and co-regulation in two TGCT subtypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1080/15592294.2020.1790108DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7889165PMC
July 2020

H19, a Long Non-coding RNA, Mediates Transcription Factors and Target Genes through Interference of MicroRNAs in Pan-Cancer.

Mol Ther Nucleic Acids 2020 Sep 27;21:180-191. Epub 2020 May 27.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030, USA; Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA. Electronic address:

Long non-coding RNAs (lncRNAs) have recently been found to be important in gene regulation. lncRNA H19 has been reported to play an oncogenic role in many human cancers. Its specific regulatory role is still elusive. In this study, we developed a novel analytic approach by integrating the synergistic regulation among lncRNAs (e.g., H19), transcription factors (TFs), target genes, and microRNAs (miRNAs) and then applied it to the pan-cancer expression datasets from The Cancer Genome Atlas (TCGA). Using linear regression models, we identified 88 H19-TF-gene co-regulatory triplets, in which 93% of the TF-gene pairs were related to cancer, indicating that our approach was effective to identify disease-related lncRNA-TF-gene co-regulation mechanisms. lncRNAs can function as miRNA sponges. Our further experiments found that H19 might regulate SP1-TGFBR2 through let-7b and miR-200b, ETS1-TGFBR2 through miR-29a and miR-200b, and STAT3-KLF11 through miR-17 in breast cancer cell lines. Our work suggests that miRNA-mediated lncRNA-TF-gene co-regulation is complicated yet important in cancer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.omtn.2020.05.028DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7321791PMC
September 2020

Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning.

Brief Bioinform 2021 05;22(3)

Center for Precision Health, School of Biomedical Informatics.

DNA N4-methylcytosine (4mC) modification represents a novel epigenetic regulation. It involves in various cellular processes, including DNA replication, cell cycle and gene expression, among others. In addition to experimental identification of 4mC sites, in silico prediction of 4mC sites in the genome has emerged as an alternative and promising approach. In this study, we first reviewed the current progress in the computational prediction of 4mC sites and systematically evaluated the predictive capacity of eight conventional machine learning algorithms as well as 12 feature types commonly used in previous studies in six species. Using a representative benchmark dataset, we investigated the contribution of feature selection and stacking approach to the model construction, and found that feature optimization and proper reinforcement learning could improve the performance. We next recollected newly added 4mC sites in the six species' genomes and developed a novel deep learning-based 4mC site predictor, namely Deep4mC. Deep4mC applies convolutional neural networks with four representative features. For species with small numbers of samples, we extended our deep learning framework with a bootstrapping method. Our evaluation indicated that Deep4mC could obtain high accuracy and robust performance with the average area under curve (AUC) values greater than 0.9 in all species (range: 0.9005-0.9722). In comparison, Deep4mC achieved an AUC value improvement from 10.14 to 46.21% when compared to previous tools in these six species. A user-friendly web server (https://bioinfo.uth.edu/Deep4mC) was built for predicting putative 4mC sites in a genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bbaa099DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8138820PMC
May 2021

Decoding whole-genome mutational signatures in 37 human pan-cancers by denoising sparse autoencoder neural network.

Oncogene 2020 07 11;39(27):5031-5041. Epub 2020 Jun 11.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.

Millions of somatic mutations have recently been discovered in cancer genomes. These mutations in cancer genomes occur due to internal and external mutagenesis forces. Decoding the mutational processes by examining their unique patterns has successfully revealed many known and novel signatures from whole exome data, but many still remain undiscovered. Here, we developed a deep learning approach, DeepMS, to decompose mutational signatures using 52,671,908 somatic mutations from 2780 highly curated cancer genomes with whole genome sequencing (WGS) in 37 cancer types/subtypes. With rigorous model training and comparison, we characterized 54 signatures for single base substitutions (SBSs), 11 for doublet base substitutions (DBSs) and 16 for small insertions and deletions (Indels). Compared to the previous methods, DeepMS could discover 37 SBS, 5 DBS, and 9 Indel new signatures, many of which represent associations with DNA mismatch or base excision repair and cisplatin therapy mechanisms. We further developed a regression-based model to estimate the correlation between signatures and clinical and demographical phenotypes. The first deep learning model DeepMS on WGS somatic mutational profiles enable us identify more comprehensive context-based mutational signatures than traditional NMF approaches. Our work substantially expands the landscape of the naturally occurring mutational signatures in cancer genomes, and provides new insights into cancer biology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41388-020-1343-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334101PMC
July 2020

An integrative, genomic, transcriptomic and network-assisted study to identify genes associated with human cleft lip with or without cleft palate.

BMC Med Genomics 2020 04 3;13(Suppl 5):39. Epub 2020 Apr 3.

Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX, 77030, USA.

Background: Cleft lip with or without cleft palate (CL/P) is one of the most common congenital human birth defects. A combination of genetic and epidemiology studies has contributed to a better knowledge of CL/P-associated candidate genes and environmental risk factors. However, the etiology of CL/P remains not fully understood. In this study, to identify new CL/P-associated genes, we conducted an integrative analysis using our in-house network tools, dmGWAS [dense module search for Genome-Wide Association Studies (GWAS)] and EW_dmGWAS (Edge-Weighted dmGWAS), in a combination with GWAS data, the human protein-protein interaction (PPI) network, and differential gene expression profiles.

Results: A total of 87 genes were consistently detected in both European and Asian ancestries in dmGWAS. There were 31.0% (27/87) showed nominal significance with CL/P (gene-based p < 0.05), with three genes showing strong association signals, including KIAA1598, GPR183, and ZMYND11 (p < 1 × 10). In EW_dmGWAS, we identified 253 and 245 module genes associated with CL/P for European ancestry and the Asian ancestry, respectively. Functional enrichment analysis demonstrated that these genes were involved in cell adhesion, protein localization to the plasma membrane, the regulation of the apoptotic signaling pathway, and other pathological conditions. A small proportion of genes (5.1% for European ancestry; 2.4% for Asian ancestry) had prior evidence in CL/P as annotated in CleftGeneDB database. Our analysis highlighted nine novel CL/P candidate genes (BRD1, CREBBP, CSK, DNM1L, LOR, PTPN18, SND1, TGS1, and VIM) and 17 previously reported genes in the top modules.

Conclusions: The genes identified through superimposing GWAS signals and differential gene expression profiles onto human PPI network, as well as their functional features, helped our understanding of the etiology of CL/P. Our multi-omics integrative analyses revealed nine novel candidate genes involved in CL/P.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-020-0675-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7118807PMC
April 2020
-->