Publications by authors named "Jake Lin"

23 Publications

  • Page 1 of 1

Genetic Adaptation of Coxsackievirus B1 during Persistent Infection in Pancreatic Cells.

Microorganisms 2020 Nov 15;8(11). Epub 2020 Nov 15.

Faculty of Medicine and Health Technology, Tampere University, 33520 Tampere, Finland.

Coxsackie B (CVB) viruses have been associated with type 1 diabetes. We have recently observed that CVB1 was linked to the initiation of the autoimmune process leading to type 1 diabetes in Finnish children. Viral persistency in the pancreas is currently considered as one possible mechanism. In the current study persistent infection was established in pancreatic ductal and beta cell lines (PANC-1 and 1.1B4) using four different CVB1 strains, including the prototype strain and three clinical isolates. We sequenced 5' untranslated region (UTR) and regions coding for structural and non-structural proteins and the second single open reading frame (ORF) protein of all persisting CVB1 strains using next generation sequencing to identify mutations that are common for all of these strains. One mutation, K257R in VP1, was found from all persisting CVB1 strains. The mutations were mainly accumulated in viral structural proteins, especially at BC, DE, EF loops and C-terminus of viral capsid protein 1 (VP1), the puff region of VP2, the knob region of VP3 and infection-enhancing epitope of VP4. This showed that the capsid region of the viruses sustains various changes during persistency some of which could be hallmark(s) of persistency.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/microorganisms8111790DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7697981PMC
November 2020

An expanded analysis framework for multivariate GWAS connects inflammatory biomarkers to functional variants and disease.

Eur J Hum Genet 2021 Feb 27;29(2):309-324. Epub 2020 Oct 27.

Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland.

Multivariate methods are known to increase the statistical power to detect associations in the case of shared genetic basis between phenotypes. They have, however, lacked essential analytic tools to follow-up and understand the biology underlying these associations. We developed a novel computational workflow for multivariate GWAS follow-up analyses, including fine-mapping and identification of the subset of traits driving associations (driver traits). Many follow-up tools require univariate regression coefficients which are lacking from multivariate results. Our method overcomes this problem by using Canonical Correlation Analysis to turn each multivariate association into its optimal univariate Linear Combination Phenotype (LCP). This enables an LCP-GWAS, which in turn generates the statistics required for follow-up analyses. We implemented our method on 12 highly correlated inflammatory biomarkers in a Finnish population-based study. Altogether, we identified 11 associations, four of which (F5, ABO, C1orf140 and PDGFRB) were not detected by biomarker-specific analyses. Fine-mapping identified 19 signals within the 11 loci and driver trait analysis determined the traits contributing to the associations. A phenome-wide association study on the 19 representative variants from the signals in 176,899 individuals from the FinnGen study revealed 53 disease associations (p < 1 × 10). Several reported pQTLs in the 11 loci provided orthogonal evidence for the biologically relevant functions of the representative variants. Our novel multivariate analysis workflow provides a powerful addition to standard univariate GWAS analyses by enabling multivariate GWAS follow-up and thus promoting the advancement of powerful multivariate methods in genomics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41431-020-00730-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7868371PMC
February 2021

MetaPhat: Detecting and Decomposing Multivariate Associations From Univariate Genome-Wide Association Statistics.

Front Genet 2020 15;11:431. Epub 2020 May 15.

Institute for Molecular Medicine Finland FIMM, Helsinki Institute of Life Science HiLIFE, University of Helsinki, Helsinki, Finland.

Background: Multivariate testing tools that integrate multiple genome-wide association studies (GWAS) have become important as the number of phenotypes gathered from study cohorts and biobanks has increased. While these tools have been shown to boost statistical power considerably over univariate tests, an important remaining challenge is to interpret which traits are driving the multivariate association and which traits are just passengers with minor contributions to the genotype-phenotypes association statistic.

Results: We introduce MetaPhat, a novel bioinformatics tool to conduct GWAS of multiple correlated traits using univariate GWAS results and to decompose multivariate associations into sets of central traits based on intuitive trace plots that visualize Bayesian Information Criterion (BIC) and -value statistics of multivariate association models. We validate MetaPhat with Global Lipids Genetics Consortium GWAS results, and we apply MetaPhat to univariate GWAS results for 21 heritable and correlated polyunsaturated lipid species from 2,045 Finnish samples, detecting seven independent loci associated with a cluster of lipid species. In most cases, we are able to decompose these multivariate associations to only three to five central traits out of all 21 traits included in the analyses. We release MetaPhat as an open source tool written in Python with built-in support for multi-processing, quality control, clumping and intuitive visualizations using the R software.

Conclusion: MetaPhat efficiently decomposes associations between multivariate phenotypes and genetic variants into smaller sets of central traits and improves the interpretation and specificity of genome-phenome associations. MetaPhat is freely available under the MIT license at: https://sourceforge.net/projects/meta-pheno-association-tracer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2020.00431DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7242752PMC
May 2020

Polygenic Hyperlipidemias and Coronary Artery Disease Risk.

Circ Genom Precis Med 2020 04 10;13(2):e002725. Epub 2020 Mar 10.

Institute for Molecular Medicine Finland, Helsinki Institute of Life Science (HiLIFE) (P.R., J.T.R., N.J.M., Y.F., J.L., C.B., I.S., T.K., A.S.H., P.P., E.W., T.T., M.P., A.P., S.R.), University of Helsinki, Helsinki, Finland.

Background: Hyperlipidemia is a highly heritable risk factor for coronary artery disease (CAD). While monogenic familial hypercholesterolemia associates with severely increased CAD risk, it remains less clear to what extent a high polygenic load of a large number of LDL (low-density lipoprotein) cholesterol (LDL-C) or triglyceride (TG)-increasing variants associates with increased CAD risk.

Methods: We derived polygenic risk scores (PRSs) with ≈6M variants separately for LDL-C and TG with weights from a UK Biobank-based genome-wide association study with ≈324K samples. We evaluated the impact of polygenic hypercholesterolemia and hypertriglyceridemia to lipid levels in 27 039 individuals from the National FINRISK Study (FINRISK) cohort and to CAD risk in 135 638 individuals (13 753 CAD cases) from the FinnGen project (FinnGen).

Results: In FINRISK, median LDL-C was 3.39 (95% CI, 3.38-3.40) mmol/L, and it ranged from 2.87 (95% CI, 2.82-2.94) to 3.78 (95% CI, 3.71-3.83) mmol/L between the lowest and highest 5% of the LDL-C PRS distribution. Median TG was 1.19 (95% CI, 1.18-1.20) mmol/L, ranging from 0.97 (95% CI, 0.94-1.00) to 1.55 (95% CI, 1.48-1.61) mmol/L with the TG PRS. In FinnGen, comparing the highest 5% of the PRS to the lowest 95%, CAD odds ratio was 1.36 (95% CI, 1.24-1.49) for the LDL-C PRS and 1.31 (95% CI, 1.19-1.43) for the TG PRS. These estimates were only slightly attenuated when adjusting for a CAD PRS (odds ratio, 1.26 [95% CI, 1.16-1.38] for LDL-C and 1.24 [95% CI, 1.13-1.36] for TG PRS).

Conclusions: The CAD risk associated with a high polygenic load for lipid-increasing variants was proportional to their impact on lipid levels and partially overlapping with a CAD PRS. In contrast with a PRS for CAD, the lipid PRSs point to known and directly modifiable risk factors providing additional guidance for clinical translation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1161/CIRCGEN.119.002725DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7176338PMC
April 2020

Metagenomics of the faecal virome indicate a cumulative effect of enterovirus and gluten amount on the risk of coeliac disease autoimmunity in genetically at risk children: the TEDDY study.

Gut 2020 08 19;69(8):1416-1422. Epub 2019 Nov 19.

The Diabetes and Celiac Disease Unit, Department of Clinical Sciences, Lund University, Lund, Sweden.

Objective: Higher gluten intake, frequent gastrointestinal infections and adenovirus, enterovirus, rotavirus and reovirus have been proposed as environmental triggers for coeliac disease. However, it is not known whether an interaction exists between the ingested gluten amount and viral exposures in the development of coeliac disease. This study investigated whether distinct viral exposures alone or together with gluten increase the risk of coeliac disease autoimmunity (CDA) in genetically predisposed children.

Design: The Environmental Determinants of Diabetes in the Young study prospectively followed children carrying the HLA risk haplotypes DQ2 and/or DQ8 and constructed a nested case-control design. From this design, 83 CDA case-control pairs were identified. Median age of CDA was 31 months. Stool samples collected monthly up to the age of 2 years were analysed for virome composition by Illumina next-generation sequencing followed by comprehensive computational virus profiling.

Results: The cumulative number of stool enteroviral exposures between 1 and 2 years of age was associated with an increased risk for CDA. In addition, there was a significant interaction between cumulative stool enteroviral exposures and gluten consumption. The risk conferred by stool enteroviruses was increased in cases reporting higher gluten intake.

Conclusions: Frequent exposure to enterovirus between 1 and 2 years of age was associated with increased risk of CDA. The increased risk conferred by the interaction between enteroviruses and higher gluten intake indicate a cumulative effect of these factors in the development of CDA.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1136/gutjnl-2019-319809DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7234892PMC
August 2020

Data-driven characterization of molecular phenotypes across heterogeneous sample collections.

Nucleic Acids Res 2019 07;47(13):e76

Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland.

Existing large gene expression data repositories hold enormous potential to elucidate disease mechanisms, characterize changes in cellular pathways, and to stratify patients based on molecular profiles. To achieve this goal, integrative resources and tools are needed that allow comparison of results across datasets and data types. We propose an intuitive approach for data-driven stratifications of molecular profiles and benchmark our methodology using the dimensionality reduction algorithm t-distributed stochastic neighbor embedding (t-SNE) with multi-study and multi-platform data on hematological malignancies. Our approach enables assessing the contribution of biological versus technical variation to sample clustering, direct incorporation of additional datasets to the same low dimensional representation, comparison of molecular disease subtypes identified from separate t-SNE representations, and characterization of the obtained clusters based on pathway databases and additional data. In this manner, we performed an integrative analysis across multi-omics acute myeloid leukemia studies. Our approach indicated new molecular subtypes with differential survival and drug responsiveness among samples lacking fusion genes, including a novel myelodysplastic syndrome-like cluster and a cluster characterized with CEBPA mutations and differential activity of the S-adenosylmethionine-dependent DNA methylation pathway. In summary, integration across multiple studies can help to identify novel molecular disease subtypes and generate insight into disease biology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkz281DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6648337PMC
July 2019

Hemap: An Interactive Online Resource for Characterizing Molecular Phenotypes across Hematologic Malignancies.

Cancer Res 2019 05 2;79(10):2466-2479. Epub 2019 Apr 2.

Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland.

Large collections of genome-wide data can facilitate the characterization of disease states and subtypes, permitting pan-cancer analysis of molecular phenotypes and evaluation of disease context for new therapeutic approaches. We analyzed 9,544 transcriptomes from more than 30 hematologic malignancies, normal blood cell types, and cell lines, and showed that disease types could be stratified in a data-driven manner. We then identified cluster-specific pathway activity, new biomarkers, and drug target prioritization through interrogation of drug target databases. Using known vulnerabilities and available drug screens, we highlighted the importance of integrating molecular phenotype with drug target expression for prediction of drug responsiveness. Our analysis implicated expression level as an important indicator of venetoclax responsiveness and provided a rationale for its targeting in specific leukemia subtypes and multiple myeloma, linked several polycomb group proteins that could be targeted by small molecules (SFMBT1, CBX7, and EZH1) with chronic lymphocytic leukemia, and supported as a disease-specific target in acute myeloid leukemia. Through integration with proteomics data, we characterized target protein expression for pre-B leukemia immunotherapy candidates, including DPEP1. These molecular data can be explored using our publicly available interactive resource, Hemap, for expediting therapeutic innovations in hematologic malignancies. SIGNIFICANCE: This study describes a data resource for researching derailed cellular pathways and candidate drug targets across hematologic malignancies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1158/0008-5472.CAN-18-2970DOI Listing
May 2019

Bioinformatics Assembling and Assessment of Novel Coxsackievirus B1 Genome.

Methods Mol Biol 2018 ;1838:261-272

Computational Biology, Faculty of Medicine and Life Sciences, University of Tampere, Tampere, Finland.

The human microbiome project via application of metagenomic next-generation sequencing techniques has found surprising large and diverse amounts of microbial sequences across different body sites. There is a wave of investigators studying autoimmune related diseases designing from birth case and control studies to elucidate microbial associations and potential direct triggers. Sequencing analysis, considered big data as it typically includes millions of reads, is challenging but particularly demanding and complex is virome profiling due to its lack of pan-viral genomic signature. Impressively thousands of virus complete genomes have been deposited and these high-quality references are core components of virus profiling pipelines and databases. Still it is commonly known that most viral sequences do not map to known viruses. Moreover human viruses, particularly RNA groups, are notoriously heterogeneous due to high mutation rates. Here, we present the related assembling challenges and a series of bioinformatics steps that were applied in the construction of the complete consensus genome of a novel clinical isolate of Coxsackievirus B1. We further demonstrate our effort in calling mutations between prototype Coxsackievirus B1 sequence from GenBank and serial clinical isolate genome grown in cell culture.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-1-4939-8682-8_18DOI Listing
April 2019

Nature-derived microbiota exposure as a novel immunomodulatory approach.

Future Microbiol 2018 06 17;13:737-744. Epub 2018 May 17.

Ecosystems and Environment Research Programme, Faculty of Biological and Environmental Sciences, University of Helsinki, Niemenkatu 73, 15140 Lahti, Finland.

Aim: Current attempts to modulate the human microbiota and immune responses are based on probiotics or human-derived bacterial transplants. We investigated microbial modulation by soil and plant-based material.

Materials & Methods: We performed a pilot study in which healthy adults were exposed to the varied microbial community of a soil- and plant-based material.

Results: The method was safe and feasible; exposure was associated with an increase in gut microbial diversity.

Conclusion: If these findings are reproduced in larger studies nature-derived microbial exposure strategies could be further developed for testing their efficacy in the treatment and prevention of immune-mediated diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2217/fmb-2017-0286DOI Listing
June 2018

Changes in the lung bacteriome in relation to antipseudomonal therapy in children with cystic fibrosis.

Folia Microbiol (Praha) 2018 Mar 10;63(2):237-248. Epub 2017 Nov 10.

Department of Paediatrics, 2nd Faculty of Medicine, Charles University in Prague and University Hospital Motol, V Úvalu 84, 15006, Prague 5, Czech Republic.

The lung in cystic fibrosis (CF) is home to numerous pathogens that shorten the lives of patients. The aim of the present study was to assess changes in the lung bacteriome following antibiotic therapy targeting Pseudomonas aeruginosa in children with CF. The study included nine children (9-18 years) with CF who were treated for their chronic or intermittent positivity for Pseudomonas aeruginosa. The bacteriomes were determined in 16 pairs of sputa collected at the beginning and at the end of a course of intravenous antibiotic therapy via deep sequencing of the variable region 4 of the 16S rRNA gene, and the total bacterial load and selected specific pathogens were assessed using quantitative real-time PCR. The effect of antipseudomonal antibiotics was observable as a profound decrease in the total 16S rDNA load (p = 0.001) as well as in a broad range of individual taxa including Staphylococcus aureus (p = 0.03) and several members of the Streptococcus mitis group (S. oralis, S. mitis, and S. infantis) (p = 0.003). Improvements in forced expiratory volume (FEV) were associated with an increase in Granulicatella sp. (p = 0.004), whereas a negative association was noted between the total bacterial load and white blood cell count (p = 0.007). In conclusion, the data show how microbial communities differ in reaction to antipseudomonal treatment, suggesting that certain rare species may be associated with clinical parameters. Our work also demonstrates the utility of absolute quantification of bacterial load in addition to the 16S rDNA profiling.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12223-017-0562-3DOI Listing
March 2018

Vipie: web pipeline for parallel characterization of viral populations from multiple NGS samples.

BMC Genomics 2017 05 15;18(1):378. Epub 2017 May 15.

Department of Pediatrics, 2nd Faculty of Medicine, Charles University and University Hospital Motol, V Úvalu 84, 150 06, Praha 5, Czech Republic.

Background: Next generation sequencing (NGS) technology allows laboratories to investigate virome composition in clinical and environmental samples in a culture-independent way. There is a need for bioinformatic tools capable of parallel processing of virome sequencing data by exactly identical methods: this is especially important in studies of multifactorial diseases, or in parallel comparison of laboratory protocols.

Results: We have developed a web-based application allowing direct upload of sequences from multiple virome samples using custom parameters. The samples are then processed in parallel using an identical protocol, and can be easily reanalyzed. The pipeline performs de-novo assembly, taxonomic classification of viruses as well as sample analyses based on user-defined grouping categories. Tables of virus abundance are produced from cross-validation by remapping the sequencing reads to a union of all observed reference viruses. In addition, read sets and reports are created after processing unmapped reads against known human and bacterial ribosome references. Secured interactive results are dynamically plotted with population and diversity charts, clustered heatmaps and a sortable and searchable abundance table.

Conclusions: The Vipie web application is a unique tool for multi-sample metagenomic analysis of viral data, producing searchable hits tables, interactive population maps, alpha diversity measures and clustered heatmaps that are grouped in applicable custom sample categories. Known references such as human genome and bacterial ribosomal genes are optionally removed from unmapped ('dark matter') reads. Secured results are accessible and shareable on modern browsers. Vipie is a freely available web-based tool whose code is open source.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-017-3721-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5430618PMC
May 2017

Imbalance of bacteriome profiles within the Finnish Diabetes Prediction and Prevention study: Parallel use of 16S profiling and virome sequencing in stool samples from children with islet autoimmunity and matched controls.

Pediatr Diabetes 2017 Nov 17;18(7):588-598. Epub 2016 Nov 17.

School of Medicine, Department of Virology, University of Tampere, Tampere, Finland.

Background: We set out to explore associations between the stool bacteriome profiles and early-onset islet autoimmunity, taking into account the interactions with the virus component of the microbiome.

Methods: Serial stool samples were longitudinally collected from 18 infants and toddlers with early-onset islet autoimmunity (median age 17.4 months) followed by type 1 diabetes, and 18 tightly matched controls from the Finnish Diabetes Prediction and Prevention (DIPP) cohort. Three stool samples were analyzed, taken 3, 6, and 9 months before the first detection of serum autoantibodies in the case child. The risk of islet autoimmunity was evaluated in relation to the composition of the bacteriome 16S rDNA profiles assessed by mass sequencing, and to the composition of DNA and RNA viromes.

Results: Four operational taxonomic units were significantly less abundant in children who later on developed islet autoimmunity as compared to controls-most markedly the species of Bacteroides vulgatus and Bifidobacterium bifidum. The alpha or beta diversity, or the taxonomic levels of bacterial phyla, classes or genera, showed no differences between cases and controls. A correlation analysis suggested a possible relation between CrAssphage signals and quantities of Bacteroides dorei. No apparent associations were seen between development of islet autoimmunity and sequences of yet unknown origin.

Conclusions: The results confirm previous findings that an imbalance within the prevalent Bacteroides genus is associated with islet autoimmunity. The detected quantitative relation of the novel "orphan" bacteriophage CrAssphage with a prevalent species of the Bacteroides genus may exemplify possible modifiers of the bacteriome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/pedi.12468DOI Listing
November 2017

Transcriptomics profiling of human SGBS adipogenesis.

Genom Data 2014 Dec 7;2:246-8. Epub 2014 Aug 7.

Institute of Biomedicine, School of Medicine, University of Eastern Finland, FI-70120 Kuopio, Finland.

Obesity is an ever-growing epidemic where tissue homeostasis is influenced by the differentiation of adipocytes that function in lipid metabolism, endocrine and inflammatory processes. While this differentiation process has been well-characterized in mice, limited data is available from human cells. Applying microarray expression profiling in the human SGBS pre-adipocyte cell line, we identified genes with differential expression during differentiation in combination with constraint-based modeling of metabolic pathway activity. Here we describe the experimental design and quality controls in detail for the gene expression and related results published by Galhardo et al. in Nucleic Acids Research 2014 associated with the data uploaded to NCBI Gene Expression Omnibus (GSE41352).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.gdata.2014.07.004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4535456PMC
December 2014

ChIP-seq profiling of the active chromatin marker H3K4me3 and PPARγ, CEBPα and LXR target genes in human SGBS adipocytes.

Genom Data 2014 Dec 6;2:230-6. Epub 2014 Aug 6.

Institute of Biomedicine, School of Medicine, University of Eastern Finland, FI-70120 Kuopio, Finland.

Transcription factors (TFs) represent key factors to establish a cellular phenotype. It is known that several TFs could play a role in disease, yet less is known so far how their targets overlap. We focused here on identifying the most highly induced TFs and their putative targets during human adipogenesis. Applying chromatin immunoprecipitation coupled with deep sequencing (ChIP-Seq) in the human SGBS pre-adipocyte cell line, we identified genes with binding sites in their vicinity for the three TFs studied, PPARγ, CEBPα and LXR. Here we describe the experimental design and quality controls in detail for the deep sequencing data and related results published by Galhardo et al. in Nucleic Acids Research 2014 [1] associated with the data uploaded to NCBI Gene Expression Omnibus (GSE41578).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.gdata.2014.07.002DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4536030PMC
December 2014

Systems genomics evaluation of the SH-SY5Y neuroblastoma cell line as a model for Parkinson's disease.

BMC Genomics 2014 Dec 20;15:1154. Epub 2014 Dec 20.

Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Campus Belval, 7, avenue des Hauts-Fourneaux, L-4362 Esch-sur-Alzette, Luxembourg.

Background: The human neuroblastoma cell line, SH-SY5Y, is a commonly used cell line in studies related to neurotoxicity, oxidative stress, and neurodegenerative diseases. Although this cell line is often used as a cellular model for Parkinson's disease, the relevance of this cellular model in the context of Parkinson's disease (PD) and other neurodegenerative diseases has not yet been systematically evaluated.

Results: We have used a systems genomics approach to characterize the SH-SY5Y cell line using whole-genome sequencing to determine the genetic content of the cell line and used transcriptomics and proteomics data to determine molecular correlations. Further, we integrated genomic variants using a network analysis approach to evaluate the suitability of the SH-SY5Y cell line for perturbation experiments in the context of neurodegenerative diseases, including PD.

Conclusions: The systems genomics approach showed consistency across different biological levels (DNA, RNA and protein concentrations). Most of the genes belonging to the major Parkinson's disease pathways and modules were intact in the SH-SY5Y genome. Specifically, each analysed gene related to PD has at least one intact copy in SH-SY5Y. The disease-specific network analysis approach ranked the genetic integrity of SH-SY5Y as higher for PD than for Alzheimer's disease but lower than for Huntington's disease and Amyotrophic Lateral Sclerosis for loss of function perturbation experiments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-15-1154DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367834PMC
December 2014

Community-integrated omics links dominance of a microbial generalist to fine-tuned resource usage.

Nat Commun 2014 Nov 26;5:5603. Epub 2014 Nov 26.

Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7 Avenue des Hauts-Fourneaux, L-4362 Esch-sur-Alzette, Luxembourg.

Microbial communities are complex and dynamic systems that are primarily structured according to their members' ecological niches. To investigate how niche breadth (generalist versus specialist lifestyle strategies) relates to ecological success, we develop and apply an integrative workflow for the multi-omic analysis of oleaginous mixed microbial communities from a biological wastewater treatment plant. Time- and space-resolved coupled metabolomic and taxonomic analyses demonstrate that the community-wide lipid accumulation phenotype is associated with the dominance of the generalist bacterium Candidatus Microthrix spp. By integrating population-level genomic reconstructions (reflecting fundamental niches) with transcriptomic and proteomic data (realised niches), we identify finely tuned gene expression governing resource usage by Candidatus Microthrix parvicella over time. Moreover, our results indicate that the fluctuating environmental conditions constrain the accumulation of genetic variation in Candidatus Microthrix parvicella likely due to fitness trade-offs. Based on our observations, niche breadth has to be considered as an important factor for understanding the evolutionary processes governing (microbial) population sizes and structures in situ.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms6603DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4263124PMC
November 2014

Quantitative analysis of colony morphology in yeast.

Biotechniques 2014 Jan;56(1):18-27

Pacific Northwest Diabetes Research Institute, Seattle, WA; Molecular and Cellular Biology Program, University of Washington, Seattle, WA.

Microorganisms often form multicellular structures such as biofilms and structured colonies that can influence the organism's virulence, drug resistance, and adherence to medical devices. Phenotypic classification of these structures has traditionally relied on qualitative scoring systems that limit detailed phenotypic comparisons between strains. Automated imaging and quantitative analysis have the potential to improve the speed and accuracy of experiments designed to study the genetic and molecular networks underlying different morphological traits. For this reason, we have developed a platform that uses automated image analysis and pattern recognition to quantify phenotypic signatures of yeast colonies. Our strategy enables quantitative analysis of individual colonies, measured at a single time point or over a series of time-lapse images, as well as the classification of distinct colony shapes based on image-derived features. Phenotypic changes in colony morphology can be expressed as changes in feature space trajectories over time, thereby enabling the visualization and quantitative analysis of morphological development. To facilitate data exploration, results are plotted dynamically through an interactive Yeast Image Analysis web application (YIMAA; http://yimaa.cs.tut.fi) that integrates the raw and processed images across all time points, allowing exploration of the image-based features and principal components associated with morphological development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2144/000114123DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3996921PMC
January 2014

POMO--Plotting Omics analysis results for Multiple Organisms.

BMC Genomics 2013 Dec 24;14:918. Epub 2013 Dec 24.

Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg, Luxembourg.

Background: Systems biology experiments studying different topics and organisms produce thousands of data values across different types of genomic data. Further, data mining analyses are yielding ranked and heterogeneous results and association networks distributed over the entire genome. The visualization of these results is often difficult and standalone web tools allowing for custom inputs and dynamic filtering are limited.

Results: We have developed POMO (http://pomo.cs.tut.fi), an interactive web-based application to visually explore omics data analysis results and associations in circular, network and grid views. The circular graph represents the chromosome lengths as perimeter segments, as a reference outer ring, such as cytoband for human. The inner arcs between nodes represent the uploaded network. Further, multiple annotation rings, for example depiction of gene copy number changes, can be uploaded as text files and represented as bar, histogram or heatmap rings. POMO has built-in references for human, mouse, nematode, fly, yeast, zebrafish, rice, tomato, Arabidopsis, and Escherichia coli. In addition, POMO provides custom options that allow integrated plotting of unsupported strains or closely related species associations, such as human and mouse orthologs or two yeast wild types, studied together within a single analysis. The web application also supports interactive label and weight filtering. Every iterative filtered result in POMO can be exported as image file and text file for sharing or direct future input.

Conclusions: The POMO web application is a unique tool for omics data analysis, which can be used to visualize and filter the genome-wide networks in the context of chromosomal locations as well as multiple network layouts. With the several illustration and filtering options the tool supports the analysis and visualization of any heterogeneous omics data analysis association results for many organisms. POMO is freely available and does not require any installation or registration.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-14-918DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3880012PMC
December 2013

Integrated analysis of transcript-level regulation of metabolism reveals disease-relevant nodes of the human metabolic network.

Nucleic Acids Res 2014 Feb 5;42(3):1474-96. Epub 2013 Nov 5.

Life Sciences Research Unit, University of Luxembourg, 162a Avenue de la Faïencerie, L-1511 Luxembourg, Luxembourg, Biozentrum, Universität Basel and Swiss Institute of Bioinformatics, Klingelbergstrasse 50-70, 4056 Basel, Switzerland, Institute for Systems Biology, 401 Terry Avenue North, 98109-5234, Seattle, Washington, USA, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, House of Biomedicine, 7 Avenue des Hauts-Fourneaux, L-4362 Esch/Alzette, Luxembourg and Department of Biotechnology and Molecular Medicine, A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, FI-70211 Kuopio, Finland.

Metabolic diseases and comorbidities represent an ever-growing epidemic where multiple cell types impact tissue homeostasis. Here, the link between the metabolic and gene regulatory networks was studied through experimental and computational analysis. Integrating gene regulation data with a human metabolic network prompted the establishment of an open-sourced web portal, IDARE (Integrated Data Nodes of Regulation), for visualizing various gene-related data in context of metabolic pathways. Motivated by increasing availability of deep sequencing studies, we obtained ChIP-seq data from widely studied human umbilical vein endothelial cells. Interestingly, we found that association of metabolic genes with multiple transcription factors (TFs) enriched disease-associated genes. To demonstrate further extensions enabled by examining these networks together, constraint-based modeling was applied to data from human preadipocyte differentiation. In parallel, data on gene expression, genome-wide ChIP-seq profiles for peroxisome proliferator-activated receptor (PPAR) γ, CCAAT/enhancer binding protein (CEBP) α, liver X receptor (LXR) and H3K4me3 and microRNA target identification for miR-27a, miR-29a and miR-222 were collected. Disease-relevant key nodes, including mitochondrial glycerol-3-phosphate acyltransferase (GPAM), were exposed from metabolic pathways predicted to change activity by focusing on association with multiple regulators. In both cell types, our analysis reveals the convergence of microRNAs and TFs within the branched chain amino acid (BCAA) metabolic pathway, possibly providing an explanation for its downregulation in obese and diabetic conditions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkt989DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3919568PMC
February 2014

High-throughput tetrad analysis.

Nat Methods 2013 Jul 12;10(7):671-5. Epub 2013 May 12.

Institute for Systems Biology, Seattle, Washington, USA.

Tetrad analysis has been a gold-standard genetic technique for several decades. Unfortunately, the need to manually isolate, disrupt and space tetrads has relegated its application to small-scale studies and limited its integration with high-throughput DNA sequencing technologies. We have developed a rapid, high-throughput method, called barcode-enabled sequencing of tetrads (BEST), that uses (i) a meiosis-specific GFP fusion protein to isolate tetrads by FACS and (ii) molecular barcodes that are read during genotyping to identify spores derived from the same tetrad. Maintaining tetrad information allows accurate inference of missing genetic markers and full genotypes of missing (and presumably nonviable) individuals. An individual researcher was able to isolate over 3,000 yeast tetrads in 3 h, an output equivalent to that of almost 1 month of manual dissection. BEST is transferable to other microorganisms for which meiotic mapping is significantly more laborious.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.2479DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3696418PMC
July 2013

Fastbreak: a tool for analysis and visualization of structural variations in genomic data.

EURASIP J Bioinform Syst Biol 2012 Oct 9;2012(1):15. Epub 2012 Oct 9.

Institute for System Biology, 401 Terry Avenue North, Seattle, WA, 98109-5234, USA.

Genomic studies are now being undertaken on thousands of samples requiring new computational tools that can rapidly analyze data to identify clinically important features. Inferring structural variations in cancer genomes from mate-paired reads is a combinatorially difficult problem. We introduce Fastbreak, a fast and scalable toolkit that enables the analysis and visualization of large amounts of data from projects such as The Cancer Genome Atlas.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1687-4153-2012-15DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3605143PMC
October 2012

EPEPT: a web service for enhanced P-value estimation in permutation tests.

BMC Bioinformatics 2011 Oct 24;12:411. Epub 2011 Oct 24.

Institute for Systems Biology, Seattle, WA, USA.

Background: In computational biology, permutation tests have become a widely used tool to assess the statistical significance of an event under investigation. However, the common way of computing the P-value, which expresses the statistical significance, requires a very large number of permutations when small (and thus interesting) P-values are to be accurately estimated. This is computationally expensive and often infeasible. Recently, we proposed an alternative estimator, which requires far fewer permutations compared to the standard empirical approach while still reliably estimating small P-values.

Results: The proposed P-value estimator has been enriched with additional functionalities and is made available to the general community through a public website and web service, called EPEPT. This means that the EPEPT routines can be accessed not only via a website, but also programmatically using any programming language that can interact with the web. Examples of web service clients in multiple programming languages can be downloaded. Additionally, EPEPT accepts data of various common experiment types used in computational biology. For these experiment types EPEPT first computes the permutation values and then performs the P-value estimation. Finally, the source code of EPEPT can be downloaded.

Conclusions: Different types of users, such as biologists, bioinformaticians and software engineers, can use the method in an appropriate and simple way.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-12-411DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3277916PMC
October 2011

SEQADAPT: an adaptable system for the tracking, storage and analysis of high throughput sequencing experiments.

BMC Bioinformatics 2010 Jul 14;11:377. Epub 2010 Jul 14.

Institute for Systems Biology, 1441 North 34th Street, Seattle, WA 98103, USA.

Background: High throughput sequencing has become an increasingly important tool for biological research. However, the existing software systems for managing and processing these data have not provided the flexible infrastructure that research requires.

Results: Existing software solutions provide static and well-established algorithms in a restrictive package. However as high throughput sequencing is a rapidly evolving field, such static approaches lack the ability to readily adopt the latest advances and techniques which are often required by researchers. We have used a loosely coupled, service-oriented infrastructure to develop SeqAdapt. This system streamlines data management and allows for rapid integration of novel algorithms. Our approach also allows computational biologists to focus on developing and applying new methods instead of writing boilerplate infrastructure code.

Conclusion: The system is based around the Addama service architecture and is available at our website as a demonstration web application, an installable single download and as a collection of individual customizable services.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-11-377DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2916924PMC
July 2010