Publications by authors named "Gos Micklem"

41 Publications

Insights into olfactory ensheathing cell development from a laser-microdissection and transcriptome-profiling approach.

Glia 2020 12 28;68(12):2550-2584. Epub 2020 Aug 28.

Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK.

Olfactory ensheathing cells (OECs) are neural crest-derived glia that ensheath bundles of olfactory axons from their peripheral origins in the olfactory epithelium to their central targets in the olfactory bulb. We took an unbiased laser microdissection and differential RNA-seq approach, validated by in situ hybridization, to identify candidate molecular mechanisms underlying mouse OEC development and differences with the neural crest-derived Schwann cells developing on other peripheral nerves. We identified 25 novel markers for developing OECs in the olfactory mucosa and/or the olfactory nerve layer surrounding the olfactory bulb, of which 15 were OEC-specific (that is, not expressed by Schwann cells). One pan-OEC-specific gene, Ptprz1, encodes a receptor-like tyrosine phosphatase that blocks oligodendrocyte differentiation. Mutant analysis suggests Ptprz1 may also act as a brake on OEC differentiation, and that its loss disrupts olfactory axon targeting. Overall, our results provide new insights into OEC development and the diversification of neural crest-derived glia.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/glia.23870DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7116175PMC
December 2020

The InterMine Android app: Cross-organism genomic data in your pocket.

F1000Res 2018 22;7:1837. Epub 2018 Nov 22.

Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK.

InterMine is a data integration and analysis software system that has been used to create both inter-connected and stand-alone biological databases for the analysis of large and complex biological data sets. Together, the InterMine databases provide access to extensive data across multiple organisms. To provide more convenient access to these data from Android mobile devices, we have developed the InterMine app, an application that can be run on any Android mobile phone or tablet. The InterMine app provides a single interface for data access, search and exploration of the InterMine databases. It can be used to retrieve information on genes and gene lists, and their relatives across species. Simple searches can be used to access a range of data about a specific gene, while links to the InterMine databases provide access to more detailed report pages and gene list analysis tools. The InterMine app thus facilitates rapid exploration of genes across multiple organisms and kinds of data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/f1000research.17005.2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6572867PMC
November 2019

InterMineR: an R package for InterMine databases.

Bioinformatics 2019 09;35(17):3206-3207

Department of Genetics, University of Cambridge, Cambridge, UK.

Summary: InterMineR is a package designed to provide a flexible interface between the R programming environment and biological databases built using the InterMine platform. The package offers access to the flexible query builder and the library of term enrichment tools of the InterMine framework, as well as interoperability with other Bioconductor packages. This facilitates automation of data retrieval tasks as well as downstream analysis with existing statistical tools in the R environment.

Availability And Implementation: InterMineR is free and open source, released under the LGPL licence and available from the Bioconductor project and Github (https://bioconductor.org/packages/release/bioc/html/InterMineR.html, https://github.com/intermine/interMineR).

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz039DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6736411PMC
September 2019

Comparative genomics of bdelloid rotifers: Insights from desiccating and nondesiccating species.

PLoS Biol 2018 04 24;16(4):e2004830. Epub 2018 Apr 24.

Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire, United Kingdom.

Bdelloid rotifers are a class of microscopic invertebrates that have existed for millions of years apparently without sex or meiosis. They inhabit a variety of temporary and permanent freshwater habitats globally, and many species are remarkably tolerant of desiccation. Bdelloids offer an opportunity to better understand the evolution of sex and recombination, but previous work has emphasised desiccation as the cause of several unusual genomic features in this group. Here, we present high-quality whole-genome sequences of 3 bdelloid species: Rotaria macrura and R. magnacalcarata, which are both desiccation intolerant, and Adineta ricciae, which is desiccation tolerant. In combination with the published assembly of A. vaga, which is also desiccation tolerant, we apply a comparative genomics approach to evaluate the potential effects of desiccation tolerance and asexuality on genome evolution in bdelloids. We find that ancestral tetraploidy is conserved among all 4 bdelloid species, but homologous divergence in obligately aquatic Rotaria genomes is unexpectedly low. This finding is contrary to current models regarding the role of desiccation in shaping bdelloid genomes. In addition, we find that homologous regions in A. ricciae are largely collinear and do not form palindromic repeats as observed in the published A. vaga assembly. Consequently, several features interpreted as genomic evidence for long-term ameiotic evolution are not general to all bdelloid species, even within the same genus. Finally, we substantiate previous findings of high levels of horizontally transferred nonmetazoan genes in both desiccating and nondesiccating bdelloid species and show that this unusual feature is not shared by other animal phyla, even those with desiccation-tolerant representatives. These comparisons call into question the proposed role of desiccation in mediating horizontal genetic transfer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pbio.2004830DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5916493PMC
April 2018

ComplexViewer: visualization of curated macromolecular complexes.

Bioinformatics 2017 Nov;33(22):3673-3675

Wellcome Trust Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, EH9 3BF, UK.

Summary: Proteins frequently function as parts of complexes, assemblages of multiple proteins and other biomolecules, yet network visualizations usually only show proteins as parts of binary interactions. ComplexViewer visualizes interactions with more than two participants and thereby avoids the need to first expand these into multiple binary interactions. Furthermore, if binding regions between molecules are known then these can be displayed in the context of the larger complex.

Availability And Implementation: freely available under Apache version 2 license; EMBL-EBI Complex Portal: http://www.ebi.ac.uk/complexportal; Source code: https://github.com/MICommunity/ComplexViewer; Package: https://www.npmjs.com/package/complexviewer; http://biojs.io/d/complexviewer. Language: JavaScript; Web technology: Scalable Vector Graphics; Libraries: D3.js.

Contact: colin.combe@ed.ac.uk or juri.rappsilber@ed.ac.uk.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btx497DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870653PMC
November 2017

Empirical Bayes method for reducing false discovery rates of correlation matrices with block diagonal structure.

BMC Bioinformatics 2017 Apr 12;18(1):213. Epub 2017 Apr 12.

CCBI, Department Applied Mathematics and Theoretical Physics, University of Cambridge, Wilberforce Road, Cambridge, CB3 0WA, UK.

Background: Correlation matrices are important in inferring relationships and networks between regulatory or signalling elements in biological systems. With currently available technology sample sizes for experiments are typically small, meaning that these correlations can be difficult to estimate. At a genome-wide scale estimation of correlation matrices can also be computationally demanding.

Results: We develop an empirical Bayes approach to improve covariance estimates for gene expression, where we assume the covariance matrix takes a block diagonal form. Our method shows lower false discovery rates than existing methods on simulated data. Applied to a real data set from Bacillus subtilis we demonstrate it's ability to detecting known regulatory units and interactions between them.

Conclusions: We demonstrate that, compared to existing methods, our method is able to find significant covariances and also to control false discovery rates, even when the sample size is small (n=10). The method can be used to find potential regulatory networks, and it may also be used as a pre-processing step for methods that calculate, for example, partial correlations, so enabling the inference of the causal and hierarchical structure of the networks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-017-1623-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5389176PMC
April 2017

Insights into electrosensory organ development, physiology and evolution from a lateral line-enriched transcriptome.

Elife 2017 03 27;6. Epub 2017 Mar 27.

Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom.

The anamniote lateral line system, comprising mechanosensory neuromasts and electrosensory ampullary organs, is a useful model for investigating the developmental and evolutionary diversification of different organs and cell types. Zebrafish neuromast development is increasingly well understood, but neither zebrafish nor is electroreceptive and our molecular understanding of ampullary organ development is rudimentary. We have used RNA-seq to generate a lateral line-enriched gene-set from late-larval paddlefish (). Validation of a subset reveals expression in developing ampullary organs of transcription factor genes critical for hair cell development, and genes essential for glutamate release at hair cell ribbon synapses, suggesting close developmental, physiological and evolutionary links between non-teleost electroreceptors and hair cells. We identify an ampullary organ-specific proneural transcription factor, and candidates for the voltage-sensing L-type Ca channel and rectifying K channel predicted from skate (cartilaginous fish) ampullary organ electrophysiology. Overall, our results illuminate ampullary organ development, physiology and evolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.24197DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5429088PMC
March 2017

Urinary Exosomes Contain MicroRNAs Capable of Paracrine Modulation of Tubular Transporters in Kidney.

Sci Rep 2017 01 17;7:40601. Epub 2017 Jan 17.

Department of Medical Genetics, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK.

Exosomes derived from all nephron segments are present in human urine, where their functionality is incompletely understood. Most studies have focused on biomarker discovery rather than exosome function. Through sequencing we identified the miRNA repertoire of urinary exosomes from healthy volunteers; 276 mature miRNAs and 345 pre-miRNAs were identified (43%/7% of reads). Among the most abundant were members of the miR-10, miR-30 and let-7 families. Targets for the identified miRNAs were predicted using five different databases; genes encoding membrane transporters and their regulators were enriched, highlighting the possibility that these miRNAs could modulate key renal tubular functions in a paracrine manner. As proof of concept, cultured renal epithelial cells were exposed to urinary exosomes and cellular exosomal uptake was confirmed; thereafter, reduced levels of the potassium channel ROMK and kinases SGK1 and WNK1 were observed in a human collecting duct cell line, while SPAK was unaltered. In proximal tubular cells, mRNA levels of the amino acid transporter gene SLC38A2 were diminished and reflected in a significant decrement of its encoded protein SNAT2. Protein levels of the kinase SGK1 did not change. Thus we demonstrated a novel potential function for miRNA in urinary exosomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/srep40601DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5240140PMC
January 2017

ThaleMine: A Warehouse for Arabidopsis Data Integration and Discovery.

Plant Cell Physiol 2017 01;58(1):e4

Plant Genomics, J. Craig Venter Institute, Medical Center Dr, Rockville, MD, USA.

ThaleMine (https://apps.araport.org/thalemine/) is a comprehensive data warehouse that integrates a wide array of genomic information of the model plant Arabidopsis thaliana. The data collection currently includes the latest structural and functional annotation from the Araport11 update, the Col-0 genome sequence, RNA-seq and array expression, co-expression, protein interactions, homologs, pathways, publications, alleles, germplasm and phenotypes. The data are collected from a wide variety of public resources. Users can browse gene-specific data through Gene Report pages, identify and create gene lists based on experiments or indexed keywords, and run GO enrichment analysis to investigate the biological significance of selected gene sets. Developed by the Arabidopsis Information Portal project (Araport, https://www.araport.org/), ThaleMine uses the InterMine software framework, which builds well-structured data, and provides powerful data query and analysis functionality. The warehoused data can be accessed by users via graphical interfaces, as well as programmatically via web-services. Here we describe recent developments in ThaleMine including new features and extensions, and discuss future improvements. InterMine has been broadly adopted by the model organism research community including nematode, rat, mouse, zebrafish, budding yeast, the modENCODE project, as well as being used for human data. ThaleMine is the first InterMine developed for a plant model. As additional new plant InterMines are developed by the legume and other plant research communities, the potential of cross-organism integrative data analysis will be further enabled.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/pcp/pcw200DOI Listing
January 2017

toxoMine: an integrated omics data warehouse for Toxoplasma gondii systems biology research.

Database (Oxford) 2015 30;2015:bav066. Epub 2015 Jun 30.

Department of Genetics, Albert Einstein College of Medicine, Bronx, NY 10461, USA, Department of Mathematical Sciences, Yeshiva University, New York, NY 10033, USA

Toxoplasma gondii (T. gondii) is an obligate intracellular parasite that must monitor for changes in the host environment and respond accordingly; however, it is still not fully known which genetic or epigenetic factors are involved in regulating virulence traits of T. gondii. There are on-going efforts to elucidate the mechanisms regulating the stage transition process via the application of high-throughput epigenomics, genomics and proteomics techniques. Given the range of experimental conditions and the typical yield from such high-throughput techniques, a new challenge arises: how to effectively collect, organize and disseminate the generated data for subsequent data analysis. Here, we describe toxoMine, which provides a powerful interface to support sophisticated integrative exploration of high-throughput experimental data and metadata, providing researchers with a more tractable means toward understanding how genetic and/or epigenetic factors play a coordinated role in determining pathogenicity of T. gondii. As a data warehouse, toxoMine allows integration of high-throughput data sets with public T. gondii data. toxoMine is also able to execute complex queries involving multiple data sets with straightforward user interaction. Furthermore, toxoMine allows users to define their own parameters during the search process that gives users near-limitless search and query capabilities. The interoperability feature also allows users to query and examine data available in other InterMine systems, which would effectively augment the search scope beyond what is available to toxoMine. toxoMine complements the major community database ToxoDB by providing a data warehouse that enables more extensive integrative studies for T. gondii. Given all these factors, we believe it will become an indispensable resource to the greater infectious disease research community.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/database/bav066DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4485433PMC
March 2016

Cross-organism analysis using InterMine.

Genesis 2015 Aug 8;53(8):547-60. Epub 2015 Jul 8.

Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom.

InterMine is a data integration warehouse and analysis software system developed for large and complex biological data sets. Designed for integrative analysis, it can be accessed through a user-friendly web interface. For bioinformaticians, extensive web services as well as programming interfaces for most common scripting languages support access to all features. The web interface includes a useful identifier look-up system, and both simple and sophisticated search options. Interactive results tables enable exploration, and data can be filtered, summarized, and browsed. A set of graphical analysis tools provide a rich environment for data exploration including statistical enrichment of sets of genes or other entities. InterMine databases have been developed for the major model organisms, budding yeast, nematode worm, fruit fly, zebrafish, mouse, and rat together with a newly developed human database. Here, we describe how this has facilitated interoperation and development of cross-organism analysis tools and reports. InterMine as a data exploration and analysis tool is also described. All the InterMine-based systems described in this article are resources freely available to the scientific community.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/dvg.22869DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4545681PMC
August 2015

Expression of multiple horizontally acquired genes is a hallmark of both vertebrate and invertebrate genomes.

Genome Biol 2015 Mar 13;16:50. Epub 2015 Mar 13.

Background: A fundamental concept in biology is that heritable material, DNA, is passed from parent to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic material between different species. HGT is well-known in single-celled organisms such as bacteria, but its existence in higher organisms, including animals, is less well established, and is controversial in humans.

Results: We have taken advantage of the recent availability of a sufficient number of high-quality genomes and associated transcriptomes to carry out a detailed examination of HGT in 26 animal species (10 primates, 12 flies and four nematodes) and a simplified analysis in a further 14 vertebrates. Genome-wide comparative and phylogenetic analyses show that HGT in animals typically gives rise to tens or hundreds of active 'foreign' genes, largely concerned with metabolism. Our analyses suggest that while fruit flies and nematodes have continued to acquire foreign genes throughout their evolution, humans and other primates have gained relatively few since their common ancestor. We also resolve the controversy surrounding previous evidence of HGT in humans and provide at least 33 new examples of horizontally acquired genes.

Conclusions: We argue that HGT has occurred, and continues to occur, on a previously unsuspected scale in metazoans and is likely to have contributed to biochemical diversification during animal evolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-015-0607-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4358723PMC
March 2015

Araport: the Arabidopsis information portal.

Nucleic Acids Res 2015 Jan 20;43(Database issue):D1003-9. Epub 2014 Nov 20.

Plant Genomics, J. Craig Venter Institute, Rockville, MD 20850, USA.

The Arabidopsis Information Portal (https://www.araport.org) is a new online resource for plant biology research. It houses the Arabidopsis thaliana genome sequence and associated annotation. It was conceived as a framework that allows the research community to develop and release 'modules' that integrate, analyze and visualize Arabidopsis data that may reside at remote sites. The current implementation provides an indexed database of core genomic information. These data are made available through feature-rich web applications that provide search, data mining, and genome browser functionality, and also by bulk download and web services. Araport uses software from the InterMine and JBrowse projects to expose curated data from TAIR, GO, BAR, EBI, UniProt, PubMed and EPIC CoGe. The site also hosts 'science apps,' developed as prototypes for community modules that use dynamic web pages to present data obtained on-demand from third-party servers via RESTful web services. Designed for sustainability, the Arabidopsis Information Portal strategy exploits existing scientific computing infrastructure, adopts a practical mixture of data integration technologies and encourages collaborative enhancement of the resource by its user community.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gku1200DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383980PMC
January 2015

esyN: network building, sharing and publishing.

PLoS One 2014 2;9(9):e106035. Epub 2014 Sep 2.

Cambridge Systems Biology Centre, University of Cambridge, Cambridge, United Kingdom; Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom.

The construction and analysis of networks is increasingly widespread in biological research. We have developed esyN ("easy networks") as a free and open source tool to facilitate the exchange of biological network models between researchers. esyN acts as a searchable database of user-created networks from any field. We have developed a simple companion web tool that enables users to view and edit networks using data from publicly available databases. Both normal interaction networks (graphs) and Petri nets can be created. In addition to its basic tools, esyN contains a number of logical templates that can be used to create models more easily. The ability to use previously published models as building blocks makes esyN a powerful tool for the construction of models and network graphs. Users are able to save their own projects online and share them either publicly or with a list of collaborators. The latter can be given the ability to edit the network themselves, allowing online collaboration on network construction. esyN is designed to facilitate unrestricted exchange of this increasingly important type of biological information. Ultimately, the aim of esyN is to bring the advantages of Open Source software development to the construction of biological networks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0106035PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4152123PMC
May 2015

InterMine: extensive web services for modern biology.

Nucleic Acids Res 2014 Jul 21;42(Web Server issue):W468-72. Epub 2014 Apr 21.

Department of Genetics, University of Cambridge, Downing Street, Cambridge, CB2 3EH, UK and Cambridge Systems Biology Centre, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK

InterMine (www.intermine.org) is a biological data warehousing system providing extensive automatically generated and configurable RESTful web services that underpin the web interface and can be re-used in many other applications: to find and filter data; export it in a flexible and structured way; to upload, use, manipulate and analyze lists; to provide services for flexible retrieval of sequence segments, and for other statistical and analysis tools. Here we describe these features and discuss how they can be used separately or in combinations to support integrative and comparative analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gku301DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4086141PMC
July 2014

Identification of microRNAs in the coral Stylophora pistillata.

PLoS One 2014 21;9(3):e91101. Epub 2014 Mar 21.

Red Sea Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.

Coral reefs are major contributors to marine biodiversity. However, they are in rapid decline due to global environmental changes such as rising sea surface temperatures, ocean acidification, and pollution. Genomic and transcriptomic analyses have broadened our understanding of coral biology, but a study of the microRNA (miRNA) repertoire of corals is missing. miRNAs constitute a class of small non-coding RNAs of ∼22 nt in size that play crucial roles in development, metabolism, and stress response in plants and animals alike. In this study, we examined the coral Stylophora pistillata for the presence of miRNAs and the corresponding core protein machinery required for their processing and function. Based on small RNA sequencing, we present evidence for 31 bona fide microRNAs, 5 of which (miR-100, miR-2022, miR-2023, miR-2030, and miR-2036) are conserved in other metazoans. Homologues of Argonaute, Piwi, Dicer, Drosha, Pasha, and HEN1 were identified in the transcriptome of S. pistillata based on strong sequence conservation with known RNAi proteins, with additional support derived from phylogenetic trees. Examination of putative miRNA gene targets indicates potential roles in development, metabolism, immunity, and biomineralisation for several of the microRNAs. Here, we present first evidence of a functional RNAi machinery and five conserved miRNAs in S. pistillata, implying that miRNAs play a role in organismal biology of scleractinian corals. Analysis of predicted miRNA target genes in S. pistillata suggests potential roles of miRNAs in symbiosis and coral calcification. Given the importance of miRNAs in regulating gene expression in other metazoans, further expression analyses of small non-coding RNAs in transcriptional studies of corals should be informative about miRNA-affected processes and pathways.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0091101PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3962355PMC
January 2015

BioJS DAGViewer: A reusable JavaScript component for displaying directed graphs.

F1000Res 2014 13;3:51. Epub 2014 Feb 13.

Department of Genetics and Cambridge Systems Biology Centre, Cambridge University, Cambridge, CB2 3EH, UK.

Summary: The DAGViewer BioJS component is a reusable JavaScript component made available as part of the BioJS project and intended to be used to display graphs of structured data, with a particular emphasis on Directed Acyclic Graphs (DAGs). It enables users to embed representations of graphs of data, such as ontologies or phylogenetic trees, in hyper-text documents (HTML). This component is generic, since it is capable (given the appropriate configuration) of displaying any kind of data that is organised as a graph. The features of this component which are useful for examining and filtering large and complex graphs are described.

Availability: http://github.com/alexkalderimis/dag-viewer-biojs; http://github.com/biojs/biojs; http://dx.doi.org/10.5281/zenodo.8303.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/f1000research.3-51.v1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3945768PMC
March 2014

Integrating microRNA and mRNA expression profiling in Symbiodinium microadriaticum, a dinoflagellate symbiont of reef-building corals.

BMC Genomics 2013 Oct 12;14:704. Epub 2013 Oct 12.

Red Sea Research Center, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, Thuwal 23955, Saudi Arabia.

Background: Animal and plant genomes produce numerous small RNAs (smRNAs) that regulate gene expression post-transcriptionally affecting metabolism, development, and epigenetic inheritance. In order to characterize the repertoire of endogenous smRNAs and potential gene targets in dinoflagellates, we conducted smRNA and mRNA expression profiling over 9 experimental treatments of cultures from Symbiodinium microadriaticum, a photosynthetic symbiont of scleractinian corals.

Results: We identified a set of 21 novel smRNAs that share stringent key features with functional microRNAs from other model organisms. smRNAs were predicted independently over all 9 treatments and their putative gene targets were identified. We found 1,720 animal-like target sites in the 3'UTRs of 12,858 mRNAs and 19 plant-like target sites in 51,917 genes. We assembled a transcriptome of 58,649 genes and determined differentially expressed genes (DEGs) between treatments. Heat stress was found to produce a much larger number of DEGs than other treatments that yielded only few DEGs. Analysis of DEGs also revealed that minicircle-encoded photosynthesis proteins seem to be common targets of transcriptional regulation. Furthermore, we identified the core RNAi protein machinery in Symbiodinium.

Conclusions: Integration of smRNA and mRNA expression profiling identified a variety of processes that could be under microRNA control, e.g. protein modification, signaling, gene expression, and response to DNA damage. Given that Symbiodinium seems to have a paucity of transcription factors and differentially expressed genes, identification and characterization of its smRNA repertoire establishes the possibility of a range of gene regulatory mechanisms in dinoflagellates acting post-transcriptionally.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-14-704DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3853145PMC
October 2013

metabolicMine: an integrated genomics, genetics and proteomics data warehouse for common metabolic disease research.

Database (Oxford) 2013 9;2013:bat060. Epub 2013 Aug 9.

Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, UK.

Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first-time users, whereas the Application Programming Interface (API) and web services provide convenient data access and tools for bioinformaticians. metabolicMine is freely available online at http://www.metabolicmine.org Database URL: http://www.metabolicmine.org.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/database/bat060DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4438919PMC
October 2013

InterMOD: integrated data and tools for the unification of model organism research.

Sci Rep 2013 ;3:1802

Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom.

Model organisms are widely used for understanding basic biology, and have significantly contributed to the study of human disease. In recent years, genomic analysis has provided extensive evidence of widespread conservation of gene sequence and function amongst eukaryotes, allowing insights from model organisms to help decipher gene function in a wider range of species. The InterMOD consortium is developing an infrastructure based around the InterMine data warehouse system to integrate genomic and functional data from a number of key model organisms, leading the way to improved cross-species research. So far including budding yeast, nematode worm, fruit fly, zebrafish, rat and mouse, the project has set up data warehouses, synchronized data models, and created analysis tools and links between data from different species. The project unites a number of major model organism databases, improving both the consistency and accessibility of comparative research, to the benefit of the wider scientific community.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/srep01802DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3647165PMC
February 2014

The 3rd DBCLS BioHackathon: improving life science data integration with Semantic Web technologies.

J Biomed Semantics 2013 Feb 11;4(1). Epub 2013 Feb 11.

Database Center for Life Science, Research Organization of Information and Systems, 2-11-16, Yayoi, Bunkyo-ku, Tokyo, 113-0032, Japan.

Background: BioHackathon 2010 was the third in a series of meetings hosted by the Database Center for Life Sciences (DBCLS) in Tokyo, Japan. The overall goal of the BioHackathon series is to improve the quality and accessibility of life science research data on the Web by bringing together representatives from public databases, analytical tool providers, and cyber-infrastructure researchers to jointly tackle important challenges in the area of in silico biological research.

Results: The theme of BioHackathon 2010 was the 'Semantic Web', and all attendees gathered with the shared goal of producing Semantic Web data from their respective resources, and/or consuming or interacting those data using their tools and interfaces. We discussed on topics including guidelines for designing semantic data and interoperability of resources. We consequently developed tools and clients for analysis and visualization.

Conclusion: We provide a meeting report from BioHackathon 2010, in which we describe the discussions, decisions, and breakthroughs made as we moved towards compliance with Semantic Web technologies - from source provider, through middleware, to the end-consumer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/2041-1480-4-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3598643PMC
February 2013

Activity of a heptad of transcription factors is associated with stem cell programs and clinical outcome in acute myeloid leukemia.

Blood 2013 Mar 17;121(12):2289-300. Epub 2013 Jan 17.

Lowy Cancer Research Centre and the Prince of Wales Clinical School, University of New South Wales, Sydney, Australia.

Aberrant transcriptional programs in combination with abnormal proliferative signaling drive leukemic transformation. These programs operate in normal hematopoiesis where they are involved in hematopoietic stem cell (HSC) proliferation and maintenance. Ets Related Gene (ERG) is a component of normal and leukemic stem cell signatures and high ERG expression is a risk factor for poor prognosis in acute myeloid leukemia (AML). However, mechanisms that underlie ERG expression in AML and how its expression relates to leukemic stemness are unknown. We report that ERG expression in AML is associated with activity of the ERG promoters and +85 stem cell enhancer and a heptad of transcription factors that combinatorially regulate genes in HSCs. Gene expression signatures derived from ERG promoter-stem cell enhancer and heptad activity are associated with clinical outcome when ERG expression alone fails. We also show that the heptad signature is associated with AMLs that lack somatic mutations in NPM1 and confers an adverse prognosis when associated with FLT3 mutations. Taken together, these results suggest that transcriptional regulators cooperate to establish or maintain primitive stem cell-like signatures in leukemic cells and that the underlying pattern of somatic mutations contributes to the development of these signatures and modulate their influence on clinical outcome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1182/blood-2012-07-446120DOI Listing
March 2013

Biochemical diversification through foreign gene expression in bdelloid rotifers.

PLoS Genet 2012 15;8(11):e1003035. Epub 2012 Nov 15.

Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, United Kingdom.

Bdelloid rotifers are microinvertebrates with unique characteristics: they have survived tens of millions of years without sexual reproduction; they withstand extreme desiccation by undergoing anhydrobiosis; and they tolerate very high levels of ionizing radiation. Recent evidence suggests that subtelomeric regions of the bdelloid genome contain sequences originating from other organisms by horizontal gene transfer (HGT), of which some are known to be transcribed. However, the extent to which foreign gene expression plays a role in bdelloid physiology is unknown. We address this in the first large scale analysis of the transcriptome of the bdelloid Adineta ricciae: cDNA libraries from hydrated and desiccated bdelloids were subjected to massively parallel sequencing and assembled transcripts compared against the UniProtKB database by blastx to identify their putative products. Of ~29,000 matched transcripts, ~10% were inferred from blastx matches to be horizontally acquired, mainly from eubacteria but also from fungi, protists, and algae. After allowing for possible sources of error, the rate of HGT is at least 8%-9%, a level significantly higher than other invertebrates. We verified their foreign nature by phylogenetic analysis and by demonstrating linkage of foreign genes with metazoan genes in the bdelloid genome. Approximately 80% of horizontally acquired genes expressed in bdelloids code for enzymes, and these represent 39% of enzymes in identified pathways. Many enzymes encoded by foreign genes enhance biochemistry in bdelloids compared to other metazoans, for example, by potentiating toxin degradation or generation of antioxidants and key metabolites. They also supplement, and occasionally potentially replace, existing metazoan functions. Bdelloid rotifers therefore express horizontally acquired genes on a scale unprecedented in animals, and foreign genes make a profound contribution to their metabolism. This represents a potential mechanism for ancient asexuals to adapt rapidly to changing environments and thereby persist over long evolutionary time periods in the absence of sex.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1003035DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3499245PMC
May 2013

InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data.

Bioinformatics 2012 Dec 27;28(23):3163-5. Epub 2012 Sep 27.

Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK.

Summary: InterMine is an open-source data warehouse system that facilitates the building of databases with complex data integration requirements and a need for a fast customizable query facility. Using InterMine, large biological databases can be created from a range of heterogeneous data sources, and the extensible data model allows for easy integration of new data types. The analysis tools include a flexible query builder, genomic region search and a library of 'widgets' performing various statistical analyses. The results can be exported in many commonly used formats. InterMine is a fully extensible framework where developers can add new tools and functionality. Additionally, there is a comprehensive set of web services, for which client libraries are provided in five commonly used programming languages.

Availability: Freely available from http://www.intermine.org under the LGPL license.

Contact: g.micklem@gen.cam.ac.uk

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bts577DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3516146PMC
December 2012

Multiple functionally divergent and conserved copies of alpha tubulin in bdelloid rotifers.

BMC Evol Biol 2012 Aug 17;12:148. Epub 2012 Aug 17.

Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire, SL5 7PY, UK.

Background: Bdelloid rotifers are microscopic animals that have apparently survived without sex for millions of years and are able to survive desiccation at all life stages through a process called anhydrobiosis. Both of these characteristics are believed to have played a role in shaping several unusual features of bdelloid genomes discovered in recent years. Studies into the impact of asexuality and anhydrobiosis on bdelloid genomes have focused on understanding gene copy number. Here we investigate copy number and sequence divergence in alpha tubulin. Alpha tubulin is conserved and normally present in low copy numbers in animals, but multiplication of alpha tubulin copies has occurred in animals adapted to extreme environments, such as cold-adapted Antarctic fish. Using cloning and sequencing we compared alpha tubulin copy variation in four species of bdelloid rotifers and four species of monogonont rotifers, which are facultatively sexual and cannot survive desiccation as adults. Results were verified using transcriptome data from one bdelloid species, Adineta ricciae.

Results: In common with the typical pattern for animals, monogonont rotifers contain either one or two copies of alpha tubulin, but bdelloid species contain between 11 and 13 different copies, distributed across five classes. Approximately half of the copies form a highly conserved group that vary by only 1.1% amino acid pairwise divergence with each other and with the monogonont copies. The other copies have divergent amino acid sequences that evolved significantly faster between classes than within them, relative to synonymous changes, and vary in predicted biochemical properties. Copies of each class were expressed under the laboratory conditions used to construct the transcriptome.

Conclusions: Our findings are consistent with recent evidence that bdelloids are degenerate tetraploids and that functional divergence of ancestral copies of genes has occurred, but show how further duplication events in the ancestor of bdelloids led to proliferation in both conserved and functionally divergent copies of this gene.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2148-12-148DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3464624PMC
August 2012

YeastMine--an integrated data warehouse for Saccharomyces cerevisiae data as a multipurpose tool-kit.

Database (Oxford) 2012 20;2012:bar062. Epub 2012 Mar 20.

Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA.

The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) provides high-quality curated genomic, genetic, and molecular information on the genes and their products of the budding yeast Saccharomyces cerevisiae. To accommodate the increasingly complex, diverse needs of researchers for searching and comparing data, SGD has implemented InterMine (http://www.InterMine.org), an open source data warehouse system with a sophisticated querying interface, to create YeastMine (http://yeastmine.yeastgenome.org). YeastMine is a multifaceted search and retrieval environment that provides access to diverse data types. Searches can be initiated with a list of genes, a list of Gene Ontology terms, or lists of many other data types. The results from queries can be combined for further analysis and saved or downloaded in customizable file formats. Queries themselves can be customized by modifying predefined templates or by creating a new template to access a combination of specific data types. YeastMine offers multiple scenarios in which it can be used such as a powerful search interface, a discovery tool, a curation aid and also a complex database presentation format. DATABASE URL: http://yeastmine.yeastgenome.org.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/database/bar062DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3308152PMC
June 2012

modMine: flexible access to modENCODE data.

Nucleic Acids Res 2012 Jan 12;40(Database issue):D1082-8. Epub 2011 Nov 12.

Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK.

In an effort to comprehensively characterize the functional elements within the genomes of the important model organisms Drosophila melanogaster and Caenorhabditis elegans, the NHGRI model organism Encyclopaedia of DNA Elements (modENCODE) consortium has generated an enormous library of genomic data along with detailed, structured information on all aspects of the experiments. The modMine database (http://intermine.modencode.org) described here has been built by the modENCODE Data Coordination Center to allow the broader research community to (i) search for and download data sets of interest among the thousands generated by modENCODE; (ii) access the data in an integrated form together with non-modENCODE data sets; and (iii) facilitate fine-grained analysis of the above data. The sophisticated search features are possible because of the collection of extensive experimental metadata by the consortium. Interfaces are provided to allow both biologists and bioinformaticians to exploit these rich modENCODE data sets now available via modMine.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkr921DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245176PMC
January 2012

The modENCODE Data Coordination Center: lessons in harvesting comprehensive experimental details.

Database (Oxford) 2011 19;2011:bar023. Epub 2011 Aug 19.

Lawrence Berkeley National Laboratory, Genomics Division, 1 Cyclotron Road MS64-121, Berkeley, CA 94720, USA.

The model organism Encyclopedia of DNA Elements (modENCODE) project is a National Human Genome Research Institute (NHGRI) initiative designed to characterize the genomes of Drosophila melanogaster and Caenorhabditis elegans. A Data Coordination Center (DCC) was created to collect, store and catalog modENCODE data. An effective DCC must gather, organize and provide all primary, interpreted and analyzed data, and ensure the community is supplied with the knowledge of the experimental conditions, protocols and verification checks used to generate each primary data set. We present here the design principles of the modENCODE DCC, and describe the ramifications of collecting thorough and deep metadata for describing experiments, including the use of a wiki for capturing protocol and reagent information, and the BIR-TAB specification for linking biological samples to experimental results. modENCODE data can be found at http://www.modencode.org.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/database/bar023DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3170170PMC
November 2011