Publications by authors named "Graham M Hughes"

16 Publications

  • Page 1 of 1

Genome analysis of the metabolically versatile Pseudomonas umsongensis GO16: the genetic basis for PET monomer upcycling into polyhydroxyalkanoates.

Microb Biotechnol 2021 Jan 6. Epub 2021 Jan 6.

Faculty of Health and Medical Sciences, University of Surrey, Guildford, GU2 7XH, UK.

The throwaway culture related to the single-use materials such as polyethylene terephthalate (PET) has created a major environmental concern. Recycling of PET waste into biodegradable plastic polyhydroxyalkanoate (PHA) creates an opportunity to improve resource efficiency and contribute to a circular economy. We sequenced the genome of Pseudomonas umsongensis GO16 previously shown to convert PET-derived terephthalic acid (TA) into PHA and performed an in-depth genome analysis. GO16 can degrade a range of aromatic substrates in addition to TA, due to the presence of a catabolic plasmid pENK22. The genetic complement required for the degradation of TA via protocatechuate was identified and its functionality was confirmed by transferring the tph operon into Pseudomonas putida KT2440, which is unable to utilize TA naturally. We also identified the genes involved in ethylene glycol (EG) metabolism, the second PET monomer, and validated the capacity of GO16 to use EG as a sole source of carbon and energy. Moreover, GO16 possesses genes for the synthesis of both medium and short chain length PHA and we have demonstrated the capacity of the strain to convert mixed TA and EG into PHA. The metabolic versatility of GO16 highlights the potential of this organism for biotransformations using PET waste as a feedstock.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/1751-7915.13712DOI Listing
January 2021

Broad host range of SARS-CoV-2 predicted by comparative and structural analysis of ACE2 in vertebrates.

Proc Natl Acad Sci U S A 2020 09 21;117(36):22311-22322. Epub 2020 Aug 21.

The Genome Center, University of California, Davis, CA 95616;

The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of COVID-19. The main receptor of SARS-CoV-2, angiotensin I converting enzyme 2 (ACE2), is now undergoing extensive scrutiny to understand the routes of transmission and sensitivity in different species. Here, we utilized a unique dataset of ACE2 sequences from 410 vertebrate species, including 252 mammals, to study the conservation of ACE2 and its potential to be used as a receptor by SARS-CoV-2. We designed a five-category binding score based on the conservation properties of 25 amino acids important for the binding between ACE2 and the SARS-CoV-2 spike protein. Only mammals fell into the medium to very high categories and only catarrhine primates into the very high category, suggesting that they are at high risk for SARS-CoV-2 infection. We employed a protein structural analysis to qualitatively assess whether amino acid changes at variable residues would be likely to disrupt ACE2/SARS-CoV-2 spike protein binding and found the number of predicted unfavorable changes significantly correlated with the binding score. Extending this analysis to human population data, we found only rare (frequency <0.001) variants in 10/25 binding sites. In addition, we found significant signals of selection and accelerated evolution in the ACE2 coding sequence across all mammals, and specific to the bat lineage. Our results, if confirmed by additional experimental data, may lead to the identification of intermediate host species for SARS-CoV-2, guide the selection of animal models of COVID-19, and assist the conservation of animals both in native habitats and in human care.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.2010146117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7486773PMC
September 2020

Six reference-quality genomes reveal evolution of bat adaptations.

Nature 2020 07 22;583(7817):578-584. Epub 2020 Jul 22.

School of Biology and Environmental Science, University College Dublin, Dublin, Ireland.

Bats possess extraordinary adaptations, including flight, echolocation, extreme longevity and unique immunity. High-quality genomes are crucial for understanding the molecular basis and evolution of these traits. Here we incorporated long-read sequencing and state-of-the-art scaffolding protocols to generate, to our knowledge, the first reference-quality genomes of six bat species (Rhinolophus ferrumequinum, Rousettus aegyptiacus, Phyllostomus discolor, Myotis myotis, Pipistrellus kuhlii and Molossus molossus). We integrated gene projections from our 'Tool to infer Orthologs from Genome Alignments' (TOGA) software with de novo and homology gene predictions as well as short- and long-read transcriptomics to generate highly complete gene annotations. To resolve the phylogenetic position of bats within Laurasiatheria, we applied several phylogenetic methods to comprehensive sets of orthologous protein-coding and noncoding regions of the genome, and identified a basal origin for bats within Scrotifera. Our genome-wide screens revealed positive selection on hearing-related genes in the ancestral branch of bats, which is indicative of laryngeal echolocation being an ancestral trait in this clade. We found selection and loss of immunity-related genes (including pro-inflammatory NF-κB regulators) and expansions of anti-viral APOBEC3 genes, which highlights molecular mechanisms that may contribute to the exceptional immunity of bats. Genomic integrations of diverse viruses provide a genomic record of historical tolerance to viral infection in bats. Finally, we found and experimentally validated bat-specific variation in microRNAs, which may regulate bat-specific gene-expression programs. Our reference-quality bat genomes provide the resources required to uncover and validate the genomic basis of adaptations of bats, and stimulate new avenues of research that are directly relevant to human health and disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2486-3DOI Listing
July 2020

Broad Host Range of SARS-CoV-2 Predicted by Comparative and Structural Analysis of ACE2 in Vertebrates.

bioRxiv 2020 Apr 18. Epub 2020 Apr 18.

The Genome Center, University of California Davis, Davis, CA 95616, USA.

The novel coronavirus SARS-CoV-2 is the cause of Coronavirus Disease-2019 (COVID-19). The main receptor of SARS-CoV-2, angiotensin I converting enzyme 2 (ACE2), is now undergoing extensive scrutiny to understand the routes of transmission and sensitivity in different species. Here, we utilized a unique dataset of 410 vertebrates, including 252 mammals, to study cross-species conservation of ACE2 and its likelihood to function as a SARS-CoV-2 receptor. We designed a five-category ranking score based on the conservation properties of 25 amino acids important for the binding between receptor and virus, classifying all species from to . Only mammals fell into the to categories, and only catarrhine primates in the category, suggesting that they are at high risk for SARS-CoV-2 infection. We employed a protein structural analysis to qualitatively assess whether amino acid changes at variable residues would be likely to disrupt ACE2/SARS-CoV-2 binding, and found the number of predicted unfavorable changes significantly correlated with the binding score. Extending this analysis to human population data, we found only rare (<0.1%) variants in 10/25 binding sites. In addition, we observed evidence of positive selection in ACE2 in multiple species, including bats. Utilized appropriately, our results may lead to the identification of intermediate host species for SARS-CoV-2, justify the selection of animal models of COVID-19, and assist the conservation of animals both in native habitats and in human care.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2020.04.16.045302DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7263403PMC
April 2020

Drivers of longitudinal telomere dynamics in a long-lived bat species, Myotis myotis.

Mol Ecol 2020 08 19;29(16):2963-2977. Epub 2020 Mar 19.

School of Biology and Environmental Science, Science Centre West, University College Dublin, Belfield, Dublin, Ireland.

Age-related telomere shortening is considered a hallmark of the ageing process. However, a recent cross-sectional ageing study of relative telomere length (rTL) in bats failed to detect a relationship between rTL and age in the long-lived genus Myotis (M. myotis and M. bechsteinii), suggesting some other factors are responsible for driving telomere dynamics in these species. Here, we test if longitudinal rTL data show signatures of age-associated telomere attrition in M. myotis and differentiate which intrinsic or extrinsic factors are likely to drive telomere length dynamics. Using quantitative polymerase chain reaction, rTL was measured in 504 samples from a marked population, from Brittany, France, captured between 2013 and 2016. These represent 174 individuals with an age range of 0 to 7+ years. We find no significant relationship between rTL and age (p = .762), but demonstrate that within-individual rTL is highly variable from year to year. To investigate the heritability of rTL, a population pedigree (n = 1744) was constructed from genotype data generated from a 16-microsatellite multiplex, designed from an initial, low-coverage, Illumina genome for M. myotis. Heritability was estimated in a Bayesian, mixed model framework, and showed that little of the observed variance in rTL is heritable (h  = 0.01-0.06). Rather, correlations of first differences, correlating yearly changes in telomere length and weather variables, demonstrate that, during the spring transition, average temperature, minimum temperature, rainfall and windspeed correlate with changes in longitudinal telomere dynamics. As such, rTL may represent a useful biomarker to quantify the physiological impact of various environmental stressors in bats.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/mec.15395DOI Listing
August 2020

Olfactory receptor repertoire size in dinosaurs.

Proc Biol Sci 2019 06 12;286(1904):20190909. Epub 2019 Jun 12.

1 School of Biology and Environmental Science, University College Dublin , Belfield, Dublin 4 , Republic of Ireland.

The olfactory bulb (OB) ratio is the size of the OB relative to the cerebral hemisphere, and is used to estimate the proportion of the forebrain devoted to smell. In birds, OB ratio correlates with the number of olfactory receptor (OR) genes and therefore has been used as a proxy for olfactory acuity. By coupling OB ratios with known OR gene repertoires in birds, we infer minimum repertoire sizes for extinct taxa, including non-avian dinosaurs, using phylogenetic modelling, ancestral state reconstruction and comparative genomics. We highlight a shift in the scaling of OB ratio to body size along the lineage leading to modern birds, demonstrating variable OR repertoires present in different dinosaur and crown-bird lineages, with varying factors potentially influencing sensory evolution in theropods. We investigate the ancestral sensory space available to extinct taxa, highlighting potential adaptations to ecological niches. Through combining morphological and genomic data, we show that, while genetic information for extinct taxa is forever lost, it is potentially feasible to investigate evolutionary trajectories in extinct genomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1098/rspb.2019.0909DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6571463PMC
June 2019

As Blind as a Bat? Opsin Phylogenetics Illuminates the Evolution of Color Vision in Bats.

Mol Biol Evol 2019 01;36(1):54-68

UCD School of Biology and Environmental Science, University College Dublin, Dublin 4, Ireland.

Through their unique use of sophisticated laryngeal echolocation bats are considered sensory specialists amongst mammals and represent an excellent model in which to explore sensory perception. Although several studies have shown that the evolution of vision is linked to ecological niche adaptation in other mammalian lineages, this has not yet been fully explored in bats. Recent molecular analysis of the opsin genes, which encode the photosensitive pigments underpinning color vision, have implicated high-duty cycle (HDC) echolocation and the adoption of cave roosting habits in the degeneration of color vision in bats. However, insufficient sampling of relevant taxa has hindered definitive testing of these hypotheses. To address this, novel sequence data was generated for the SWS1 and MWS/LWS opsin genes and combined with existing data to comprehensively sample species representing diverse echolocation types and niches (SWS1 n = 115; MWS/LWS n = 45). A combination of phylogenetic analysis, ancestral state reconstruction, and selective pressure analyses were used to reconstruct the evolution of these visual pigments in bats and revealed that although both genes are evolving under purifying selection in bats, MWS/LWS is highly conserved but SWS1 is highly variable. Spectral tuning analyses revealed that MWS/LWS opsin is tuned to a long wavelength, 555-560 nm in the bat ancestor and the majority of extant taxa. The presence of UV vision in bats is supported by our spectral tuning analysis, but phylogenetic analyses demonstrated that the SWS1 opsin gene has undergone pseudogenization in several lineages. We do not find support for a link between the evolution of HDC echolocation and the pseudogenization of the SWS1 gene in bats, instead we show the SWS1 opsin is functional in the HDC echolocator, Pteronotus parnellii. Pseudogenization of the SWS1 is correlated with cave roosting habits in the majority of pteropodid species. Together these results demonstrate that the loss of UV vision in bats is more widespread than was previously considered and further elucidate the role of ecological niche specialization in the evolution of vision in bats.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msy192DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6340466PMC
January 2019

AGILE: an assembled genome mining pipeline.

Bioinformatics 2019 04;35(7):1252-1254

School of Biology and Environmental Science, University College Dublin, Dublin 4, Ireland.

Summary: A number of limiting factors mean that traditional genome annotation tools either fail or perform sub-optimally when trying to detect coding sequences in poor quality genome assemblies/genome reports. This means that potentially useful data is accessible only to those with specific skills and expertise in assembly and annotation. We present an Assembled-Genome mIning pipeLinE (AGILE) written in Perl that combines bioinformatics tools with a number of steps to overcome the limitations imposed by such assemblies when applied to highly fragmented genomes. Our methodology uses user-specified query genes from a closely related species to mine and annotate coding sequences that would traditionally be missed by standard annotation packages. Despite a focus on mammalian genomes, the generalized implementation means that it may be applied to any genome assembly, providing a means for non-specialists to gather gene sequences for downstream analyses.

Availability And Implementation: Source code and associated files are available at: https://github.com/batlabucd/GenomeMining and https://bitbucket.org/BatlabUCD/genomemining/src. Singularity and Virtual Box images available at https://figshare.com/s/a0004bf93dc43484b0c0.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty781DOI Listing
April 2019

The Birth and Death of Olfactory Receptor Gene Families in Mammalian Niche Adaptation.

Mol Biol Evol 2018 06;35(6):1390-1406

School of Biology and Environmental Science, University College Dublin, Dublin, Ireland.

The olfactory receptor (OR) gene families, which govern mammalian olfaction, have undergone extensive expansion and contraction through duplication and pseudogenization. Previous studies have shown that broadly defined environmental adaptations (e.g., terrestrial vs. aquatic) are correlated with the number of functional and non-functional OR genes retained. However, to date, no study has examined species-specific gene duplications in multiple phylogenetically divergent mammals to elucidate OR evolution and adaptation. Here, we identify the OR gene families driving adaptation to different ecological niches by mapping the fate of species-specific gene duplications in the OR repertoire of 94 diverse mammalian taxa, using molecular phylogenomic methods. We analyze >70,000 OR gene sequences mined from whole genomes, generated from novel amplicon sequencing data, and collated with data from previous studies, comprising one of the largest OR studies to date. For the first time, we demonstrate statistically significant patterns of OR species-specific gene duplications associated with the presence of a functioning vomeronasal organ. With respect to dietary niche, we uncover a novel link between a large number of duplications in OR family 5/8/9 and herbivory. Our results also highlight differences between social and solitary niches, indicating that a greater OR repertoire expansion may be associated with a solitary lifestyle. This study demonstrates the utility of species-specific duplications in elucidating gene family evolution, revealing how the OR repertoire has undergone expansion and contraction with respect to a number of ecological adaptations in mammals.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msy028DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5967467PMC
June 2018

Growing old, yet staying young: The role of telomeres in bats' exceptional longevity.

Sci Adv 2018 02 7;4(2):eaao0926. Epub 2018 Feb 7.

School of Biology and Environmental Science, Science Centre West, University College Dublin, Belfield, Dublin 4, Ireland.

Understanding aging is a grand challenge in biology. Exceptionally long-lived animals have mechanisms that underpin extreme longevity. Telomeres are protective nucleotide repeats on chromosome tips that shorten with cell division, potentially limiting life span. Bats are the longest-lived mammals for their size, but it is unknown whether their telomeres shorten. Using >60 years of cumulative mark-recapture field data, we show that telomeres shorten with age in and , but not in the bat genus with greatest longevity, . As in humans, telomerase is not expressed in blood or fibroblasts. Selection tests on telomere maintenance genes show that and , which repair and prevent DNA damage, potentially mediate telomere dynamics in bats. Twenty-one telomere maintenance genes are differentially expressed in , of which 14 are enriched for DNA repair, and 5 for alternative telomere-lengthening mechanisms. We demonstrate how telomeres, telomerase, and DNA repair genes have contributed to the evolution of exceptional longevity in bats, advancing our understanding of healthy aging.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/sciadv.aao0926DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5810611PMC
February 2018

Is there a link between aging and microbiome diversity in exceptional mammalian longevity?

PeerJ 2018 8;6:e4174. Epub 2018 Jan 8.

School of Biology and Environmental Science, University College Dublin, Dublin, Ireland.

A changing microbiome has been linked to biological aging in mice and humans, suggesting a possible role of gut flora in pathogenic aging phenotypes. Many bat species have exceptional longevity given their body size and some can live up to ten times longer than expected with little signs of aging. This study explores the anal microbiome of the exceptionally long-lived bat, investigating bacterial composition in both adult and juvenile bats to determine if the microbiome changes with age in a wild, long-lived non-model organism, using non-lethal sampling. The anal microbiome was sequenced using metabarcoding in more than 50 individuals, finding no significant difference between the composition of juvenile and adult bats, suggesting that age-related microbial shifts previously observed in other mammals may not be present in . Functional gene categories, inferred from metabarcoding data, expressed in the microbiome were categorized identifying pathways involved in metabolism, DNA repair and oxidative phosphorylation. We highlight an abundance of 'Proteobacteria' relative to other mammals, with similar patterns compared to other bat microbiomes. Our results suggest that may have a relatively stable, unchanging microbiome playing a role in their extended 'health spans' with the advancement of age, and suggest a potential link between microbiome and sustained, powered flight.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7717/peerj.4174DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5764031PMC
January 2018

Genome-wide signatures of complex introgression and adaptive evolution in the big cats.

Sci Adv 2017 07 19;3(7):e1700299. Epub 2017 Jul 19.

Laboratório de Biologia Genômica e Molecular, Faculdade de Biociências, Pontifical Catholic University of Rio Grande do Sul (PUCRS), Porto Alegre, Rio Grande do Sul, Brazil.

The great cats of the genus comprise a recent radiation whose evolutionary history is poorly understood. Their rapid diversification poses challenges to resolving their phylogeny while offering opportunities to investigate the historical dynamics of adaptive divergence. We report the sequence, de novo assembly, and annotation of the jaguar () genome, a novel genome sequence for the leopard (), and comparative analyses encompassing all living species. Demographic reconstructions indicated that all of these species have experienced variable episodes of population decline during the Pleistocene, ultimately leading to small effective sizes in present-day genomes. We observed pervasive genealogical discordance across genomes, caused by both incomplete lineage sorting and complex patterns of historical interspecific hybridization. We identified multiple signatures of species-specific positive selection, affecting genes involved in craniofacial and limb development, protein metabolism, hypoxia, reproduction, pigmentation, and sensory perception. There was remarkable concordance in pathways enriched in genomic segments implicated in interspecies introgression and in positive selection, suggesting that these processes were connected. We tested this hypothesis by developing exome capture probes targeting ~19,000 genes and applying them to 30 wild-caught jaguars. We found at least two genes ( and , both related to optic nerve development) bearing significant signatures of interspecies introgression and within-species positive selection. These findings indicate that post-speciation admixture has contributed genetic material that facilitated the adaptive evolution of big cat lineages.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/sciadv.1700299DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5517113PMC
July 2017

A novel method of microsatellite genotyping-by-sequencing using individual combinatorial barcoding.

R Soc Open Sci 2016 Jan 20;3(1):150565. Epub 2016 Jan 20.

Area 52 Research Group, University College Dublin, Belfield, Dublin, Republic of Ireland; Earth Institute, University College Dublin, Belfield, Dublin, Republic of Ireland.

This study examines the potential of next-generation sequencing based 'genotyping-by-sequencing' (GBS) of microsatellite loci for rapid and cost-effective genotyping in large-scale population genetic studies. The recovery of individual genotypes from large sequence pools was achieved by PCR-incorporated combinatorial barcoding using universal primers. Three experimental conditions were employed to explore the possibility of using this approach with existing and novel multiplex marker panels and weighted amplicon mixture. The GBS approach was validated against microsatellite data generated by capillary electrophoresis. GBS allows access to the underlying nucleotide sequences that can reveal homoplasy, even in large datasets and facilitates cross laboratory transfer. GBS of microsatellites, using individual combinatorial barcoding, is potentially faster and cheaper than current microsatellite approaches and offers better and more data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1098/rsos.150565DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4736940PMC
January 2016

Systematic exploration of guide-tree topology effects for small protein alignments.

BMC Bioinformatics 2014 Oct 4;15:338. Epub 2014 Oct 4.

University College Dublin, Conway Institute, Dublin, Ireland.

Background: Guide-trees are used as part of an essential heuristic to enable the calculation of multiple sequence alignments. They have been the focus of much method development but there has been little effort at determining systematically, which guide-trees, if any, give the best alignments. Some guide-tree construction schemes are based on pair-wise distances amongst unaligned sequences. Others try to emulate an underlying evolutionary tree and involve various iteration methods.

Results: We explore all possible guide-trees for a set of protein alignments of up to eight sequences. We find that pairwise distance based default guide-trees sometimes outperform evolutionary guide-trees, as measured by structure derived reference alignments. However, default guide-trees fall way short of the optimum attainable scores. On average chained guide-trees perform better than balanced ones but are not better than default guide-trees for small alignments.

Conclusions: Alignment methods that use Consistency or hidden Markov models to make alignments are less susceptible to sub-optimal guide-trees than simpler methods, that basically use conventional sequence alignment between profiles. The latter appear to be affected positively by evolutionary based guide-trees for difficult alignments and negatively for easy alignments. One phylogeny aware alignment program can strongly discriminate between good and bad guide-trees. The results for randomly chained guide-trees improve with the number of sequences.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-15-338DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4287568PMC
October 2014

Loss of olfactory receptor function in hominin evolution.

PLoS One 2014 2;9(1):e84714. Epub 2014 Jan 2.

UCD Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Belfield, Dublin, Ireland.

The mammalian sense of smell is governed by the largest gene family, which encodes the olfactory receptors (ORs). The gain and loss of OR genes is typically correlated with adaptations to various ecological niches. Modern humans have 853 OR genes but 55% of these have lost their function. Here we show evidence of additional OR loss of function in the Neanderthal and Denisovan hominin genomes using comparative genomic methodologies. Ten Neanderthal and 8 Denisovan ORs show evidence of loss of function that differ from the reference modern human OR genome. Some of these losses are also present in a subset of modern humans, while some are unique to each lineage. Morphological changes in the cranium of Neanderthals suggest different sensory arrangements to that of modern humans. We identify differences in functional olfactory receptor genes among modern humans, Neanderthals and Denisovans, suggesting varied loss of function across all three taxa and we highlight the utility of using genomic information to elucidate the sensory niches of extinct species.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0084714PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3879314PMC
November 2014

Using Illumina next generation sequencing technologies to sequence multigene families in de novo species.

Mol Ecol Resour 2013 May 9;13(3):510-21. Epub 2013 Mar 9.

UCD School of Biological and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland.

The advent of Next Generation Sequencing Technology (NGST) has revolutionized molecular biology research, allowing for rapid gene/genome sequencing from a multitude of diverse species. As high throughput sequencing becomes more accessible, more efficient workflows must be developed to deal with the amounts of data produced and better assemble the genomes of de novo lineages. We combine traditional laboratory methods with Illumina NGST to amplify and sequence the largest mammalian multigene family, the Olfactory Receptor gene family, for species with and without a reference genome. We develop novel assembly methods to annotate and filter these data, which can be utilized for any gene family or any species. We find no significant difference between the ratio of genes within their respective gene families of our data compared with available genomic data. Using simulated data we explore the limitations of short-read sequence data and our assembly in recovering this gene family. We highlight the benefits and shortcomings of these methods. Compared with data generated from traditional polymerase chain reaction, cloning and Sanger sequencing methodologies, sequence data generated using our pipeline increases yield and sequencing efficiency without reducing the number of unique genes amplified. A cloning step is not required, therefore shortening data generation time. The novel downstream methodologies and workflows described provide a tool to be utilized by many fields of biology, to access and analyze the vast quantities of data generated. By combining laboratory and in silico methods, we provide a means of extracting genomic information for multigene families without complete genome sequencing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/1755-0998.12087DOI Listing
May 2013