Publications by authors named "Fernando Mora-Márquez"

4 Publications

  • Page 1 of 1

NGScloud2: optimized bioinformatic analysis using Amazon Web Services.

PeerJ 2021 16;9:e11237. Epub 2021 Apr 16.

GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politécnica de Madrid, Madrid, Spain.

Background: NGScloud was a bioinformatic system developed to perform de novo RNAseq analysis of non-model species by exploiting the cloud computing capabilities of Amazon Web Services. The rapid changes undergone in the way this cloud computing service operates, along with the continuous release of novel bioinformatic applications to analyze next generation sequencing data, have made the software obsolete. NGScloud2 is an enhanced and expanded version of NGScloud that permits the access to ad hoc cloud computing infrastructure, scaled according to the complexity of each experiment.

Methods: NGScloud2 presents major technical improvements, such as the possibility of running spot instances and the most updated AWS instances types, that can lead to significant cost savings. As compared to its initial implementation, this improved version updates and includes common applications for de novo RNAseq analysis, and incorporates tools to operate workflows of bioinformatic analysis of reference-based RNAseq, RADseq and functional annotation. NGScloud2 optimizes the access to Amazon's large computing infrastructures to easily run popular bioinformatic software applications, otherwise inaccessible to non-specialized users lacking suitable hardware infrastructures.

Results: The correct performance of the pipelines for de novo RNAseq, reference-based RNAseq, RADseq and functional annotation was tested with real experimental data, providing workflow performance estimates and tips to make optimal use of NGScloud2. Further, we provide a qualitative comparison of NGScloud2 vs. the Galaxy framework. NGScloud2 code, instructions for software installation and use are available at https://github.com/GGFHF/NGScloud2. NGScloud2 includes a companion package, NGShelper that contains Python utilities to post-process the output of the pipelines for downstream analysis at https://github.com/GGFHF/NGShelper.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7717/peerj.11237DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8054753PMC
April 2021

TOA: A software package for automated functional annotation in non-model plant species.

Mol Ecol Resour 2021 Feb 18;21(2):621-636. Epub 2020 Nov 18.

GI Sistemas Naturales e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politécnica de Madrid, Madrid, Spain.

The increase of sequencing capacity provided by high-throughput platforms has made it possible to routinely obtain large sets of genomic and transcriptomic sequences from model and non-model organisms. Subsequent genomic analysis and gene discovery in next-generation sequencing experiments are, however, bottlenecked by functional annotation. One common way to perform functional annotation of sets of sequences obtained from next-generation sequencing experiments, is by searching for homologous sequences and accessing the related functional information deposited in genomic databases. Functional annotation is especially challenging for non-model organisms, like many plant species. In such cases, existing free and commercial general-purpose applications may not offer complete and accurate results. We present TOA (Taxonomy-oriented annotation), a Python-based user-friendly open source application designed to establish functional annotation pipelines geared towards non-model plant species that can run in Linux/Mac computers, HPCs and cloud servers. TOA performs homology searches against proteins stored in the PLAZA databases, NCBI RefSeq Plant, Nucleotide Database and Non-Redundant Protein Sequence Database, and outputs functional information from several ontology systems: Gene Ontology, InterPro, EC, KEGG, Mapman and MetaCyc. The software performance was validated by comparing the runtimes, total number of annotated sequences and accuracy of the functional information obtained for several plant benchmark data sets with TOA and other functional annotation solutions. TOA outperformed the other software in terms of number of annotated sequences and accuracy of the annotation and constitutes a good alternative to improve functional annotation in plants. TOA is especially recommended for gymnosperms or for low quality sequence data sets of non-model plants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/1755-0998.13285DOI Listing
February 2021

ddRAD Sequencing-Based Identification of Genomic Boundaries and Permeability in and Hybrids.

Front Plant Sci 2020 4;11:564414. Epub 2020 Sep 4.

G.I. Genética, Fisiología e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politécnica de Madrid, Madrid, Spain.

Hybridization and its relevance is a hot topic in ecology and evolutionary biology. Interspecific gene flow may play a key role in species adaptation to environmental change, as well as in the survival of endangered populations. Despite the fact that hybridization is quite common in plants, many hybridizing species, such as spp., maintain their integrity, while precise determination of genomic boundaries between species remains elusive. Novel high throughput sequencing techniques have opened up new perspectives in the comparative analysis of genomes and in the study of historical and current interspecific gene flow. In this work, we applied ddRADseq technique and developed an bioinformatics pipeline for the study of ongoing hybridization between two relevant Mediterranean oaks, and . We adopted a local scale approach, analyzing adult hybrids () identified in a mixed stand and their open-pollinated progenies. We have identified up to 9,435 markers across the genome and have estimated individual introgression levels in adults and seedlings. Estimated contribution of to the genome is higher, on average, in hybrid progenies than in hybrid adults, suggesting preferential backcrossing with this parental species, maybe followed by selection during juvenile stages against individuals with higher genomic contribution. Most discriminating markers seem to be scattered throughout the genome, suggesting that a large number of small genomic regions underlie boundaries between these species. A noticeable proportion of the markers (26%) showed allelic frequencies in adult hybrids very similar to one of the parental species, and very different from the other; a finding that seems relevant for understanding the hybridization process and the occurrence of adaptive introgression. Candidate marker databases developed in this study constitute a valuable resource to design large scale re-sequencing experiments in Mediterranean sclerophyllous oak species and could provide insight in species boundaries and on adaptive introgression between and
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fpls.2020.564414DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7498617PMC
September 2020

NGScloud: RNA-seq analysis of non-model species using cloud computing.

Bioinformatics 2018 10;34(19):3405-3407

GI Genética, Fisiología e Historia Forestal, Dpto. Sistemas y Recursos Naturales, ETSI Montes, Forestal y del Medio Natural, Universidad Politécnica de Madrid, Spain.

Summary: RNA-seq analysis usually requires large computing infrastructures. NGScloud is a bioinformatic system developed to analyze RNA-seq data using the cloud computing services of Amazon that permit the access to ad hoc computing infrastructure scaled according to the complexity of the experiment, so its costs and times can be optimized. The application provides a user-friendly front-end to operate Amazon's hardware resources, and to control a workflow of RNA-seq analysis oriented to non-model species, incorporating the cluster concept, which allows parallel runs of common RNA-seq analysis programs in several virtual machines for faster analysis.

Availability And Implementation: NGScloud is freely available at https://github.com/GGFHF/NGScloud/. A manual detailing installation and how-to-use instructions is available with the distribution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty363DOI Listing
October 2018