Publications by authors named "Oscar L Rodriguez"

11 Publications

  • Page 1 of 1

Pervasive cis effects of variation in copy number of large tandem repeats on local DNA methylation and gene expression.

Am J Hum Genet 2021 Mar 24. Epub 2021 Mar 24.

Department of Genetics and Genomic Sciences and Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA. Electronic address:

Variable number tandem repeats (VNTRs) are composed of large tandemly repeated motifs, many of which are highly polymorphic in copy number. However, because of their large size and repetitive nature, they remain poorly studied. To investigate the regulatory potential of VNTRs, we used read-depth data from Illumina whole-genome sequencing to perform association analysis between copy number of ∼70,000 VNTRs (motif size ≥ 10 bp) with both gene expression (404 samples in 48 tissues) and DNA methylation (235 samples in peripheral blood), identifying thousands of VNTRs that are associated with local gene expression (eVNTRs) and DNA methylation levels (mVNTRs). Using an independent cohort, we validated 73%-80% of signals observed in the two discovery cohorts, while allelic analysis of VNTR length and CpG methylation in 30 Oxford Nanopore genomes gave additional support for mVNTR loci, thus providing robust evidence to support that these represent genuine associations. Further, conditional analysis indicated that many eVNTRs and mVNTRs act as QTLs independently of other local variation. We also observed strong enrichments of eVNTRs and mVNTRs for regulatory features such as enhancers and promoters. Using the Human Genome Diversity Panel, we define sets of VNTRs that show highly divergent copy numbers among human populations and show that these are enriched for regulatory effects and preferentially associate with genes that have been linked with human phenotypes through GWASs. Our study provides strong evidence supporting functional variation at thousands of VNTRs and defines candidate sets of VNTRs, copy number variation of which potentially plays a role in numerous human phenotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2021.03.016DOI Listing
March 2021

A Novel Framework for Characterizing Genomic Haplotype Diversity in the Human Immunoglobulin Heavy Chain Locus.

Front Immunol 2020 23;11:2136. Epub 2020 Sep 23.

Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States.

An incomplete ascertainment of genetic variation within the highly polymorphic immunoglobulin heavy chain locus (IGH) has hindered our ability to define genetic factors that influence antibody-mediated processes. Due to locus complexity, standard high-throughput approaches have failed to accurately and comprehensively capture IGH polymorphism. As a result, the locus has only been fully characterized two times, severely limiting our knowledge of human IGH diversity. Here, we combine targeted long-read sequencing with a novel bioinformatics tool, IGenotyper, to fully characterize IGH variation in a haplotype-specific manner. We apply this approach to eight human samples, including a haploid cell line and two mother-father-child trios, and demonstrate the ability to generate high-quality assemblies (>98% complete and >99% accurate), genotypes, and gene annotations, identifying 2 novel structural variants and 15 novel IGH alleles. We show multiplexing allows for scaling of the approach without impacting data quality, and that our genotype call sets are more accurate than short-read (>35% increase in true positives and >97% decrease in false-positives) and array/imputation-based datasets. This framework establishes a desperately needed foundation for leveraging IG genomic data to study population-level variation in antibody-mediated immunity, critical for bettering our understanding of disease risk, and responses to vaccines and therapeutics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fimmu.2020.02136DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7539625PMC
September 2020

A Survey of Rare Epigenetic Variation in 23,116 Human Genomes Identifies Disease-Relevant Epivariations and CGG Expansions.

Am J Hum Genet 2020 10 15;107(4):654-669. Epub 2020 Sep 15.

Department of Genetics and Genomic Sciences and Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, Hess Center for Science and Medicine, New York, NY 10029, USA. Electronic address:

There is growing recognition that epivariations, most often recognized as promoter hypermethylation events that lead to gene silencing, are associated with a number of human diseases. However, little information exists on the prevalence and distribution of rare epigenetic variation in the human population. In order to address this, we performed a survey of methylation profiles from 23,116 individuals using the Illumina 450k array. Using a robust outlier approach, we identified 4,452 unique autosomal epivariations, including potentially inactivating promoter methylation events at 384 genes linked to human disease. For example, we observed promoter hypermethylation of BRCA1 and LDLR at population frequencies of ∼1 in 3,000 and ∼1 in 6,000, respectively, suggesting that epivariations may underlie a fraction of human disease which would be missed by purely sequence-based approaches. Using expression data, we confirmed that many epivariations are associated with outlier gene expression. Analysis of variation data and monozygous twin pairs suggests that approximately two-thirds of epivariations segregate in the population secondary to underlying sequence mutations, while one-third are likely sporadic events that occur post-zygotically. We identified 25 loci where rare hypermethylation coincided with the presence of an unstable CGG tandem repeat, validated the presence of CGG expansions at several loci, and identified the putative molecular defect underlying most of the known folate-sensitive fragile sites in the genome. Our study provides a catalog of rare epigenetic changes in the human genome, gives insight into the underlying origins and consequences of epivariations, and identifies many hypermethylated CGG repeat expansions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2020.08.019DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7536611PMC
October 2020

A robust benchmark for detection of germline large deletions and insertions.

Nat Biotechnol 2020 11 15;38(11):1347-1355. Epub 2020 Jun 15.

Joint Initiative for Metrology in Biology, SLAC National Accelerator Lab, Stanford University, Stanford, CA, USA.

New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-020-0538-8DOI Listing
November 2020

Elucidation of de novo small insertion/deletion biology with parent-of-origin phasing.

Hum Mutat 2020 04 16;41(4):800-806. Epub 2020 Jan 16.

Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, New York.

The mechanisms underlying de novo insertion/deletion (indel) genesis, such as polymerase slippage, have been hypothesized but not well characterized in the human genome. We implemented two methodological improvements, which were leveraged to dissect indel mutagenesis. We assigned de novo variants to parent-of-origin (i.e., phasing) with low-coverage long-read whole-genome sequencing, achieving better phasing compared to short-read sequencing (medians of 84% and 23%, respectively). We then wrote an application programming interface to classify indels into three subtypes according to sequence context. Across three cohorts with different phasing methods (N  = 540, all cohorts), we observed that one de novo indel subtype, change in copy count (CCC), was significantly correlated with father's (p = 7.1 × 10 ) but not mother's (p = .45) age at conception. We replicated this effect in three cohorts without de novo phasing (p  = 1.9 × 10 , p  = .61; N  = 3,391, all cohorts). Although this is consistent with polymerase slippage during spermatogenesis, the percentage of variance explained by paternal age was low, and we did not observe an association with replication timing. These results suggest that spermatogenesis-specific events have a minor role in CCC indel mutagenesis, one not observed for other indel subtypes nor for maternal age in general. These results have implications for indel modeling in evolution and disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23971DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7069802PMC
April 2020

A unified encyclopedia of human functional DNA elements through fully automated annotation of 164 human cell types.

Genome Biol 2019 08 28;20(1):180. Epub 2019 Aug 28.

Department of Genome Sciences, University of Washington, Seattle, USA.

Semi-automated genome annotation methods such as Segway take as input a set of genome-wide measurements such as of histone modification or DNA accessibility and output an annotation of genomic activity in the target cell type. Here we present annotations of 164 human cell types using 1615 data sets. To produce these annotations, we automated the label interpretation step to produce a fully automated annotation strategy. Using these annotations, we developed a measure of the importance of each genomic position called the "conservation-associated activity score." We further combined all annotations into a single, cell type-agnostic encyclopedia that catalogs all human regulatory elements.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-019-1784-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6714098PMC
August 2019

MsPAC: a tool for haplotype-phased structural variant detection.

Bioinformatics 2020 02;36(3):922-924

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Summary: While next-generation sequencing (NGS) has dramatically increased the availability of genomic data, phased genome assembly and structural variant (SV) analyses are limited by NGS read lengths. Long-read sequencing from Pacific Biosciences and NGS barcoding from 10x Genomics hold the potential for far more comprehensive views of individual genomes. Here, we present MsPAC, a tool that combines both technologies to partition reads, assemble haplotypes (via existing software) and convert assemblies into high-quality, phased SV predictions. MsPAC represents a framework for haplotype-resolved SV calls that moves one step closer to fully resolved, diploid genomes.

Availability And Implementation: https://github.com/oscarlr/MsPAC.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz618DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7523683PMC
February 2020

Slow Delivery Immunization Enhances HIV Neutralizing Antibody and Germinal Center Responses via Modulation of Immunodominance.

Cell 2019 05 9;177(5):1153-1171.e28. Epub 2019 May 9.

Division of Vaccine Discovery, La Jolla Institute for Immunology (LJI), La Jolla, CA 92037, USA; Center for HIV/AIDS Vaccine Immunology and Immunogen Discovery (Scripps CHAVI-ID), The Scripps Research Institute, La Jolla, CA 92037, USA; Department of Medicine, University of California, San Diego, La Jolla, CA 92037, USA. Electronic address:

Conventional immunization strategies will likely be insufficient for the development of a broadly neutralizing antibody (bnAb) vaccine for HIV or other difficult pathogens because of the immunological hurdles posed, including B cell immunodominance and germinal center (GC) quantity and quality. We found that two independent methods of slow delivery immunization of rhesus monkeys (RMs) resulted in more robust T follicular helper (T) cell responses and GC B cells with improved Env-binding, tracked by longitudinal fine needle aspirates. Improved GCs correlated with the development of >20-fold higher titers of autologous nAbs. Using a new RM genomic immunoglobulin locus reference, we identified differential IgV gene use between immunization modalities. Ab mapping demonstrated targeting of immunodominant non-neutralizing epitopes by conventional bolus-immunized animals, whereas slow delivery-immunized animals targeted a more diverse set of epitopes. Thus, alternative immunization strategies can enhance nAb development by altering GCs and modulating the immunodominance of non-neutralizing epitopes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2019.04.012DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6619430PMC
May 2019

Multi-platform discovery of haplotype-resolved structural variation in human genomes.

Nat Commun 2019 04 16;10(1):1784. Epub 2019 Apr 16.

The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-08148-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6467913PMC
April 2019