Publications by authors named "Mahdi Sarmady"

29 Publications

  • Page 1 of 1

Utilizing nanopore sequencing technology for the rapid and comprehensive characterization of eleven HLA loci; addressing the need for deceased donor expedited HLA typing.

Hum Immunol 2020 Aug 25;81(8):413-422. Epub 2020 Jun 25.

Department of Pathology and Laboratory Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA. Electronic address:

The comprehensive characterization of human leukocyte antigen (HLA) genomic sequences remains a challenging problem. Despite the significant advantages of next-generation sequencing (NGS) in the field of Immunogenetics, there has yet to be a single solution for unambiguous, accurate, simple, cost-effective, and timely genotyping necessary for all clinical applications. This report demonstrates the benefits of nanopore sequencing introduced by Oxford Nanopore Technologies (ONT) for HLA genotyping. Samples (n = 120) previously characterized at high-resolution three-field (HR-3F) for 11 loci were assessed using ONT sequencing paired to a single-plex PCR protocol (Holotype) and to two multiplex protocols OmniType (Omixon) and NGSgo®-MX6-1 (GenDx). The results demonstrate the potential of nanopore sequencing for delivering accurate HR-3F typing with a simple, rapid, and cost-effective protocol. The protocol is applicable to time-sensitive applications, such as deceased donor typings, enabling better assessments of compatibility and epitope analysis. The technology also allows significantly shorter turnaround time for multiple samples at a lower cost. Overall, the nanopore technology appears to offer a significant advancement over current next-generation sequencing platforms as a single solution for all HLA genotyping needs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.humimm.2020.06.004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7870017PMC
August 2020

Phen2Gene: rapid phenotype-driven gene prioritization for rare diseases.

NAR Genom Bioinform 2020 Jun 25;2(2):lqaa032. Epub 2020 May 25.

Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.

Human Phenotype Ontology (HPO) terms are increasingly used in diagnostic settings to aid in the characterization of patient phenotypes. The HPO annotation database is updated frequently and can provide detailed phenotype knowledge on various human diseases, and many HPO terms are now mapped to candidate causal genes with binary relationships. To further improve the genetic diagnosis of rare diseases, we incorporated these HPO annotations, gene-disease databases and gene-gene databases in a probabilistic model to build a novel HPO-driven gene prioritization tool, Phen2Gene. Phen2Gene accesses a database built upon this information called the HPO2Gene Knowledgebase (H2GKB), which provides weighted and ranked gene lists for every HPO term. Phen2Gene is then able to access the H2GKB for patient-specific lists of HPO terms or PhenoPacket descriptions supported by GA4GH (http://phenopackets.org/), calculate a prioritized gene list based on a probabilistic model and output gene-disease relationships with great accuracy. Phen2Gene outperforms existing gene prioritization tools in speed and acts as a real-time phenotype-driven gene prioritization tool to aid the clinical diagnosis of rare undiagnosed diseases. In addition to a command line tool released under the MIT license (https://github.com/WGLab/Phen2Gene), we also developed a web server and web service (https://phen2gene.wglab.org/) for running the tool via web interface or RESTful API queries. Finally, we have curated a large amount of benchmarking data for phenotype-to-gene tools involving 197 patients across 76 scientific articles and 85 patients' de-identified HPO term data from the Children's Hospital of Philadelphia.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nargab/lqaa032DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7252576PMC
June 2020

AnthOligo: automating the design of oligonucleotides for capture/enrichment technologies.

Bioinformatics 2020 08;36(15):4353-4356

Department of Pathology and Laboratory Medicine.

Summary: A number of methods have been devised to address the need for targeted genomic resequencing. One of these methods, region-specific extraction (RSE) is characterized by the capture of long DNA fragments (15-20 kb) by magnetic beads, after enzymatic extension of oligonucleotides hybridized to selected genomic regions. Facilitating the selection of the most appropriate capture oligos for targeting a region of interest, satisfying the properties of temperature (Tm) and entropy (ΔG), while minimizing the formation of primer-dimers in a pooled experiment, is therefore necessary. Manual design and selection of oligos becomes very challenging, complicated by factors such as length of the target region and number of targeted regions. Here we describe, AnthOligo, a web-based application developed to optimally automate the process of generation of oligo sequences used to target and capture the continuum of large and complex genomic regions. Apart from generating oligos for RSE, this program may have wider applications in the design of customizable internal oligos to be used as baits for gene panel analysis or even probes for large-scale comparative genomic hybridization array processes. AnthOligo was tested by capturing the Major Histocompatibility Complex (MHC) of a random sample.The application provides users with a simple interface to upload an input file in BED format and customize parameters for each task. The task of probe design in AnthOligo commences when a user uploads an input file and concludes with the generation of a result-set containing an optimal set of region-specific oligos. AnthOligo is currently available as a public web application with URL: http://antholigo.chop.edu.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa552DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7520035PMC
August 2020

Diagnosing Cornelia de Lange syndrome and related neurodevelopmental disorders using RNA sequencing.

Genet Med 2020 05 8;22(5):927-936. Epub 2020 Jan 8.

Genomics Center, Al Jalila Children's Specialty Hospital, Dubai, UAE.

Purpose: Neurodevelopmental disorders represent a frequent indication for clinical exome sequencing. Fifty percent of cases, however, remain undiagnosed even upon exome reanalysis. Here we show RNA sequencing (RNA-seq) on human B-lymphoblastoid cell lines (LCL) is highly suitable for neurodevelopmental Mendelian gene testing and demonstrate the utility of this approach in suspected cases of Cornelia de Lange syndrome (CdLS).

Methods: Genotype-Tissue Expression project transcriptome data for LCL, blood, and brain were assessed for neurodevelopmental Mendelian gene expression. Detection of abnormal splicing and pathogenic variants in these genes was performed with a novel RNA-seq diagnostic pipeline and using a validation CdLS-LCL cohort (n = 10) and test cohort of patients who carry a clinical diagnosis of CdLS but negative genetic testing (n = 5).

Results: LCLs share isoform diversity of brain tissue for a large subset of neurodevelopmental genes and express 1.8-fold more of these genes compared with blood (LCL, n = 1706; whole blood, n = 917). This enables testing of more than 1000 genetic syndromes. The RNA-seq pipeline had 90% sensitivity for detecting pathogenic events and revealed novel diagnoses such as abnormal splice products in NIPBL and pathogenic coding variants in BRD4 and ANKRD11.

Conclusion: The LCL transcriptome enables robust frontline and/or reflexive diagnostic testing for neurodevelopmental disorders.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41436-019-0741-5DOI Listing
May 2020

Using Machine Learning to Identify True Somatic Variants from Next-Generation Sequencing.

Clin Chem 2020 01;66(1):239-246

Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA.

Background: Molecular profiling has become essential for tumor risk stratification and treatment selection. However, cancer genome complexity and technical artifacts make identification of real variants a challenge. Currently, clinical laboratories rely on manual screening, which is costly, subjective, and not scalable. We present a machine learning-based method to distinguish artifacts from bona fide single-nucleotide variants (SNVs) detected by next-generation sequencing from nonformalin-fixed paraffin-embedded tumor specimens.

Methods: A cohort of 11278 SNVs identified through clinical sequencing of tumor specimens was collected and divided into training, validation, and test sets. Each SNV was manually inspected and labeled as either real or artifact as part of clinical laboratory workflow. A 3-class (real, artifact, and uncertain) model was developed on the training set, fine-tuned with the validation set, and then evaluated on the test set. Prediction intervals reflecting the certainty of the classifications were derived during the process to label "uncertain" variants.

Results: The optimized classifier demonstrated 100% specificity and 97% sensitivity over 5587 SNVs of the test set. Overall, 1252 of 1341 true-positive variants were identified as real, 4143 of 4246 false-positive calls were deemed artifacts, whereas only 192 (3.4%) SNVs were labeled as "uncertain," with zero misclassification between the true positives and artifacts in the test set.

Conclusions: We presented a computational classifier to identify variant artifacts detected from tumor sequencing. Overall, 96.6% of the SNVs received definitive labels and thus were exempt from manual review. This framework could improve quality and efficiency of the variant review process in clinical laboratories.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1373/clinchem.2019.308213DOI Listing
January 2020

A comparison of survival analysis methods for cancer gene expression RNA-Sequencing data.

Cancer Genet 2019 06 12;235-236:1-12. Epub 2019 Apr 12.

Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, United States; Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, United States; Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, Australia. Electronic address:

Identifying genetic biomarkers of patient survival remains a major goal of large-scale cancer profiling studies. Using gene expression data to predict the outcome of a patient's tumor makes biomarker discovery a compelling tool for improving patient care. As genomic technologies expand, multiple data types may serve as informative biomarkers, and bioinformatic strategies have evolved around these different applications. For categorical variables such as a gene's mutation status, biomarker identification to predict survival time is straightforward. However, for continuous variables like gene expression, the available methods generate highly-variable results, and studies on best practices are lacking. We investigated the performance of eight methods that deal specifically with continuous data. K-means, Cox regression, concordance index, D-index, 25th-75th percentile split, median-split, distribution-based splitting, and KaplanScan were applied to four RNA-sequencing (RNA-seq) datasets from the Cancer Genome Atlas. The reliability of the eight methods was assessed by splitting each dataset into two groups and comparing the overlap of the results. Gene sets that had been identified from the literature for a specific tumor type served as positive controls to assess the accuracy of each biomarker using receiver operating characteristic (ROC) curves. Artificial RNA-Seq data were generated to test the robustness of these methods under fixed levels of gene expression noise. Our results show that methods based on dichotomizing tend to have consistently poor performance while C-index, D-index, and k-means perform well in most settings. Overall, the Cox regression method had the strongest performance based on tests of accuracy, reliability, and robustness.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cancergen.2019.04.004DOI Listing
June 2019

Genetic variant pathogenicity prediction trained using disease-specific clinical sequencing data sets.

Genome Res 2019 07 24;29(7):1144-1151. Epub 2019 Jun 24.

Division of Genomic Diagnostics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania 19104, USA.

Recent advances in DNA sequencing have expanded our understanding of the molecular basis of genetic disorders and increased the utilization of clinical genomic tests. Given the paucity of evidence to accurately classify each variant and the difficulty of experimentally evaluating its clinical significance, a large number of variants generated by clinical tests are reported as variants of unknown clinical significance. Population-scale variant databases can improve clinical interpretation. Specifically, pathogenicity prediction for novel missense variants can use features describing regional variant constraint. Constrained genomic regions are those that have an unusually low variant count in the general population. Computational methods have been introduced to capture these regions and incorporate them into pathogenicity classifiers, but these methods have yet to be compared on an independent clinical variant data set. Here, we introduce one variant data set derived from clinical sequencing panels and use it to compare the ability of different genomic constraint metrics to determine missense variant pathogenicity. This data set is compiled from 17,071 patients surveyed with clinical genomic sequencing for cardiomyopathy, epilepsy, or RASopathies. We further use this data set to demonstrate the necessity of disease-specific classifiers and to train PathoPredictor, a disease-specific ensemble classifier of pathogenicity based on regional constraint and variant-level features. PathoPredictor achieves an average precision >90% for variants from all 99 tested disease genes while approaching 100% accuracy for some genes. The accumulation of larger clinical variant training data sets can significantly enhance their performance in a disease- and gene-specific manner.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.240994.118DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6633260PMC
July 2019

Clinical utility of custom-designed NGS panel testing in pediatric tumors.

Genome Med 2019 05 28;11(1):32. Epub 2019 May 28.

Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, 19104, USA.

Background: Somatic genetic testing is rapidly becoming the standard of care in many adult and pediatric cancers. Previously, the standard approach was single-gene or focused multigene testing, but many centers have moved towards broad-based next-generation sequencing (NGS) panels. Here, we report the laboratory validation and clinical utility of a large cohort of clinical NGS somatic sequencing results in diagnosis, prognosis, and treatment of a wide range of pediatric cancers.

Methods: Subjects were accrued retrospectively at a single pediatric quaternary-care hospital. Sequence analyses were performed on 367 pediatric cancer samples using custom-designed NGS panels over a 15-month period. Cases were profiled for mutations, copy number variations, and fusions identified through sequencing, and their clinical impact on diagnosis, prognosis, and therapy was assessed.

Results: NGS panel testing was incorporated meaningfully into clinical care in 88.7% of leukemia/lymphomas, 90.6% of central nervous system (CNS) tumors, and 62.6% of non-CNS solid tumors included in this cohort. A change in diagnosis as a result of testing occurred in 3.3% of cases. Additionally, 19.4% of all patients had variants requiring further evaluation for potential germline alteration.

Conclusions: Use of somatic NGS panel testing resulted in a significant impact on clinical care, including diagnosis, prognosis, and treatment planning in 78.7% of pediatric patients tested in our institution. Somatic NGS tumor testing should be implemented as part of the routine diagnostic workup of newly diagnosed and relapsed pediatric cancer patients.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13073-019-0644-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6537185PMC
May 2019

Use of a Dynamic Genetic Testing Approach for Childhood-Onset Epilepsy.

JAMA Netw Open 2019 04 5;2(4):e192129. Epub 2019 Apr 5.

Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.

Importance: Although genetic testing is important for bringing precision medicine to children with epilepsy, it is unclear what genetic testing strategy is best in maximizing diagnostic yield.

Objectives: To evaluate the diagnostic yield of an exome-based gene panel for childhood epilepsy and discuss the value of follow-up testing.

Design, Setting, And Participants: A case series study was conducted on data from clinical genetic testing at Children's Hospital of Philadelphia was conducted from September 26, 2016, to January 8, 2018. Initial testing targeted 100 curated epilepsy genes for sequence and copy number analysis in 151 children with idiopathic epilepsy referred consecutively by neurologists. Additional genetic testing options were offered afterward.

Exposures: Clinical genetic testing.

Main Outcomes And Measures: Molecular diagnostic findings.

Results: Of 151 patients (84 boys [55.6%]; median age, 4.2 years [interquartile range, 1.4-8.7 years]), 16 children (10.6%; 95% CI, 6%-16%) received a diagnosis after initial panel analysis. Parental testing for 15 probands with inconclusive results revealed de novo variants in 7 individuals (46.7%), resulting in an overall diagnostic yield of 15.3% (23 of 151; 95% CI, 9%-21%). Twelve probands with nondiagnostic panel findings were reflexed to exome sequencing, and 4 were diagnostic (33.3%; 95% CI, 6%-61%), raising the overall diagnostic yield to 17.9% (27 of 151; 95% CI, 12%-24%). The yield was highest (17 of 44 [38.6%; 95% CI, 24%-53%]) among probands with epilepsy onset in infancy (age, 1-12 months). Panel diagnostic findings involved 16 genes: SCN1A (n = 4), PRRT2 (n = 3), STXBP1 (n = 2), IQSEC2 (n = 2), ATP1A2, ATP1A3, CACNA1A, GABRA1, KCNQ2, KCNT1, SCN2A, SCN8A, DEPDC5, TPP1, PCDH19, and UBE3A (all n = 1). Exome sequencing analysis identified 4 genes: SMC1A, SETBP1, NR2F1, and TRIT1. For the remaining 124 patients, analysis of 13 additional genes implicated in epilepsy since the panel was launched in 2016 revealed promising findings in 6 patients.

Conclusions And Relevance: Exome-based targeted panels appear to enable rapid analysis of a preselected set of genes while retaining flexibility in gene content. Successive genetic workup should include parental testing of select probands with inconclusive results and reflex to whole-exome trio analysis for the remaining nondiagnostic cases. Periodic reanalysis is needed to capture information in newly identified disease genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/jamanetworkopen.2019.2129DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6481455PMC
April 2019

Rapid and accurate interpretation of clinical exomes using Phenoxome: a computational phenotype-driven approach.

Eur J Hum Genet 2019 04 9;27(4):612-620. Epub 2019 Jan 9.

Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.

Clinical exome sequencing (CES) has become the preferred diagnostic platform for complex pediatric disorders with suspected monogenic etiologies. Despite rapid advancements, the major challenge still resides in identifying the casual variants among the thousands of variants detected during CES testing, and thus establishing a molecular diagnosis. To improve the clinical exome diagnostic efficiency, we developed Phenoxome, a robust phenotype-driven model that adopts a network-based approach to facilitate automated variant prioritization. Phenoxome dissects the phenotypic manifestation of a patient in concert with their genomic profile to filter and then prioritize variants that are likely to affect the function of the gene (potentially pathogenic variants). To validate our method, we have compiled a clinical cohort of 105 positive patient samples that represent a wide range of genetic heterogeneity. Phenoxome identifies the causative variants within the top 5, 10, or 25 candidates in more than 50%, 71%, or 88% of these exomes, respectively. Furthermore, we show that our method is optimized for clinical testing by outperforming the current state-of-art method. We have demonstrated the performance of Phenoxome using a clinical cohort and showed that it enables rapid and accurate interpretation of clinical exomes. Phenoxome is available at https://phenoxome.chop.edu/ .
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41431-018-0328-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6460638PMC
April 2019

A mutation update for the PCDH19 gene causing early-onset epilepsy in females with an unusual expression pattern.

Hum Mutat 2019 03 10;40(3):243-257. Epub 2019 Jan 10.

Al Jalila Children's Specialty Hospital, Dubai, United Arab Emirates.

The PCDH19 gene consists of six exons encoding a 1,148 amino acid transmembrane protein, Protocadherin 19, which is involved in brain development. Heterozygous pathogenic variants in this gene are inherited in an unusual X-linked dominant pattern in which heterozygous females are affected, while hemizygous males are typically unaffected, although they pass on the pathogenic variant to each affected daughter. PCDH19-related disorder is known to cause early-onset epilepsy in females characterized by seizure clusters exacerbated by fever and in most cases, onset is within the first year of life. This condition was initially described in 1971 and in 2008 PCDH19 was identified as the underlying genetic etiology. This condition is the result of pathogenic loss-of-function variants that may be de novo or inherited from an affected mother or unaffected father and cellular interference has been hypothesized to be the culprit. Heterozygous females are symptomatic because of the presence of both wild-type and mutant cells that interfere with one another due to the production of different surface proteins, whereas nonmosaic hemizygous males produce a homogenous population of cells. Here, we review novel pathogenic variants in the PCDH19 gene since 2012 to date, and summarize any genotype-phenotype correlations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23701DOI Listing
March 2019

Automated Clinical Exome Reanalysis Reveals Novel Diagnoses.

J Mol Diagn 2019 01;21(1):38-48

Division of Genomic Diagnostics, Children's Hospital of Philadelphia, Philadelphia; Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania. Electronic address:

Clinical exome sequencing (CES) has a reported diagnostic yield of 20% to 30% for most clinical indications. The ongoing discovery of novel gene-disease and variant-disease associations are expected to increase the diagnostic yield of CES. Performing systematic reanalysis of previously nondiagnostic CES samples represents a significant challenge for clinical laboratories. Here, we present the results of a novel automated reanalysis methodology applied to 300 CES samples initially analyzed between June 2014 and September 2016. Application of our reanalysis methodology reduced reanalysis variant analysis burden by >93% and correctly captured 70 of 70 previously identified diagnostic variants among 60 samples with previously identified diagnoses. Notably, reanalysis of 240 initially nondiagnostic samples using information available on July 1, 2017, revealed 38 novel diagnoses, representing a 15.8% increase in diagnostic yield. Modeling monthly iterative reanalysis of 240 nondiagnostic samples revealed a diagnostic rate of 0.57% of samples per month. Modeling the workload required for monthly iterative reanalysis of nondiagnostic samples revealed a variant analysis burden of approximately 5 variants/month for proband-only and approximately 0.5 variants/month for trio samples. Approximately 45% of samples required evaluation during each monthly interval, and 61.3% of samples were reevaluated across three consecutive reanalyses. In sum, automated reanalysis methods can facilitate efficient reevaluation of nondiagnostic samples using up-to-date literature and can provide significant value to clinical laboratories.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jmoldx.2018.07.008DOI Listing
January 2019

Correction: Novel findings with reassessment of exome data: implications for validation testing and interpretation of genomic data.

Genet Med 2018 10;20(10):1298

Division of Genomic Diagnostics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.

In the published version of this article, the degree of author Bo Zhang was incorrectly listed as PhD. The correct degree is BS.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/gim.2017.264DOI Listing
October 2018

Need for Automated Interactive Genomic Interpretation and Ongoing Reanalysis.

JAMA Pediatr 2018 12;172(12):1113-1114

Department of Genetics, Al Jalila Children's Specialty Hospital, Dubai, United Arab Emirates.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/jamapediatrics.2018.2675DOI Listing
December 2018

The Development and Validation of Clinical Exome-Based Panels Using ExomeSlicer: Considerations and Proof of Concept Using an Epilepsy Panel.

J Mol Diagn 2018 09 22;20(5):643-652. Epub 2018 Jun 22.

Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania; Department of Pathology and Laboratory Medicine, The University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania; Genetics Department, Al Jalila Children's Specialty Hospital, Dubai, United Arab Emirates. Electronic address:

Exome-based panels are becoming the preferred diagnostic strategy in clinical laboratories. This approach enables dynamic gene content update and, if needed, cost-effective reflex to whole-exome sequencing. Currently, no guidelines or appropriate resources are available to support the clinical implementation of exome-based panels. Here, we highlight principles and important considerations for the clinical development and validation of exome-based panels. In addition, we developed ExomeSlicer, a novel, web-based resource, which uses empirical exon-level next-generation sequencing quality metrics to predict and visualize technically challenging exome-wide regions in any gene or genes of interest. Exome sequencing data from 100 clinical epilepsy cases were used to illustrate the clinical utility of ExomeSlicer in predicting poor-quality regions and its impact on streamlining the ad hoc Sanger sequencing fill in burden. With the use of ExomeSlicer, >2100 low complexity and/or high-homology regions affecting >1615 genes across the exome were also characterized. These regions can be a source of false-positive or false-negative variant calls, which can lead to misdiagnoses in tested patients and/or inaccurate functional annotations. We provide important considerations and a novel resource for the clinical development of exome-based panels.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jmoldx.2018.05.003DOI Listing
September 2018

Utility and limitations of exome sequencing as a genetic diagnostic tool for children with hearing loss.

Genet Med 2018 12 15;20(12):1663-1676. Epub 2018 Jun 15.

Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

Purpose: Hearing loss (HL) is the most common sensory disorder in children. Prompt molecular diagnosis may guide screening and management, especially in syndromic cases when HL is the single presenting feature. Exome sequencing (ES) is an appealing diagnostic tool for HL as the genetic causes are highly heterogeneous.

Methods: ES was performed on a prospective cohort of 43 probands with HL. Sequence data were analyzed for primary and secondary findings. Capture and coverage analysis was performed for genes and variants associated with HL.

Results: The diagnostic rate using ES was 37.2%, compared with 15.8% for the clinical HL panel. Secondary findings were discovered in three patients. For 247 genes associated with HL, 94.7% of the exons were targeted for capture and 81.7% of these exons were covered at 20× or greater. Further analysis of 454 randomly selected HL-associated variants showed that 89% were targeted for capture and 75% were covered at a read depth of at least 20×.

Conclusion: ES has an improved yield compared with clinical testing and may capture diagnoses not initially considered due to subtle clinical phenotypes. Technical challenges were identified, including inadequate capture and coverage of HL genes. Additional considerations of ES include secondary findings, cost, and turnaround time.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41436-018-0004-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6295269PMC
December 2018

AUDIOME: a tiered exome sequencing-based comprehensive gene panel for the diagnosis of heterogeneous nonsyndromic sensorineural hearing loss.

Genet Med 2018 12 29;20(12):1600-1608. Epub 2018 Mar 29.

Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.

Purpose: Hereditary hearing loss is highly heterogeneous. To keep up with rapidly emerging disease-causing genes, we developed the AUDIOME test for nonsyndromic hearing loss (NSHL) using an exome sequencing (ES) platform and targeted analysis for the curated genes.

Methods: A tiered strategy was implemented for this test. Tier 1 includes combined Sanger and targeted deletion analyses of the two most common NSHL genes and two mitochondrial genes. Nondiagnostic tier 1 cases are subjected to ES and array followed by targeted analysis of the remaining AUDIOME genes.

Results: ES resulted in good coverage of the selected genes with 98.24% of targeted bases at >15 ×. A fill-in strategy was developed for the poorly covered regions, which generally fell within GC-rich or highly homologous regions. Prospective testing of 33 patients with NSHL revealed a diagnosis in 11 (33%) and a possible diagnosis in 8 cases (24.2%). Among those, 10 individuals had variants in tier 1 genes. The ES data in the remaining nondiagnostic cases are readily available for further analysis.

Conclusion: The tiered and ES-based test provides an efficient and cost-effective diagnostic strategy for NSHL, with the potential to reflex to full exome to identify causal changes outside of the AUDIOME test.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/gim.2018.48DOI Listing
December 2018

Correction: Novel findings with reassessment of exome data: implications for validation testing and interpretation of genomic data.

Genet Med 2018 11;20(11):1486

Division of Genomic Diagnostics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.

In the published version of this article, the name of the 18th author was misspelled as Minjie Lou. The correct name is Minjie Luo. The authors regret the error.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/gim.2018.1DOI Listing
November 2018

Novel findings with reassessment of exome data: implications for validation testing and interpretation of genomic data.

Genet Med 2018 03 12;20(3):329-336. Epub 2017 Oct 12.

Division of Genomic Diagnostics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.

PurposeThe objective of this study was to assess the ability of our laboratory's exome-sequencing test to detect known and novel sequence variants and identify the critical factors influencing the interpretation of a clinical exome test.MethodsWe developed a two-tiered validation strategy: (i) a method-based approach that assessed the ability of our exome test to detect known variants using a reference HapMap sample, and (ii) an interpretation-based approach that assessed our relative ability to identify and interpret disease-causing variants, by analyzing and comparing the results of 19 randomly selected patients previously tested by external laboratories.ResultsWe demonstrate that this approach is reproducible with >99% analytical sensitivity and specificity for single-nucleotide variants and indels <10 bp. Our findings were concordant with the reference laboratories in 84% of cases. A new molecular diagnosis was applied to three cases, including discovery of two novel candidate genes.ConclusionWe provide an assessment of critical areas that influence interpretation of an exome test, including comprehensive phenotype capture, assessment of clinical overlap, availability of parental data, and the addressing of limitations in database updates. These results can be used to inform improvements in phenotype-driven interpretation of medical exomes in clinical and research settings.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/gim.2017.153DOI Listing
March 2018

Characterizing reduced coverage regions through comparison of exome and genome sequencing data across 10 centers.

Genet Med 2018 08 16;20(8):855-866. Epub 2017 Nov 16.

Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.

Purpose: As massively parallel sequencing is increasingly being used for clinical decision making, it has become critical to understand parameters that affect sequencing quality and to establish methods for measuring and reporting clinical sequencing standards. In this report, we propose a definition for reduced coverage regions and describe a set of standards for variant calling in clinical sequencing applications.

Methods: To enable sequencing centers to assess the regions of poor sequencing quality in their own data, we optimized and used a tool (ExCID) to identify reduced coverage loci within genes or regions of particular interest. We used this framework to examine sequencing data from 500 patients generated in 10 projects at sequencing centers in the National Human Genome Research Institute/National Cancer Institute Clinical Sequencing Exploratory Research Consortium.

Results: This approach identified reduced coverage regions in clinically relevant genes, including known clinically relevant loci that were uniquely missed at individual centers, in multiple centers, and in all centers.

Conclusion: This report provides a process road map for clinical sequencing centers looking to perform similar analyses on their data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/gim.2017.192DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6456263PMC
August 2018

Transcriptome analysis of IL-10-stimulated (M2c) macrophages by next-generation sequencing.

Immunobiology 2017 07 20;222(7):847-856. Epub 2017 Feb 20.

School of Biomedical Engineering, Science and Health Systems, Drexel University, 3141 Chestnut Street, Philadelphia, PA, 19104, USA. Electronic address:

Alternatively activated "M2" macrophages are believed to function during late stages of wound healing, behaving in an anti-inflammatory manner to mediate the resolution of the pro-inflammatory response caused by "M1" macrophages. However, the differences between two main subtypes of M2 macrophages, namely interleukin-4 (IL-4)-stimulated "M2a" macrophages and IL-10-stimulated "M2c" macrophages, are not well understood. M2a macrophages are characterized by their ability to inhibit inflammation and contribute to the stabilization of angiogenesis. However, the role and temporal profile of M2c macrophages in wound healing are not known. Therefore, we performed next generation sequencing (RNA-seq) to identify biological functions and gene expression signatures of macrophages polarized in vitro with IL-10 to the M2c phenotype in comparison to M1 and M2a macrophages and an unactivated control (M0). We then explored the expression of these gene signatures in a publicly available data set of human wound healing. RNA-seq analysis showed that hundreds of genes were upregulated in M2c macrophages compared to the M0 control, with thousands of alternative splicing events. Following validation by Nanostring, 39 genes were found to be upregulated by M2c macrophages compared to the M0 control, and 17 genes were significantly upregulated relative to the M0, M1, and M2a phenotypes (using an adjusted p-value cutoff of 0.05 and fold change cutoff of 1.5). Many of the identified M2c-specific genes are associated with angiogenesis, matrix remodeling, and phagocytosis, including CD163, MMP8, TIMP1, VCAN, SERPINA1, MARCO, PLOD2, PCOCLE2 and F5. Analysis of the macrophage-conditioned media for secretion of matrix-remodeling proteins showed that M2c macrophages secreted higher levels of MMP7, MMP8, and TIMP1 compared to the other phenotypes. Interestingly, temporal gene expression analysis of a publicly available microarray data set of human wound healing showed that M2c-related genes were upregulated at early times after injury, similar to M1-related genes, while M2a-related genes appeared at later stages or were downregulated after injury. While further studies are required to confirm the timing and role of M2c macrophages in vivo, these results suggest that M2c macrophages may function at early stages of wound healing. Identification of markers of the M2c phenotype will allow more detailed investigations into the role of M2c macrophages in vivo.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.imbio.2017.02.006DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5719494PMC
July 2017

Using large sequencing data sets to refine intragenic disease regions and prioritize clinical variant interpretation.

Genet Med 2017 05 22;19(5):496-504. Epub 2016 Sep 22.

Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.

Purpose: Classification of novel variants is a major challenge facing the widespread adoption of comprehensive clinical genomic sequencing and the field of personalized medicine in general. This is largely because most novel variants do not have functional, genetic, or population data to support their clinical classification.

Methods: To improve variant interpretation, we leveraged the Exome Aggregation Consortium (ExAC) data set (N = ~60,000) as well as 7,000 clinically curated variants in 132 genes identified in more than 11,000 probands clinically tested for cardiomyopathies, rasopathies, hearing loss, or connective tissue disorders to perform a systematic evaluation of domain level disease associations.

Results: We statistically identify regions that are most sensitive to functional variation in the general population and also most commonly impacted in symptomatic individuals. Our data show that a significant number of exons and domains in genes strongly associated with disease can be defined as disease-sensitive or disease-tolerant, leading to potential reclassification of at least 26% (450 out of 1,742) of variants of uncertain clinical significance in the 132 genes.

Conclusion: This approach leverages domain functional annotation and associated disease in each gene to prioritize candidate disease variants, increasing the sensitivity and specificity of novel variant assessment within these genes.Genet Med advance online publication 22 September 2016.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/gim.2016.134DOI Listing
May 2017

Exome sequencing analysis reveals variants in primary immunodeficiency genes in patients with very early onset inflammatory bowel disease.

Gastroenterology 2015 Nov 17;149(6):1415-24. Epub 2015 Jul 17.

Division of Human Genetics, The Children's Hospital of Philadelphia; Department of Pediatrics, Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania; Department of Molecular Medicine, University Sapienza, Rome, Italy.

Background & Aims: Very early onset inflammatory bowel disease (VEO-IBD), IBD diagnosed at 5 years of age or younger, frequently presents with a different and more severe phenotype than older-onset IBD. We investigated whether patients with VEO-IBD carry rare or novel variants in genes associated with immunodeficiencies that might contribute to disease development.

Methods: Patients with VEO-IBD and parents (when available) were recruited from the Children's Hospital of Philadelphia from March 2013 through July 2014. We analyzed DNA from 125 patients with VEO-IBD (age, 3 wk to 4 y) and 19 parents, 4 of whom also had IBD. Exome capture was performed by Agilent SureSelect V4, and sequencing was performed using the Illumina HiSeq platform. Alignment to human genome GRCh37 was achieved followed by postprocessing and variant calling. After functional annotation, candidate variants were analyzed for change in protein function, minor allele frequency less than 0.1%, and scaled combined annotation-dependent depletion scores of 10 or less. We focused on genes associated with primary immunodeficiencies and related pathways. An additional 210 exome samples from patients with pediatric IBD (n = 45) or adult-onset Crohn's disease (n = 20) and healthy individuals (controls, n = 145) were obtained from the University of Kiel, Germany, and used as control groups.

Results: Four hundred genes and regions associated with primary immunodeficiency, covering approximately 6500 coding exons totaling more than 1 Mbp of coding sequence, were selected from the whole-exome data. Our analysis showed novel and rare variants within these genes that could contribute to the development of VEO-IBD, including rare heterozygous missense variants in IL10RA and previously unidentified variants in MSH5 and CD19.

Conclusions: In an exome sequence analysis of patients with VEO-IBD and their parents, we identified variants in genes that regulate B- and T-cell functions and could contribute to pathogenesis. Our analysis could lead to the identification of previously unidentified IBD-associated variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1053/j.gastro.2015.07.006DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4853027PMC
November 2015

Utility and limitations of exome sequencing as a genetic diagnostic tool for conditions associated with pediatric sudden cardiac arrest/sudden cardiac death.

Hum Genomics 2015 Jul 19;9:15. Epub 2015 Jul 19.

Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA.

Background: Conditions associated with sudden cardiac arrest/death (SCA/D) in youth often have a genetic etiology. While SCA/D is uncommon, a pro-active family screening approach may identify these inherited structural and electrical abnormalities prior to symptomatic events and allow appropriate surveillance and treatment. This study investigated the diagnostic utility of exome sequencing (ES) by evaluating the capture and coverage of genes related to SCA/D.

Methods: Samples from 102 individuals (13 with known molecular etiologies for SCA/D, 30 individuals without known molecular etiologies for SCA/D and 59 with other conditions) were analyzed following exome capture and sequencing at an average read depth of 100X. Reads were mapped to human genome GRCh37 using Novoalign, and post-processing and analysis was done using Picard and GATK. A total of 103 genes (2,190 exons) related to SCA/D were used as a primary filter. An additional 100 random variants within the targeted genes associated with SCA/D were also selected and evaluated for depth of sequencing and coverage. Although the primary objective was to evaluate the adequacy of depth of sequencing and coverage of targeted SCA/D genes and not for primary diagnosis, all patients who had SCA/D (known or unknown molecular etiologies) were evaluated with the project's variant analysis pipeline to determine if the molecular etiologies could be successfully identified.

Results: The majority of exons (97.6 %) were captured and fully covered on average at minimum of 20x sequencing depth. The proportion of unique genomic positions reported within poorly covered exons remained small (4 %). Exonic regions with less coverage reflect the need to enrich these areas to improve coverage. Despite limitations in coverage, we identified 100 % of cases with a prior known molecular etiology for SCA/D, and analysis of an additional 30 individuals with SCA/D but no known molecular etiology revealed a diagnostic answer in 5/30 (17 %). We also demonstrated 95 % of 100 randomly selected reported variants within our targeted genes would have been picked up on ES based on our coverage analysis.

Conclusions: ES is a helpful clinical diagnostic tool for SCA/D given its potential to successfully identify a molecular diagnosis, but clinicians should be aware of limitations of available platforms from technical and diagnostic perspectives.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s40246-015-0038-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4506570PMC
July 2015

Exome sequencing expands the mechanism of SOX5-associated intellectual disability: A case presentation with review of sox-related disorders.

Am J Med Genet A 2015 Nov 25;167A(11):2548-54. Epub 2015 Jun 25.

Division of Genomic Diagnostics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.

The SOX5 haploinsufficiency syndrome is characterized by global developmental delay, intellectual disability, language and motor impairment, and distinct facial features. The smallest deletion encompassed only one gene, SOX5 (OMIM 604975), indicating that haploinsufficiency of SOX5 contributes to neuro developmental delay. Although multiple deletions of the SOX5 gene have been reported in patients, none are strictly intragenic point mutations. Here, we report the identification of a de novo loss of function variant in SOX5 identified through whole exome sequencing. The proband presented with moderate developmental delay, bilateral optic atrophy, mildly dysmorphic features, and scoliosis, which correlates with the previously-described SOX5-associated phenotype. These results broaden the diagnostic spectrum of SOX5-related intellectual disability. Furthermore it highlights the utility of exome sequencing in establishing an etiological basis in clinically and genetically heterogeneous conditions such as intellectual disability.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ajmg.a.37221DOI Listing
November 2015

mtDNA Variation and Analysis Using Mitomap and Mitomaster.

Curr Protoc Bioinformatics 2013 Dec;44:1.23.1-26

Center for Mitochondrial and Epigenomic Medicine, Children's Hospital of Philadelphia Research Institute, Philadelphia, PA; Department of Pathology and Laboratory Medicine; University of Pennsylvania, Philadelphia, PA; phone: 1-267-425-3078; fax: 1-267-426-0978.

The Mitomap database of human mitochondrial DNA (mtDNA) information has been an important compilation of mtDNA variation for researchers, clinicians and genetic counselors for the past twenty-five years. The Mitomap protocol shows how users may look up human mitochondrial gene loci, search for public mitochondrial sequences, and browse or search for reported general population nucleotide variants as well as those reported in clinical disease. Within Mitomap is the powerful sequence analysis tool for human mitochondrial DNA, Mitomaster. The Mitomaster protocol gives step-by-step instructions showing how to submit sequences to identify nucleotide variants relative to the rCRS, to determine the haplogroup, and to view species conservation. User-supplied sequences, GenBank identifiers and single nucleotide variants may be analyzed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/0471250953.bi0123s44DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4257604PMC
December 2013

Efficient digest of high-throughput sequencing data in a reproducible report.

BMC Bioinformatics 2013 13;14 Suppl 11:S3. Epub 2013 Sep 13.

Background: High-throughput sequencing (HTS) technologies are spearheading the accelerated development of biomedical research. Processing and summarizing the large amount of data generated by HTS presents a non-trivial challenge to bioinformatics. A commonly adopted standard is to store sequencing reads aligned to a reference genome in SAM (Sequence Alignment/Map) or BAM (Binary Alignment/Map) files. Quality control of SAM/BAM files is a critical checkpoint before downstream analysis. The goal of the current project is to facilitate and standardize this process.

Results: We developed bamchop, a robust program to efficiently summarize key statistical metrics of HTS data stored in BAM files, and to visually present the results in a formatted report. The report documents information about various aspects of HTS data, such as sequencing quality, mapping to a reference genome, sequencing coverage, and base frequency. Bamchop uses the R language and Bioconductor packages to calculate statistical matrices and the Sweave utility and associated LaTeX markup for documentation. Bamchop's efficiency and robustness were tested on BAM files generated by local sequencing facilities and the 1000 Genomes Project. Source code, instruction and example reports of bamchop are freely available from https://github.com/CBMi-BiG/bamchop.

Conclusions: Bamchop enables biomedical researchers to quickly and rigorously evaluate HTS data by providing a convenient synopsis and user-friendly reports.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-14-S11-S3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3846741PMC
May 2014

HIV protein sequence hotspots for crosstalk with host hub proteins.

PLoS One 2011 15;6(8):e23293. Epub 2011 Aug 15.

Center for Integrated Bioinformatics, Drexel University, Philadelphia, Pennsylvania, United States of America.

HIV proteins target host hub proteins for transient binding interactions. The presence of viral proteins in the infected cell results in out-competition of host proteins in their interaction with hub proteins, drastically affecting cell physiology. Functional genomics and interactome datasets can be used to quantify the sequence hotspots on the HIV proteome mediating interactions with host hub proteins. In this study, we used the HIV and human interactome databases to identify HIV targeted host hub proteins and their host binding partners (H2). We developed a high throughput computational procedure utilizing motif discovery algorithms on sets of protein sequences, including sequences of HIV and H2 proteins. We identified as HIV sequence hotspots those linear motifs that are highly conserved on HIV sequences and at the same time have a statistically enriched presence on the sequences of H2 proteins. The HIV protein motifs discovered in this study are expressed by subsets of H2 host proteins potentially outcompeted by HIV proteins. A large subset of these motifs is involved in cleavage, nuclear localization, phosphorylation, and transcription factor binding events. Many such motifs are clustered on an HIV sequence in the form of hotspots. The sequential positions of these hotspots are consistent with the curated literature on phenotype altering residue mutations, as well as with existing binding site data. The hotspot map produced in this study is the first global portrayal of HIV motifs involved in altering the host protein network at highly connected hub nodes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0023293PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3156123PMC
February 2012

Sequence- and interactome-based prediction of viral protein hotspots targeting host proteins: a case study for HIV Nef.

PLoS One 2011 28;6(6):e20735. Epub 2011 Jun 28.

Center for Integrated Bioinformatics, School of Biomedical Engineering, Science, and Health Systems, Drexel University, Philadelphia, Pennsylvania, United States of America.

Virus proteins alter protein pathways of the host toward the synthesis of viral particles by breaking and making edges via binding to host proteins. In this study, we developed a computational approach to predict viral sequence hotspots for binding to host proteins based on sequences of viral and host proteins and literature-curated virus-host protein interactome data. We use a motif discovery algorithm repeatedly on collections of sequences of viral proteins and immediate binding partners of their host targets and choose only those motifs that are conserved on viral sequences and highly statistically enriched among binding partners of virus protein targeted host proteins. Our results match experimental data on binding sites of Nef to host proteins such as MAPK1, VAV1, LCK, HCK, HLA-A, CD4, FYN, and GNB2L1 with high statistical significance but is a poor predictor of Nef binding sites on highly flexible, hoop-like regions. Predicted hotspots recapture CD8 cell epitopes of HIV Nef highlighting their importance in modulating virus-host interactions. Host proteins potentially targeted or outcompeted by Nef appear crowding the T cell receptor, natural killer cell mediated cytotoxicity, and neurotrophin signaling pathways. Scanning of HIV Nef motifs on multiple alignments of hepatitis C protein NS5A produces results consistent with literature, indicating the potential value of the hotspot discovery in advancing our understanding of virus-host crosstalk.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0020735PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3125164PMC
December 2011