Publications by authors named "Michael D Linderman"

29 Publications

  • Page 1 of 1

NPSV: A simulation-driven approach to genotyping structural variants in whole-genome sequencing data.

Gigascience 2021 Jul;10(7)

Mindich Child Health and Development Institute and the Departments of Pediatrics and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave Levy Place, Box 1040, New York, NY 10029, USA.

Background: Structural variants (SVs) play a causal role in numerous diseases but are difficult to detect and accurately genotype (determine zygosity) in whole-genome next-generation sequencing data. SV genotypers that assume that the aligned sequencing data uniformly reflect the underlying SV or use existing SV call sets as training data can only partially account for variant and sample-specific biases.

Results: We introduce NPSV, a machine learning-based approach for genotyping previously discovered SVs that uses next-generation sequencing simulation to model the combined effects of the genomic region, sequencer, and alignment pipeline on the observed SV evidence. We evaluate NPSV alongside existing SV genotypers on multiple benchmark call sets. We show that NPSV consistently achieves or exceeds state-of-the-art genotyping accuracy across SV call sets, samples, and variant types. NPSV can specifically identify putative de novo SVs in a trio context and is robust to offset SV breakpoints.

Conclusions: Growing SV databases and the increasing availability of SV calls from long-read sequencing make stand-alone genotyping of previously identified SVs an increasingly important component of genome analyses. By treating potential biases as a "black box" that can be simulated, NPSV provides a framework for accurately genotyping a broad range of SVs in both targeted and genome-scale applications.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giab046DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8246072PMC
July 2021

Development and Validation of a Comprehensive Genomics Knowledge Scale.

Public Health Genomics 2021 May 31:1-13. Epub 2021 May 31.

Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA.

Background: Genomic testing is increasingly employed in clinical, research, educational, and commercial contexts. Genomic literacy is a prerequisite for the effective application of genomic testing, creating a corresponding need for validated tools to assess genomics knowledge. We sought to develop a reliable measure of genomics knowledge that incorporates modern genomic technologies and is informative for individuals with diverse backgrounds, including those with clinical/life sciences training.

Methods: We developed the GKnowM Genomics Knowledge Scale to assess the knowledge needed to make an informed decision for genomic testing, appropriately apply genomic technologies and participate in civic decision-making. We administered the 30-item draft measure to a calibration cohort (n = 1,234) and subsequent participants to create a combined validation cohort (n = 2,405). We performed a multistage psychometric calibration and validation using classical test theory and item response theory (IRT) and conducted a post-hoc simulation study to evaluate the suitability of a computerized adaptive testing (CAT) implementation.

Results: Based on exploratory factor analysis, we removed 4 of the 30 draft items. The resulting 26-item GKnowM measure has a single dominant factor. The scale internal consistency is α = 0.85, and the IRT 3-PL model demonstrated good overall and item fit. Validity is demonstrated with significant correlation (r = 0.61) with an existing genomics knowledge measure and significantly higher scores for individuals with adequate health literacy and healthcare providers (HCPs), including HCPs who work with genomic testing. The item bank is well suited to CAT, achieving high accuracy (r = 0.97 with the full measure) while administering a mean of 13.5 items.

Conclusion: GKnowM is an updated, broadly relevant, rigorously validated 26-item measure for assessing genomics knowledge that we anticipate will be useful for assessing population genomic literacy and evaluating the effectiveness of genomics educational interventions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1159/000515006DOI Listing
May 2021

MySeq: privacy-protecting browser-based personal Genome analysis for genomics education and exploration.

BMC Med Genomics 2019 11 27;12(1):172. Epub 2019 Nov 27.

Department of Computer Science, Middlebury College, Middlebury, VT, USA.

Background: The complexity of genome informatics is a recurring challenge for genome exploration and analysis by students and other non-experts. This complexity creates a barrier to wider implementation of experiential genomics education, even in settings with substantial computational resources and expertise. Reducing the need for specialized software tools will increase access to hands-on genomics pedagogy.

Results: MySeq is a React.js single-page web application for privacy-protecting interactive personal genome analysis. All analyses are performed entirely in the user's web browser eliminating the need to install and use specialized software tools or to upload sensitive data to an external web service. MySeq leverages Tabix-indexing to efficiently query whole genome-scale variant call format (VCF) files stored locally or available remotely via HTTP(s) without loading the entire file. MySeq currently implements variant querying and annotation, physical trait prediction, pharmacogenomic, polygenic disease risk and ancestry analyses to provide representative pedagogical examples; and can be readily extended with new analysis or visualization components.

Conclusions: MySeq supports multiple pedagogical approaches including independent exploration and interactive online tutorials. MySeq has been successfully employed in an undergraduate human genome analysis course where it reduced the barriers-to-entry for hands-on human genome analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-019-0615-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6882182PMC
November 2019

DECA: scalable XHMM exome copy-number variant calling with ADAM and Apache Spark.

BMC Bioinformatics 2019 Oct 11;20(1):493. Epub 2019 Oct 11.

AMPLab, University of California, Berkeley, Berkeley, CA, USA.

Background: XHMM is a widely used tool for copy-number variant (CNV) discovery from whole exome sequencing data but can require hours to days to run for large cohorts. A more scalable implementation would reduce the need for specialized computational resources and enable increased exploration of the configuration parameter space to obtain the best possible results.

Results: DECA is a horizontally scalable implementation of the XHMM algorithm using the ADAM framework and Apache Spark that incorporates novel algorithmic optimizations to eliminate unneeded computation. DECA parallelizes XHMM on both multi-core shared memory computers and large shared-nothing Spark clusters. We performed CNV discovery from the read-depth matrix in 2535 exomes in 9.3 min on a 16-core workstation (35.3× speedup vs. XHMM), 12.7 min using 10 executor cores on a Spark cluster (18.8× speedup vs. XHMM), and 9.8 min using 32 executor cores on Amazon AWS' Elastic MapReduce. We performed CNV discovery from the original BAM files in 292 min using 640 executor cores on a Spark cluster.

Conclusions: We describe DECA's performance, our algorithmic and implementation enhancements to XHMM to obtain that performance, and our lessons learned porting a complex genome analysis application to ADAM and Spark. ADAM and Apache Spark are a performant and productive platform for implementing large-scale genome analyses, but efficiently utilizing large clusters can require algorithmic optimizations and careful attention to Spark's configuration parameters.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12859-019-3108-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6787990PMC
October 2019

Predispositional genome sequencing in healthy adults: design, participant characteristics, and early outcomes of the PeopleSeq Consortium.

Genome Med 2019 02 27;11(1):10. Epub 2019 Feb 27.

Division of Genetics, Department of Medicine, Brigham and Women's Hospital, 41 Avenue Louis Pasteur, Suite 301, Boston, MA, 02115, USA.

Background: Increasing numbers of healthy individuals are undergoing predispositional personal genome sequencing. Here we describe the design and early outcomes of the PeopleSeq Consortium, a multi-cohort collaboration of predispositional genome sequencing projects, which is examining the medical, behavioral, and economic outcomes of returning genomic sequencing information to healthy individuals.

Methods: Apparently healthy adults who participated in four of the sequencing projects in the Consortium were included. Web-based surveys were administered before and after genomic results disclosure, or in some cases only after results disclosure. Surveys inquired about sociodemographic characteristics, motivations and concerns, behavioral and medical responses to sequencing results, and perceived utility.

Results: Among 1395 eligible individuals, 658 enrolled in the Consortium when contacted and 543 have completed a survey after receiving their genomic results thus far (mean age 53.0 years, 61.4% male, 91.7% white, 95.5% college graduates). Most participants (98.1%) were motivated to undergo sequencing because of curiosity about their genetic make-up. The most commonly reported concerns prior to pursuing sequencing included how well the results would predict future risk (59.2%) and the complexity of genetic variant interpretation (56.8%), while 47.8% of participants were concerned about the privacy of their genetic information. Half of participants reported discussing their genomic results with a healthcare provider during a median of 8.0 months after receiving the results; 13.5% reported making an additional appointment with a healthcare provider specifically because of their results. Few participants (< 10%) reported making changes to their diet, exercise habits, or insurance coverage because of their results. Many participants (39.5%) reported learning something new to improve their health that they did not know before. Reporting regret or harm from the decision to undergo sequencing was rare (< 3.0%).

Conclusions: Healthy individuals who underwent predispositional sequencing expressed some concern around privacy prior to pursuing sequencing, but were enthusiastic about their experience and not distressed by their results. While reporting value in their health-related results, few participants reported making medical or lifestyle changes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13073-019-0619-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6391825PMC
February 2019

Impacts of incorporating personal genome sequencing into graduate genomics education: a longitudinal study over three course years.

BMC Med Genomics 2018 01 30;11(1). Epub 2018 Jan 30.

Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Background: To address the need for more effective genomics training, beginning in 2012 the Icahn School of Medicine at Mount Sinai has offered a unique laboratory-style graduate genomics course, "Practical Analysis of Your Personal Genome" (PAPG), in which students optionally sequence and analyze their own whole genome. We hypothesized that incorporating personal genome sequencing (PGS) into the course pedagogy could improve educational outcomes by increasing student motivation and engagement. Here we extend our initial study of the pilot PAPG cohort with a report on student attitudes towards genome sequencing, decision-making, psychological wellbeing, genomics knowledge and pedagogical engagement across three course years.

Methods: Students enrolled in the 2013, 2014 and 2015 course years completed questionnaires before (T1) and after (T2) a prerequisite workshop (n = 110) and before (T3) and after (T4) PAPG (n = 66).

Results: Students' interest in PGS was high; 56 of 59 eligible students chose to sequence their own genome. Decisional conflict significantly decreased after the prerequisite workshop (T2 vs. T1 p < 0.001). Most, but not all students, reported low levels of decision regret and test-related distress post-course (T4). Each year baseline decisional conflict decreased (p < 0.001) suggesting, that as the course became more established, students increasingly made their decision prior to enrolling in the prerequisite workshop. Students perceived that analyzing their own genome enhanced the genomics pedagogy, with students self-reporting being more persistent and engaged as a result of analyzing their own genome. More than 90% of respondents reported spending additional time outside of course assignments analyzing their genome.

Conclusions: Incorporating personal genome sequencing in graduate medical education may improve student motivation and engagement. However, more data will be needed to quantitatively evaluate whether incorporating PGS is more effective than other educational approaches.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-018-0319-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5791365PMC
January 2018

Concordance between Research Sequencing and Clinical Pharmacogenetic Genotyping in the eMERGE-PGx Study.

J Mol Diagn 2017 07 11;19(4):561-566. Epub 2017 May 11.

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York.

There has been extensive debate about both the necessity of orthogonal confirmation of next-generation sequencing (NGS) results in Clinical Laboratory Improvement Amendments-approved laboratories and return of research NGS results to participants enrolled in research studies. In eMERGE-PGx, subjects underwent research NGS using PGRNseq and orthogonal targeted genotyping in clinical laboratories, which prompted a comparison of genotyping results between platforms. Concordance (percentage agreement) was reported for 4077 samples tested across nine combinations of research and clinical laboratories. Retesting was possible on a subset of 1792 samples, and local laboratory directors determined sources of genotype discrepancy. Research NGS and orthogonal clinical genotyping had an overall per sample concordance rate of 0.972 and per variant concordance rate of 0.997. Genotype discrepancies attributed to research NGS were because of sample switching (preanalytical errors), whereas the majority of genotype discrepancies (92.3%) attributed to clinical genotyping were because of allele dropout as a result of rare variants interfering with primer hybridization (analytical errors). These results highlight the analytical quality of clinically significant pharmacogenetic variants derived from NGS and reveal important areas for research and clinical laboratories to address with quality management programs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jmoldx.2017.04.002DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5500823PMC
July 2017

Psychological and behavioural impact of returning personal results from whole-genome sequencing: the HealthSeq project.

Eur J Hum Genet 2017 02 4;25(3):280-292. Epub 2017 Jan 4.

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Providing ostensibly healthy individuals with personal results from whole-genome sequencing could lead to improved health and well-being via enhanced disease risk prediction, prevention, and diagnosis, but also poses practical and ethical challenges. Understanding how individuals react psychologically and behaviourally will be key in assessing the potential utility of personal whole-genome sequencing. We conducted an exploratory longitudinal cohort study in which quantitative surveys and in-depth qualitative interviews were conducted before and after personal results were returned to individuals who underwent whole-genome sequencing. The participants were offered a range of interpreted results, including Alzheimer's disease, type 2 diabetes, pharmacogenomics, rare disease-associated variants, and ancestry. They were also offered their raw data. Of the 35 participants at baseline, 29 (82.9%) completed the 6-month follow-up. In the quantitative surveys, test-related distress was low, although it was higher at 1-week than 6-month follow-up (Z=2.68, P=0.007). In the 6-month qualitative interviews, most participants felt happy or relieved about their results. A few were concerned, particularly about rare disease-associated variants and Alzheimer's disease results. Two of the 29 participants had sought clinical follow-up as a direct or indirect consequence of rare disease-associated variants results. Several had mentioned their results to their doctors. Some participants felt having their raw data might be medically useful to them in the future. The majority reported positive reactions to having their genomes sequenced, but there were notable exceptions to this. The impact and value of returning personal results from whole-genome sequencing when implemented on a larger scale remains to be seen.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ejhg.2016.178DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5315514PMC
February 2017

Personal Genome Sequencing in Ostensibly Healthy Individuals and the PeopleSeq Consortium.

J Pers Med 2016 Mar 25;6(2). Epub 2016 Mar 25.

Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA.

Thousands of ostensibly healthy individuals have had their exome or genome sequenced, but a much smaller number of these individuals have received any personal genomic results from that sequencing. We term those projects in which ostensibly healthy participants can receive sequencing-derived genetic findings and may also have access to their genomic data as participatory predispositional personal genome sequencing (PPGS). Here we are focused on genome sequencing applied in a pre-symptomatic context and so define PPGS to exclude diagnostic genome sequencing intended to identify the molecular cause of suspected or diagnosed genetic disease. In this report we describe the design of completed and underway PPGS projects, briefly summarize the results reported to date and introduce the PeopleSeq Consortium, a newly formed collaboration of PPGS projects designed to collect much-needed longitudinal outcome data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/jpm6020014DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4932461PMC
March 2016

Impact of Genomic Counseling on Informed Decision-Making among ostensibly Healthy Individuals Seeking Personal Genome Sequencing: the HealthSeq Project.

J Genet Couns 2016 10 22;25(5):1044-53. Epub 2016 Feb 22.

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1498, New York, NY, USA.

Personal genome sequencing is increasingly utilized by healthy individuals for predispositional screening and other applications. However, little is known about the impact of 'genomic counseling' on informed decision-making in this context. Our primary aim was to compare measures of participants' informed decision-making before and after genomic counseling in the HealthSeq project, a longitudinal cohort study of individuals receiving personal results from whole genome sequencing (WGS). Our secondary aims were to assess the impact of the counseling on WGS knowledge and concerns, and to explore participants' satisfaction with the counseling. Questionnaires were administered to participants (n = 35) before and after their pre-test genomic counseling appointment. Informed decision-making was measured using the Decisional Conflict Scale (DCS) and the Satisfaction with Decision Scale (SDS). DCS scores decreased after genomic counseling (mean: 11.34 before vs. 5.94 after; z = -4.34, p < 0.001, r = 0.52), and SDS scores increased (mean: 27.91 vs. 29.06 respectively; z = 2.91, p = 0.004, r = 0.35). Satisfaction with counseling was high (mean (SD) = 26.91 (2.68), on a scale where 6 = low and 30 = high satisfaction). HealthSeq participants felt that their decision regarding receiving personal results from WGS was more informed after genomic counseling. Further research comparing the impact of different genomic counseling models is needed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10897-016-9935-zDOI Listing
October 2016

Preparing the next generation of genomicists: a laboratory-style course in medical genomics.

BMC Med Genomics 2015 Aug 12;8:47. Epub 2015 Aug 12.

Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1498, New York, NY, 10029, USA.

The growing gap between the demand for genome sequencing and the supply of trained genomics professionals is creating an acute need to develop more effective genomics education. In response we developed "Practical Analysis of Your Personal Genome", a novel laboratory-style medical genomics course in which students have the opportunity to obtain and analyze their own whole genome. This report describes our motivations for and the content of a "practical" genomics course that incorporates personal genome sequencing and the lessons we learned during the first three iterations of this course.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12920-015-0124-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4534145PMC
August 2015

Motivations, concerns and preferences of personal genome sequencing research participants: Baseline findings from the HealthSeq project.

Eur J Hum Genet 2016 Jan 3;24(1):14-20. Epub 2015 Jun 3.

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Whole exome/genome sequencing (WES/WGS) is increasingly offered to ostensibly healthy individuals. Understanding the motivations and concerns of research participants seeking out personal WGS and their preferences regarding return-of-results and data sharing will help optimize protocols for WES/WGS. Baseline interviews including both qualitative and quantitative components were conducted with research participants (n=35) in the HealthSeq project, a longitudinal cohort study of individuals receiving personal WGS results. Data sharing preferences were recorded during informed consent. In the qualitative interview component, the dominant motivations that emerged were obtaining personal disease risk information, satisfying curiosity, contributing to research, self-exploration and interest in ancestry, and the dominant concern was the potential psychological impact of the results. In the quantitative component, 57% endorsed concerns about privacy. Most wanted to receive all personal WGS results (94%) and their raw data (89%); a third (37%) consented to having their data shared to the Database of Genotypes and Phenotypes (dbGaP). Early adopters of personal WGS in the HealthSeq project express a variety of health- and non-health-related motivations. Almost all want all available findings, while also expressing concerns about the psychological impact and privacy of their results.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ejhg.2015.118DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4795230PMC
January 2016

Novel, compound heterozygous, single-nucleotide variants in MARS2 associated with developmental delay, poor growth, and sensorineural hearing loss.

Hum Mutat 2015 Jun 8;36(6):587-92. Epub 2015 Apr 8.

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York.

Novel, single-nucleotide mutations were identified in the mitochondrial methionyl amino-acyl tRNA synthetase gene (MARS2) via whole exome sequencing in two affected siblings with developmental delay, poor growth, and sensorineural hearing loss.We show that compound heterozygous mutations c.550C>T:p.Gln 184* and c.424C>T:p.Arg142Trp in MARS2 lead to decreased MARS2 protein levels in patient lymphoblasts. Analysis of respiratory complex enzyme activities in patient fibroblasts revealed decreased complex I and IV activities. Immunoblotting of patient fibroblast and lymphoblast samples revealed reduced protein levels of NDUFB8 and COXII, representing complex I and IV, respectively. Additionally, overexpression of wild-type MARS2 in patient fibroblasts increased NDUFB8 and COXII protein levels. These findings suggest that recessive single-nucleotide mutations in MARS2 are causative for a new mitochondrial translation deficiency disorder with a primary phenotype including developmental delay and hypotonia. Identification of additional patients with single-nucleotide mutations in MARS2 is necessary to determine if pectus carinatum is also a consistent feature of this syndrome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.22781DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4439286PMC
June 2015

How do students react to analyzing their own genomes in a whole-genome sequencing course?: outcomes of a longitudinal cohort study.

Genet Med 2015 Nov 29;17(11):866-74. Epub 2015 Jan 29.

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA.

Purpose: Health-care professionals need to be trained to work with whole-genome sequencing (WGS) in their practice. Our aim was to explore how students responded to a novel genome analysis course that included the option to analyze their own genomes.

Methods: This was an observational cohort study. Questionnaires were administered before (T3) and after the genome analysis course (T4), as well as 6 months later (T5). In-depth interviews were conducted at T5.

Results: All students (n = 19) opted to analyze their own genomes. At T5, 12 of 15 students stated that analyzing their own genomes had been useful. Ten reported they had applied their knowledge in the workplace. Technical WGS knowledge increased (mean of 63.8% at T3, mean of 72.5% at T4; P = 0.005). In-depth interviews suggested that analyzing their own genomes may increase students' motivation to learn and their understanding of the patient experience. Most (but not all) of the students reported low levels of WGS results-related distress and low levels of regret about their decision to analyze their own genomes.

Conclusion: Giving students the option of analyzing their own genomes may increase motivation to learn, but some students may experience personal WGS results-related distress and regret. Additional evidence is required before considering incorporating optional personal genome analysis into medical education on a large scale.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/gim.2014.203DOI Listing
November 2015

Analytical validation of whole exome and whole genome sequencing for clinical applications.

BMC Med Genomics 2014 Apr 23;7:20. Epub 2014 Apr 23.

Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Background: Whole exome and genome sequencing (WES/WGS) is now routinely offered as a clinical test by a growing number of laboratories. As part of the test design process each laboratory must determine the performance characteristics of the platform, test and informatics pipeline. This report documents one such characterization of WES/WGS.

Methods: Whole exome and whole genome sequencing was performed on multiple technical replicates of five reference samples using the Illumina HiSeq 2000/2500. The sequencing data was processed with a GATK-based genome analysis pipeline to evaluate: intra-run, inter-run, inter-mode, inter-machine and inter-library consistency, concordance with orthogonal technologies (microarray, Sanger) and sensitivity and accuracy relative to known variant sets.

Results: Concordance to high-density microarrays consistently exceeds 97% (and typically exceeds 99%) and concordance between sequencing replicates also exceeds 97%, with no observable differences between different flow cells, runs, machines or modes. Sensitivity relative to high-density microarray variants exceeds 95%. In a detailed study of a 129 kb region, sensitivity was lower with some validated single-base insertions and deletions "not called". Different variants are "not called" in each replicate: of all variants identified in WES data from the NA12878 reference sample 74% of indels and 89% of SNVs were called in all seven replicates, in NA12878 WGS 52% of indels and 88% of SNVs were called in all six replicates. Key sources of non-uniformity are variance in depth of coverage, artifactual variants resulting from repetitive regions and larger structural variants.

Conclusion: We report a comprehensive performance characterization of WES/WGS that will be relevant to offering laboratories, consumers of genome sequencing and others interested in the analytical validity of this technology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1755-8794-7-20DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4022392PMC
April 2014

Informed decision-making among students analyzing their personal genomes on a whole genome sequencing course: a longitudinal cohort study.

Genome Med 2013 30;5(12):113. Epub 2013 Dec 30.

Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA ; Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA.

Background: Multiple laboratories now offer clinical whole genome sequencing (WGS). We anticipate WGS becoming routinely used in research and clinical practice. Many institutions are exploring how best to educate geneticists and other professionals about WGS. Providing students in WGS courses with the option to analyze their own genome sequence is one strategy that might enhance students' engagement and motivation to learn about personal genomics. However, if this option is presented to students, it is vital they make informed decisions, do not feel pressured into analyzing their own genomes by their course directors or peers, and feel free to analyze a third-party genome if they prefer. We therefore developed a 26-hour introductory genomics course in part to help students make informed decisions about whether to receive personal WGS data in a subsequent advanced genomics course. In the advanced course, they had the option to receive their own personal genome data, or an anonymous genome, at no financial cost to them. Our primary aims were to examine whether students made informed decisions regarding analyzing their personal genomes, and whether there was evidence that the introductory course enabled the students to make a more informed decision.

Methods: This was a longitudinal cohort study in which students (N = 19) completed questionnaires assessing their intentions, informed decision-making, attitudes and knowledge before (T1) and after (T2) the introductory course, and before the advanced course (T3). Informed decision-making was assessed using the Decisional Conflict Scale.

Results: At the start of the introductory course (T1), most (17/19) students intended to receive their personal WGS data in the subsequent course, but many expressed conflict around this decision. Decisional conflict decreased after the introductory course (T2) indicating there was an increase in informed decision-making, and did not change before the advanced course (T3). This suggests that it was the introductory course content rather than simply time passing that had the effect. In the advanced course, all (19/19) students opted to receive their personal WGS data. No changes in technical knowledge of genomics were observed. Overall attitudes towards WGS were broadly positive.

Conclusions: Providing students with intensive introductory education about WGS may help them make informed decisions about whether or not to work with their personal WGS data in an educational setting.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/gm518DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3971344PMC
May 2014

Genome-wide association analysis of red blood cell traits in African Americans: the COGENT Network.

Hum Mol Genet 2013 Jun 26;22(12):2529-38. Epub 2013 Feb 26.

Division of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, AZ 85724, USA.

Laboratory red blood cell (RBC) measurements are clinically important, heritable and differ among ethnic groups. To identify genetic variants that contribute to RBC phenotypes in African Americans (AAs), we conducted a genome-wide association study in up to ~16 500 AAs. The alpha-globin locus on chromosome 16pter [lead SNP rs13335629 in ITFG3 gene; P < 1E-13 for hemoglobin (Hgb), RBC count, mean corpuscular volume (MCV), MCH and MCHC] and the G6PD locus on Xq28 [lead SNP rs1050828; P < 1E - 13 for Hgb, hematocrit (Hct), MCV, RBC count and red cell distribution width (RDW)] were each associated with multiple RBC traits. At the alpha-globin region, both the common African 3.7 kb deletion and common single nucleotide polymorphisms (SNPs) appear to contribute independently to RBC phenotypes among AAs. In the 2p21 region, we identified a novel variant of PRKCE distinctly associated with Hct in AAs. In a genome-wide admixture mapping scan, local European ancestry at the 6p22 region containing HFE and LRRC16A was associated with higher Hgb. LRRC16A has been previously associated with the platelet count and mean platelet volume in AAs, but not with Hgb. Finally, we extended to AAs the findings of association of erythrocyte traits with several loci previously reported in Europeans and/or Asians, including CD164 and HBS1L-MYB. In summary, this large-scale genome-wide analysis in AAs has extended the importance of several RBC-associated genetic loci to AAs and identified allelic heterogeneity and pleiotropy at several previously known genetic loci associated with blood cell traits in AAs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddt087DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3658166PMC
June 2013

CytoSPADE: high-performance analysis and visualization of high-dimensional cytometry data.

Bioinformatics 2012 Sep 10;28(18):2400-1. Epub 2012 Jul 10.

Department of Electrical Engineering, Stanford University, Stanford, CA, USA.

Motivation: Recent advances in flow cytometry enable simultaneous single-cell measurement of 30+ surface and intracellular proteins. CytoSPADE is a high-performance implementation of an interface for the Spanning-tree Progression Analysis of Density-normalized Events algorithm for tree-based analysis and visualization of this high-dimensional cytometry data.

Availability: Source code and binaries are freely available at http://cytospade.org and via Bioconductor version 2.10 onwards for Linux, OSX and Windows. CytoSPADE is implemented in R, C++ and Java.

Contact: [email protected]

Supplementary Information: Additional documentation available at http://cytospade.org.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bts425DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436846PMC
September 2012

Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE.

Nat Biotechnol 2011 Oct 2;29(10):886-91. Epub 2011 Oct 2.

Department of Radiology, Stanford University, Stanford, CA, USA.

The ability to analyze multiple single-cell parameters is critical for understanding cellular heterogeneity. Despite recent advances in measurement technology, methods for analyzing high-dimensional single-cell data are often subjective, labor intensive and require prior knowledge of the biological system. To objectively uncover cellular heterogeneity from single-cell measurements, we present a versatile computational approach, spanning-tree progression analysis of density-normalized events (SPADE). We applied SPADE to flow cytometry data of mouse bone marrow and to mass cytometry data of human bone marrow. In both cases, SPADE organized cells in a hierarchy of related phenotypes that partially recapitulated well-described patterns of hematopoiesis. We demonstrate that SPADE is robust to measurement noise and to the choice of cellular markers. SPADE facilitates the analysis of cellular heterogeneity, the identification of cell types and comparison of functional markers in response to perturbations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nbt.1991DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3196363PMC
October 2011

Computational solutions to large-scale data management and analysis.

Nat Rev Genet 2010 Sep;11(9):647-57

Pacific Biosciences, Menlo Park, California 94025, USA.

Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale, high-dimensional data sets that are generated by these technologies, which in turn requires us to adopt advances in informatics. Here we discuss how we can master the different types of computational environments that exist - such as cloud and heterogeneous computing - to successfully tackle our big data problems.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nrg2857DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3124937PMC
September 2010

High-throughput Bayesian Network Learning using Heterogeneous Multicore Computers.

ICS 2010 Jun;2010:95-104

Microbiology and Immunology, Stanford University.

Aberrant intracellular signaling plays an important role in many diseases. The causal structure of signal transduction networks can be modeled as Bayesian Networks (BNs), and computationally learned from experimental data. However, learning the structure of Bayesian Networks (BNs) is an NP-hard problem that, even with fast heuristics, is too time consuming for large, clinically important networks (20-50 nodes). In this paper, we present a novel graphics processing unit (GPU)-accelerated implementation of a Monte Carlo Markov Chain-based algorithm for learning BNs that is up to 7.5-fold faster than current general-purpose processor (GPP)-based implementations. The GPU-based implementation is just one of several implementations within the larger application, each optimized for a different input or machine configuration. We describe the methodology we use to build an extensible application, assembled from these variants, that can target a broad range of heterogeneous systems, e.g., GPUs, multicore GPPs. Specifically we show how we use the Merge programming model to efficiently integrate, test and intelligently select among the different potential implementations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1145/1810085.1810101DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5557010PMC
June 2010

Towards Program Optimization through Automated Analysis of Numerical Precision.

Proc CGO 2010 Apr;2010:230-237

Microbiology & Immunology, Stanford University, Stanford, CA, USA.

Reducing the arithmetic precision of a computation has real performance implications, including increased speed, decreased power consumption, and a smaller memory footprint. For some architectures, e.g., GPUs, there can be such a large performance difference that using reduced precision is effectively a requirement. The tradeoff is that the accuracy of the computation will be compromised. In this paper we describe a proof assistant and associated static analysis techniques for efficiently bounding numerical and precision-related errors. The programmer/compiler can use these bounds to numerically verify and optimize an application for different input and machine configurations. We present several case study applications that demonstrate the effectiveness of these techniques and the performance benefits that can be achieved with rigorous precision analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1145/1772954.1772987DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5552069PMC
April 2010

Increasing the performance of cortically-controlled prostheses.

Conf Proc IEEE Eng Med Biol Soc 2006 ;Suppl:6652-6

Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA.

Neural prostheses have received considerable attention due to their potential to dramatically improve the quality of life of severely disabled patients. Cortically-controlled prostheses are able to translate neural activity from cerebral cortex into control signals for guiding computer cursors or prosthetic limbs. Non-invasive and invasive electrode techniques can be used to measure neural activity, with the latter promising considerably higher levels of performance and therefore functionality to patients. We review here some of our recent experimental and computational work aimed at establishing a principled design methodology to increase electrode-based cortical prosthesis performance to near theoretical limits. Studies discussed include translating unprecedentedly brief periods of "plan" activity into high information rate (6.5 bits/s)control signals, improving decode algorithms and optimizing visual target locations for further performance increases, and recording from chronically implanted arrays in freely behaving monkeys to characterize neuron stability. Taken together, these results should substantially increase the clinical viability of cortical prostheses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/IEMBS.2006.260912DOI Listing
April 2008

HermesB: a continuous neural recording system for freely behaving primates.

IEEE Trans Biomed Eng 2007 Nov;54(11):2037-50

Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA.

Chronically implanted electrode arrays have enabled a broad range of advances in basic electrophysiology and neural prosthetics. Those successes motivate new experiments, particularly, the development of prototype implantable prosthetic processors for continuous use in freely behaving subjects, both monkeys and humans. However, traditional experimental techniques require the subject to be restrained, limiting both the types and duration of experiments. In this paper, we present a dual-channel, battery-powered neural recording system with an integrated three-axis accelerometer for use with chronically implanted electrode arrays in freely behaving primates. The recording system called HermesB, is self-contained, autonomous, programmable, and capable of recording broadband neural (sampled at 30 kS/s) and acceleration data to a removable compact flash card for up to 48 h. We have collected long-duration data sets with HermesB from an adult macaque monkey which provide insight into time scales and free behaviors inaccessible under traditional experiments. Variations in action potential shape and root-mean square (RMS) noise are observed across a range of time scales. The peak-to-peak voltage of action potentials varied by up to 30% over a 24-h period including step changes in waveform amplitude (up to 25%) coincident with high acceleration movements of the head. These initial results suggest that spike-sorting algorithms can no longer assume stable neural signals and will need to transition to adaptive signal processing methodologies to maximize performance. During physically active periods (defined by head-mounted accelerometer), significantly reduced 5-25-Hz local field potential (LFP) power and increased firing rate variability were observed. Using a threshold fit to LFP power, 93% of 403 5-min recording blocks were correctly classified as active or inactive, potentially providing an efficient tool for identifying different behavioral contexts in prosthetic applications. These results demonstrate the utility of the HermesB system and motivate using this type of system to advance neural prosthetics and electrophysiological experiments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TBME.2007.895753DOI Listing
November 2007

Multiday electrophysiological recordings from freely behaving primates.

Conf Proc IEEE Eng Med Biol Soc 2006;2006:5643-6

Department of Computer Science, Stanford University, California, USA.

Continuous multiday broadband neural data provide a means for observing effects at fine timescales over long periods. In this paper we present analyses on such data sets to demonstrate neural correlates for physically active and inactive time periods, as defined by the response of a head-mounted accelerometer. During active periods, we found that 5-25 Hz local field potential (LFP) power was significantly reduced, firing rate variability increased, and firing rates have greater temporal correlation. Using a single threshold fit to LFP power, 93% of the 403 5 minute blocks tested were correctly classified as active or inactive (as labeled by thresholding each block's maximal accelerometer magnitude). These initial results motivate the use of such data sets for testing neural prosthetics systems and for finding the neural correlates of natural behaviors.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/IEMBS.2006.260412DOI Listing
March 2008

Neural recording stability of chronic electrode arrays in freely behaving primates.

Conf Proc IEEE Eng Med Biol Soc 2006;2006:4387-91

Dept. of Electr. Eng., Stanford Univ., CA, USA.

Chronically implanted electrode arrays have enabled a broad range of advances, particularly in the field of neural prosthetics. Those successes motivate development of prototype implantable prosthetic processors for long duration, continuous use in freely behaving subjects. However, traditional experimental protocols have provided limited information regarding the stability of the electrode arrays and their neural recordings. In this paper we present preliminary results derived from long duration neural recordings in a freely behaving primate which show variations in action potential shape and RMS noise across a range of time scales. These preliminary results suggest that spike sorting algorithms can no longer assume stable neural signals and will need to transition to adaptive signal processing methodologies to maximize performance.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/IEMBS.2006.260814DOI Listing
April 2008

An autonomous, broadband, multi-channel neural recording system for freely behaving primates.

Conf Proc IEEE Eng Med Biol Soc 2006;2006:1212-5

Department of Electrical Engineering, Stanford University, CA, USA.

Successful laboratory proof-of-concept experiments with neural prosthetic systems motivate continued algorithm and hardware development. For these efforts to move beyond traditional fixed laboratory setups, new tools are needed to enable broadband, multi-channel, long duration neural recording from freely behaving primates. In this paper we present a dual-channel, battery powered, neural recording system with integrated 3-axis accelerometer for use with chronically implanted electrode arrays. The recording system, called HermesB, is self-contained, autonomous, programmable and capable of recording broadband neural and head acceleration data to a removable compact flash card for up to 48 hours.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/IEMBS.2006.260813DOI Listing
March 2008
-->