Publications by authors named "Michael Schatz"

375 Publications

SNPC-1.3 is a sex-specific transcription factor that drives male piRNA expression in .

Elife 2021 Feb 15;10. Epub 2021 Feb 15.

Department of Biology, Johns Hopkins University, Baltimore, United States.

Piwi-interacting RNAs (piRNAs) play essential roles in silencing repetitive elements to promote fertility in metazoans. Studies in worms, flies, and mammals reveal that piRNAs are expressed in a sex-specific manner. However, the mechanisms underlying this sex-specific regulation are unknown. Here we identify SNPC-1.3, a male germline-enriched variant of a conserved subunit of the small nuclear RNA-activating protein complex, as a male-specific piRNA transcription factor in . SNPC-1.3 colocalizes with the core piRNA transcription factor, SNPC-4, in nuclear foci of the male germline. Binding of SNPC-1.3 at male piRNA loci drives spermatogenic piRNA transcription and requires SNPC-4. Loss of leads to depletion of male piRNAs and defects in male-dependent fertility. Furthermore, TRA-1, a master regulator of sex determination, binds to the promoter and represses its expression during oogenesis. Loss of TRA-1 targeting causes ectopic expression of and male piRNAs during oogenesis. Thus, sexually dimorphic regulation of expression coordinates male and female piRNA expression during germline development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.60681DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7884074PMC
February 2021

Using Galaxy to Perform Large-Scale Interactive Data Analyses-An Update.

Curr Protoc 2021 Feb;1(2):e31

Penn State University, University Park, Pennsylvania.

Modern biology continues to become increasingly computational. Datasets are becoming progressively larger, more complex, and more abundant. The computational savviness necessary to analyze these data creates an ongoing obstacle for experimental biologists. Galaxy (galaxyproject.org) provides access to computational biology tools in a web-based interface. It also provides access to major public biological data repositories, allowing private data to be combined with public datasets. Galaxy is hosted on high-capacity servers worldwide and is accessible for free, with an option to be installed locally. This article demonstrates how to employ Galaxy to perform biologically relevant analyses on publicly available datasets. These protocols use both standard and custom tools, serving as a tutorial and jumping-off point for more intensive and/or more specific analyses using Galaxy. © 2021 Wiley Periodicals LLC. Basic Protocol 1: Finding human coding exons with highest SNP density Basic Protocol 2: Calling peaks for ChIP-seq data Basic Protocol 3: Compare datasets using genomic coordinates Basic Protocol 4: Working with multiple alignments Basic Protocol 5: Single cell RNA-seq.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/cpz1.31DOI Listing
February 2021

The human origin recognition complex is essential for pre-RC assembly, mitosis, and maintenance of nuclear structure.

Elife 2021 Feb 1;10. Epub 2021 Feb 1.

Cold Spring Harbor Laboratory, Cold Spring Harbor, United States.

The origin recognition complex (ORC) cooperates with CDC6, MCM2-7, and CDT1 to form pre-RC complexes at origins of DNA replication. Here, using tiling-sgRNA CRISPR screens, we report that each subunit of ORC and CDC6 is essential in human cells. Using an auxin-inducible degradation system, we created stable cell lines capable of ablating ORC2 rapidly, revealing multiple cell division cycle phenotypes. The primary defects in the absence of ORC2 were cells encountering difficulty in initiating DNA replication or progressing through the cell division cycle due to reduced MCM2-7 loading onto chromatin in G1 phase. The nuclei of ORC2-deficient cells were also large, with decompacted heterochromatin. Some ORC2-deficient cells that completed DNA replication entered into, but never exited mitosis. ORC1 knockout cells also demonstrated extremely slow cell proliferation and abnormal cell and nuclear morphology. Thus, ORC proteins and CDC6 are indispensable for normal cellular proliferation and contribute to nuclear organization.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.61797DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7877914PMC
February 2021

Prevalence and Characteristics of Chronic Cough in Adults Identified by Administrative Data.

Perm J 2020 12;24:1-3

Departments of Allergy and Research and Evaluation, Kaiser Permanente Southern California, San Diego and Pasadena, CA.

Context: International Classification of Diseases-9/10 codes for chronic cough (CC) do not exist, limiting investigation.

Objective: To develop a computerized algorithm to determine CC prevalence and its characteristics.

Design: This observational study using administrative data identified hierarchically patients aged 18 to 85 years with CC from 2013 to 2016. First, a specialist-diagnosed CC group was identified using an internal CC encounter code during an outpatient visit to a pulmonologist, allergist, otolaryngologist, or gastroenterologist. Subsequently, an event-diagnosed CC group was identified based on clinical notes through natural language processing, ICD-9/ICD-10 cough codes, and dispensed antitussives.

Main Outcome Measures: Prevalence of CC and comparison of clinical characteristics between specialist-diagnosed and event-diagnosed CC subgroups.

Results: A total of 50,163 patients with CC of more than 8 weeks were identified. Of these, 11,290 (22.5%) were specialist diagnosed, and 38,873 (77.5%) were event diagnosed. The CC cohort was 57.4 ± 16.5 years of age; 67.6% were female. The overall prevalence was 1.04% (95% confidence interval = 1.03-1.06) in 2016. Prevalence in 2016 was higher in female patients (1.21%) than in male patients (0.81%), higher in patients aged 65 to 85 years (2.2%) than in patients aged 18 to 44 years (0.43%), and higher in Blacks (1.38%) than in Whites (1.21%). Compared with patients with event-diagnosed CC, patients with specialist-diagnosed CC exhibited significantly higher frequencies of laboratory tests and respiratory and nonrespiratory comorbidities and dispensed medication and lower frequency of pneumonia, all-cause and respiratory-cause emergency department visits and hospitalizations, and dispensed antitussives.

Conclusions: We identified a CC cohort using electronic data in a managed care organization. Prevalences varied by sex, age, and ethnicity. Clinical characteristics varied between specialist-diagnosed and event-diagnosed CC.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7812/TPP/20.022DOI Listing
December 2020

A structured review evaluating content validity of the Asthma Control Test, and its consistency with U.S. guidelines and patient expectations for asthma control.

J Asthma 2020 Dec 30:1-15. Epub 2020 Dec 30.

Value Evidence and Outcomes, GlaxoSmithKline plc., Brentford, MDX, UK.

Objective: To assess whether the content of the Asthma Control Test (ACT) served as a valid measure of asthma control (i.e., content validity) by mapping ACT items to the National Heart, Lung and Blood Institute (NHLBI) guideline asthma control definitions, and to language used by patients to describe their asthma.

Data Sources: PubMed and EMBASE databases were used for a structured literature analysis.

Study Selections: Full-text, English-language articles that reported findings from qualitative studies conducted in adults, focusing on patient descriptors of asthma symptoms, impacts, or severity, were included. Pediatric studies, studies conducted in patients without asthma, and studies that did not contain qualitative data were excluded.

Results: ACT items reflected all domains of asthma impairment described in the NHLBI guidelines, except pulmonary function. Following the literature review, 28 full-text publications were identified that included patient descriptors that could be mapped to ACT items. For example, per ACT Item 1, patients described having trouble at work, school, and completing household chores; and, per ACT Item 2, patients used the phrase "short of breath" to describe asthma-associated symptoms.

Conclusion: ACT item content corresponded well with the NHLBI guideline definitions of the impairment domain of asthma control (focused on asthma symptoms and impact), and we identified numerous examples in the literature indicating that ACT concepts and item content mirror the language patients use when discussing asthma symptoms and impact, and their degree of asthma control. This provides further evidence to support content validity of the ACT as a measure of asthma control.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1080/02770903.2020.1861624DOI Listing
December 2020

Parliament2: Accurate structural variant calling at scale.

Gigascience 2020 Dec;9(12)

Human Genome Sequencing Center, One Baylor Plaza, Baylor College of Medicine, Houston, TX 77030, USA.

Background: Structural variants (SVs) are critical contributors to genetic diversity and genomic disease. To predict the phenotypic impact of SVs, there is a need for better estimates of both the occurrence and frequency of SVs, preferably from large, ethnically diverse cohorts. Thus, the current standard approach requires the use of short paired-end reads, which remain challenging to detect, especially at the scale of hundreds to thousands of samples.

Findings: We present Parliament2, a consensus SV framework that leverages multiple best-in-class methods to identify high-quality SVs from short-read DNA sequence data at scale. Parliament2 incorporates pre-installed SV callers that are optimized for efficient execution in parallel to reduce the overall runtime and costs. We demonstrate the accuracy of Parliament2 when applied to data from NovaSeq and HiSeq X platforms with the Genome in a Bottle (GIAB) SV call set across all size classes. The reported quality score per SV is calibrated across different SV types and size classes. Parliament2 has the highest F1 score (74.27%) measured across the independent gold standard from GIAB. We illustrate the compute performance by processing all 1000 Genomes samples (2,691 samples) in <1 day on GRCH38. Parliament2 improves the runtime performance of individual methods and is open source (https://github.com/slzarate/parliament2), and a Docker image, as well as a WDL implementation, is available.

Conclusion: Parliament2 provides both a highly accurate single-sample SV call set from short-read DNA sequence data and enables cost-efficient application over cloud or cluster environments, processing thousands of samples.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giaa145DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7751401PMC
December 2020

iGenomics: Comprehensive DNA sequence analysis on your Smartphone.

Gigascience 2020 Dec;9(12)

Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY 11724, USA.

Background: Following the miniaturization of integrated circuitry and other computer hardware over the past several decades, DNA sequencing is on a similar path. Leading this trend is the Oxford Nanopore sequencing platform, which currently offers the hand-held MinION instrument and even smaller instruments on the horizon. This technology has been used in several important applications, including the analysis of genomes of major pathogens in remote stations around the world. However, despite the simplicity of the sequencer, an equally simple and portable analysis platform is not yet available.

Results: iGenomics is the first comprehensive mobile genome analysis application, with capabilities to align reads, call variants, and visualize the results entirely on an iOS device. Implemented in Objective-C using the FM-index, banded dynamic programming, and other high-performance bioinformatics techniques, iGenomics is optimized to run in a mobile environment. We benchmark iGenomics using a variety of real and simulated Nanopore sequencing datasets of viral and bacterial genomes and show that iGenomics has performance comparable to the popular BWA-MEM/SAMtools/IGV suite, without necessitating a laptop or server cluster.

Conclusions: iGenomics is available open source (https://github.com/stuckinaboot/iGenomics) and for free on Apple's App Store (https://apple.co/2HCplzr).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giaa138DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7720420PMC
December 2020

2020 Focused Updates to the Asthma Management Guidelines: A Report from the National Asthma Education and Prevention Program Coordinating Committee Expert Panel Working Group.

J Allergy Clin Immunol 2020 Dec;146(6):1217-1270

National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda.

The 2020 Focused Updates to the Asthma Management Guidelines: A Report from the National Asthma Education and Prevention Program Coordinating Committee Expert Panel Working Group was coordinated and supported by the National Heart, Lung, and Blood Institute (NHLBI) of the National Institutes of Health. It is designed to improve patient care and support informed decision making about asthma management in the clinical setting. This update addresses six priority topic areas as determined by the state of the science at the time of a needs assessment, and input from multiple stakeholders:A rigorous process was undertaken to develop these evidence-based guidelines. The Agency for Healthcare Research and Quality's (AHRQ) Evidence-Based Practice Centers conducted systematic reviews on these topics, which were used by the Expert Panel Working Group as a basis for developing recommendations and guidance. The Expert Panel used GRADE (Grading of Recommendations, Assessment, Development and Evaluation), an internationally accepted framework, in consultation with an experienced methodology team for determining the certainty of evidence and the direction and strength of recommendations based on the evidence. Practical implementation guidance for each recommendation incorporates findings from NHLBI-led patient, caregiver, and clinician focus groups. To assist clincians in implementing these recommendations into patient care, the new recommendations have been integrated into the existing Expert Panel Report-3 (EPR-3) asthma management step diagram format.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jaci.2020.10.003DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924476PMC
December 2020

Managing Asthma in Adolescents and Adults: 2020 Asthma Guideline Update From the National Asthma Education and Prevention Program.

JAMA 2020 12;324(22):2301-2317

Department of Allergy, Kaiser Permanente Medical Center, San Diego, California.

Importance: Asthma is a major public health problem worldwide and is associated with excess morbidity, mortality, and economic costs associated with lost productivity. The National Asthma Education and Prevention Program has released the 2020 Asthma Guideline Update with updated evidence-based recommendations for treatment of patients with asthma.

Objective: To report updated recommendations for 6 topics for clinical management of adolescents and adults with asthma: (1) intermittent inhaled corticosteroids (ICSs); (2) add-on long-acting muscarinic antagonists; (3) fractional exhaled nitric oxide; (4) indoor allergen mitigation; (5) immunotherapy; and (6) bronchial thermoplasty.

Evidence Review: The National Heart, Lung, and Blood Advisory Council chose 6 topics to update the 2007 asthma guidelines based on results from a 2014 needs assessment. The Agency for Healthcare Research and Quality conducted systematic reviews of these 6 topics based on literature searches up to March-April 2017. Reviews were updated through October 2018 and used by an expert panel (n = 19) that included asthma content experts, primary care clinicians, dissemination and implementation experts, and health policy experts to develop 19 new recommendations using the GRADE method. The 17 recommendations for individuals aged 12 years or older are reported in this Special Communication.

Findings: From 20 572 identified references, 475 were included in the 6 systematic reviews to form the evidence basis for these recommendations. Compared with the 2007 guideline, there was no recommended change in step 1 (intermittent asthma) therapy (as-needed short-acting β2-agonists [SABAs] for rescue therapy). In step 2 (mild persistent asthma), either daily low-dose ICS plus as-needed SABA therapy or as-needed concomitant ICS and SABA therapy are recommended. Formoterol in combination with an ICS in a single inhaler (single maintenance and reliever therapy) is recommended as the preferred therapy for moderate persistent asthma in step 3 (low-dose ICS-formoterol therapy) and step 4 (medium-dose ICS-formoterol therapy) for both daily and as-needed therapy. A short-term increase in the ICS dose alone for worsening of asthma symptoms is not recommended. Add-on long-acting muscarinic antagonists are recommended in individuals whose asthma is not controlled by ICS-formoterol therapy for step 5 (moderate-severe persistent asthma). Fractional exhaled nitric oxide testing is recommended to assist in diagnosis and monitoring of symptoms, but not alone to diagnose or monitor asthma. Allergen mitigation is recommended only in individuals with exposure and relevant sensitivity or symptoms. When used, allergen mitigation should be allergen specific and include multiple allergen-specific mitigation strategies. Subcutaneous immunotherapy is recommended as an adjunct to standard pharmacotherapy for individuals with symptoms and sensitization to specific allergens. Sublingual immunotherapy is not recommended specifically for asthma. Bronchial thermoplasty is not recommended as part of standard care; if used, it should be part of an ongoing research effort.

Conclusions And Relevance: Asthma is a common disease with substantial human and economic costs globally. Although there is no cure or established means of prevention, effective treatment is available. Use of the recommendations in the 2020 Asthma Guideline Update should improve the health of individuals with asthma.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/jama.2020.21974DOI Listing
December 2020

Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED.

Nat Biotechnol 2020 Nov 30. Epub 2020 Nov 30.

Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.

Conventional targeted sequencing methods eliminate many of the benefits of nanopore sequencing, such as the ability to accurately detect structural variants or epigenetic modifications. The ReadUntil method allows nanopore devices to selectively eject reads from pores in real time, which could enable purely computational targeted sequencing. However, this requires rapid identification of on-target reads while most mapping methods require computationally intensive basecalling. We present UNCALLED ( https://github.com/skovaka/UNCALLED ), an open source mapper that rapidly matches streaming of nanopore current signals to a reference sequence. UNCALLED probabilistically considers k-mers that could be represented by the signal and then prunes the candidates based on the reference encoded within a Ferragina-Manzini index. We used UNCALLED to deplete sequencing of known bacterial genomes within a metagenomics community, enriching the remaining species 4.46-fold. UNCALLED also enriched 148 human genes associated with hereditary cancers to 29.6× coverage using one MinION flowcell, enabling accurate detection of single-nucleotide polymorphisms, insertions and deletions, structural variants and methylation in these genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-020-0731-9DOI Listing
November 2020

Clonal Hematopoiesis Before, During, and After Human Spaceflight.

Cell Rep 2020 Dec 25;33(10):108458. Epub 2020 Nov 25.

Department of Physiology and Biophysics, Weill Cornell Medicine, New York, 10065, USA; The Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, USA; The WorldQuant Initiative for Quantitative Prediction, New York, NY, USA; The Feil Family Brain and Mind Research Institute, New York, NY, USA. Electronic address:

Clonal hematopoiesis (CH) occurs when blood cells harboring an advantageous mutation propagate faster than others. These mutations confer a risk for hematological cancers and cardiovascular disease. Here, we analyze CH in blood samples from a pair of twin astronauts over 4 years in bulk and fractionated cell populations using a targeted CH panel, linked-read whole-genome sequencing, and deep RNA sequencing. We show CH with distinct mutational profiles and increasing allelic fraction that includes a high-risk, TET2 clone in one subject and two DNMT3A mutations on distinct alleles in the other twin. These astronauts exhibit CH almost two decades prior to the mean age at which it is typically detected and show larger shifts in clone size than age-matched controls or radiotherapy patients, based on a longitudinal cohort of 157 cancer patients. As such, longitudinal monitoring of CH may serve as an important metric for overall cancer and cardiovascular risk in astronauts.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.celrep.2020.108458DOI Listing
December 2020

Patient-Reported Burden of Chronic Cough in a Managed Care Organization.

J Allergy Clin Immunol Pract 2020 Nov 20. Epub 2020 Nov 20.

Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, Calif.

Background: The burden of chronic cough (CC) requires better understanding.

Objective: To determine the severity, health status, and health care resource utilization among patients with CC identified by electronic health records on 2 visits separated by ≥1 year.

Methods: Information on cough-related burden was collected through survey from patients with CC, including validated questionnaires (the cough health status Leicester Cough Questionnaire [LCQ], the cough hypersensitivity Hull Airway Reflux Questionnaire [HARQ], and the Cough Quality of Life Questionnaire [CQLQ]), CC-associated respiratory and gastrointestinal comorbidities, and treatment responses. Spearman correlation coefficients were reported to examine the associations among the LCQ, HARQ, and CQLQ. Patient demographics and patient-reported CC features were compared between males and females, and among ethnic groups using Robust Poisson regression models.

Results: The survey was completed by 565 patients who were 64.8 ± 12.6 years, 75.8% female, and 60.4% white. CC duration was 8.6 ± 10.5 years with an average weekly severity of 5.3 ± 2.3 (maximum 10). The LCQ score was 11.3 ± 3.9 (maximum 21). The HARQ score was 33.3 ± 13.6 (normal ≤13). The CQLQ score was 56.9 ± 17.5 (maximum 112, worse with higher scores). The Spearman rank correlations were high between the LCQ and HARQ (-0.65), the LCQ and CQLQ (-0.80), and the HARQ and CQLQ (0.69). Patients with CC-associated respiratory and gastrointestinal comorbidities generally showed similar results regarding the above questionnaires. Treatment responses were suboptimal. Women compared with men and non-whites compared with whites reported significantly worse cough severity and poorer LCQ, HARQ, and CQLQ scores.

Conclusions: CC is self-reported as a burdensome condition, particularly in women and non-white minorities, which markedly affects daily living with inadequate response to treatments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jaip.2020.11.018DOI Listing
November 2020

Developing a Predictive Model for Asthma-Related Hospital Encounters in Patients With Asthma in a Large, Integrated Health Care System: Secondary Analysis.

JMIR Med Inform 2020 Nov 9;8(11):e22689. Epub 2020 Nov 9.

Department of Research & Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States.

Background: Asthma causes numerous hospital encounters annually, including emergency department visits and hospitalizations. To improve patient outcomes and reduce the number of these encounters, predictive models are widely used to prospectively pinpoint high-risk patients with asthma for preventive care via care management. However, previous models do not have adequate accuracy to achieve this goal well. Adopting the modeling guideline for checking extensive candidate features, we recently constructed a machine learning model on Intermountain Healthcare data to predict asthma-related hospital encounters in patients with asthma. Although this model is more accurate than the previous models, whether our modeling guideline is generalizable to other health care systems remains unknown.

Objective: This study aims to assess the generalizability of our modeling guideline to Kaiser Permanente Southern California (KPSC).

Methods: The patient cohort included a random sample of 70.00% (397,858/568,369) of patients with asthma who were enrolled in a KPSC health plan for any duration between 2015 and 2018. We produced a machine learning model via a secondary analysis of 987,506 KPSC data instances from 2012 to 2017 and by checking 337 candidate features to project asthma-related hospital encounters in the following 12-month period in patients with asthma.

Results: Our model reached an area under the receiver operating characteristic curve of 0.820. When the cutoff point for binary classification was placed at the top 10.00% (20,474/204,744) of patients with asthma having the largest predicted risk, our model achieved an accuracy of 90.08% (184,435/204,744), a sensitivity of 51.90% (2259/4353), and a specificity of 90.91% (182,176/200,391).

Conclusions: Our modeling guideline exhibited acceptable generalizability to KPSC and resulted in a model that is more accurate than those formerly built by others. After further enhancement, our model could be used to guide asthma care management.

International Registered Report Identifier (irrid): RR2-10.2196/resprot.5039.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2196/22689DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7683251PMC
November 2020

Sapling: Accelerating Suffix Array Queries with Learned Data Models.

Bioinformatics 2020 Oct 27. Epub 2020 Oct 27.

Department of Computer Science, Johns Hopkins University, Baltimore, MD.

Motivation: As genomic data becomes more abundant, efficient algorithms and data structures for sequence alignment become increasingly important. The suffix array is a widely used data structure to accelerate alignment, but the binary search algorithm used to query it requires widespread memory accesses, causing a large number of cache misses on large datasets.

Results: Here we present Sapling, an algorithm for sequence alignment which uses a learned data model to augment the suffix array and enable faster queries. We investigate different types of data models, providing an analysis of different neural network models as well as providing an open-source aligner with a compact, practical piecewise linear model. We show that Sapling outperforms both an optimized binary search approach and multiple widely-used read aligners on a diverse collection of genomes, including human, bacteria, and plants, speeding up the algorithm by more than a factor of two while adding less than 1% to the suffix array's memory footprint.

Availability: The source code and tutorial are available open-source at https://github.com/mkirsche/sapling.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa911DOI Listing
October 2020

Interpreting, analysing and modelling COVID-19 mortality data.

Nonlinear Dyn 2020 Oct 1:1-26. Epub 2020 Oct 1.

Tokyo Tech World Research Hub Initiative (WRHI), Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, 226-8502 Japan.

We present results on the mortality statistics of the COVID-19 epidemic in a number of countries. Our data analysis suggests classifying countries in five groups, (1) Western countries, (2) East Block, (3) developed Southeast Asian countries, (4) Northern Hemisphere developing countries and (5) Southern Hemisphere countries. Comparing the number of deaths per million inhabitants, a pattern emerges in which the Western countries exhibit the largest mortality rate. Furthermore, comparing the running cumulative death tolls as the same level of outbreak progress in different countries reveals several subgroups within the Western countries and further emphasises the difference between the five groups. Analysing the relationship between deaths per million and life expectancy in different countries, taken as a proxy of the preponderance of elderly people in the population, a main reason behind the relatively more severe COVID-19 epidemic in the Western countries is found to be their larger population of elderly people, with exceptions such as Norway and Japan, for which other factors seem to dominate. Our comparison between countries at the same level of outbreak progress allows us to identify and quantify a measure of efficiency of the level of stringency of confinement measures. We find that increasing the stringency from 20 to 60 decreases the death count by about 50 lives per million in a time window of 20  days. Finally, we perform logistic equation analyses of deaths as a means of tracking the dynamics of outbreaks in the "first wave" and estimating the associated ultimate mortality, using four different models to identify model error and robustness of results. This quantitative analysis allows us to assess the outbreak progress in different countries, differentiating between those that are at a quite advanced stage and close to the end of the epidemic from those that are still in the middle of it. This raises many questions in terms of organisation, preparedness, governance structure and so on.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s11071-020-05966-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7527427PMC
October 2020

A diploid assembly-based benchmark for variants in the major histocompatibility complex.

Nat Commun 2020 09 22;11(1):4794. Epub 2020 Sep 22.

Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD, 20899, USA.

Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - the Major Histocompatibility Complex (MHC). Here, we develop a human genome benchmark derived from a diploid assembly for the openly-consented Genome in a Bottle sample HG002. We assemble a single contig for each haplotype, align them to the reference, call phased small and structural variants, and define a small variant benchmark for the MHC, covering 94% of the MHC and 22368 variants smaller than 50 bp, 49% more variants than a mapping-based benchmark. This benchmark reliably identifies errors in mapping-based callsets, and enables performance assessment in regions with much denser, complex variation than regions covered by previous benchmarks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-18564-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7508831PMC
September 2020

Comprehensive analysis of structural variants in breast cancer genomes using single-molecule sequencing.

Genome Res 2020 Sep 4;30(9):1258-1273. Epub 2020 Sep 4.

Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21211, USA.

Improved identification of structural variants (SVs) in cancer can lead to more targeted and effective treatment options as well as advance our basic understanding of the disease and its progression. We performed whole-genome sequencing of the SKBR3 breast cancer cell line and patient-derived tumor and normal organoids from two breast cancer patients using Illumina/10x Genomics, Pacific Biosciences (PacBio), and Oxford Nanopore Technologies (ONT) sequencing. We then inferred SVs and large-scale allele-specific copy number variants (CNVs) using an ensemble of methods. Our findings show that long-read sequencing allows for substantially more accurate and sensitive SV detection, with between 90% and 95% of variants supported by each long-read technology also supported by the other. We also report high accuracy for long reads even at relatively low coverage (25×-30×). Furthermore, we integrated SV and CNV data into a unifying karyotype-graph structure to present a more accurate representation of the mutated cancer genomes. We find hundreds of variants within known cancer-related genes detectable only through long-read sequencing. These findings highlight the need for long-read sequencing of cancer genomes for the precise analysis of their genetic instability.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.260497.119DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7545150PMC
September 2020

Capturing Turbulent Dynamics and Statistics in Experiments with Unstable Periodic Orbits.

Phys Rev Lett 2020 Aug;125(6):064501

School of Physics, Georgia Institute of Technology, Atlanta, Georgia 30332, USA.

In laboratory studies and numerical simulations, we observe clear signatures of unstable time-periodic solutions in a moderately turbulent quasi-two-dimensional flow. We validate the dynamical relevance of such solutions by demonstrating that turbulent flows in both experiment and numerics transiently display time-periodic dynamics when they shadow unstable periodic orbits (UPOs). We show that UPOs we computed are also statistically significant, with turbulent flows spending a sizable fraction of the total time near these solutions. As a result, the average rates of energy input and dissipation for the turbulent flow and frequently visited UPOs differ only by a few percent.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1103/PhysRevLett.125.064501DOI Listing
August 2020

Effect of early and late prenatal vitamin D and maternal asthma status on offspring asthma or recurrent wheeze.

J Allergy Clin Immunol 2020 Aug 19. Epub 2020 Aug 19.

Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Mass. Electronic address:

Background: Childhood asthma developmental programming is complex. Maternal asthma is a strong risk factor for childhood asthma, whereas vitamin D (VD) has emerged as a modifiable prenatal exposure.

Objective: Our aim was to examine the combined effect of early and late prenatal VD status in during pregnancies in women with and without asthma on childhood asthma or recurrent wheeze development.

Methods: We conducted a cohort study using prospectively collected data from the Vitamin D Antenatal Asthma Reduction Trial, a randomized, double-blinded, placebo-controlled VD supplementation trial in pregnant women at high risk of offspring asthma (N = 806 mother-offspring pairs). 25-Hydroxyvitamin-D (25(OH)D) level was measured in early and late pregnancy. Our main exposure was an ordered variable representing early and late prenatal VD sufficiency (25(OH)D level ≥ 30 ng/mL) status during pregnancy in women with and without asthma. The primary outcome was offspring with asthma or recurrent wheeze by age 3 years. We also examined the effect of prenatal VD level on early life asthma or recurrent wheeze progression to active asthma at age 6 years.

Results: Among mothers with asthma versus among mothers with early and late prenatal VD insufficiency, those with early or late VD sufficiency (adjusted odds ratio = 0.56; 95% CI = 0.31-1.00) or early and late VD sufficiency (adjusted odds ratio = 0.36; 95% CI = 0.15-0.81) had a lower risk of offspring with asthma or recurrent wheeze by age 3 years (P = .008). This protective trend was reiterated in asthma or recurrent wheeze progression to active asthma from age 3 to 6 years (P  = .04).

Conclusion: This study implies a protective role for VD sufficiency throughout pregnancy, particularly in attenuating the risk conferred by maternal asthma on childhood asthma or recurrent wheeze development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jaci.2020.06.041DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7892633PMC
August 2020

Genomic Diversity of SARS-CoV-2 During Early Introduction into the United States National Capital Region.

medRxiv 2020 Aug 15. Epub 2020 Aug 15.

Background: The early COVID-19 pandemic has been characterized by rapid global spread. In the United States National Capital Region, over 2,000 cases were reported within three weeks of its first detection in March 2020. We aimed to use genomic sequencing to understand the initial spread of SARS-CoV-2, the virus that causes COVID-19, in the region. By correlating genetic information to disease phenotype, we also aimed to gain insight into any correlation between viral genotype and case severity or transmissibility.

Methods: We performed whole genome sequencing of clinical SARS-CoV-2 samples collected in March 2020 by the Johns Hopkins Health System, building on methods developed by the ARTIC network. We analyzed these regional SARS-CoV-2 genomes alongside detailed clinical metadata and the global phylogeny to understand early establishment of the virus within the region.

Results: We analyzed 620 samples from the Johns Hopkins Health System collected between March 11-31, 2020, comprising 37.3% of the total cases in Maryland during this period. We selected 143 of these samples for sequencing, generating 114 complete viral genomes. These genomes belonged to all five major Nextstrain-defined clades, suggesting multiple introductions into the region and underscoring the diversity of the regional epidemic. We also found that clinically severe cases had genomes belonging to all of these clades.

Conclusions: We established a pipeline for SARS-CoV-2 sequencing within the Johns Hopkins Health system, which enabled us to capture the significant viral diversity present in the region as early as March 2020. Efforts to control local spread of the virus were likely confounded by the number of introductions into the region early in the epidemic and interconnectedness of the region as a whole.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2020.08.13.20174136DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7430609PMC
August 2020

Psychometric properties of the Asthma Control Test in 2 randomized clinical trials.

J Allergy Clin Immunol Pract 2021 Jan 6;9(1):561-563.e1. Epub 2020 Aug 6.

Value Evidence and Outcomes, GlaxoSmithKline plc, London, United Kingdom.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jaip.2020.07.040DOI Listing
January 2021

Ribbon: Intuitive visualization for complex genomic variation.

Bioinformatics 2020 Aug 7. Epub 2020 Aug 7.

Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.

Summary: Ribbon is an alignment visualization tool that shows how alignments are positioned within both the reference and read contexts, giving an intuitive view that enables a better understanding of structural variants and the read evidence supporting them. Ribbon was born out of a need to curate complex structural variant calls and determine whether each was well supported by long-read evidence, and it uses the same intuitive visualization method to shed light on contig alignments from genome-to-genome comparisons.

Availability And Implementation: Ribbon is freely available online at http://genomeribbon.com/ and is open-source at https://github.com/marianattestad/ribbon.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa680DOI Listing
August 2020

Major Impacts of Widespread Structural Variation on Gene Expression and Crop Improvement in Tomato.

Cell 2020 07 17;182(1):145-161.e23. Epub 2020 Jun 17.

Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA; Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA. Electronic address:

Structural variants (SVs) underlie important crop improvement and domestication traits. However, resolving the extent, diversity, and quantitative impact of SVs has been challenging. We used long-read nanopore sequencing to capture 238,490 SVs in 100 diverse tomato lines. This panSV genome, along with 14 new reference assemblies, revealed large-scale intermixing of diverse genotypes, as well as thousands of SVs intersecting genes and cis-regulatory regions. Hundreds of SV-gene pairs exhibit subtle and significant expression changes, which could broadly influence quantitative trait variation. By combining quantitative genetics with genome editing, we show how multiple SVs that changed gene dosage and expression levels modified fruit flavor, size, and production. In the last example, higher order epistasis among four SVs affecting three related transcription factors allowed introduction of an important harvesting trait in modern tomato. Our findings highlight the underexplored role of SVs in genotype-to-phenotype relationships and their widespread importance and utility in crop improvement.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2020.05.021DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7354227PMC
July 2020

A robust benchmark for detection of germline large deletions and insertions.

Nat Biotechnol 2020 11 15;38(11):1347-1355. Epub 2020 Jun 15.

Joint Initiative for Metrology in Biology, SLAC National Accelerator Lab, Stanford University, Stanford, CA, USA.

New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-020-0538-8DOI Listing
November 2020

In memory of James Taylor: the birth of Galaxy.

Genome Biol 2020 04 30;21(1):105. Epub 2020 Apr 30.

Departments of Computer Science and Biology, Johns Hopkins University, Baltimore, MD, USA.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02016-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7193333PMC
April 2020

Use of National Asthma Guidelines by Allergists and Pulmonologists: A National Survey.

J Allergy Clin Immunol Pract 2020 Oct 25;8(9):3011-3020.e2. Epub 2020 Apr 25.

Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, NC.

Background: Little is known about specialist-specific variations in guideline agreement and adoption.

Objective: To assess similarities and differences between allergists and pulmonologists in adherence to cornerstone components of the National Asthma Education and Prevention Program's Third Expert Panel Report.

Methods: Self-reported guideline agreement, self-efficacy, and adherence were assessed in allergists (n = 134) and pulmonologists (n = 99) in the 2012 National Asthma Survey of Physicians. Multivariate models were used to assess if physician and practice characteristics explained bivariate associations between specialty and "almost always" adhering to recommendations (ie, ≥75% of the time).

Results: Allergists and pulmonologists reported high guideline self-efficacy and moderate guideline agreement. Both groups "almost always" assessed asthma control (66.2%, standard error [SE] 4.3), assessed school/work asthma triggers (71.3%, SE, 3.9), and endorsed inhaled corticosteroids use (95.5%, SE 2.0). Repeated assessment of the inhaler technique, use of asthma action/treatment plans, and spirometry were lower (39.7%, SE 4.0; 30.6%, SE 3.6; 44.7%, SE 4.1, respectively). Compared with pulmonologists, more allergists almost always performed spirometry (56.6% vs 38.6%, P = .06), asked about nighttime awakening (91.9% vs 76.5%, P = .03) and emergency department visits (92.2% vs 76.5%, P = .03), assessed home triggers (70.5% vs 52.6%, P = .06), and performed allergy testing (61.8% vs 21.3%, P < .001). In multivariate analyses, practice-specific characteristics explained differences except for allergy testing.

Conclusions: Overall, allergists and pulmonologists adhere to the asthma guidelines with notable exceptions, including asthma action plan use and inhaler technique assessment. Recommendations with low implementation offer opportunities for further exploration and could serve as targets for increasing guideline uptake.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jaip.2020.04.026DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7554121PMC
October 2020

Vargas: heuristic-free alignment for assessing linear and graph read aligners.

Bioinformatics 2020 06;36(12):3712-3718

Department of Computer Science.

Motivation: Read alignment is central to many aspects of modern genomics. Most aligners use heuristics to accelerate processing, but these heuristics can fail to find the optimal alignments of reads. Alignment accuracy is typically measured through simulated reads; however, the simulated location may not be the (only) location with the optimal alignment score.

Results: Vargas implements a heuristic-free algorithm guaranteed to find the highest-scoring alignment for real sequencing reads to a linear or graph genome. With semiglobal and local alignment modes and affine gap and quality-scaled mismatch penalties, it can implement the scoring functions of commonly used aligners to calculate optimal alignments. While this is computationally intensive, Vargas uses multi-core parallelization and vectorized (SIMD) instructions to make it practical to optimally align large numbers of reads, achieving a maximum speed of 456 billion cell updates per second. We demonstrate how these 'gold standard' Vargas alignments can be used to improve heuristic alignment accuracy by optimizing command-line parameters in Bowtie 2, BWA-maximal exact match and vg to align more reads correctly.

Availability And Implementation: Source code implemented in C++ and compiled binary releases are available at https://github.com/langmead-lab/vargas under the MIT license.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa265DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7320598PMC
June 2020