Publications by authors named "Jeroen F J Laros"

38 Publications

Adenine base editing of the polyadenylation signal for targeted genetic therapy in facioscapulohumeral muscular dystrophy.

Mol Ther Nucleic Acids 2021 Sep 1;25:342-354. Epub 2021 Jun 1.

Department of Human Genetics, Leiden University Medical Center, 2333 ZC Leiden, the Netherlands.

Facioscapulohumeral muscular dystrophy (FSHD) is caused by chromatin relaxation of the D4Z4 repeat resulting in misexpression of the D4Z4-encoded gene in skeletal muscle. One of the key genetic requirements for the stable production of full-length mRNA in skeletal muscle is a functional polyadenylation signal (ATTAAA) in exon three of that is used in somatic cells. Base editors hold great promise to treat DNA lesions underlying genetic diseases through their ability to carry out specific and rapid nucleotide mutagenesis even in postmitotic cells such as skeletal muscle. In this study, we present a simple and straightforward strategy for mutagenesis of the somatic polyadenylation signal by adenine base editing in immortalized myoblasts derived from independent FSHD-affected individuals. We show that mutating this critical -regulatory element results in downregulation of mRNA and its direct transcriptional target genes. Our findings identify the somatic polyadenylation signal as a therapeutic target and represent the first step toward clinical application of the CRISPR-Cas9 base editing platform for FSHD gene therapy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.omtn.2021.05.020DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8399085PMC
September 2021

Fecal Microbiota Transplantation Influences Procarcinogenic Escherichia coli in Recipient Recurrent Clostridioides difficile Patients.

Gastroenterology 2021 Oct 11;161(4):1218-1228.e5. Epub 2021 Jun 11.

Experimental Bacteriology, Department of Medical Microbiology, Leiden University Medical Center, Leiden, the Netherlands; Netherlands Donor Feces Bank, Leiden, the Netherlands; Center for Microbiome Analyses and Therapeutics, Leiden University Medical Center, Leiden, the Netherlands; National Institute for Public Health and the Environment (RIVM), Bilthoven, the Netherlands.

Background & Aims: Patients with multiple recurrent Clostridioides difficile infection (rCDI) have a disturbed gut microbiota that can be restored by fecal microbiota transplantation (FMT). Despite extensive screening, healthy feces donors may carry bacteria in their intestinal tract that could have long-term health effects, such as potentially procarcinogenic polyketide synthase-positive (pks) Escherichia coli. Here, we aim to determine whether the pks abundance and persistence of pksE coli is influenced by pks status of the donor feces.

Methods: In a cohort of 49 patients with rCDI treated with FMT and matching donor samples-the largest cohort of its kind, to our knowledge-we retrospectively screened fecal metagenomes for pksE coli and compared the presence of pks in patients before and after treatment and to their respective donors.

Results: The pks island was more prevalent (P = .026) and abundant (P < .001) in patients with rCDI (pre-FMT, 27 of 49 [55%]; median, 0.46 reads per kilobase per million [RPKM] pks) than in healthy donors (3 of 8 donors [37.5%], 11 of 38 samples [29%]; median, 0.01 RPKM pks). The pks status of patients post-FMT depended on the pks status of the donor suspension with which the patient was treated (P = .046). Particularly, persistence (8 of 9 cases) or clearance (13 of 18) of pksE coli in pks patients was correlated to pks in the donor (P = .004).

Conclusions: We conclude that FMT contributes to pksE coli persistence or eradication in patients with rCDI but that donor-to-patient transmission of pksE coli is unlikely.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1053/j.gastro.2021.06.009DOI Listing
October 2021

Next Generation HGVS Nomenclature Checker.

Bioinformatics 2021 Feb 4. Epub 2021 Feb 4.

Department of Human Genetics, Leiden University Medical Center (LUMC).

Motivation: Unambiguous variant descriptions are of utmost importance in clinical genetic diagnostics, scientific literature, and genetic databases. The Human Genome Variation Society (HGVS) publishes a comprehensive set of guidelines on how variants should be correctly and unambiguously described. We present the implementation of the Mutalyzer 2 tool suite, designed to automatically apply the HGVS guidelines so users do not have to deal with the HGVS intricacies explicitly to check and correct their variant descriptions.

Results: Mutalyzer is profusely used by the community, having processed over 133 million descriptions since its launch. Over a five year period, Mutalyzer reported a correct input in approximately 50% of cases. In 41% of the cases either a syntactic or semantic error was identified and for approximately 7% of cases, Mutalyzer was able to automatically correct the description.

Availability: Mutalyzer is an Open Source project under the GNU Affero General Public License. The source code is available on GitHub (https://github.com/mutalyzer/mutalyzer) and a running instance is available at: https://mutalyzer.nl.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btab051DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8479679PMC
February 2021

A catalogue of 863 Rett-syndrome-causing MECP2 mutations and lessons learned from data integration.

Sci Data 2021 01 15;8(1):10. Epub 2021 Jan 15.

Department of Bioinformatics - BiGCaT, NUTRIM School of Nutrition and Translational Research in Metabolism, MHeNS School of Mental Health and Neuroscience, Maastricht University, Maastricht, The Netherlands.

Rett syndrome (RTT) is a rare neurological disorder mostly caused by a genetic variation in MECP2. Making new MECP2 variants and the related phenotypes available provides data for better understanding of disease mechanisms and faster identification of variants for diagnosis. This is, however, currently hampered by the lack of interoperability between genotype-phenotype databases. Here, we demonstrate on the example of MECP2 in RTT that by making the genotype-phenotype data more Findable, Accessible, Interoperable, and Reusable (FAIR), we can facilitate prioritization and analysis of variants. In total, 10,968 MECP2 variants were successfully integrated. Among these variants 863 unique confirmed RTT causing and 209 unique confirmed benign variants were found. This dataset was used for comparison of pathogenicity predicting tools, protein consequences, and identification of ambiguous variants. Prediction tools generally recognised the RTT causing and benign variants, however, there was a broad range of overlap Nineteen variants were identified that were annotated as both disease-causing and benign, suggesting that there are additional factors in these cases contributing to disease development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41597-020-00794-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7810705PMC
January 2021

Coronavirus discovery by metagenomic sequencing: a tool for pandemic preparedness.

J Clin Virol 2020 Oct 21;131:104594. Epub 2020 Aug 21.

Department of Medical Microbiology, Leiden University Medical Center (LUMC), Leiden, the Netherlands.

Introduction: The SARS-CoV-2 pandemic of 2020 is a prime example of the omnipresent threat of emerging viruses that can infect humans. A protocol for the identification of novel coronaviruses by viral metagenomic sequencing in diagnostic laboratories may contribute to pandemic preparedness.

Aim: The aim of this study is to validate a metagenomic virus discovery protocol as a tool for coronavirus pandemic preparedness.

Methods: The performance of a viral metagenomic protocol in a clinical setting for the identification of novel coronaviruses was tested using clinical samples containing SARS-CoV-2, SARS-CoV, and MERS-CoV, in combination with databases generated to contain only viruses of before the discovery dates of these coronaviruses, to mimic virus discovery.

Results: Classification of NGS reads using Centrifuge and Genome Detective resulted in assignment of the reads to the closest relatives of the emerging coronaviruses. Low nucleotide and amino acid identity (81% and 84%, respectively, for SARS-CoV-2) in combination with up to 98% genome coverage were indicative for a related, novel coronavirus. Capture probes targeting vertebrate viruses, designed in 2015, enhanced both sequencing depth and coverage of the SARS-CoV-2 genome, the latter increasing from 71% to 98%.

Conclusion: The model used for simulation of virus discovery enabled validation of the metagenomic sequencing protocol. The metagenomic protocol with virus probes designed before the pandemic, can assist the detection and identification of novel coronaviruses directly in clinical samples.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jcv.2020.104594DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7441049PMC
October 2020

Comprehensive diagnostics of acute myeloid leukemia by whole transcriptome RNA sequencing.

Leukemia 2021 01 3;35(1):47-61. Epub 2020 Mar 3.

Department of Hematology, Leiden University Medical Center, 2300RC, Leiden, The Netherlands.

Acute myeloid leukemia (AML) is caused by genetic aberrations that also govern the prognosis of patients and guide risk-adapted and targeted therapy. Genetic aberrations in AML are structurally diverse and currently detected by different diagnostic assays. This study sought to establish whole transcriptome RNA sequencing as single, comprehensive, and flexible platform for AML diagnostics. We developed HAMLET (Human AML Expedited Transcriptomics) as bioinformatics pipeline for simultaneous detection of fusion genes, small variants, tandem duplications, and gene expression with all information assembled in an annotated, user-friendly output file. Whole transcriptome RNA sequencing was performed on 100 AML cases and HAMLET results were validated by reference assays and targeted resequencing. The data showed that HAMLET accurately detected all fusion genes and overexpression of EVI1 irrespective of 3q26 aberrations. In addition, small variants in 13 genes that are often mutated in AML were called with 99.2% sensitivity and 100% specificity, and tandem duplications in FLT3 and KMT2A were detected by a novel algorithm based on soft-clipped reads with 100% sensitivity and 97.1% specificity. In conclusion, HAMLET has the potential to provide accurate comprehensive diagnostic information relevant for AML classification, risk assessment and targeted therapy on a single technology platform.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41375-020-0762-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787979PMC
January 2021

Taxonomic classification and abundance estimation using 16S and WGS-A comparison using controlled reference samples.

Forensic Sci Int Genet 2020 05 5;46:102257. Epub 2020 Feb 5.

Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands; Department of Clinical Genetics, Leiden University Medical Center, Leiden, the Netherlands.

The assessment of microbiome biodiversity is the most common application of metagenomics. While 16S sequencing remains standard procedure for taxonomic profiling of metagenomic data, a growing number of studies have clearly demonstrated biases associated with this method. By using Whole Genome Shotgun sequencing (WGS) metagenomics, most of the known restrictions associated with 16S data are alleviated. However, due to the computationally intensive data analyses and higher sequencing costs, WGS based metagenomics remains a less popular option. Selecting the experiment type that provides a comprehensive, yet manageable amount of information is a challenge encountered in many metagenomics studies. In this work, we created a series of artificial bacterial mixes, each with a different distribution of skin-associated microbial species. These mixes were used to estimate the resolution of two different metagenomic experiments - 16S and WGS - and to evaluate several different bioinformatics approaches for taxonomic read classification. In all test cases, WGS approaches provide much more accurate results, in terms of taxa prediction and abundance estimation, in comparison to those of 16S. Furthermore, we demonstrate that a 16S dataset, analysed using different state of the art techniques and reference databases, can produce widely different results. In light of the fact that most forensic metagenomic analysis are still performed using 16S data, our results are especially important.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2020.102257DOI Listing
May 2020

Dutch genome diagnostic laboratories accelerated and improved variant interpretation and increased accuracy by sharing data.

Hum Mutat 2019 12 3;40(12):2230-2238. Epub 2019 Sep 3.

Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands.

Each year diagnostic laboratories in the Netherlands profile thousands of individuals for heritable disease using next-generation sequencing (NGS). This requires pathogenicity classification of millions of DNA variants on the standard 5-tier scale. To reduce time spent on data interpretation and increase data quality and reliability, the nine Dutch labs decided to publicly share their classifications. Variant classifications of nearly 100,000 unique variants were catalogued and compared in a centralized MOLGENIS database. Variants classified by more than one center were labeled as "consensus" when classifications agreed, and shared internationally with LOVD and ClinVar. When classifications opposed (LB/B vs. LP/P), they were labeled "conflicting", while other nonconsensus observations were labeled "no consensus". We assessed our classifications using the InterVar software to compare to ACMG 2015 guidelines, showing 99.7% overall consistency with only 0.3% discrepancies. Differences in classifications between Dutch labs or between Dutch labs and ACMG were mainly present in genes with low penetrance or for late onset disorders and highlight limitations of the current 5-tier classification system. The data sharing boosted the quality of DNA diagnostics in Dutch labs, an initiative we hope will be followed internationally. Recently, a positive match with a case from outside our consortium resulted in a more definite disease diagnosis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23896DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6900155PMC
December 2019

Archival, paleopathological and aDNA-based techniques in leprosy research and the case of Father Petrus Donders at the Leprosarium 'Batavia', Suriname.

Int J Paleopathol 2019 12 17;27:1-8. Epub 2019 Aug 17.

Dept Biochemistry, Faculty of Medical Sciences, Anton de Kom Universiteit van Suriname, Paramaribo, Suriname.

Objective: We assessed whether Petrus Donders (died 1887), a Dutch priest who for 27 years cared for people with leprosy in the leprosarium Batavia, Suriname, had evidence of Mycobacterium (M.) leprae infection. A positive finding of M. leprae ancient (a)DNA would contribute to the origin of leprosy in Suriname.

Materials: Skeletal remains of Father Petrus Donders; two additional skeletons excavated from the Batavia cemetery were used as controls.

Methods: Archival research, paleopathological evaluation and aDNA-based testing of skeletal remains.

Results: Neither archives nor inspection of Donders skeletal remains revealed evidence of leprosy, and aDNA-based testing for M. leprae was negative. We detected M. leprae aDNA by RLEP PCR in one control skeleton, which also displayed pathological lesions compatible with leprosy. The M. leprae aDNA was genotyped by Sanger sequencing as SNP type 4; the skeleton displayed mitochondrial haplogroup L3.

Conclusion: We found no evidence that Donders contracted leprosy despite years of intense leprosy contact, but we successfully isolated an archaeological M. leprae aDNA sample from a control skeleton from South America.

Significance: We successfully genotyped recovered aDNA to a M. leprae strain that likely originated in West Africa. The detected human mitochondrial haplogroup L3 is also associated with this geographical region. This suggests that slave trade contributed to leprosy in Suriname.

Limitations: A limited number of skeletons was examined.

Suggestions For Further Research: Broader review of skeletal collections is advised to expand on diversity of the M. leprae aDNA database.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijpp.2019.08.001DOI Listing
December 2019

BacTag - a pipeline for fast and accurate gene and allele typing in bacterial sequencing data based on database preprocessing.

BMC Genomics 2019 May 6;20(1):338. Epub 2019 May 6.

Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands.

Background: Bacteria carry a wide array of genes, some of which have multiple alleles. These different alleles are often responsible for distinct types of virulence and can determine the classification at the subspecies levels (e.g., housekeeping genes for Multi Locus Sequence Typing, MLST). Therefore, it is important to rapidly detect not only the gene of interest, but also the relevant allele. Current sequencing-based methods are limited to mapping reads to each of the known allele reference, which is a time-consuming procedure.

Results: To address this limitation, we developed BacTag - a pipeline that rapidly and accurately detects which genes are present in a sequencing dataset and reports the allele of each of the identified genes. We exploit the fact that different alleles of the same gene have a high similarity. Instead of mapping the reads to each of the allele reference sequences, we preprocess the database prior to the analysis, which makes the subsequent gene and allele identification efficient. During the preprocessing, we determine a representative reference sequence for each gene and store the differences between all alleles and this chosen reference. Throughout the analysis we estimate whether the gene is present in the sequencing data by mapping the reads to this reference sequence; if the gene is found, we compare the variants to those in the preprocessed database. This allows to detect which specific allele is present in the sequencing data. Our pipeline was successfully tested on artificial WGS E. coli, S. pseudintermedius, P. gingivalis, M. bovis, Borrelia spp. and Streptomyces spp. data and real WGS E. coli and K. pneumoniae data in order to report alleles of MLST house-keeping genes.

Conclusions: We developed a new pipeline for fast and accurate gene and allele recognition based on database preprocessing and parallel computing and performed better or comparable to the current popular tools. We believe that our approach can be useful for a wide range of projects, including bacterial subspecies classification, clinical diagnostics of bacterial infections, and epidemiological studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-019-5723-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6501397PMC
May 2019

Annotating Transcriptional Effects of Genetic Variants in Disease-Relevant Tissue: Transcriptome-Wide Allelic Imbalance in Osteoarthritic Cartilage.

Arthritis Rheumatol 2019 04 23;71(4):561-570. Epub 2019 Feb 23.

Leiden University Medical Center, Leiden, The Netherlands.

Objective: Multiple single-nucleotide polymorphisms (SNPs) conferring susceptibility to osteoarthritis (OA) mark imbalanced expression of positional genes in articular cartilage, reflected by unequally expressed alleles among heterozygotes (allelic imbalance [AI]). We undertook this study to explore the articular cartilage transcriptome from OA patients for AI events to identify putative disease-driving genetic variation.

Methods: AI was assessed in 42 preserved and 5 lesioned OA cartilage samples (from the Research Arthritis and Articular Cartilage study) for which RNA sequencing data were available. The count fraction of the alternative alleles among the alternative and reference alleles together (φ) was determined for heterozygous individuals. A meta-analysis was performed to generate a meta-φ and P value for each SNP with a false discovery rate (FDR) correction for multiple comparisons. To further validate AI events, we explored them as a function of multiple additional OA features.

Results: We observed a total of 2,070 SNPs that consistently marked AI of 1,031 unique genes in articular cartilage. Of these genes, 49 were found to be significantly differentially expressed (fold change <0.5 or >2, FDR <0.05) between preserved and paired lesioned cartilage, and 18 had previously been reported to confer susceptibility to OA and/or related phenotypes. Moreover, we identified notable highly significant AI SNPs in the CRLF1, WWP2, and RPS3 genes that were related to multiple OA features.

Conclusion: We present a framework and resulting data set for researchers in the OA research field to probe for disease-relevant genetic variation that affects gene expression in pivotal disease-affected tissue. This likely includes putative novel compelling OA risk genes such as CRLF1, WWP2, and RPS3.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/art.40748DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6593438PMC
April 2019

Short hypervariable microhaplotypes: A novel set of very short high discriminating power loci without stutter artefacts.

Forensic Sci Int Genet 2018 07 22;35:169-175. Epub 2018 May 22.

Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands. Electronic address:

Since two decades, short tandem repeats (STRs) are the preferred markers for human identification, routinely analysed by fragment length analysis. Here we present a novel set of short hypervariable autosomal microhaplotypes (MH) that have four or more SNPs in a span of less than 70 nucleotides (nt). These MHs display a discriminating power approaching that of STRs and provide a powerful alternative for the analysis;1;is of forensic samples that are problematic when the STR fragment size range exceeds the integrity range of severely degraded DNA or when multiple donors contribute to an evidentiary stain and STR stutter artefacts complicate profile interpretation. MH typing was developed using the power of massively parallel sequencing (MPS) enabling new powerful, fast and efficient SNP-based approaches. MH candidates were obtained from queries in data of the 1000 Genomes, and Genome of the Netherlands (GoNL) projects. Wet-lab analysis of 276 globally dispersed samples and 97 samples of nine large CEPH families assisted locus selection and corroboration of informative value. We infer that MHs represent an alternative marker type with good discriminating power per locus (allowing the use of a limited number of loci), small amplicon sizes and absence of stutter artefacts that can be especially helpful when unbalanced mixed samples are submitted for human identification.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2018.05.008DOI Listing
July 2018

Brain Transcriptomic Analysis of Hereditary Cerebral Hemorrhage With Amyloidosis-Dutch Type.

Front Aging Neurosci 2018 13;10:102. Epub 2018 Apr 13.

Department of Human Genetics, Leiden University Medical Center, Leiden, Netherlands.

Hereditary cerebral hemorrhage with amyloidosis-Dutch type (HCHWA-D) is an early onset hereditary form of cerebral amyloid angiopathy (CAA) caused by a point mutation resulting in an amino acid change (NP_000475.1:p.Glu693Gln) in the amyloid precursor protein (APP). Post-mortem frontal and occipital cortical brain tissue from nine patients and nine age-related controls was used for RNA sequencing to identify biological pathways affected in HCHWA-D. Although previous studies indicated that pathology is more severe in the occipital lobe in HCHWA-D compared to the frontal lobe, the current study showed similar changes in gene expression in frontal and occipital cortex and the two brain regions were pooled for further analysis. Significantly altered pathways were analyzed using gene set enrichment analysis (GSEA) on 2036 significantly differentially expressed genes. Main pathways over-represented by down-regulated genes were related to cellular aerobic respiration (including ATP synthesis and carbon metabolism) indicating a mitochondrial dysfunction. Principal up-regulated pathways were extracellular matrix (ECM)-receptor interaction and ECM proteoglycans in relation with an increase in the transforming growth factor beta (TGFβ) signaling pathway. Comparison with the publicly available dataset from pre-symptomatic APP-E693Q transgenic mice identified overlap for the ECM-receptor interaction pathway, indicating that ECM modification is an early disease specific pathomechanism.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fnagi.2018.00102DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5908973PMC
April 2018

Critical points for an accurate human genome analysis.

Hum Mutat 2017 08 16;38(8):912-921. Epub 2017 Jun 16.

Department of Human Genetics, Leiden University Medical Center, The Netherlands.

Next-generation sequencing is radically changing how DNA diagnostic laboratories operate. What started as a single-gene profession is now developing into gene panel sequencing and whole-exome and whole-genome sequencing (WES/WGS) analyses. With further advances in sequencing technology and concomitant price reductions, WGS will soon become the standard and be routinely offered. Here, we focus on the critical steps involved in performing WGS, with a particular emphasis on points where WGS differs from WES, the important variables that should be taken into account, and the quality control measures that can be taken to monitor the process. The points discussed here, combined with recent publications on guidelines for reporting variants, will facilitate the routine implementation of WGS into a diagnostic setting.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23238DOI Listing
August 2017

FDSTools: A software package for analysis of massively parallel sequencing data with the ability to recognise and correct STR stutter and other PCR or sequencing noise.

Forensic Sci Int Genet 2017 03 27;27:27-40. Epub 2016 Nov 27.

Department of Human Genetics, Leiden University Medical Center, Leiden, 2300 RC, The Netherlands. Electronic address:

Massively parallel sequencing (MPS) is on the advent of a broad scale application in forensic research and casework. The improved capabilities to analyse evidentiary traces representing unbalanced mixtures is often mentioned as one of the major advantages of this technique. However, most of the available software packages that analyse forensic short tandem repeat (STR) sequencing data are not well suited for high throughput analysis of such mixed traces. The largest challenge is the presence of stutter artefacts in STR amplifications, which are not readily discerned from minor contributions. FDSTools is an open-source software solution developed for this purpose. The level of stutter formation is influenced by various aspects of the sequence, such as the length of the longest uninterrupted stretch occurring in an STR. When MPS is used, STRs are evaluated as sequence variants that each have particular stutter characteristics which can be precisely determined. FDSTools uses a database of reference samples to determine stutter and other systemic PCR or sequencing artefacts for each individual allele. In addition, stutter models are created for each repeating element in order to predict stutter artefacts for alleles that are not included in the reference set. This information is subsequently used to recognise and compensate for the noise in a sequence profile. The result is a better representation of the true composition of a sample. Using Promega Powerseq™ Auto System data from 450 reference samples and 31 two-person mixtures, we show that the FDSTools correction module decreases stutter ratios above 20% to below 3%. Consequently, much lower levels of contributions in the mixed traces are detected. FDSTools contains modules to visualise the data in an interactive format allowing users to filter data with their own preferred thresholds.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2016.11.007DOI Listing
March 2017

Massively parallel sequencing of short tandem repeats-Population data and mixture analysis results for the PowerSeq™ system.

Forensic Sci Int Genet 2016 09 7;24:86-96. Epub 2016 Jun 7.

Forensic Laboratory for DNA Research, Department of Human Genetics, Leiden University Medical Centre, Postzone S 05 P, P.O. Box 9600, 2300 RC Leiden, The Netherlands. Electronic address:

Current forensic DNA analysis predominantly involves identification of human donors by analysis of short tandem repeats (STRs) using Capillary Electrophoresis (CE). Recent developments in Massively Parallel Sequencing (MPS) technologies offer new possibilities in analysis of STRs since they might overcome some of the limitations of CE analysis. In this study 17 STRs and Amelogenin were sequenced in high coverage using a prototype version of the Promega PowerSeq™ system for 297 population samples from the Netherlands, Nepal, Bhutan and Central African Pygmies. In addition, 45 two-person mixtures with different minor contributions down to 1% were analysed to investigate the performance of this system for mixed samples. Regarding fragment length, complete concordance between the MPS and CE-based data was found, marking the reliability of MPS PowerSeq™ system. As expected, MPS presented a broader allele range and higher power of discrimination and exclusion rate. The high coverage sequencing data were used to determine stutter characteristics for all loci and stutter ratios were compared to CE data. The separation of alleles with the same length but exhibiting different stutter ratios lowers the overall variation in stutter ratio and helps in differentiation of stutters from genuine alleles in mixed samples. All alleles of the minor contributors were detected in the sequence reads even for the 1% contributions, but analysis of mixtures below 5% without prior information of the mixture ratio is complicated by PCR and sequencing artefacts.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2016.05.016DOI Listing
September 2016

Whole Gene Capture Analysis of 15 CRC Susceptibility Genes in Suspected Lynch Syndrome Patients.

PLoS One 2016 14;11(6):e0157381. Epub 2016 Jun 14.

Department of Human Genetics, Leiden University Medical Centre, Leiden, The Netherlands.

Background And Aims: Lynch Syndrome (LS) is caused by pathogenic germline variants in one of the mismatch repair (MMR) genes. However, up to 60% of MMR-deficient colorectal cancer cases are categorized as suspected Lynch Syndrome (sLS) because no pathogenic MMR germline variant can be identified, which leads to difficulties in clinical management. We therefore analyzed the genomic regions of 15 CRC susceptibility genes in leukocyte DNA of 34 unrelated sLS patients and 11 patients with MLH1 hypermethylated tumors with a clear family history.

Methods: Using targeted next-generation sequencing, we analyzed the entire non-repetitive genomic sequence, including intronic and regulatory sequences, of 15 CRC susceptibility genes. In addition, tumor DNA from 28 sLS patients was analyzed for somatic MMR variants.

Results: Of 1979 germline variants found in the leukocyte DNA of 34 sLS patients, one was a pathogenic variant (MLH1 c.1667+1delG). Leukocyte DNA of 11 patients with MLH1 hypermethylated tumors was negative for pathogenic germline variants in the tested CRC susceptibility genes and for germline MLH1 hypermethylation. Somatic DNA analysis of 28 sLS tumors identified eight (29%) cases with two pathogenic somatic variants, one with a VUS predicted to pathogenic and LOH, and nine cases (32%) with one pathogenic somatic variant (n = 8) or one VUS predicted to be pathogenic (n = 1).

Conclusions: This is the first study in sLS patients to include the entire genomic sequence of CRC susceptibility genes. An underlying somatic or germline MMR gene defect was identified in ten of 34 sLS patients (29%). In the remaining sLS patients, the underlying genetic defect explaining the MMRdeficiency in their tumors might be found outside the genomic regions harboring the MMR and other known CRC susceptibility genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0157381PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4907507PMC
July 2017

Repeated FcεRI triggering reveals modified mast cell function related to chronic allergic responses in tissue.

J Allergy Clin Immunol 2016 09 28;138(3):869-880. Epub 2016 Mar 28.

Department of Rheumatology, Leiden University Medical Center, Leiden, The Netherlands. Electronic address:

Background: Activation of mast cells through FcεRI plays an important role in acute allergic reactions. However, little is known about the function of mast cells in patients with chronic allergic inflammation or the effect of repeated FcεRI triggering occurring in such responses.

Objective: We aimed to identify changes in mast cell function after repeated FcεRI triggering and to correlate these changes to chronic allergic responses in tissue.

Methods: Human cord blood-derived mast cells were treated for 2 weeks with anti-IgE. The function of naive or treated mast cells was analyzed by means of RNA sequencing, quantitative RT-PCR, flow cytometry, and functional assays. Protein secretion was measured with ELISAs and multiplex assays.

Results: We observed several changes in mast cell function after repeated anti-IgE triggering. Although the acute response was dampened, we identified 289 genes significantly upregulated after repeated anti-IgE. Most of these genes (84%) were not upregulated after a single anti-IgE stimulus, indicating a significantly different response mode characterized by increased antigen presentation, response to bacteria, and chemotaxis. Changes in mast cell function were related to changes in expression of the transcription factors RXRA and BATF and others. Importantly, we found a substantial overlap between genes upregulated after repeated anti-IgE triggering and genes upregulated in tissue from patients with chronic allergy, in particular those of patients with chronic rhinosinusitis.

Conclusion: Our study provides evidence for intrinsic modulation of mast cell function on repeated FcεRI-mediated activation. The overlap with gene expression in tissues is suggestive of a direct link between repeated IgE-mediated activation of mast cells and chronic allergy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jaci.2016.01.017DOI Listing
September 2016

The Implicitome: A Resource for Rationalizing Gene-Disease Associations.

PLoS One 2016 26;11(2):e0149621. Epub 2016 Feb 26.

Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands.

High-throughput experimental methods such as medical sequencing and genome-wide association studies (GWAS) identify increasingly large numbers of potential relations between genetic variants and diseases. Both biological complexity (millions of potential gene-disease associations) and the accelerating rate of data production necessitate computational approaches to prioritize and rationalize potential gene-disease relations. Here, we use concept profile technology to expose from the biomedical literature both explicitly stated gene-disease relations (the explicitome) and a much larger set of implied gene-disease associations (the implicitome). Implicit relations are largely unknown to, or are even unintended by the original authors, but they vastly extend the reach of existing biomedical knowledge for identification and interpretation of gene-disease associations. The implicitome can be used in conjunction with experimental data resources to rationalize both known and novel associations. We demonstrate the usefulness of the implicitome by rationalizing known and novel gene-disease associations, including those from GWAS. To facilitate the re-use of implicit gene-disease associations, we publish our data in compliance with FAIR Data Publishing recommendations [https://www.force11.org/group/fairgroup] using nanopublications. An online tool (http://knowledge.bio) is available to explore established and potential gene-disease associations in the context of other biomedical relations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0149621PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4769089PMC
July 2016

Transmission of human mtDNA heteroplasmy in the Genome of the Netherlands families: support for a variable-size bottleneck.

Genome Res 2016 Apr 25;26(4):417-26. Epub 2016 Feb 25.

Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany;

Although previous studies have documented a bottleneck in the transmission of mtDNA genomes from mothers to offspring, several aspects remain unclear, including the size and nature of the bottleneck. Here, we analyze the dynamics of mtDNA heteroplasmy transmission in the Genomes of the Netherlands (GoNL) data, which consists of complete mtDNA genome sequences from 228 trios, eight dizygotic (DZ) twin quartets, and 10 monozygotic (MZ) twin quartets. Using a minor allele frequency (MAF) threshold of 2%, we identified 189 heteroplasmies in the trio mothers, of which 59% were transmitted to offspring, and 159 heteroplasmies in the trio offspring, of which 70% were inherited from the mothers. MZ twin pairs exhibited greater similarity in MAF at heteroplasmic sites than DZ twin pairs, suggesting that the heteroplasmy MAF in the oocyte is the major determinant of the heteroplasmy MAF in the offspring. We used a likelihood method to estimate the effective number of mtDNA genomes transmitted to offspring under different bottleneck models; a variable bottleneck size model provided the best fit to the data, with an estimated mean of nine individual mtDNA genomes transmitted. We also found evidence for negative selection during transmission against novel heteroplasmies (in which the minor allele has never been observed in polymorphism data). These novel heteroplasmies are enhanced for tRNA and rRNA genes, and mutations associated with mtDNA diseases frequently occur in these genes. Our results thus suggest that the female germ line is able to recognize and select against deleterious heteroplasmies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.203216.115DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4817766PMC
April 2016

Non-sequential and multi-step splicing of the dystrophin transcript.

RNA Biol 2016 15;13(3):290-305. Epub 2015 Dec 15.

a Department of Human Genetics , Leiden University Medical Center , Leiden , the Netherlands.

The dystrophin protein encoding DMD gene is the longest human gene. The 2.2 Mb long human dystrophin transcript takes 16 hours to be transcribed and is co-transcriptionally spliced. It contains long introns (24 over 10kb long, 5 over 100kb long) and the heterogeneity in intron size makes it an ideal transcript to study different aspects of the human splicing process. Splicing is a complex process and much is unknown regarding the splicing of long introns in human genes. Here, we used ultra-deep transcript sequencing to characterize splicing of the dystrophin transcripts in 3 different human skeletal muscle cell lines, and explored the order of intron removal and multi-step splicing. Coverage and read pair analyses showed that around 40% of the introns were not always removed sequentially. Additionally, for the first time, we report that non-consecutive intron removal resulted in 3 or more joined exons which are flanked by unspliced introns and we defined these joined exons as an exon block. Lastly, computational and experimental data revealed that, for the majority of dystrophin introns, multistep splicing events are used to splice out a single intron. Overall, our data show for the first time in a human transcript, that multi-step intron removal is a general feature of mRNA splicing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1080/15476286.2015.1125074DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4829307PMC
December 2016

An efficient algorithm for the extraction of HGVS variant descriptions from sequences.

Bioinformatics 2015 Dec 31;31(23):3751-7. Epub 2015 Jul 31.

Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands, Leiden Genome Technology Center, Leiden University Medical Center, Leiden, The Netherlands.

Motivation: Unambiguous sequence variant descriptions are important in reporting the outcome of clinical diagnostic DNA tests. The standard nomenclature of the Human Genome Variation Society (HGVS) describes the observed variant sequence relative to a given reference sequence. We propose an efficient algorithm for the extraction of HGVS descriptions from two sequences with three main requirements in mind: minimizing the length of the resulting descriptions, minimizing the computation time and keeping the unambiguous descriptions biologically meaningful.

Results: Our algorithm is able to compute the HGVS descriptions of complete chromosomes or other large DNA strings in a reasonable amount of computation time and its resulting descriptions are relatively small. Additional applications include updating of gene variant database contents and reference sequence liftovers.

Availability: The algorithm is accessible as an experimental service in the Mutalyzer program suite (https://mutalyzer.nl). The C++ source code and Python interface are accessible at: https://github.com/mutalyzer/description-extractor.

Contact: [email protected]
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btv443DOI Listing
December 2015

SplicePie: a novel analytical approach for the detection of alternative, non-sequential and recursive splicing.

Nucleic Acids Res 2015 Jul 23;43(12):e80. Epub 2015 Mar 23.

Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands Leiden Genome Technology Center, Leiden University Medical Center, Leiden, The Netherlands

Alternative splicing is a powerful mechanism present in eukaryotic cells to obtain a wide range of transcripts and protein isoforms from a relatively small number of genes. The mechanisms regulating (alternative) splicing and the paradigm of consecutive splicing have recently been challenged, especially for genes with a large number of introns. RNA-Seq, a powerful technology using deep sequencing in order to determine transcript structure and expression levels, is usually performed on mature mRNA, therefore not allowing detailed analysis of splicing progression. Sequencing pre-mRNA at different stages of splicing potentially provides insight into mRNA maturation. Although the number of tools that analyze total and cytoplasmic RNA in order to elucidate the transcriptome composition is rapidly growing, there are no tools specifically designed for the analysis of nuclear RNA (which contains mixtures of pre- and mature mRNA). We developed dedicated algorithms to investigate the splicing process. In this paper, we present a new classification of RNA-Seq reads based on three major stages of splicing: pre-, intermediate- and post-splicing. Applying this novel classification we demonstrate the possibility to analyze the order of splicing. Furthermore, we uncover the potential to investigate the multi-step nature of splicing, assessing various types of recursive splicing events. We provide the data that gives biological insight into the order of splicing, show that non-sequential splicing of certain introns is reproducible and coinciding in multiple cell lines. We validated our observations with independent experimental technologies and showed the reliability of our method. The pipeline, named SplicePie, is freely available at: https://github.com/pulyakhina/splicing_analysis_pipeline. The example data can be found at: https://barmsijs.lumc.nl/HG/irina/example_data.tar.gz.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkv242DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4499118PMC
July 2015

Determining the quality and complexity of next-generation sequencing data without a reference genome.

Genome Biol 2014 ;15(12):555

We describe an open-source kPAL package that facilitates an alignment-free assessment of the quality and comparability of sequencing datasets by analyzing k-mer frequencies. We show that kPAL can detect technical artefacts such as high duplication rates, library chimeras, contamination and differences in library preparation protocols. kPAL also successfully captures the complexity and diversity of microbiomes and provides a powerful means to study changes in microbial communities. Together, these features make kPAL an attractive and broadly applicable tool to determine the quality and comparability of sequence libraries even in the absence of a reference sequence. kPAL is freely available at https://github.com/LUMC/kPAL webcite.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-014-0555-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298064PMC
August 2015

Roux-en-Y gastric bypass surgery, but not calorie restriction, reduces plasma branched-chain amino acids in obese women independent of weight loss or the presence of type 2 diabetes.

Diabetes Care 2014 Dec 14;37(12):3150-6. Epub 2014 Oct 14.

Department of Endocrinology and Metabolism, Leiden University Medical Center, Leiden, the Netherlands Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands Einthoven Laboratory for Experimental Vascular Medicine, Leiden, the Netherlands.

Objective: Obesity and type 2 diabetes mellitus (T2DM) have been associated with increased levels of circulating branched-chain amino acids (BCAAs) that may be involved in the pathogenesis of insulin resistance. However, weight loss has not been consistently associated with the reduction of BCAA levels.

Research Design And Methods: We included 30 obese normal glucose-tolerant (NGT) subjects, 32 obese subjects with T2DM, and 12 lean female subjects. Obese subjects underwent either a restrictive procedure (gastric banding [GB], a very low-calorie diet [VLCD]), or a restrictive/bypass procedure (Roux-en-Y gastric bypass [RYGB] surgery). Fasting blood samples were taken for the determination of amine group containing metabolites 4 weeks before, as well as 3 weeks and 3 months after the intervention.

Results: BCAA levels were higher in T2DM subjects, but not in NGT subjects, compared with lean subjects. Principal component (PC) analysis revealed a concise PC consisting of all BCAAs, which showed a correlation with measures of insulin sensitivity and glucose tolerance. Only after the RYGB procedure, and at both 3 weeks and 3 months, were circulating BCAA levels reduced.

Conclusions: Our data confirm an association between deregulation of BCAA metabolism in plasma and insulin resistance and glucose intolerance. Three weeks after undergoing RYGB surgery, a significant decrease in BCAAs in both NGT as well as T2DM subjects was observed. After 3 months, despite inducing significant weight loss, neither GB nor VLCD induced a reduction in BCAA levels. Our results indicate that the bypass procedure of RYGB surgery, independent of weight loss or the presence of T2DM, reduces BCAA levels in obese subjects.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2337/dc14-0195DOI Listing
December 2014

Downregulation of the acetyl-CoA metabolic network in adipose tissue of obese diabetic individuals and recovery after weight loss.

Diabetologia 2014 Nov 7;57(11):2384-92. Epub 2014 Aug 7.

Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, P.O. Box 9600, 2300 RC, Leiden, the Netherlands.

Aims/hypothesis: Not all obese individuals develop type 2 diabetes. Why some obese individuals retain normal glucose tolerance (NGT) is not well understood. We hypothesise that the biochemical mechanisms that underlie the function of adipose tissue can help explain the difference between obese individuals with NGT and those with type 2 diabetes.

Methods: RNA sequencing was used to analyse the transcriptome of samples extracted from visceral adipose tissue (VAT) and subcutaneous adipose tissue (SAT) of obese women with NGT or type 2 diabetes who were undergoing bariatric surgery. The gene expression data was analysed by bioinformatic visualisation and statistical analyses techniques.

Results: A network-based approach to distinguish obese individuals with NGT from obese individuals with type 2 diabetes identified acetyl-CoA metabolic network downregulation as an important feature in the pathophysiology of type 2 diabetes in obese individuals. In general, genes within two reaction steps of acetyl-CoA were found to be downregulated in the VAT and SAT of individuals with type 2 diabetes. Upon weight loss and amelioration of metabolic abnormalities three months following bariatric surgery, the expression level of these genes recovered to levels seen in individuals with NGT. We report four novel genes associated with type 2 diabetes and recovery upon weight loss: ACAT1 (encoding acetyl-CoA acetyltransferase 1), ACACA (encoding acetyl-CoA carboxylase α), ALDH6A1 (encoding aldehyde dehydrogenase 6 family, member A1) and MTHFD1 (encoding methylenetetrahydrofolate dehydrogenase).

Conclusions/interpretation: Downregulation of the acetyl-CoA network in VAT and SAT is an important feature in the pathophysiology of type 2 diabetes in obese individuals. ACAT1, ACACA, ALDH6A1 and MTHFD1 represent novel biomarkers in adipose tissue associated with type 2 diabetes in obese individuals.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00125-014-3347-0DOI Listing
November 2014

A promoter-level mammalian expression atlas.

Nature 2014 Mar;507(7493):462-70

Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly 'housekeeping', whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature13182DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4529748PMC
March 2014

TSSV: a tool for characterization of complex allelic variants in pure and mixed genomes.

Bioinformatics 2014 Jun 13;30(12):1651-9. Epub 2014 Feb 13.

Department of Human Genetic, Leiden Genome Technology Center, Leiden University Medical Center, Leiden, 2300 RC, The Netherlands and Netherlands Bioinformatics Centre, Leiden, The NetherlandsDepartment of Human Genetic, Leiden Genome Technology Center, Leiden University Medical Center, Leiden, 2300 RC, The Netherlands and Netherlands Bioinformatics Centre, Leiden, The NetherlandsDepartment of Human Genetic, Leiden Genome Technology Center, Leiden University Medical Center, Leiden, 2300 RC, The Netherlands and Netherlands Bioinformatics Centre, Leiden, The Netherlands.

Motivation: Advances in sequencing technologies and computational algorithms have enabled the study of genomic variants to dissect their functional consequence. Despite this unprecedented progress, current tools fail to reliably detect and characterize more complex allelic variants, such as short tandem repeats (STRs). We developed TSSV as an efficient and sensitive tool to specifically profile all allelic variants present in targeted loci. Based on its design, requiring only two short flanking sequences, TSSV can work without the use of a complete reference sequence to reliably profile highly polymorphic, repetitive or uncharacterized regions.

Results: We show that TSSV can accurately determine allelic STR structures in mixtures with 10% representation of minor alleles or complex mixtures in which a single STR allele is shared. Furthermore, we show the universal utility of TSSV in two other independent studies: characterizing de novo mutations introduced by transcription activator-like effector nucleases (TALENs) and profiling the noise and systematic errors in an IonTorrent sequencing experiment. TSSV complements the existing tools by aiding the study of highly polymorphic and complex regions and provides a high-resolution map that can be used in a wide range of applications, from personal genomics to forensic analysis and clinical diagnostics.

Availability And Implementation: We have implemented TSSV as a Python package that can be installed through the command-line using pip install TSSV command. Its source code and documentation are available at https://pypi.python.org/pypi/tssv and http://www.lgtc.nl/tssv.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btu068DOI Listing
June 2014

Aging as accelerated accumulation of somatic variants: whole-genome sequencing of centenarian and middle-aged monozygotic twin pairs.

Twin Res Hum Genet 2013 Dec 4;16(6):1026-32. Epub 2013 Nov 4.

Molecular Epidemiology, Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, the Netherlands.

It has been postulated that aging is the consequence of an accelerated accumulation of somatic DNA mutations and that subsequent errors in the primary structure of proteins ultimately reach levels sufficient to affect organismal functions. The technical limitations of detecting somatic changes and the lack of insight about the minimum level of erroneous proteins to cause an error catastrophe hampered any firm conclusions on these theories. In this study, we sequenced the whole genome of DNA in whole blood of two pairs of monozygotic (MZ) twins, 40 and 100 years old, by two independent next-generation sequencing (NGS) platforms (Illumina and Complete Genomics). Potentially discordant single-base substitutions supported by both platforms were validated extensively by Sanger, Roche 454, and Ion Torrent sequencing. We demonstrate that the genomes of the two twin pairs are germ-line identical between co-twins, and that the genomes of the 100-year-old MZ twins are discerned by eight confirmed somatic single-base substitutions, five of which are within introns. Putative somatic variation between the 40-year-old twins was not confirmed in the validation phase. We conclude from this systematic effort that by using two independent NGS platforms, somatic single nucleotide substitutions can be detected, and that a century of life did not result in a large number of detectable somatic mutations in blood. The low number of somatic variants observed by using two NGS platforms might provide a framework for detecting disease-related somatic variants in phenotypically discordant MZ twins.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1017/thg.2013.73DOI Listing
December 2013
-->