Publications by authors named "Bahrad A Sokhansanj"

17 Publications

  • Page 1 of 1

Mapping Data to Deep Understanding: Making the Most of the Deluge of SARS-CoV-2 Genome Sequences.

mSystems 2022 Apr 21;7(2):e0003522. Epub 2022 Mar 21.

Drexel Universitygrid.166341.7, Ecological and Evolutionary Signal-Processing and Informatics Laboratory, Department of Electrical & Computer Engineering, College of Engineering, Philadelphia, Pennsylvania, USA.

Next-generation sequencing has been essential to the global response to the COVID-19 pandemic. As of January 2022, nearly 7 million severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequences are available to researchers in public databases. Sequence databases are an abundant resource from which to extract biologically relevant and clinically actionable information. As the pandemic has gone on, SARS-CoV-2 has rapidly evolved, involving complex genomic changes that challenge current approaches to classifying SARS-CoV-2 variants. Deep sequence learning could be a potentially powerful way to build complex sequence-to-phenotype models. Unfortunately, while they can be predictive, deep learning typically produces "black box" models that cannot directly provide biological and clinical insight. Researchers should therefore consider implementing emerging methods for visualizing and interpreting deep sequence models. Finally, researchers should address important data limitations, including (i) global sequencing disparities, (ii) insufficient sequence metadata, and (iii) screening artifacts due to poor sequence quality control.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/msystems.00035-22DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9040592PMC
April 2022

Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network.

PLoS Comput Biol 2021 09 22;17(9):e1009345. Epub 2021 Sep 22.

Ecological and Evolutionary Signal-Processing and Informatics Laboratory, Department of Electrical and Computer Engineering, College of Engineering, Drexel University, Philadelphia, Pennsylvania, United States of America.

Recurrent neural networks with memory and attention mechanisms are widely used in natural language processing because they can capture short and long term sequential information for diverse tasks. We propose an integrated deep learning model for microbial DNA sequence data, which exploits convolutional neural networks, recurrent neural networks, and attention mechanisms to predict taxonomic classifications and sample-associated attributes, such as the relationship between the microbiome and host phenotype, on the read/sequence level. In this paper, we develop this novel deep learning approach and evaluate its application to amplicon sequences. We apply our approach to short DNA reads and full sequences of 16S ribosomal RNA (rRNA) marker genes, which identify the heterogeneity of a microbial community sample. We demonstrate that our implementation of a novel attention-based deep network architecture, Read2Pheno, achieves read-level phenotypic prediction. Training Read2Pheno models will encode sequences (reads) into dense, meaningful representations: learned embedded vectors output from the intermediate layer of the network model, which can provide biological insight when visualized. The attention layer of Read2Pheno models can also automatically identify nucleotide regions in reads/sequences which are particularly informative for classification. As such, this novel approach can avoid pre/post-processing and manual interpretation required with conventional approaches to microbiome sequence classification. We further show, as proof-of-concept, that aggregating read-level information can robustly predict microbial community properties, host phenotype, and taxonomic classification, with performance at least comparable to conventional approaches. An implementation of the attention-based deep learning network is available at https://github.com/EESI/sequence_attention (a python package) and https://github.com/EESI/seq2att (a command line tool).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1009345DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8496832PMC
September 2021

Amino Acid -mer Feature Extraction for Quantitative Antimicrobial Resistance (AMR) Prediction by Machine Learning and Model Interpretation for Biological Insights.

Biology (Basel) 2020 Oct 28;9(11). Epub 2020 Oct 28.

Ecological and Evolutionary Signal-Processing and Informatics Laboratory, Department of Electrical and Computer Engineering, College of Engineering, Drexel University, Philadelphia, PA 19104, USA.

Machine learning algorithms can learn mechanisms of antimicrobial resistance from the data of DNA sequence without any a priori information. Interpreting a trained machine learning algorithm can be exploited for validating the model and obtaining new information about resistance mechanisms. Different feature extraction methods, such as SNP calling and counting nucleotide -mers have been proposed for presenting DNA sequences to the model. However, there are trade-offs between interpretability, computational complexity and accuracy for different feature extraction methods. In this study, we have proposed a new feature extraction method, counting amino acid -mers or oligopeptides, which provides easier model interpretation compared to counting nucleotide -mers and reaches the same or even better accuracy in comparison with different methods. Additionally, we have trained machine learning algorithms using different feature extraction methods and compared the results in terms of accuracy, model interpretability and computational complexity. We have built a new feature selection pipeline for extraction of important features so that new AMR determinants can be discovered by analyzing these features. This pipeline allows the construction of models that only use a small number of features and can predict resistance accurately.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/biology9110365DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7694136PMC
October 2020

Genetic grouping of SARS-CoV-2 coronavirus sequences using informative subtype markers for pandemic spread visualization.

PLoS Comput Biol 2020 09 17;16(9):e1008269. Epub 2020 Sep 17.

Ecological and Evolutionary Signal-Processing and Informatics Laboratory, Department of Electrical and Computer Engineering, College of Engineering, Drexel University, Philadelphia, PA, USA.

We propose an efficient framework for genetic subtyping of SARS-CoV-2, the novel coronavirus that causes the COVID-19 pandemic. Efficient viral subtyping enables visualization and modeling of the geographic distribution and temporal dynamics of disease spread. Subtyping thereby advances the development of effective containment strategies and, potentially, therapeutic and vaccine strategies. However, identifying viral subtypes in real-time is challenging: SARS-CoV-2 is a novel virus, and the pandemic is rapidly expanding. Viral subtypes may be difficult to detect due to rapid evolution; founder effects are more significant than selection pressure; and the clustering threshold for subtyping is not standardized. We propose to identify mutational signatures of available SARS-CoV-2 sequences using a population-based approach: an entropy measure followed by frequency analysis. These signatures, Informative Subtype Markers (ISMs), define a compact set of nucleotide sites that characterize the most variable (and thus most informative) positions in the viral genomes sequenced from different individuals. Through ISM compression, we find that certain distant nucleotide variants covary, including non-coding and ORF1ab sites covarying with the D614G spike protein mutation which has become increasingly prevalent as the pandemic has spread. ISMs are also useful for downstream analyses, such as spatiotemporal visualization of viral dynamics. By analyzing sequence data available in the GISAID database, we validate the utility of ISM-based subtyping by comparing spatiotemporal analyses using ISMs to epidemiological studies of viral transmission in Asia, Europe, and the United States. In addition, we show the relationship of ISMs to phylogenetic reconstructions of SARS-CoV-2 evolution, and therefore, ISMs can play an important complementary role to phylogenetic tree-based analysis, such as is done in the Nextstrain project. The developed pipeline dynamically generates ISMs for newly added SARS-CoV-2 sequences and updates the visualization of pandemic spatiotemporal dynamics, and is available on Github at https://github.com/EESI/ISM (Jupyter notebook), https://github.com/EESI/ncov_ism (command line tool) and via an interactive website at https://covid19-ism.coe.drexel.edu/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1008269DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7523987PMC
September 2020

Discovering the unknown: improving detection of novel species and genera from short reads.

J Biomed Biotechnol 2011 23;2011:495849. Epub 2011 Mar 23.

Department of Electrical and Computer Engineering, Drexel University, Philadelphia, PA 19104, USA.

High-throughput sequencing technologies enable metagenome profiling, simultaneous sequencing of multiple microbial species present within an environmental sample. Since metagenomic data includes sequence fragments ("reads") from organisms that are absent from any database, new algorithms must be developed for the identification and annotation of novel sequence fragments. Homology-based techniques have been modified to detect novel species and genera, but, composition-based methods, have not been adapted. We develop a detection technique that can discriminate between "known" and "unknown" taxa, which can be used with composition-based methods, as well as a hybrid method. Unlike previous studies, we rigorously evaluate all algorithms for their ability to detect novel taxa. First, we show that the integration of a detector with a composition-based method performs significantly better than homology-based methods for the detection of novel species and genera, with best performance at finer taxonomic resolutions. Most importantly, we evaluate all the algorithms by introducing an "unknown" class and show that the modified version of PhymmBL has similar or better overall classification performance than the other modified algorithms, especially for the species-level and ultrashort reads. Finally, we evaluate the performance of several algorithms on a real acid mine drainage dataset.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1155/2011/495849DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3085467PMC
August 2011

Signal processing for metagenomics: extracting information from the soup.

Curr Genomics 2009 Nov;10(7):493-510

Electrical and Computer Engineering Department, Drexel University, Philadelphia, PA, USA.

Traditionally, studies in microbial genomics have focused on single-genomes from cultured species, thereby limiting their focus to the small percentage of species that can be cultured outside their natural environment. Fortunately, recent advances in high-throughput sequencing and computational analyses have ushered in the new field of metagenomics, which aims to decode the genomes of microbes from natural communities without the need for cultivation. Although metagenomic studies have shed a great deal of insight into bacterial diversity and coding capacity, several computational challenges remain due to the massive size and complexity of metagenomic sequence data. Current tools and techniques are reviewed in this paper which address challenges in 1) genomic fragment annotation, 2) phylogenetic reconstruction, 3) functional classification of samples, and 4) interpreting complementary metaproteomics and metametabolomics data. Also surveyed are important applications of metagenomic studies, including microbial forensics and the roles of microbial communities in shaping human health and soil ecology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2174/138920209789208255DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2808676PMC
November 2009

Mining, modeling, and evaluation of subnetworks from large biomolecular networks and its comparison study.

IEEE Trans Inf Technol Biomed 2009 Mar;13(2):184-94

College of Information Science and Technology, Drexel University, Philadelphia, PA 19104, USA.

In this paper, we present a novel method to mine, model, and evaluate a regulatory system executing cellular functions that can be represented as a biomolecular network. Our method consists of two steps. First, a novel scale-free network clustering approach is applied to such a biomolecular network to obtain various subnetworks. Second, computational models are generated for the subnetworks and simulated to predict their behavior in the cellular context. We discuss and evaluate some of the advanced computational modeling approaches, in particular, state-space modeling, probabilistic Boolean network modeling, and fuzzy logic modeling. The modeling and simulation results represent hypotheses that are tested against high-throughput biological datasets (microarrays and/or genetic screens) under normal and perturbation conditions. Experimental results on time-series gene expression data for the human cell cycle indicate that our approach is promising for subnetwork mining and simulation from large biomolecular networks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TITB.2008.2007649DOI Listing
March 2009

Multi-platform investigation of the metabolome in a leptin receptor defective murine model of type 2 diabetes.

Mol Biosyst 2008 Oct 7;4(10):1015-23. Epub 2008 Aug 7.

School of Biomedical Engineering, Science, and Health Systems, Drexel University, 3141 Chestnut Street, Philadelphia, Pennsylvania 19104, USA.

We describe a multi-platform ((1)H NMR, LC-MS, microarray) investigation of metabolic disturbances associated with the leptin receptor defective (db/db) mouse model of type 2 diabetes using novel assignment methodologies. For the first time, several urinary metabolites were found to be associated with diabetes and/or diabetes progression and confirmed in both NMR and LC-MS datasets. The confirmed metabolites were trimethylamine-n-oxide (TMAO), creatine, carnitine, and phenylalanine. TMAO and phenylalanine were both elevated in db/db mice and decreased in these mice with age. Levels of both creatine and carnitine increase in diabetic mice with age and creatine was also significantly decreased in db/db mice. Additionally, many metabolic markers were found by either NMR or LC-MS, but could not be found in both, due to instrumental limitations. This indicates that the combined use of NMR and LC-MS instrumentation provides complementary information that would be otherwise unattainable. Pathway analyses of urinary metabolites and liver, muscle, and adipose tissue transcripts from the db/db model were also performed to identify altered biochemical processes in the diabetic mice. Metabolite and liver transcript levels associated with the TCA cycle and steroid processes were altered in db/db mice. In addition, gene expression in muscle and liver associated with fatty acid processing was altered in the diabetic mice and similar evidence was observed in the LC-MS data. Our findings highlight the importance of a number of processes known to be associated with diabetes and reveal tissue specific responses to the condition. When studying metabolic disorders such as diabetes, multiple platform integrated profiling of metabolite alterations in biofluids can provide important insights into the processes underlying the disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1039/b807332eDOI Listing
October 2008

Integrated modeling methodology for microtubule dynamics and Taxol kinetics with experimentally identifiable parameters.

Comput Methods Programs Biomed 2007 Oct 17;88(1):18-25. Epub 2007 Aug 17.

School of Biomedical Engineering, Science and Health Systems, Drexel University, 3141 Chestnut St., Philadelphia, PA 19104, USA.

Microtubule dynamics play a critical role in cell function and stress response, modulating mitosis, morphology, signaling, and transport. Drugs such as paclitaxel (Taxol) can impact tubulin polymerization and affect microtubule dynamics. While theoretical methods have been previously proposed to simulate microtubule dynamics, we develop a methodology here that can be used to compare model predictions with experimental data. Our model is a hybrid of (1) a simple two-state stochastic formulation of tubulin polymerization kinetics and (2) an equilibrium approximation for the chemical kinetics of Taxol drug binding to microtubule ends. Model parameters are biologically realistic, with values taken directly from experimental measurements. Model validation is conducted against published experimental data comparing optical measurements of microtubule dynamics in cultured cells under normal and Taxol-treated conditions. To compare model predictions with experimental data requires applying a "windowing" strategy on the spatiotemporal resolution of the simulation. From a biological perspective, this is consistent with interpreting the microtubule "pause" phenomenon as at least partially an artifact of spatiotemporal resolution limits on experimental measurement.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cmpb.2007.07.004DOI Listing
October 2007

Accelerated search for biomolecular network models to interpret high-throughput experimental data.

BMC Bioinformatics 2007 Jul 18;8:258. Epub 2007 Jul 18.

School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA 19104, USA.

Background: The functions of human cells are carried out by biomolecular networks, which include proteins, genes, and regulatory sites within DNA that encode and control protein expression. Models of biomolecular network structure and dynamics can be inferred from high-throughput measurements of gene and protein expression. We build on our previously developed fuzzy logic method for bridging quantitative and qualitative biological data to address the challenges of noisy, low resolution high-throughput measurements, i.e., from gene expression microarrays. We employ an evolutionary search algorithm to accelerate the search for hypothetical fuzzy biomolecular network models consistent with a biological data set. We also develop a method to estimate the probability of a potential network model fitting a set of data by chance. The resulting metric provides an estimate of both model quality and dataset quality, identifying data that are too noisy to identify meaningful correlations between the measured variables.

Results: Optimal parameters for the evolutionary search were identified based on artificial data, and the algorithm showed scalable and consistent performance for as many as 150 variables. The method was tested on previously published human cell cycle gene expression microarray data sets. The evolutionary search method was found to converge to the results of exhaustive search. The randomized evolutionary search was able to converge on a set of similar best-fitting network models on different training data sets after 30 generations running 30 models per generation. Consistent results were found regardless of which of the published data sets were used to train or verify the quantitative predictions of the best-fitting models for cell cycle gene dynamics.

Conclusion: Our results demonstrate the capability of scalable evolutionary search for fuzzy network models to address the problem of inferring models based on complex, noisy biomolecular data sets. This approach yields multiple alternative models that are consistent with the data, yielding a constrained set of hypotheses that can be used to optimally design subsequent experiments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-8-258DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1940030PMC
July 2007

Systems approaches to the networks of aging.

Ageing Res Rev 2006 Nov 14;5(4):434-48. Epub 2006 Aug 14.

School of Biomedical Engineering, Drexel University, Science and Health System, Chestnut Street 3401, Philadelphia, PA 19104, USA.

The aging of an organism is the result of complex changes in structure and function of molecules, cells, tissues, and whole body systems. To increase our understanding of how aging works, we have to analyze and integrate quantitative evidence from multiple levels of biological organization. Here, we define a broader conceptual framework for a quantitative, computational systems biology approach to aging. Initially, we consider fractal supply networks that give rise to scaling laws relating body mass, metabolism and lifespan. This approach provides a top-down view of constrained cellular processes. Concomitantly, multi-omics data generation build such a framework from the bottom-up, using modeling strategies to identify key pathways and their physiological capacity. Multiscale spatio-temporal representations finally connect molecular processes with structural organization. As aging manifests on a systems level, it emerges as a highly networked process regulated through feedback loops between levels of biological organization.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.arr.2006.06.002DOI Listing
November 2006

Estimating the effect of human base excision repair protein variants on the repair of oxidative DNA base damage.

Cancer Epidemiol Biomarkers Prev 2006 May;15(5):1000-8

School of Biomedical Engineering, Science and Health Systems, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19104, USA.

Epidemiologic studies have revealed a complex association between human genetic variance and cancer risk. Quantitative biological modeling based on experimental data can play a critical role in interpreting the effect of genetic variation on biochemical pathways relevant to cancer development and progression. Defects in human DNA base excision repair (BER) proteins can reduce cellular tolerance to oxidative DNA base damage caused by endogenous and exogenous sources, such as exposure to toxins and ionizing radiation. If not repaired, DNA base damage leads to cell dysfunction and mutagenesis, consequently leading to cancer, disease, and aging. Population screens have identified numerous single-nucleotide polymorphism variants in many BER proteins and some have been purified and found to exhibit mild kinetic defects. Epidemiologic studies have led to conflicting conclusions on the association between single-nucleotide polymorphism variants in BER proteins and cancer risk. Using experimental data for cellular concentration and the kinetics of normal and variant BER proteins, we apply a previously developed and tested human BER pathway model to (i) estimate the effect of mild variants on BER of abasic sites and 8-oxoguanine, a prominent oxidative DNA base modification, (ii) identify ranges of variation associated with substantial BER capacity loss, and (iii) reveal nonintuitive consequences of multiple simultaneous variants. Our findings support previous work suggesting that mild BER variants have a minimal effect on pathway capacity whereas more severe defects and simultaneous variation in several BER proteins can lead to inefficient repair and potentially deleterious consequences of cellular damage.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1158/1055-9965.EPI-05-0817DOI Listing
May 2006

Temporal global changes in gene expression during temperature transition in Yersinia pestis.

J Bacteriol 2004 Sep;186(18):6298-305

Biology and Biotechnology Research Program, L-452, 7000 East Ave., Livermore, CA 94550, USA.

DNA microarrays encompassing the entire genome of Yersinia pestis were used to characterize global regulatory changes during steady-state vegetative growth occurring after shift from 26 to 37 degrees C in the presence and absence of Ca2+. Transcriptional profiles revealed that 51, 4, and 13 respective genes and open reading frames (ORFs) on pCD, pPCP, and pMT were thermoinduced and that the majority of these genes carried by pCD were downregulated by Ca2+. In contrast, Ca2+ had little effect on chromosomal genes and ORFs, of which 235 were thermally upregulated and 274 were thermally downregulated. The primary consequence of these regulatory events is profligate catabolism of numerous metabolites available in the mammalian host.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/JB.186.18.6298-6305.2004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC515171PMC
September 2004

Linear fuzzy gene network models obtained from microarray data by exhaustive search.

BMC Bioinformatics 2004 Aug 10;5:108. Epub 2004 Aug 10.

Computational Systems Biology Group, University of California, Lawrence Livermore National Laboratory, L-235, 7000 East Ave, Livermore, CA 94551, USA.

Background: Recent technological advances in high-throughput data collection allow for experimental study of increasingly complex systems on the scale of the whole cellular genome and proteome. Gene network models are needed to interpret the resulting large and complex data sets. Rationally designed perturbations (e.g., gene knock-outs) can be used to iteratively refine hypothetical models, suggesting an approach for high-throughput biological system analysis. We introduce an approach to gene network modeling based on a scalable linear variant of fuzzy logic: a framework with greater resolution than Boolean logic models, but which, while still semi-quantitative, does not require the precise parameter measurement needed for chemical kinetics-based modeling.

Results: We demonstrated our approach with exhaustive search for fuzzy gene interaction models that best fit transcription measurements by microarray of twelve selected genes regulating the yeast cell cycle. Applying an efficient, universally applicable data normalization and fuzzification scheme, the search converged to a small number of models that individually predict experimental data within an error tolerance. Because only gene transcription levels are used to develop the models, they include both direct and indirect regulation of genes.

Conclusion: Biological relationships in the best-fitting fuzzy gene network models successfully recover direct and indirect interactions predicted from previous knowledge to result in transcriptional correlation. Fuzzy models fit on one yeast cell cycle data set robustly predict another experimental data set for the same system. Linear fuzzy gene networks and exhaustive rule search are the first steps towards a framework for an integrated modeling and experiment approach to high-throughput "reverse engineering" of complex biological systems.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-5-108DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC514698PMC
August 2004

Oxidative DNA damage background estimated by a system model of base excision repair.

Free Radic Biol Med 2004 Aug;37(3):422-7

Chemistry and Materials Science Directorate, Lawrence Livermore National Laboratory, Livermore, CA 94500, USA.

Human DNA can be damaged by natural metabolism through free radical production. It has been suggested that the equilibrium between innate damage and cellular DNA repair results in an oxidative DNA damage background that potentially contributes to disease and aging. Efforts to quantitatively characterize the human oxidative DNA damage background level, based on measuring 8-oxoguanine lesions as a biomarker, have led to estimates that vary over three to four orders of magnitude, depending on the method of measurement. We applied a previously developed and validated quantitative pathway model of human DNA base excision repair, integrating experimentally determined endogenous damage rates and model parameters from multiple sources. Our estimates of at most 100 8-oxoguanine lesions per cell are consistent with the low end of data from biochemical and cell biology experiments, a result robust to model limitations and parameter variation. Our findings show the power of quantitative system modeling to interpret composite experimental data and make biologically and physiologically relevant predictions for complex human DNA repair pathway mechanisms and capacity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.freeradbiomed.2004.05.003DOI Listing
August 2004

A quantitative model of human DNA base excision repair. I. Mechanistic insights.

Nucleic Acids Res 2002 Apr;30(8):1817-25

Biology and Biotechnology Research Program, L-441, University of California, Lawrence Livermore National Laboratory, Livermore, CA 94551-9900, USA.

Base excision repair (BER) is a multistep process involving the sequential activity of several proteins that cope with spontaneous and environmentally induced mutagenic and cytotoxic DNA damage. Quantitative kinetic data on single proteins of BER have been used here to develop a mathematical model of the BER pathway. This model was then employed to evaluate mechanistic issues and to determine the sensitivity of pathway throughput to altered enzyme kinetics. Notably, the model predicts considerably less pathway throughput than observed in experimental in vitro assays. This finding, in combination with the effects of pathway cooperativity on model throughput, supports the hypothesis of cooperation during abasic site repair and between the apurinic/apyrimidinic (AP) endonuclease, Ape1, and the 8-oxoguanine DNA glycosylase, Ogg1. The quantitative model also predicts that for 8-oxoguanine and hydrolytic AP site damage, short-patch Polbeta-mediated BER dominates, with minimal switching to the long-patch subpathway. Sensitivity analysis of the model indicates that the Polbeta-catalyzed reactions have the most control over pathway throughput, although other BER reactions contribute to pathway efficiency as well. The studies within represent a first step in a developing effort to create a predictive model for BER cellular capacity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC113225PMC
http://dx.doi.org/10.1093/nar/30.8.1817DOI Listing
April 2002

Genetic variability of Yersinia pestis isolates as predicted by PCR-based IS100 genotyping and analysis of structural genes encoding glycerol-3-phosphate dehydrogenase (glpD).

J Bacteriol 2002 Feb;184(4):1019-27

Lawrence Livermore National Laboratory, University of California, Livermore, California 94550, USA.

A PCR-based genotyping system that detects divergence of IS100 locations within the Yersinia pestis genome was used to characterize a large collection of isolates of different biovars and geographical origins. Using sequences derived from the glycerol-negative biovar orientalis strain CO92, a set of 27 locus-specific primers was designed to amplify fragments between the end of IS100 and its neighboring gene. Geographically diverse members of the orientalis biovar formed a homogeneous group with identical genotype with the exception of strains isolated in Indochina. In contrast, strains belonging to the glycerol-positive biovar antiqua showed a variety of fingerprinting profiles. Moreover, strains of the biovar medievalis (also glycerol positive) clustered together with the antiqua isolates originated from Southeast Asia, suggesting their close phylogenetic relationships. Interestingly, a Manchurian biovar antiqua strain Nicholisk 51 displayed a genotyping pattern typical of biovar orientalis isolates. Analysis of the glycerol pathway in Y. pestis suggested that a 93-bp deletion within the glpD gene encoding aerobic glycerol-3-phosphate dehydrogenase might account for the glycerol-negative phenotype of the orientalis biovar. The glpD gene of strain Nicholisk 51 did not possess this deletion, although it contained two nucleotide substitutions characteristic of the glpD version found exclusively in biovar orientalis strains. To account for this close relationship between biovar orientalis strains and the antiqua Nicholisk 51 isolate, we postulate that the latter represents a variant of this biovar with restored ability to ferment glycerol. The fact that such a genetic lesion might be repaired as part of the natural evolutionary process suggests the existence of genetic exchange between different Yersinia strains in nature. The relevance of this observation on the emergence of epidemic Y. pestis strains is discussed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC134790PMC
http://dx.doi.org/10.1128/jb.184.4.1019-1027.2002DOI Listing
February 2002
-->