Publications by authors named "Deepak Bandyopadhyay"

20 Publications

  • Page 1 of 1

Scaffold-Based Analytics: Enabling Hit-to-Lead Decisions by Visualizing Chemical Series Linked across Large Datasets.

J Chem Inf Model 2019 11 29;59(11):4880-4892. Epub 2019 Oct 29.

National Center for Advancing Translational Science , 9800 Medical Center Drive , Rockville , Maryland 20850 , United States.

We present a method for visualizing and navigating large screening datasets while also taking into account their activities and properties. Our approach is to annotate the data with all possible scaffolds contained within each molecule. We have developed a Spotfire visualization, coupled to a fuzzy clustering approach based on the scaffold decomposition of the screening deck, used to drive the hit triage process. Progression decisions can be made using aggregate scaffold parameters and data from multiple datasets merged at the scaffold level. This visualization reveals overlaps that help prioritize hits, highlight tractable series, and posit ways to combine aspects of multiple hits. The structure-activity relationship of a large and complex hit is automatically mapped onto all constituent scaffolds making it possible to navigate, via any shared scaffold, to all related hits. This scaffold "walking" helps address bias toward a handful of potent and ligand-efficient molecules at the expense of coverage of chemical space. We consider two scaffold generation methods and explored their similarities and differences both qualitatively and quantitatively. The workflow of a Spotfire visualization used in combination with fuzzy clustering and structure annotation provides an intuitive view of large and diverse screening datasets. This allows teams to effortlessly navigate between structurally related molecules and enriches the population of leads considered and progressed in a manner complementary to established approaches.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.9b00243DOI Listing
November 2019

Identification via a Parallel Hit Progression Strategy of Improved Small Molecule Inhibitors of the Malaria Purine Uptake Transporter that Inhibit Parasite Proliferation.

ACS Infect Dis 2019 10 14;5(10):1738-1753. Epub 2019 Aug 14.

Platform Technology & Science and Discovery Partners in Academia , GlaxoSmithKline , 1250 South Collegeville Road , Collegeville , Pennsylvania 19426 , United States.

Emerging resistance to current antimalarial medicines underscores the importance of identifying new drug targets and novel compounds. Malaria parasites are purine auxotrophic and import purines via the equilibrative nucleoside transporter type 1 (PfENT1). We previously showed that PfENT1 inhibitors block parasite proliferation in culture. Our goal was to identify additional, possibly more optimal chemical starting points for a drug discovery campaign. We performed a high throughput screen (HTS) of GlaxoSmithKline's 1.8 million compound library with a yeast-based assay to identify PfENT1 inhibitors. We used a parallel progression strategy for hit validation and expansion, with an emphasis on chemical properties in addition to potency. In one arm, the most active hits were tested for human cell toxicity; 201 had minimal toxicity. The second arm, hit expansion, used a scaffold-based substructure search with the HTS hits as templates to identify over 2000 compounds; 123 compounds had activity. Of these 324 compounds, 175 compounds inhibited proliferation of parasite strain 3D7 with IC values between 0.8 and ∼180 μM. One hundred forty-two compounds inhibited PfENT1 knockout (Δ) parasite growth, indicating they also hit secondary targets. Thirty-two hits inhibited growth of 3D7 but not Δ parasites. Thus, PfENT1 inhibition was sufficient to block parasite proliferation. Therefore, PfENT1 may be a viable target for antimalarial drug development. Six compounds with novel chemical scaffolds were extensively characterized in yeast-, parasite-, and human-erythrocyte-based assays. The inhibitors showed similar potencies against drug sensitive and resistant strains. They represent attractive starting points for development of novel antimalarial drugs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acsinfecdis.9b00168DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7171677PMC
October 2019

Discovery and Lead-Optimization of 4,5-Dihydropyrazoles as Mono-Kinase Selective, Orally Bioavailable and Efficacious Inhibitors of Receptor Interacting Protein 1 (RIP1) Kinase.

J Med Chem 2019 05 2;62(10):5096-5110. Epub 2019 May 2.

Flexible Discovery Unit , GlaxoSmithKline , 25-27 avenue du Québec , 91951 Les Ulis Cedex , France.

RIP1 kinase regulates necroptosis and inflammation and may play an important role in contributing to a variety of human pathologies, including inflammatory and neurological diseases. Currently, RIP1 kinase inhibitors have advanced into early clinical trials for evaluation in inflammatory diseases such as psoriasis, rheumatoid arthritis, and ulcerative colitis and neurological diseases such as amyotrophic lateral sclerosis and Alzheimer's disease. In this paper, we report on the design of potent and highly selective dihydropyrazole (DHP) RIP1 kinase inhibitors starting from a high-throughput screen and the lead-optimization of this series from a lead with minimal rat oral exposure to the identification of dihydropyrazole 77 with good pharmacokinetic profiles in multiple species. Additionally, we identified a potent murine RIP1 kinase inhibitor 76 as a valuable in vivo tool molecule suitable for evaluating the role of RIP1 kinase in chronic models of disease. DHP 76 showed efficacy in mouse models of both multiple sclerosis and human retinitis pigmentosa.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jmedchem.9b00318DOI Listing
May 2019

Discovery and electrophysiological characterization of SKF-32802: A novel hERG agonist found through a large-scale structural similarity search.

Eur J Pharmacol 2018 Jan 16;818:306-327. Epub 2017 Oct 16.

Screening, Profiling and Mechanistic Biology, GlaxoSmithKline, 1250 South Collegeville Road, Collegeville, PA 19426, USA.

Despite the importance of the hERG channel in drug discovery and the sizable number of antagonist molecules discovered, only a few hERG agonists have been discovered. Here we report a novel hERG agonist; SKF-32802 and a structural analog of the agonist NS3623, SB-335573. These were discovered through a similarity search of published hERG agonists. SKF-32802 incorporates an amide linker rather than NS3623's urea, resulting in a compound with a different mechanism of action. We find that both compounds decrease the time constant of open channel kinetics, increase the amplitude of the envelope of tails assay, mildly increased the amplitude of the IV curve, bind the hERG channel in either open or closed states, increase the plateau of the voltage dependence of activation and modulate the effects of the hERG antagonist, quinidine. Neither compound affects inactivation nor deactivation kinetics, a property unique among hERG agonists. Additionally, SKF-32802 induces a leftward shift in the voltage dependence of activation. Our structural models show that both compounds make strong bridging interactions with multiple channel subunits and are stabilized by internal hydrogen bonding similar to NS3623, PD-307243 and RPR26024. While SB-335573 binds in a nearly identical fashion as NS3623, SKF-32802 makes an additional hydrogen bond with neighboring threonine 623. In summary, SB-335573 is a type 4 agonist which increases open channel probability while SKF-32802 is a type 3 agonist which induces a leftward shift in the voltage dependence of activation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ejphar.2017.10.015DOI Listing
January 2018

Discovery of a First-in-Class Receptor Interacting Protein 1 (RIP1) Kinase Specific Clinical Candidate (GSK2982772) for the Treatment of Inflammatory Diseases.

J Med Chem 2017 02 10;60(4):1247-1261. Epub 2017 Feb 10.

Centre for Immunobiology, Blizard Institute, Barts, and The London School of Medicine and Dentistry, Queen Mary University of London , E1 2AD London, U.K.

RIP1 regulates necroptosis and inflammation and may play an important role in contributing to a variety of human pathologies, including immune-mediated inflammatory diseases. Small-molecule inhibitors of RIP1 kinase that are suitable for advancement into the clinic have yet to be described. Herein, we report our lead optimization of a benzoxazepinone hit from a DNA-encoded library and the discovery and profile of clinical candidate GSK2982772 (compound 5), currently in phase 2a clinical studies for psoriasis, rheumatoid arthritis, and ulcerative colitis. Compound 5 potently binds to RIP1 with exquisite kinase specificity and has excellent activity in blocking many TNF-dependent cellular responses. Highlighting its potential as a novel anti-inflammatory agent, the inhibitor was also able to reduce spontaneous production of cytokines from human ulcerative colitis explants. The highly favorable physicochemical and ADMET properties of 5, combined with high potency, led to a predicted low oral dose in humans.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jmedchem.6b01751DOI Listing
February 2017

DNA-Encoded Library Screening Identifies Benzo[b][1,4]oxazepin-4-ones as Highly Potent and Monoselective Receptor Interacting Protein 1 Kinase Inhibitors.

J Med Chem 2016 Mar 23;59(5):2163-78. Epub 2016 Feb 23.

Platform Technology & Science, GlaxoSmithKline , King of Prussia, Pennsylvania 19406, United States.

The recent discovery of the role of receptor interacting protein 1 (RIP1) kinase in tumor necrosis factor (TNF)-mediated inflammation has led to its emergence as a highly promising target for the treatment of multiple inflammatory diseases. We screened RIP1 against GSK's DNA-encoded small-molecule libraries and identified a novel highly potent benzoxazepinone inhibitor series. We demonstrate that this template possesses complete monokinase selectivity for RIP1 plus unique species selectivity for primate versus nonprimate RIP1. We elucidate the conformation of RIP1 bound to this benzoxazepinone inhibitor driving its high kinase selectivity and design specific mutations in murine RIP1 to restore potency to levels similar to primate RIP1. This series differentiates itself from known RIP1 inhibitors in combining high potency and kinase selectivity with good pharmacokinetic profiles in rodents. The favorable developability profile of this benzoxazepinone template, as exemplified by compound 14 (GSK'481), makes it an excellent starting point for further optimization into a RIP1 clinical candidate.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jmedchem.5b01898DOI Listing
March 2016

Discovery of Small Molecule RIP1 Kinase Inhibitors for the Treatment of Pathologies Associated with Necroptosis.

ACS Med Chem Lett 2013 Dec 4;4(12):1238-43. Epub 2013 Nov 4.

Pattern Recognition Receptor DPU and Platform Technology & Science, GlaxoSmithKline , Collegeville Road, Collegeville, Pennsylvania 19426, United States.

Potent inhibitors of RIP1 kinase from three distinct series, 1-aminoisoquinolines, pyrrolo[2,3-b]pyridines, and furo[2,3-d]pyrimidines, all of the type II class recognizing a DLG-out inactive conformation, were identified from screening of our in-house kinase focused sets. An exemplar from the furo[2,3-d]pyrimidine series showed a dose proportional response in protection from hypothermia in a mouse model of TNFα induced lethal shock.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/ml400382pDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4027519PMC
December 2013

Mining Natural-Products Screening Data for Target-Class Chemical Motifs.

J Biomol Screen 2014 Jun 11;19(5):749-57. Epub 2014 Feb 11.

Molecular Discovery Research, GlaxoSmithKline R&D Pharmaceuticals, Tres Cantos, Spain.

In this article, we describe two complementary data-mining approaches used to characterize the GlaxoSmithKline (GSK) natural-products set (NPS) based on information from the high-throughput screening (HTS) databases. Both methods rely on the aggregation and analysis of a large set of single-shot screening data for a number of biological assays, with the goal to reveal natural-product chemical motifs. One of them is an established method based on the data-driven clustering of compounds using a wide range of descriptors,(1)whereas the other method partitions and hierarchically clusters the data to identify chemical cores.(2,3)Both methods successfully find structural scaffolds that significantly hit different groups of discrete drug targets, compared with their relative frequency of demonstrating inhibitory activity in a large number of screens. We describe how these methods can be applied to unveil hidden information in large single-shot HTS data sets. Applied prospectively, this type of information could contribute to the design of new chemical templates for drug-target classes and guide synthetic efforts for lead optimization of tractable hits that are based on natural-product chemical motifs. Relevant findings for 7TM receptors (7TMRs), ion channels, class-7 transferases (protein kinases), hydrolases, and oxidoreductases will be discussed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1177/1087057114521463DOI Listing
June 2014

Perspectives on the discovery of small-molecule modulators for epigenetic processes.

J Biomol Screen 2012 Jun 5;17(5):555-71. Epub 2012 Mar 5.

GlaxoSmithKline, Collegeville, Pennsylvania, USA.

Epigenetic gene regulation is a critical process controlling differentiation and development, the malfunction of which may underpin a variety of diseases. In this article, we review the current landscape of small-molecule epigenetic modulators including drugs on the market, key compounds in clinical trials, and chemical probes being used in epigenetic mechanistic studies. Hit identification strategies for the discovery of small-molecule epigenetic modulators are summarized with respect to writers, erasers, and readers of histone marks. Perspectives are provided on opportunities for new hit discovery approaches, some of which may define the next generation of therapeutic intervention strategies for epigenetic processes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1177/1087057112437763DOI Listing
June 2012

Stochastic Proximity Embedding: Methods and Applications.

Mol Inform 2010 Nov 17;29(11):758-70. Epub 2010 Nov 17.

Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, PA 19477, USA tel: (215) 628-6814.

Since its inception in 1996, the stochastic proximity embedding (SPE) algorithm and its variants have been applied to a wide range of problems in computational chemistry and biology with notable success. At its core, SPE attempts to generate Euclidean coordinates for a set of points so that they satisfy a prescribed set of geometric constraints. The algorithm's appeal rests on three factors: 1) its conceptual and programmatic simplicity; 2) its superior speed and scaling properties; and 3) its broad applicability. Here, we review some of the key applications, outline known limitations and ways to circumvent them, and highlight additional problem domains where the use of this technique could lead to significant breakthroughs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/minf.201000134DOI Listing
November 2010

Functional neighbors: inferring relationships between nonhomologous protein families using family-specific packing motifs.

IEEE Trans Inf Technol Biomed 2010 Sep 21;14(5):1137-43. Epub 2010 Jun 21.

Department of Computational and Structural Chemistry, GlaxoSmithKline, Collegeville, PA UP12-210, USA.

We describe a new approach for inferring the functional relationships between nonhomologous protein families by looking at statistical enrichment of alternative function predictions in classification hierarchies such as Gene Ontology (GO) and Structural Classification of Proteins (SCOP). Protein structures are represented by robust graph representations, and the fast frequent subgraph mining algorithm is applied to protein families to generate sets of family-specific packing motifs, i.e., amino acid residue-packing patterns shared by most family members but infrequent in other proteins. The function of a protein is inferred by identifying in it motifs characteristic of a known family. We employ these family-specific motifs to elucidate functional relationships between families in the GO and SCOP hierarchies. Specifically, we postulate that two families are functionally related if one family is statistically enriched by motifs characteristic of another family, i.e., if the number of proteins in a family containing a motif from another family is greater than expected by chance. This function-inference method can help annotate proteins of unknown function, establish functional neighbors of existing families, and help specify alternate functions for known proteins.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TITB.2010.2053550DOI Listing
September 2010

Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: II. Case studies and applications.

J Comput Aided Mol Des 2009 Nov 23;23(11):785-97. Epub 2009 Jun 23.

GlaxoSmithKline, Collegeville, PA, USA.

This paper describes several case studies concerning protein function inference from its structure using our novel approach described in the accompanying paper. This approach employs family-specific motifs, i.e. three-dimensional amino acid packing patterns that are statistically prevalent within a protein family. For our case studies we have selected families from the SCOP and EC classifications and analyzed the discriminating power of the motifs in depth. We have devised several benchmarks to compare motifs mined from unweighted topological graph representations of protein structures with those from distance-labeled (weighted) representations, demonstrating the superiority of the latter for function inference in most families. We have tested the robustness of our motif library by inferring the function of new members added to SCOP families, and discriminating between several families that are structurally similar but functionally divergent. Furthermore we have applied our method to predict function for several proteins characterized in structural genomics projects, including orphan structures, and we discuss several selected predictions in depth. Some of our predictions have been corroborated by other computational methods, and some have been validated by independent experimental studies, validating our approach for protein function inference from structure.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10822-009-9277-0DOI Listing
November 2009

Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: I. Method development.

J Comput Aided Mol Des 2009 Nov 20;23(11):773-84. Epub 2009 Jun 20.

GlaxoSmithKline, Collegeville, PA, USA.

Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name used as a vertex label) are connected by geometrical proximity edges. The approach employs two steps. First, it uses a fast subgraph mining algorithm to find all occurrences of family-specific labeled subgraphs for all well characterized protein structural and functional families. Second, it queries a new structure for occurrences of a set of motifs characteristic of a known family, using a graph index to speed up Ullman's subgraph isomorphism algorithm. The confidence of function inference from structure depends on the number of family-specific motifs found in the query structure compared with their distribution in a large non-redundant database of proteins. This method can assign a new structure to a specific functional family in cases where sequence alignments, sequence patterns, structural superposition and active site templates fail to provide accurate annotation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10822-009-9273-4DOI Listing
November 2009

A self-organizing algorithm for molecular alignment and pharmacophore development.

J Comput Chem 2008 Apr;29(6):965-82

Johnson & Johnson Pharmaceutical Research and Development, L.L.C., 665 Stockton Drive, Exton, Pennsylvania 19341, USA.

We present a method for simultaneous three-dimensional (3D) structure generation and pharmacophore-based alignment using a self-organizing algorithm called Stochastic Proximity Embedding (SPE). Current flexible molecular alignment methods either start from a single low-energy structure for each molecule and tweak bonds or torsion angles, or choose from multiple conformations of each molecule. Methods that generate structures and align them iteratively (e.g., genetic algorithms) are often slow. In earlier work, we used SPE to generate good-quality 3D conformations by iteratively adjusting pairwise distances between atoms based on a set of geometric rules, and showed that it samples conformational space better and runs faster than earlier programs. In this work, we run SPE on the entire ensemble of molecules to be aligned. Additional information about which atoms or groups of atoms in each molecule correspond to points in the pharmacophore can come from an automatically generated hypothesis or be specified manually. We add distance terms to SPE to bring pharmacophore points from different molecules closer in space, and also to line up normal/direction vectors associated with these points. We also permit pharmacophore points to be constrained to lie near external coordinates from a binding site. The aligned 3D molecular structures are nearly correct if the pharmacophore hypothesis is chemically feasible; postprocessing by minimization of suitable distance and energy functions further improves the structures and weeds out infeasible hypotheses. The method can be used to test 3D pharmacophores for a diverse set of active ligands, starting from only a hypothesis about corresponding atoms or groups.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/jcc.20854DOI Listing
April 2008

On the effects of permuted input on conformational sampling of drug-like molecules: an evaluation of stochastic proximity embedding.

Chem Biol Drug Des 2007 Aug;70(2):123-33

Johnson & Johnson Pharmaceutical Research & Development, L.L.C., 665 Stockton Drive, Exton, PA 19341, USA.

Conformational sampling is a problem of central importance in computer-aided drug design. A good conformational search method must not exhibit any intrinsic bias, and must provide confidence that important regions of conformational space are not missed during the search. A recent study by Carta et al. showed that this is not always the case, and that several popular conformational search methods, such as Omega, are very sensitive to the relative ordering of atoms and bonds in the connection table. Here, we examine the performance of a newer method known as stochastic proximity embedding, or SPE, using five diverse bioactive ligands extracted from the PDB. Our results confirm that the conformational ensembles produced by SPE using different permuted inputs are statistically indistinguishable, and well within the range of variability that would be expected from the stochastic nature of the method itself. This, along with the results of a more comprehensive comparative study (Agrafiotis et al., J. Chem. Info. Model, 2007, in press), provides further evidence that SPE is one of the most robust and competitive conformational search methods described to date.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/j.1747-0285.2007.00538.xDOI Listing
August 2007

Recent advances in chemoinformatics.

J Chem Inf Model 2007 Jul-Aug;47(4):1279-93. Epub 2007 May 19.

Johnson & Johnson Pharmaceutical Research and Development, L.L.C., Exton, Pennsylvania 19341, USA.

Chemoinformatics is a large scientific discipline that deals with the storage, organization, management, retrieval, analysis, dissemination, visualization, and use of chemical information. Chemoinformatics techniques are used extensively in drug discovery and development. Although many consider it a mature field, the advent of high-throughput experimental techniques and the need to analyze very large data sets have brought new life and challenges to it. Here, we review a selection of papers published in 2006 that caught our attention with regard to the novelty of the methodology that was presented. The field is seeing significant growth, which will be further catalyzed by the widespread availability of public databases to support the development and validation of new approaches.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/ci700059gDOI Listing
October 2007

Distance-based identification of structure motifs in proteins using constrained frequent subgraph mining.

Comput Syst Bioinformatics Conf 2006 :227-38

Computer Science Department, University of North Carolina at Chapel Hill, NC, USA.

Structure motifs are amino acid packing patterns that occur frequently within a set of protein structures. We define a labeled graph representation of protein structure in which vertices correspond to amino acid residues and edges connect pairs of residues and are labeled by (1) the Euclidian distance between the C(alpha) atoms of the two residues and (2) a boolean indicating whether the two residues are in physical/chemical contact. Using this representation, a structure motif corresponds to a labeled clique that occurs frequently among the graphs representing the protein structures. The pairwise distance constraints on each edge in a clique serve to limit the variation in geometry among different occurrences of a structure motif. We present an efficient constrained subgraph mining algorithm to discover structure motifs in this setting. Compared with contact graph representations, the number of spurious structure motifs is greatly reduced. Using this algorithm, structure motifs were located for several SCOP families including the Eukaryotic Serine Proteases, Nuclear Binding Domains, Papain-like Cysteine Proteases, and FAD/NAD-linked Reductases. For each family, we typically obtain a handful of motifs within seconds of processing time. The occurrences of these motifs throughout the PDB were strongly associated with the original SCOP family, as measured using a hyper-geometric distribution. The motifs were found to cover functionally important sites like the catalytic triad for Serine Proteases and co-factor binding sites for Nuclear Binding Domains. The fact that many motifs are highly family-specific can be used to classify new proteins or to provide functional annotation in Structural Genomics Projects.
View Article and Find Full Text PDF

Download full-text PDF

Source
June 2007

Radial clustergrams: visualizing the aggregate properties of hierarchical clusters.

J Chem Inf Model 2007 Jan-Feb;47(1):69-75

Johnson & Johnson Pharmaceutical Research & Development, L.L.C., 665 Stockton Drive, Exton, Pennsylvania 19341, USA.

A new radial space-filling method for visualizing cluster hierarchies is presented. The method, referred to as a radial clustergram, arranges the clusters into a series of layers, each representing a different level of the tree. It uses adjacency of nodes instead of links to represent parent-child relationships and allocates sufficient screen real estate to each node to allow effective visualization of cluster properties through color-coding. Radial clustergrams combine the most appealing features of other cluster visualization techniques but avoid their pitfalls. Compared to classical dendrograms and hyperbolic trees, they make much more efficient use of space; compared to treemaps, they are more effective in conveying hierarchical structure and displaying properties of nodes higher in the tree. A fisheye lens is used to focus on areas of interest, without losing sight of the global context. The utility of the method is demonstrated using examples from the fields of molecular diversity and conformational analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/ci600427xDOI Listing
May 2007

Structure-based function inference using protein family-specific fingerprints.

Protein Sci 2006 Jun;15(6):1537-43

Department of Computer Science, University of North Carolina at Chapel Hill, North Carolina 27599, USA.

We describe a method to assign a protein structure to a functional family using family-specific fingerprints. Fingerprints represent amino acid packing patterns that occur in most members of a family but are rare in the background, a nonredundant subset of PDB; their information is additional to sequence alignments, sequence patterns, structural superposition, and active-site templates. Fingerprints were derived for 120 families in SCOP using Frequent Subgraph Mining. For a new structure, all occurrences of these family-specific fingerprints may be found by a fast algorithm for subgraph isomorphism; the structure can then be assigned to a family with a confidence value derived from the number of fingerprints found and their distribution in background proteins. In validation experiments, we infer the function of new members added to SCOP families and we discriminate between structurally similar, but functionally divergent TIM barrel families. We then apply our method to predict function for several structural genomics proteins, including orphan structures. Some predictions have been corroborated by other computational methods and some validated by subsequent functional characterization.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1110/ps.062189906DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2265098PMC
June 2006

Comparing graph representations of protein structure for mining family-specific residue-based packing motifs.

J Comput Biol 2005 Jul-Aug;12(6):657-71

Department of Computer Science, University of North Carolina, 3239 Sitterson Hall, Chapel Hill, NC 27599, USA.

We find recurring amino-acid residue packing patterns, or spatial motifs, that are characteristic of protein structural families, by applying a novel frequent subgraph mining algorithm to graph representations of protein three-dimensional structure. Graph nodes represent amino acids, and edges are chosen in one of three ways: first, using a threshold for contact distance between residues; second, using Delaunay tessellation; and third, using the recently developed almost-Delaunay edges. For a set of graphs representing a protein family from the Structural Classification of Proteins (SCOP) database, subgraph mining typically identifies several hundred common subgraphs corresponding to spatial motifs that are frequently found in proteins in the family but rarely found outside of it. We find that some of the large motifs map onto known functional regions in two protein families explored in this study, i.e., serine proteases and kinases. We find that graphs based on almost-Delaunay edges significantly reduce the number of edges in the graph representation and hence present computational advantage, yet the patterns extracted from such graphs have a biological interpretation approximately equivalent to that of those extracted from distance based graphs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1089/cmb.2005.12.657DOI Listing
October 2005