550 results match your criteria Journal of Cheminformatics [Journal]


Software solutions for evaluation and visualization of laser ablation inductively coupled plasma mass spectrometry imaging (LA-ICP-MSI) data: a short overview.

J Cheminform 2019 Feb 18;11(1):16. Epub 2019 Feb 18.

Department of Biochemistry and Biotechnology, Center for Research and Advanced Studies (CINVESTAV) Irapuato, Km. 9.6 Libramiento Norte Carr. Irapuato-León, 36824, Irapuato, Gto., Mexico.

Mass spectrometry imaging (MSI) using laser ablation (LA) inductively coupled plasma (ICP) is an innovative and exciting methodology to perform highly sensitive elemental analyses. LA-ICP-MSI of metals, trace elements or isotopes in tissues has been applied to a range of biological samples. Several LA-ICP-MSI studies have shown that metals have a highly compartmentalized distribution in some organs, which might be altered in consequence of genetic diseases, intoxication, or malnutrition. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-019-0338-7DOI Listing
February 2019

Identification of novel small molecule inhibitors for solute carrier SGLT1 using proteochemometric modeling.

J Cheminform 2019 Feb 14;11(1):15. Epub 2019 Feb 14.

Division of Drug Discovery & Safety, Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands.

Sodium-dependent glucose co-transporter 1 (SGLT1) is a solute carrier responsible for active glucose absorption. SGLT1 is present in both the renal tubules and small intestine. In contrast, the closely related sodium-dependent glucose co-transporter 2 (SGLT2), a protein that is targeted in the treatment of diabetes type II, is only expressed in the renal tubules. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-019-0337-8DOI Listing
February 2019
1 Read

Dimorphite-DL: an open-source program for enumerating the ionization states of drug-like small molecules.

J Cheminform 2019 Feb 14;11(1):14. Epub 2019 Feb 14.

Department of Biological Sciences, University of Pittsburgh, 4249 Fifth Avenue, Pittsburgh, PA, 15260, USA.

Small-molecule protonation can promote or discourage protein binding by altering hydrogen-bond, electrostatic, and van-der-Waals interactions. To improve virtual-screen pose and affinity predictions, researchers must account for all major small-molecule ionization states. But existing programs for calculating these states have notable limitations such as high cost, restrictive licenses, slow execution times, and poor modularity. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-019-0336-9DOI Listing
February 2019

rBAN: retro-biosynthetic analysis of nonribosomal peptides.

J Cheminform 2019 Feb 8;11(1):13. Epub 2019 Feb 8.

Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, CMU, Rue Michel-Servet 1, 1211, Geneva, Switzerland.

Proteinogenic and non-proteinogenic amino acids, fatty acids or glycans are some of the main building blocks of nonribsosomal peptides (NRPs) and as such may give insight into the origin, biosynthesis and bioactivities of their constitutive peptides. Hence, the structural representation of NRPs using monomers provides a biologically interesting skeleton of these secondary metabolites. Databases dedicated to NRPs such as Norine, already integrate monomer-based annotations in order to facilitate the development of structural analysis tools. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-019-0335-xDOI Listing
February 2019
1 Read

Programming languages in chemistry: a review of HTML5/JavaScript.

Authors:
Kevin J Theisen

J Cheminform 2019 Feb 5;11(1):11. Epub 2019 Feb 5.

iChemLabs, LLC., 7305 Hancock Village Dr #525, Chesterfield, VA, 23112, USA.

This is one part of a series of reviews concerning the application of programming languages in chemistry, edited by Dr. Rajarshi Guha. This article reviews the JavaScript technology as it applies to the chemistry discipline. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-019-0331-1DOI Listing
February 2019
1 Read

Implementing cheminformatics.

Authors:
Rajarshi Guha

J Cheminform 2019 Feb 5;11(1):12. Epub 2019 Feb 5.

Vertex Pharmaceuticals, 50 Northern Ave, Boston, MA, 02210, USA.

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-019-0333-zDOI Listing
February 2019
2 Reads

Chemoinformatics and structural bioinformatics in OCaml.

J Cheminform 2019 Feb 5;11(1):10. Epub 2019 Feb 5.

Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, Iizuka, Fukuoka, Japan.

Background: OCaml is a functional programming language with strong static types, Hindley-Milner type inference and garbage collection. In this article, we share our experience in prototyping chemoinformatics and structural bioinformatics software in OCaml.

Results: First, we introduce the language, list entry points for chemoinformaticians who would be interested in OCaml and give code examples. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-019-0332-0DOI Listing
February 2019
3 Reads

Avoiding hERG-liability in drug design via synergetic combinations of different (Q)SAR methodologies and data sources: a case study in an industrial setting.

J Cheminform 2019 Feb 2;11(1). Epub 2019 Feb 2.

Merck KGaA, Darmstadt, Germany.

In this paper, we explore the impact of combining different in silico prediction approaches and data sources on the predictive performance of the resulting system. We use inhibition of the hERG ion channel target as the endpoint for this study as it constitutes a key safety concern in drug development and a potential cause of attrition. We will show that combining data sources can improve the relevance of the training set in regard of the target chemical space, leading to improved performance. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-019-0334-yDOI Listing
February 2019
4 Reads

The nature of ligand efficiency.

Authors:
Peter W Kenny

J Cheminform 2019 Jan 31;11(1). Epub 2019 Jan 31.

Berwick-on-Sea, North Coast Road, Blanchisseuse, Saint George, Trinidad and Tobago.

Ligand efficiency is a widely used design parameter in drug discovery. It is calculated by scaling affinity by molecular size and has a nontrivial dependency on the concentration unit used to express affinity that stems from the inability of the logarithm function to take dimensioned arguments. Consequently, perception of efficiency varies with the choice of concentration unit and it is argued that the ligand efficiency metric is not physically meaningful nor should it be considered to be a metric. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-019-0330-2DOI Listing
January 2019
3 Reads

OGER++: hybrid multi-type entity recognition.

J Cheminform 2019 Jan 21;11(1). Epub 2019 Jan 21.

Institute of Computational Linguistics, University of Zurich, Andreasstr. 15, 8050, Zürich, Switzerland.

Background: We present a text-mining tool for recognizing biomedical entities in scientific literature. OGER++ is a hybrid system for named entity recognition and concept recognition (linking), which combines a dictionary-based annotator with a corpus-based disambiguation component. The annotator uses an efficient look-up strategy combined with a normalization method for matching spelling variants. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0326-3DOI Listing
January 2019
10 Reads

Universal nanohydrophobicity predictions using virtual nanoparticle library.

J Cheminform 2019 Jan 18;11(1). Epub 2019 Jan 18.

The Rutgers Center for Computational and Integrative Biology, Camden, NJ, 08102, USA.

To facilitate the development of new nanomaterials, especially nanomedicines, a novel computational approach was developed to precisely predict the hydrophobicity of gold nanoparticles (GNPs). The core of this study was to develop a large virtual gold nanoparticle (vGNP) library with computational nanostructure simulations. Based on the vGNP library, a nanohydrophobicity model was developed and then validated against externally synthesized and tested GNPs. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-019-0329-8DOI Listing
January 2019
4 Reads

QBMG: quasi-biogenic molecule generator with deep recurrent neural network.

J Cheminform 2019 Jan 17;11(1). Epub 2019 Jan 17.

Research Center for Drug Discovery, School of Pharmaceutical Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China.

Biogenic compounds are important materials for drug discovery and chemical biology. In this work, we report a quasi-biogenic molecule generator (QBMG) to compose virtual quasi-biogenic compound libraries by means of gated recurrent unit recurrent neural networks. The library includes stereo-chemical properties, which are crucial features of natural products. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-019-0328-9DOI Listing
January 2019

Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery.

J Cheminform 2019 Jan 10;11(1). Epub 2019 Jan 10.

Chemogenomics Team, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

Structure-activity relationship modelling is frequently used in the early stage of drug discovery to assess the activity of a compound on one or several targets, and can also be used to assess the interaction of compounds with liability targets. QSAR models have been used for these and related applications over many years, with good success. Conformal prediction is a relatively new QSAR approach that provides information on the certainty of a prediction, and so helps in decision-making. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0325-4DOI Listing
January 2019
2 Reads

LSTMVoter: chemical named entity recognition using a conglomerate of sequence labeling tools.

J Cheminform 2019 Jan 10;11(1). Epub 2019 Jan 10.

Text Technology Lab, Goethe-University Frankfurt, Robert-Mayer-Straße 10, 60325, Frankfurt am Main, Germany.

Background: Chemical and biomedical named entity recognition (NER) is an essential preprocessing task in natural language processing. The identification and extraction of named entities from scientific articles is also attracting increasing interest in many scientific disciplines. Locating chemical named entities in the literature is an essential step in chemical text mining pipelines for identifying chemical mentions, their properties, and relations as discussed in the literature. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0327-2DOI Listing
January 2019
6 Reads

BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification.

J Cheminform 2019 Jan 5;11(1). Epub 2019 Jan 5.

Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada.

Background: A number of computational tools for metabolism prediction have been developed over the last 20 years to predict the structures of small molecules undergoing biological transformation or environmental degradation. These tools were largely developed to facilitate absorption, distribution, metabolism, excretion, and toxicity (ADMET) studies, although there is now a growing interest in using such tools to facilitate metabolomics and exposomics studies. However, their use and widespread adoption is still hampered by several factors, including their limited scope, breath of coverage, availability, and performance. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0324-5DOI Listing
January 2019
4 Reads

A retrosynthetic analysis algorithm implementation.

J Cheminform 2019 Jan 3;11(1). Epub 2019 Jan 3.

Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, IN, 46285, USA.

The need for synthetic route design arises frequently in discovery-oriented chemistry organizations. While traditionally finding solutions to this problem has been the domain of human experts, several computational approaches, aided by the algorithmic advances and the availability of large reaction collections, have recently been reported. Herein we present our own implementation of a retrosynthetic analysis method and demonstrate its capabilities in an attempt to identify synthetic routes for a collection of approved drugs. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0323-6DOI Listing
January 2019
3 Reads

Configurable web-services for biomedical document annotation.

Authors:
Sérgio Matos

J Cheminform 2018 Dec 21;10(1):68. Epub 2018 Dec 21.

DETI/IEETA, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal.

The need to efficiently find and extract information from the continuously growing biomedical literature has led to the development of various annotation tools aimed at identifying mentions of entities and relations. Many of these tools have been integrated in user-friendly applications facilitating their use by non-expert text miners and database curators. In this paper we describe the latest version of Neji, a web-services ready text processing and annotation framework. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0317-4DOI Listing
December 2018

A probabilistic molecular fingerprint for big data settings.

J Cheminform 2018 Dec 18;10(1):66. Epub 2018 Dec 18.

Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure, University of Berne, Freiestrasse 3, 3012, Bern, Switzerland.

Background: Among the various molecular fingerprints available to describe small organic molecules, extended connectivity fingerprint, up to four bonds (ECFP4) performs best in benchmarking drug analog recovery studies as it encodes substructures with a high level of detail. Unfortunately, ECFP4 requires high dimensional representations (≥ 1024D) to perform well, resulting in ECFP4 nearest neighbor searches in very large databases such as GDB, PubChem or ZINC to perform very slowly due to the curse of dimensionality.

Results: Herein we report a new fingerprint, called MinHash fingerprint, up to six bonds (MHFP6), which encodes detailed substructures using the extended connectivity principle of ECFP in a fundamentally different manner, increasing the performance of exact nearest neighbor searches in benchmarking studies and enabling the application of locality sensitive hashing (LSH) approximate nearest neighbor search algorithms. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0321-8DOI Listing
December 2018
4 Reads

"We were here before the Web and hype…": a brief history of and tribute to the Computational Chemistry List.

J Cheminform 2018 Dec 18;10(1):67. Epub 2018 Dec 18.

Archives Poincaré - Philosophie et Recherches sur les Sciences et les Technologies, UMR 7117 CNRS & Université de Lorraine, Nancy, France.

The Computational Chemistry List is a mailing list, portal, and community which brings together people interested in computational chemistry, mostly practitioners. It was formed in 1991 and continues to exist as a vibrant discussion space, highly valued by its members, and serving both its original and new functions. Its duration has been unusual for online communities. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0322-7DOI Listing
December 2018
2 Reads

A neural network approach to chemical and gene/protein entity recognition in patents.

J Cheminform 2018 Dec 18;10(1):65. Epub 2018 Dec 18.

College of Computer Science and Technology, Dalian University of Technology, Dalian, China.

In biomedical research, patents contain the significant amount of information, and biomedical text mining has received much attention in patents recently. To accelerate the development of biomedical text mining for patents, the BioCreative V.5 challenge organized three tracks, i. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0318-3DOI Listing
December 2018
11 Reads

Statistical principle-based approach for gene and protein related object recognition.

J Cheminform 2018 Dec 17;10(1):64. Epub 2018 Dec 17.

Intelligent Information Service Research Laboratory, Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan.

The large number of chemical and pharmaceutical patents has attracted researchers doing biomedical text mining to extract valuable information such as chemicals, genes and gene products. To facilitate gene and gene product annotations in patents, BioCreative V.5 organized a gene- and protein-related object (GPRO) recognition task, in which participants were assigned to identify GPRO mentions and determine whether they could be linked to their unique biological database records. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0314-7DOI Listing
December 2018
8 Reads

JPlogP: an improved logP predictor trained using predicted data.

J Cheminform 2018 Dec 14;10(1):61. Epub 2018 Dec 14.

Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK.

The partition coefficient between octanol and water (logP) has been an important descriptor in QSAR predictions for many years and therefore the prediction of logP has been examined countless times. One of the best performing models is to predict the logP using multiple methods and average the result. We have used those averaged predictions to develop a training-set which was able to distil the information present across the disparate logP methods into one single model. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0316-5DOI Listing
December 2018
12 Reads

SIA: a scalable interoperable annotation server for biomedical named entities.

J Cheminform 2018 Dec 14;10(1):63. Epub 2018 Dec 14.

DFKI Language Technology Lab, Alt-Moabit 91c, Berlin, Germany.

Recent years showed a strong increase in biomedical sciences and an inherent increase in publication volume. Extraction of specific information from these sources requires highly sophisticated text mining and information extraction tools. However, the integration of freely available tools into customized workflows is often cumbersome and difficult. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0319-2DOI Listing
December 2018
8 Reads

Chaos-embedded particle swarm optimization approach for protein-ligand docking and virtual screening.

J Cheminform 2018 Dec 14;10(1):62. Epub 2018 Dec 14.

Department of Computer and Information Science, University of Macau, Avenida da Universidade, Taipa, Macau, China.

Background: Protein-ligand docking programs are routinely used in structure-based drug design to find the optimal binding pose of a ligand in the protein's active site. These programs are also used to identify potential drug candidates by ranking large sets of compounds. As more accurate and efficient docking programs are always desirable, constant efforts focus on developing better docking algorithms or improving the scoring function. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0320-9DOI Listing
December 2018
1 Read

Chemlistem: chemical named entity recognition using recurrent neural networks.

J Cheminform 2018 Dec 6;10(1):59. Epub 2018 Dec 6.

Data Science Group, Technology Department, The Royal Society of Chemistry, Cambridge, UK.

Chemical named entity recognition (NER) has traditionally been dominated by conditional random fields (CRF)-based approaches but given the success of the artificial neural network techniques known as "deep learning" we decided to examine them as an alternative to CRFs. We present here several chemical named entity recognition systems. The first system translates the traditional CRF-based idioms into a deep learning framework, using rich per-token features and neural word embeddings, and producing a sequence of tags using bidirectional long short term memory (LSTM) networks-a type of recurrent neural net. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0313-8DOI Listing
December 2018
8 Reads

A new semi-automated workflow for chemical data retrieval and quality checking for modeling applications.

J Cheminform 2018 Dec 10;10(1):60. Epub 2018 Dec 10.

Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via la Masa 19, 20156, Milan, Italy.

The quality of data used for QSAR model derivation is extremely important as it strongly affects the final robustness and predictive power of the model. Ambiguous or wrong structures need to be carefully checked, because they lead to errors in calculation of descriptors, hence leading to meaningless results. The increasing amounts of data, however, have often made it hard to check of very large databases manually. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0315-6DOI Listing
December 2018

MER: a shell script and annotation server for minimal named entity recognition and linking.

J Cheminform 2018 Dec 5;10(1):58. Epub 2018 Dec 5.

LASIGE, Faculdade de Ciências, Universidade de Lisboa, 1749 016, Lisbon, Portugal.

Named-entity recognition aims at identifying the fragments of text that mention entities of interest, that afterwards could be linked to a knowledge base where those entities are described. This manuscript presents our minimal named-entity recognition and linking tool (MER), designed with flexibility, autonomy and efficiency in mind. To annotate a given text, MER only requires: (1) a lexicon (text file) with the list of terms representing the entities of interest; (2) optionally a tab-separated values file with a link for each term; (3) and a Unix shell. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0312-9DOI Listing
December 2018
1 Read

chemmodlab: a cheminformatics modeling laboratory R package for fitting and assessing machine learning models.

J Cheminform 2018 Nov 28;10(1):57. Epub 2018 Nov 28.

Department of Statistics, North Carolina State University, 2311 Stinson Drive, Campus Box 8203, Raleigh, NC, 27695-8203, USA.

The goal of chemmodlab is to streamline the fitting and assessment pipeline for many machine learning models in R, making it easy for researchers to compare the utility of these models. While focused on implementing methods for model fitting and assessment that have been accepted by experts in the cheminformatics field, all of the methods in chemmodlab have broad utility for the machine learning community. chemmodlab contains several assessment utilities, including a plotting function that constructs accumulation curves and a function that computes many performance measures. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0309-4DOI Listing
November 2018

Statistical-based database fingerprint: chemical space dependent representation of compound databases.

J Cheminform 2018 Nov 22;10(1):55. Epub 2018 Nov 22.

Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.

Background: Simplified representation of compound databases has several applications in cheminformatics. Herein, we introduce an alternative and general method to build single fingerprint representations of compound databases. The approach is inspired on the previously published modal fingerprints that are aimed to capture the most significant bits of a fingerprint representation for a compound data set. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0311-xDOI Listing
November 2018

Implicit-descriptor ligand-based virtual screening by means of collaborative filtering.

J Cheminform 2018 Nov 22;10(1):56. Epub 2018 Nov 22.

Department of Computer Science and Engineering, Bobby B. Lyle School of Engineering, Southern Methodist University, 3145 Dyer Street, Dallas, TX, 75205, USA.

Current ligand-based machine learning methods in virtual screening rely heavily on molecular fingerprinting for preprocessing, i.e., explicit description of ligands' structural and physicochemical properties in a vectorized form. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0310-yDOI Listing
November 2018
10 Reads

Improved understanding of aqueous solubility modeling through topological data analysis.

J Cheminform 2018 Nov 20;10(1):54. Epub 2018 Nov 20.

Mathematical Sciences, University of Southampton, Southampton, UK.

Topological data analysis is a family of recent mathematical techniques seeking to understand the 'shape' of data, and has been used to understand the structure of the descriptor space produced from a standard chemical informatics software from the point of view of solubility. We have used the mapper algorithm, a TDA method that creates low-dimensional representations of data, to create a network visualization of the solubility space. While descriptors with clear chemical implications are prominent features in this space, reflecting their importance to the chemical properties, an unexpected and interesting correlation between chlorine content and rings and their implication for solubility prediction is revealed. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0308-5DOI Listing
November 2018

Cheminformatics-based enumeration and analysis of large libraries of macrolide scaffolds.

J Cheminform 2018 Nov 12;10(1):53. Epub 2018 Nov 12.

Department of Chemistry, North Carolina State University, Raleigh, NC, USA.

We report on the development of a cheminformatics enumeration technology and the analysis of a resulting large dataset of virtual macrolide scaffolds. Although macrolides have been shown to have valuable biological properties, there is no ready-to-screen virtual library of diverse macrolides in the public domain. Conducting molecular modeling (especially virtual screening) of these complex molecules is highly relevant as the organic synthesis of these compounds, when feasible, typically requires many synthetic steps, and thus dramatically slows the discovery of new bioactive macrolides. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0307-6DOI Listing
November 2018
14 Reads

An automated framework for NMR chemical shift calculations of small organic molecules.

J Cheminform 2018 Oct 26;10(1):52. Epub 2018 Oct 26.

The Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, WA, USA.

When using nuclear magnetic resonance (NMR) to assist in chemical identification in complex samples, researchers commonly rely on databases for chemical shift spectra. However, authentic standards are typically depended upon to build libraries experimentally. Considering complex biological samples, such as blood and soil, the entirety of NMR spectra required for all possible compounds would be infeasible to ascertain due to limitations of available standards and experimental processing time. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0305-8DOI Listing
October 2018
18 Reads

Choquet integral-based fuzzy molecular characterizations: when global definitions are computed from the dependency among atom/bond contributions (LOVIs/LOEIs).

J Cheminform 2018 Oct 25;10(1):51. Epub 2018 Oct 25.

Grupo de Química Cuántica y Teórica, Facultad de Ciencias Exactas y Naturales, Programa de Química, Universidad de Cartagena, Campus de San Pablo, Cartagena, Colombia.

Background: Several topological (2D) and geometric (3D) molecular descriptors (MDs) are calculated from local vertex/edge invariants (LOVIs/LOEIs) by performing an aggregation process. To this end, norm-, mean- and statistic-based (non-fuzzy) operators are used, under the assumption that LOVIs/LOEIs are independent (orthogonal) values of one another. These operators are based on additive and/or linear measures and, consequently, they cannot be used to encode information from interrelated criteria. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0306-7DOI Listing
October 2018

A new chemoinformatics approach with improved strategies for effective predictions of potential drugs.

J Cheminform 2018 Oct 11;10(1):50. Epub 2018 Oct 11.

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.

Background: Fast and accurate identification of potential drug candidates against therapeutic targets (i.e., drug-target interactions, DTIs) is a fundamental step in the early drug discovery process. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0303-xDOI Listing
October 2018

Evaluating parameters for ligand-based modeling with random forest on sparse data sets.

J Cheminform 2018 Oct 11;10(1):49. Epub 2018 Oct 11.

Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.

Ligand-based predictive modeling is widely used to generate predictive models aiding decision making in e.g. drug discovery projects. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0304-9DOI Listing
October 2018

Life beyond the Tanimoto coefficient: similarity measures for interaction fingerprints.

J Cheminform 2018 Oct 4;10(1):48. Epub 2018 Oct 4.

Plasma Chemistry Research Group, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Magyar tudósok krt. 2, Budapest, 1117, Hungary.

Background: Interaction fingerprints (IFP) have been repeatedly shown to be valuable tools in virtual screening to identify novel hit compounds that can subsequently be optimized to drug candidates. As a complementary method to ligand docking, IFPs can be applied to quantify the similarity of predicted binding poses to a reference binding pose. For this purpose, a large number of similarity metrics can be applied, and various parameters of the IFPs themselves can be customized. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0302-yDOI Listing
October 2018
9 Reads

Exploring non-linear distance metrics in the structure-activity space: QSAR models for human estrogen receptor.

J Cheminform 2018 Sep 18;10(1):47. Epub 2018 Sep 18.

US EPA, 109 TW Alexander Drive, ORD, NCCT, Research Triangle Park, NC, 27711, USA.

Background: Quantitative structure-activity relationship (QSAR) models are important tools used in discovering new drug candidates and identifying potentially harmful environmental chemicals. These models often face two fundamental challenges: limited amount of available biological activity data and noise or uncertainty in the activity data themselves. To address these challenges, we introduce and explore a QSAR model based on custom distance metrics in the structure-activity space. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0300-0DOI Listing
September 2018
2 Reads
4.547 Impact Factor

Novel applications of Machine Learning in cheminformatics.

Authors:
Ola Spjuth

J Cheminform 2018 Sep 6;10(1):46. Epub 2018 Sep 6.

Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24, Uppsala, Sweden.

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0301-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6127077PMC
September 2018
13 Reads

"MS-Ready" structures for non-targeted high-resolution mass spectrometry screening studies.

J Cheminform 2018 Aug 30;10(1):45. Epub 2018 Aug 30.

National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Mail Drop D143-02, 109 T.W. Alexander Dr., Research Triangle Park, NC, 27711, USA.

Chemical database searching has become a fixture in many non-targeted identification workflows based on high-resolution mass spectrometry (HRMS). However, the form of a chemical structure observed in HRMS does not always match the form stored in a database (e.g. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0299-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6117229PMC
August 2018
3 Reads

The influence of solid state information and descriptor selection on statistical models of temperature dependent aqueous solubility.

J Cheminform 2018 Aug 29;10(1):44. Epub 2018 Aug 29.

School of Chemical and Process Engineering, University of Leeds, Leeds, LS2 9JT, UK.

Predicting the equilibrium solubility of organic, crystalline materials at all relevant temperatures is crucial to the digital design of manufacturing unit operations in the chemical industries. The work reported in our current publication builds upon the limited number of recently published quantitative structure-property relationship studies which modelled the temperature dependence of aqueous solubility. One set of models was built to directly predict temperature dependent solubility, including for materials with no solubility data at any temperature. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0298-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6115327PMC
August 2018
10 Reads

Machine learning for the prediction of molecular dipole moments obtained by density functional theory.

J Cheminform 2018 Aug 22;10(1):43. Epub 2018 Aug 22.

LAQV and REQUIMTE, Departamento de Química, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516, Caparica, Portugal.

Machine learning (ML) algorithms were explored for the fast estimation of molecular dipole moments calculated by density functional theory (DFT) by B3LYP/6-31G(d,p) on the basis of molecular descriptors generated from DFT-optimized geometries and partial atomic charges obtained by empirical or ML schemes. A database was used with 10,071 structures, new molecular descriptors were designed and the models were validated with external test sets. Several ML algorithms were screened. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0296-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6104469PMC
August 2018
3 Reads

Probing the chemical-biological relationship space with the Drug Target Explorer.

J Cheminform 2018 Aug 20;10(1):41. Epub 2018 Aug 20.

Sage Bionetworks, 1100 Fairview Avenue N, Seattle, WA, 98109, USA.

Modern phenotypic high-throughput screens (HTS) present several challenges including identifying the target(s) that mediate the effect seen in the screen, characterizing 'hits' with a polypharmacologic target profile, and contextualizing screen data within the large space of drugs and screening models. To address these challenges, we developed the Drug-Target Explorer. This tool allows users to query molecules within a database of experimentally-derived and curated compound-target interactions to identify structurally similar molecules and their targets. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0297-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6102167PMC
August 2018
8 Reads

Ambit-SMIRKS: a software module for reaction representation, reaction search and structure transformation.

J Cheminform 2018 Aug 20;10(1):42. Epub 2018 Aug 20.

Ideaconsult Ltd, 4 A. Kanchev Str., 1000, Sofia, Bulgaria.

Ambit-SMIRKS is an open source software, enabling structure transformation via the SMIRKS language and implemented as an extension of Ambit-SMARTS. As part of the Ambit project it builds on top of The Chemistry Development Kit (The CDK). Ambit-SMIRKS provides the following functionalities: parsing of SMIRKS linear notations into internal reaction (transformation) representations based on The CDK objects, application of the stored reactions against target (reactant) molecules for actual transformation of the target chemical objects, reaction searching, stereo information handling, product post-processing, etc. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0295-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6102164PMC
August 2018
2 Reads

Predictive classification models and targets identification for betulin derivatives as Leishmania donovani inhibitors.

J Cheminform 2018 Aug 17;10(1):40. Epub 2018 Aug 17.

Centre for Drug Research, Division of Pharmaceutical Biosciences, University of Helsinki, Viikinkaari 5E, P.O. Box 56, 00790, Helsinki, Finland.

Betulin derivatives have been proven effective in vitro against Leishmania donovani amastigotes, which cause visceral leishmaniasis. Identifying the molecular targets and molecular mechanisms underlying their action is a currently an unmet challenge. In the present study, we tackle this problem using computational methods to establish properties essential for activity as well as to screen betulin derivatives against potential targets. Read More

View Article

Download full-text PDF

Source
https://jcheminf.springeropen.com/articles/10.1186/s13321-01
Publisher Site
http://dx.doi.org/10.1186/s13321-018-0291-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6097978PMC
August 2018
8 Reads

P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure.

J Cheminform 2018 Aug 14;10(1):39. Epub 2018 Aug 14.

Department of Software Engineering, Charles University, Prague, Czech Republic.

Background: Ligand binding site prediction from protein structure has many applications related to elucidation of protein function and structure based drug discovery. It often represents only one step of many in complex computational drug design efforts. Although many methods have been published to date, only few of them are suitable for use in automated pipelines or for processing large datasets. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0285-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6091426PMC
August 2018
4 Reads

Annotation and detection of drug effects in text for pharmacovigilance.

J Cheminform 2018 Aug 13;10(1):37. Epub 2018 Aug 13.

National Centre for Text Mining, School of Computer Science, Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN, UK.

Pharmacovigilance (PV) databases record the benefits and risks of different drugs, as a means to ensure their safe and effective use. Creating and maintaining such resources can be complex, since a particular medication may have divergent effects in different individuals, due to specific patient characteristics and/or interactions with other drugs being administered. Textual information from various sources can provide important evidence to curators of PV databases about the usage and effects of drug targets in different medical subjects. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0290-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6089860PMC
August 2018
2 Reads

Chemotion-ELN part 2: adaption of an embedded Ketcher editor to advanced research applications.

J Cheminform 2018 Aug 13;10(1):38. Epub 2018 Aug 13.

Institute of Toxicology and Genetics, Karlsruhe Institute of Technology, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany.

The Ketcher editor, available as an Open Source software package for drawing chemical structures, has been expanded to include several features that allow storage, management and application of templates, as well as the use of symbols for a planning and processing of solid phase synthesis. In addition, tools for the drawing of coordinative bonds to represent e.g. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0292-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6089857PMC
August 2018
1 Read

PubChem chemical structure standardization.

J Cheminform 2018 Aug 10;10(1):36. Epub 2018 Aug 10.

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD, 20894, USA.

Background: PubChem is a chemical information repository, consisting of three primary databases: Substance, Compound, and BioAssay. When individual data contributors submit chemical substance descriptions to Substance, the unique chemical structures are extracted and stored into Compound through an automated process called structure standardization. The present study describes the PubChem standardization approaches and analyzes them for their success rates, reasons that cause structures to be rejected, and modifications applied to structures during the standardization process. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0293-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6086778PMC
August 2018
11 Reads

SPICES: a particle-based molecular structure line notation and support library for mesoscopic simulation.

J Cheminform 2018 Aug 9;10(1):35. Epub 2018 Aug 9.

Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665, Recklinghausen, Germany.

Simplified Particle Input ConnEction Specification (SPICES) is a particle-based molecular structure representation derived from straightforward simplifications of the atom-based SMILES line notation. It aims at supporting tedious and error-prone molecular structure definitions for particle-based mesoscopic simulation techniques like Dissipative Particle Dynamics by allowing for an interplay of different molecular encoding levels that range from topological line notations and corresponding particle-graph visualizations to 3D structures with support of their spatial mapping into a simulation box. An open Java library for SPICES structure handling and mesoscopic simulation support in combination with an open Java Graphical User Interface viewer application for visual topological inspection of SPICES definitions are provided. Read More

View Article

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13321-018-0294-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6085218PMC
August 2018
6 Reads