Publications by authors named "Kenneth M Merz"

263 Publications

Formation of the Metal-Binding Core of the ZRT/IRT-like Protein (ZIP) Family Zinc Transporter.

Biochemistry 2021 Sep 29;60(36):2727-2738. Epub 2021 Aug 29.

Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States.

Zinc homeostasis in mammals is constantly and precisely maintained by sophisticated regulatory proteins. Among them, the Zrt/Irt-like protein (ZIP) regulates the influx of zinc into the cytoplasm. In this work, we have employed all-atom molecular dynamics simulations to investigate the Zn transport mechanism in prokaryotic ZIP obtained from (BbZIP) in a membrane bilayer. Additionally, the structural and dynamical transformations of BbZIP during this process have been analyzed. This study allowed us to develop a hypothesis for the zinc influx mechanism and formation of the metal-binding site. We have created a model for the outward-facing form of BbZIP (experimentally only the inward-facing form has been characterized) that has allowed us, for the first time, to observe the Zn ion entering the channel and binding to the negatively charged M2 site. It is thought that the M2 site is less favored than the M1 site, which then leads to metal ion egress; however, we have not observed the M1 site being occupied in our simulations. Furthermore, removing both Zn ions from this complex resulted in the collapse of the metal-binding site, illustrating the "structural role" of metal ions in maintaining the binding site and holding the proteins together. Finally, due to the long Cd-residue bond distances observed in the X-ray structures, we have proposed the existence of an HO ion at the M2 site that plays an important role in protein stability in the absence of the metal ion.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.biochem.1c00415DOI Listing
September 2021

A critical overview of computational approaches employed for COVID-19 drug discovery.

Chem Soc Rev 2021 Aug 2;50(16):9121-9151. Epub 2021 Jul 2.

UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA.

COVID-19 has resulted in huge numbers of infections and deaths worldwide and brought the most severe disruptions to societies and economies since the Great Depression. Massive experimental and computational research effort to understand and characterize the disease and rapidly develop diagnostics, vaccines, and drugs has emerged in response to this devastating pandemic and more than 130 000 COVID-19-related research papers have been published in peer-reviewed journals or deposited in preprint servers. Much of the research effort has focused on the discovery of novel drug candidates or repurposing of existing drugs against COVID-19, and many such projects have been either exclusively computational or computer-aided experimental studies. Herein, we provide an expert overview of the key computational methods and their applications for the discovery of COVID-19 small-molecule therapeutics that have been reported in the research literature. We further outline that, after the first year the COVID-19 pandemic, it appears that drug repurposing has not produced rapid and global solutions. However, several known drugs have been used in the clinic to cure COVID-19 patients, and a few repurposed drugs continue to be considered in clinical trials, along with several novel clinical candidates. We posit that truly impactful computational tools must deliver actionable, experimentally testable hypotheses enabling the discovery of novel drugs and drug combinations, and that open science and rapid sharing of research results are critical to accelerate the development of novel, much needed therapeutics for COVID-19.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1039/d0cs01065kDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8371861PMC
August 2021

Harnessing the Power of Multi-GPU Acceleration into the Quantum Interaction Computational Kernel Program.

J Chem Theory Comput 2021 Jul 1;17(7):3955-3966. Epub 2021 Jun 1.

San Diego Supercomputer Center, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093-0505, United States.

We report a new multi-GPU capable Hartree-Fock/density functional theory implementation integrated into the open source QUantum Interaction Computational Kernel (QUICK) program. Details on the load balancing algorithms for electron repulsion integrals and exchange correlation quadrature across multiple GPUs are described. Benchmarking studies carried out on up to four GPU nodes, each containing four NVIDIA V100-SXM2 type GPUs demonstrate that our implementation is capable of achieving excellent load balancing and high parallel efficiency. For representative medium to large size protein/organic molecular systems, the observed parallel efficiencies remained above 82% for the Kohn-Sham matrix formation and above 90% for nuclear gradient calculations. The accelerations on NVIDIA A100, P100, and K80 platforms also have realized parallel efficiencies higher than 68% in all tested cases, paving the way for large-scale electronic structure calculations with QUICK.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jctc.1c00145DOI Listing
July 2021

Quantum Chemistry Calculations for Metabolomics.

Chem Rev 2021 May 12;121(10):5633-5670. Epub 2021 May 12.

Biological Science Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States.

A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials ("standards"), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, , libraries for "standards-free" identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.chemrev.0c00901DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8161423PMC
May 2021

Open-Source Multi-GPU-Accelerated QM/MM Simulations with AMBER and QUICK.

J Chem Inf Model 2021 05 29;61(5):2109-2115. Epub 2021 Apr 29.

San Diego Supercomputer Center, University of California San Diego, La Jolla, California 92093, United States.

The quantum mechanics/molecular mechanics (QM/MM) approach is an essential and well-established tool in computational chemistry that has been widely applied in a myriad of biomolecular problems in the literature. In this publication, we report the integration of the QUantum Interaction Computational Kernel (QUICK) program as an engine to perform electronic structure calculations in QM/MM simulations with AMBER. This integration is available through either a file-based interface (FBI) or an application programming interface (API). Since QUICK is an open-source GPU-accelerated code with multi-GPU parallelization, users can take advantage of "free of charge" GPU-acceleration in their QM/MM simulations. In this work, we discuss implementation details and give usage examples. We also investigate energy conservation in typical QM/MM simulations performed at the microcanonical ensemble. Finally, benchmark results for two representative systems in bulk water, the -methylacetamide (NMA) molecule and the photoactive yellow protein (PYP), show the performance of QM/MM simulations with QUICK and AMBER using a varying number of CPU cores and GPUs. Our results highlight the acceleration obtained from a single or multiple GPUs; we observed speedups of up to 53× between a single GPU vs a single CPU core and of up to 2.6× when comparing four GPUs to a single GPU. Results also reveal speedups of up to 3.5× when the API is used instead of FBI.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.1c00169DOI Listing
May 2021

Parametrization of Trivalent and Tetravalent Metal Ions for the OPC3, OPC, TIP3P-FB, and TIP4P-FB Water Models.

J Chem Theory Comput 2021 Apr 1;17(4):2342-2354. Epub 2021 Apr 1.

Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States.

Commonly seen in rare-earth chemistry and materials science, highly charged metal ions play key roles in many chemical processes. Computer simulations have become an important tool for scientific research nowadays. Meaningful simulations require reliable parameters. In the present work, we parametrized 18 M(III) and 6 M(IV) metal ions for four new water models (OPC3, OPC, TIP3P-FB, TIP4P-FB) in conjunction with each of the 12-6 and 12-6-4 nonbonded models. Similar to what was observed previously, issues with the 12-6 model can be fixed by using the 12-6-4 model. Moreover, the four new water models showed comparable performance or considerable improvement over the previous water models (TIP3P, SPC/E, and TIP4P) in the same category (3-point or 4-point water models, respectively). Finally, we reported a study of a metalloprotein system demonstrating the capability of the 12-6-4 model to model metalloproteins. The reported parameters will facilitate accurate simulations of highly charged metal ions in aqueous solution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jctc.0c01320DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8173366PMC
April 2021

AutoGraph: Autonomous Graph-Based Clustering of Small-Molecule Conformations.

J Chem Inf Model 2021 04 29;61(4):1647-1656. Epub 2021 Mar 29.

Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States.

While accurately modeling the conformational ensemble is required for predicting properties of flexible molecules, the optimal method of obtaining the conformational ensemble appears as varied as their applications. Ensemble structures have been modeled by generation, refinement, and clustering of conformations with a sufficient number of samples. We present a conformational clustering algorithm intended to automate the conformational clustering step through the Louvain algorithm, which requires minimal hyperparameters and importantly no predefined number of clusters or threshold values. The conformational graphs produced by this method for -succinyl-l-homoserine, oxidized nicotinamide adenine dinucleotide, and 200 representative metabolites each preserved the geometric/energetic correlation expected for points on the potential energy surface. Clustering based on these graphs provides partitions informed by the potential energy surface. Automating conformational clustering in a workflow with AutoGraph may mitigate human biases introduced by guess and check over hyperparameter selection while allowing flexibility to the result by not imposing predefined criteria other than optimizing the model's loss function. Associated codes are available at https://github.com/TanemuraKiyoto/AutoGraph.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.0c01492DOI Listing
April 2021

Parameterization of Monovalent Ions for the OPC3, OPC, TIP3P-FB, and TIP4P-FB Water Models.

J Chem Inf Model 2021 02 4;61(2):869-880. Epub 2021 Feb 4.

Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States.

Monovalent ions play significant roles in various biological and material systems. Recently, four new water models (OPC3, OPC, TIP3P-FB, and TIP4P-FB), with significantly improved descriptions of condensed phase water, have been developed. The pairwise interaction between the metal ion and water necessitates the development of ion parameters specifically for these water models. Herein, we parameterized the 12-6 and the 12-6-4 nonbonded models for 12 monovalent ions with the respective four new water models. These monovalent ions contain eight cations including alkali metal ions (Li, Na, K, Rb, Cs), transition-metal ions (Cu and Ag), and Tl from the boron family, along with four halide anions (F, Cl, Br, I). Our parameters were designed to reproduce the target hydration free energies (the 12-6 hydration free energy (HFE) set), the ion-oxygen distances (the 12-6 ion-oxygen distance (IOD) set), or both of them (the 12-6-4 set). The 12-6-4 parameter set provides highly accurate structural features overcoming the limitations of the routinely used 12-6 nonbonded model for ions. Specifically, we note that the 12-6-4 parameter set is able to reproduce experimental hydration free energies within 1 kcal/mol and experimental ion-oxygen distances within 0.01 Å simultaneously. We further reproduced the experimentally determined activity derivatives for salt solutions, validating the ion parameters for simulations of ion pairs. The improved performance of the present water models over our previous parameter sets for the TIP3P, TIP4P, and SPC/E water models (Li, P. et al 2015 11 1645 1657) highlights the importance of the choice of water model in conjunction with the metal ion parameter set.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.0c01390DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8173365PMC
February 2021

Generative Models for Molecular Design.

J Chem Inf Model 2020 12;60(12):5635-5636

Department of Mathematics, Michigan State University, Michigan, East Lansing 48824, United States.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.0c01388DOI Listing
December 2020

What Makes a Paper Be Highly Cited? 60 Years of the .

J Chem Inf Model 2020 12;60(12):5866-5867

Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.0c01248DOI Listing
December 2020

ReaxFF/AMBER-A Framework for Hybrid Reactive/Nonreactive Force Field Molecular Dynamics Simulations.

J Chem Theory Comput 2020 Dec 3;16(12):7645-7654. Epub 2020 Nov 3.

Department of Computer Science and Engineering, Michigan State University, 428 S. Shaw Lane, East Lansing, Michigan 48824-1322, United States.

Combined quantum mechanical/molecular mechanical (QM/MM) models using semiempirical and methods have been extensively reported on over the past few decades. These methods have been shown to be capable of providing unique insights into a range of problems, but they are still limited to relatively short time scales, especially QM/MM models using methods. An intermediate approach between a QM based model and classical mechanics could help fill this time-scale gap and facilitate the study of a range of interesting problems. Reactive force fields represent the intermediate approach explored in this paper. A widely used reactive model is ReaxFF, which has largely been applied to materials science problems and is generally used as a stand-alone (i.e., the full system is modeled using ReaxFF). We report a hybrid ReaxFF/AMBER molecular dynamics (MD) tool, which introduces ReaxFF capabilities to capture bond breaking and formation within the AMBER MD software package. This tool enables us to study local reactive events in large systems at a fraction of the computational costs of QM/MM models. We describe the implementation of ReaxFF/AMBER, validate this implementation using a benzene molecule solvated in water, and compare its performance against a range of similar approaches. To illustrate the predictive capabilities of ReaxFF/AMBER, we carried out a Claisen rearrangement study in aqueous solution. In a first for ReaxFF, we were able to use AMBER's potential of mean force (PMF) capabilities to perform a PMF study on this organic reaction. The ability to capture local reaction events in large systems using combined ReaxFF/AMBER opens up a range of problems that can be tackled using this model to address both chemical and biological processes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jctc.0c00874DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8145783PMC
December 2020

Parameterization of a Dioxygen Binding Metal Site Using the MCPB.py Program.

Methods Mol Biol 2021 ;2199:257-275

Department of Chemistry, Michigan State University, East Lansing, MI, USA.

The MCPB.py program greatly facilitates force field parameterization for metal sites in metalloproteins and organometallic compounds. Herein we present an example of MCPB.py to the parameterization of the dioxygen binding metal site of peptidylglycine-alphahydroxylating monooxygenase (PHM), which contains a copper ion. In this example, we also extend the functionality of MCPB.py to support molecular dynamics (MD) simulations in GROMACS through a python script. Illustrative MD simulations were performed using GROMACS and the results were analyzed. Notes about the program were also provided in this chapter, to assist MCPB.py users for metal site parameterizations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-1-0716-0892-0_15DOI Listing
March 2021

Converging Interests: Chemoinformatics, History, and Bibliometrics.

J Chem Inf Model 2020 12 15;60(12):5870-5872. Epub 2020 Oct 15.

Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, Illinois 61801, United States.

Modern scientometric techniques, applied at scale, can provide valuable information that complements qualitative investigation of the accumulation of knowledge in a field. We discuss a trio of articles from computational chemistry selected from an analysis of 181 million tri-cited articles.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.0c01098DOI Listing
December 2020

Receptor-Ligand Binding Free Energies from a Consecutive Histograms Monte Carlo Sampling Method.

J Chem Theory Comput 2020 Oct 12;16(10):6645-6655. Epub 2020 Sep 12.

School of Chemistry, Chemical Engineering and Life Science, Wuhan University of Technology, 122 Luoshi Road, Wuhan 430070, PR China.

To obtain accurate and converged free energy calculations for ligand binding to biomolecular systems requires validated force fields and extensive sampling of the energy landscape, which requires exhaustive and effective conformational searching methods. Herein, we introduce the consecutive histograms Monte Carlo (CHMC) sampling protocol that generates receptor-ligand binding modes within a series of continuously distributed sampling units ranging from placement near the geometric center of the receptor's binding site to fully unbound states. This protocol employs independent energy-state sampling for calculating the ensemble energy within every predefined location along the receptor-ligand dissociation pathway, without the need to traverse the energy barriers as in molecular dynamic simulations during the dissociation procedure. We applied this method to a set of selected receptor targets with their corresponding ligands providing detailed studies of molecular binding free energy predictions. The results show that the CHMC gives an excellent accounting of the free energy surfaces and binding free energies at a reasonable computational cost.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jctc.0c00457DOI Listing
October 2020

Evolution of Alchemical Free Energy Methods in Drug Discovery.

J Chem Inf Model 2020 11 6;60(11):5308-5318. Epub 2020 Sep 6.

Department of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States.

The goal of the present manuscript is to succinctly trace the key technological steps in the evolution of alchemical free energy methods (AFEMs) from a purely theoretical construct to a method that is now widely used in the biotechnological and pharmaceutical industries. More specifically, we focus on relative binding free energy (RBFE) computations which are more routinely applied in computer aided drug design (CADD) campaigns rather than the more computationally intensive absolute binding free energy (ABFE) computations. We have not been exhaustive in the development of our timeline but rather try to weave a story about how theoretical ideas were ultimately converted into contemporary free energy capabilities. Necessarily this story-telling approach limits us from citing all work on AFEMs, and we apologize for this shortcoming. However, for those interested in a broad delineation of all the work done in this area they are directed to the many excellent reviews that are extant.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.0c00547DOI Listing
November 2020

Refinement of pairwise potentials via logistic regression to score protein-protein interactions.

Proteins 2020 12 30;88(12):1559-1568. Epub 2020 Jul 30.

Department of Chemistry, Michigan State University, East Lansing, Michigan, USA.

Protein-protein interactions (PPIs) are ubiquitous and functionally of great importance in biological systems. Hence, the accurate prediction of PPIs by protein-protein docking and scoring tools is highly desirable in order to characterize their structure and biological function. Ab initio docking protocols are divided into the sampling of docking poses to produce at least one near-native structure, and then to evaluate the vast candidate structures by scoring. Concurrent development in both sampling and scoring is crucial for the deployment of protein-protein docking software. In the present work, we apply a machine learning model on pairwise potentials to refine the task of protein quaternary structure native structure detection among decoys. A decoy set was featurized using the Knowledge and Empirical Combined Scoring Algorithm 2 (KECSA2) pairwise potential. The highly unbalanced decoy set was then balanced using a comparison concept between native and decoy structures. The resultant comparison descriptors were used to train a logistic regression (LR) classifier. The LR model yielded the optimal performance for native detection among decoys compared with conventional scoring functions, while exhibiting lesser performance for the detection of low root mean square deviation decoy structures. Its deployment on an independent benchmark set confirms that the scoring function performs competitively relative to other scoring functions. The scripts used are available at https://github.com/TanemuraKiyoto/PPI-native-detection-via-LR.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/prot.25973DOI Listing
December 2020

MRP.py: A Parametrizer of Post-Translationally Modified Residues.

J Chem Inf Model 2020 10 18;60(10):4424-4428. Epub 2020 Aug 18.

Department of Chemistry and Biochemistry, Auburn University, Auburn, Alabama36849-5312, United States.

MRP.py is a Python-based parametrization program for covalently modified amino acid residues for molecular dynamics simulations. Charge derivation is performed via an RESP charge fit, and force constants are obtained through rewriting of either protein or GAFF database parameters. This allows for the description of interfacial interactions between the modifed residue and protein. MRP.py is capable of working with a variety of protein databases. MRP.py's highly general and systematic method of obtaining parameters allows the user to circumvent the process of parametrizing the modified residue-protein interface. Two examples, a covalently bound inhibitor and covalent adduct consisting of modified residues, are provided in the Supporting Information.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.0c00472DOI Listing
October 2020

Impact of the Special Issue on Women in Computational Chemistry.

J Chem Inf Model 2020 07 5;60(7):3328-3330. Epub 2020 Jul 5.

Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States.

In this Viewpoint, we provide a commentary on the impact of the Special Issue on Women in Computational Chemistry published in May 2019 and the feedback we received.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.0c00636DOI Listing
July 2020

Metabolite Structure Assignment Using In Silico NMR Techniques.

Anal Chem 2020 08 15;92(15):10412-10419. Epub 2020 Jul 15.

Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States.

A major challenge for metabolomic analysis is to obtain an unambiguous identification of the metabolites detected in a sample. Among metabolomics techniques, NMR spectroscopy is a sophisticated, powerful, and generally applicable spectroscopic tool that can be used to ascertain the correct structure of newly isolated biogenic molecules. However, accurate structure prediction using computational NMR techniques depends on how much of the relevant conformational space of a particular compound is considered. It is intrinsically challenging to calculate NMR chemical shifts using high-level DFT when the conformational space of a metabolite is extensive. In this work, we developed NMR chemical shift calculation protocols using a machine learning model in conjunction with standard DFT methods. The pipeline encompasses the following steps: (1) conformation generation using a force field (FF)-based method, (2) filtering the FF generated conformations using the ASE-ANI machine learning model, (3) clustering of the optimized conformations based on structural similarity to identify chemically unique conformations, (4) DFT structural optimization of the unique conformations, and (5) DFT NMR chemical shift calculation. This protocol can calculate the NMR chemical shifts of a set of molecules using any available combination of DFT theory, solvent model, and NMR-active nuclei, using both user-selected reference compounds and/or linear regression methods. Our protocol reduces the overall computational time by 2 orders of magnitude over methods that optimize the conformations using fully ab initio methods, while still producing good agreement with experimental observations. The complete protocol is designed in such a manner that makes the computation of chemical shifts tractable for a large number of conformationally flexible metabolites.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.analchem.0c00768DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8045457PMC
August 2020

Pair Potentials as Machine Learning Features.

J Chem Theory Comput 2020 Aug 6;16(8):5385-5400. Epub 2020 Jul 6.

Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States.

Atom pairwise potential functions make up an essential part of many scoring functions for protein decoy detection. With the development of machine learning (ML) tools, there are multiple ways to combine potential functions to create novel ML models and methods. Potential function parameters can be easily extracted; however, it is usually hard to directly obtain the calculated atom pairwise energies from scoring functions. Amber, as one of the most popular suites of modeling programs, has an extensive history and library of force field potential functions. In this work, we directly used the force field parameters in ff94 and ff14SB from Amber and encoded them to calculate atom pairwise energies for different interactions. Two sets of structures (single amino acid set and a dipeptide set) were used to evaluate the performance of our encoded Amber potentials. From the comparison results between energy terms obtained from our encoding and Amber, we find energy difference within ±0.06 kcal/mol for all tested structures. Previously we have shown that the Random Forest (RF) model can help to emphasize more important atom pairwise interactions and ignore insignificant ones [Pei, J.; Zheng, Z.; Merz, K. M. 2019, 59, 1919-1929]. Here, as an example of combining ML methods with traditional potential functions, we followed the same work flow to combine the RF models with force field potential functions from Amber. To determine the performance of our RF models with force field potential functions, 224 different protein native-decoy systems were used as our training and testing sets We find that the RF models with ff94 and ff14SB force field parameters outperformed all other scoring functions (RF models with KECSA2, RWplus, DFIRE, dDFIRE, and GOAP) considered in this work for native structure detection, and they performed similarly in detecting the best decoy. Through inclusion of best decoy to decoy comparisons in building our RF models, we were able to generate models that outperformed the score functions tested herein both on accuracy and best decoy detection, again showing the performance and flexibility of our RF models to tackle this problem. Finally, the importance of the RF algorithm and force field parameters were also tested and the comparison results suggest that both the RF algorithm and force field potentials are important with the ML scoring function achieving its best performance only by combining them together. All code and data used in this work are available at https://github.com/JunPei000/FFENCODER_for_Protein_Folding_Pose_Selection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jctc.9b01246DOI Listing
August 2020
-->