Publications by authors named "Eugene I Shakhnovich"

162 Publications

Metabolic response to point mutations reveals principles of modulation of in vivo enzyme activity and phenotype.

Mol Syst Biol 2021 06;17(6):e10200

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA.

The relationship between sequence variation and phenotype is poorly understood. Here, we use metabolomic analysis to elucidate the molecular mechanism underlying the filamentous phenotype of E. coli strains that carry destabilizing mutations in dihydrofolate reductase (DHFR). We find that partial loss of DHFR activity causes reversible filamentation despite SOS response indicative of DNA damage, in contrast to thymineless death (TLD) achieved by complete inhibition of DHFR activity by high concentrations of antibiotic trimethoprim. This phenotype is triggered by a disproportionate drop in intracellular dTTP, which could not be explained by drop in dTMP based on the Michaelis-Menten-like in vitro activity curve of thymidylate kinase (Tmk), a downstream enzyme that phosphorylates dTMP to dTDP. Instead, we show that a highly cooperative (Hill coefficient 2.5) in vivo activity of Tmk is the cause of suboptimal dTTP levels. dTMP supplementation rescues filamentation and restores in vivo Tmk kinetics to Michaelis-Menten. Overall, this study highlights the important role of cellular environment in sculpting enzymatic kinetics with system-level implications for bacterial phenotype.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.15252/msb.202110200DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236904PMC
June 2021

Accelerating high-throughput virtual screening through molecular pool-based active learning.

Chem Sci 2021 Apr 29;12(22):7866-7881. Epub 2021 Apr 29.

Department of Chemical Engineering, MIT Cambridge MA USA

Structure-based virtual screening is an important tool in early stage drug discovery that scores the interactions between a target protein and candidate ligands. As virtual libraries continue to grow (in excess of 10 molecules), so too do the resources necessary to conduct exhaustive virtual screening campaigns on these libraries. However, Bayesian optimization techniques, previously employed in other scientific discovery problems, can aid in their exploration: a surrogate structure-property relationship model trained on the predicted affinities of a subset of the library can be applied to the remaining library members, allowing the least promising compounds to be excluded from evaluation. In this study, we explore the application of these techniques to computational docking datasets and assess the impact of surrogate model architecture, acquisition function, and acquisition batch size on optimization performance. We observe significant reductions in computational costs; for example, using a directed-message passing neural network we can identify 94.8% or 89.3% of the top-50 000 ligands in a 100M member library after testing only 2.4% of candidate ligands using an upper confidence bound or greedy acquisition strategy, respectively. Such model-guided searches mitigate the increasing computational costs of screening increasingly large virtual libraries and can accelerate high-throughput virtual screening campaigns with applications beyond docking.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1039/d0sc06805eDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8188596PMC
April 2021

Avoidance of protein unfolding constrains protein stability in long-term evolution.

Biophys J 2021 06 29;120(12):2413-2424. Epub 2021 Apr 29.

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts. Electronic address:

Every amino acid residue can influence a protein's overall stability, making stability highly susceptible to change throughout evolution. We consider the distribution of protein stabilities evolutionarily permittable under two previously reported protein fitness functions: flux dynamics and misfolding avoidance. We develop an evolutionary dynamics theory and find that it agrees better with an extensive protein stability data set for dihydrofolate reductase orthologs under the misfolding avoidance fitness function rather than the flux dynamics fitness function. Further investigation with ribonuclease H data demonstrates that not any misfolded state is avoided; rather, it is only the unfolded state. At the end, we discuss how our work pertains to the universal protein abundance-evolutionary rate correlation seen across organisms' proteomes. We derive a closed-form expression relating protein abundance to evolutionary rate that captures Escherichia coli, Saccharomyces cerevisiae, and Homo sapiens experimental trends without fitted parameters.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.bpj.2021.03.042DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8390877PMC
June 2021

Effect of Protein Structure on Evolution of Cotranslational Folding.

Biophys J 2020 09 12;119(6):1123-1134. Epub 2020 Aug 12.

Department of Chemistry & Chemical Biology, Harvard University, Cambridge, Massachusetts. Electronic address:

Cotranslational folding depends on the folding speed and stability of the nascent protein. It remains difficult, however, to predict which proteins cotranslationally fold. Here, we simulate evolution of model proteins to investigate how native structure influences evolution of cotranslational folding. We developed a model that connects protein folding during and after translation to cellular fitness. Model proteins evolved improved folding speed and stability, with proteins adopting one of two strategies for folding quickly. Low contact order proteins evolve to fold cotranslationally. Such proteins adopt native conformations early on during the translation process, with each subsequently translated residue establishing additional native contacts. On the other hand, high contact order proteins tend not to be stable in their native conformations until the full chain is nearly extruded. We also simulated evolution of slowly translating codons, finding that slower translation speeds at certain positions enhances cotranslational folding. Finally, we investigated real protein structures using a previously published data set that identified evolutionarily conserved rare codons in Escherichia coli genes and associated such codons with cotranslational folding intermediates. We found that protein substructures preceding conserved rare codons tend to have lower contact orders, in line with our finding that lower contact order proteins are more likely to fold cotranslationally. Our work shows how evolutionary selection pressure can cause proteins with local contact topologies to evolve cotranslational folding.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.bpj.2020.06.037DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7499064PMC
September 2020

Dynamic metastable long-living droplets formed by sticker-spacer proteins.

Elife 2020 06 2;9. Epub 2020 Jun 2.

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, United States.

Multivalent biopolymers phase separate into membrane-less organelles (MLOs) which exhibit liquid-like behavior. Here, we explore formation of prototypical MOs from multivalent proteins on various time and length scales and show that the kinetically arrested metastable multi-droplet state is a dynamic outcome of the interplay between two competing processes: a diffusion-limited encounter between proteins, and the exhaustion of available valencies within smaller clusters. Clusters with satisfied valencies cannot coalesce readily, resulting in metastable, long-living droplets. In the regime of dense clusters akin to phase-separation, we observe co-existing assemblies, in contrast to the single, large equilibrium-like cluster. A system-spanning network encompassing all multivalent proteins was only observed at high concentrations and large interaction valencies. In the regime favoring large clusters, we observe a slow-down in the dynamics of the condensed phase, potentially resulting in loss of function. Therefore, metastability could be a hallmark of dynamic functional droplets formed by sticker-spacer proteins.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.56159DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7360371PMC
June 2020

Effects of Single Mutations on Protein Stability Are Gaussian Distributed.

Biophys J 2020 06 1;118(12):2872-2878. Epub 2020 May 1.

Department of Chemistry & Chemical Biology, Harvard University, Cambridge, Massachusetts. Electronic address:

The distribution of protein stability effects is known to be well approximated by a Gaussian distribution from previous empirical fits. Starting from first-principles statistical mechanics, we more rigorously motivate this empirical observation by deriving per-residue-position protein stability effects to be Gaussian. Our derivation requires the number of amino acids to be large, which is satisfied by the standard set of 20 amino acids found in nature. No assumption is needed on the number of residues in close proximity in space, in contrast to previous applications of the central limit theorem to protein energetics. We support our derivation results with computational and experimental data on mutant protein stabilities across all types of protein residues.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.bpj.2020.04.027DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7300273PMC
June 2020

Common activation mechanism of class A GPCRs.

Elife 2019 12 19;8. Epub 2019 Dec 19.

iHuman Institute, ShanghaiTech University, Shanghai, China.

Class A G-protein-coupled receptors (GPCRs) influence virtually every aspect of human physiology. Understanding receptor activation mechanism is critical for discovering novel therapeutics since about one-third of all marketed drugs target members of this family. GPCR activation is an allosteric process that couples agonist binding to G-protein recruitment, with the hallmark outward movement of transmembrane helix 6 (TM6). However, what leads to TM6 movement and the key residue level changes of this movement remain less well understood. Here, we report a framework to quantify conformational changes. By analyzing the conformational changes in 234 structures from 45 class A GPCRs, we discovered a common GPCR activation pathway comprising of 34 residue pairs and 35 residues. The pathway unifies previous findings into a common activation mechanism and strings together the scattered key motifs such as CWxP, DRY, Na pocket, NPxxY and PIF, thereby directly linking the bottom of ligand-binding pocket with G-protein coupling region. Site-directed mutagenesis experiments support this proposition and reveal that rational mutations of residues in this pathway can be used to obtain receptors that are constitutively active or inactive. The common activation pathway provides the mechanistic interpretation of constitutively activating, inactivating and disease mutations. As a module responsible for activation, the common pathway allows for decoupling of the evolution of the ligand binding site and G-protein-binding region. Such an architecture might have facilitated GPCRs to emerge as a highly successful family of proteins for signal transduction in nature.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.50279DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6954041PMC
December 2019

Semi-rational design and molecular dynamics simulations study of the thermostability enhancement of cellobiose 2-epimerases.

Int J Biol Macromol 2020 Jul 13;154:1356-1365. Epub 2019 Nov 13.

State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China; International Joint Laboratory on Food Safety, Jiangnan University, Wuxi 214122, China.

Directed evolution using random mutation in vast sequence space leads to the low probability of obtaining target proteins. Emerging engineering strategies with computational tools are developed for more trustable outcomes. We used some semi-rational design methods to modify an industrial enzyme, namely cellobiose 2-epimerase (CE). A mutant was selected for its better thermostability and isomerization activity. The tradeoffs between thermostability, epimerization activity and isomerization activity of the CE mutants were different. To investigate the computational prediction performance of protein stability upon point mutations, molecular dynamics (MD) simulation analyses were conducted. The root mean square deviation (RMSD) and hydrogen bond analyses reproduced the correct trends in stability changes of the wild-type and mutated CEs with relatively high accuracy (correlation coefficients r ~ 0.5-0.8). The simulation temperature and time are important factors that influence the prediction performance. Our result shows that thermostability predictors calculated from MD simulation do better in predicting the thermostability changes of the mutated enzymes than the predictors using static-state information of the enzymes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijbiomac.2019.11.015DOI Listing
July 2020

Adaptation to mutational inactivation of an essential gene converges to an accessible suboptimal fitness peak.

Elife 2019 10 1;8. Epub 2019 Oct 1.

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, United States.

The mechanisms of adaptation to inactivation of essential genes remain unknown. Here we inactivate dihydrofolate reductase (DHFR) by introducing D27G,N,F chromosomal mutations in a key catalytic residue with subsequent adaptation by an automated serial transfer protocol. The partial reversal G27- > C occurred in three evolutionary trajectories. Conversely, in one trajectory for D27G and in all trajectories for D27F,N strains adapted to grow at very low metabolic supplement (folAmix) concentrations but did not escape entirely from supplement auxotrophy. Major global shifts in metabolome and proteome occurred upon DHFR inactivation, which were partially reversed in adapted strains. Loss-of-function mutations in two genes, and , ensured adaptation to low folAmix by rerouting the 2-Deoxy-D-ribose-phosphate metabolism from glycolysis towards synthesis of dTMP. Multiple evolutionary pathways of adaptation converged to a suboptimal solution due to the high accessibility to loss-of-function mutations that block the path to the highest, yet least accessible, fitness peak.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.50509DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6828540PMC
October 2019

The Early Phase of β2m Aggregation: An Integrative Computational Study Framed on the D76N Mutant and the ΔN6 Variant.

Biomolecules 2019 08 14;9(8). Epub 2019 Aug 14.

BioISI-Biosystems & Integrative Sciences Institute and Departamento de Física, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal.

Human β2-microglobulin (b2m) protein is classically associated with dialysis-related amyloidosis (DRA). Recently, the single point mutant D76N was identified as the causative agent of a hereditary systemic amyloidosis affecting visceral organs. To get insight into the early stage of the β2m aggregation mechanism, we used molecular simulations to perform an in depth comparative analysis of the dimerization phase of the D76N mutant and the ΔN6 variant, a cleaved form lacking the first six N-terminal residues, which is a major component of ex vivo amyloid plaques from DRA patients. We also provide first glimpses into the tetramerization phase of D76N at physiological pH. Results from extensive protein-protein docking simulations predict an essential role of the C- and N-terminal regions (both variants), as well as of the BC-loop (ΔN6 variant), DE-loop (both variants) and EF-loop (D76N mutant) in dimerization. The terminal regions are more relevant under acidic conditions while the BC-, DE- and EF-loops gain importance at physiological pH. Our results recapitulate experimental evidence according to which Tyr10 (A-strand), Phe30 and His31 (BC-loop), Trp60 and Phe62 (DE-loop) and Arg97 (C-terminus) act as dimerization hot-spots, and further predict the occurrence of novel residues with the ability to nucleate dimerization, namely Lys-75 (EF-loop) and Trp-95 (C-terminus). We propose that D76N tetramerization is mainly driven by the self-association of dimers via the N-terminus and DE-loop, and identify Arg3 (N-terminus), Tyr10, Phe56 (D-strand) and Trp60 as potential tetramerization hot-spots.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/biom9080366DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6722664PMC
August 2019

Simulation-guided enzyme discovery: A new microbial source of cellobiose 2-epimerase.

Int J Biol Macromol 2019 Oct 8;139:1002-1008. Epub 2019 Aug 8.

State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China; International Joint Laboratory on Food Safety, Jiangnan University, Wuxi 214122, China.

Cellobiose 2-epimerase (CE) is a promising industrial enzyme that can be utilized in the dairy industry. More thermostable CEs from different microorganisms are still needed for a higher lactulose productivity. This study demonstrated the feasibility to use molecular dynamics (MD) simulation as the preliminary computational filter for thermostable enzymes screening. Sequence information of eleven uncharacterized CEs were chosen to be analyzed by MD simulations. The CE from Dictyoglomus thermophilum (Dith-CE) was determined experimentally to be one of the most thermostable CEs with the highest epimerization (160 ± 6.5 U mg) and isomerization activities (3.52 ± 0.23 U mg) among all the reported CEs. This enzyme shows the highest isomerization activity at 85 °C and pH 7.0. The kinetic parameters (k and K) of isomerization activity of this CE are 3.98 ± 0.3 s and 235.2 ± 11.2 mM, respectively. These results suggest that the CE from Dith-CE is a promising lactulose-producing enzyme.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijbiomac.2019.08.075DOI Listing
October 2019

Substrate inhibition imposes fitness penalty at high protein stability.

Proc Natl Acad Sci U S A 2019 06 16;116(23):11265-11274. Epub 2019 May 16.

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138;

Proteins are only moderately stable. It has long been debated whether this narrow range of stabilities is solely a result of neutral drift toward lower stability or purifying selection against excess stability-for which no experimental evidence was found so far-is also at work. Here, we show that mutations outside the active site in the essential enzyme adenylate kinase (Adk) result in a stability-dependent increase in substrate inhibition by AMP, thereby impairing overall enzyme activity at high stability. Such inhibition caused substantial fitness defects not only in the presence of excess substrate but also under physiological conditions. In the latter case, substrate inhibition caused differential accumulation of AMP in the stationary phase for the inhibition-prone mutants. Furthermore, we show that changes in flux through Adk could accurately describe the variation in fitness effects. Taken together, these data suggest that selection against substrate inhibition and hence excess stability may be an important factor determining stability observed for modern-day Adk.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1821447116DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6561296PMC
June 2019

Chimeric dihydrofolate reductases display properties of modularity and biophysical diversity.

Protein Sci 2019 07 30;28(7):1359-1367. Epub 2019 May 30.

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts.

While reverse genetics and functional genomics have long affirmed the role of individual mutations in determining protein function, there have been fewer studies addressing how large-scale changes in protein sequences, such as in entire modular segments, influence protein function and evolution. Given how recombination can reassort protein sequences, these types of changes may play an underappreciated role in how novel protein functions evolve in nature. Such studies could aid our understanding of whether certain organismal phenotypes related to protein function-such as growth in the presence or absence of an antibiotic-are robust with respect to the identity of certain modular segments. In this study, we combine molecular genetics with biochemical and biophysical methods to gain a better understanding of protein modularity in dihydrofolate reductase (DHFR), an enzyme target of antibiotics also widely used as a model for protein evolution. We replace an integral α-helical segment of Escherichia coli DHFR with segments from a number of different organisms (many nonmicrobial) and examine how these chimeric enzymes affect organismal phenotypes (e.g., resistance to an antibiotic) as well as biophysical properties of the enzyme (e.g., thermostability). We find that organismal phenotypes and enzyme properties are highly sensitive to the identity of DHFR modules, and that this chimeric approach can create enzymes with diverse biophysical characteristics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/pro.3646DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7663999PMC
July 2019

Mutation rate variability as a driving force in adaptive evolution.

Phys Rev E 2019 Feb;99(2-1):022424

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, USA.

Mutation rate is a key determinant of the pace as well as outcome of evolution, and variability in this rate has been shown in different scenarios to play a key role in evolutionary adaptation and resistance evolution under stress caused by selective pressure. Here we investigate the dynamics of resistance fixation in a bacterial population with variable mutation rates, and we show that evolutionary outcomes are most sensitive to mutation rate variations when the population is subject to environmental and demographic conditions that suppress the evolutionary advantage of high-fitness subpopulations. By directly mapping a biophysical fitness function to the system-level dynamics of the population, we show that both low and very high, but not intermediate, levels of stress in the form of an antibiotic result in a disproportionate effect of hypermutation on resistance fixation. We demonstrate how this behavior is directly tied to the extent of genetic hitchhiking in the system, the propagation of high-mutation rate cells through association with high-fitness mutations. Our results indicate a substantial role for mutation rate flexibility in the evolution of antibiotic resistance under conditions that present a weak advantage over wildtype to resistant cells.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1103/PhysRevE.99.022424DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819004PMC
February 2019

Dynamic disulfide exchange in a crystallin protein in the human eye lens promotes cataract-associated aggregation.

J Biol Chem 2018 11 21;293(46):17997-18009. Epub 2018 Sep 21.

From the Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138. Electronic address:

Increased light scattering in the eye lens due to aggregation of the long-lived lens proteins, crystallins, is the cause of cataract disease. Several mutations in the gene encoding human γD-crystallin (HγD) cause misfolding and aggregation. Cataract-associated substitutions at Trp cause the protein to aggregate from a partially unfolded intermediate locked by an internal disulfide bridge, and proteomic evidence suggests a similar aggregation precursor is involved in age-onset cataract. Surprisingly, WT HγD can promote aggregation of the W42Q variant while itself remaining soluble. Here, a search for a biochemical mechanism for this interaction has revealed a previously unknown oxidoreductase activity in HγD. Using oxidation, mutational analysis, cysteine labeling, and MS, we have assigned this activity to a redox-active internal disulfide bond that is dynamically exchanged among HγD molecules. The W42Q variant acts as a disulfide sink, reducing oxidized WT and forming a distinct internal disulfide that kinetically traps the aggregation-prone intermediate. Our findings suggest a redox "hot potato" competition among WT and mutant or modified polypeptides wherein variants with the lowest kinetic stability are trapped in aggregation-prone intermediate states upon accepting disulfides from more stable variants. Such reactions may occur in other long-lived proteins that function in oxidizing environments. In these cases, aggregation may be forestalled by inhibiting disulfide flow toward mutant or damaged polypeptides.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1074/jbc.RA118.004551DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6240864PMC
November 2018

Growth tradeoffs produce complex microbial communities on a single limiting resource.

Nat Commun 2018 08 10;9(1):3214. Epub 2018 Aug 10.

Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, 02138, USA.

The relationship between the dynamics of a community and its constituent pairwise interactions is a fundamental problem in ecology. Higher-order ecological effects beyond pairwise interactions may be key to complex ecosystems, but mechanisms to produce these effects remain poorly understood. Here we model microbial growth and competition to show that higher-order effects can arise from variation in multiple microbial growth traits, such as lag times and growth rates, on a single limiting resource with no other interactions. These effects produce a range of ecological phenomena: an unlimited number of strains can exhibit multistability and neutral coexistence, potentially with a single keystone strain; strains that coexist in pairs do not coexist all together; and a strain that wins all pairwise competitions can go extinct in a mixed competition. Since variation in multiple growth traits is ubiquitous in microbial populations, our results indicate these higher-order effects may also be widespread, especially in laboratory ecology and evolution experiments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-05703-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6086922PMC
August 2018

Accurate Protein-Folding Transition-Path Statistics from a Simple Free-Energy Landscape.

J Phys Chem B 2018 12 22;122(49):11126-11136. Epub 2018 Aug 22.

Department of Chemistry and Chemical Biology , Harvard University , 12 Oxford Street , Cambridge , Massachusetts 02138 , United States.

A central goal of protein-folding theory is to predict the stochastic dynamics of transition paths-the rare trajectories that transit between the folded and unfolded ensembles-using only thermodynamic information, such as a low-dimensional equilibrium free-energy landscape. However, commonly used one-dimensional landscapes typically fall short of this aim, because an empirical coordinate-dependent diffusion coefficient has to be fit to transition-path trajectory data in order to reproduce the transition-path dynamics. We show that an alternative, first-principles free-energy landscape predicts transition-path statistics that agree well with simulations and single-molecule experiments without requiring dynamical data as an input. This "topological configuration" model assumes that distinct, native-like substructures assemble on a time scale that is slower than native-contact formation but faster than the folding of the entire protein. Using only equilibrium simulation data to determine the free energies of these coarse-grained intermediate states, we predict a broad distribution of transition-path transit times that agrees well with the transition-path durations observed in simulations. We further show that both the distribution of finite-time displacements on a one-dimensional order parameter and the ensemble of transition-path trajectories generated by the model are consistent with the simulated transition paths. These results indicate that a landscape based on transient folding intermediates, which are often hidden by one-dimensional projections, can form the basis of a predictive model of protein-folding transition-path dynamics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jpcb.8b05842DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6386633PMC
December 2018

Evolution on the Biophysical Fitness Landscape of an RNA Virus.

Mol Biol Evol 2018 10;35(10):2390-2400

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA.

Viral evolutionary pathways are determined by the fitness landscape, which maps viral genotype to fitness. However, a quantitative description of the landscape and the evolutionary forces on it remain elusive. Here, we apply a biophysical fitness model based on capsid folding stability and antibody binding affinity to predict the evolutionary pathway of norovirus escaping a neutralizing antibody. The model is validated by experimental evolution in bulk culture and in a drop-based microfluidics that propagates millions of independent small viral subpopulations. We demonstrate that along the axis of binding affinity, selection for escape variants and drift due to random mutations have the same direction, an atypical case in evolution. However, along folding stability, selection and drift are opposing forces whose balance is tuned by viral population size. Our results demonstrate that predictable epistatic tradeoffs between molecular traits of viral proteins shape viral evolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msy131DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6188569PMC
October 2018

Accessibility of the Shine-Dalgarno Sequence Dictates N-Terminal Codon Bias in E. coli.

Mol Cell 2018 06 7;70(5):894-905.e5. Epub 2018 Jun 7.

Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, USA. Electronic address:

Despite considerable efforts, no physical mechanism has been shown to explain N-terminal codon bias in prokaryotic genomes. Using a systematic study of synonymous substitutions in two endogenous E. coli genes, we show that interactions between the coding region and the upstream Shine-Dalgarno (SD) sequence modulate the efficiency of translation initiation, affecting both intracellular mRNA and protein levels due to the inherent coupling of transcription and translation in E. coli. We further demonstrate that far-downstream mutations can also modulate mRNA levels by occluding the SD sequence through the formation of non-equilibrium secondary structures. By contrast, a non-endogenous RNA polymerase that decouples transcription and translation largely alleviates the effects of synonymous substitutions on mRNA levels. Finally, a complementary statistical analysis of the E. coli genome specifically implicates avoidance of intra-molecular base pairing with the SD sequence. Our results provide general physical insights into the coding-level features that optimize protein expression in prokaryotes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.molcel.2018.05.008DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311106PMC
June 2018

ProteomeVis: a web app for exploration of protein properties from structure to sequence evolution across organisms' proteomes.

Bioinformatics 2018 10;34(20):3557-3565

Department of Chemistry & Chemical Biology, Harvard University, Cambridge, MA, USA.

Motivation: Protein evolution spans time scales and its effects span the length of an organism. A web app named ProteomeVis is developed to provide a comprehensive view of protein evolution in the Saccharomyces cerevisiae and Escherichia coli proteomes. ProteomeVis interactively creates protein chain graphs, where edges between nodes represent structure and sequence similarities within user-defined ranges, to study the long time scale effects of protein structure evolution. The short time scale effects of protein sequence evolution are studied by sequence evolutionary rate (ER) correlation analyses with protein properties that span from the molecular to the organismal level.

Results: We demonstrate the utility and versatility of ProteomeVis by investigating the distribution of edges per node in organismal protein chain universe graphs (oPCUGs) and putative ER determinants. S.cerevisiae and E.coli oPCUGs are scale-free with scaling constants of 1.79 and 1.56, respectively. Both scaling constants can be explained by a previously reported theoretical model describing protein structure evolution. Protein abundance most strongly correlates with ER among properties in ProteomeVis, with Spearman correlations of -0.49 (P-value < 10-10) and -0.46 (P-value < 10-10) for S.cerevisiae and E.coli, respectively. This result is consistent with previous reports that found protein expression to be the most important ER determinant.

Availability And Implementation: ProteomeVis is freely accessible at http://proteomevis.chem.harvard.edu.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty370DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6184454PMC
October 2018

Trade-offs between microbial growth phases lead to frequency-dependent and non-transitive selection.

Proc Biol Sci 2018 02;285(1872)

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA

Mutations in a microbial population can increase the frequency of a genotype not only by increasing its exponential growth rate, but also by decreasing its lag time or adjusting the yield (resource efficiency). The contribution of multiple life-history traits to selection is a critical question for evolutionary biology as we seek to predict the evolutionary fates of mutations. Here we use a model of microbial growth to show that there are two distinct components of selection corresponding to the growth and lag phases, while the yield modulates their relative importance. The model predicts rich population dynamics when there are trade-offs between phases: multiple strains can coexist or exhibit bistability due to frequency-dependent selection, and strains can engage in rock-paper-scissors interactions due to non-transitive selection. We characterize the environmental conditions and patterns of traits necessary to realize these phenomena, which we show to be readily accessible to experiments. Our results provide a theoretical framework for analysing high-throughput measurements of microbial growth traits, especially interpreting the pleiotropy and correlations between traits across mutants. This work also highlights the need for more comprehensive measurements of selection in simple microbial systems, where the concept of an ordinary fitness landscape breaks down.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1098/rspb.2017.2459DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5829199PMC
February 2018

Absence of Selection for Quantum Coherence in the Fenna-Matthews-Olson Complex: A Combined Evolutionary and Excitonic Study.

ACS Cent Sci 2017 Oct 30;3(10):1086-1095. Epub 2017 Aug 30.

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States.

We present a study on the evolution of the Fenna-Matthews-Olson bacterial photosynthetic pigment-protein complex. This protein complex functions as an antenna. It transports absorbed photons-excitons-to a reaction center where photosynthetic reactions initiate. The efficiency of exciton transport is therefore fundamental for the photosynthetic bacterium's survival. We have reconstructed an ancestor of the complex to establish whether coherence in the exciton transport was selected for or optimized over time. We have also investigated the role of optimizing free energy variation upon folding in evolution. We studied whether mutations which connect the ancestor to current day species were stabilizing or destabilizing from a thermodynamic viewpoint. From this study, we established that most of these mutations were thermodynamically neutral. Furthermore, we did not see a large change in exciton transport efficiency or coherence, and thus our results predict that exciton coherence was not specifically selected for.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acscentsci.7b00269DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5658757PMC
October 2017

Evidence of evolutionary selection for cotranslational folding.

Proc Natl Acad Sci U S A 2017 10 10;114(43):11434-11439. Epub 2017 Oct 10.

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138

Recent experiments and simulations have demonstrated that proteins can fold on the ribosome. However, the extent and generality of fitness effects resulting from cotranslational folding remain open questions. Here we report a genome-wide analysis that uncovers evidence of evolutionary selection for cotranslational folding. We describe a robust statistical approach to identify loci within genes that are both significantly enriched in slowly translated codons and evolutionarily conserved. Surprisingly, we find that domain boundaries can explain only a small fraction of these conserved loci. Instead, we propose that regions enriched in slowly translated codons are associated with cotranslational folding intermediates, which may be smaller than a single domain. We show that the intermediates predicted by a native-centric model of cotranslational folding account for the majority of these loci across more than 500 proteins. By making a direct connection to protein folding, this analysis provides strong evidence that many synonymous substitutions have been selected to optimize translation rates at specific locations within genes. More generally, our results indicate that kinetics, and not just thermodynamics, can significantly alter the efficiency of self-assembly in a biological context.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1705772114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5664504PMC
October 2017

Optimization of lag phase shapes the evolution of a bacterial enzyme.

Nat Ecol Evol 2017 Apr 28;1(6):149. Epub 2017 Apr 28.

Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, Massachusetts 02138, USA.

Mutations provide the variation that drives evolution, yet their effects on fitness remain poorly understood. Here we explore how mutations in the essential enzyme adenylate kinase (Adk) of Escherichia coli affect multiple phases of population growth. We introduce a biophysical fitness landscape for these phases, showing how they depend on molecular and cellular properties of Adk. We find that Adk catalytic capacity in the cell (the product of activity and abundance) is the major determinant of mutational fitness effects. We show that bacterial lag times are at a well-defined optimum with respect to Adk's catalytic capacity, while exponential growth rates are only weakly affected by variation in Adk. Direct pairwise competitions between strains show how environmental conditions modulate the outcome of a competition where growth rates and lag times have a tradeoff, shedding light on the multidimensional nature of fitness and its importance in the evolutionary optimization of enzymes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41559-017-0149DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5640271PMC
April 2017

A tale of two tails: The importance of unstructured termini in the aggregation pathway of β2-microglobulin.

Proteins 2017 Nov 8;85(11):2045-2057. Epub 2017 Aug 8.

BioISI - Biosystems & Integrative Sciences Institute, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.

The identification of intermediate states for folding and aggregation is important from a fundamental standpoint and for the design of novel therapeutic strategies targeted at conformational disorders. Protein human β2-microglobulin (HB2m) is classically associated with dialysis-related amyloidosis, but the single point mutant D76N was recently identified as the causative agent of a hereditary systemic amyloidosis affecting visceral organs. Here, we use D76N as a model system to explore the early stage of the aggregation mechanism of HB2m by means of an integrative approach framed on molecular simulations. Discrete molecular dynamics simulations of a structured-based model predict the existence of two intermediate states populating the folding landscape. The intermediate I features an unstructured C-terminus, while I , which is exclusively populated by the mutant, exhibits two unstructured termini. Docking simulations indicate that I is the key species for aggregation at acidic and physiological pH contributing to rationalize the higher amyloidogenic potential of D76N relative to the wild-type protein and the ΔN6 variant. The analysis carried out here recapitulates the importance of the DE-loop in HB2m self-association at a neutral pH and predicts a leading role of the C-terminus and the adjacent G-strand in the dimerization process under acidic conditions. The identification of aggregation hot-spots is in line with experimental results that support the importance of Phe56, Asp59, Trp60, Phe62, Tyr63, and Tyr66 in HB2m amyloidogenesis. We further predict the involvement of new residues such as Lys94 and Trp95 in the aggregation process.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/prot.25358DOI Listing
November 2017

Effect of sampling on BACE-1 ligands binding free energy predictions via MM-PBSA calculations.

J Comput Chem 2017 08 1;38(22):1941-1951. Epub 2017 Jun 1.

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, 02138.

The BACE-1 enzyme is a prime target to find a cure to Alzheimer's disease. In this article, we used the MM-PBSA approach to compute the binding free energies of 46 reported ligands to this enzyme. After showing that the most probable protonation state of the catalytic dyad is mono-protonated (on ASP32), we performed a thorough analysis of the parameters influencing the sampling of the conformational space (in total, more than 35 μs of simulations were performed). We show that ten simulations of 2 ns gives better results than one of 50 ns. We also investigated the influence of the protein force field, the water model, the periodic boundary conditions artifacts (box size), as well as the ionic strength. Amber03 with TIP3P, a minimal distance of 1.0 nm between the protein and the box edges and a ionic strength of I = 0.2 M provides the optimal correlation with experiments. Overall, when using these parameters, a Pearson correlation coefficient of R = 0.84 (R  = 0.71) is obtained for the 46 ligands, spanning eight orders of magnitude of K (from 0.017 nm to 2000 μM, i.e., from -14.7 to -3.7 kcal/mol), with a ligand size from 22 to 136 atoms (from 138 to 937 g/mol). After a two-parameter fit of the binding affinities for 12 of the ligands, an error of RMSD = 1.7 kcal/mol was obtained for the remaining ligands. © 2017 Wiley Periodicals, Inc.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/jcc.24839DOI Listing
August 2017

The Role of Evolutionary Selection in the Dynamics of Protein Structure Evolution.

Biophys J 2017 Apr;112(7):1350-1365

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts. Electronic address:

Homology modeling is a powerful tool for predicting a protein's structure. This approach is successful because proteins whose sequences are only 30% identical still adopt the same structure, while structure similarity rapidly deteriorates beyond the 30% threshold. By studying the divergence of protein structure as sequence evolves in real proteins and in evolutionary simulations, we show that this nonlinear sequence-structure relationship emerges as a result of selection for protein folding stability in divergent evolution. Fitness constraints prevent the emergence of unstable protein evolutionary intermediates, thereby enforcing evolutionary paths that preserve protein structure despite broad sequence divergence. However, on longer timescales, evolution is punctuated by rare events where the fitness barriers obstructing structure evolution are overcome and discovery of new structures occurs. We outline biophysical and evolutionary rationale for broad variation in protein family sizes, prevalence of compact structures among ancient proteins, and more rapid structure evolution of proteins with lower packing density.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.bpj.2017.02.029DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5390048PMC
April 2017

Graph's Topology and Free Energy of a Spin Model on the Graph.

Phys Rev Lett 2017 Feb 24;118(8):088302. Epub 2017 Feb 24.

Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, Massachusetts 02138, USA.

In this Letter we investigate a direct relationship between a graph's topology and the free energy of a spin system on the graph. We develop a method of separating topological and energetic contributions to the free energy, and find that considering the topology is sufficient to qualitatively compare the free energies of different graph systems at high temperature, even when the energetics are not fully known. This method was applied to the metal lattice system with defects, and we found that it partially explains why point defects are more stable than high-dimensional defects. Given the energetics, we can even quantitatively compare free energies of different graph structures via a closed form of linear graph contributions. The closed form is applied to predict the sequence-space free energy of lattice proteins, which is a key factor determining the designability of a protein structure.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1103/PhysRevLett.118.088302DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5668130PMC
February 2017

A Hybrid Knowledge-Based and Empirical Scoring Function for Protein-Ligand Interaction: SMoG2016.

J Chem Inf Model 2017 03 27;57(3):584-593. Epub 2017 Feb 27.

Department of Chemistry and Chemical Biology, Harvard University , Cambridge, Massachusetts 02138, United States.

We present the third generation of our scoring function for the prediction of protein-ligand binding free energy. This function is now a hybrid between a knowledge-based potential and an empirical function. We constructed a diversified set of ∼1000 complexes from the PDBBinding-CN database for the training of the function, and we show that this number of complexes generates enough data to build the potential. The occurrence of 420 different types of atomic pairwise interactions is computed in up to five different ranges of distances to derive the knowledge-based part. All of the parameters were optimized, and we were able to considerably improve the accuracy of the scoring function with a Pearson correlation coefficient against experimental binding free energies of up to 0.57, which ranks our new scoring function as one of the best currently available and the second-best in terms of standard deviation (SD = 1.68 kcal/mol). The function was then further improved by inclusion of different terms taking into account repulsion and loss of entropy upon binding, and we show that it is capable of recovering native binding poses up to 80% of the time. All of the programs, tools, and protein sets are released in the Supporting Information or as open-source programs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.6b00610DOI Listing
March 2017
-->