Enzyme Promiscuity in Enolase Superfamily. A Theoretical Study of o-Succinylbenzoate Synthase Using QM/MM Methods

The promiscuous activity of the enzyme o-Succinylbenzoate Synthase (OSBS) from the actinobacteria Amycolatopsis is investigated by means of QM/MM methods, using both Density Functional Theory and Semiempirical Hamiltonians. This enzyme catalyzes not only the dehydration of 2-succinyl-6R-hydroxy-2,4-cyclohexadiene-1R-carboxylate but also catalyzes racemization of different acylaminoacids, being N-succinyl-Rphenylglycine the best substrate. We investigated the molecular mechanisms for both reactions exploring the Potential Energy Surface. Then, Molecular Dynamics simulations were performed to obtain the free energy profiles and the averaged interaction energies of enzymatic residues with the reacting system. Our results confirm the plausibility of the reaction mechanisms proposed in the literature, with a good agreement between theoretical and experimentally-derived activation free energies. Our simulations unravel the role played by the different residues in each of the two possible reactions. The presence of flexible loops in the active site and the selection of structural modifications in the substrate seem to be key elements to promote the promiscuity of this enzyme.


Introduction
Many efforts have been devoted during the last years to improve the understanding of the mechanism of action of natural enzymes [1][2][3][4][5] and apply this knowledge to the development of new biocatalysts. 6,7 Scientists began to use evolutionary strategies (Directed Evolution) 8 to tailor the properties of individual molecules. Random mutations or recombination can, in many cases, be done efficiently, leading in this way to molecular evolution in the laboratory. On the other side, rational design can be used to guide the direct mutations of residues at specific positions of the enzyme. 9 Molecular simulations appear to be nowadays a promising option for computer-based design.
Nevertheless, the most active computationally designed enzymes usually perform modestly in comparison with the natural enzymes. [10][11][12][13] The development of new specific functions in an enzyme is something that Nature has already made during done through evolution, although the specific paths evolutionary routes followed to reach this goal remain, in general, raveled to be revealed. While closely related proteins can have different functions, some distantly related enzymes can have the same or similar function. Evolution could have taken advantage of enzymes presenting promiscuous activities. This promiscuity can be classified in three categories: substrate promiscuity (the enzyme accepts different substrates catalyzing the same reaction), catalytic promiscuity (the enzyme accepts different substrates catalyzing different reactions) and product promiscuity (the enzyme accepts a single substrate and catalyzes the formation of different products). [14][15][16][17][18] Promiscuity provides a raw starting point for the evolution of enzymes, as a new duplicated gene presenting low activity would be the germ for adaptive evolution. 15 This ability endows organisms with a selective advantage and thereby enables their survival and further evolution. Studying protein evolution from promiscuous enzymes offers the possibility to obtain valuable experiences for the design of new biocatalysts.
In this paper we focus on a promiscuous enzyme member of the enolase superfamily, a large group of enzymes that catalyzes different reactions distributed in 7 subgroups. 19 All enzymes belonging to this superfamily use a common mechanistic first step consisting in the abstraction of an alpha proton next to a carboxylate group, leading to the formation of an enediolate anion intermediate stabilized by the presence of a divalent cation in the active site (usually Mg 2+ ). The active site nestled on positioned in a cavity formed by the C-terminal ends of the β-strands of the barrel domain and the substrate lies sandwiched between the catalytic acid/base residues. 20 The Muconate lactonizing subgroup of the family is characterized by the presence of a conserved Lys-X-Lys motif at the end of the second β-strand and a lysine at the end of the sixth βstrand. These lysine residues are located at opposite faces of the active site, allowing proton transfer reactions to/from both faces of the substrate. The o-Succinylbenzoate Synthase (OSBS) enzyme from Amycolatopsis belongs to this subgroup and was first assigned as a racemase and designated as N-acylamino acid racemase (NAAAR).
However it was later discovered that this enzyme is a better catalyst for a specific syn dehydration reaction that happens during o-succinylbenzoate synthesis and was then designated as o-Succinylbenzoate Synthase (OSBS). 21 OSBS is thus an example of catalytic promiscuity of an enzyme with catalytic promiscuity. Other members of the enolase superfamily also present the ability to catalyze dehydration and racemization reactions either for the same (product promiscuity) or different substrates (catalytic promiscuity). 22,23 OSBS from Amycolatopsis catalyzes the two different reactions mentioned before (see Scheme I): i) the dehydration of 2-succinyl-6R-hydroxy-2,4-cyclohexadiene-1Rcarboxylate (SHCHC) to o-Succinylbenzoate (OSB) (OSBS reaction, see Scheme 1 a)) and ii) the N-acyloamino acids racemization (NAAAR reaction). The NAAAR reaction can be enhanced using substrates that closely resembles to SHCHC. One of the best substrates is N-succinyl-(R)-phenylglycine (NSRPG) that is converted into N-succinyl-(S)-phenylglycine (NSSPG) (see Scheme I b)). The first step in both reactions requires the abstraction of a proton next to a carboxylate group of the substrate carried out by the deprotonated Lys 163. It has been proposed that the second step in the OSBS reaction In this work we present a theoretical analysis of the reaction mechanisms for the OSBS/NAAAR enzyme from Amycolatopsis, using SHCHC and NSRPG as substrates.
The Potential Energy Surfaces of both reactions have been explored using Quantum

Preparation of the system
Theoretical studies were performed starting with the crystallized structure of the OSBS enzyme that appears in the Protein Data Bank with code 1SJB. 25 This structure includes the OSB substrate and the magnesium ion. According to experimental data, OSBS from Amycolatopsis assembles as an octamer. 25 Active sites are found inside each subunit, with residues of the closer subunit placed at 15 Å from the active site chosen to model the reaction (chain B). Our model consisted of two subunits, chains A and B and the distance between the active sites of these two subunits is about 30 Å. The crystal structure contains the product of the OSBS reaction (OSB, see Scheme I). A schematic representation of the main interactions established by OSB in the 1SJB structure is shown in Figure S1 as Supporting Information. The OSB molecule was transformed into the reactant (SHCHC) adding a hydrogen atom to the C  atom and a hydroxyl group to the C  atom. After the chemical transformation the internal geometry of the substrate was fully relaxed. The protonation state of titratable residues at pH 7 was determined using the program PropKa 3.1. [26][27][28][29] All the His residues were protonated because they are located at the surface of our model and are accessible to the solvent.
The pK a values predicted by PropKa for Lys 163 and 263 of chain B (the selected one to model the reaction in its active site) were 7.15 and 9.14 respectively. Theses pK a values depend on protein conformation and on the nature of the substrate. Results were similar for PDB structure 1SJC 25 that contains N-succinylmethionine in the active site; 6.47 and 9.00, respectively. However, the computed values were smaller (4.55 and 6.52 for Lys 163 and Lys 263, respectively) for structure 1SJA 25 (chain B) that contains a different substrate: N-succinylacetate. In any case, these values are consistent with a protonation state adequate for the forward reactions as depicted in Scheme I: Lys 163, which acts as a base in both reactions, is considered to be unprotonated at the reactant state, whereas Lys 263, that a acts as acid in the NAAAR reaction, is protonated. We did not use consider the 1SJD 25 structure because it lacks the magnesium ion and the substrate (NSRPG) is found in a non-productive orientation with respect the catalytic lysines.
Hydrogen atoms were added using the fDynamo program. 30,31 The system was solvated with a pre-equilibrated water box of 100 x 80 x 80 Å of size centered on the substrate.
Water molecules placed at less than 2.8 Å from any other non-hydrogen atom of the system were deleted. To neutralize the system, 5 sodium atoms were added. In order to reduce the computational time, a sphere of 25 Å centered on the substrate was defined and the atoms outside that sphere were kept fixed. A representation of the simulated system is provided as Supporting Information ( Figure S2). To study the reaction a hybrid Quantum Mechanics/Molecular Mechanics (QM/MM) Hamiltonian was employed. The quantum region (QM) contains the substrate and the side-chains of those residues directly involved in the reactions: Lys 163 and Lys 263. In addition, in order to model correctly the possible charge transfer between the negatively charged substrate and the metal center, 32 we also included in the QM subsystem the magnesium ion and the side-chains of the residues that are part of its coordination shell (Asp189, Glu 214, Asp 239 and a water molecule) (see Figure 1). Figure 1 displays the QM regions of the OSBS reaction on the left (a) and the QM region of the NAAAR reaction on the right (b). The QM subsystem was described at AM1 33 and M06-2X/6-31+G(d,p) 34,35 levels.
The M06 level is clearly superior to the semiempirical AM1 treatment but, because of the computational cost, the former was used only for the exploration of the Potential Energy Surface, while the latter was employed also for QM/MM Molecular Dynamics simulations. In any case our results show (see below) that the geometrical description obtained at the semiempirical level is in good agreement with the higher QM level and that single-point energy corrections can be applied to obtain a reasonable quantitative picture of the reaction energetics. For the classical region (MM), the OPLS-AA 36,37 and TIP3P 38 force fields, as implemented in fDynamo, 30,31 were used to describe the protein and the water molecules, respectively. The link atom approach 39 was used to saturate the valence of the QM/MM frontier. These link atoms were placed between C  and C  atoms of each residue in the QM subsystem (see Figure 1). Link atoms are depicted as •.
In order to prepare the system for the study of the NAAAR reaction, the SHCHC

Potential Energy Surface Exploration
The potential energy surfaces (PES) for both the OSBS and NAAAR reactions were

Free Energy Calculations
Potentials of Mean Force (PMFs) were obtained for the OSBS and NAAAR reactions at the AM1/MM level. These PMFs provide the free energy change along selected reaction coordinates. Antisymmetric combinations of the distances describing the breaking and forming bonds were employed as distinguished reaction coordinates (RCs). The selected distances depend on the reaction and the step studied (see below). The umbrella sampling approach 46 was used to constraint the system close to a particular value of the RC by means of the addition of a harmonic potential with a force constant of 2500 kJ·mol -1 ·Å -2 , which allowed good overlap between consecutive simulation windows.
The probability distributions obtained from MD simulations within each individual window were combined by means of the weighted histogram analysis method (WHAM) 47 to obtain the whole PMF. Each simulation window consisted in 10 ps of relaxation and 20 ps of production, with a time step of 0.5 fs. Simulations were carried out in the NVT ensemble with a reference temperature of 298 K and the Langevin-Verlet integrator. 40 In the OSBS reaction, at the first step, the PMF was performed exploring the reaction coordinate with a window width of 0.03 Å, and at the second step (a bidimensional PMF) with a window width of 0.05 Å (the total number of windows were 100 and 1600, respectively). At the first step of the NAAAR reaction, the reaction coordinate was then explored with a window width of 0.05 Å, and at the second step the reaction coordinate was explored with a window width of 0.03 Å. The total number of windows were 70 and 150, respectively.
The free energy barriers were computed at AM1/MM level and corrected at M06-2X/MM level for the two steps of both OSBS and NAAAR reactions. In particular, we obtained corrected free energy barriers by applying the following equation: The averaged energy differences appearing in the above expression have been obtained by means of single-point calculations of a set of ¿10? different reactant and TS structures localized using a micro-macro iteration approach 48-50 at the AM1/MM level.
These structures are optimized using a Hessian-based search for the QM atoms while the MM atoms are fully optimized at each step of this search using only gradients. [48][49][50] In equation 1, HL stands for the high level chosen to obtain the corrected estimation of the activation free energy. In our case M06-2X/6-31+G(d,p) 34,35 has been used. HL calculations are carried out under the effect of the field created by the MM environment, where the electrostatic coupling between the QM and MM subsystems is calculated using point charges on the QM atoms. [48][49][50]

Interaction Energies
In order to elucidate the role of the different residues in each of the two reactions catalyzed by this enzyme, we computed the averaged interaction energies of these residues with the reacting subsystem (the substrate and lysines 163 and 263, see Figure   1). The average was performed over 100 ps of MD simulations constrained at values of the reaction coordinate corresponding to reactant and transition states, as determined from the corresponding PMFs. The averaged interaction energy of a given residue j was obtained as: where N is the number of configurations analyzed (1000 in our case), the wave function ( Ψ ) corresponds to that of the polarized reacting subsystem and the interaction Hamiltonian with residue j is defined as: including the electrostatic and van der Waals terms. These calculations were carried out at the AM1/MM level.
We have also compared the averaged interaction energies obtained for each residue at the transition states of both reactions with those obtained for their corresponding reactants. The contribution of the interaction energy of each residue to the barrier can be calculated as the difference of the interaction established in the transition state and reactant state: where TS  and R  denote averages over the transition and reactant states of the reaction, respectively, and the subscripts (j) refers to the residue number.

Exploring the Reaction Mechanisms
Location and characterization of the stationary structures corresponding to a given reaction mechanism (reactants, products, transition structures and possible intermediates) allow determining its feasibility from the inspection of the energy barriers and the total energy change. Tracing the reaction paths along these structures provides detailed information about the molecular characteristics of the mechanism: the step-wise or concerted nature of the process and the degree of synchronicity between bond breaking and forming events. The use of QM/MM methods has facilitated the application of accurate QM methods to the exploration of reaction mechanisms in large systems, such as in enzymatic processes. We investigated the mechanistic proposal of Gerlt and coworkers (see Scheme I) 24 at the M06-2X/MM level. This method has been successfully used to determine the energetics associated to chemical reactions. 34 Table 1. Figure 3 shows the corresponding Potential Energy Profiles together with a representation of the chemical subsystem at the TSs. kcal· mol -1 ) or even slightly more stable than SHCHC (-2.5 kcal· mol -1 ).
The OSBS and NAAAR reactions differ in the second step, the transformation followed

Potentials of Mean Force Free Energy Profiles
For reactions involving a reduced small number of atoms, the characterization of a few stationary structures is usually enough to characterize the process and to calculate free energies that can be compared to experimental equilibrium and rate constants. These free energies are usually obtained by means of the application of the harmonic approximation for the displacements from the stationary structures. However, this approach does not hold for reactions involving many degrees of freedom (such as enzymatic reactions) where the PES is usually rough and many different configurations must be considered in order to obtain averaged values to be compared to experimental properties. The energy profiles presented above must be considered with caution because they represent the energy paths for a particular conformation of the system but,  Figure S6 of the Supporting Information, while the relative AM1/MM free energies and averaged distances are given in Table 3. Free energy differences corrected by means of single-point calculations at the M06-2X/MM level are also presented in Table 3. The free energy profiles obtained from these simulations roughly match the potential energy profiles presented in Figure 3, validating the mechanistic description given above. This is also confirmed by an analysis of the averaged distances presented in Table 3, which are in reasonable agreement with the optimized values presented in Table 1 and because it provides a 3-fold enhancement in the rate of the NAAAR reaction relative to Mg 2+ . 24 However, this difference in the rate constants translates into a small change in the activation free energy (less than 1 kcal·mol -1 ). Our free energy profiles also agree with the experimental observation that the values of the rate constant for the proton exchange exceed those for racemization 24 because our energy profiles show that TS2 of the NAAAR reaction presents a free energy slightly higher than for TS1. It should be taken into account that we have not considered quantum corrections to the classical activation free energy (zero point energies and tunneling) that typically contribute to decrease the calculated activation free energies in H-transfer processes. 54,55 With respect to the reaction energies, our results show that the OSBS process is clearly exergonic (the reaction free energy being -17.5 kcal·mol -1 , see Table 3) in agreement with the experimental observation. 20,51 Instead, the NAAAR reaction is almost thermoneutral (0.3 kcal·mol -1 ) This implies that the activation free energies and the catalytic rate constant will be very similar for the reverse reaction, the transformation of NSSPG into NSRPG, in agreement with the observation that this reaction occurs in both directions at nearly equal rates. 24

Analysis of interactions relevant for catalysis
Once we have presented the results of the QM/MM calculations and checked that these results reasonably agree with experimental observations, we can use our simulations to provide a detailed molecular picture of the processes and the basis for the catalytic promiscuity of the OSBS/NAAAR enzyme from Amycolatopsis. As explained in the Methodology section, the average interaction energy of the different residues with the reacting subsystem (the substrate and lysines 163 and 263) was computed and the results of these interaction energies for the most relevant residues and their standard deviations are provided as Supporting Information (Tables S2 and S3).
We first analyzed the role of each residue in the formation of the corresponding Michaelis complexes for the OSBS and NAAAR reactions. Figure 4  only changes needed to evolve the latter activity. 57 As discussed below other residues contribute to accelerate the racemase activity in Amycolatopsis OSBS/NAAAR. We can analyze the contribution of the interaction energy of each residue to the barrier (calculated using Eq. 4). A negative value means that a particular residue stabilizes the TS more than the reactant state and this contributes to increase k cat . A positive value of the difference means that that residue stabilizes the reactant state more than the TS and this contributes to decrease k cat . Figure 5 presents the results obtained for TS2 of both reactions. A similar representation for TS1 is given as Supporting Information ( Figure   S7). Here we only analyze the results for TS2 because this transition state corresponds to the step that differs between the OSBS and the NAAAR reactions and, in addition, this step seems to be the rate-limiting step in both cases. When analyzing these differential interaction energies it is important to take into account the different nature  Figure   S7 in the Supporting Information. Instead, in the second step of the NAAAR reaction the contribution of the ion (with its coordination shell) to stabilize TS2 relative to the reactant state is negligible (-0.1 kcal·mol -1 ). In this case, Arg 266 play a more important role has a larger catalytic contribution in catalyzing the second step of the NAAAR reaction (-14.8 kcal·mol -1 ) than in the OSBS reaction (-5.5 kcal·mol -1 ). This residue is close to Lys 263 and is decisive to helps lower its pK a , facilitating the proton transfer  OSBS and NAAAR reactions. The red lines corresponds to subunit-A residues, the blue lines to subunit-B, green color is for the magnesium ion and the black one for the sum of this ion and its coordination shell. A negative value means that a particular residue stabilizes the TS more than the reactant state. The insert shows the differences in the average interaction energies of the residues found in loops 20 and 50.

Conclusions
As a summary of our work, QM/MM simulation methods offer a detailed molecular picture of enzymatic reactions and constitute, together to kinetic, structural and mutagenesis studies, a fundamental tool to understand how enzyme specificity evolves.
In the particular case of the Amycolatopsis OSBS/NAAAR enzyme, the evolution of the racemase activity seems to be the consequence of several factors. As a general conclusion, an active site designed to bind a negatively charged residue and to lower the pK a of the base can also be adequate to catalyze a different reaction (the racemization process) provided that the new substrate binds to the active site in the appropriate conformation. In the example analyzed in this work the presence of loops in the active site (and in particular the flexible loop 20) seems to be a key factor to promote promiscuity. This loop is able to provide similar interactions to different substrates, even if they bind into the active site in slightly different orientations. However, other residues, apart from this loop, must also contribute to evolve the racemase activity, lowering the barrier for the proton transfer back to the C  atom. Obviously, another important element is the selection of the substrate that must contain those structural elements needed to compensate the interactions lost by the suppression of the hydroxyl group and to reaccommodate the substrate in the adequate binding pose. The present analysis shows that computational studies of enzymatic promiscuity may provide us guiding principles for the design of new functions into existing enzymes or for the de novo design of new activities in protein scaffolds. 6 Figure S1 shows the interactions of OSB substrate in the 1SJB structure. A representation of the simulated system is provided as Supporting Information in Figure  S2. In Figure S2 is depicted the overlap of two snapshots of SHCHC and NSRPG equilibrated in the active site of the enzyme. Figure S3 shows the energy profiles obtained for the OSBS and NAAAR reactions at the AM1/Mm level. The stationary structures obtained from the potential energy exploration are shown in Figures S4 and  S5. Figure S6 shows the corresponding Potentials of Mean Force obtained for the two steps of both reactions. Finally, Figure S7 shows the difference in the averaged interaction energies of each protein residue with the reacting subsystem at TS1 and reactants states, for both OSBS and NAAAR reactions. Table S1 shows the semiclassical KIEs determined at the M06-2X/MM level at 298 K of the rate-limiting step for the OSBS and NAAAR reactions. Table S2 and S3 provides key coordinates of the structures optimized at the AM1/level and M06-2X/MM, respectively, for the OSBS and NAAAR reactions. Table S4 and S5 give the interaction energies of some relevant residues along the stationary states of OSBS and NAAAR reactions, respectively. This material is available free of charge via the Internet at http://pubs.acs.org.