QM/MM modeling of the hydroxylation of the androstenedione substrate catalyzed by cytochrome P450 aromatase (CYP19A1)

CYP19A1 aromatase is a member of the Cytochrome P450 family of hemeproteins, and is the enzyme responsible for the final step of the androgens conversion into the corresponding estrogens, via a three‐step oxidative process. For this reason, the inhibition of this enzyme plays an important role in the treatment of hormone‐dependent breast cancer. The first catalytic subcycle, corresponding to the hydroxilation of androstenedione, has been proposed to occur through a first hydrogen abstraction and a subsequent oxygen rebound step. In present work, we have studied the mechanism of the first catalytic subcycle by means of hybrid quantum mechanics/molecular mechanics methods. The inclusion of the protein flexibility has been achieved by means of Free Energy Perturbation techniques, giving rise to a free energy of activation for the hydrogen abstraction step of 13.5 kcal/mol. The subsequent oxygen rebound step, characterized by a small free energy barrier (1.5 kcal/mol), leads to the hydroxylated products through a highly exergonic reaction. In addition, an analysis of the primary deuterium kinetic isotopic effects, calculated for the hydrogen abstraction step, reveals values (∼10) overpassing the semiclassical limit for the CH, indicating the presence of a substantial tunnel effect. Finally, a decomposition analysis of the interaction energy for the substrate and cofactor in the active site is also discussed. According to our results, the role of the enzymatic environment consists of a transition state stabilization by means of dispersive and polarization effects. © 2015 Wiley Periodicals, Inc.


Introduction
Cytochrome P450 (CYP or P450) is a superfamily of hemethiolate-containing metalloenzymes, composed by a large number of families and subfamilies, which belong to the three domains of life. Currently there are more than 11,500 CYP named, 57 of which are members of the homo sapiens species. [1] These enzymes play an important role in different biological processes in some organisms such as plants, mammalian, bacteria, fungi, archaea, or protists. [2,3] P450s can metabolize a wide variety of both endogenous and exogenous substrates, being responsible of the metabolism for a large amount of pharmaceuticals [4] ($75%). The most common reactions catalyzed by CYP enzymes involve oxygen insertions, including hydroxylation of unactivated CAH bonds, alkene epoxidations, N-and O-dealkylations, heteroatom oxidations such as sulfoxidations, and so forth. [2,5] Human Aromatase (CYP19A1) is a member of the superfamily of cytochrome P450 enzymes and is located in a number of tissues that include gonads, adrenal glands, ovaries, placenta, testes, adipose tissue, and numerous sites in the brain. [6,7] This enzyme is responsible for a key step in the biosynthesis of steroid hormones, and more specifically it is involved in the final step of the conversion of androgens (androstenedione [ASD], testosterone and 16-a-hydroxytestosterone) into the corresponding estrogens (estrone, 17-b-estradiol and estriol, respectively). [8] This conversion consists of three consecutive oxidation steps of the 19-methyl group of the androgens, which is removed as formic acid during the course of the reaction, resulting in aromatization of ring A of these steroidal substrates (see Scheme 1). [9] First and second steps are believed to proceed through two sequential hydroxylations of the 19-methyl group of ASD to produce 19-gem-diol species that then undergoes dehydration to yield the aldehyde intermediate (19-oxo-ASD). The third step, which is still under discussion, corresponds to the C 10 AC 19 bond cleavage process (lyase reaction). [10] The entire process of aromatization consumes a total of three moles of NADPH and three moles of oxygen. [2] Arotamase is the only known enzyme responsible for the biosynthesis of estrogens from androgens in vertebrates. The estrogens are the primary female sex hormones, being important for sexual and reproductive development in humans, mainly in woman. However, these steroidal hormones also stimulate the growth of hormone-receptor-positive breast cancer cells playing an important role in the development of hormone-dependent breast cancer. The inhibition of this enzyme, and hence cessation of estrogen production, provides one of the first molecular targets for rational drug development in the treatment of breast cancer and of other hormoneresponsive cancers. [11,12] For this reason, the development of potent and selective aromatase inhibitors (AIs) is one of the greatest challenges in the field of endocrinology regarding the breast cancer therapy, as well as a goal to achieve in the pharmaceutical industry.
The knowledge of the structural nature of the enzyme and its mechanism of action provide useful information for designing new AIs and optimal drugs to treat breast cancer. One of the best ways to explore the structure of an enzyme as well as its reaction mechanisms is through the use of computational techniques such as molecular dynamics (MD) simulations and hybrid quantum mechanics/molecular mechanics (QM/ MM) methods. Although the cytochrome P450 superfamily of enzymes has been extensively studied using computational methods in some of its most common isoforms, [13] CYP450 aromatase has not been under theoretical study for a long time, partly because of the lack of a good crystallographic model. This fact prompted a number of homology models based on other P450 isoforms and site-directed mutagenesis data were proposed. [14] The crystal structure of aromatase purified from human placenta in complex with its natural sub-strate ASD (protein data bank [PDB] code 3EQM) was determined by Ghosh et al. [15] in 2009. This crystal structure shows how the natural substrate is accommodated in the active site of this enzyme and provides a good initial molecular structure on which to apply computational tools. In fact, after obtaining this crystal structure, several mechanistic studies on aromatase catalysis using QM/MM methods have been published in recent years. [16,17] The first oxidation step of the 19-methyl group of ASD has been studied experimentally many times, [18][19][20][21][22][23][24][25][26] however, no QM/MM study has been performed for this catalytic subcycle so far, thus remaining the mechanism still unsolved to this day.
This first oxidation step, in which ASD is hydroxylated to 19hydroxy-ASD, is thought to proceed through the hydrogen abstraction/oxygen rebound mechanism, originally proposed by Groves et al., as occurs in other P450s. [27] This mechanism, consists of an initial hydrogen atom abstraction of the substrate by means of the reactive species, a high valent ferryloxo heme complex (Compound I, Cpd I), leading to the formation of the alkyl radical intermediate and the iron-hydroxo complex. Subsequently, an alkyl (or OH) reorientation is produced to facilitate the final oxygen rebound mechanism, where a recombination of the alkyl radical with the ironbound hydroxyl radical leads to the formation of the corresponding hydroxylated product (see Scheme 2). The nature of Scheme 1. Catalytic process of conversion of ASD into aromatic estrone catalyzed by cytochrome CYP19A aromatase.

FULL PAPER
WWW.C-CHEM.ORG the reactive species located in the active site of CYP enzymes, has been extensively discussed in the literature by both theoretical [13] and experimental [28] studies. Currently, the most accepted oxidant is the Cpd I, an iron IV oxo-porphyrin cation radical (Por 1Á -Fe IV @O), whose ability to activate inert CAH bonds has been amply demostrated. [29][30][31][32][33] Even so, due to the elusiveness of this compound and its high reactivity, making it very difficult to detect and capture, other oxidants such as the ferric hydroperoxo complexes (Por-Fe III -OOH 2 , Cpd 0) and perferryl-based species (Por-Fe V @O) have been proposed. [28,[34][35][36][37][38][39] However, Rittle and Green [30] succeeded preparing Cpd I on CYP119 P450 cytochrome in a high yield, being able to spectroscopically and kinetically characterize it, settling the matter on the hydroxylating agent in P450 enzymes. In addition, in a recent study, [40] Khatri and coworkers concluded that the Cpd I is the reactive intermediate not only in the first two steps of aromatization by aromatase (hydroxylations) but also in the third (and last) aromatization step.
The Compound I has a radicalary nature and possesses three singly occupied orbitals. Two of them are the p Ã xz and p Ã yz orbitals of the FeAO moiety, while the third one is delocalized between the a 2u orbital of the porphyrin and the p s of the sulfur atom. A ferromagnetic coupling of the three electrons provides a high-spin quartet state (S 5 3/2), whereas an antiferromagnetic one leads to a low-spin doublet state (S 5 1/2). As a result of this dual spin nature of the Cpd I, Shaik et al. [41] proposed a two-state reactivity mechanism for the hydroxylation and the epoxidation of alkanes, in which product distribution of the reaction is determined by the combination of both spin states.
We herein present a theoretical study for the first oxidation step of androgen substrate ASD via cytochrome CYP19A1 aromatase. In this oxidation step, which involves the hydroxylation of the 19-methyl group of ASD to 19-hydroxy-ASD, the radical rebound mechanism as well as the Cpd I as the oxidizing species have been proposed. The two potential energy surfaces (PESs), corresponding to the two possible spin configurations, doublet, and quartet, have been explored by means of QM/MM methods using the density functional theory (DFT) to describe the QM atoms. In addition, the activation free energies of the biochemical process have been evaluated introducing free energy perturbation (FEP) techniques, thus, taking into account the conformational space of the protein. Also, semiclassical kinetic isotope effects (KIE) of the hydrogen abstraction step from ASD 19-methyl group have been reported. Finally, an exhaustive analysis of the binding nature of both substrate and cofactor has been performed by means of the decomposition of the interaction potential energy along the hydrogen abstraction step.

System setup and MD simulations
The initial geometry used in our calculations was obtained from the X-ray crystal structure of the human placental aromatase cytochrome P450, in complex with its natural substrate ASD, solved at 2.9 Å of resolution (PDB code 3EQM). [15] The original pentacoordinated heme B cofactor found in the PDB file was modeled into an hexacoordinated Iron-Oxo porphyrin Cpd I species, using the atom positions suggested in the literature. [15] All the hydrogen atoms were added using the NAMD 2.8 program, [42] according to the empirical pKa predictions rendered by PROPKA web interface. [43] The standard protonation states in solution were found except for Asp-309, which exhibited a large pKa displacement (7.7). This value indicated a weak acidic character and, therefore, Asp-309 was protonated at pH 7. Histidine residues were protonated as follows: His-171 was doubly protonated, His- (62, 105, 111, 325, 475, and 480) were singly protonated at e position, His-(109, 128, 402, and 459) were singly protonated at d position. A total of four counter ions (Cl 2 ) were placed into optimal electrostatic positions around the enzyme to electroneutralize the excess of positive charge (14), fulfilling the electroneutrality of the system. The model was placed in a prerelaxed orthorhombic box of water molecules with dimensions of 90 3 80 3 80 Å 3 , erasing all those water molecules with an oxygen atom lying less than 2.8 Å from any heavy atom. The resulting model consisted of 452 residues of amino acids, the modeled Cpd I, the substrate ASD, 35 crystallographic water molecules, four counterions, and 16,542 water molecules of the solvation box.
The solvated model was minimized and then equilibrated by means of classical MD at 300 K, using the canonical (NVT) ensemble and the Langevin-Verlet integrator. The MD was run in NAMD for 20 ns with a step size of 1 fs using the OPLS-AA [44] force field and the TIP3P [45] water model. The nonbonding interactions were treated in all calculations by periodic boundary conditions with the Particle Mesh Ewald convention, via a force-switch function with a cutoff distance in the range 14-16 Å . No harmonic restraints were applied between atoms and/or fragments during the simulations.

QM and QM/MM calculations
Starting structure used in QM/MM calculations was extracted from one of the last snapshots obtained in MD simulations based on observation criteria. To study the chemical reactivity, the QM region of the model was defined to include the substrate, the Cpd I and the axial Cys-437 ligand (comprising the S c , the C b , and two H b atoms), giving rise to a total of 124 atoms (see Fig. 1a). The link atom formalism was applied to satisfy the valence of the QM region, due to presence of the classical bond partitioning in the Cys-437 amino acid (C a AC b bond). Because of the complexity of the enzymatic model, a truncation scheme was implemented during all hybrid QM/MM calculations. In this scheme, the lists of interaction for the QM atoms (incorporating all atoms within 20 Å from any atom of both the substrate and heme groups) are defined at the very beginning and remain constant during the calculations. In this way, all residues further than 20 Å from either the Cpd I or the substrate were kept frozen, during the simulations (8638 atoms from a total of 57,252).
As the study and characterization of a transition state (TS) usually make use of the Hessian matrix, its evaluation is not FULL PAPER WWW.C-CHEM.ORG feasible for enzymatic systems due to the large amount of degrees of freedom. One way to overcome this problem is to divide the full coordinates space into two different subsets: the control space, which incorporates all those atoms or molecules involved in the reaction process (usually the QM atoms); and the complementary space comprising the rest of the system. Therefore, the optimization of the model system can be expressed as the combination of iterations in both subsets: at each step of the control space Hessian guided optimization (based on the Baker [46,47] algorithm), the rest of the system is kept fully relaxed merely using gradient vectors (using the L-BFGS [48] procedure). This strategy is also known as the micro/ macroiteration method, and allows taking advantage of efficient optimization algorithms. [49] Furthermore, a dual QM:Charge/MM scheme [49,50] was adopted, where the total QM/MM energy of the system is expressed using different terms, depending on which coordinate space is being optimized. This choice is justified by the fact that a typical complementary space optimization (corresponding to the macroiterations) typically needs a large number of gradient evaluations at each control space movement (or microiteration), and thus some approximations must be introduced to speedup calculations. In this case, the QM atoms are reduced to frozen classical charges during the macroiterations, thereby using different expressions for the energy depending on the current iteration, as shown in following equations: The term "E MM ", common in both equations, is the energy of the MM region, including the Lennard-Jones interaction between QM and MM atoms. The terms "E QM,o " and "E QM,Int " included in eq. (1), correspond to the quantum energy of the QM atoms using a high-level method (such as ab initio or den-sity functional-based methods) under a classical environment described by point charges (q MM ). The term "E QM,o " refers to the selected gas-phase Hamiltonian operator acting on the polarized wave function, meanwhile the term "E QM,Int " involves the electrostatic interaction among this wave function and the classical point charges. Conversely, in eq. (2) the gas-phase quantum energy calculation has been removed, and the QM/ MM electrostatic interaction is replaced by a Coulombic expression, making use of electrostatic potential fitted charges (Merz-Kollman [51,52] ) for the QM atoms. In this way, the TSs were localized by means of this micro/ macroiterations scheme, and were characterized by hessian inspection. Afterward, minimum energy paths were then traced down to their corresponding minima, to ensure that the structures really connect reactants and intermediates. Once the reaction paths were obtained, the Gibbs free energy was evaluated at the stationary points, under the rigid-rotor and the harmonicoscillator (RRHO) approximations, taking into account both zero point energy (ZPE) and thermal corrections.
Moreover, a small model of the system (see Fig. 1b) was built to study the hydrogen abstraction step, performing gasphase calculations (QM-only model). This model, consisting of a reduced version of both Cpd I and ASD, was used to contrast some results obtained for the enzymatic model.
The unrestricted Kohn-Sham formalism with the B3LYP [53,54] hybrid density functional was used to describe the QM region of the enzyme aromatase as well as the quantum atoms of the QM-only model. It has been shown that the DFT and DFT/ MM methodologies provide a reasonably good description of the doublet and quartet ground states of the Cpd I. [55,56] In particular, the B3LYP functional has become the preferred choice for carrying out the modeling of the CYP enzymes as is well-known to give reasonable relative spin state energies as well as to perform well for a number of other properties. [13,57] The LACVP* basis set (B1), [58] consisting of the combination of the 6-31G(d) basis for all the atoms except for the iron, which is represented by the Lanl2dz effective core potential (ECP), was used for the exploration of the PES. Additionally, to obtain more accurate energies and frequency numbers, a larger basis set (B2) was adopted to perform single point calculations on the stationary points previously allocated with the B1 basis set. This B2 basis set corresponds to the combination of the Lanl2tz1 ECP for the iron and the 6-311G(d,p) basis for the rest of the atoms. The rest of the enzyme and the water molecules were described by the OPLS-AA and TIP3P force fields, respectively.
All the electronic calculations were performed using the Gaussian03 [59] family of programs, whereas the fDYNAMO [60] library was used to describe the MM during the hybrid QM/ MM simulations.

Free energy perturbation calculations
FEP techniques were introduced to evaluate the free energies of activation of the chemical process. The use of this method, based on statistical mechanics, allows taking into account the conformational space of the enzyme, obtaining accurate freeenergy surfaces and solvent effects. For this purpose, different energetically equispaced geometries (windows) were selected from the previously traced reaction paths, being 54 for the doublet spin state and 75 for the quartet state. In each window, the QM region (including the link atom) was kept frozen during the calculations. Following the micro/macro scheme presented before, after each QM calculation performed [see eq. (1)], 100 steps of classical MD were performed with a time step of 1 fs [see eq. (2)], under the same conditions as during the equilibration. This procedure was repeated 200 times, resulting in a total of 20 ps per window. Finally, the coordinates of the QM region were swapped between consecutive windows (for the last 100 structures) to estimate the free energy changes, using the following expression: In eq. (3), H i corresponds to the QM/MM potential energy obtained during the ith window of MD. The term H j refers to the QM/MM potential energy, once the coordinates of the atoms for the control space have been exchanged (j 5 i 1 1 for forward or j 5 i 2 1 for backwards), under the same geometry of the classical environment (i). Finally, the triangular brackets denote an average performed over the ith window.
Once the free energy barrier is calculated using eq. (3), the ZPE correction term is included as the difference between the ZPEs of both final and initial states (reactant and TS for the activation energy) by means of QM/MM frequency calculations.

Kinetic isotope effects calculations
Semiclassical KIE were calculated for the hydrogen abstraction step in both QM-only model and the enzymatic system (as an average of 10 window obtained from FEP calculations for both the TS and reactants).
Subsequent corrections to the Eyring semiclassical KIE were incorporated by means of the Wigner correction [61] to estimate the Tunnel effects on these KIEs: where v ‡ is the imaginary frequency at the saddle point, k B is Boltzmann's constant and h is the Planck's constant. This choice seems to be appropriate to the tunneling corrections applied to rate constants for which the values of transmission coefficients are small to moderate ( 2). [62,63] The CAMVIB/ CAMISO [64,65] programs, based on the RRHO approximations, without scaling the vibrational frequencies, were used to calculate KIEs.
As the analysis of the bond orders (BO) may provide relevant information about the geometrical similarity in the TS structures, which is directly related with the KIE values, the C 19 AH 19 and H 19 AO formal BO were calculated. For this purpose, natural bond orbital (NBO) calculations were performed using the Gaussian 03 package at the UB3LYP/B1 level to estimate the Wiberg bond indices in the natural atomic orbitals (NAO) basis. [66]

Results and Discussion
The results presented in this article correspond to the hydroxylation of the ASD 19-methyl group to 19-hydroxy-ASD (see Scheme 2) catalyzed by the cytochrome P450 aromatase. Both the initial hydrogen abstraction step, and the subsequent oxygen rebound step were studied in the active site of the enzyme, in the doublet and the quartet spin states, using the B3LYP[B1]:ESP/MM scheme presented before. Table 1 shows the Gibbs free energies for the QM-only model (using the B1 basis set), and for the enzymatic system (using both B1 and B2 basis set). In addition, the free energies of activation obtained from FEP in the enzymatic model for both basis sets are also presented.
The QM-only model gas-phase calculations, exhibit almost degenerated reactants at both spin states. The Gibbs activation energies, as well as the corresponding imaginary frequencies, for the hydrogen abstraction step show also similar values, leading to the formation of the intermediates through an endergonic process. The comparison of these values with those obtained in the enzymatic environment (using the same B1 basis set), displays a modest breakdown of the reactants degeneracy by 1.4 kcal/mol. The subsequent Gibbs activation free energies for this step are lowered by 3.4 kcal/mol in the doublet and 1.3 kcal/mol in the quartet, when comparing to the gas-phase model. This modest catalytic effect obtained from the PES analysis suggests a mild participation of the enzymatic environment in the chemical process. The intermediates that follow the hydrogen abstraction step give place to the oxygen rebound mechanism by means of a small FULL PAPER WWW.C-CHEM.ORG barrier ($3 kcal/mol), leading to the formation of the hydroxylated products through a highly exergonic process (232 and 243 kcal/mol for the doublet and quartet states, respectively).
The inclusion of the conformational space effects (enzyme flexibility), by means of the FEP methodology, dramatically reduces the free energy of activation for the hydrogen abstraction step. The obtained value of 13.5 kcal/mol for both spin states appears to be as a more realistic prediction for this biochemical reaction. [67] At the same time, the free energy difference between reactants in both spin states obtained with the FEP is increased when compared with the PES results, despite that both spin states are accessible during the dynamics. Once the intermediate species have been formed, the oxygen rebound step takes place after a previous reorientation process (see Scheme 2).
For the doublet electronic state, the rotation has a small free energy barrier of 2.1 kcal/mol leading to a rotation intermediate (23.3 kcal/mol) from which the rebound step takes place with a barrier of 2.3 kcal/mol. This behavior is different in the quartet state, where the rotation has a free energy barrier of 2.2 kcal/mol leading to a rotation intermediate roughly thermoneutral (0.7 kcal/mol) from which the rebound takes place with a barrier of 2.7 kcal/mol. Finally, the inclusion of the ZPE energy term reduces the overall barrier for the rebound step to 1.5 and 2.8 kcal/mol, for the doublet and quartet states, respectively. As can be noted, the free energies of the product species are not reported in Table 1, as we have focused on obtaining only the free energies of activation for the biochemical process with the FEP methodology.
The effect of the basis set has been also inspected by means of the incorporation of a larger basis set (B2), thus, performing single point calculations of the structures obtained in the former PES (B1 basis set). The new Gibbs free energies obtained depict a decrease in the activation energy of the hydrogen abstraction around 2-3 kcal/mol, while the remaining stationary points show similar values of relative energy. Additionally, the basis set correction has been included in the FEP calculation, but only in those terms that can be factored out of the average, to avoid the large amount of expensive QM(B2) calculations: In eq. (6), the interaction energy term (DE B1 Int ) has been previously calculated using the B1 basis set, while both the gasphase energy (DE B2 vac ) and the ZPE (DZPE B2 ) terms correspond to the expensive calculations with the large basis set. The obtained results show a free energy of activation for the hydrogen abstraction lower than the corresponding to the B1 basis set (11.4 and 12.0 kcal/mol for the doublet and quartet states, respectively). This trend, associated to the use of the B2 basis set, is not observed in the rebound step, where the activation barriers remain virtually unchanged (1.3 and 3.3 kcal/ mol for the doublet and quartet state, respectively).
Primary deuterium ( 2 H) KIE values of the H 19 abstraction step for the two models (QM-only and the enzymatic system) were determined using the frequency data obtained with the B1 basis set. This basis set was used to compare the results for the QM-only model and the enzymatic system (obtained as an average of the FEP structures), due to the complexity of the latter.
Inspection of the computed KIEs presented in Table 2 show values around 7 in both models and in both spin states The basis set used in each calculation is also shown in parentheses. The imaginary frequencies associated with the TS obtained in the PES explorations are also reported. All energies are expressed in kcal/mol, and frequencies in cm 21 . In this way, due to the strong parallel drawn between TS structures, we can find similar imaginary frequencies for the hydrogen abstraction step in doublet (1606i) and quartet (1645i) spin states, and even when the deuterium atom replaces the transferred hydrogen atom (1209i and 1236i for doublet and quartet, respectively).
This analogous behavior between doublet and quartet spin states along the H-transfer coordinate, can also be found in the case of the QM-only model. In this model, the geometrical parameters obtained for the TS species along this coordinate are: d(C 19  1601i in doublet and 1713i in quartet spin states, and when the deuterium atom replaces the transferred hydrogen atom are 1200i and 1277i for doublet and quartet, respectively. It is noteworthy that both the geometrical parameters and the BO indices as well as imaginary frequency numbers obtained are very similar between the QM-only model and the enzymatic system, and therefore, it is not surprising that the KIE results are similar in both models. Corrections for barrier tunneling were determined by Wigner approximation and provided values in the range 1.4-1.5 in both models and spin states. As the Wigner correction yields values significantly greater than unity, it suggests the involvement of a significant amount of tunneling in this hydrogen abstraction step. Indeed, the corrected KIE values increase up to values around 10 in both systems and spin states, thus, exceeding the semiclassical Eyring limit (see Table 2). It is worth pointing out that tunneling effect has been reported in previous theoretical [68][69][70][71][72][73] and experimental [30,38,67,[74][75][76][77] studies for this kind of hydrogen atom abstractions.
The Mulliken atomic spin densities obtained for the hydroxylation step using the B2 basis set are reported in Table 3. These results reveal the presence of three unpaired electrons in the reactant species: two electrons with parallel spin are located on the FeO moiety of the Cpd I forming a local triplet (q FeO $2.10) and a third electron is located between the p s orbital at the sulfur atom of the Cys-437 and the a 2u orbital of the porphyrin ring. The sign of the latter (beta or alpha) determines either the doublet (q Por 1 SCH3 5 21.13) or quartet (q Por 1 SCH3 5 0.95) nature of the system, respectively. The results in Table 3 show that the porphyrin p cation radical is predominantly of A 2u character ($80%) in both doublet and quartet spin states, with a significant contribution of G s state, leading to a "green" Cpd I instead of the "red" one. As long as  Information Table S1). It is worth noting that the iron-hydroxo complex formed in intermediate species was found in two different electromeric configurations for the quartet spin state ( 4 Fe III Por •1 and 4 Fe IV-Por) and only one for the doublet state ( 2 Fe III Por •1 ). Both electromers found in the quartet state share the same geometry, but differ in the distribution of electrons on their atoms. [13,78] The 2,4 Fe III Por •1 electromers contain one unpaired electron on the iron center (q Fe 5 0.92 ( 4 Fe III Por •1 ), 1.02 ( 2 Fe III Por •1 )) and the other one distributed on both the a 2u orbital and the sulfur atom (q Por 1 SCH3 5 0.97 ( 4 Fe III Por •1 ), 21.09 ( 2 Fe III Por •1 )). The 4 Fe IV Por electromer has the two unpaired electrons located on the iron center (q Fe 5 2.06 and q Por 1 SCH3 5 20.17). However, all the electromeric situations retain the radical character on the C 19 atom with a singly occupied orbital (q C19 5 1.03 ( 2 Fe III Por •1 ), 1.04 ( 4 Fe III Por •1 ), and 1.05 ( 4 Fe IV Por)).
The oxygen rebound step is characterized by the formation of a new CAO bond, and involves the elimination of the radical character in the C 19 carbon atom and the formation of a closed shell porphyrin in the product species. As a result, two different electronic distributions for Fe III are obtained in products defining the multiplicity state of the molecular system: the doublet state, which have an uncoupled electron on the iron atom (q Fe 5 1.34 and q Por 1 SCH3 5 20.34) and the quartet state which have three uncoupled electrons on the iron (and partially in the sulfur axial ligand) (q Fe 5 2.70 and q Por 1 SCH3 5 0.29). Table 4 illustrates some of the most relevant geometrical parameters of the substrate and cofactor in the active site (for more details see Supporting Information Tables S2 and S3). The values obtained for both the FeAS and FeAO distances in the reactant species, are consistent with those observed experimentally for the Cpd I [79] (2.48 and 1.65 Å , respectively, using EXAFS spectroscopy). Concerning the evolution of these distances as the hydrogen abstraction step proceeds, it can be observed that the FeAO bond is stretched due to the formation of the new hydroxyl group linked to the iron atom (from $1.63 to $1.80 Å ), being this process accompanied by a shortening of the FeAS distance (from $2.70 to $2.50 Å ).
Furthermore, it can be seen that the donor-acceptor distance (C 19 AO) remains practically invariant in the TSs of both doublet and quartet spin states ($2.52 Å ). This fact can also be observed in the QM-only model (see Supporting Information Table S3), where the C 19 AO bond distance in the TSs ($2.53 Å ) is similar to that found in the enzymatic system. However, we can observe that for the reactant species obtained for the enzymatic model, the C 19 AO distance ($2.91 Å ) is shorter than the one found in the QM-only model ($3.51 Å ). This observation suggests that one of the possible roles of the aromatase active site consists of the preorganization of the reactant state toward the TS along the reaction coordinate.
Regarding the C 19 AH 19 AO angle, it shows its maximum value in the geometries of the TSs corresponding to the hydrogen abstraction step. This angle takes values of $1708 in the TS of the enzymatic system and $1658 in the TS of the QM-model. The fact that this angle is virtually linear in the TS structures is what is expected for the hydrogen abstraction step, being this value closer to the linearity in the enzymatic system than in gas-phase model.
Previous to the oxygen rebound step, a reorientation of the hydroxyl group takes place to prepare the formation of the new C 19 AO bond. This rotation can be observed by following the change that occurs in the H 19 AOAFeAN a dihedral angle between intermediates and TS of the rebound step (around 318 for doublet and 478 for quartet), together with the shortening of the C 19 AO bond (from $2.94 to $2.60 Å ). Subsequently, this distance decreases dramatically from TS to products (from $2.60 to $1.42 Å ) leading to the formation of a new C 19 AO bond as well as a penta-coordinated iron complex. Regarding the interactions established among the substrate/ cofactor and the residues present in the active site, the following observations can be made: (i) there is a large network of hydrogen bonds at the binding site of the heme propionate chains involving four charged Arg (115, 145, 375, and 435) and a Trp-141 residues (see Fig. 2). These interactions balance the large repulsion of the carboxylate side chains of the porphyrin ring, facilitating thus an efficient binding site for the heme group in the active site. (ii) The partially charged sulfur atom of the Cys-437 is stabilized by the interaction with three polar peptidic hydrogens (Ala-438, Gly-439, and Lys-440). In fact, hydrogen bonds have been reported to stabilize the heme thiolate coordination and regulate the redox potential of the heme iron, [80] preventing the elongation of the FeAS bond and avoiding the formation of a radical on the sulfur. (iii) The binding of the substrate is accomplished via hydrophobic interactions at the active site with the residues Ile-133, Trp-224, Ala-306, Thr-310, Val-370, Leu-372, Val-373, Phe-430, and Leu-477. (iv) Two hydrogen bonds are established between the polar moieties of the substrate ASD (carbonyl groups) with the Met-374 and protonated Asp-309 residues. Previous research suggests that the Asp-309 residue can participate in the enolization reaction of ASD as a proton donor in the third and last catalytic subcycle of aromatase enzyme. [15] The interaction energy between the substrate/cofactor and the enzymatic environment for the hydrogen atom abstraction step was calculated in the enzymatic system, using the windows corresponding to the reactants and the TSs obtained from the FEP. This energy has been in turn decomposed into different energetic contributions as follows: (i) the electrostatic term (E elec ) which accounts for the interaction between the MM charges of the enzymatic environment and the polarized wave function of the substrate; (ii) the polarization energy (E pol ) of the wave function due to its changes on-going from gas-phase to the enzymatic active site; (iii) the Lennard-Jones term (E LJ ) representing the dispersive interactions between the MM and the QM atoms and (iv) the gas-phase energy (E vac ) of each species. Each of the interaction energy terms was obtained as an average of both doublet and quartet FEP calculations.
As can be seen in Table 5, the electrostatic interaction term (E elec ) for the doublet TS species shows a weaker value when compared with the corresponding reactant, leading to an unfavorable contribution to the activation energy (1.4 kcal/ mol). This effect is similar in the quartet state, but less adverse than in the doublet (1 kcal/mol). Conversely, it is worth pointing that both polarization (E pol ) and Lennard-Jones (E LJ ) terms stabilize the TSs with respect to the corresponding reactant species, being different the weight of each term depending on the spin state. In this way, the polarization term in the quartet (25.9 kcal/mol) is higher than in the doublet (22.9 kcal/mol); however, a different behavior occurs for the Lennard-Jones where both the doublet and quartet TSs are stabilized in a similar way (24.1 and 24.5 kcal/mol, respectively). Furthermore, the E LJ term can in turn be broken down into the contributions of the substrate (DE LJ (ASD)) and cofactor (DE LJ (Cys 1 heme)), showing that the main contribution to this stabilization comes from the substrate (ASD) when comparing with the cofactor (Cys 1 heme), giving rise to values of 22.6 and 24.9 kcal/mol for the doublet and quartet states, respectively.
All in all, both the gas-phase barrier and the unfavorable electrostatic TS stabilization are offset by those contributions from the polarization and Lennard-Jonnes interaction, leading to lower activation free energies for both spin states. Despite the fact that the free energy of activation is lower for the quartet state, the reactant is destabilized compared with the doublet, making the overall process to be less affordable.

Conclusions
The above study investigates the hydroxylation of the substrate ASD to 19-hydroxy-ASD, occurring during the first catalytic subcycle of the enzyme aromatase, by means of MD simulations and hybrid QM/MM calculations. With this study, we attempt to improve the understanding of the conversion mechanism of androgens into estrogens via the enzyme aromatase from the standpoint of theoretical chemistry.
In this oxidation step, the radical rebound mechanism (consisting in a first hydrogen abstraction step followed by an oxygen rebound step) as well as the Cpd I as the oxidizing species have been proposed.
According to the results obtained from the analysis performed on the PES, the rate-limiting step corresponds to the hydrogen abstraction process. The Gibbs free energy barriers for this step (obtained by means of the RRHO approximation) were found to be slightly lower for the enzymatic system (22.3 kcal/mol) than for the QM-only model (25.7 kcal/mol), showing a catalytic effect of the enzyme on the biochemical reaction. An analysis of the 2 H-KIE calculated for this step show almost identical values in both models studied (QM-only model and enzymatic system), indicating that KIE value is intrinsic to the reaction and is not heavily affected by the enzymatic environment. Moreover, the results show a slight difference in KIE values between different spin states, which is explained by the great geometrical similarity found between the TS species obtained in both states. Once the tunneling correction is included, the KIE values exceed the semiclassical limit thus denoting a substantial tunnel effect (as has already been reported in some studies involving CAH abstractions).
Furthermore, we have incorporated (for the first time in this P450 isoform) the enzymatic conformational diversity in the calculations by means of the FEP methodology. As this methodology takes into account the enzyme flexibility, it provides a more realistic view of the role played by the enzyme in the biochemical process. According to the results obtained with this methodology, the free energy barriers for the hydroxylation step are lower than those calculated by way of the RRHO approximation, being the values obtained within the range expected for this kind of biological processes. The hydrogen abstraction step takes place through an endergonic process with an activation barrier of 13.5 kcal/mol, leading to the formation of an alkyl radical on the substrate and a hydroxyl group linked to the heme cofactor. From these intermediates, a reorientation of the hydroxyl group occurs previous to the oxygen rebound step. The energetics of this reorientation process depends on the spin state of the substrate, being the doublet state the more favored. Thereby, the rebinding of the oxygen befalls via the hydroxyl rotation and the formation of the new CAO bond. The coupled reorientation-rebound process is characterized by a remarkably low free activation Absolute values with the corresponding errors are shown in brackets. Lennard-Jones term has been broken down into substrate (ASD) and cofactor (Cys 1 heme). barrier of 1.5 kcal/mol, giving rise to the formation of the products through a highly exergonic reaction. Based in these energetic results and the KIEs obtained, we can conclude that the hydroxylation of substrates via the enzyme aromatase is compatible with the conventional hydrogen abstractionoxygen rebound mechanism observed in other P450 isoforms. Finally, the analysis of the decomposition of the free energy barrier (obtained by FEP calculations) in its different contributions, suggests that the electric field produced by the enzyme during the biochemical reaction is not directly involved in the stabilization of the TS when compared with the reactant species. In fact, it is noteworthy that both the polarization of the wave function and the Lennard-Jones term are the only ones responsible for such stabilization. These findings are consistent with the kind of chemical process and/or substrate. Therefore, we can conclude that the main role of the enzyme aromatase during the hydroxylation of ASD consists of a stabilization of the TS by means of the participation of both dispersive and polarization effects.