Formamide reaction network in gas phase and solution via a unified theoretical approach: Toward a reconciliation of different prebiotic scenarios Fabio Pietrucci1 and Antonino Marco Saitta Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Université Pierre et Marie Curie, Sorbonne Universités, CNRS, Muséum National d’Histoire Naturelle, Institut de Recherche pour le Développement, UMR 7590, F-75005 Paris, France Edited by Jerrold Meinwald, Cornell University, Ithaca, NY, and approved October 19, 2015 (received for review July 7, 2015)
Increasing experimental and theoretical evidence points to formamide as a possible hub in the complex network of prebiotic chemical reactions leading from simple precursors like H2, H2O, N2, NH3, CO, and CO2 to key biological molecules like proteins, nucleic acids, and sugars. We present an in-depth computational study of the formation and decomposition reaction channels of formamide by means of ab initio molecular dynamics. To this aim we introduce a new theoretical method combining the metadynamics sampling scheme with a general purpose topological formulation of collective variables able to track a wide range of different reaction mechanisms. Our approach is flexible enough to discover multiple pathways and intermediates starting from minimal insight on the systems, and it allows passing in a seamless way from reactions in gas phase to reactions in liquid phase, with the solvent active role fully taken into account. We obtain crucial new insight into the interplay of the different formamide reaction channels and into environment effects on pathways and barriers. In particular, our results indicate a similar stability of formamide and hydrogen cyanide in solution as well as their relatively facile interconversion, thus reconciling experiments and theory and, possibly, two different and competing prebiotic scenarios. Moreover, although not explicitly sought, formic acid/ammonium formate is produced as an important formamide decomposition byproduct in solution.
chemical reactions free energy landscapes formamide prebiotic scenarios
| ab initio molecular dynamics |
ery different environments have been suggested as possible cradles for the emergence of biomolecules, including a primordial liquid soup (1), rarified gaseous interstellar clouds (2, 3), liquid–solid interfaces at hydrothermal conditions (4), and the extreme case of impact sites of meteorites and comets (5). Biochemically relevant reactions can be fueled by a range of different energy sources, and a large number of possible chemical reactions and reaction paths are suggested to have played an important role in the process leading simple molecules to form small organic molecules and, from them, biological monomers and polymers. Among such small organic molecules, formamide is assuming an ever more central role in prebiotic chemistry research, as recently shown in experiments mimicking some of those origins of life scenarios, such as laser sparks (6), UV light (7), proton irradiation (8), or shock waves (as in meteorite impacts) (9). The main reason for such strong interest in formamide, besides its ubiquitousness in the solar system, is its chemical flexibility: it can be formed from (or, conversely, dissociated into) different molecular species that represent fundamental building blocks, including H2, H2O, NH3, CO, HCN, HNCO, and HCOOH (10, 11). The barriers for these different processes lie in a relatively narrow range, providing many synthetic directions. Additionally, as has been also showed experimentally and computationally, formamide could give rise to all classes of biomolecular monomers, from amino acids to nucleic acids to sugars (12, 13). For example, inspired by the classic Miller experiments, recent ab initio MD (AIMD) simulations demonstrated that an intense 15030–15035 | PNAS | December 8, 2015 | vol. 112 | no. 49
electric field can induce the barrierless (spontaneous) formation of formamide and formic acid from a CH4, CO, NH3, H2O, and N2 mixture in the condensed phase, and under the same conditions, formamide in turn gives birth to more complex organic molecules including glycine (14). In this context, understanding the thermodynamics (free energy differences) and kinetics implied in basic reactions of prebiotic relevance is key to assess the likelihood of the different scenarios put forward in the literature. Computational approaches are a formidable complement to experiments in this field because they exploit the fundamental laws of quantum mechanics to study chemical reactions, interpret experimental results, and predict novel mechanisms. However, contemporary computational physical chemistry is dominated by gas-phase calculations at zero temperature, with effects due to temperature, pressure, and chemical environment relegated to approximated extrapolations. For instance, our knowledge of reaction dynamics in condensed phases is far from complete (15, 16), despite the fact that water is a polyvalent molecule, known to participate also in formamide chemistry under different roles, including as a stabilizer through hydrogen bonds, as an efficient acid–base bifunctional catalyst, and as a coreactant (17). Additional effects should also be considered, including vibrational energy dissipation upon birth of exothermic products or solute trapping into finite-lifetime cages affecting its diffusion and reactivity (16, 18, 19). The large number of possible configurations [already including few water molecules (17)] together with the strong anharmonicity of liquids naturally calls for methods like MD that include from the start the finite-temperature dynamics. Significance The qualitative and quantitative description of chemical reactions is a formidable problem. Determining pathways, energetics, and kinetics requires costly accurate quantum approaches. Typically, this amounts to a simplification of gas-phase reactions into elementary steps, which, in turn, are usually directly translated in solution. The topology-based method hereby presented allows instead an unbiased exploration of both gas-phase and aqueous-phase reactions. Its application to formamide, a centerpiece of prebiotic chemistry, unveils very different reaction networks between gas and solution regimes, indicating, in the liquid phase, new pathways, intermediates, and products. The new scheme allows us also to demonstrate the comparable relative stability of formamide, hydrogen cyanide, and formic acid, thus contributing insight to a heated current debate in the prebiotic chemistry community. Author contributions: F.P. and A.M.S. designed research, performed research, and wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. 1
To whom correspondence should be addressed. Email: [email protected]
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1512486112/-/DCSupplemental.
P Nf ke−λD½RðtÞ, Rk sðtÞ = Pk=1 , Nf −λD½RðtÞ, Rk′ k′=1 e
! Nf X 1 e−λD½RðtÞ, Rk , zðtÞ = − log λ k=1
where s represents the progress along a pathway defined through an ordered sequence of atomic configurations R1, R2, . . ., RNf , whereas z is a measure of the distance from the pathway itself. This class of variables proved crucial to obtain free energy landscapes of a variety of processes including gas-phase chemical reactions with concerted mechanisms (33–35) and transformations of carbon nanostructures (36). The key ingredient of path collective variables, determining their effectiveness in a given problem, is the definition of distance DðRðtÞ, Rk Þ between the atomic configuration at time t and the kth reference structure. In this work we introduce D½RðtÞ, Rk =
From a prebiotic perspective, it is necessary to have a comparative understanding of reaction networks in different environments (gas or condensed phase, with different solvents and also interfaces with minerals) and at different conditions (T, P, irradiation, shock waves, etc.), eventually embracing also nonequilibrium scenarios, for their role in the emergence of life. The overwhelming gap between the (long) time scale of reactive events and the (short) time scale of ab initio MD simulations can be effectively overcome using the available enhanced sampling techniques, including metadynamics (20) and transition path sampling (21). Despite this, the widespread adoption of MD simulations in the study of chemical reactions has so far been slowed down by the lack of standard, general purpose formulations of reaction coordinates. In particular, it is challenging to design coordinates that fully include the important role of the solvent degrees of freedom (22–24) and that are general enough to be applied to a palette of diverse reaction mechanisms. Here we propose a novel free energy calculation approach able to address in a general way a wide range of chemical reaction mechanisms in solution. The method deals in the same formal way (namely, in the same space of coordinates) with gasphase and solutions, allowing for the first time to our knowledge a direct comparison of the two environments and an easy assessment of the effects of a given solvent on reaction pathways. We exploit the unique advantages of our new approach to reconstruct in detail the multiple formation/decomposition mechanisms of formamide and to disclose the sizable qualitative and quantitative difference of solution chemistry with respect to gas phase. In this way we unveil a reaction network of remarkable complexity that sheds further light on fundamental prebiotic reactions and will serve as the basis of a future comparative understanding of different scenarios for the emergence of biological molecules.
where CIS is the coordination number between atom I of species (element) S′ and all atoms J of species S, defined by means of a smooth switching function: P CIS ðtÞ = J∈S ½1 − ðRIJ ðtÞ=R0SS′ ÞN =½1 − ðRIJ ðtÞ=R0SS′ ÞM . The parameter R0SS′ depends on the two species because, e.g., a C–H bond is shorter than a C–C bond. We remark that apart from the latter parameters, the sole inputs needed to construct the collective variables are the coordination patterns of the reactant and product species (and possibly other intermediates) as in Fig. 1. We used metadynamics both in the fixed Gaussian height and in the well-tempered variants (37). The duration of each simulation was between 50 and 200 ps, and we estimate that free energy surfaces have a statistical precision of about ±2 kcal/mol by comparing bias profiles at different times and across independent simulations (repeating at least three times each simulation changing Gaussian size and rate of deposition). Technical details are discussed in Supporting Information. A particularly convenient feature of the present formulation is that the detailed geometry of the kth reference structure needs not be explicitly given: instead, the matrix of CISk entries can be directly constructed starting simply from structural formulas (Fig. 1). In practice, using the CISk entries obtained from a short MD simulation of the end states can improve the resolution of intermediate and transition states in the free energy landscape. Note that coordination numbers have been repeatedly used as an effective collective variable to study, e.g., (de)protonation reactions in water (38), because they are well suited to describe the active participation of the solvent. On the other hand, a typical reaction may require a set of several coordination numbers to be faithfully described, to track the creation/dissociation of different chemical bonds, with all of the resulting difficulties in reconstructing high-dimensional free energy landscapes. Combining path collective variables with a high-dimensional space of coordination numbers allows exploiting the best of both worlds, providing a handy 2D space able to track complex transformations, leaving the system the freedom to depart from the proposed path, and describing in a seamless way both gas- and condensed-phase chemistry.
Methods We exploited Born–Oppenheimer MD simulations based on the PBE exchange correlation functional (25) including Grimme’s dispersion corrections (26, 27), as implemented in the code Quantum Espresso (28). We simulated periodically repeated supercells including 96 or 192 atoms at a density of 1 g/cm3 and at T = 300 K (29). We performed metadynamics (20, 30) simulations using the plugin Plumed 1.3 (31), where we implemented our new approach (freely available on demand). Following ref. 32, we define two socalled path collective variables s and z as
Pietrucci and Saitta
Fig. 2. Free energy landscape for interconversion of formamide (region A) and CO + NH3 (region B) in the gas phase. Representative atomic configurations of free energy minima and transition states are shown as insets.
PNAS | December 8, 2015 | vol. 112 | no. 49 | 15031
Fig. 1. Construction principle of topological path collective variables. The connectivity patterns of reactants and products are represented by tables having individual, nonhydrogen atoms on rows and atomic species (the set of all atoms of a given element) on columns. Arrows indicate changes of coordination numbers; however, all other matrix elements are free to change as well thanks to the flexibility of path collective variables (see text).
i2 Xh CIS ðtÞ − CISk ,
Fig. 3. Free energy landscape for reactions in aqueous solution between formamide (region A), CO + NH3 (region B), and HCOOH + NH3 (region C). Representative atomic configurations are shown as insets.
Results Decomposition of Formamide into Carbon Monoxide and Ammonia.
In a first set of simulations we investigated decarbonylation reactions and their inverse: HCONH2 ⇄ CO + NH3 .
Within a single gas-phase simulation of 170 ps the trajectory explores seven forward and seven backward reactions. In the resulting free energy landscape (Fig. 2) two basins corresponding to reactants and products are readily identified, differing by less than 1 kcal/mol in stability. Reactants and products are separated by a free energy barrier ΔF p = 79 kcal/mol, very similar to the barrier of 80.5 kcal/mol found in zero-temperature calculations at CCSD(T)/CBS + ZPE level in ref. 10. Compared with gas phase, the landscape of the solution is significantly modified in both quantitative and qualitative features (Fig. 3). In particular, besides the reference states used to build the collective variables (formamide and CO + NH3), the simulation explores a third basin featuring formic acid plus ammonia. This finding is remarkable because our space of collective variables was not explicitly constructed to take into account formic acid, as shown from the higher values of the z coordinate with respect to the reference species. Formamide, CO + NH3, and HCOOH + NH3 show a similar stability within our statistical uncertainty (±2 kcal/mol). Inspection of the landscape results also in a barrier of 35 kcal/mol for the reaction HCONH2 → HCOOH + NH3 and of 40 kcal/mol for the reaction HCONH2 → CO + NH3. For comparison, repeating the simulation in the larger cell containing 62 water molecules (see values in parentheses in Table 1), we found differences in the free energy levels corresponding to minima and saddle points that are within 5 kcal/mol from the smaller model ones. The two decomposition channels, decarbonylation and hydrolysis, thus have quite similar barriers, confirming the effectiveness of topological path collective variables in discovering competitive pathways and even relevant chemical species. Additionally, ionic forms NH4+ and HCOO− are also explored, as expected from the basicity of ammonia and acidity of formic acid. An advantage of our molecular dynamics approach is to have built-in anharmonic temperature effects, so that we can directly break down ΔF into energetic and entropic contributions: ΔE is directly obtained as the average energy observed in a relatively long equilibrium (unbiased) MD simulation performed in each metastable state (Supporting Information), whereas TΔS = ΔE − ΔF. As shown in Table 1, energy differences among the three product states are 15032 | www.pnas.org/cgi/doi/10.1073/pnas.1512486112
small, whereas all decomposition reactions correspond to a sizable increase in energy that is almost exactly compensated by an increase in entropy, resulting in a comparable free energy for all four metastable states. As detailed in Supporting Information, the different solute molecules interact in very different ways with water: formamide and formic acid are highly hydrophilic engaging on average in 4.5 and 5 hydrogen bonds with water, respectively, within an equilibrium MD simulation. Instead, CO appears as a hydrophobic molecule, not engaging in hydrogen bonds and featuring a first solvation shell at the sizable distance of 3.5 Å, similar to the hydrophobic cage of water molecules observed around solvated CO2 (19). In more detail, all observed reactions begin with water donating a proton to the amino group of formamide as the first step (leaving an OH− species in solution), overcoming a barrier ΔF p ≈ 35 kcal/mol. The resulting cation, which was identified as a relevant intermediate for this same reaction in ref. 14, is locally stable only within a time scale of less than 1 ps, as we verified with hundreds of unbiased trajectories started from 10 different configurations featuring this species (see committor analysis below). Experimentally, formamide is a weak base, with the protonated species having a standard free energy almost equivalent (only 0.1 kcal/mol higher) to the neutral species (39); however, the carbonyl is expected to be the most probable protonation site (40), in agreement with the transient nature of the protonated nitrogen in our simulations. Due to the transient nature and very short lifetime of the HCONH3+ intermediate, the mechanism could also be approximately considered as one step. From HCONH3+ the system can evolve either toward carbon monoxide and ammonia or toward formic acid and ammonia: we focus here on the former channel and discuss the latter in next section. The dissociation of the C–N bond is accompanied by the donation of the carbon-bound proton to water. The mechanism appears fully reversible in our simulations, with backward transitions (formamide formation) featuring a barrier of 40 kcal/mol. The barrier can be flattened by a strong enough electric field as demonstrated in ref. 14. We validated the mechanism and assigned transition state structures by performing a committor analysis, as detailed in Supporting Information. We could then identify the transition state structures as those giving approximately 50% probability to fall in either basin, as shown in Fig. 4. Note how both steps of the reaction, nitrogen protonation and carbon deprotonation, are enabled by the solvent, whereas in our gas-phase simulations the very same proton was transferred from the carbon to the nitrogen overcoming a higher barrier. We also remark the striking difference between committor trajectories in solution (Fig. 4), wandering in the low-curvature landscape regions for more than about 100 fs, compared with those in gas phase (Supporting Information) resembling steepest descent relaxation pathways due to the lack of solvent collisions. We could not observe direct transitions between the basins corresponding to CO + NH3 and HCOOH + NH3: transitions pass instead close to the formamide basin, pointing to a catalytic role of ammonia in the conversion process between CO and HCOOH. Our result reinforces the important role of ammonia in prebiotic chemistry from multiple points of view, including as a reactant Table 1. Energetic differences (kcal/mol) of the main states in aqueous solution at T = 300 K, with formamide taken as a reference State HCONH2 CO + NH3 HCOOH + NH3 HCN + H2O
ΔF 0 −5 0 1
(0) (−3) (−5) (−4)
0 21 20 18
0 −26 −20 −17
Models include 30 water molecules (62 in parentheses).
Pietrucci and Saitta
for the synthesis of formamide, as a catalyst for formic acid synthesis, and as pointed out in a recent experimental study as an essential ingredient for the formation of sugar-related aldehydes from UV-irradiated water and methanol ice mixtures mimicking protostellar clouds (41). Hydrolysis of Formamide into Formic Acid and Ammonia. We pro-
ceed to study hydrolysis reactions and their inverse: HCONH2 + H2 O ⇄ HCOOH + NH3 .
This reaction channel emerged (even if it was not explicitly sought) as a formamide decomposition pathway alternative to decarbonylation in our previous liquid-phase simulation. Due to the key role of water molecules as (at least) a coreactant, we uniquely addressed the liquid-phase processes. Experimentally, at pH-neutral conditions the hydrolysis has a pseudo-first-order rate of 3.6 · 10−9 s−1 at 56 °C (42); however, the mechanism and free energy profile of the process have not been compellingly assigned. Based on static calculations including five water molecules, ref. 43 claimed that hydrolysis would proceed preferentially through a water-assisted two-step mechanism, with the formation of an amino-gem-diol intermediate CH(OH)2NH2 (by adding water to the CO bond, the rate limiting step) followed by proton transfer from one OH group to the nitrogen and simultaneous dissociation of the CN bond. Previous Car–Parrinello steered-MD simulations of the aqueous solution (44), based on the C–O distance as reaction coordinate, suggested instead a concerted one-step mechanism with direct addition of water to the carbon in a tetrahedral transition state. Theoretical and experimental results were also interpreted in terms of a zwitterion mechanism. We also recall that the Miller-like ab initio simulations in ref. 14 identified formic acid as an important prebiotic intermediate species, together with formamide, on a pathway toward the synthesis of glycine facilitated by an external electric field. In our simulations the hydrolysis of formamide preferentially follows a two-step mechanism: First, the nitrogen is protonated, forming the transient HCONH3+ cation with a barrier of ΔF p ≈ 35 kcal/mol (as discussed in the previous section). Inspection of the multiple reactive trajectories observed with metadynamics suggests that OH− addition to the carbon leads to the dissociation of the C–N bond, thus completing the two-step reaction (albeit with a quite unstable intermediate). The backward reaction is also observed simply reverting the steps. Our barrier of 35 kcal/mol is similar to the one of ref. 43 and, as shown therein, is in agreement with the available kinetic experimental data (42); Pietrucci and Saitta
Reactions Connecting Formamide with Hydrogen (Iso)cyanide. We
finally address the following reaction: HCONH2 ⇄ HCN + H2 O.
In gas phase, similar to before, we initially defined path collective variables based on the coordination patterns of formamide (reference state 1) and HCN + H2O (reference state 2). However, whereas for decarbonylation we observed trajectories interconverting reactants and products without passing through intermediates, here the simulations revealed a more complex scenario. All reactions begin by the isomerization of formamide into OHCHNH through direct hydrogen transfer from nitrogen to oxygen. Next, the isomer evolves toward HNC + H 2O, it reverts to a different OHCHNH isomer, and only at this point it reacts to form HCN + H2O: The main barriers are 45 kcal/mol for the initial internal proton transfer and 55 kcal/mol for the final dissociation giving HCN and water [to be compared with barriers of 42 and 59 kcal/mol, respectively, from zero temperature calculations at CCSD(T) + ZPE level (10) and similarly at B3LYP level (46)]. The gas-phase scenario is of limited interest here (except for comparison with previous zero-temperature calculations): its detailed analysis is in Supporting Information. In solution the interconversion between formamide and HCN is still a multistep process (Fig. 5). The first step consists again of the formation of OHCHNH, however, with the nitrogen-to-oxygen hydrogen transfer occurring
Fig. 5. Free energy landscape for reactions in aqueous solution between formamide (region A), OHCHNH (region B), and HCN (region C). Representative atomic configurations are shown as insets.
PNAS | December 8, 2015 | vol. 112 | no. 49 | 15033
Fig. 4. Examples of unbiased trajectories observed within committor analysis and representative structures of transition states for reactions among formamide, CO + NH3, and HCOOH + NH3.
however, we remark how our two-step mechanism is different from the one proposed from the calculations in ref. 43 because it does not pass through an amino-gem-diol intermediate (that we sporadically also observe off-pathway). The different result in ref. 43 could be related to the simplified model of solution (including five water molecules) used by the authors. On the other hand, they adopted a theoretical approach for the quantum mechanical treatment of electrons, which is generally more accurate than the DFT approach used in the present work. The one-step mechanism proposed from Car–Parrinello simulations in ref. 44 is also unlikely because it features a higher barrier (45 kcal/mol), possibly due to the simple collective variable used by the authors, the C–O distance, that includes far fewer degrees of freedom than our topological path collective variables. Note that in ref. 45, metadynamics made it possible to explain the experimental barrier [21 kcal/mol (42)] for base-catalized hydrolysis, namely, OH− attack without prior protonation of the nitrogen. We validated the metadynamics scenario with a committor analysis, resulting in the transition state structures displayed in Fig. 4.
Fig. 6. Examples of unbiased trajectories observed within committor analysis and representative structures of transition states for reactions between OHCHNH and HCN. The orange contour indicates atoms belonging to the reacting OHCHNH molecule.
through a proton chain in the solvent, sizably reducing the barrier to only 25 kcal/mol. Next, we observed a number of facile interconversions between the four different OHCHNH isomers, again thanks to proton exchanges with the solvent rather than dihedral rotations (which, however, are also present). HCN is then obtained overcoming a barrier of about 20 kcal/mol (or 45 kcal/mol taking formamide as a reference), the reverse barrier being about 44 kcal/mol. For comparison, barrier heights computed in gasphase clusters including three or four water molecules at zero temperature and CCSD(T) + ZPE level are 21 kcal/mol from formamide to OHCHNH, similar to our value of 25 kcal/mol, and 35 kcal/mol from OHCHNH to HCN, sizably larger than our value of 20 kcal/mol; additionally, HCN is destabilized by 14 kcal/mol with respect to our results (or by 19 kcal/mol considering our larger 62-waters simulation box) (17). These findings show the importance of realistic models of solutions and of the fully anharmonic dynamics at finite temperature. According to committor analysis, the crucial reaction step is a proton transfer from the NH moiety to a water molecule, followed by the dissociation of the OH unit of the isomer and its protonation by the solvent to generate a water molecule (Fig. 6). The process occurs through a hexagonal ring with the participation of three water molecules plus the newly generated one. The free energy difference between formamide and HCN (plus water) is small—as confirmed also using a larger 62-waters model (Table 1)—thanks to the favorable entropic term canceling out the unfavorable energetic one. The relative stability emerging from our simulations is in reasonable agreement with an experiments-based model of primitive oceans giving, at pH = 7, a free energy difference of only 4 kcal/mol (47). This fact may be interpreted in the sense of both molecules having been largely available, given appropriate conditions, for possible prebiotic processes. HCN appears as weakly hydrophilic, engaging on average in 0.5 hydrogen bonds with water and featuring a first solvation shell of radius 2.3 Å. Occasionally, overcoming a small barrier, HCN is also deprotonated to cyanide ion, at higher values of the z-path coordinate. At variance with the gas phase, in solution, HNC is not locally stable, occasionally forming but then immediately disappearing within a time scale of ∼ 0.1 ps. This observation is consistent with HNC being mainly found in interstellar clouds (48) and only in gas-phase laboratory experiments (49), and it could be interpreted in terms of HNC having a much more acidic proton than HCN.
15034 | www.pnas.org/cgi/doi/10.1073/pnas.1512486112
An interesting question is how temperature and pressure modify prebiotic reaction networks, because different scenarios are characterized by different boundary conditions. As a first step in this direction we repeated gas-phase and liquid solution simulations at T = 400 K, keeping identical collective variables and simulation setup. The corresponding reaction landscapes (reported in Supporting Information) show that in gas phase the effect on free energy differences is limited to within 3 kcal/mol, thus compatible with our statistical error bar of ±2 kcal/mol, whereas in solution both the transition state between OHCHNH and HCN and the local minimum of HCN are destabilized by 10 kcal/mol in passing from 300 to 400 K. The destabilization of HCN compared with formamide is traced back to an increase in ΔE (from 18 to 25 kcal/mol) concomitant with a decrease in ΔS [from 57 to 38 cal/ (mol K)]. The latter variation is consistent with an increased freedom of the formamide molecule and decreased freedom of HCN: the average number of solute–solvent hydrogen bonds passes from 4.5 to 3.6 for formamide and from 0.5 to 0.9 for HCN. The sizable deviation from a temperature extrapolation at rigid ΔE and ΔS strongly supports the usefulness of our approach and, more generally, of MD simulations. Conclusions In conclusion, we have developed a novel approach which allows the study of chemical reactivity in gas phase and in liquid phase on the same footing, through the definition of simple, intuitive, and transferable reaction coordinates. In combination with state-of-theart free energy calculation methods such as metadynamics, the new coordinates allow us to discover relevant reaction mechanisms and reconstruct the corresponding free energy landscapes, which can be directly compared between phases and/or at different thermodynamical conditions. Finally, detailed inspection of trajectories and committor analysis allow us to compellingly identify transition states and reaction pathways and mechanisms. We apply this new approach to the emblematic case of formamide, a centerpiece of many prebiotic scenarios recently put forward to explain the chemical origins of life on Earth. On one hand, our method is capable of quantitatively reproducing existing gasphase results. Much more importantly, liquid-phase calculations— for which our approach is a major methodological step forward—of all three fundamental reaction channels, i.e., dehydration, decarbonylation, and hydrolysis, provide novel results of prebiotic significance and implications. In fact, previous gas-phase studies (even including few water molecules) suggested formamide as thermodynamically much more stable than hydrogen cyanide, a fundamental prebiotic molecule and a precursor in crucial biochemical syntheses. Within the approximations of this work, namely, the DFT level treatment of the electronic problem and the classical treatment of protons, we show that formamide and hydrogen cyanide in explicit solution have instead a comparable stability and relatively facile interconversion, as pointed out experimentally by Miller and coworkers (47). Our results thus suggest the formamide-based and the hydrogen cyanide-based prebiotic scenarios to be compatible, contributing to settle a recent high-pitched debate on the question. Finally, our uninformed reaction network exploration unveils ammonium formate as a major species in formamide decarbonylation/ hydrolysis, of comparable stability with respect to both formamide and hydrogen cyanide, and thus supporting its major role among fundamental prebiotic actors. ACKNOWLEDGMENTS. The authors thank G. Cassone, F. Guyot, S. Laporte, F. Saija, and R. Vuilleumier for discussion, insights, and support; the Grand Equipement National de Calcul Intensif (GENCI) French National Supercomputing Facility for computer time (project Grant 91387); and the Université Pierre et Marie Curie Computational Mesocenter.
Pietrucci and Saitta
Pietrucci and Saitta
28. Giannozzi P, et al. (2009) QUANTUM ESPRESSO: A modular and open-source software project for quantum simulations of materials. J Phys Condens Matter 21(39):395502. 29. Bussi G, Donadio D, Parrinello M (2007) Canonical sampling through velocity rescaling. J Chem Phys 126(1):014101. 30. Laio A, Gervasio FL (2008) Metadynamics: A method to simulate rare events and reconstruct the free energy in biophysics, chemistry and material science. Rep Prog Phys 71(12):126601. 31. Bonomi M, et al. (2009) PLUMED: A portable plugin for free-energy calculations with molecular dynamics. Comput Phys Commun 180(10):1961–1972. 32. Branduardi D, Gervasio FL, Parrinello M (2007) From A to B in free energy space. J Chem Phys 126(5):054103. 33. Branduardi D, De Vivo M, Rega N, Barone V, Cavalli A (2011) Methyl phosphate dianion hydrolysis in solution characterized by path collective variables coupled with dft-based enhanced sampling simulations. J Chem Theory Comput 7(3):539–543. 34. Gallet G, Pietrucci F, Andreoni W (2012) Bridging static and dynamical descriptions of chemical reactions: An ab initio study of co2 interacting with water molecules. J Chem Theory Comput 8(11):4029–4039. 35. Saitta AM, Saija F, Pietrucci F, Guyot F (2015) Reply to Bada and Cleaves: Ab initio free-energy landscape of Miller-like prebiotic reactions. Proc Natl Acad Sci USA 112(4): E343–E344. 36. Pietrucci F, Andreoni W (2014) Fate of a graphene flake: A new route toward fullerenes disclosed with ab initio simulations. J Chem Theory Comput 10(3):913–917. 37. Barducci A, Bussi G, Parrinello M (2008) Well-tempered metadynamics: A smoothly converging and tunable free-energy method. Phys Rev Lett 100(2):020603. 38. Sprik M (2000) Computation of the pK of liquid water using coordination constraints. Chem Phys 258(2-3):139–150. 39. Iglesias E, Montenegro L (1996) Basicity of amides in water and in aqueous micellar solutions of SDS. Their influence on micellar structure. J Chem Soc Faraday Trans 92(7):1205–1212. 40. Chaudhuri C, Jiang JC, Wu C-C, Wang X, Chang H-C (2001) Characterization of protonated formamide-containing clusters by infrared spectroscopy and ab initio calculations. II. Hydration of formamide in the gas phase. J Phys Chem A 105(39): 8906–8915. 41. de Marcellus P, et al. (2015) Aldehydes and sugars from evolved precometary ice analogs: Importance of ices in astrochemical and prebiotic evolution. Proc Natl Acad Sci USA 112(4):965–970. 42. Slebocka-Tilk H, Sauriol F, Monette M, Brown R (2002) Aspects of the hydrolysis of formamide: Revisitation of the water reaction and determination of the solvent deuterium kinetic isotope effect in base. Can J Chem 80(10):1343–1350. 43. Gorb L, Asensio A, Tuñón I, Ruiz-López MF (2005) The mechanism of formamide hydrolysis in water from ab initio calculations and simulations. Chemistry 11(22): 6743–6753. 44. Cascella M, Raugei S, Carloni P (2004) Formamide hydrolysis investigated by multiplesteering ab initio molecular dynamics. J Phys Chem B 108(1):369–375. 45. Blumberger J, Ensing B, Klein ML (2006) Formamide hydrolysis in alkaline aqueous solution: Insight from Ab initio metadynamics calculations. Angew Chem Int Ed Engl 45(18):2893–2897. 46. Wang J, Gu J, Nguyen MT, Springsteen G, Leszczynski J (2013) From formamide to purine: An energetically viable mechanistic reaction pathway. J Phys Chem B 117(8): 2314–2320. 47. Miyakawa S, Cleaves HJ, Miller SL (2002) The cold origin of life: A. Implications based on the hydrolytic stabilities of hydrogen cyanide and formamide. Orig Life Evol Biosph 32(3):195–208. 48. Graninger DM, Herbst E, Öberg KI, Vasyunin AI (2014) The HNC/HCN ratio in starforming regions. Astrophys J 787(11):74–85. 49. Ferus M, et al. (2011) HNC/HCN ratio in acetonitrile, formamide, and BrCN discharge. J Phys Chem A 115(10):1885–1899. 50. Rappe AM, Rabe KM, Kaxiras E, Joannopoulos JD (1990) Optimized pseudopotentials. Phys Rev B Condens Matter 41(2):1227–1230. 51. Luzar A, Chandler D (1996) Effect of environment on hydrogen bond dynamics in liquid water. Phys Rev Lett 76(6):928–931.
PNAS | December 8, 2015 | vol. 112 | no. 49 | 15035
1. Darwin CD (1871) Letter to Joseph Hooker . Correspondence of Charles Darwin 19:188–189. 2. Pilling S, Baptista L, Boechat-Roberty HM, Andrade DP (2011) Formation routes of interstellar glycine involving carboxylic acids: Possible favoritism between gas and solid phase. Astrobiology 11(9):883–893. 3. Kahane C, Ceccarelli C, Faure A, Caux E (2013) Detection of formamide, the simplest but crucial amide in a solar type protostar. Astrophys J 763(2):L38. 4. Lambert J-F (2008) Adsorption and polymerization of amino acids on mineral surfaces: A review. Orig Life Evol Biosph 38(3):211–242. 5. Bar-Nun A, Bar-Nun N, Bauer SH, Sagan C (1970) Shock synthesis of amino acids in simulated primitive environments. Science 168(3930):470–473. 6. Ferus M, et al. (2015) High-energy chemistry of formamide: A unified mechanism of nucleobase formation. Proc Natl Acad Sci USA 112(3):657–662. 7. Barks HL, et al. (2010) Guanine, adenine, and hypoxanthine production in UV-irradiated formamide solutions: Relaxation of the requirements for prebiotic purine nucleobase formation. ChemBioChem 11(9):1240–1243. 8. Saladino R, et al. (2015) Meteorite-catalyzed syntheses of nucleosides and of other prebiotic compounds from formamide under proton irradiation. Proc Natl Acad Sci USA 112(21):E2746–E2755. 9. Martins Z, Proce M, Goldman N, Sephton M, Burchell M (2013) Shock synthesis of amino acids from impacting cometary and icy planet surface analogues. Nat Geosci 6: 1045–1049. 10. Nguyen VS, et al. (2011) Theoretical study of formamide decomposition pathways. J Phys Chem A 115(5):841–851. 11. Maeda S, Harabuchi Y, Ono Y, Taketsugu T, Morokuma K (2015) Intrinsic reaction coordinate: Calculation, bifurcation, and automated search. Int J Quantum Chem 115(5):258–269. 12. Saladino R, Botta G, Pino S, Costanzo G, Di Mauro E (2012) Genetics first or metabolism first? The formamide clue. Chem Soc Rev 41(16):5526–5565. 13. Saladino R, Crestini C, Pino S, Costanzo G, Di Mauro E (2012) Formamide and the origin of life. Phys Life Rev 9(1):84–104. 14. Saitta AM, Saija F (2014) Miller experiments in atomistic computer simulations. Proc Natl Acad Sci USA 111(38):13768–13773. 15. Nitzan A (2006) Chemical Dynamics in Condensed Phases (Oxford Univ Press, Oxford). 16. Orr-Ewing AJ (2014) Perspective: Bimolecular chemical reaction dynamics in liquids. J Chem Phys 140(9):090901. 17. Nguyen VS, Orlando TM, Leszczynski J, Nguyen MT (2013) Theoretical study of the decomposition of formamide in the presence of water molecules. J Phys Chem A 117(12):2543–2555. 18. Guido CA, Pietrucci F, Gallet GA, Andreoni W (2012) The fate of a zwitterion in water from ab initio molecular dynamics: Monoethanolamine (MEA)-CO2. J Chem Theory Comput 9(1):28–32. 19. Ma C, Pietrucci F, Andreoni W (2014) Capturing CO2 in monoethanolamine (MEA) aqueous solutions: Fingerprints of carbamate formation assessed with first-principles simulations. J Phys Chem Lett 5(10):1672–1677. 20. Laio A, Parrinello M (2002) Escaping free-energy minima. Proc Natl Acad Sci USA 99(20):12562–12566. 21. Bolhuis PG, Chandler D, Dellago C, Geissler PL (2002) Transition path sampling: throwing ropes over rough mountain passes, in the dark. Annu Rev Phys Chem 53: 291–318. 22. Geissler PL, Dellago C, Chandler D (1999) Kinetic pathways of ion pair dissociation in water. J Phys Chem B 103(18):3706–3710. 23. Ensing B, De Vivo M, Liu Z, Moore P, Klein ML (2006) Metadynamics as a tool for exploring free energy landscapes of chemical reactions. Acc Chem Res 39(2):73–81. 24. Prasad BR, Plotnikov NV, Warshel A (2013) Addressing open questions about phosphate hydrolysis pathways by careful free energy mapping. J Phys Chem B 117(1): 153–163. 25. Perdew JP, Burke K, Ernzerhof M (1996) Generalized gradient approximation made simple. Phys Rev Lett 77(18):3865–3868. 26. Grimme S (2006) Semiempirical GGA-type density functional constructed with a longrange dispersion correction. J Comput Chem 27(15):1787–1799. 27. Barone V, et al. (2009) Role and effective treatment of dispersive forces in materials: Polyethylene and graphite crystals as test cases. J Comput Chem 30(6):934–939.
Supporting Information Pietrucci and Saitta 10.1073/pnas.1512486112 SI Text In Born–Oppenheimer molecular dynamics simulations we used ultrasoft pseudopotentials (50) (included in the quantum espresso distribution) and a plane-wave cutoff of 35 Ry for the expansion of Kohn–Sham orbitals. We used a convergence criterion of 10−7 on the wave functions and a time step of 0.48 fs, replacing hydrogen masses with deuterium ones. We adopted the following parameters for the coordination function in Eq. 3: N = 6, M = 12, R0SS′ = 1.8 Å for S, S′ = C, O, N, 1.5 Å for S = C, O, N, S′ = H, and 1.4 Å for S = S′ = H. In each simulation we chose the parameter λ such that λDðRk , Rk+1 Þ ≈ 2.3. We verified that variations within a factor 2 do not alter significantly the free energy landscapes. The metadynamics potential was composed of Gaussians with widths σ s = 0.025 and σ z = 0.05 and height 3 kcal/mol deposed every 50 fs. In addition to regular (fixed-height) metadynamics, we also repeated simulations within the well-tempered scheme (37), with a bias factor equal to 60 kcal/mol in gas phase and 30 kcal/mol in liquid phase. We repeated each simulation at least three times, changing the Gaussian parameters and/or the bias factor: by comparing instantaneous free energy landscapes at (or beyond) filling time with time averages of landscapes beyond filling time, we found differences not larger than 4 kcal/mol (which we adopt as an estimate of our statistical precision). In all simulations we restricted the maximum z values explored by employing a semi-parabolic wall potential. To assess the magnitude of possible finite-system size effects we repeated liquid-phase simulations in a cell including CHONH2 + 30 water molecules or + 62 water molecules. The comparison of the resulting free energy landscapes is reported in Figs. S1 and S2 for the different reaction channels. The differences in free
Pietrucci and Saitta www.pnas.org/cgi/content/short/1512486112
energy are within 5 kcal/mol for the relevant points in the landscape (minima and saddles). Potential energy contributions to free energy differences are estimated from the average Kohn–Sham energy in 60-ps-long equilibrium (unbiased) MD simulations in each metastable state, discarding the first 10 ps as equilibration. In the case of the larger model (including 62 water molecules) the simulation length is 40 ps. In Figs. S3 and S4 we report the free energy landscapes for gas phase reactions, obtained by including more than only the reactant and product coordination patterns in the construction of the topological path collective variables. Further reference structures are extracted from a reactive trajectory obtained in a preceding simulation, such as to be equally spaced with respect to the topological metric (Eq. 3). This procedure allows us to follow more closely a given reaction path and, in the case of gas-phase simulations of Figs. S3 and S4, facilitated the statistical convergence of the metadynamics free energy landscape. The committor analysis has been performed in the following manner. In complex many-particle systems like solutions, transition states do not typically coincide with saddle points on the full-dimensional potential energy hypersurface, whereas their generally applicable definition coincides with the separatrix (dynamical bottleneck in configuration space), which can be found in practice by shooting many unbiased trajectories from a putative structure and checking whether they fall into A and B with equal probability (21, 22). We therefore extracted from the interbasin region 30 configurations along the reaction pathways, and from each one we shot 40 unbiased MD trajectories of 250 fs each. We identify a configuration as belonging to the transition state ensemble when it is committed to either basin with a probability of 50 ± 10%.
1 of 7
Fig. S1. Comparison of free energy landscapes at T = 300 K computed using (A) a smaller (30 water molecules) or (B) a larger (62 water molecules) simulation box for transitions between formamide (region A), CO + NH3 (region B), and HCOOH + NH3 (region C).
Pietrucci and Saitta www.pnas.org/cgi/content/short/1512486112
2 of 7
Fig. S2. Comparison of free energy landscapes at T = 300 K computed using (A) a smaller (30 water molecules) or (B) a larger (62 water molecules) simulation box for transitions between formamide (region A), OHCHNH (region B), and HCN + H2O (region C). (C) Free energy landscape at T = 400K (including 30 water molecules).
Pietrucci and Saitta www.pnas.org/cgi/content/short/1512486112
3 of 7
Fig. S3. Gas-phase free energy landscape at T = 300 K for the reaction between formamide (region A) and CO + NH3 (region B). The topological path collective variables are defined starting from six different patterns of coordination, as sampled from a reactive trajectory in the simulation reported in Fig. 2. Representative committor analysis trajectories are shown as colored lines superimposed to the free energy landscape.
Pietrucci and Saitta www.pnas.org/cgi/content/short/1512486112
4 of 7
Fig. S4. Gas-phase free energy landscape at T = 300 K (A) for the reaction between formamide (region A) and OHCHNH (region B) and (B) for the reaction between another OHCHNH isomer (region B′) and HCN + H2O (region C). The topological path collective variables are defined starting from two different patterns of coordination in the first case (regions A and C) and from five different patterns in the second case (between regions B′ and C, sampled from a reactive trajectory obtained with the same setup as in A).
Pietrucci and Saitta www.pnas.org/cgi/content/short/1512486112
5 of 7
Fig. S5. Gas-phase free energy landscape at T = 400 K (A) for the reaction between formamide (region A) and OHCHNH (region B) and (B) for the reaction between another CHOHNH isomer (region B′) and HCN + H2O (region C). To be compared with Fig. S4.
Table S1. Solution structure around formamide and its decomposition products, obtained analyzing equilibrium MD simulations of length 30 ps State
Solute–water H bonds
HCONH2 CO HCOOH NH3 HCN
4.5± 0.7 0.0 5.0± 1.2 2.9± 1.0 0.5± 0.6
Radius of solvation shell 1.8 3.5 1.6 1.7 2.3
Å Å Å Å Å
The first 5 ps are discarded as equilibration. The average number of solute–water hydrogen bonds is computed according to ref. 51, whereas the solvation shell radius corresponds to the first maximum of the solute–water all-atom pair correlation function.
Table S2. Relative free energy (kcal/mol) of relevant states for the reactions in gas phase of Figs. S4 and S5 at different T Temperature T = 300 K T = 400 K
The statistical uncertainty is ±2 kcal/mol.
Pietrucci and Saitta www.pnas.org/cgi/content/short/1512486112
6 of 7
Table S3. Relative free energy (kcal/mol) of relevant states for the reactions in solution of Fig. S2 A and C at different T Temperature
T = 300K T = 400K
The statistical uncertainty is ±2 kcal/mol.
Table S4. Differences (kcal/mol) corresponding to the reaction HCONH2 → HCN + H2O in solution for different model sizes and temperatures System size and temperature
30-waters T = 300K 30-waters T = 400K 62-waters T = 300K
1 10 −4
18 25 20
−17 −15 −24
Free energy differences are directly obtained from metadynamics, whereas energy differences are directly obtained from equilibrium molecular dynamics; in both cases the statistical uncertainty is ±2 kcal/mol. The entropic contribution is estimated as the difference between the first two columns; therefore, its statistical uncertainty is at least ±3 kcal/mol.
Pietrucci and Saitta www.pnas.org/cgi/content/short/1512486112
7 of 7