## Abstract

Electron transfer (ET) reactions in biomolecular systems represent an important class of processes at the interface of physics, chemistry and biology. The theoretical description of these reactions constitutes a huge challenge because extensive systems require a quantum-mechanical treatment and a broad range of time scales are involved. Thus, only small model systems may be investigated with the modern density functional theory techniques combined with non-adiabatic dynamics algorithms. On the other hand, model calculations based on Marcus's seminal theory describe the ET involving several assumptions that may not always be met. We review a multi-scale method that combines a non-adiabatic propagation scheme and a linear scaling quantum-chemical method with a molecular mechanics force field in such a way that an unbiased description of the dynamics of excess electron is achieved and the number of degrees of freedom is reduced effectively at the same time. ET reactions taking nanoseconds in systems with hundreds of quantum atoms can be simulated, bridging the gap between non-adiabatic *ab initio* simulations and model approaches such as the Marcus theory. A major recent application is hole transfer in DNA, which represents an archetypal ET reaction in a polarizable medium. Ongoing work focuses on hole transfer in proteins, peptides and organic semi-conductors.

## 1. Introduction

Electron transfer (ET) constitutes an important class of chemical reactions in biomolecular complexes and has been an object of intensive research [1]. These reactions are found at the heart of fundamental biological processes, such as respiration [2], photosynthesis [3], and radiative damage and repair of DNA [4]. The biochemical ET occurs over large distances, often in a heterogeneous medium composed of different kinds of molecules and on a long time scale. All of these features make both the experimental and the computational studies of ET particularly demanding.

The classical tool to describe the ET in complex molecular and biomolecular systems is the Marcus theory [5,6] and its extensions [7–11]. To evaluate the rate of transfer between an electron donor and an acceptor, three parameters are needed: the reaction free energy *Δ**G*, the electronic coupling *V* and the reorganization energy (RE) *λ*. *V* can be calculated with quantum chemistry (QC) [12], and, nowadays, also with highly accurate methods [13]. Particularly cost-efficient methods are based on the fragment molecular orbital (FMO) approaches [14–18] and on constrained density functional theory (DFT) [19,20]. These allow *V* to be calculated along nanosecond molecular dynamics (MD) trajectories [21,22]. In addition, effective sampling of the configuration space is needed to evaluate *Δ**G* and *λ*, which are thermodynamic quantities. A rather accurate estimation of these parameters is required because of their appearance in the argument of an exponential, and the RE is particularly challenging to determine. It can be decomposed into two parts: the inner sphere RE *λ*_{i}, which is also sensitive to quantum effects [23], and the solvent contribution to the outer sphere RE *λ*_{s}, which can be calculated with classical MD simulations [24–27]; the effects of several structural and dynamic factors on *λ*_{s} have been discussed previously [28,29].

The application of the standard Marcus theory is based on several assumptions, as follows. (i) The electronic structures of the electron donor and of the electron acceptor are known, including the information on the de/localization of the charge. (ii) ET occurs significantly more slowly than the structural dynamics of the molecular system.^{1} (iii) The mechanism of transfer is known (adiabatic or non-adiabatic, solvent-controlled, etc.). A good indicator may be the relation of *V* and *λ* [35], although the distinction may not always be easy to make use of [23].

An alternative solution is to perform an unbiased simulation of ET using a non-adiabatic MD scheme and high-level electronic structure calculations [36], yet such methods are limited to several tens of atoms and time intervals of up to a picosecond. Stepping down the ladder of accuracy of QC methods, the DFT may be able to describe slightly larger systems, still being restricted to a picosecond time scale [37–39]. Even worse, semi-empirical QC methods cannot treat systems with hundreds of atoms on nanosecond time scales either, and this has been possible only with more approximate methods [40] so far.

In order to make a rigorous description of such processes practicable, we have developed a novel multi-scale computational scheme, where the number of degrees of freedom is reduced carefully yet an unbiased description of the ET reaction in the spirit of non-adiabatic QC-MD schemes is still provided. The computational scheme is based on a twofold combination of quantum-mechanical (QM) methods with empirical force fields and linear scaling approaches: (i) A QM region is defined, and the remainder of the system will be described with molecular mechanics (MM), with interaction between the regions to be treated with a standard QM/MM coupling. (ii) An FMO approach is applied in the QM region, which leads to a linear scaling of the computational time with the system size. (iii) Only the frontier orbitals (highest-occupied molecular orbitals (HOMOs) or lowest-unoccupied molecular orbitals (LUMOs)) are considered in the QM treatment, while the remainder of the electronic structure is covered by MM; this is similar to the simple Hamiltonians in the Pariser–Parr–Pople (PPP) or Hückel molecular orbital (HMO) methods. Since an approximate total energy expression for the combined system is derived, the atomic forces can be computed and the standard semi-classical techniques can be applied to propagate the nuclear and electronic degrees of freedom simultaneously, such as the mean-field (MF) Ehrenfest or surface-hopping (SH) methods.

This resulting computational scheme is cost-efficient and allows the electronic and nuclear degrees of freedom to be propagated for several hundreds of QM atoms simultaneously. We have applied this method to the problems of hole/ET in DNA, proteins, peptides and organic materials, which will be presented briefly after reviewing the methodology.

## 2. Methodology

The computational scheme for the simulation of ET is derived from an expression for total energy within DFT. A favourable computational efficiency is achieved by the introduction of several approximations, among which the following ones are prominent:

— use of a linear scaling ansatz for the QC problem, by means of an FMO approach;

— efficient calculation of the FMOs with a semi-empirical DFT scheme, self-consistent charge/density functional tight binding (SCC-DFTB) [41];

— application of a hybrid QM/MM scheme to treat the influence of the environment; and

— a combination of QC and MM calculations on yet another level, treating only selected (frontier) orbitals quantum mechanically while the remainder of the system is covered by a force field.

These approximations are discussed in the first section below. The following sections concern the total energy expression, the calculation of energy derivatives, the effects that are described effectively or neglected in the approach, and the QM propagation schemes applied.

### 2.1. Linear combination of fragment orbitals

The QC problem can be solved very efficiently, provided it is possible to decompose the molecular system naturally into a set of *M* spatially non-overlapping fragments, for which separate QC calculations can be performed. This idea is close to that of the FMO approach of Kitaura *et al.* [42]. An example is the purine nucleobases for the hole transfer in DNA (figure 1). Then, the computational problem is reduced to the determination of the molecular orbitals on each of the *M* fragments (indexed as *m*)
2.1where the *i*th molecular orbital (FMO) of fragment *m* is expressed in an atomic basis set *χ*_{μ} with a set of FMO coefficients In the case of DNA, these fragments are the nucleobases over which the excess charge is considered to travel. In the case of proteins, these may be important amino acid side chains and parts of the protein backbone. The methodology may also be applied to organic materials as these consist of individual molecules, which can be treated as fragments.

To solve for the FMOs in equation (2.1), Hartree–Fock (HF) or DFT methods can be used. However, the application of semi-empirical QC methods can speed up these calculations by as much as three orders of magnitude. This is an essential step as it allows the FMOs to be calculated along the MD trajectories in the nanosecond regime, whereas the HF and DFT approaches would be limited to a picosecond range. In our work, the QC calculations are based on the SCC-DFTB method [41]. Evidently, the use of such approximate methods requires a thorough benchmarking, which has been done for hole transfer in DNA in our initial work [16]. It was shown that SCC-DFTB provides orbitals in close agreement with full DFT calculations. Also, the charge-transfer (CT) parameters—the on-site energies *ɛ*_{m} and electronic couplings *H _{mn}* (see below)—agree with those provided by the higher level DFT and wave function methods.

To save computer time, only the region where the charge can localize is treated with QC, while the entire remainder of the system is described with MM. In the case of hole transfer in DNA, the QC part contains the purine nucleobases (guanines and adenines), while the MM part contains the other nucleobases, the sugar–phosphate backbones as well as the solvent (water molecules and counter-ions). The effect of the MM environment is taken into account using a QM/MM approach. To this end, the calculation of FMOs in equation (2.1) includes the electric field induced by the MM environment, represented by point charges, as an external potential. Also, the electrostatic interactions *between* the individual fragments (i.e. between the purine nucleobases) is only included via point-charge electrostatics. Thus, the MOs are constrained to be localized in the pre-determined region, and their spatial support cannot change in the course of the simulation, regardless of any geometry changes.

Actually, only the excess electron or hole is treated quantum mechanically, while the entire remainder of the system (i.e. the charge-neutral variant of the molecular system) is described with an MM force field. This is comparable to the HMO [43,44] and PPP [45,46] models, in which a *π* – *σ* separation is assumed: only the *π*-electrons are treated with QC while the *σ*-electrons are considered to be condensed to the nuclei with classical force-field terms. The approximation introduced here proceeds even further, by considering the quantum system as consisting of a single electron or hole. The basis set for the expansion of the wave function of excess electron is constituted by the FMOs. In the simplest form, the wave function of the excess electron consists of a superposition of the LUMOs of the fragments (*i* = LUMO in ), and the wave function *Ψ* of the hole is built from the HOMOs
2.2where *a _{m}* are the expansion coefficients in the basis of the HOMOs. Thus, it is assumed that the lowest unoccupied orbitals of the complex are linear combinations of the LUMOs of the individual fragments, or that the highest occupied orbitals are linear combinations of the HOMOs of fragments (for hole transfer). This assumption is confirmed for hole transfer in DNA in figure 2: the energies of the highest occupied orbitals in a stack of adenine molecules obtained with a diagonalization of the ‘coarse-grained’ Hamiltonian in the FMO approach (introduced below in equation (2.9)) are compared with those yielded by full atomistic SCC-DFTB calculations. These sets of orbital energies match convincingly, and the deviations do not exceed 0.01 eV. Furthermore, a visual inspection of the orbitals reveals that the five highest occupied orbitals of the stacked complex of six adenines represent superpositions of the weakly coupled HOMOs of the individual adenines.

Note that also further orbitals of the fragments (HOMO–1, HOMO–2, etc. or LUMO+1, LUMO+2, etc.) can be included if required for a more accurate description.

### 2.2. Approximate total energy expression for the hybrid FMO/MM approach

The starting point is the expression for the DFT total energy of the system with *N* – 1 electrons, i.e. a system which contains a hole (radical cation),^{2} with electron density *ρ*. The first step towards an efficient method is to expand this energy around the density *ρ*_{0} of the neutral system with *N* electrons up to the second order, using the differential density *δρ* = *ρ* − *ρ*_{0}
2.3where *n _{i}* are the occupation numbers of the Kohn–Sham orbitals

*Ψ*

_{i}(with

*n*

_{HOMO}= 1 and

*n*= 2 otherwise),

_{i}*E*

_{DC}[

*ρ*

_{0}] is the DFT double-counting contribution [47], and

*E*

^{2nd}is a contribution that is of second order in the differential density and will be analysed in detail later. Here, the aim is to express the energy explicitly as a sum of the form 2.4which will require further approximations.

#### 2.2.1. The Koopmans theorem and the zeroth-order term

If we assume the orbitals of the charged system *Ψ** _{i}* to be identical to those of the neutral system
2.5then we may rearrange the energy expression to obtain
2.6where is the energy of the

*N*-electron, charge-neutral system. This expression has the form of equation (2.4).

Equation (2.5) looks like a drastic approximation at first sight. For discussion, let us assume the wave function of hole to be localized on the fragment *m* completely so that *Ψ*_{HO} = *φ*_{m}. We may expect three kinds of changes upon ionization: (a) the shape of the orbital *φ*_{m} will change, (b) the orbitals on the fragment *m* (indexed as *k*) will mix (rotate), and (c) the orbital energies will change.

Regarding (a), it has to be kept in mind that relatively large molecular fragments are usually considered. Then, the excess charge is distributed over many atoms as the HOMO/LUMO is quite delocalized most likely. In such a case, the frontier orbitals are unlikely to change much upon ionization. Whenever the excess charge is localized on a few atoms in a fragment, the approximation of preserved frontier orbitals may become problematic; then, a third-order expansion may account, for example, for the spatial expansion of the electron density in small anions with respect to the neutral molecules, as discussed for SCC-DFTB recently [48]. Also, it was shown that, after removing an electron from the HOMO, this orbital remains static and the charge migration is driven by lower lying orbitals [49].

We have tested this approximation for the nucleobase guanine, which is the dominant hole-carrying fragment in DNA. As a measure of similarity of orbitals in the radical cation molecule and those in the neutral one, the overlap of these orbitals obtained on the DFT-level B3LYP/def2-TZVP was evaluated (see figure 3 for the results). The HOMO is the same in the neutral and ionized molecule (the overlap is 0.99 and 0.999 for the α- and β-orbitals of the radical cation, respectively). On a few occasions, there is some rotation of lower lying orbitals and changes of their energy order. Therefore, the response to taking an electron out of the HOMO occurs in lower lying orbitals as expected (b). The effect of charging on the orbital energy will be covered by the second-order term, as discussed later, while this term will include also any effects of the kinds (b) and (c) in an effective way. Also, note that the electronic structure obtained with SCC-DFTB agrees well with higher level results [16].

Equation (2.6) can be interpreted as follows. The (*N* – 1)-electron energy is approximated to the zeroth order by the *N*-electron energy *E ^{N}*[

*ρ*

_{0}], corrected to the first order by subtracting the energy of one electron occupying the HOMO. This is in the spirit of the Koopmans theorem completely. The second-order term represents the change of electron–electron interaction upon the removal of the electron, discussed further below, and it will take the form of a Hubbard-type contribution to the on-site energy owing to the change of charge on the fragment. In this way, it will cover the electronic relaxation effects, i.e. points (a), (b) and (c), effectively.

Since *E ^{N}*[

*ρ*

_{0}] denotes the energy of the charge-neutral,

*N*-electron system, it can be computed using HF or DFT. But, it can be represented by any other total energy expression, and, aiming at a computationally efficient scheme, this energy will be obtained with an empirical force field 2.7Here, we see the importance of passing from equation (2.3) to the form of equation (2.4): we can represent a large part of the energy by the energy of the empirical force field.

^{3}

#### 2.2.2. First-order term

The first-order term in equation (2.6) is expanded in the FMO functions according to equation (2.2)
2.8leading to the FMO Hamiltonian matrix . From now on, we consider one orbital *φ*_{m} (HOMO or LUMO) per fragment, instead of . It is possible (and required for certain applications) to include multiple orbitals per fragment, and this will be discussed later.

The Hamiltonian matrix is evaluated easily using the FMO expansion (equation (2.1)) as
2.9where the double sum runs over the relevant atomic-orbital-like basis functions, and is the Hamiltonian in this atomic orbital (AO) basis, which is built for the current geometry of the molecular system. This matrix is easy to set up, as it requires only SCC-DFTB calculations to be performed on the fragments *m* and *n* to determine and respectively, which then enter the sum in equation (2.9) simply. As described earlier, the AO Hamiltonian is set up with SCC-DFTB for the current geometry of the molecular system. These integrals are interpolated from values that have been pre-calculated and tabulated, therefore the set-up of is not time-consuming at all. Furthermore, the AO Hamiltonian contains the interaction with the environment via a QM/MM coupling; the whole approach is described in [16].

Then, the set of FMOs is orthogonalized with the scheme proposed by Löwdin [50], simplifying the following propagation of the wave function, as described later. The Hamiltonian matrix corresponding to these orthogonalized orbitals is considered in all what follows.

In the study by Kubař & Elstner [21], the time series of the on-site energies and of the couplings (with *m*≠*n*) was computed along classical MD trajectories for DNA. Here, the dynamics of solvated DNA was simulated using a classical force field, and the CT parameters *ɛ*_{m} and were recorded as described earlier. Every DNA base was treated with SCC-DFTB, and the electrostatic interaction with the point charges on the sugars, phosphate groups, counter-ions and water, as well as the charges on all of the other DNA bases, were included using the QM/MM scheme.

The off-diagonal elements (*m≠n*) are smaller than 0.1 eV on average and fluctuate in time owing to the structural dynamics of DNA. It is only the relative orientation of the fragments (DNA bases) that induces this variation, while the solvent has only a minor impact. The opposite is true for the diagonal matrix elements . The fluctuations driven by the molecular vibrations of the fragments lead to oscillations with a period of *ca* 20 fs and an amplitude of *ca* 0.1 eV. The solvent, however, introduces much larger fluctuations of 0.3–0.4 eV, and these fluctuations are the major driving force for the CT dynamics, because these fluctuations are of the same magnitude as the energy differences between different fragment energies; note that the difference in ionization potentials (IPs) of guanine and adenine is *ca* 0.4 eV [21]. The magnitude of the fluctuations due to the internal dynamics and due to the environment can be compared visually in figure 4 for the nucleobases in the double-stranded DNA.

There is an issue related to the application of classical MD simulation, namely whether the amplitudes of molecular vibrations of the fragments, which may be of a quantum character, are in agreement with reality. (Importantly, these amplitudes drive the magnitude of oscillations of CT parameters, above all the instantaneous IPs or electron affinities.) This problem was analysed in the context of excitation energies of nucleobases recently [51], and it was concluded that the intra-molecular motion due to the zero-point vibrations—an effect of a purely QM character—determined the shape of electronic spectra. The point is actually that most of the internal modes of motion of a molecule of the size of the nucleobases are in the vibrational ground state, so that the amplitude of motion is given by the zero-point vibrations.^{4} The crucial questions are then (i) how this ‘zero-point dynamics’ compares with the classical dynamics and (ii) how the instantaneous IP depends on the internal structure of the fragments. Notably, the IP of the nucleobases in the solvated double-stranded DNA is driven mainly by the electric field induced by the environment—the solvent; the related modes of motion possess lower frequencies, thus the quantum effects may be of lesser importance in these systems. However, this need not be the case in all of the CT systems. In organic semi-conductors, for instance, the environment is non-polar and does not induce any strong electric field, so that the fluctuations in the internal structure of charge-carrying fragments are the decisive factor. Then, the (possibly quantum) character of the internal modes of motion and the applicability of classical mechanics require close investigation.

#### 2.2.3. Second-order terms

The second-order term in equation (2.6) contains the second derivatives of Hartree and exchange–correlation (XC) energy
2.10The differential density *δρ* = *ρ* − *ρ*_{0} is decomposed into contributions located on the individual molecular fragments *m*,
2.11

Inserting equation (2.11) into equation (2.10), an expression is obtained which describes the electron–electron interaction between the differential densities *δρ*_{m}(*r*) and *δρ*_{n}(*r*), and the second-order term is obtained as a sum of pairwise contributions
2.12

Two cases may be distinguished here: for large distances (greater than 3 Å) between the fragments *m* and *n*, the XC term can be neglected and only the Hartree interaction prevails; this is shown for the atomistic SCC-DFTB method, e.g. fig. 1 in [48]. Note that the actual inter-fragment distances in our applications are always in this range, being *ca* 3.4 Å for stacked organic molecules (DNA bases, molecules of organic semi-conductors) and larger in the other cases. Furthermore, the Hartree interaction may be approximated by the Coulomb interaction of fragment charges *Δ**Q _{m}*
2.13

Here, the differential charge on the fragment *m* due to the presence of the excess charge is obtained from the wave function expansion coefficient *a _{m}* introduced in equation (2.2) as (

*e*is the elementary charge), and

*R*is the centre-of-mass distance of the fragments. In the other limit

_{mn}*m*=

*n*, describes the on-site charge self-interaction. This can be obtained with the effective Hubbard model, and the diagonal term becomes 2.14where

*U*is the Hubbard parameter of the molecular fragment

_{m}*m*, which is a constant that can be calculated as the second derivative of the total energy of the molecule with respect to its charge. Note that the applied approximations are basically the same as those involved in the derivation of the second-order term in SCC-DFTB [41]. The second-order term can be written in a condensed form as 2.15with

*Γ*

_{mm}=

*e*

^{2}

*·U*and, for

_{m}*m ≠ n*,

*Γ*

_{mn}=

*e*

^{2}/

*R*.

_{mn}DFT calculations of radical systems are prone to the self-interaction error (SIE), which would lead to a drastic over-delocalization of the excess charge in the system. In SCC-DFTB, which is formally identical to the formalism derived here, it can be shown that the SIE occurs as a result of the second-order terms *E*^{2nd} [52]. Since it would be too involved to subtract the electron self-interaction orbital by orbital as proposed earlier [53], a better (but, nevertheless, approximate) option is to consider only the spin density in the correction scheme [37,54]; the spin density corresponds to the density of the excess charge in the current computational method. Following this approach, the second-order term is reduced by a scaling with a constant empirical factor *C* = 0.2 [47].

With these approximations, the total energy (equation (2.6)) reads 2.16where the plus sign applies for the transfer of excess electron, and the minus sign shall be used for the hole transfer.

### 2.3. Total energy and its derivatives

To allow for MD simulation, it is crucial to have a total energy expression as derived in equation (2.16). This involves a few approximations, but it still enables the same equations as for full DFT to be derived formally. The total energy is a function of the expansion coefficients of the wave function of excess charge *a _{m}* (equation (2.2)) and of the spatial coordinates

*R*of atoms

_{α}*α*, 2.17

To perform a simulation within the Born–Oppenheimer (BO) approximation, the derivatives of *E ^{N−}*

^{1}with respect to

*a*can be considered, to arrive at a coarse-grained version of the Kohn–Sham equations in a form equivalent to those obtained in full DFT, with the matrix elements of the Hamiltonian taking the form 2.18(again, with plus and minus signs for electron and hole transfer, respectively). These coarse-grained Kohn–Sham equations have to be solved self-consistently.

_{m}At this point, it is apparent how the second-order energy term corrects for the approximations due to the Koopmans theorem (issues (a), (b) and (c) in §2.2.1) effectively. The matrix elements are evaluated according to equation (2.9) for the *electro-neutral* fragments. For instance, the diagonal terms are the Kohn–Sham energies of the HOMOs of neutral fragments. Then, the on-site energy is corrected by an additional term (containing as the major contribution), which is the change of on-site energy due to the removal or addition of one electron from/to the system. Here, the deformation and rotation of the fragment orbitals as well as the change of orbital energy due to the different electron–electron interactions are included in an effective way. The shift of energy is described by the fragment-specific Hubbard parameter *U _{m}*, which is determined for each individual fragment as the second derivative of total energy with respect to the charge.

Kohn–Sham equations with the Hamiltonian in equation (2.18) determine the coefficients *a _{m}* that minimize the total energy for a given geometry

*R*. Once the minimum of electronic energy is determined, the forces on atoms can be computed by taking the derivatives 2.19

_{α}Here, *E ^{N−}*

^{1}contains the three parts discussed earlier. The derivatives of

*E*

^{0}are simply the forces obtained from the MM force field, and they are implemented in the applied MM package already. The first-order term

*E*

^{1}consists of the kinetic energy (

*T*), electron–nucleus (

*v*

_{el}

_{−}_{nuc}), Hartree (

*v*

_{H}) and XC (

*v*

_{XC}) contributions 2.20

The derivatives of kinetic energy and XC originating from *E*^{1} cannot be obtained in a practicable way and are neglected; the consequences will be discussed in §2.4. The derivatives due to the Hartree contribution to *E*^{1} together with the Coulomb interaction in *E*^{2} are approximated by the interaction of atom-centred point charges according to the Coulomb law.

This ansatz, which connects the quantum system (the excess electron or hole) with the classical (atomic) system, is implemented in the following way: the charge on fragment *m* due to the excess electron hole, *Δ**Q _{m}*, is distributed among the atoms of the fragment, providing additional contributions to atomic charges . Then, the Coulomb interaction energy between all of the point charges and (the sum running over all fragments

*m*and atoms

*α*) corresponds to the Hartree term in

*E*

^{1}plus the contributions from

*E*

^{2}. To compute the derivatives of this energy term with respect to

*R*, it suffices to update the point charges of the MM atoms with the additional atom-centred contributions due to the excess charge, 2.21The contributions are obtained by a linear interpolation 2.22between the atomic charges of the charged molecule (a cation for hole transfer, an anion for excess ET) obtained prior to the simulation for the neutral () and for the charged molecular fragments. These sets of atomic charges are obtained with restrained electrostatic potential (RESP) fitting [55] prior to the simulation. The use of RESP charges for MD simulation is an established standard, and it was reported that the choice of atomic charge sets may affect the features of excess charge in simulations [56].

_{α}The resulting simulation scheme is conceptually simple: the solution of equation (2.18) gives coefficients *a _{m}*, which are projected to the differential atomic charges . These are used to update the atomic charges in the MM part of the calculation. Then, cost-efficient MM forces are computed, considering these updated atomic charges in the QM region. The influence of the excess charge on the MD is covered fully in this way. The presence of the excess charge induces a polarization of the environment, i.e. a ‘polaron’ accompanies the transferring charge, and this is the determining factor for ET in molecular systems immersed in an aqueous environment. A remaining problem is the application of a non-polarizable force field, and this will be discussed later in detail.

Note that the only part of the system treated actually with quantum mechanics is the excess charge—a single electron or hole. This is done with a QM/MM approach so that the interaction with the molecular system is accounted for. Still, the entire remainder of the system is described classically, with MM. Thus, there is an implicit underlying assumption that the studied system is described adequately with an available MM force field; this assumption is typically accepted for solvated DNA, peptides and proteins.

Based on the described approximations, a very efficient MD scheme is implemented, making use of BO dynamics and updating the MM atomic charges in every time step as directed by the coarse-grained quantum mechanics. For instance, eight QC calculations have to be performed in every time step of the simulation of hole transfer in a DNA octanucleotide, one for each CT-active nucleobase. Each fragment contains about 15 atoms, and a calculation of such a system using the semi-empirical SCC-DFTB method takes less than 100 ms on one CPU core of a main-stream desktop computer. As these calculations on fragments, which may be performed in parallel, are the most time-consuming part of the calculation, an efficiency of as much as 1 ns per day is obtained.

### 2.4. Effects not taken into account

The computational scheme involves several notable approximations: classical non-polarizable MM calculation is applied to obtain the energy of the charged system. Also, the kinetic energy and XC contributions to the first-order energy term (*T* and *v*_{XC} in equation (2.20)) are neglected in the MM calculation of forces on atoms due to the excess electron. Consequently, several kinds of interactions are not described with the total energy in equation (2.16), and they are discussed in this section in detail.

#### 2.4.1. Intra-fragment interactions

Although the change of Coulomb interactions upon the charging of the system is covered by the scheme formally, its portion corresponding to the interactions within a fragment is missing. The reason for this is the way in which the commonly used empirical force fields are constructed—the non-bonded interaction between a pair of atoms within the distance of fewer than three covalent bonds is not computed, as they are accounted for by the bonded terms. In addition, the non-bonded interaction between atoms connected by three bonds (1–4 interaction) is scaled down by a certain factor. As a matter of fact, most of the atom pairs within a fragment of typical size (a nucleobase or an amino acid side chain) are within this distance. Therefore, the MM energy of the fragment does not actually change upon the arrival of excess charges, thus no change of forces occurs, and the energy cost of the structural deformation of the fragment due to the varying charge cannot be described.

In the Marcus theory, this effect is covered by the inner sphere RE *λ*_{i}. Following an earlier study [57], we introduce *λ*_{i} by means of an additional empirical contribution to the total energy
2.23There have been several estimates of the parameter , which reflects the response of the molecular geometry of the fragment *m* to the varying charge. Applied here are values obtained from QC calculations with an implicit solvent; see the discussion in [47]; amounts to 0.23 eV for both adenine and guanine, which are considered for hole transfer in DNA. Supplying this additional energy term to the total energy, the intra-fragment contributions are corrected for effectively.

Such an effective, empirical description of the energy cost of the molecular geometry change shows an additional advantage over the application of classical MM. As found recently [23], the rearrangement of molecular geometry caused by the change of the charge state may occur via tunnelling—an exclusively QM mechanism. If that is the case then the usual classical way of calculating *λ*_{i} as mentioned earlier would yield values overestimated by a factor of as much as 2. For this reason, it is appropriate to describe the inner sphere reorganization by means of an additional, empirical term in the expression for the total energy, instead of aiming at an explicit account for the changes in molecular geometry.

#### 2.4.2. Inter-fragment interactions

Since the charges in the QM region are updated according to the location of the excess charge, the additional Coulomb interaction between the fragments relative to the neutral system is covered by the present formalism. This is the main contribution from the off-diagonal terms in equation (2.16). However, the van der Waals interaction between the fragments in the charged state may also differ from that in the neutral state. Now, this effect is missing in the description because the XC contribution to the forces on atoms was neglected in the derivative of first-order energy in equation (2.20). The possible consequences are of two kinds: (i) Owing to the change of charge density, the exchange repulsion may differ between the charged and the neutral species. This may result in a slightly different van der Waals distance of the molecular fragments. In principle, an additional charge-dependent term could be introduced into the classical force field to take this interaction into account (as suggested in [47], supplementary information). A possible effect would be a modified distortion of the structure of DNA, i.e. a small compression or expansion of the region where the excess charge is localized. This effect is expected to be small, as discussed for DNA in [58] and in general in [59]. Such a minor change of structure will be overwritten completely by the dynamics of the system, which introduces much larger fluctuations than the geometry change induced by the charge-dependent van der Waals interactions. (ii) Another effect may be the change of dispersive interactions. To inspect this for hole transfer in DNA, we computed the isotropic polarizability of the neutral and the cationic forms of both purine bases on the DFT/PBE/TZVP level. Very similar values were obtained—for A, 13.62 and 13.61 Å, respectively, and for G, 14.58 and 14.67 Å, respectively. Therefore, the effect seems negligible for the hole transfer in DNA, and no correction has to be added in the current scheme.

#### 2.4.3. Electronic polarization of the environment

The electric-charge density of the molecular environment is represented by fixed point charges in the standard force fields, as applied here as well. This approximation inherent to MM-based methods is a source of errors whenever the electronic polarization is changing during the process being simulated. Obviously, the transfer of electric charge is such a process, and it has been established that the solvent RE is overestimated severely if evaluated with a non-polarized force field [60–63].^{5} While this is the most prominent effect, other relevant quantities may be affected as well, such as the magnitude of fluctuation of on-site energies *ɛ*_{m}; as an example, note again the large fluctuations in IP of nucleobases in DNA (figure 4).

The factor by which the energy-related quantities are overestimated is related to the optical dielectric constant of the medium [62,64]. Then, if it is not practicable to pass to a polarizable MM force field, the most straightforward way to correct for the overestimated electrostatic interaction is to scale the relevant interactions down by a suitable factor [64–66]. The effect on the energy barrier can be illustrated by considering the relation between the RE *λ*_{s} (representing the barrier) and the magnitude of fluctuation of one of the energy quantities describing the system; available choices are the energy gap expressed as the difference in IP (*Δ*IP) [26],
2.24the IP itself [24] and the total electronic energy [30]. The relation of the (possibly scaled) electric potential induced by the MM environment to *λ*_{s} becomes clear by noting that the IP is nearly directly proportional to the electric potential [21]. While no such quantity as RE is defined in the developed simulation scheme, the reasoning presented here applies to the effect of scaling of the MM electric field on the energy barrier to ET. Simply speaking, the reduced energy barrier leads to an increased rate of transfer.

Blumberger & Lamoureux [67] described the more complex character of the problem, finding the RE in an aqueous medium to be overestimated by 26–34% when the TIP3P water model was used. Based on this observation is the suggested choice of the value of the scaling factor of 1/1.4–1/1.5. Within the current framework, this approach is used in the QM/MM calculations of the CT-active molecular fragments, where the MM charges of the molecular environment are divided by a factor of 1.5 in the application to hole transfer in DNA. In other recent work [68,69], it was proposed to scale the charge of charged residues and ions in classical MD simulations with the common biomolecular force fields by a factor of *ca* 0.7, to correct for the energetics of ion solvation. While this procedure is clearly debatable, we wish to point out that the introduction of the scaling factor of 1/1.4–1/1.5 in our method as well as our motivation by the overestimated solvent RE is consistent with this reasoning completely. Effectively, the excess charge interacting with the MM environment is reduced by a factor of 0.67–0.71 as a result of the scaling.

The scaling of MM electric field also influences the energy landscape of the simulated ET reaction, which is determined by the differences in IP of the charge-carrying fragments in the case of hole transfer. Whenever this difference is induced by the different character of the molecular environments of the individual fragments, it decreases as a consequence of the scaling of the MM electric field. Based on the analysis in [68,69], where correct solvation energetics was obtained with the scaling of atomic charges, it can be expected that the MM scaling also improves the energetics of ET in our simulations. Note that this applies in the case of hole transfer in *Escherichia coli* DNA photolyase, where the energy landscape will be due to the different MM environment of the charge-carrying fragments, and thus it will be affected by the MM scaling. Contrary to that, this point will not occur in applications where all of the fragments experience the same or a similar environment because of symmetry, as in the case of hole transfer in DNA where the energy differences are due to the different chemical characters of adenine and guanine.

### 2.5. Non-adiabatic propagation schemes

As indicated in §2.3, the computational scheme described up to this point can be used readily to perform a multi-scale simulation of ET within the BO approximation. Note that QM/MM-BO simulations of ET were also performed by other researchers previously [37–40]. Since the validity of BO for the systems of interest is ambiguous or even unlikely, it is preferable to resort to a non-adiabatic representation. Our work makes use of the extensive development in this field, and both main approaches, the MF (Ehrenfest) and the SH method, have been implemented. For a discussion of the general features and applicability of MF and SH, we refer to previous publications [70,71].

#### 2.5.1. Mean-field (Ehrenfest) method

The energy within the MF method [72,73] is obtained by averaging over all (adiabatic) states weighted by their populations. Then, all of the states interact with a common potential as a result of the classical environment, which in turn interacts with the combination of states in the quantum system. The quantum system is propagated with a straightforward numerical solution of the time-dependent Schrödinger equation (TDSE). Figure 5*a* shows schematically how the wave function, which corresponds to one of the adiabatic states at the start of the simulation, is driven into a combination of adiabatic states as soon as these become coupled via the atomic dynamics.

The current MF implementation [47] is in the framework of time-dependent DFT, representing a coarse-grained version of the previous work by Niehaus *et al.* [74]. The Lagrangian formalism is used to derive the equations of motion from the total energy expression presented in the previous sections. The potential energy is taken from the above derivation, and there are kinetic energy contributions related both to the coordinates of atoms and to the expansion coefficients of the wave function for excess charge. Two sets of equations of motion are obtained: the electronic system evolves according to the TDSE with the Hamiltonian *H _{mn}* containing the second-order contribution (equation (2.18))
2.25

For the classical (MM) system, Newton's laws of motion are obtained 2.26Note that these equations of motion contain atomic charges on the CT-active fragments augmented by the differential charges corresponding to the distribution of excess electron,

These two sets of equations are coupled to each other, so that they have to be propagated simultaneously. The MM equations involve the (time-dependent) charges of the (*N* – 1)-electron system *q ^{α}*, which are obtained by a mapping (projection) of the excess charge onto the classical MM system. On the other hand, the time-dependent electronic Hamiltonian depends on the coordinates of all atoms parametrically. Practically, the coarse-grained electronic Hamiltonian is calculated first, and a step of the quantum propagation of the excess charge is performed, followed by a projection of the excess charge to the atomic charges and then a step of classical dynamics of the atoms is done. The following iteration (time step) continues with another calculation of the Hamiltonian and a quantum propagation step, and so on. By performing the projection of the excess charge onto the MM system, it is ensured that the environment (e.g. solvent) is being polarized according to the position and de/localization of the excess charge, and it turns out that this is an important factor for the understanding of the CT process.

#### 2.5.2. Surface hopping

Within an SH scheme [75], the quantum system is only allowed to reside in one of a certain set of states, either adiabatic or diabatic. The principle of SH is to propagate the populations of states in time, while the system occupies one of the states at a time, and a transition to another state (surface hop) is possible in every step of the simulation. This is depicted in figure 5*b*. The key component of the algorithm is the calculation of the probability of such a surface hop. Accurate SH algorithms are known, such as the fewest switches by Tully [76]. Following a recent study [36], the SH algorithm based on a local diabatization of the adiabatic states [77] is implemented in this work. We present a brief outline of the method here, and refer to the original work for details [78].

In every step of the simulation, the eigenproblem for the quantum system—the excess charge—is solved, providing the adiabatic states, and the current wave function of the excess charge is expressed as their combination. Then, the set of diabatic states is redefined to be identical to the adiabatic ones at the beginning of the time step, and a step of propagation of the wave function (population of the states) expressed in this locally diabatic basis is performed according to the TDSE. The newly obtained populations are used to calculate the probabilities of surface hops from the current state to each of the other states, and a random number is drawn to determine whether a transition to another state (surface hop) occurs in this step.

This procedure is an alternative to the direct numerical integration of the TDSE, which is performed in the MF approach; see above. All of the remaining parts of the simulation protocol are the same. The charge distribution of the excess electron or hole, determined by the currently occupied adiabatic state, is projected onto the MM atomic charges of the participating molecular fragments, and the charges of the corresponding atoms are updated in the MM engine. Then, a step of the classical dynamics of the atoms is performed.

It should be noted that the current SH scheme is a particularly simple one. For instance, explicit calculation of non-adiabatic couplings is not involved, and the overlap of the adiabatic state vectors is considered instead. Furthermore, the velocities of atoms are not rescaled after a completed surface hop, which is common with other SH implementations. This way, sudden changes in the dynamics of the system are avoided, but also the total energy of the system is not conserved; this is probably a minor issue because of the application of a thermostat to conserve the temperature rather than the energy. Also, the so-called classically forbidden transitions [79] are not treated in any particular way, so that the classical system is assumed always to contain enough energy to support the surface hop, effectively. These two topics in the coarse-grained QC context are the object of current work in our laboratory. The same is true about the implementation of a possibly necessary decoherence correction [80–82].

#### 2.5.3. Which quantum-mechanical propagator?

Now, it is necessary to choose the propagation scheme to be used. It shall be noted here that while every propagation scheme may have its own distinct issues, as will be discussed later in this section, there are also problems that stem from the introduction of the semi-classical representation. One of them is the too slow quantum decoherence, which is caused by the division of the system into a quantum part and a classical part, and thus this problem is shared by both MF and SH schemes [83]. (Still, it is possible that the problem will manifest itself to a different extent in MF and SH.)

The simplest propagation scheme is the MF (Ehrenfest) method, which does not require the coarse-grained Hamiltonian to be diagonalized. The TDSE is integrated explicitly with a numerical method, including all of the adiabatic states in the calculation implicitly. The current wave function of the hole is transferred directly into the propagation of atomic coordinates (i.e. MD) by means of the mapping onto the atomic point charges. The conceptual simplicity comes at the cost of a potentially severe MF error, increasing the delocalization of the wave function largely. This is evident above all in applications where ET proceeds along a flat energy landscape, as is the case in hole transfer in homogeneous DNA sequences.

The BO approximation is involved in most of the main-stream QM/MM simulations and may be plugged into the current hybrid methodology as well. The coarse-grained electronic eigenproblem has to be solved by a diagonalization of the coarse-grained Hamiltonian, and already this step has turned out to be critical on rare occasions, in the simulations of hole transfer along flat energy landscapes. Still, the most important limitation is that only the lowest adiabatic state—the ground state—is considered. As shown recently [36], a proper description of elementary ET events may require higher adiabatic states to be considered, and so the application of BO is restricted to systems where this is not the case. Another possible problem concerns the fact that artificial transfer over long distances may be obtained with BO; this would happen when the instantaneous IP of a distant fragment is decreased by an accidental environmental fluctuation, even though there is no electronic coupling between the involved molecular fragments whatsoever.

The SH approach seems to be the only one free of such principal limitations. In spite of its complex character and the fact that it is still an approximation, it is possible to construct an implementation of SH that is suitable for the application to ET processes. SH involves the higher adiabatic states naturally, and it suffers neither from the delocalization issue (as a single adiabatic state is mapped to the MD engine at any given instant) nor from the possible artificial transfer when there is no coupling of the involved fragments. This makes it perhaps the best choice of the presented approaches. We note that it is not possible to rule out that further issues will emerge in future applications of the current simulation method to new molecular systems. A conceivable example may be a too large number of fragment orbitals that would need to be considered in the application to organic semiconductor systems (see below), which would imply a too large number of adiabatic states to be considered for SH.

### 2.6. Implementation details and simulation protocols

The described methodology has been implemented in a local version of the Gromacs simulation package [84,85], v. 4.0.3 and (newly) v. 4.6. The most time-consuming procedure in the simulation is the SCC-DFTB calculations of the molecular fragments. The SCC-DFTB method has been included as an integral part of the Gromacs MD engine mdrun. Since the calculations of the different fragments are independent of each other, they may be performed at the same time on different processing units, and this simple parallelization has been implemented. Favourable computational efficiency is achieved as a result of full integration of the QC calculations in the programme and of the described parallelization. A multi-scale simulation of ET in a system such as the DNA sequences described below takes typically 1 day per nanosecond, provided the number of charge-carrying molecular fragments in the system does not exceed the number of available CPU cores.

Production simulations have been performed for DNA and protein systems so far, and the simulation protocols follow the standards established for classical MD simulation closely. The Amber parm99 force field is used [86,87], including the usual corrections specific for DNA [88] and proteins [89]. The biomolecule is placed in a rectangular box filled with TIP3P water molecules, and Na^{+} or Cl^{–} counter-ions are added to neutralize the system. The smooth particle–mesh Ewald method is applied to evaluate the Coulomb interactions while the Lennard–Jones interactions are cut off at 1 nm. The simulations are performed at a constant temperature of 300 K and a pressure of 1 bar, maintained by the Nosé–Hoover thermostat and the Parrinello–Rahman barostat, respectively, with a characteristic time of 0.5 ps. The lengths of bonds involving a hydrogen atom are constrained to the equilibrium lengths with LINCS or SHAKE. The time step is usually 1 fs; for the initial DNA simulations presented below, the time step was 0.2 fs. The number of steps made, and thus the total time simulated differs by application.

Prior to each production simulation, an equilibration is performed with a stationary excess charge placed on a selected molecular fragment. To this end, modified MM charges of the atoms in this fragment are used; the magnitude of these charges is estimated beforehand according to equation (2.22) with *Δ**Q _{m}* = 1. The equilibration procedure starts with a steepest-descents minimization of 100 steps, and it continues with a short dynamics (of 20 ps typically) where the solvent is heated up to 300 K by means of a Berendsen thermostat while the solute is kept at 10 K. The next step is a simulation (of 20 ps again) where the entire system is brought to a temperature of 300 K with the Berendsen thermostat. Finally, a simulation (of 100 ps to 1 ns, differing by application) is performed with the same set-up as described for the production phase.

## 3. Applications

### 3.1. Hole transfer in DNA

The original motivation to develop the presented methodology was to describe the transfer of a radical cation in double-stranded DNA. There has been a long debate about the mechanism, and all of the issues of localization of charge, adiabaticity of transfer and time-scale separation have been raised. While it is established that the mechanism is not band-like because the disorder due to dynamics and solvent prevents the formation of extended orbitals [90], initial studies concentrated on the distinction [91] between one-step superexchange tunnelling and what was interpreted as multi-step hopping [92–94] mechanisms. Static DNA molecules were considered here, and models involving the dynamics of the system—except model-based work, e.g. by Jortner *et al.* [95]—appeared later. A prominent role is played by the idea of the polaron, the deformation or polarization of the molecular environment in the vicinity of the transferring hole [96]. It constitutes the basis of, for example, the phonon-assisted polaron-gated [97] and polaron drift [98] models, and its existence was supported by the studies of sequence-dependent transfer [99].

There is a question that is both crucial and interesting but that is by no means answered unambiguously yet: is the hole in DNA delocalized over several nucleobases or rather confined to a single one? The delocalization of the hole charge would be a prerequisite of band transport or, for example, a conformational gating [100] mechanism, while other experimental work was interpreted in terms of the hopping mechanism assuming a localized hole [101–103]. Generally, the quantum delocalizing effect competes with the localizing forces of the solvent [56]. A certain consensus of a spatially localized hole has been reached recently in the theoretical and computational community [39,40,104,105]. It is possible to understand how peculiar the question of delocalization is by noticing the qualitative change over time of the results reported by several researchers. For instance, while Conwell [98] previously considered delocalized holes, she concluded more recently that the hole is quite localized [39]. Our own previous work [106] involved a numerical integration of TDSE for the hole, without taking into account the polarization of the environment due to the excess charge. Thus, the localizing forces of the environment were not included in the simulation, and a delocalized hole was observed. This has changed upon critical analysis and has passed to the method reviewed here, and a rather localized hole is observed now, as described later.

The hole transfer in DNA was studied by means of atomistic simulation in several laboratories over the last decade. Several studies relied on the post-processing of MD trajectories performed for a system that did not contain the excess charge. Here, the various approaches used a Monte Carlo formalism [107], calculation of the state of the hole for individual snapshots [108], or involved the integration of the TDSE in a simple way [106,109]. Principally, none of this work could describe the polarization response of the environment to the presence of the hole, and so the agreement of the obtained rates of transfer with the experimental results obtained in these studies may have been accidental to a considerable extent. The response of the environment was taken into account in the DFT-based work in Mantz *et al*. [37], but the observation of a localized hole was actually forced by the biased simulation protocol. So, the only work from that time which involved most of the interactions between the excess charge and the environment seems to be that in the study by Steinbrecher *et al*. [40], in which a parametrized model of the electronic structure was used, making nanosecond simulations possible. Still, the model was not quite correct, as it left out the description of inner sphere reorganization. The addition of this term would have made the model complete and self-contained; also, however, the rate of transfer would have decreased considerably. Thus, the good agreement with experimental reports seems to have resulted from a compensation of errors brought about by the approximations in the model. We note that there is also newer work that describes the discussed interactions [39], but the application of full DFT calculations brings an increased computational cost, restricting the available time scale to the picosecond range.

The parts of DNA that exhibit the lowest IP are the purine nucleobases, guanine (G) and adenine (A). Therefore, G and A are the molecular fragments that are considered in the FMO calculations. The pyrimidine nucleobases, the DNA backbones and the solvent compose the MM environment, being represented by point electric charges. In a simulation, the transferring hole (radical cation) evolves via the purines along the DNA double strand. We note that the efficiency of hole transfer is not limited by the possible distribution of purines between both DNA strands as the electronic coupling can be sufficient in such a case as well [21]. The first application deals with two archetypical DNA species [78], one with a homogeneous sequence AAAA in which the IP of all of the considered fragments are equal, and one with a sequence GAG in which the transfer from one G to the other has to overcome a higher energy region represented by the A.

#### 3.1.1. Oligonucleotides with a homogeneous sequence

The different non-adiabatic propagation schemes, MF and SH, provide markedly different pictures of hole transfer in the DNA sequence AAAA; the amount of delocalization and a measure of the mobility of the hole are shown in table 1. The wave function of the hole delocalizes over multiple As within picoseconds after an MF simulation is started, even though the initial conditions are those of a perfectly localized hole. As soon as such delocalization has occurred, there is no way for the wave function to go back into a localized state. This is a consequence of the MF description, where the MM environment interacts with the excess charge assuming a state that corresponds to a combination of adiabatic states. The amount of delocalization is large and probably dependent on the size of the system, i.e. the number of purines considered.

The large delocalization observed in MF simulations leads to very fast transfer. It was shown previously that a much lower RE applies in the transfer of a delocalized hole than in the transfer of a spatially confined hole [110]. Then, the increased transfer rate is a direct consequence. Actually, in the MF simulations of hole dynamics in AAAA, it is merely possible to speak about the mobility of the wave function rather than about the rate of transfer because the ‘position’ of the delocalized hole is difficult to determine.

For this reason, it seems to be inappropriate to use the (conceptually simple) MF scheme to simulate hole transfer in systems such as the homogeneous DNA sequences. Instead, an implementation of the SH method is a good alternative, in spite of the disadvantages of SH, similar to the hopping algorithm, not being unambiguous, etc. [83]. In the SH simulations of hole transfer in AAAA, the wave function of the hole remains localized to a single A for most of the time, which is the same appearance that is observed in BO simulations. The rate of next-neighbour hole transfer can be determined in such a case and amounts to *ca* 100 ns^{–1}, which is roughly an order of magnitude larger than the experimental estimates [101,102]. This discrepancy may be caused by the preliminary character of the SH implementation, which seems to require additional attention regarding the calculation of hopping probabilities and decoherence behaviour. Development along these lines is underway in our laboratory.

#### 3.1.2. Oligonucleotides with a heterogeneous sequence

The DNA species with the nucleobase sequence GAG represents a qualitatively different system with regard to hole transfer. The energy landscape is not flat anymore because of the IP of A being *ca* 0.4 eV larger than that of G. Consequently, a spatial extension of the hole spanning this high-energy region is unfavourable. The wave function of the hole is localized considerably even in MF simulations, and the MF error does not manifest itself notably. Although there are still isolated instants when the wave function is spread over two nucleobases, the wave function always returns to a rather spatially confined state, which is observed for most of the simulation time. The amount of delocalization and the rate of transfer of the hole are shown in table 2.

The appearance of the CT events is similar in MF and SH simulations, perhaps as a consequence of similar localization of the hole. Also, the rate of transfer does not differ between MF and SH simulations, and three or four transfer events are observed per nanosecond on average.^{6} We note that this nearly perfect agreement may be the result of a sort of cancellation of errors to a certain extent as the MF error cannot be excluded entirely and the SH scheme is not in a final form yet. These results will have to be checked as soon as the SH algorithm has been upgraded (see the section on the non-adiabatic dynamics above). It can be expected that slower rates will be obtained with the corrected methodology.

#### 3.1.3. Conclusion on hole transfer in DNA

The hole transfer in homogeneous DNA sequences does not seem to be a great problem for the Marcus theory. The hole is likely to be localized well. On the basis of spectroscopic measurements, the relaxation of water was reported to occur with two significant time scales of 1.4 and 19 ps [111], and similar observations were made in MD simulation studies [112]. Thus, the relaxation occurs just faster than the hole transfer. Lastly, the calculation of the adiabaticity parameter [35], as performed in [78], predicts a solvent-controlled adiabatic mechanism of transfer.

The hole transfer in heterogeneous sequences represents a more complex problem. The simple model of thermally induced hopping [94] may not be sensitive enough to reproduce the difference between various DNA sequences. Notably, Schuster and co-workers [99,113] observed a more intricate dependence of hole transfer efficiency on the exact nucleobase sequence, which was attributed to the so-called phonon-assisted polaron transport involving a delocalized hole. The particular topics of distance dependence and sequence dependence of hole transfer are the prospective applications of the reviewed simulation scheme.

### 3.2. Note on charge transport in DNA

The presented simulation approach is aimed at the temporal evolution of an excess electron in a complex molecular system, following an injection of this electron or hole into the system.^{7} Apart from this ‘chemical’ process, it is possible to consider the charge transport through the molecular system, which is considered rather from a point of view of physics and can also be called conductivity. We would like to mention our recent work in this area briefly.

To this end, our FMO scheme for the calculation of the CT Hamiltonian is combined with the Landauer–Büttiker formalism [114] to compute the electrical conductivity. The structural dynamics of the molecular system is reflected by the time-dependent Hamiltonian, which enters the calculation of electric current under the approximation of instantaneous electron transport by means of tunnelling. The variation of the Hamiltonian in time has a huge impact on the electron transport, and the description of transport improves dramatically in comparison with static models, which were often considered previously.

This flavour of our simulation approach was introduced in the work on charge transport in double-stranded DNA [115], where the effect of structural fluctuations was analysed in detail. Subsequent applications dealt with another class of DNA molecules, G4-DNA [116], double-stranded DNA containing mismatches [117], under mechanical stress [118] and in a micro-hydrated environment [119]. This multi-scale scheme has been developed further, by taking into account the dissipative character of the environment [120,121], and by passing to the time-dependent Green's function approach [122].

### 3.3. Hole transfer in a protein

ET is the most frequently occurring type of enzyme-catalysed reactions in biochemistry. Although many of the ET reactions taking place in proteins, often requiring the presence of a catalytic metal centre, can be described with the Marcus theory with success, examples can be found where this is not so straightforward. This is the case with *E. coli* DNA photolyase [123], in which the photo-activation process proceeds via a sequential transfer of a hole from the flavin adenine dinucleotide (FAD) cofactor via two tryptophan side chains (Trp1 and Trp2) to a terminal tryptophan (Trp3) at the surface of the protein [124] (figure 6). The hole transfer in the subsystem composed of three tryptophan side chains was considered in our work [34], and the main findings will be presented in this review because another type of energy relations in a CT system is observed here.

The fundamental difference between the charge transfer in DNA and in this enzyme concerns the properties of the molecular environment rather than the charge-carrying molecular fragments themselves. Much like in the DNA species AAAA, all of the charge-carrying fragments are chemically identical (all of them are tryptophan residues), and so they possess equal inherent IP. But, the highly heterogeneous structure of the protein affects the energy landscape of the transfer: Trp1 is buried in the interior of the protein molecule whereas Trp3 is located at the surface, exposed to the solvent; Trp2 is moderately solvent-accessible. These structural features have consequences for hole transfer.

A first point is that the solvent is effectively more polarizable than the protein. Consequently, when the hole is located on Trp2, it is stabilized more strongly than on Trp1, and the stabilization is the strongest on Trp3; the energy difference amounts to *ca* 0.6 eV for each of the steps. This fact provides the hole transfer reaction with a driving force, so that the lowest energy state has the hole located on Trp3 and a re-transfer to Trp2 or Trp1 is unlikely. Another point concerns the dynamics of the system. Upon a completed hole transfer event, the environment adjusts—re-polarizes—according to the new location of the hole. The applicability of the classical theory of CT requires this re-polarization to occur substantially faster than the transfer itself. While the component of polarization that corresponds to the solvent meets this criterion, the relaxation of the (albeit smaller) component due to protein takes hundreds of picoseconds, interfering with the time scale of transfer.

The hybrid methodology for ET simulations is an ideal means to describe the transfer under such complex conditions as seen in this example. The rate of hole transfer obtained from simulations in [34] is in reasonable accordance with the experimental reports, deviating by less than an order of magnitude, while the calculations with the Marcus theory predict a rate for the second step Trp2 *→* Trp3 that is three orders of magnitude too slow (table 3). Interestingly, when a hole transfer simulation is started from the system equilibrated with the hole localized on Trp2—so that the environment is relaxed fully—the elementary transfer Trp2 *→* Trp3 is much slower than observed in the simulations performed without the artificial relaxation (see fig. 14 in [34]). The artificial relaxation of the intermediate state performed in this test simulation is equivalent to the assumptions of fully relaxed reactants in the Marcus theory. Therefore, the reason for the second elementary transfer occurring so fast is most probably that the energy barrier for this step increases slowly only after the first step Trp1 *→* Trp2, allowing for the second step to take place well before the environment has relaxed. Note that very slow rates of relaxation, on time scales longer than 100 ps, were indeed reported in protein systems [125]. Such a sequence of events is not in accordance with the assumptions of the Marcus theory, which considers an instantaneous relaxation of the environment. The example of hole transfer in *E. coli* DNA photolyase demonstrates the advantage of an explicit simulation of ET over the classical theory of transfer, in cases where certain assumptions of the classical theory are not fulfilled.

### 3.4. Hole transfer in peptides

Peptides have been a popular class of systems to study ET, and the studies by Giese and co-workers [126,127] concentrated on the hole transfer between two amino acid side chains via a relay amino acid (see figure 7 for an example of such a system). The characteristics of these ET systems differ from both DNA and the *E. coli* DNA photolyase in several respects.

The conformational flexibility of small peptides in an aqueous solution is much larger, making the sampling of conformational space difficult. The consequences for ET become clear by noting the straightforward connection between the structure and the electronic coupling between the individual electron-carrying fragments. Concerning the fragments that are involved in the transfer, it may be necessary to consider not only the amino acid side chains but also the *π*-electron systems of the peptide bonds [128], and this represents another technical challenge because of the difficult definition of the QM/MM boundary.

Our preliminary study of hole transfer in these peptides focused on the basic characterization of the transfer and the calculation of parameters of the Marcus theory [129]. An interesting result is the outer sphere RE for the sequential transfer 1 *→* 2 *→* 3: for the entire transfer 1 *→* 3, it is very similar to that obtained for the elementary steps 1 *→* 2 and 2 *→* 3. This is different from the situation for the hole transfer in DNA, where strong distance dependence was observed [110]. The reason is perhaps the relatively large distance between the individual amino acid side chains involved in the transfer, which are not immediate neighbours on the peptide chain—rather, there are two or three other amino acids in between. Then, the RE is already saturated for the elementary transfer steps. In the current work on this topic in our laboratory, the QM region has involved the *π*-electron systems of the peptide backbone explicitly, and use has been made of a Green function approach to compute the electronic couplings [130] to account for the super-exchange mechanism on long time scales correctly. Also, we plan to perform direct simulations of hole transfer in these systems, in spite of the presumably extensive computational cost due to the long time scale of transfer exceeding nanoseconds.

### 3.5. Hole transfer in organic semi-conductors

Organic semi-conductors are a class of ET systems that are of great interest currently. At the same time, they possess ET characteristics that are markedly different from those observed in the previously mentioned applications [131]. For these reasons, we are working to extend the presented hybrid methodology, to describe the ET in systems such as hexabenzocoronene (HBC) derivatives. Owing to their self-organizing capabilities [132], these materials form phases such as liquid crystals and self-assembling monolayers on surfaces; an example of the latter structures is shown in figure 8.

The difference between these systems and those presented above is clear—there is no aqueous or generally polar solvent around the electron-carrying molecules. Therefore, there is little orientational polarizability in the system that would oppose the transfer of electron, and merely electron polarizability is conceivable. Also, the inner sphere RE seems to be much smaller than in DNA bases or amino acids. On the other hand, the electronic couplings between the individual molecular fragments can be quite large. As pointed out recently [133], the reversed relation of RE and electron coupling probably leads to an entirely different mechanism of transfer—adiabatic in this case. An additional computational challenge is represented by the extended, highly symmetric aromatic character of the electronic systems of these molecules. Two HOMOs are degenerated actually, and another orbital has only a slightly lower energy, so that at least three orbitals per molecular fragment have to be considered for the hole transfer.

Since the charge will probably be rather delocalized over several molecular fragments, it may not be meaningful to describe the transfer as consisting of hopping events. On the other hand, the delocalization will be limited owing to the disorder caused by the structural fluctuations. This point can be inferred from the results of preliminary calculations performed on the complex of six HBC fragments. The electronic structure of the complex was computed along an MD trajectory, and the delocalization of the HOMO was obtained as where *Δ**Q _{i}* is the occupation of each fragment by the HOMO (figure 9). The delocalization amounts to 2.3 on average, which is distinctly smaller than the value of 3.3 obtained for the (static) ideally ordered structure, illustrating the disordering effect of the dynamics, which prevents the formation of extended electronic bands.

Apparently, the hole transfer in organic semi-conductors is an application with particularly high requirements for the flexibility of the theoretical or computational approach that would describe it, and it seems that, unlike the Marcus theory, our hybrid methodology may meet these requirements.

## 4. Discussion and conclusion

We have developed a multi-scale computational scheme that allows us to simulate the electron and hole transfer processes in QM regions containing several hundreds of atoms, embedded in extended non-reactive environments, on a time scale of up to tens or hundreds of nanoseconds. The scheme relies on an approximated expression for the total energy, which allows us to implement state-of-the-art non-adiabatic propagation methods. Therefore, the simulation does not rely on any assumptions about the localization of the electronic structure, the nuclear and electronic time scales and the mechanism of the transport process. The adiabatic as well as the non-adiabatic effects are covered naturally, and the hopping transfer and the band transport are described as well as the super-exchange tunnelling, although the accessible time scales may be exceeded for the latter. The quantum description is reduced to the frontier orbitals, and it is easy to draw parallels with the parameters used in the Marcus theory—in terms of the electronic coupling, the driving force and the RE.

The method makes use of a QM/MM coupling scheme. The reactive region in which the charge can be localized is treated with a QC method. It is split further into the frontier orbitals, which are treated with QC explicitly, while the remainder of this region is described with MM, much like in the HMO and PPP approaches. The main approximations are the following:

— The wave function of the excess electron or hole is considered as a superposition of orbitals of the molecular fragments. This makes the QC calculations, which are the most time-consuming component of the simulation scheme, scale linearly with the system size.

— The total energy to the first order is the energy of the neutral system

*E*^{0}plus the energy of the excess electron*E*^{1}(or its negative in the case of hole transfer), according to the Koopmans approximation. We showed that the basic assumption, a similar shape of the frontier orbitals of the neutral and the ionized fragments, holds well.— All of the energy change due to the different on-site electron–electron interactions as well as that due to the relaxation and rotation of orbitals is covered by the second-order term

*E*^{2}by using a Hubbard model effectively. These effects are evaluated for isolated fragments and are assumed to hold in the weakly interacting complex system as well.— In the calculation of forces on the atoms, the contributions of kinetic energy and XC to

*E*^{1}have to be neglected further. Consequently, two kinds of processes taking place upon the charging of the system are not covered: the geometry relaxation of the fragments and the change of van der Waals interactions between the fragments. While the latter effect is small, the former is significant and is taken into account by an additional inner sphere reorganization contribution to the total energy. This may even be a superior approximation because quantum effects can be taken into consideration, which would be missing in the standard semi-classical dynamics schemes.

The derived approximate expression for the total energy allows us to compute the forces on the atoms, as in the standard QC methods. This is the basis for the application of a non-adiabatic semi-classical scheme, MF (Ehrenfest) or SH. The testing revealed that the application of the MF method is problematic because it leads to a pronounced, overestimated delocalization of the wave function of the excess charge. This resulted in the wrong description of the transfer because the outer sphere reorganization energies were underestimated. This issue does not apply in the SH scheme.

The framework has been applied to the hole transfer in DNA, in *E. coli* DNA photolyase, and in model peptides and organic materials. We summarize the main findings here.

While there has been no real consensus on the fundamental features of hole transfer in DNA, our results showed that this can be resolved on the basis of direct ET simulations. The hole is localized nearly completely as confirmed by non-adiabatic SH simulations, and reasonable hopping rates were obtained, albeit slightly faster than in the experiments. In DNA, the time scale of transfer not shorter than tens of picoseconds is perhaps long enough so that no problems related to the relaxation effects would occur, and it is possible to apply the Marcus theory. Also, RE is much larger than the electronic coupling, so that the hole is localized on a single nucleobase predominantly, and the transfer proceeds with the solvent-controlled adiabatic mechanism in homogeneous sequences. The detailed sequence dependence of transfer can be readily studied with direct simulations.

For the consecutive hole transfer reaction in the protein *E. coli* DNA photolyase, we find the second elementary step to be fast, in accordance with the experiments. This fast transfer cannot be predicted on the basis of the standard Marcus theory because of the time-scale separation problem: the time scale of transfer interferes with that of the structural relaxation. Generally, long time-scale relaxation effects ranging from pico- to nanoseconds [112] can be observed in biomolecular complexes whenever a sudden change of the electrostatics in the system has taken place. It seems to be common that there are slow relaxation modes in proteins, on a time scale exceeding 100 ps [125], which are not present in DNA for instance.^{8} In this case it does not seem possible to construct a simple model since the dynamics of the system seems to depend on structural details of the molecule. Rather, an atomistic approach is needed, particularly regarding the environment, which may be pronouncedly heterogeneous.

The study of hole transfer in model peptides relies on the Marcus theory, which is justified as the transfer is slow. The focus is the calculation of electronic couplings for the transfer between the amino acid side chains, involving Green's function methods and taking into account the *π*-electron systems in the backbone.

An application of the multi-scale simulation scheme for organic semi-conductors faces an additional challenge as it is necessary to include multiple molecular orbitals per fragment. The preliminary results show that the ET in these materials is anything but similar to that in the aqueous biomolecular complexes. RE may be much smaller owing to the non-polar character of the system, being dominated by the inner sphere component *λ*_{i}. The relation of RE to the electronic coupling may not be as clear as in DNA or in the protein, or may even be reversed. Consequently, considerable delocalization of charge is observed, making it possible for the transfer to proceed with an entirely different mechanism from that in the previous cases, similar to band transport. Then, direct dynamics simulations would be a method with clear advantage over the Marcus theory. This particular case demonstrates that our methodology seems to be well suited for the description of fast CT processes for which the detailed mechanism is still in question, in particular when the application of a hopping scheme may not be possible. The organic electronics is perhaps a typical example [133–135].

These entirely different characters of ET in the various molecular systems make it difficult to develop a unified phenomenological model of transfer. Then, a multi-scale simulation scheme as reviewed here is a viable alternative to characterize the features and predict the rate of the ET reaction.

## Acknowledgements

We thank Alexander Heck for valuable discussions and Florian Weigend for help with QC calculations. The reviewed work was supported by the Deutsche Forschungsgemeinschaft under contracts EL 206/5-1, EL 206/5-2 and EL 206/5-3 within the priority programme Quantum Transport at the Molecular Scale (SPP 1243).

## Footnotes

↵1 While this separation of time scales [30,31] may not be valid in some fast processes [32–34], an extension of the Marcus model to account for this effect was proposed [32].

↵2 The derivation for the system with an excess electron (radical anion) is analogous.

↵3 Also, it may be noted that only the relative energetics of the molecular conformations is described by the zeroth-order term.

↵4 The typical frequency of skeletal C–C vibrations is 1600 cm

^{–1}, and 1 kT = 209 cm^{–1}at 300 K.↵5 The atomic charges are overestimated in the commonly used biomolecular force fields to compensate for the missing electronic polarization in polar environments such as aqueous solutions.

↵6 Unfortunately, no experimental work is known to the authors that could be interpreted in terms of the absolute rate of hole transfer in this or similarly simple heterogeneous DNA sequences.

↵7 The injection of charge itself cannot be simulated with the current methodology.

↵8 Whereas the change in electrostatics concerns the dipole moment of the chromophore in spectroscopic studies, the same is expected when an electron is transferring in the molecular system.

- Received May 7, 2013.
- Accepted June 24, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.