On time-dependent perturbation theory in matrix mechanics and time averaging

The time-dependent quantum perturbation theory developed by Born, Heisenberg and Jordan in 1926 is revisited. We show that it not only reproduces the standard theory formulated in the interaction picture, but also allows one to construct more accurate approximations if time averaging techniques are employed. The theory can be rendered unitary even if the expansion is truncated by using a transformation previously suggested by Heisenberg. We illustrate the main features of the procedure on a simple example which clearly shows its advantages in comparison with the standard perturbation theory.

purpose was then, in analogy with classical perturbation theory, to construct perturbatively a 'canonical transformation' S on the variables q and p to a new set of variables in which the Hamiltonian W of the perturbed system is diagonal. When dealing with time-dependent external forces, an appropriate perturbation scheme was formulated from the preceding one, this time involving an explicitly time-dependent transformation matrix S t ( ). At each step of the procedure, a choice had to be made for the matrix W and subsequently the transformation S was determined by solving the corresponding equation.
By contrast, the procedure followed in the wave mechanics approach and later in the standard formulation of quantum mechanics was more analytical in character: the purpose is to get the quantum time-evolution operator U t ( ). If the Hamiltonian H does not depend on time, then one has to solve the time-independent Schrödinger equation H E y y = . This is done by determining both the eigenvalues E n and the eigenfunctions n y as a power series of the perturbation parameter, whereas if H depends explicitly on time, U t ( ) is constructed by considering the interaction representation, i.e., as a factorization of the evolution operator corresponding to the unperturbed part of the Hamiltonian operator and the evolution of an appropriately transformed Hamiltonian describing the perturbation, which is then approximated as a power series in the perturbation parameter [17].
Both perturbation theories lead of course to the same results when the Hamiltonian is independent of time [23], although the original connection with classical mechanics is somehow lost in wave mechanics, particularly when dealing with time-dependent problems. Perhaps for this reason, several papers have been published along the years emphasizing this relation, by using techniques of classical mechanics and dynamical systems in a quantum mechanical context: normal forms [1,12], averaging theory to avoid secular terms [14,19], unitary transformations in time-dependent perturbation theory [9,11,20,21], etc.
Here we consider once again the time-dependent perturbation theory as originally proposed by Born, Heisenberg and Jordan in [8] and analyze the formalism after expressing it in modern quantum mechanics language. We show that this technique is indeed more general than the standard procedure of wave mechanics, in the sense that different choices for the new Hamiltonian are possible, thus leading to different kinds of approximations. It is with a particular election of W, namely by taking as W the unperturbed part of the original Hamiltonian, that the standard perturbation theory is recovered. We then explore other options that avoids the presence of secular terms by using high-order averaging, a well known technique in classical perturbation theory [3]. Although the approximations resulting from truncating the series are not unitary, they turn out to be more accurate than those arising in the standard treatment over longer time intervals. Moreover, a unitary formalism can be obtained even when truncating the series, by following the initial approach designed by Heisenberg.
Here we analyze this alternative and illustrate the different procedures in a simple two-level system whose exact solution can be obtained in closed form.
We believe the approach proposed by Born, Heisenberg and Jordan has several advantages when dealing with time-dependent perturbations in quantum mechanics. On the one hand, the procedure is formally similar to the time-independent case. On the other hand, the treatment is obtained by applying standard techniques of classical perturbation theory (transformations of variables, averaging), so that it highlights the connection with classical mechanics, something that might be helpful for students taking a course in quantum mechanics but already familiar with the mathematical aspects of classical and celestial mechanics. In addition, the standard perturbation theory is easily recovered in this framework, if desired, although by applying averaging it is possible to get more accurate results.

Standard time-dependent perturbation (STDP) theory in quantum mechanics
In the standard quantum mechanical treatment of a given quantum system interacting with an external environment, such as that created by an external field explicitly depending on time, one considers a constant (non perturbed) Hamiltonian H 0 associated with the system and an additional time-dependent part H t, The state at time t, characterized by the wave function t ( ) y , is then related to the initial state --, but the general case is not so simple. A convenient way to proceed consists in transforming the problem into the interaction picture [17], i.e. the solution is factorized as where the unknown operator U I obeys the Schrödinger equation In this way, the transformation defined by (3) allows to incorporate the solvable piece H 0 of the Hamiltonian and focus the problem in the (approximate) integration of the perturbation as given by (5). In general, however, it is not possible to get analytical solutions of (4), and so one must turn to a perturbative analysis, looking for an expansion of U I in terms of the parameter e, , , where U t t , n 0 ( ) stands for the contribution of order n in e. This can be obtained by substituting (6) into (4) and then equating terms of the same power in e. Thus, for the first terms we get , then the general expression of U n in (6) is given by Very often, it is sufficient to determine the first few terms to describe transition probabilities between states. The resulting approximations, however, present some undesirable features. First, any truncation of the infinite sum (6) is no longer unitary, and thus the computed transition probability between different quantum states may exceed unity. Second, secular terms in time (i.e., terms of the form t m ) may appear at each order, and so the quality of the approximation degrades considerably with time. In this way, one cannot represent the solution for all t using a finite number of terms in the series, and is only when the infinite series is summed up that the unitary character is restored.
Example. We will use as illustration along the paper the simple two-level quantum system described by the Hamiltonian where j s denote the Pauli matrices and and e, 0 w , 0 w w ¹ are real parameters. If 1 and 2 stand for the spin up and down states, respectively, the exact transition probability from state 1 to state 2 is given by We see then that secular terms already appear at order 2, and so one expect that the truncated (non unitary) approximation obtained from (3) -+ + will be valid only for short times. The resulting transition probability up to this order is -Secular terms appear in P t at higher orders in e.

Time-dependent perturbation theory in matrix mechanics
Historically, however, the first treatment of perturbations in quantum mechanics proceeded in a different way, more related to the usual approach followed in classical Hamiltonian mechanics. The objective there is to construct a canonical transformation (a symplectic change of coordinates in phase space) such that in the new variables the dynamics of the resulting Hamiltonian is easily solved. This transformation can be obtained in particular by solving the corresponding Hamilton-Jacobi equation [3,15]. Analogously, the idea of Born, Heisenberg and Jordan in [8] when developing a perturbation theory in matrix mechanics was to construct a unitary transformation that renders 'solvable' the Hamiltonian (see e.g. [4]). In their procedure the (perturbed) system to be solved is described by a Hamiltonian H of the form where the dynamics corresponding to H 0 is known. This means that the Hamiltonian H 0 is diagonal in the momentum p t 0 ( ) and coordinate q t 0 ( ), so that their time evolution is given by  inserting these expansions into (15) and collecting together terms of the same power in e, the following equations are obtained [8]: r r r r r 0 0  In this way, it is possible to obtain general formulae for the first terms in the expansion of the eigenvalues of W (i.e., the energy levels of H) even for degenerate systems [8].
What happens if time enters explicitly into H H , , 1 2 ¼, but not into H 0 (as it is the case, for instance, when time-dependent external forces act on a system)? Then, according to [p 336, 8], 'simple considerations show that for this case the perturbation formulae ensue from those cited earlier' (i.e., equation (17)) on replacing Again, W r is determined as the mean value of F r and S r is obtained by solving the resulting matrix equation. Moreover, Born, Heisenberg and Jordan assume that the formalism also applies when the external forces are not periodic in time (even though this assumption was incorporated into the derivation of the formulae) [8].

Reformulating the BHJ perturbation theory
At this point, a natural question arises: is this time-dependent perturbation theory equivalent to the standard one developed in section 2? Whereas in the autonomous case the procedure (14)- (17) reproduces the standard perturbative formalism for determining the eigenvalues of H [17], this is by no means obvious in the general case of a time dependency. Our claim here is that the BHJ formalism is indeed more general than the standard treatment of section 2, but reproduces it as a particular case.
To substantiate this claim, let us first rephrase the BHJ time-dependent perturbation theory in modern quantum mechanics language as follows. Suppose we have a quantum problem defined by the Hamiltonian (1) so that the dynamics associated with H 0 has been solved, i.e., we have computed U t t , H 0 0 ( ). Then, a unitary transformation S t, ( ) e is sought such that the equation satisfied by the new wave function t S t ( ) ( ) y Y = , or equivalently, by the new evolution operator U t t , is easier to solve than (2). It is clear that both evolution operators are then related by Equating terms with the same power of e in (23) we get for n 1, 2,  (23) and (24) we clearly see the origin of the prescription (18) for getting the perturbation formulae as given in [8].

Recovering the STDP theory
Next we analyze how the present framework allows one to reproduce the standard timedependent perturbation theory developed in section 2. More specifically, we show that the STDP theory just corresponds to taking the unperturbed part H 0 as the new Hamiltonian. In other words, the expression (3) for the evolution operator U H with (6) is recovered when the unitary transformation S is designed such that W H 0 = .
To keep the treatment as simple as possible, we analyze the case H t H t , 1 ( ) ( ) e e = in (1) and take t 0 0 = . We also drop the second argument t 0 in the definition of U H , etc.
To determine the inverse of the transformation S, instead of working with equation (24), it is more convenient to formulate a similar equation for the series S t I S t S t , .

Averaging
As a matter of fact, Born, Heisenberg and Jordan pursued a different course of action to deal with equation (24). Inspired by the classical treatment of perturbations in Hamiltonian classical mechanics (see e.g. [3]), they proposed instead to take as the new Hamiltonian W n the mean value (or time average) of V t n ( ) at each order, and subsequently to determine the matrix S t n ( ) from (24) [8].
What is the 'mean value' of V t n ( )? If the function is periodic with period T, then the answer is clear: the mean value is just The mean value is also well defined for quasi periodic functions of time. We recall that a function V is said to be quasi-periodic with basic frequencies , , r . This inequality prevents of course the presence of small denominators in the expression of S t n ( ) [2]. When the procedure is carried out for m steps, the series are truncated at terms of order m e , and the resulting transformation defined by the corresponding S is considered, in the new picture the evolution is given by Finally, the approximation reads, according to (20), Should the series S t, ( ) e for the change of variables converge, the procedure described above would allows us to solve the original perturbed problem.
Notice that with this procedure we have to compute the terms S t

Unitary perturbation theory with averaging
At this point it is worth remarking that before the publication of [ 22]. Although this approach was considered more involved than the one based on a direct power expansion of S (and even wrong at first sight [p 49, 22], since the corresponding formulae differ from (17) already at order two), we could say that, in a sense, it is qualitatively superior. For the unitary character of the transformation is guaranteed even when the series S is truncated, in contrast with the approach (13)- (18), where unitary is only preserved up to the order of the approximation. When this approach is applied to the explicitly time dependent case, equation (21) can be rewritten in terms of the exponent S as where the symbol C dexp ( ) S stands for the (everywhere convergent) power series which naturally arises when differentiating the exponential of a non constant matrix [6]. Here C ad ( ) S is a short-hand notation for the (iterated) commutator and B k are the Bernoulli numbers [18]. Inserting the corresponding series for H, W and S into (38) and collecting terms of the same power in e one arrives at W H S whose expression can be determined using the same techniques as in e.g. [5]. By following a similar approach as in section 4.2, we get the solution of (39) such that t n ( ) S is free of secular terms as

S S
We thus get formally the same expressions as in section 4.2, but there are important differences, though. First, equation (39) for determining n S are formulated only in terms of skew-Hermitian operators and commutators. If, as it is often the case, the original problem is defined in a Lie algebra of finite dimension, one can use the constants of structure to compute the commutators appearing in F n and thus render a more efficient algorithm. Second, the procedure renders unitary approximations by construction even if the series t, ( ) e S is truncated after m terms Third, computing the inverse transformation S t, 1 ( ) e is trivial once S has been obtained: we only have to change the sign of S and evaluate S t t , exp , 1 ( ) ( ( )) e e = --S (or the corresponding truncation of S). A variation of this procedure, considered in the context of quantum averaging [21] and the design of unitary transformations [9] is the following: introduce a skew-Hermitian operator L t, ( ) e such that S t, ( ) e is the formal solution of the operator differential equation +. In other words, the required unitary transformation

S t,
( ) e is obtained by shifting a 'time' e along the trajectories of the differential equation (43). The operator L can be seen as the generator of the transformation. This is in complete analogy with the procedure proposed by Deprit in classical mechanics [15], and leads to some computational simplifications in the procedure.  (12), we see that the result furnished by the STDP theory is clearly less accurate than the approximations rendered by both the BHJ and the unitary schemes.
To illustrate how the different approximations behave for longer times, in figure 2 we collect the errors in the transition probability obtained with the STDP theory (solid line), the BHJ formalism with averaging (dashed line) and the unitary version (dotted line), for the same value of the parameters but now up to order 6 e after 100 periods of the exact result T 2p w = ¢. Whereas BHJ and its unitary version provide almost undistinguishable results (although the former is not unitary), those achieved the standard perturbation theory are completely useless for large times.
To clarify the nature of the error in the BHJ and its unitary variant, in figure 3 we depict the transition probability in the interval t 280, 283 , where now the solid line corresponds to the exact result (the result given by STDP theory is by large out of the scale of the graph).  Only the result obtained by BHJ theory is depicted, since the curve corresponding to the unitary version is almost identical. Notice that the amplitude of the approximations is quite similar to the exact result, and the error is mainly due to a shift in the graph of the function describing the transition probability.

Discussion
In this paper we have reformulated in modern quantum mechanical language the original time-dependent theory developed by Born, Heisenberg and Jordan in matrix mechanics. The formalism is based on designing a unitary transformation S such that in the new representation the dynamics associated with the corresponding Hamiltonian W is easier to solve than the original problem. In the case of autonomous problems, W is taken as diagonal. When time enters explicitly, the new Hamiltonian is chosen as constant. Both S and W are constructed as power series in the perturbation parameter. We have shown that the choice W H 0 = leads directly to the STDP theory, whereas the application of averaging eliminates secular terms and allows one to construct approximations that are reasonably accurate for longer time intervals. This is so although the approximations are not unitary when the series are truncated. Nevertheless, by constructing S as the exponential of a anti-Hermitian operator S, as Heisenberg first proposed, the theory can be rendered unitary even after truncation.
Here we have proceeded on a purely formal level, and several important issues should be further studied, in particular the convergence of the procedure and the conditions to be satisfied by the operators so that the formal solution (30) is well defined in a functional analytic framework. On the other hand, when only matrices are involved, this formalism can be seen as a particular case of application of high order averaging to linear systems [3].
Several studies exist in the literature where quantum mechanical analogues of classical Hamiltonian perturbation methods have been proposed, involving averaging techniques and even several unitary transformations [11,[19][20][21]. These transformations can be designed to be unitary by applying an analogy of the Deprit technique in classical mechanics [9]. The novelty of the treatment exposed here is that, in one way or another, all these contributions can be obtained from the original BHJ treatment, and that this formalism constitutes indeed an effective alternative to the usual time-dependent perturbation theory when dealing with practical problems in quantum mechanics.