A pedagogical approach to the Magnus expansion

Time-dependent perturbation theory as a tool to compute approximate solutions of the Schrödinger equation does not preserve unitarity. Here we present, in a simple way, how the Magnus expansion (also known as exponential perturbation theory) provides such unitary approximate solutions. The purpose is to illustrate the importance and consequences of such a property. We suggest that the Magnus expansion may be introduced to students in advanced courses of quantum mechanics.


Motivation
In physics courses the key role of a few first principles in the formulation of the basic laws is emphasized. Students are then, from the very beginning, exposed to the idea that the fundamental equations have to strictly obey those requirements. Soon, however, when they come to applications, they face the fact that the exact solution of these equations is a rare possibility and consequently approximations are necessary.
A question then naturally arises: Do these approximate treatments still satisfy the requisite basic principles? For some of the more conventional approaches the answer is often in the negative. It is then interesting to discuss to what extent can one devise approximate algorithms which are still respectful of the general principles.
In this paper we consider this issue for the case of time evolution in quantum mechanics. Our presentation will be addressed to physics or engineering students with some general ideas of quantum mechanics and elementary knowledge of differential equations and matrix calculus.
During the evolution in time of a quantum system the total probability of the possible outcomes of any measurement on it must be conserved. Or, in mathematical terms, the timeevolution operator U(t, t 0 ) transforming the state at time t 0 into the state at time t is unitary.
However, when the corresponding time-dependent Schrödinger equation is solved by the most familiar approximation schemes, the unitarity condition is lost.
Here we present the Magnus expansion (ME) [1]: a systematic way to build approximations to the time-dependent Schrödinger equation in such a way that, in any order, the evolution operator is unitary. It has a long history of more than 50 years and has been, and still is, widely used. However it seems to have found no place in the graduate curriculum, far less at the undergraduate level, as the lack of elementary expositions indicates. An interesting exception is reference [2], which formally unifies different unitary expansions but without illustrative examples. Our exposition will be kept within pedagogical limits and, consequently, questions of mathematical rigor and further developments will intentionally be left out. The interested reader is referred to [3] for a more detailed recent review where a more complete list of references may also be found.
The paper is organized as follows: in section 2 the conventional textbook treatment of the time-evolution operator is roughly summarized pointing out its conflict with the unitarity requirement. In section 3 the ME is deduced in as direct a way as possible. The first three terms are explicitly given and their relation with the conventional time-dependent perturbation theory established. Section 4 includes two examples to illustrate how the algorithm works. The last section collects some final comments.

Customary time-dependent perturbation theory
The time-evolution operator U(t, t 0 ) is introduced in quantum mechanics in order to study how a system changes with time. It evolves the wavefunction between times t 0 and t as (t) = U(t, t 0 ) (t 0 ). During the evolution the norm of the wavefunction is kept constant which ensures probability conservation. Mathematically this means that U is a unitary operator: with the initial condition U(t 0 , t 0 ) = I . Here H (t) is the Hamiltonian of the system which may depend on time. λ is a bookkeeping parameter which at the end can be taken as λ = 1. From the mathematical point of view, the basic objects we deal with (H, U and their relatives) are operators in the Hilbert space of quantum states. However, for the sake of simplicity, we suggest that the inexperienced reader think of them as (finite-)dimensional matrices, and in what follows we will refer to them as such.
Equation (1) is easily solved by iteration. This procedure is equivalent to time-dependent perturbation theory (TDPT), and gives the expansion where P n (t 0 , t 0 ) = 0. Substituting in (1) and comparing the terms of the same order in λ, we find whence where we have shortened H i ≡ H (t i ).
If the Hamiltonian does not depend on time, then (4) yields and the sum in (2) gives the familiar expression which also follows by direct integration of (1) in this case. The last result explicitly shows the unitarity of U(t, t 0 ): any matrix of the form exp(iA) with A Hermitian is always unitary.
Note that the previous equation may be only deceptively simple: matrix exponentials are not always easy to calculate. As we will see the last comment also applies to the ME. When the Hamiltonian is time dependent, however, no such simple form is obtained. Of course the sum of the whole series (2) provides a unitary operator, but in practice one has to truncate the series and the result is not unitary any longer. It may be instructive to compare the situation with a very familiar one: the sine function is certainly periodic, but any truncation whatsoever of its Taylor series ceases to share that property.
The non-unitary character of finite-order TDPT does not mean that this is a useless scheme. On the contrary, the theory has been successfully in use ever since the beginning of quantum mechanics. Therefore the student should not lose interest in grasping the TDPT techniques. Just keep in mind that every approximate method of solution has its own limitations and this holds true also for ME.
Here a further comment is in order: equation (6) may be trivially written as In the time-dependent case, neither equation (6) nor (7) is valid. But, as can be seen in some advanced textbooks on quantum mechanics, it is customary to introduce the so-called time-ordering or Dyson operator T defined as and then U(t, t 0 ) can symbolically be written as which, in spite of its appearance, does not correspond to a true exponential representation. By this we mean that whereas (7) admits the conventional series expansion of the exponential function in powers of the exponent, (9) cannot be expanded in that way. This should be clear from equation (4) since the multiple integral is not the nth power of a single integral.

The Magnus expansion
The so-called ME [1], also referred to as exponential perturbation theory in the literature, starts by assuming that a true exponential solution of (1) does exist: and proceeds to find and solve by series expansion an equation for (t, t 0 ).
For ease of reading, from now on we change our notation a little bit. In the previous section we have explicitly written the i andh factors as is usually done. However, it will prove convenient to hide those factors in the redefined 'Hamiltonian' and now the Schrödinger equation for with the initial condition U(t 0 , t 0 ) = I . Note that the Hermitian character of H makesH anti-Hermitian, which simply means H † = −H . The previously quoted result about the exponential representation of unitary matrices can now be rephrased as saying that any unitary matrix is the exponential of an anti-Hermitian matrix. So we get our first condition on (t, t 0 ): it has to be anti-Hermitian in order to ensure the unitary character of U(t, t 0 ).
For future reference, note that if A and B are two anti-Hermitian operators, then their commutator, [A, B] ≡ AB − BA, is also anti-Hermitian. This property will be instrumental in proving the unitary character of the Magnus scheme.
Note that by changing our unknown from U to , we actually look for the logarithm of the time-evolution operator. If the Hamiltonian is time independent, then, according to the preceding section, one has simply (t, To see the origin of the difficulties when the Hamiltonian is time dependent, let us consider the differential equation (12) but taking U andH as ordinary scalar functions instead of matrices. From elementary calculus, we know that the solution is as is immediately verified by direct ordinary differentiation.
The key point is that if, instead, we deal with matrices, the familiar rule is not generally valid any more because the matrices A(t) and its derivative A (t) could be noncommuting. The exponential in equation (13) is not a solution of (12) unless [H (t 1 ),H (t 2 )] = 0, for arbitrary t 1 , t 2 , or, at least, Obviously, this is certainly the case ifH is time independent. Otherwise, the search for becomes much more difficult in general.
The question now is: If U obeys (12), then what is the differential equation for ? Inasmuch as we are not allowed to use the familiar differentiation rules, we have to follow a different route to proceed. Here two important results come to our rescue.
The first one is the intuitively clear 'group property' of the time-evolution operator: The second one is related to a famous classical result in matrix algebra known as the Baker-Campbell-Hausdorff (BCH) formula [4] for the product of two exponentials. It states that for any two, in general noncommuting, operators X and Y, one has The exponent in this equation is an infinite series. Its terms are nested commutators of increasing order.
The simplicity in (16) is misleading because higher order terms become quickly very much involved. The explicit form of the series is not known, although there exist algorithms to compute it to a finite order. As a measure of its complexity, we mention that the number of terms in order m is O(2 m /m) and the CPU time increases consequently.
In spite of this awful complexity, surprisingly, a compact formula exists [5,6] that gives the piece of the BCH series to all orders in one operator, say Y, and to first order in X, namely where multiple nested commutators are explicitly indicated. Here B k are Bernoulli numbers [7]. Now, to derive the equation satisfied by , we consider a short time interval δt after time t; at the end, we will let δt → 0. We use the group property (15) with The crucial point now is that, during the interval (t, t + δt) the Hamiltonian can be assumed to take the constant valueH (t). The Schrödinger equation can then be solved in the exponential form exp( (t + δt, t)) exp(λH (t)δt) and so Applying (17) to (19) and keeping just the first order in δt, we get In the limit δt → 0, this yields the exact result which is a highly nonlinear differential equation for , in contrast with the linear differential equation obeyed by U. So far, finding approximate solutions for the logarithm of the timeevolution operator may seem an even more difficult task than finding U itself. The gain with the exponential representation of the time-evolution operator emerges when is expanded in a power series in λ. We call = ∞ k=1 λ k k the Magnus series. By introducing it in (21) and equating terms of the same order in λ, we break the previous differential equation into an infinite set of trivially integrable differential equations for each k . After some algebra, the first three terms read where again we have abbreviatedH (t i ) ≡H i . Note that 1 reproduces, after exponentiation, the result in equation (13). But now it is only a part of the solution. Interestingly, the terms k with k > 1 are the successive corrections generated by the non-vanishing commutators, necessary to have an exponential solution for U(t, t 0 ).
As a rule of thumb, we can say that to obtain k all one has to do is to commute and integrate. The important point is that, by doing so to any order, the truncated sum of the series for is always anti-Hermitian, because of our earlier comment on the commutator of anti-Hermitian operators. Consequently its exponential furnishes a unitary approximation for U(t, t 0 ). That is precisely what we were looking for.
From a practical point of view two further comments are in order. First, equations (22) give the first three terms of the Magnus series explicitly. It can also be proved that all terms in the series can be obtained recursively. Second, as was already mentioned, once an expression for is obtained, one still has to calculate its exponential, and this may not be an easy task.
Hence we readily find

Two illustrative examples
For the sake of illustration of the ME and comparison with TDPT, this section presents a simplified version of application examples given in [3,8]. In the two examples we treat, the use of approximation methods is, in fact, purely illustrative because the exact solution is known for both. But the comparison between approximate and exact results buttresses the importance of calculating with unitary approximations. We deal with the first and second approximants in ME and TDPT for the forced harmonic oscillator and the two-level spin problems. The Hamiltonian in these examples has the structure H (t) = H 0 + V (t), where H 0 is independent of time and V (t) is a perturbation. The exact time-evolution operator associated with H 0 is known and reads U 0 (t, t 0 ) = exp[(t − t 0 )H 0 ]. Here the bookkeeping parameter from the previous sections has been set to λ = 1. This splitting of the Hamiltonian will allow us to focus the perturbative approximations just on the piece V (t).
As one can find in most textbooks of quantum mechanics, for split Hamiltonians, it proves extremely useful to change to what is called the interaction picture, disposing of the (assumed known) action of H 0 . This is accomplished by factorizing Then the Schrödinger equation for U I (t, t 0 ) reads ∂ ∂t U I (t, t 0 ) =H I (t)U I (t, t 0 ), withH As a benchmark of our computations, we will study the transition probability between the two eigenstates |a and |b of H 0 :

Linearly forced harmonic oscillator
The linearly forced harmonic oscillator is a soluble problem in quantum physics that has received a great deal of attention over the years. The Hamiltonian reads where H 0 stands for the unperturbed Hamiltonian and V (t) is the perturbation The second-order ME needs the commutator [H . As this is a scalar quantity, higher order terms in the ME vanish and the series terminates, eventually yielding the exact solution. Explicitly, introducing we have where the superscript I indicates that the computation has been done in the interaction picture.
In the perturbative expansion, P I 1 and P I 2 can be directly obtained from (24). However, contrary to what happens with the ME, the series (2) does not terminate. It is also clear that in second-order TDPT, is not unitary. This simple example also reveals an important physical difference between ME and TDPT: due to the fact that in any order of approximation the ME furnishes an exponential representation of U, it compactly involves the action of an infinite number of creationannihilation operators and therefore connects any pair of eigenstates |m and |m . In contrast, TDPT in any finite order of approximation involves exclusively a finite number of a ± operators and correspondingly the transition |m → |m has to satisfy strict selection rules in order to be allowed.
For the sake of illustration, let us suppose f (t) even and t = +∞, t 0 = −∞. Then U(+∞, −∞) corresponds to the scattering S-matrix. The transition probability from the ground state |0 to the excited state |m gives in the ME the usual Poisson distribution P(0 → m) = α 2m exp(−α 2 )/m!, whereas TDPT is, according to the comment above, absolutely unable to give such a result. Not to mention the further drawback that, due to the loss of unitarity, it may give P > 1, a fact that ME guarantees will never occur.

Two-level system
A simple time-dependent two-level system in quantum mechanics is described by the Hamiltonian in terms of Pauli matrices (recall, [σ x , σ y ] = 2iσ z and cyclic permutations).
Here we take Thus, f (t) stands for a rectangular pulse.
In order to analyse the evolution induced by H (t), we take H 0 ≡ 1 2h ωσ z as the unperturbed Hamiltonian. Its spin-up and spin-down eigenstates have energies ±hω/2 and are denoted by |± . Treating the remainder of (35) as a perturbation corresponds to assuming V 0 hω. Since |± are not the eigenstates of σ x , the pulse represented by f (t)σ x in (35) will generate the transitions from |± to |∓ .
From a physical point of view, the energy splitting of the states |± may be achieved by applying a constant magnetic field along the z direction. Then a magnetic pulse along the x direction causes the transitions. Besides, ωT measures the ratio of the duration T of the pulse to the internal time scale of the system 1/ω. Thus, ωT 1 and ωT 1 establish the sudden and the adiabatic regimes of the perturbation f (t), respectively.
If we take t 0 = 0 and t > 0, the exact solution for U(t, 0) is The exponentials of the Pauli matrices above may be evaluated by using the formula exp(i a · σ ) = I cos(| a|) + i a · σ sin(| a|)/| a|, valid for a linear combination of Pauli matrices that we express as the scalar product a · σ and where I stands for the 2 × 2 unit matrix. Explicitly cos τ sin η + sin τ cos η, The introduction of the dimensionless parameters γ and ξ will prove to be convenient later on. Note that the first one stands for the pulse area and ξ gives its duration, expressed in the units of the internal time 1/ω. The time evolution for t > T simply mixes the eigenstates harmonically, as indicated by the τ -dependent terms in (39). Hence, the interesting question is: Assuming that the system was in a given eigenstate of H 0 at t = 0, what is the relative proportion of each eigenstate just when the pulse vanishes, i.e. at t = T ? Thus, as a test, we propose to compute the transition probability P(ξ ) between the eigenstates |± of H 0 .
To apply the ME and TDPT to this problem, we transform the Hamiltonian into the interaction picture defined by H 0 above: and H I = 0 otherwise. The time-evolution operator factorizes as U(t, 0) = exp(−iωtσ z /2)U I (t, 0). Due to the diagonal structure of the exponential, only U I is involved in the computation of P(t), namely | ±|U |∓ | = | ±|U I |∓ |.
The first two orders in the ME read Using (37) we get the approximate transition probabilities Time-dependent perturbation theory gives, for the first and second approximation orders, Meanwhile, from (36) and (37) we get the exact result P ex (T ) = 4γ 2 4γ 2 + ξ 2 sin 2 γ 2 + ξ 2 /4.
For the sake of illustration, in figure 1 we plot the exact formula, and the various approximations for P at the end of the pulse, as a function of ξ , with γ = 0.5. For a fixed value of γ , ξ 1 corresponds to the sudden limit, whereas ξ 1 is the adiabatic regime. Thus, the different systems compared in figure 1 correspond to the pulses of the same area, embracing the sudden and adiabatic regimes. Remember that ξ is the dimensionless time expressed in units of internal time.
Note that the limit of sudden pulse (i.e. ξ 1) is very well described by the ME, unlike TDPT. This is a general characteristic of ME approximations in the interaction picture. An intuitive explanation would be that the more sudden the perturbation the smaller the range of times where the commutator [H I (t 1 ), H I (t 2 )] does not vanish. Remember that this commutator is the origin of all the nuisances for the exponential representation of U. Observe that P (1,2) pt → γ 2 as ξ → 0, and this result has no physical meaning for γ > 1. Then, one is forced to consider TDPT at much higher orders (frequently, very hard to compute) just to get reasonable results. TDPT always converges to the exact solution, but this convergence can be too slow for practical purposes.
In the opposite limit, the adiabatic regime, the transition probability is very small. Both approaches seem to approximate well the exact result. This is, however, deceptive. In general, both ME and TDPT have to be preceded by a change of picture based on the instantaneous diagonal form of the full Hamiltonian if one wants to obtain meaningful transition probabilities. As a matter of fact, the preliminary transformation may be looked upon as an especial adiabatic picture.
In the example at hand, the second-order correction of the ME to the transition probability is very small, whereas for TDPT it exactly vanishes. Moreover, main corrections come from odd approximation orders.
The inset in figure 1 represents the relative error of the various approximants, i.e. the difference between the approximate and the exact probabilities divided by the exact one. For the ME, the relative error fits a power law (which corresponds to a straight line in a log-log scale). For TDPT it is roughly constant in the interval plotted (0 < ξ < 4). This behaviour persists with other values of γ .
Near the zero of P(ξ ), around ξ = 6, the relative error has no meaning. The interpretation of this zero is noteworthy. In that situation the system is unaffected by the perturbing pulse. The potential V (t) presents a kind of transparency for this particular two-level configuration.
Another outcome from figure 1 is that the transitions between the two levels are most efficiently activated with sudden pulses.
The rectangular pulse has been worked out up to fourth order both in the ME and TDPT [3]. Further examples with known exact analytical solution are the Rosen-Zener and Landau-Zener two-level models [3,9], as well as the spin one-half system in a rotating magnetic field [10]. They can also be used as benchmarks for the approximation algorithms.

Conclusions
In this paper we have shown how to build a perturbation expansion for the quantum timeevolution operator in such a way that truncation at any desired order of approximation does not spoil the unitary character of that operator. The proof presented is based on the group property of U, and conceals all the mathematical subtleties in a simplified form of the BCH formula.
We think that with the resources available to the average student in quantum mechanics the whole procedure can be followed and the topic could be introduced either directly in a general syllabus or as assigned work. It will help the student to see the noncommutativity of observables, one of the flagships of quantum mechanics, at work. In this instance one deals not with two different observables like the familiar case of position and momentum, but with a single one taken at two different times.
As has been repeatedly emphasized, from the mathematical point of view, the complicated structure of the expansion will also give the opportunity for students to increase their skills in matrix calculus.
In section 1 we already mentioned our intention to present only the basic ideas of the ME. Important questions, like convergence conditions, alternative approaches and generalizations, have not been touched upon and the interested reader is referred to the literature. It is important to say that the Magnus proposal is by no means restricted to the quantum Schrödinger equation but applies directly to any linear differential operator initial value problem which pervades the entire field of physics and probably the student has already met the above equation in different contexts. This explains why the ME has found applications in a wide spectrum of fields from elementary particles, nuclear, atomic and molecular physics to classical dynamics, nuclear magnetic resonance or control theory. In recent years the ME has also found its way into the numerical treatment of differential equations, within the field of geometric integration [3,11], where it has been shown to provide effective algorithms which incorporate essential physical requirements.