My aim in this post is to plot a simple course through elementary dynamical systems theory to arrive quickly at some of the basic elements which appear in physical theories like quantum mechanics. In learning physics we normally encounter many of these concepts only when we first need them, and rarely see clearly how generic they turn out to be. It should help to follow the story from the beginning—all we need is first-order ordinary differential equations.

Table of Contents

Direct Integration

Our subject will be the trajectory of a single point over time, for now in or , in response to a velocity function

In the simplest case of a constant velocity the trajectory is given by the integral , or

where includes a parameter for the initial condition.

For varying with time,

the trajectory is the general integral :

where the integral sign stands for nothing more than the repeated application of the linearization in the limit .

The integral expression must be equivalent to a Taylor Series where hardcodes the first derivative, the second, etc.:

We could generate the individual terms of this series by repeatedly expanding the previous integrand to first order:

This series is suggestive of the series expansion of an exponential , where is a shorthand for . This “time-translation operator”, applied to the function and evaluated at , is

This expression is basically a general solution to time-translation, and could be evaluated term-by-term by substituting .

The system specified by a velocity function is fairly trivial. Solving it amounts to mere integration, and if we know and all its higher derivatives, the entire trajectory is determined by the initial values.

Generating a Flow

What if varies with space as ? Here plays the role of a “velocity field” indicating the direction of motion at any point, but with no momentum (which could represent the motion of a particle in a viscous fluid which dampens all momentum.)

The ODE

can be integrated to an awkward but general solution:

For example, if then the solution is

which is readily solved (ignoring the absolute values, which add a wrinkle) to give

representing a trajectory which exponentially grows or decays, depending on the sign of .

Calling the l.h.s. integral , we see that time evolution amounts to a translation in ,

If is simple enough to be invertible, this has an exact solution:

This is not as nice an expression as the others, but it looks like something we can expand in a series. The inverse function theorem for the derivative will be useful: from the fact that we can evaluate the successive derivatives of ,

Evidently each derivative applies a factor of to the previous one.

With these, the series expansion is

and the last line looks like the series expansion of an exponential function whose argument is the “generator” :

The strange-looking exponential here is really an “operator” acting on functions of ,

and the time-evolution of the identity function would give us itself:

We might have been able to guess the form of the generator by observing that

which is suggestive of an ODE with solution . But this feels sketchy—we seem to be blurring the idea of what actually is here. We’ll come back to this.

Higher Dimensions

Next we ought to visit the -dimensional case of , to see which features are generic to multi-dimensional systems. We won’t need a time-dependent , so I’ll put that off until later.

What happens? Well, nothing about the derivation in the previous section required one dimension, so we can quote the solution in the function representation:

and the time evolution of a trajectory is the same expression on the identity :

Of interest is the simple case , a pure matrix multiplication, which will serve as a prototype for the local linearization of a generic . Here

and

so the general solution is simply

The corresponding 1D case had an exact solution , i.e. exponential growth or decay. With more dimensions, much more can happen:

Exponential growth and decay, at once along different eigenvectors, e.g.:

Rotations, where two dimensions flow into each other while other dimensions do other things:

where

Shears, the simplest case of which is a nilpotent matrix leading to linear trajectories of varying speeds:

The effect is basically predicted by the series solution:

with a nilpotent matrix, such that the series terminates at whichever power has .

Transients of various kinds:

A general real matrix could shear any “subsystem” into any other one, with various results, e.g.:

  • The dimension being sheared-into could have its own growth or decay rate:

Here, if , the resulting shear would be a transient effect which eventually dies off, while it would boost the already-unbounded growth of .

  • could be a pure rotation with then sheared into , causing to couple to the oscillation itself at a lag:
  • could be exponentially decaying, while shearing into , which will cause it to grow transiently and then level off:
  • Multiple dimensions could shear into the same one, with various transient effects.

In general the classification scheme is:

  • if the matrix is normal, , then its eigenvectors are orthogonal, and its dynamics “factor” into distinct orthogonal dimensions with their own growth or decay, or into pairs of dimensions exhibiting rotations. Classifying eigenvalues suffices to characterize behavior.
  • if not, the eigenvectors are non-orthogonal and can feed transiently into each other.
  • if furthermore the matrix is “defective”—it has a non-trivial Jordan Normal form, and fewer eigenvectors than dimensions—then polynomial-in-time terms like appear.

For various reasons physical systems rarely exhibit these latter exotic behaviors, among them that these will tend to violate conservation laws, and that they are unstable w.r.t. perturbations of . But these systems are interesting, as they tend to lie on the “phase change” barriers of dynamical systems, e.g. near resonances or near the limitations of applicability of a particular model.




I find it clarifying to approach the analysis of linear systems with the view of time-evolution as an exponential already in hand. Exponential solutions are generic because exponentiation is generic—it represents the basic feedback loop by which a trajectory at later times experiences the compounding effects of earlier times.

Of course, in full generality we would have to apply that entire analysis to the local linearization of near a point :

A careful analysis might proceed first by identifying the fixed points , linearizing in their local neighborhoods to determine stability, and then dividing the overall space into regions according to which fixed points or limiting behavior each flows into.

For example, the following system is a minimal example of “limit cycle”: a circle of fixed-points were , with trajectories on either side of the circle flowing stably towards it,

which is easier to see in polar coordinates:

We can read off the behavior:

  • at ,
  • grows when and shrinks when , approaching the fixed point from both sides.
  • circulates at a constant rate all the while.

Three Views of Time Evolution

We first tried to describe the time-evolution of a single function with initial condition , the way one learns to do in a first differential equations course. The result was direct integration: .

This worked well enough for , but when we tried to take on we found that the derivation lead us to a different view of time-evolution as an operator acting on functions .

If we think about it, the first case, whose solution simply added something to , clearly cannot be very general—in most spaces where one might want to describe the time-evolution of points, one cannot “add” points at all. We need some kind of lens onto the space, and is one way of supplying this.

Let us denote the operation of “evolve in time by ” on an initial point , with a suitable manifold, by

For a given , simply maps initial points to time-evolved points:

The map of this motion under a function is some trajectory in :

That is,

The “time-evolution” operator on functions, like or , is an operator which takes a function to the new function . This operator is called the “pullback” of and is denoted :

The time-evolved will be the function taking an initial point to its endpoint after :

Whence the name “pullback”? We tend to think of as “pushing” or “flowing” the point according to an evolution law, with sense of evolving “forward”. Then compare and :

For a certain input point , whose output under is , the input point which would produce the same output under is simply , the point which precedes under time evolution by . That is, for and both nicely invertible,

or generally

where and may be sets rather than single points.

In other words: if we regard as a simple directed arrow between points in and , then moves the initial tip of backwards along lines of time-evolution in —the end of the arrow is “pulled back” in time.1

One effect of this definition is that composes times oppositely to itself:

The “outermost” time-evolution composes to the right. This doesn’t really matter for the simple operator described above, but it will matter for the general case.




Now for yet another view of time evolution. Frequently in physics we find ourselves wishing to think of a system being “in a state ”, out of all possible states , rather than as being described by a single point which moves with time. In this view we are lead to express a one-dimensional trajectory as a moving delta function

where the “state” itself is a vector sum of “basis states” of specific . This description has the advantage of readily generalizing to non--function states with arbitrary “densities” ,

Here I’ve used a quantum-mechanics-inspired syntax to represent the states themselves, but such a description is generally used for representing any kind of “collection” of states—whether physically-real superpositions (as in quantum mechanics), epistemic ignorance (as in statistical mechanics), or hypothetical ensembles (as used in frequentist stat-mech). All the information is really contained in , and we could skip writing the state vectors, which gives it more of a stat-mech feel, e.g.

For now we’ll limit ourselves to the single particle case . Given that the trajectory is known (it is defined by our original ODE ), how should we represent the evolution of , and specifically of the density ?

The delta-function case is easy, because the entire density is located at a single coordinate at all times. Let us see what it looks like first with a discretized time coordinate, where in an interval the particle moves exactly as

Evidently the density must change in such a way that it removes all of the density from and adds it all to . The exact change in would be

Visually:

The answer looks like the negative derivative of a delta function, , with “width” . This is easy enough to express in a discrete setting—but continuously?

For continuous- case, still for a delta-function density:

The change in is exactly a delta-derivative multiplied by a “width” . 2

What I’ve computed here is . This is because, contrary to the case of , we do not want to follow the flow of the density . An expression like treats as a “pointer” moving through the space , with measuring the value of at the point pointed-to, and the change in this value. If we did the same for the density we would be treating it as if the density were fixed in space as a function except for some explicit time variation. For deterministic and measure-preserving evolution the time derivative would just come out to .

Instead, by studying , we are fixing a point and gauging how changes with time due to the flow of the underlying space under .

In light of this we should go a step further and rewrite the above in terms of rather than for a truly “local” evolution law for . Applying the -function identity

we get

The last line is also the general formula for any , though we’ve derived it only for the -function case.

Compare with the rule we found for functions:

for which a formal solution could be written. The r.h.s. operator of the equation, is not so amenable to exponentiation. The term alone could be exponentiated as ; evidently it translates backwards by . But the other term makes a mess of things.

We will proceed another way. If we write an integral which measures the spatial average of a function ,

then if we were to study the time-evolution of this average, we should only time-evolve the argument to . The part simply represents a fixed integral over the entire space, and should not evolve:

Then writing , we can see the earlier vs. distinction as associating the time-evolution operator either to the right or left:

But now there appears a third option: keep fixed but transform the coordinate of the entire integral backwards:

Here the term is the Jacobian of the time evolution map .

Apparently, if we want to assign the time-dependence to , what we need is:

In terms of our diagrams, simply “watches” a spot of the space as flows under it. The density at at is the density which was at at time . The following diagram represents this informally (with the little reference frame evoking the time-evolving volume element which produces the Jacobian term):

Our expression,

is called the “pushforward”3 of by , denoted :

One last note. Pushforwards of densities turn out to compose in forward order,

like points themselves (and unlike functions/pullbacks). This looks wrong at a glance because the time-evolution operators are applied to the argument of in reverse order. To find the density flowing to a fixed point at time , we want to find the point for which application of should give , and this is exactly




In all we have three “views” of time evolution:

  • Trajectories .
  • Functions .
  • Densities , which we think of as being associated with “states”

Note that these all describe the same evolution—we cannot really even say which view is primal! While the evolution of trajectories is the most natural pedagogical starting point, there is nothing in the physical world to truly distinguish this from, say, an opposite-in-time-variation of the function we use to measure or observe the state.

It turns out that “functions” and “densities” will be easier to talk about than “trajectories” themselves. Outside of the setting of we cannot directly integrate a trajectory like ; there may be no notion of “addition” at all, and a motion will have to be expressed in terms of some other structure or representation, which functions and densities will supply.

These representations are the natural material of physical theories. The “Heisenberg picture” of Quantum Mechanics is a description of time-evolving functions, and the Shrodinger picture of time-evolving states; approximately densities. (It is not a description of trajectories themselves, but as we saw trajectories and states both evolve in the “forward” direction, and a state is a trajectory in a larger space.)

Time-Ordered Exponentials

Now for one final feature frequently seen in physical theories which, it turns out, arises in the analysis of generic first-order systems.

Return again to our original ODE in one dimension, but now with a time-dependent velocity field:

If we discretize time into intervals of size , the solution should have the form of a sequence of time translation operators , except that the velocity field itself varies with time:

The true trajectory should look like a limit of this expression, with each computed at time and applying the vector field at , the result of the path so far.

We could express the discretized solution on functions or densities instead:

(Note the pullbacks compose in opposite order.)

Which will be easiest to calculate? If we try to integrate directly we get something like

and then we could imagine writing . The time derivative would be , and repeating this procedure would produce something like a formal Taylor series for , involving increasingly-complicated time derivatives of . This might work, but it’s hard to see how it would be enlightening, and it will only really work for anyway without further machinery.

For a general expression, it’s simplest to work on the pullback . Let us try to compute the time derivative of this expression:

where in the last line I’ve used to treat as an operator acting on the spatial argument to its right, and then rewritten both instances of dependence as another application of to the whole combination .

Forgetting about the particular or now, the above is an operator equation for :

It will be useful to give the combination a name of its own:

Note that operators at different times do not commute:

With this, the operator equation is

The being on the right is important. The equivalent derivation for would wind up with an operator on the left, corresponding to the fact that time-evolution composes on densities in the normal way.

We can now try to solve this “operator ODE” to find an explicit form of . The above almost looks like it should have an exponential solution

which worked in the case of , but now also varies with time. The next guess would be an exponential of an integral,

but this is suspicious with not commuting with itself, and indeed, if we just discretize into two bins, we see a problem:

What we get has terms of mixed in with terms of . A derivative will not produce an expression with all the factors falling to the left (as it would for ), and we cannot freely rearrange the operators as they don’t commute.

We can make progress by recursively applying the fundamental theorem of calculus for the operator , using liberally and the fact that , the identity operator:

This is a fairly tidy infinite-sum-of-nested-integrals, at least. Note that the integration variables obey , and the total integration volume of term is therefore that of a -simplex of side length , .

Observe also that each factor appears only in reverse time-order (), out of the possible orderings, with each participating only in a single of the integrals.

Therefore if we simultaneously…

  • modify each integral to the full domain range , multiplying the overall integration volume by
  • replace the products-of- operators with a “reverse time-ordered product” , such that the operator argument always appear in the order of increasing time, even when the time parameters range into the region we just added to the integral,
  • and divide by to undo the overcounting introduced by the first two steps…

… we should get an equivalent expression which even better resembles a true exponential series, and is therefore called the “(reverse) time-ordered exponential”:

This, finally, is a general solution to time evolution, albeit an unwieldy one. The reverse time-ordering operator must be used because time evolution composes in reverse order for pullbacks.

The most interesting thing about this is that the operator arises here, in the study of the most generic first-order ODE. I first encountered it in quantum mechanics and later in field theory, but it is in fact quite fundamental to dynamical systems.




I chose to work out the time-evolution of a function because the equivalent derivation for is a little more complicated. However, it will be useful to write both, and the formula will have the advantage of utilizing a normal time-ordering . We’ll get to it another way, though.

If we imagine taking the spatial average of a function with respect to a certain density , and plugging in the operator we just found,

then it should be possible to rewrite this integral with a different time-evolution operator applied to instead. This would be the adjoint of , and will be a similar looking integral, except a) normal time-ordered, and b) involving the adjoint .

What is the adjoint of , then? This we can find by integration-by-parts of the expression:

We move off of and onto the combination , giving

Therefore acts on densities as

Note this is basically the r.h.s. of the equation for , which we perhaps could have expected.

We can immediately write down the general form of the pushforward time-evolution operator on densities:

Here we use a normal time-ordering because is acting on from the left.

But the operator is a little less natural than itself. To see what’s going on we can split it into two parts by a product rule:

The first term is exactly the negative of , and must implement , pure “advection” of density, as a time-ordered exponential:

(Here I’ve switch the sign of the upper limit, reversed the time-ordering, and added a negative sign all at once. Effectively this is taking .)

The second term is simply a multiplication by , the divergence of the velocity field. It must be responsible for the Jacobian term in :

Therefore there must be some way to factor the time-ordered exponential of the first term of out of its exponential. Writing , with , we have

What must be the expression for ? I tried a few derivations here4, but I can’t find one I like, so I’ll just quote the answer:

Effectively this just adds up all the divergence in which would have encountered on its flow between times and . This is almost exactly would you’d expect from the exponentiation of alone, except that the argument moves with the integration time rather than being that of itself.

In all:

where the first term is and the second takes .




Was this worth it?

Well, here’s the whole point: the standard description of classical mechanics is as the “time evolution of a density function”, though it is simplified somewhat by the Jacobian being for time-independent Hamiltonians.

And even quantum mechanics can be described in this way, if you treat the real and imaginary parts of the wave function as distinct variables . The evolution of itself under the Shrodinger Equation is not quite this, but the evolution of the probability density is—except that its velocity field depends on the underlying wave function .

My hope in the next post is to describe these three theories—classical mechanics, quantum mechanics in , and actual-quantum-mechanics—in these same terms, so as to see how much of Q.M., in particular, can be relegated to general dynamical-systems theory.




  1. I find this naming counterintuitive, as I am for some reason prejudiced to trying to read as “pulling back”, or perhaps as “pulling back” (to , I suppose), rather than pulling its argument, the function , back.

  2. It’s curious that multiplication-by- acts like a “width” here. Multiplication feels like the wrong sense for this. In the discrete case, would be approximately —it moves the mass between two bins a distance apart. But nothing would be multiplied…

  3. Only at this point did I realize that this “pushforward” is different from the first thing by that name one encounters in differential geometry, the linearization which carries tangent vectors or vector fields along a general map . These are related, but this in fact is the same “pushforward of a measure” that one encounters in probability theory, e.g. when “pushing forward” a probability measure from a sample space to the real line by a random variable .

  4. I am certain I have seen a combinatorial-species-like derivation where the series of sequences of terms in the exponential is combinatorially equivalent to another series generated by the exponentiation of the combination , an “interaction picture”-like expression. But with all the integrals this gets very hairy, and the final result turns out to be simple anyway.