Table of Contents

Introduction

This is part 1/? of a series of posts on the theory of “reaction networks”, a.k.a. “Petri Nets”, a.k.a. “mass-action kinematics”. For the most part this material will be assembled from various links on John Baez’ page. Almost nothing here is my own work, but I will be restating it in my own words and partially in my own notation.

My own interests are in:

  • Elementary examples of parts of quantum-mechanics and statistical-mechanics, e.g. Fock Space, and in decomposing these theories into simpler ones.
  • The relationships between descriptions of the same system at levels of abstraction, with a renormalization flavor.
  • The zoo of equilibrium behaviors of dynamical systems, and how they relate to the underlying parameters
  • Generating function methods, wherever they appear.
  • The opportunity to develop methods of diagramming high-dimensional spaces.

Reaction Networks

A “reaction network” is a graph whose edges represents “reactions” (or “processes” or “transitions”) between different “complexes” (or “combinations” or “tuples”) of “species”, each with an associated “rate constant”. For example:

The standard examples of systems which might be described by reactions networks are:

  • Chemicals, where the “species” stand for molecules, and reactions stand for reactions.
  • Population dynamics, where the “species” represent species, or cells, enzymes, viruses, etc, and reactions stand for interactions between species, like predation or asexual reproduction.

A reaction network represents the set of processes that can change the state of a many-particle system. Only these reactions can occur, and no others; therefore any state which is not accessible via some sequence of reactions is unreachable. The set of possible states is further constrained by the rate constants: some states may turn out to be attractors, others dynamically unreachable. The second reaction above represents a reversible reaction, so in principle this system could move back and forth along the line of , but whether it actually does so will depend on the relative rates of the two reactions and the interactions with the other reactions. If a network has a graph cycle —“cyclic” or “weakly reversible” network—it could potentially return to its original state from another direction, but again it is not guaranteed to do so.

Levels of Abstraction

I want to emphasize at the outset that a reaction network merely describes a set of processes, which can be “translated” into equations in different ways—in particular, at different “resolutions”, “levels of analysis”, or “levels of abstraction”. 0. Equilibrium. The highest level of abstraction. Here we characterize a system by its equilibrium or long-time behaviors, which might be an attractor state, but also might include more complicated behaviors like cycles. Whether these states can be deduced from the network is a matter of mathematics.

Examples: Thermodynamics as equilibrium stat-mech. Expectations of stationary wave functions in quantum mechanics.

  1. Expectations. This is the entrypoint for most analysis of a reaction network: we translate the network into differential equation in the average populations or concentrations. This is called “Rate Equation”, which is an ODE for species averages . It can be analyzed for long-time behavior and for stability and bifurcations, etc, which can tell us things at level 0.

    Examples: in quantum mechanics: the classical limit as an expectation, or in another sense, stationary wave functions. Various models one encounters in diff-eq class, like the predator-prey model and various models of viral spread.

  2. Stochastic Dynamics. At this much-lower level of abstraction, we can describe the system as a probability distribution over states with exact numbers of each species. Then we can translate our reaction network into a differential equation for the evolution of this probability distribution, giving a “stochastic dynamical system”. This level gives a better picture of the underlying behavior, but is much higher-dimensional and correspondingly is much harder to solve. Nonetheless, under some conditions we can characterize equilibrium solutions at this level of analysis, and can lift these solutions to the Rate-Equation level.

    Examples: the Shrodinger equation, especially when viewed as a path integral of a propagator.

  3. Histories, Timelines, or Sequences of Reactions. At a lower-level still would be a characterization of the state as a particular sequence of reactions, with which we can associate some probability-weight or “amplitude”. The weighted sum of all possible timelines we expect to leads to the higher-level characterizations of the system. But this level of analysis is particularly suited to questions like: what is the probability for a transition between specific starting and ending states?

    Examples: a diagram in a path integral, in QFT. I also think of “programs”, in the sense of a sequence of expressions in an abstract language. Or, the probability of a specific sequence of mutations occuring in genetics, and the ways that selection effects would bias the post-hoc distribution of mutations. Or, the probability that some system breaks down over time.

  4. Micro-scale Dynamics. At the lowest level of analysis we have the interactions of individual particles. At this level the reaction network doesn’t tell us anything; in fact this is the level where you might calculate the rate constants used at higher levels of analysis by computing scattering cross sections from the underlying dynamics. Or perhaps we can use statistical-mechanics methods to find a density-of-states or partition function.

    Examples: Hamiltonian mechanics of interacting particles in a gas. Individual vertexes in QFT.

Most of the theory here will concentrate on the three highest levels of analysis. In particular, we’ll want to say things about the behavior at higher levels of abstraction, through arguments at lower levels.

I spell all of this out because, first, I think it will be a good way to organize these ideas, but in particular because, for me, the most interesting thing about reaction networks is as a model system for understanding the relationships between the different levels of abstraction.

Level 1: The Rate Equation

Anatomy of a Rate Equation

A reaction network is most simply seen as encoding a differential equation called the “Rate Equation” of the network.

Here is the example from above again:

We translate into the following set of coupled diff-eqs:

stand for the “rate constants” of the four reactions, and will also be used as the names for the specific reactions. Note that in this example and represent two different reactions relating between the same sets of species, but in reverse directions.

When we want to iterate over reactions, we’ll index by . will the total number of reactions, and we can also use for the set of reactions without ambiguity. So in this example we have . The rate constant of reaction is a positive real number, and will also be called —we will see how far we can get identifying reactions with their rate constants.

are the “concentrations” or “populations” of each of the species, which are non-negative real numbers—we think of the rate equation variables as representing the time-dependence of averages or expectations rather than integer numbers of molecules or organisms. You could also write concentrations as or like in chemistry class. We use capitals for concentrations because we will later use , and when working with exact integer numbers of particles.

The rate of a reaction depends on the product of the concentrations of its inputs only, e.g. our first reaction occurs with rate . This is the “law of mass action”.

The coefficients like represent the change in the concentration of a species when a reaction occurs. So our first reaction removes two units of and adds one unit of , but doesn’t change the number of species. since appears on both sides of the reaction.

The groups of species appearing on either side of the reactions are the “complexes” of the network. will denote the set of complexes and also the total number of them. In this network there are five:

We will index complexes by a lower-case . Each reaction has a source complex and a target complex —think of these as the two ends of an edge of a graph. We can index into these complexes to find the component of species like . Here:

  • , so , , and the full tuple is .
  • , so and the full tuple is .

The difference , in terms of species, we will call —this is the change in the exact species population when reaction occurs once. Here .

We could try to write the Rate Equation as a matrix equation:

…but this won’t be very useful because the matrix isn’t square—instead it has dimension , because the rates depend on the complexes, and there can be a lot of these in general. You could also write it as a diff-eq relating complexes to complexes, but you would have to denote the species-in-the-complexes in some other way. So, neither of these approaches will be useful.


The Rate Equation in Tuple Notation

We can condense the Rate Equation somewhat by summing over reactions, where are the “target” and “source” of reaction , and additionally index the species.

This is not too bad, but all those species indices are not so easy to read. Some of the literature chooses to clean this up by omitting the species index entirely, asking us to remember which terms refer to species (which obey special rules_) and which do not. I want to take a different approach and represent all indexing-over-species as an overbar , which will always represent a tuple over species. So in the example above. You can still index into a barred quantity like .

A tuple over species will act like a vector as , but we won’t assume it has any “vectorial” properties besides those we explicitly define mention here.

Tuples can be added or subtracted, and it will occasionally be useful to take a dot-product:

We’ll also define operations on tuples which do not have vector analogs. Most operations which would not be defined on vectors we implement by applying component-wise, then multiplying all of the components of the result:

Note that the last line is a succinct way to write a multinomial coefficient , if . We could also write this as .

Finally we will need a couple more non-standard operations. First we have the “falling power”, which works for integers or tuples:

The last line indicates that the falling power could be related to a multinomial coefficient with both arguments tupled, but this seems a little too wacky, so we won’t use it.

Note that is a product of terms where all but one being are slightly less than . So it’s reasonable to read it as representing a number “somewhat smaller than ”.

Next we have the scalar product of more-than-2 tuples, which we write as a chain of dot-products:

And lastly we implement inequality between tuples, which can be seen as a shorthand for “logical AND” or as a product of indicator functions. We won’t attempt to find a notation for “OR”.

With all of this notation, and , we can write the Rate Equation very concisely. The Rate Equation on column vectors looks like this:

We want to write this as a “tuple equation”, just as we do for vectors, representing a system with one equation for each index . Then the r.h.s. expression will have to be evaluated in each equation. For a two-species system the above stands for a system of two equations, each containing a term which has its own internal tuple structure:

We immediately encounter a weakness of the notation, unfortunately—we would like to write this term as , but we will need to indicate somehow that these tuples are not to participate in the system of equations. I will account for this with absolute-value symbols like . This isn’t ideal because, unlike a vector norm, we’re not really taking an absolute value, and it’s easy to imagine doing this with negative numbers, although we won’t encounter any. Another approach could be for operations like to be defined to apply component-wise but without multiplying-away the tuple structure, like , and only multiplying when wrapped in absolute-value signs: . I don’t care for this—it will be too verbose when we use a lot of tuples. I’ll instead use only when an expression inside a tuple equation might be ambiguous. In general this might collide with any actual absolute values, but we won’t encounter any of these; nearly all of our tuples will be positive anyway.

An alternative to all of this would be to use an Einstein-like notation, with the rule that tuple components multiply for any tuple index appearing on only one side of the , while those that appear on both sides of lead to a tuple of equations. We’d have:

I’m going to stick with tuples, though—we’ll get a lot of use out of them.

Our concise Rate Equation is therefore:


A Rate Equation Example

For any particular network you could attack the corresponding Rate Equation with the methods of standard differential equations. In general we can’t hope to find exact solutions; the goal of such analysis is usually to make “level 0” statements characterizing equilibria and long-time behavior. But it will be helpful to see the general shape such an analysis takes with a worked example.

As a demonstration of what it takes to analyze a particular Rate Equation, I’ll look at the following simple model of an epidemic, where represents the “healthy population”, is “sick”, and is “dead”.

This leads to the following system of equations:

The tuples are and , and the input complexes are and . So the same equation in “tuple notation” would be:

For such simple system the tuple notation is no help at all, and the clearest form would would probably be to just expand the “input” tuples:

Clearly the total population has zero time derivative, being the sum of the three diff-eqs above. This also implies that that matrix representation on complexes is a stochastic matrix, with columns summing to 1:

Now we’ll solve this. This system is simple enough that it admits an exact solution by way of the following trick. We observe that the and equations are not independent because

Then the solutions for and differ only by a constant:

And we can make use of the fact that to solve for exactly:

We now define parameters to make this dimensionless. If we define , we can rearrange:

And now we define constants , , and to write this in a dimensionless form:

We arrive at a fairly simple first-order differential equation, which is most easily analyzed by observing that always has an intercept of , and is shaped like a distorted kind of upward parabola. The minimum of this part of the equation occurs where , or ; hence . For this minimum is to the left of ; for it is to the right:

Then the full system will have a zero (a fixed point) wherever the horizontal line crosses this curve. Note that is always , being the inverse of the initial “healthy” fraction.

Apparently:

  • When , meaning , is a fixed point: a healthy population can’t suddenly get sick.
  • When or , the model develops an “attractor” fixed point —an eventual “death toll”—and while the value of the attractor increases slowly as a function of the initial condition (through , which isn’t really independent of ), it increases very quickly as a function of the “rate of transmission” relative to the rate at which sick patients die off ; represented by .

From here you could go on to calculate the location of the fixed point as a function of the parameters and the initial conditions, or the timescale of convergence to the fixed point, or you could modify this comically-simple model to include behaviors like “getting better” and “quarantining”. But it is adequate to demonstrate the typical behavior of reaction networks: fixed points appear and move around the parameter space as a function of the rate constants.

Level 2: The Master Equation

Stochastic Evolution

In one sense the Rate Equation is a direct translation of the information in a reaction network. But in most cases this should be thought of as an approximation of the underlying physical system, in a few ways:

  1. The rate constants are either estimated from measurements, or, if they’re determined from the underlying physics, are probably approximations which only capture the highest-order interactions
  2. The law that reactions occur with rate is itself only approximately true
  3. The initial conditions are only approximate measurements and therefore represent a peaked distribution of species numbers , which is in actuality discrete rather than continuous, and may be quantum-mechanical

We could imagine trying to “undo” the abstraction along any of these axes. Here we’ll take the first two as givens—as assumptions of the “reaction network” class of models, essentially—and only broaden our view along the third axis.

Instead of concentrations, we’ll now consider the time-evolution of a probability distribution over the exact particle numbers.

A “state” will now be a distribution over the discrete species numbers, represented by an -tuple . We’ll use for the total number of particles. The equation governing the evolution of such a distribution is called a “Master Equation” of the system.

This is a “stochastic” dynamical system, but will turn out to closely resemble a quantum-mechanical wave function, with the Master Equation corresponding to the Shrodinger Equation for many particles in a basis whose diagonal elements are the “occupation numbers” of each energy state, exactly like our .

In quantum mechanics one might arrive at the same thing by first considering some underlying single-particle wave functions; then restricting to only anti-symmetric or symmetric combinations (a Fock Space), and then switching to the occupation basis. We won’t run into those symmetrization questions because we are not trying to say anything about spatial wave functions. (I imagine quantum mechanics would come out cleaner by starting with the occupation numbers. Spatial wave functions are the complicated part. These occupation-number equations will not be difficult to write down, as we’ll see.)

Following that analogy, we can represent a full state by an expression which looks like a wave function:

Here, . Note that in the stochastic case, the coefficients are positive real numbers rather than complex amplitudes, and we normalize this distribution with rather than .

One could write the quantum-mechanical probability as , which is just a distribution over tuples of species. (Telling! We’ll want to follow the two parallel tracks to the exact point at which this fails.)

Our “Master Equation” will represent the time evolution of this distribution by a Hamiltonian-like operator:

The solution can be found for a given initial condition as a time-evolution operator which is just the matrix exponential of the generator :

Now we’ll build it. First we need to represent our reactions in the occupation-number basis. We’ll use off the same example from the beginning of this essay:

Each reaction converts the term , where, recall, : it adds to the targets and removes from the sources. Spelled out, this is:

How do we represent the rate of a reaction? Reaction will convert state to state at a factor proportional to:

  1. The prior probability, of which is
  2. The rate constant
  3. The number of ways to choose all of the reaction’s inputs out of all the candidates in state , which is the tuple “falling power” introduced above:

We also have a condition that , meaning contains enough of each species in for the reaction to actually occur. But this is already implied by the definition of the falling power: if .

Then the total rate of reaction on state is:

Our time-evolution generator will consist of one of such term for each initial state and for each reaction. Each should link the initial state to the final state , which we represent by an additional factor , where the first indicator selects the state which is related to by exactly one occurence of reaction , and the second removes the same weight from the initial . This erases all of the probability mass from , giving it to , but note that, in almost all cases, another instance of the same reaction will also be adding probability mass back the state . (Here I wonder what can be said about those states for which the inflow and outflow rates are equal, or if it would be useful to represent the system as the difference between inflows and outflows for each state.)

This form implies that our generator matrix does not alter the total probability. is an “infinitesimal stochastic matrix”, whose columns each sum to 0; this turns out to imply that the exponential will be a stochastic matrix whose columns sum to 1. In quantum mechanics, this corresponds to the fact that anti-Hermitian operators are generators for Unitary transformations. For example, is the generator of time-evolution: .

The full Master Equation is:

We can isolate the generator of time-evolution ; it is a matrix representing the contribution of state to state :


The Master Equation on Generating Functions

Now, rather than representing our state by a sum over occupation-number basis elements , we could instead represent it by a “probability generating function” (p.g.f.) over monomials , like this:

You could motivate this in a few ways:

  • Out of a general love for generating functions (like mine)
  • Or, by analogy to the Rate Equation , which involved powers of a species tuple as well, perhaps hoping to derive the Rate from the Master (we’ll try this in a bit)
  • Or, by observing that falling powers , which we used above to count the “number of ways of choosing in the reaction inputs ”, corresponds exactly to the coefficients you get when taking derivatives of the monomial:

(Apparently, the coefficient in a derivative was always counting the number of ways to choose which copy of to remove—a fact which I may have learned in calculus class, but have long since forgotten.)

Guided by this observation, we’ll take our quantum-mechanical analogy even further by defining a “creation” operator and an “annihilation” operator for each species . The effect of these operators on a monomial is to increase or decrease by 1 and multiplying by a factor representing the number of ways of doing so—a falling-power for but just 1 for any power of . (I’m not using the exact QM notation , because these are not “Hermitian conjugates”. I never cared for that anyway.)

The implementation of in the g.f. representation will be , as we have just seen, while the implementation of must be multiplication by :

We’ll write each of these as a tuple over all species :

We can also define a number operator which counts the number of particles of . It’s given by—what else?—

The operators don’t commute; their commutator is:

Note the quantum-mechanical analogy. For , the uncertainty principle is just this same identity . It almost looks like the in in Q.M. doesn’t need to be there—it has the effect of making operate correctly on functions of the form , where it gives . To me, this looks like yet another feature of Q.M. which could be “factored out” of the theory—I need to be keeping a list.

Applying multiple s followed by the same number of s returns you to the same power you started with, but with a coefficient of a falling power:

So we can also define the “falling power of the number operator ”:

… and that is everything we need to represent the Master Equation as a time-evolution operator in the generating function representation:

This is a sum across reactions , which we could write as . The term will act on a state of definite species number like:

Read this as:

  1. First, consume the inputs from the input state ”, which gives the state times the original probability and a coefficient representing the probability of this reaction occuring.
  2. Then, in parallel:
  • move to the new state and add all the probability
  • subtract all of the probability from the original state

For comparison, here is the original original Master Equation, with expanded into :

I prefer the operator version. There’s one fewer to think about, and it’s easier to read the operator equation as a “process” linking to . And in some cases we’ll be able to sum the target g.f. series into a nice function of , which will make the derivatives in very easy to use.

Finally, in either representation, we can implement time-evolution by exponentiating this thing and acting on an arbitrary state:


Time Evolution as a Path Integral

Now we’ll briefly dip our toes into the next level down the ladder of abstraction.

One advantage of the operator formulation is that it lends itself to a path-integral-like interpretation, as follows. Observe that is a sum over reactions, like . Then, when we exponentiate this operator we get:

Carrying out the powers leads to terms with sequences of operators like with some coefficient, which we read as representing the rate at which that sequence of transitions occurs in the time . When everything’s expanded, the action of time-evolution as a sum-over-timelines, or a sum-over-all-possible-sequences of transitions. For some initial state , each will in the final state as a sum over all sequence which could produce it, each in proportion to:

  • the number of ways to choose candidates for those sequences: a product of terms .
  • the probability that those reactions occur independently, which gives a product of factors for each eaction, representing the underlying hypothesis that each reaction occurs with at an independent rate over time.
  • a correction for each sequence of reactions, which (I think) avoids an overcounting which arises because permuting the sequence of reactions is equivalent to permuting the times at which those reactions occur

For example, if our network looks like this:

Then we have two “processess” and . The series expansion will have some term in it like , representing “one reaction followed by one ” (acting on a state the right). Then if our initial state has 3 “1” particles and 1 “2” particle, which in the g.f. notation is , then this sequence will take us to the state : two “1”s combine to give a “11”; then their product combines with the “2” to give a “21” and a “1”.

For this sequence the combinatorial term would be for the first reaction, there being 6 ways to choose 2 of 3 particles (which considers the particles as distinguishable point) and for the second.

We can draw diagrams for three of these combinations; the other 3 choose the same two inputs into but in the opposite order.

Observe that in each case one of the “1” particles passes through untouched, being affected by only the identity term of instead of any of the interactions.

These are something like Feynman diagrams. They’re mostly a conceptual aid at this level, but they could be useful when analyzing small- systems. If you want to determine the probability of, say, beginning in a state of and ending in the same state, you could formulate a combinatorial series of diagrams representing all the sequences of reactions which meet this constraint: (no change) + (one reaction and the reverse) + (higher order terms). The exact series would depend on the available set of reactions. It would be instructive to come up with exact Feynman rules for some model reaction network.

Once again the most interesting result might be the light this can shed on quantum-mechanics through the analogy. It has always felt to me that QM was a tangled mess of concepts which, individually, are not all that complicated. Once you have an elementary example of each, the full theory ought to “factor”, with the particulars of quantum physics entering at clearly-delineated points. This could at least be an aid to pedagogy; of course any improvement to pedagogy can also improve the reach of pedagogy. If we dare hope for more, we might aim to better “compress” the body of physics knowledge better in the minds even of experts, letting all see further and more-easily grasp subtleties than in the current presentation. If!