# Petri Nets for Computer Scientists

I am writing my undergraduate thesis on string diagrams and Petri nets, and along the way I will be writing blog posts summarizing things that I have learned, and explaining them to a (hopefully) wider audience. Right now I am reading a book on Petri nets by John Baez called Quantum Techniques for Stochastic Mechanics. The trouble with John Baez is that he is the king of blog posts explaining math. If I am writing about a subject that he has covered, it is somewhat difficult to add value. I have two choices: I can write about something more specific, or I can appeal to an audience that John Baez does not write for. In this blog post, I have chosen the latter route, and so this one goes out to all my CS friends.

## Data Structures

Suppose you are a programmer, and you want to simulate chemical reactions. For the record, I’m only expecting you to have as much chemistry knowledge as I have, which is almost none.

Before we can define our algorithm, we have to define the data that we are working with. To do this, let’s make some assumptions about chemistry. Suppose we have a test tube. In our test tube, there are some different types of molecules. For instance, we might have H_2 O, we might have CO_2, we might have H_2 and O_2, etc. Because this is computer science, assume that there is a finite number of types of molecules. Then we will use a dictionary to store the number of molecules of each type. The keys for this dictionary will be symbols representing each molecule type, and the values for this dictionary will be a natural number representing the number of molecules of that type. For instance, W[\mathbf{H_2O}] represents the number of H_2O molecules (we use a bold font to indicate that \mathbf{H_2O} is a formal symbol). We will want to write something that computes the evolution of W through time.

In order to describe the evolution of W, we will want to describe what changes are allowed to happen to W; what chemical reactions can happen. The description of these chemical reactions is what the Petri net will capture. What do reactions look like? Well, normally we write them down like so: 2H_2 + O_2 \to 2H_2 O. To represent this on a computer, note that all we care about is how many of each molecule is on the left side, and how many of each molecule is on the right side. To represent this, we will use two more dictionaries, which are just like W. For instance, in the above reaction we would have I[\mathbf{H_2}] = 2, I[\mathbf{O_2}] = 1, O[\mathbf{H_2O}] = 2.

However, we might have more than one reaction that can take place! Then we have a list of pairs of I and O, which we call R. The dictionary that denotes the input to the kth reaction we will call R_I[k], and the dictionary that denotes the output to the kth reaction we will call R_O[k].

Provisionally, we define a Petri net to be this list.

Let’s draw a picture of a Petri net. We are just going to think about the molecule types H_2, O_2, and H_2 O, and we are going to have only one reaction: 2 H_2 + O_2 \to 2H_2 O. We draw a Petri net with a blue circle for each molecule type, and a grey rectangle for each reaction type. Each reaction k has R_I[k][\mathbf{s}] input arrows from molecule s, and R_O[k][\mathbf{s}] output arrows to molecule s. So our example reaction has two arrows coming in from H_2, one arrow coming in from O_2, and two arrows going out to H_2 O.

Now, at this point you may have noticed that we don’t have to restrict ourselves to chemistry. This set-up could describe many other dynamic processes. For instance, suppose we want to simulate the population dynamics of amoebas. There are two “reactions” that can take place; one amoeba can split into two amoebas, and two amoebas can fight each other so that one of them dies (who knows, is this how amoebas work? I’m not a biologist). To describe this, we can use the following Petri net, where F stands for “fight” and D stands for “divide”.

## Simulation

This is all very pretty, but we want to actually write some code to describe how our system (the dictionary W) evolves over time. Because this is computer science, we are working in discrete time steps. Because we are modelling continuous systems, this assumption is false. Such is life.

During each time step, we want to possibly select a reaction to run. To do this, we have to assign a probability to each transition. To do *this*, we have to make a basic assumption.

**Reasonable Assumption**: Any tuple of molecules which form an input to a transition (so for instance, a tuple where the first two elements are H_2 molecules and the third element is a O_2 molecule) has an equal probability of undergoing a transition in a given time step. Therefore, the probability for *any* input set to undergo a reaction is proportional to the number of different possible input tuples.

One might guess that the set of possible input tuples for the kth reaction is the set

\{1,\ldots,W[\mathbf{s_1}]\}^{R_I[k][\mathbf{s_1}]} \times \ldots \times \{1,\ldots,W[\mathbf{s_n}]\}^{R_I[k][\mathbf{s_n}]}

This is the set of tuples with R_I[k][s_j] elements from molecule type s_j, for j=1\ldots n. The number of elements in this set is

W[\mathbf{s_1}]^{R_I[k][\mathbf{s_1}]} \cdot \ldots \cdot W[\mathbf{s_n}]^{R_I[k][\mathbf{s_n}]}

However, one molecule cannot appear in two places in the input tuple; that would be cheating! We must account for this in some way. To solve this, we first look at the case where there is only one type of molecule. Suppose that we haave m molecules of this type. Then to create a tuple of length k, we make k successive choices of molecule. For the first choice, we have m options, because we can take any molecule. For the second choice, however, we only have m-1 options, because we can’t choose the first molecule again. In general, for the jth choice we have m - j + 1 options. Therefore, the total number of tuples is m \cdot (m - 1) \cdot \ldots \cdot (m - k + 1) We call this a “falling exponential”, and we give it the notation m^{\underline{k}}. With this in mind, the correct number of input tuples to the kth reaction is:

W[\mathbf{s_1}]^{\underline{R_I[k][\mathbf{s_1}]}} \cdot \ldots \cdot W[\mathbf{s_n}]^{\underline{R_I[k][\mathbf{s_n}]}}

We know that the probability should be proportional to this number. But proportional by what constant? To answer this question, we must know something about how fast reactions happen. Specifically, in a time interval \Delta t, how likely is it for a single input tuple to undergo a reaction? From the information I have given you, it is impossible to say. The real answer is that this “reaction rate” is part of the data of a Petri net. Each reaction has an input dictionary I, an output dictionary O and a reaction rate r. The expected number of times for a reaction k to happen in a time interval \Delta t is R_r[k]\Delta t. If \Delta t is very small, then there is essentially no chance that the reaction will happen *twice*, so the expected number of times that the reaction happens is essentially equal to the probability that it will happen once. Therefore, if the current state of our system is W, the probability for a reaction k to happen is essentially

P(W,k) = W[\mathbf{s_1}]^{\underline{R_I[k][\mathbf{s_1}]}} \cdot \ldots \cdot W[\mathbf{s_n}]^{\underline{R_I[k][\mathbf{s_n}]}}R_r[k]\Delta t

Something you may have noticed is that if W has large values, this “probability” might be larger than 1. Again, this is because the “probability” is really expected number of reactions. Except this is not really true, because after one reaction, the probability for a subsequent reaction would change. Really what is going on is that this is an algorithm that approximates what sampling from an underlying distribution on time-histories of W. With smaller values of \Delta t this algorithm is closer and closer to being accurate, but with larger values of \Delta t the algorithm might just make no sense.

To sum up, here is our algorithm for simulating a Petri net. Start with a given state W. At each time step, choose reaction k to run with probability P(W,k), or don’t run a reaction with probability 1 - \sum_{k=1}^nP(W,k) (again, we assume that \Delta t is small enough so that this sum is less than 1). If we run a reaction k, set W[\mathbf{s_i}] to W[\mathbf{s_i}] + R_{O}[k][\mathbf{s_i}] - R_I[k][\mathbf{s_i}] for all i (we add the output of the reaction, and subtract the input). Then repeat until boredom sets in!

## Food for Thought

This was not a very complicated algorithm, however there are many interesting dynamical behaviors that can be captured by Petri nets. For instance, there is a famous model for predator-prey interaction called the Lotka-Volterra model. This model is captured by a Petri net that looks like this:

This model describes three processes: birth (B), predation (P) and death (D). The standard presentation of the Lotka-Volterra model uses differential equations, but we see here that we can understand the basic structure of the model without using differential equations at all! Moreover, we can simulate it! In short, Petri nets are powerful because they package up a large class of dynamical systems into a format that can be easily presented and understood

For more information about Petri nets, and for a more mathematical approach, I highly recommend John Baez’s book linked above. I’ll link again, just in case you are lazy, and don’t be scared of the title; you don’t need to know quantum mechanics. Quantum Techniques for Stochastic Mechanics. Petri nets live in the sweet spot of being simple enough to prove lots of nice theorems about, but being complicated enought that they exhibit lots of nontrivial behavior, and in fact Petri nets can even be used as a sort of simple computer!

P.S. Next post will be on a two-dimensional syntax for describing sequences of events in Petri nets, so stay tuned!