Petri Nets for Computer Scientists

Posted on September 20, 2019

I am writing my undergraduate thesis on string diagrams and petri nets, and along the way I will be writing blog posts summarizing things that I have learned, and explaining them to a (hopefully) wider audience. Right now I am reading a book on Petri nets by John Baez called Quantum Techniques for Stochastic Mechanics. The trouble with John Baez is that he is the king of blog posts explaining math. If I am writing about a subject that he has covered, it is somewhat difficult to add value. I have two choices: I can write about something more specific, or I can appeal to an audience that John Baez does not write for. In this blog post, I have chosen the latter route, and so this one goes out to all my CS friends.

Data Structures

Suppose you are a programmer, and you want to simulate chemical reactions. For the record, I’m only expecting you to have as much chemistry knowledge as I have, which is almost none.

Before we can define our algorithm, we have to define the data that we are working with. To do this, let’s make some assumptions about chemistry. Suppose we have a test tube. In our test tube, there are some different types of molecules. For instance, we might have \(H_2 O\), we might have \(CO_2\), we might have \(H_2\) and \(O_2\), etc. Because this is computer science, assume that there is a finite number of molecules, say \(n\). Then the state of our test tube can be described by a length-\(n\) array of integers, let’s call it \(W\). \(W[i]\) represents the number of molecules of type \(i\). So, if \(2\) is the number associated to \(H_2 O\), \(W[2] = 5\) means that we have 5 molecules of \(H_2 O\) in our test tube. We will want to write something that computes the evolution of \(W\) through time.

In order to describe the evolution of \(W\), we will want to describe what changes are allowed to happen to \(W\); what chemical reactions can happen. The description of these chemical reactions is what the petri net will capture. What do reactions look like? Well, normally we write them down like so: \(2H_2 + O_2 \to 2H_2 O\). To represent this on a computer, note that all we care about is how many of each molecule is on the left side, and how many of each molecule is on the right side. To represent this, we might use two length-\(n\) integer arrays, \(I\) and \(O\) (input and output). A reaction takes in \(I[i]\) molecules of type \(i\) for each \(i\), and spits out \(O[j]\) molecules of type \(j\) for each \(j\).

However, we might have more than one reaction that can take place! Suppose there are \(m\) reactions. We encode this by making \(I\) and \(O\) into two-dimensional \(m \times n\) arrays, where \(I[k,i]\) is the number of inputs of type \(i\) to the \(k\)th reaction, and \(O[k,i]\) is the number of outputs of type \(i\) to the \(k\)th reaction.

Provisionally, we define a petri net to be these two arrays.

Let’s draw a picture of a petri net. We are just going to think about \(H_2\), \(O_2\), and \(H_2 O\), and we are going to have only one reaction \(2 H_2 + O_2 \to 2H_2 O\). We draw a petri net with a blue circle for each molecule type, and a grey rectangle for each reaction type. Each reaction \(k\) has \(I[k,i]\) input arrows from molecule \(i\), and \(O[k,j]\) output arrows to molecule \(j\). So our one reaction has 2 arrows coming in from \(H_2\), one arrow coming in from \(O_2\), and two arrows coming out to \(H_2 O\).

Now, at this point you may have noticed that we don’t have to restrict ourselves to chemistry. This set-up could describe many other dynamic processes. For instance, suppose we want to simulate the population dynamics of amoebas. There are two “reactions” that can take place; one amoeba can split into two amoebas, and two amoebas can fight each other so that one of them dies (who knows, is this how amoebas work? I’m not a biologist). To describe this, we can use the following petri net, where \(F\) stands for “fight” and \(D\) stands for “divide”.

Simulation

This is all very pretty, but we want to actually write some code to describe how our system (the array \(W\)) evolves over time. Because this is computer science, we are working in discrete time steps. Because we are modelling continuous systems, this assumption is false. Such is life.

During each time step, we want to possibly select a reaction to run. To do this, we have to assign a probability to each transition. To do this, we have to make a basic assumption, which is that any tuple of molecules which form an input to a transition (so for instance, a tuple where the first two elements are \(H_2\) molecules and the third element is a \(O_2\) molecule) has an equal probability of undergoing a transition in a given time step. Therefore, the probability for any input set to undergo a reaction is proportional to the number of different possible input tuples. One might guess that the set of possible input tuples for the \(k\)th reaction is just the set \(\{1,\ldots,W[1]\}^{I[k,1]} \times \ldots \times \{1,\ldots,W[n]\}^{I[k,n]}\). The number of elements of this is \(W[1]^{I[k,1]} \cdot \ldots \cdot W[n]^{I[k,n]}\). However, one molecule cannot appear in two places in the input tuple; that would be cheating! The number of ways of getting a tuple where all of the elements are distinct from a single type of molecule is not \(W[i]^I[k,i]\), it is \(W[i] \cdot (W[i] - 1) \cdot \ldots \cdot (W[i] - I[k,1] + 1)\). For example, if we have \(4\) amoebas, there are \(4 \cdot 3\) ways of choosing a pair to fight each other, not \(4 \cdot 4\). If we define \(m^{\underline{l}}\) to be \(m \cdot (m - 1) \cdot \ldots \cdot (m - l + 1)\), then we can write the number of input tuples as \(W[1]^{\underline{I[k,1]}} \cdot \ldots \cdot W[n]^{\underline{I[k,n]}}\).

We know that the probability should be proportional to this number. But proportional by what constant? To answer this question, we must know something about how fast reactions happen. Specifically, in a time interval \(\Delta t\), how likely is it for a single input tuple to undergo a reaction? The answer is that the “reaction rate” is part of the data of a petri net. We are given an array \(R\), and the probability for a reaction \(k\) to happen in a time interval \(\Delta t\) is \(R[k]\Delta t\). Therefore, if the current state of our system is \(W\), the probability for a reaction \(k\) to happen is \(P(W,k) = W[1]^{\underline{I[k,1]}} \cdot \ldots \cdot W[n]^{\underline{I[k,n]}}R[k]\Delta t\).

Something you may have noticed is that if \(W\) has large values, this probability might be larger than \(1\). This is because we have discretized our system – if \(\Delta t\) were truly infinitesimal, then the probability would be infinitesimal too. Working discretely, the solution is just to make our time increment really small and hope that \(W\) doesn’t get too big. Hey, at least this way we don’t have to worry what on earth an infinitesimal probability is!

Something else you may have noticed is that if we are being consistent, there should be a probability proportional to \((\Delta t)^2/4\) chance that we run two reactions in a time step \(\Delta t\), because there is a probability proportional to \(\Delta t/2\) that we run a reaction in a time step \(\Delta t/2\). However, \((\Delta t)^2/4\) is really small so we can neglect it.

All of this gives us an algorithm for simulating a petri net. Start with a given state \(W\). At each time step, choose reaction \(k\) to run with probability \(P(W,k)\), or don’t run a reaction with probability \(1 - \sum_{k=1}^nP(W,k)\) (again, we assume that \(\Delta t\) is small enough so that this sum is less than \(1\)). If we run a reaction \(k\), set \(W[i]\) to \(W[i] + O[k,i] - I[k,i]\) for all \(i\) (we add the output of the reaction, and subtract the input). Then repeat forever!

Food for Thought

This was not a very complicated algorithm, however there are many interesting dynamical behaviors that can be captured by petri nets. For instance, there is a famous model for predator-prey interaction called the Lotka-Volterra model. This model is captured by a petri net that looks like this:

This model describes three processes: birth (\(B\)), predation (\(P\)) and death (\(D\)). The standard presentation of the Lotka-Volterra model uses differential equations, but we see here that we can understand the basic structure of the model without using differential equations at all! Moreover, we can simulate it!

For more information about petri nets, and for a more mathematical approach, I highly recommend John Baez’s book linked above. I’ll link again, just in case you are lazy, and don’t be scared of the title; you don’t need to know quantum mechanics. Quantum Techniques for Stochastic Mechanics. Petri nets live in the sweet spot of being simple enough to prove lots of nice theorems about, but being complicated enought that they exhibit lots of nontrivial behavior, and in fact petri nets can even be used as a sort of simple computer!

P.S. Next post will be on a two-dimensional syntax for describing sequences of events in petri nets, so stay tuned!


Site proudly generated by Hakyll with stylistic inspiration from Tufte CSS