Owen's Bloghttps://owenlynch.org/atom.xmlOwen Lynchroot@owenlynch.org2021-10-18T00:00:00ZThe Ultimate In Self Care: Learning from Marvel Villainshttps://owenlynch.org/posts/2021-10-18-villain-morning-routine2021-10-18T00:00:00Z2021-10-18T00:00:00ZOwen's Blog -- The Ultimate In Self Care: Learning from Marvel Villains

I don’t even really remember watching “Daredevil” but one scene has stuck with me. The main villain in Daredevil, Wilson Fisk, is a powerful businessman and crime lord who tries to reshape Hell’s Kitchen to be a better place, but whose methods are violent.

In the show, this is played as very sinister. But really what it signifies is that this man is totally in control of his life, and he starts every day with purpose and focus. I think in media we are supposed to identify with the people who don’t have their life together, who have conflicted souls, who listen to normal-people music and not Bach. Me personally though? I’d rather wake up like a villain.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>The Questioner Ep. 4https://owenlynch.org/posts/2021-10-03-questioner-ep42021-10-03T00:00:00Z2021-10-03T00:00:00ZOwen's Blog -- The Questioner Ep. 4

The fourth episode of The Questioner, the podcast where misunderstandings are diagnosed and fixed in real time, has been released. I try to explain the combinatorial foundations of Shannon entropy to a chemist.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Term Rewriting with (2,1)-Lawvere Theorieshttps://owenlynch.org/posts/2021-09-17-categorical-term-rewriting2021-09-17T00:00:00Z2021-09-17T00:00:00ZOwen's Blog -- Term Rewriting with (2,1)-Lawvere Theories

The purpose of this blog post is to share with the world some thoughts I’ve been having about term rewriting systems in the context of a categorical computer algebra.

Evan Patterson and I are thinking about rewriting the core machinery of Catlab to take more seriously the categorical perspective on algebra. The core machinery of Catlab is data structures and functions for manipulating Generalized Algebraic Theories. As one might guess, these are are more general version of algebraic theories. Thus, as a warm up for doing this refactor, I have been thinking about regular old algebraic theories in a categorical/computer algebra context.

Specifically, there are three viewpoints that I am trying to “take seriously”, in the hope that they will lead to a more general and elegant view of computer algebra.

The central object of study should be the category where the objects are “contexts”. We will get to exactly what the morphisms in this category are later.

Term rewriting and equational reasoning is a 2-categorical groupoidal structure on the previous category.

It is important to understand precisely what a presentation is, and how the terms in that presentation are built.

I will be illustrating my ideas with snippets of Julia code.

Note: this post will not be particularly friendly to those without much of a background in category theory. I apologize for this, I wrote this blog post in a short period of time and don’t really want to spend more time in writing more background.

Lawvere Theories

The framework that I am working in mostly derives from the idea of a Lawvere theory, so we are going to briefly review that first. To be clear, this is a review that will get you and me on the same page for terminology, not a review that will give you an intuition for Lawvere theories if you haven’t seen them before. If you haven’t seen Lawvere theories before, I suggest taking a slow perusal through Lawvere’s PhD thesis; it’s always worth reading Lawvere because you will learn more than you expected!

A Lawvere theory is a category T whose objects are canonically generated as finite products of a generating set \Lambda of sorts. That is, every single object is canonically a product \prod_{i=1}^{n} S_{i} for S_{i} \in \Lambda.

An “evil” way to put this is that there is a functor (\mathsf{FinSet}/\Lambda)^{\mathrm{op}} \to T that is the identity on objects.

Note that we have linked the n-Lab page for “single-sorted” Lawvere theories, whereas we in this blog post will always be talking about “multi-sorted” Lawvere theories.

Now, this seems like a fairly abstruse definition; it is unclear exactly what this has to do with algebra. The real meat of Lawvere theories comes from how to define them in terms of presentations. A presentation consists of generators and relations, and the generator part of the presentation is called a signature.

Signatures

A signature\Sigma of a Lawvere theory is inductively defined in the following way.

A set \Lambda of sorts

A set \Omega of operations. Each operation w \in \Omega has an arityi(w), which is an element of \Lambda^{\ast}, (the set of strings with characters drawn from \Lambda), and a return typeo(w) \in \Lambda.

The signature for the theory of groups has one sort, X, and three operations:

e \colon[] \to X (identity)

i \colon[X] \to X (inverse)

m \colon[X,X] \to X (multiplication)

Given a signature \Sigma = (\Lambda, \Omega), a context is simply a labeled set (\Gamma, \mathrm{ty}) \in \mathsf{FinSet}/ \Lambda. In a context (\Gamma, \mathrm{ty}), we can define the set terms of sort S, \mathrm{Term}_{S}(\Gamma, \mathrm{ty}), in the following way, for S \in \Lambda

For i \in \Gamma with \mathrm{ty}(i) = S, \mathrm{var}(i) \in \mathrm{Term}_{S}(\Gamma, \mathrm{ty})

For w \in \Omega with o(w) = S, and t_{1},\ldots,t_{n} such that t_{j} \in \mathrm{Term}_{i(w)_{j}}(\Gamma, \mathrm{ty}), \mathrm{appl}(w, [t_{1},\ldots,t_{n}]) \in \mathrm{Term}_{S}(\Gamma, \mathrm{ty}).

For those in computer science, you should recognize this as an abstract syntax tree where the leaf nodes are either arity-0 operations or variables in the context.

We can implement this in Julia with the following data structures. Note that we use an “unlabeled” implementation.

using MLStylestruct Operation arity::Vector{Int} ret::Intendstruct Signature# representing the set {1,...,sorts} sorts::Int operations::Vector{Operation}end@data Term begin Var(i::Int) Appl(op::Int, args::Vector{Term})endstruct Context vartypes::Vector{Int}endfunction check_sort(s::Signature, c::Context, t::Term)@match t begin Var(i) => c.vartypes[i] Appl(op, args) =>begin ts = map(arg -> check_sort(s,c,arg), args)@assert ts == s.operations[op].arity s.operations[op].retendendend

Having introduced this clean “unlabeled” format, we will immediately revert back to using a more conventional notation for actual examples, which can in theory be parsed back into this unlabeled format.

Now, the categorical perspective is always looking for morphisms, and it turns out there is a very natural definition for a morphism between two contexts.

Given two contexts (\Gamma_{1}, \mathrm{ty}_{1}) and (\Gamma_{2}, \mathrm{ty}_{2}), a morphism between them is an assignment of elements of \Gamma_{2} to terms in (\Gamma_{1}, \mathrm{ty}_{1}) with the same type.

That is, f(i) \in \mathrm{Term}_{\mathrm{ty}_{2}(i)}(\Gamma_{1}, \mathrm{ty}_{1}) for i \in \Gamma_{2}.

One can lift a morphism of contexts to apply to not just variables in the context but also terms, and the clearest way to show this is with code.

struct ContextMorphism dom::Context codom::Context# A term in the dom context for each variable in the codom context terms::Vector{Term}end# t must be a term in the codomain contextfunction (f::ContextMorphism)(t::Term)@match t begin Var(i) => f.terms[i] Appl(op, args) => Appl(op, f.(args))endend

This allows us to compose morphisms, again the easiest way to show this is also with code.

function compose(f::ContextMorphism, g::ContextMorphism)@assert f.codom == g.dom ContextMorphism(f.dom, g.codom, f.(g.terms))end

Given a signature \Sigma, let \mathrm{Ctx}(\Sigma) be the category of contexts and context morphisms. It is easy to show that \mathrm{Ctx}(\Sigma) is a Lawvere theory.

Note that by definition, \mathrm{Term}_{S}(\Gamma, \mathrm{ty}) = \mathrm{Ctx}(\Sigma)((\Gamma,\mathrm{ty}), S) (where by abuse of notation, S is the context with one variable of type S). That is, the set of terms of sort S in context (\Gamma, \mathrm{ty}) is precisely the morphisms from (\Gamma, \mathrm{ty}) to S in the category of contexts.

Presentations

The last ingredient to a presentation is the collection of equations. An equation consists of a context, and then two terms in that context of the same sort.

You can see that each equation has a context on the left, and then on the other side of the \vdash, two terms separated by an equals sign.

Note that another way of looking at an equation in context (\Gamma, \mathrm{ty}) and of type S is simply as a pair of morphisms from (\Gamma,\mathrm{ty}) to S.

A presentation for a Lawvere theory consists of a signature (\Lambda, \Omega) and a set \Xi of equations in that signature.

Given a presentation (\Sigma, \Xi), we construct a Lawvere theory T in the follow way. We start with the category \mathrm{Ctx}(\Sigma), and then we quotient out by the equations in \Xi by identifying the morphisms corresponding to the left and right hand side of each equation.

Everything but that very last step is very amenable to computation. However, “quotienting out” is a problematic and undecidable operation. It is this that lead me to seek out whether a higher-categorical structure that didn’t impose such strict equality could be more amenable to computation.

(2,1)-Lawvere Theories

The essential idea of (2,1)-Lawvere theories is to not quotient out by the equations in the presentation of a Lawvere theory, and instead keep track of equalities between terms as “proof objects”.

A (2,1)-category is a 2-category where all of the 2-morphisms are invertible.

A (2,1)-Lawvere theory is a (2,1)-category where every object is canonically a finite product of a fixed set of “sorts”.

A presentation for a (2,1)-Lawvere theory is just the same as a presentation for a regular Lawvere theory

To construct a (2,1)-Lawvere theory from a presentation (\Sigma, \Xi), we start with the 1-category \mathrm{Ctx}(\Sigma) and freely add in 2-morphisms corresponding to each of the equations.

There is still a slight problem with this, because “freely adding in 2-morphisms” actually does still involve quotienting out (for instance, taking the inverse of a morphism twice should end up with the same morphism again, and much more complex identities as well). But from a computational standpoint, we may not care about equality of 2-morphisms, only their existence and validity. Thus we could simply say that any two 2-morphisms with the same domain and codomain are equal, trivially making equality decidable.

What we are more concerned about is exactly how to represent a general 2-morphism in this category. However, it turns out that this is not too hard. There are exactly 5 ways of producing a new 2-morphism.

Take a generating 2-morphism (i.e. one of the equations in the presentation)

Take the identity morphism on a 1-morphism

Invert an existing 2-morphism

Vertical composition of two existing 2-morphisms

Horizontal composition of two existing 2-morphisms

Of these 5 ways, the last one is by far the most interesting. It generalizes two classic rules from term rewriting.

The first classic rule is “congruence”, which can be phrased as “substituting equal expressions in to the same expression yields equal expressions” (see: Theorem Proving and Algebra). For instance, suppose we are in a theory with a single sort S and an operation + \colon[S,S] \to S. Then suppose that in context \Gamma, t_{1} = t_{1}\prime and t_{2} = t_{2}\prime. Congruence allows us to conclude that t_{1} + t_{2} = t_{1}\prime + t_{2}\prime.

Categorically, this is the horizontal composition of an equality \langle t_{1},t_{2} \rangle = \langle t_{1}\prime, t_{2}\prime \rangle with the identity on the map \langle x + y \rangle \colon[x:S,y:S] \to [t:S], which we can picture as follows. (We use \langle t_{1}, t_{2} \rangle to refer to the map \Gamma \to [x:S, y:S] that sends x to t_{1} and y to t_{2})

The second classic rule is “substitutivity”. This says that if \phi(x_{1},\ldots,x_{n}) = \phi\prime(x_1,\ldots,x_n), where x_{1},\ldots,x_{n} are variables, then we can substitute in any terms t_{1},\ldots,t_{n} to get \phi(t_{1},\ldots,t_{n}) = \phi\prime(t_1,\ldots,t_n).

This is horizontal composition the other way!

The upshot of all of this is that we can store a “witness” for an equality proof with a faily simple recursive data structure based on these 5 ways of contructing an equality.

@data Equation begin GeneratingEq(i::Int) # references an equation in the presentation Refl(f::ContextMorphism) # reflexivity of equality Sym(e::Equation) # symmetry of equality VCompose(e1::Equation, e2::Equation) # transitivity of equality HCompose(e1::Equation, e2::Equation) # congruence and substitutivityend

Given an Equation, the procedure to check that it is valid is an straightforward recursive function, the implementation of which I leave to the reader.

Conclusion

My idea here was to show that taking the categorical approach to algebra seriously would lead to an elegant structure for witnesses of rewrites, and I think I have succeeeded at this! However, what is still left to do is to figure out how to make this sort of thing smooth for the end-user, not just the implementor. My hope is that the elegance in the categorical structure will allow for a slick, general, and powerful user interface for term rewriting, but I haven’t figured out exactly how to do that yet.

As always, thoughts comments and questions are welcome, and instructions for how to leave these are down below.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Legibilityhttps://owenlynch.org/posts/2021-09-15-legibility2021-09-15T00:00:00Z2021-09-15T00:00:00ZOwen's Blog -- Legibility

“Can’t Stop Conducting” is a classic Tom and Jerry cartoon where Tom is conducting an orchestra, and Jerry is trying to conduct along with him. Tom’s pride as a conductor will not allow this, and classic fights ensue.

I want you to watch this right now, and pay attention to how legible every single action in this sketch is. Almost every action has some sort of windup, and some sort of predictable consequence. The emotions of the characters are clearly expressed through their faces and body languages, and more than that, the motivations of the characters are clear from the beginning and consistent throughout. Tom wants to be the star of the show, and Jerry wants to be part of the show.

And most importantly, the action is totally in sync with the music, from the rhythm to the thematic content.

I wish that people had made me pay attention to the concept of legibility earlier in my life. I used to think that creation was all about creativity; all about having new ideas that nobody else had ever had before and producing something wacky and original.

But look at that Tom and Jerry sketch. I think that the writers of that Tom and Jerry sketch were very creative; coming up with the action sequences and character motivations that fit perfectly with the music is not easy!

That being said, the sketch holds together because of so much that is very recognizable and predictable. The setting is a generic outdoor concert shell. The dress for both are classic orchestral tuxes. Tom’s mannerisms as a conductor are very familiar to anyone who has been in an orchestra, from the little finger waggles telling the violins to play lushly, to the pushing hands, telling the orchestra that even though this is an accelerando, they should speed up gradually. And in one of my favorite moments, the conductor always has a spare baton.

Even the end, where Tom keeps conducting, not noticing his surroundings while zooming down the highway, is consistent with the logic that has been set up earlier in the sketch, when Jerry doesn’t notice that he’s been flung across the stage, and keeps conducting, and even at the very start.

I sum this up by saying that the legibility of the sketch is the key to its humor. Each of the things that I mentioned contributes to legibility in some way:

Synchronized music and action makes both the music and the action more legible

Stock images and gestures help the audience parse the scene into well-known chunks

Repetition of actions creates a consistent logic (even if the logic is not realistic)

Clear and simple motivations help the reader understand why things are happening

In this blog post, I aim to investigate this concept of legibility further, and argue that understanding legibility is an important step for becoming a better creator. To this end, I will compare legibility as it applies in a couple domains, and draw some general conclusions for how to pursue legibility.

Use of Language

One of (in my opinion) the best writers of the decade has very kindly shared with us the secrets to his success. In one of the sections he talks about the use of connecting words.

I lampshade my flow of ideas with a lot of words like “Also”, “But”, “Nevertheless”, “Relatedly”, and “So” (when I’m feeling pretentious, also “Thus”). These are the words your eighth-grade English teacher told you never to start paragraphs with. Your eighth-grade English teacher was wrong. If you’re writing three paragraphs that are three different pieces of evidence for the same conclusion that you’re going to present afterwards, make damn sure your readers know this. It could be as simple as:

It’s pretty obvious that X is true, and we have lots of converging lines of evidence for this. Some of the best evidence comes from the field of augury. For example:

First, A

Second, B

Third, C

Now, some people say that not-A, but that’s totally wrong. It only looks like not-A, because P. Likewise, although Q might make it look like not-B, Q can’t be trusted for several other reasons, for example R. And not-C is too silly to even think about. Sodespite the objections you always hear, the augurical evidence for X is strong.

Even more evidence comes from the field of haruspicy. All four major haruspical schools hold X as a major principle. School 1 says X because of D. School 2 says X because of E. School 3 says X because of F. And school 4 says X because of G. So although augury and haruspicy disagree on a lot, on the subject of X they are in complete accord.

Notice the underlined words holding up the structure of the argument. Not only is the argument nice and tight, but the role of each part in the whole is telegraphed beforehand. For example, the “now” that comes just after C is saying something like “Take a step back, I’m about to tell you something that might otherwise be controversial, but listen to what I have to say”. And the “likewise” just after P means something like “We just got down talking about not-A because P, here’s another argument with about the same structure”. Before any of the facts are inserted, you already know where they fit into the structure. And you’re able to abstract from the micro-level and get the bigger picture of some fact which is supported by both augury and haruspicy, which was the main point of the argument.

Essentially, these “lampshading” words work to make the flow of ideas legible. I think of lampshading words like the “wind-up” to every action in the Tom and Jerry cartoon.

For instance, consider the classic “run wind-up”. You know that Jerry is about to zip across the screen, and also it’s clear that Jerry feels like he’s late to something and is rushing to catch up, just like when Scott says “Even more”, you know that he’s about to list some more evidence that’s in agreement with the previous paragraph.

I could spend pages talking about other structures in writing that increase legibility, but this post isn’t about how to effectively use legibility in every single domain, it’s about noticing common threads across domains.

Music Theory

One of my biggest complaints when I was learning music theory was always that music theory seemed to always be teaching you about how a piece was put together, and never why.

But now I understand more about exactly what the point of music theory is. By identifying structures common among music in a given style, one can understand how to effectively use musical cliches to convey a certain tone. For instance, the James Bond theme famously ends with a minor major seventh chord (see: James Bond, explanation). From that cultural association, you can bring out a certain spy-like feel in something that you are writing by using a minor-major seventh. This is only partly because of the intrinsic sound of the minor-major seventh; audiences have an association with the minor-major seventh that is purely cultural (even if they might not realize it).

Another example of legibility in music theory is the ballet Petrushka, by Stravinsky. If you just listen to the music of Petrushka, it makes very little sense. But when you see it actually staged, the movements of the dancers make it (more) clear what’s going on in the music. One part of the music that can be particularly confusing to listen to in the introduction is when there are two themes playing at once. When you see the ballet you see that this is because there are two competing music box ladies. Understanding the (vaguely specified) plot of the ballet brings a narrative structure to the music that makes it more legible.

The key is that even though Petrushka doesn’t have a traditional classical structure like a Mozart piece, it holds together and makes sense through the narrative structure, and the use of repetitive and recognizable themes for different characters. If Petrushka simply broke classical rules without having some substitute for creating legibility, it would not be successful at all.

I think that this is a general principle for new art. You can break as many rules as you like, as long as you come up with new rules to replace them, and continue to keep your art very legible.

Design

I hope at this point I’ve conveyed why I think legibility is important, but for the sake of repetition (a key part of legibility), I’m going to give another example.

Taking legibility literally, it is most directly applicable in graphic design. You can see this in Matthew Butterick’s Practical Typography. First of all, as an object of graphic design, the design of that webpage is totally clean. Immediately you can see what is going on: it’s a book and this is the table of contents. There is also a tasteful grid structure that more efficiently uses the space of the table of contents.

But secondly, the book starts by talking about why “the programmer, the scientist, the lawyer” should care about typography.

I’m not here to tell you that typography is more important than the substance of your writing. It’s not.

But typography can enhance your writing. Typography can create a better first impression. Typography can reinforce your key points. Typography can extend reader attention. When you ignore typography, you’re ignoring an opportunity to improve the effectiveness of your writing.

And isn’t that why you write at all? To have an effect on readers? To move them, to persuade them, to spur them to action?

I think that if you replaced every instance of “typography” in that block quote with “legibility”, each point would still hold.

Legibility More Generally

At this point, I have finished making my main argument about the importance of legibility in creative arts. So now I’m going to make some unwarranted speculations on legibility outside of art, and also on mathematical interpretation of legibility.

In James C Scott’s book Seeing Like a State (book review/summary), a central theme is that the legibility of a society is key to the ability of the state to govern that society.

For example, in medieval France, where most men had one of six first names and no last names, it was very hard for tax collectors to figure out who had paid taxes. For people in the village, they could differentiate “Jaque the smith” and “Jaque who lives under the hill” and “Jaque the baker”, but these sort of loose names were impossible to keep track of for the state, which needed identifiers that were valid in a larger scope. Last names were invented to make the population more legibile to the state, and this in turn enabled the state to tax more efficiently.

This sort of thing appears all throughout the book. However, I’m not going to try and analyze it at the moment; I am just bringing up an interesting connection.

Finally, I have a sneaking suspicion that legibility has something to do with information theory. In art, legibility often comes down to repetition and the use of tropes. I feel like this should have something to do with the efficiency of encoding schemes, and making the informational content of a piece clearly and redundantly expressed in an encoding scheme that is easy on the brain. But again, I’m just bringing this up as a connection; to probe this more thoroughly is a whole line of research!

And that’s all for now! Tweet at me or blog at me with any comments (instructions down below).

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Book Review: Capital as Powerhttps://owenlynch.org/posts/2021-09-13-capital-as-power-book-review2021-09-13T00:00:00Z2021-09-13T00:00:00ZOwen's Blog -- Book Review: Capital as Power

Note: This was originally submitted to the astralcodexten book review contest, but did not win.

Note 2: Sophie Galowitz did the lovely illustrations for this blog post.

Capital As Power

Part A: Overview

1. A Need for Better Theory

If you are a well-educated person in the 21st century, you probably have conflicted views. On the one hand, the grand socialist project has had… problems… over the last century. Serious problems. Problems that kill and hurt people, and are really, really non-dismissable. For a bibliography, check here.

On the other hand, the (main) alternative is capitalism. And that also sucks. A lot. If you haven’t noticed this, you haven’t been paying attention.

Capital as Power by Bichler and Nitzan does not even attempt to talk about an alternative system of government. However, it argues that a necessary precondition for radical system change is a new theory of economics. In their words:

Perhaps the key problem facing young people today is a lack of theoretical alternatives. A new social reality presupposes and implies a new social cosmology. To change the capitalist world, one first needs to re-conceive it; and that re-conception means new ways of thinking, new categories and new measurements.

The purpose of Capital as Power is to provide such a theoretical alternative. However such a thing is easier said than done. To start with, it is necessary to give a thorough examination of past attempts to put economics on sound theoretical feet.

Bichler and Nitzan provide some blistering polemic towards those who try to build the future without understanding this.

With some obvious exceptions, present-day leftists prefer to avoid ‘the economy’, and many are rather proud about it. To prioritize profit and accumulation, to theorize corporations and the stock market, to empirically research the gyrations of money and prices are all acts of narrow ‘economism’. To do these things is to fetishize the world, to conceal the cultural nuances of human consciousness, to prevent the critic from seeing the true political underpinnings of social affairs. Best to leave them to the dismal scientists. And, so, most self-respecting critics of capitalism remain happily ignorant of its ‘economics’, neoclassical as well as Marxist. They know little about the respective histories, questions and challenges of these theories, and they are oblivious to their triumphs, contradictions and failures. This innocence is certainly liberating. It allows critics to produce ‘critical discourse’ littered with cut-and-paste platitudes, ambiguities and often plain nonsense. Seldom do their ‘critiques’ tell us something important about the forces of contemporary capitalism, let alone about how these forces should be researched, understood and challenged.

Concordant with the spirit of this paragraph, Bichler and Nitzan devote much of the first half of the book to a critical dive into the history over the last two centuries of the “dismal science”, both Marxist and Neoclassical.

I learned a lot about both neoliberalism and Marxism, and this part of the book would hold up as a good survey, even without the arguments they make for why the theories ultimately fail. It centers the analysis on an idea that both neoliberalism and Marxism are ultimately tied to theories of value. We are familiar with the neoliberal theory of value to such an extent that it is hard to even realize it is a theory. The neoliberal theory of value is that value comes from the utility that a good delivers to its consumer. Often this is how economics textbooks start, and they promise that the arguments that you can think of off the top of your head against this model have good counter-arguments, and in any case it’s a useful model. Students with further questions are told that real economists use better theories than this, but they are too complicated to put in introductory textbooks. Bichler and Nitzan do a thorough job expounding on arguments that an intro econ student might think of, but could not come close to articulating in enough detail to make headway.

The Marxist theory of value is that value comes from the work that humans put into material goods. On the face of it, this makes a lot of sense. Ultimately, the limiting factor to production comes down to humans: no humans = no production. However, this theory also has holes in it.

The key to Bichler and Nitzans’ arguments against both theories is that they cannot explain how capital accumulates, or provide a framework in which predictions about value can be made. So in short, the motivation for this book is that there are productive, empirical insights to be derived from a new economic point of view. I am used to alternatives to neoliberalism proposed for moral reasons, and it was refreshing to hear someone try to elucidate an alternative proposed for scientific reasons.

2. Business and Industry

One of the key parts of this new theory comes out of the theories of a historian named Thorstein Veblen (according to Wikipedia, Veblen coined the term “conspicuous consumption”). Veblen’s big idea is that there is a fundamental distinction between Business and Industry. Industry is the domain of the kind of people who build giant redstone contraptions in Minecraft or the kind of people who name their lab mice and talk to them in squeaky voices while cleaning up after an experiment that went on until 1am. It is both a collaborative activity and a competitive activity, but it is fundamentally built on creativity, curiosity and a desire to solve problems. Business is the domain of the kind of people who network at parties and care a lot about “corporate strategy”. The point of Business is profit and accumulation.

Rather than theorizing capitalism as a perpetual struggle between classes, Veblen theorizes capitalism as a perpetual struggle between Business and Industry.

Modern capitalists are removed from production: they are absentee owners. Their ownership, says Veblen, doesn’t contribute to industry; it merely controls it for profitable ends. And since the owners are absent from industry, the only way for them to exact their profit is by ‘sabotaging’ industry.

The one thing capitalism was supposed to be good at was high-quality goods at low prices, this is the promise dangled from the hands of every billboard in Times Square and every dense 800-page neoliberal economics book. But actually, business subverts the production of quality goods at low prices for the purpose of profit. One of the most obvious examples of this is intellectual property; industry is siloed into many companies who cannot freely remix and use each other’s designs. Open source software, even though it is massively underfunded compared to proprietary software, often manages to punch above its weight because of the superior development model of sharing, and because it is not sabotaged by profit considerations.

If nothing else, Capital as Power is worth reading for the wealth of examples of this conflict.

This is perhaps the reason why early in the twentieth century the automobile companies bought and dismantled 100 electric railway systems in 45 US cities (Barnet 1980: Ch. 2). And it is also why these companies have long shunned any radical change in energy sources. The electric car, first invented in the 1830s, predates its gasoline and diesel counterparts by half a century, and for a while was more popular than both (Wakefield 1994). But by the early twentieth century, having proved less profitable than the gas guzzlers, it fell out of favour and was forcefully erased from the collective memory. Then came intolerable pollution, which in the 1990s led the state of California to mandate a gradual transition of automobiles to alternative energy. Complying with the new regulations, General Motors had its engineers quickly develop a highly efficient electric car, the EV1. But fearing that this gem of a car would undermine profit from their gas guzzlers, the company’s owners, along with owners of other concerned corporations in the automotive and oil business, also invested in an orchestrated attempt to defeat the California bill. When the regulation was finally overturned, every specimen of the EV1 was recalled and literally shredded (Paine 2006).

The idea of a distinction between Business and Industry was not the most provocative idea in Capital as Power, nor the one with the farthest-reaching implications, but it was the one that stuck in my head the most. It seems to me to be a very productive way of thinking, and sums up a lot of stuff I didn’t have words to describe before. But to Bichler and Nitzan, this is merely a springboard for a much larger theory.

3. Accumulation of Power

Here, Bichler and Nitzan follow the ideas of a historian named Lewis Mumford. Mumford takes us back all the way to the beginning of what we now call civilization, in the Nile River delta. His conceit is that the first “technology” was not mechanical or chemical, it was social. The organizational structure of ancient Egypt, with its intricate hierarchy of politics and religion meshed together, was a form of power previously unmatched in its ability to change its surroundings and to persist through time.

And here I add my own analysis. It was impressed on me thoroughly in ninth grade ancient history by my most excellent history teacher Audrey Budding that the one of most common threads through history and one of the most important questions to ask about a society is how legitimacy of the ruling class is achieved. In this lens, the novel technology of Egypt was the ability to give its rulers, through religion, tradition, force, and bureaucracy, enough legitimacy that they could impress their will on a massive population.

Mumford calls this new technology the “mega-machine”, and Bichler and Nitzan take an interesting romp through some of the mega-machine’s greatest hits throughout the years since the fall of Egypt.

According to them, the most recent incarnation of the mega-machine is the entirely novel quantification of power through capital. With the dazzling mathematics of the market, the mega-machine has reached heights of sophistication that the gold-plated pharoahs of yester-millenium could only dream of. The mechanism by which the ruling class exerts power is tightly woven into the daily fabric of the lives of their thralls, and legitimized by every interaction that becomes a transaction.

At this point, you are probably thinking “Yes! This is the hot tea that I signed up for when I clicked on a link that claimed to lead to a book review of a book called Capital as Power!”

I hate to disappoint you, but this is the end of the introduction, and we’re not going to get back to the real juicy stuff until after about a thousand or so words droning on about theories of value and dead white men.

You see, I’m trying to give you a picture of what it is like to read this book, and the experience of having a tantalizing insight dangled in front of you but then being forced to read far more history and statistics than you would really like to understand it is essentially all of Capital as Power.

Summary of Part A

Capitalism is bad, but before we can improve it, we need to understand it scientifically.

Current economics has some deep flaws in this regard.

Cultural critiques are dumb.

Business and Industry are two distinct things.

Capitalism is, like, Egyptian pharoahs but with more numbers.

Part B: Dilemmas of the Dismal Science

1. Politics and Economy

Before the industrial revolution, one could make a decent argument that political power and economic power could be separated. One strong point in favor was the separation of nobles, merchants, and clergy. Certainly, nobles often happened to be rich, but their wealth mostly derived from land-ownership, and their political power was mainly derived from their birth and connections, rather than their wealth (though, connections and wealth could go hand and hand). Merchants, even though they could have influence through their wealth, were excluded from positions of political power. And the clergy, in some cases more important than the nobles or the merchants, were also quite separated from both.

It is from this situation that the “original sin” of economics derives: the belief that economics is productively analyzed outside of political context. The original liberals relied on this duality to expound on their idea that the ideal government simply facilitates the movements of the market.

Marx was the first to question this belief, claiming that it was only through political oppression that industrialists were able to achieve their economic exploitation. However, he still makes an essential distinction between the two, and the Marxist viewpoint is that the contradictions between the political and economic spheres are what will eventually bring down capitalism and usher in communism, where politics and economics will finally be united.

Bichler and Nitzan take as a jumping-off point this faulty duality, and use it to explain how various problems come up in both theories. For instance, the pure “Newtonian” laws of microeconomics were eventually forced to be revised by the Great Depression, and the new science of macroeconomics was introduced to account for this. Since then, the systematic differences of the real world from the “spontaneous equilibrium” of the market have been accounted for by an ever growing pageant of distortions, applied ad-hoc. But just as Ptolemaic astronomy eventually drowned under the weight of its many epicycles, so is neoclassical economics struggling under the weight of all of the various compensations needed to account for politics in a theory that assumes politics is out of the picture.

In Marxism, a similar problem arose as the competition-rich environment where Marx originally made his theories gave way to a new monopolistic capitalism. Without competition, the tendency of prices of goods to correlate with the prices of labor, and for profits to equalize among different sectors no longer held true. To account for this, Neo-Marxists developed a new theory that attempted to bring power back into the picture. However, like the man who cannot light his cigarette without putting down his teacup, in order to do so the labor theory of value had to be jettisoned, and Neo-Marxism became unmoored from the theoretical framework that birthed it. When the stagflation of the 70s and 80s hit along with the breakup of “state capitalism”, Neo-Marxism ceased to be an accurate description of the world, and leftists attempted to move back to original Marxism, decrying the period of monopoly as a “historical blip”. However, there was little left of the original theory that still made sense, and many Marxists moved to cultural critiques, abandoning the original attempt of Marxism to put the study of capitalism on a scientific basis.

2. The Nominal and The Material

Another issue that plagues both Marxist and neoliberal analyses is an attempt to make a distinction between what I might call “the map” and the “territory.” On the Marxist side, this comes up as a distinction between “fictional” and “real” capital. “Real” capital is owned by industrial capitalists, who employ productive labor to create surplus value. I imagine that Marx was thinking of factories here. “Fictional” capital, on the other hand, is capital owned by commercial and financial capitalists, who merely appropriate the value generated by productive labor. Intuitively, this is a theory that one is inclined to be sympathetic to. On the one hand, you have farmers and laborers, who are clearly producing surplus, and on the other hand you have stock brokers, who clearly are just siphoning off the top. But to actually give a sturdy definition of what is “productive” and what is “unproductive” is not at all easy.

On the other hand, the liberals face a similar dilemma as they argue that the market ultimately represents the movements of actual industrial processes. Bichler and Nitzan argue that liberals were originally motivated to do this in order to make the argument that, unlike nobles who acquired their wealth by looting or birth, the new rich acquired their money by work. If ultimately the market is just a representation of real processes that bring material changes in the world, then business deals are as legitimate a form of work as manufacturing and farming, and so just as the laborer is entitled to the work brought in by the sweat of his brow, so is the businessman entitled to whatever rewards he can manage to make by fair participation in the market.

However, this has its own problems. The movements of capital markets are empirically quite difficult to correlate with the movements of the underlying material assets that those markets are supposed to represent. Again, neoliberals employ a pageant of epicycles to explain away this, but there are only so many epicycles that a scientist should accept.

In short, applying the “map and territory” analogy to money and “real” assets is bad for two reasons. The first is that it is very hard to draw a line between the two, and the second is that even when we can tell that something is a map, it’s very difficult to figure out what it is a map of.

3. What is Value?

The last problem is more philosophical. It’s best introduced in Bichler and Nitzan’s own words.

To study the rationalist order of capitalism without quantities is like studying feudalism without religion, or physics without mathematics. According to Marx, and here he was right on the mark, capitalism, by its very nature, seeks to turn quality into quantity, to objectify and reify social relations as if they were natural and unassailable. In this sense, a qualitative theory of value necessarily implies a quantitative theory of value; it means a society not only obsessed with numbers, but actually shaped and organized by numbers. This organization is the architecture of capitalist power. To understand capitalism therefore is to decipher the link between quality and quantity, to reduce the multifaceted nature of social power to the universal appearance of capital accumulation. The two aspects of the theory rise and fall together. If one is proven wrong, so is the other.

With this passage, Bichler and Nitzan set a high bar for theories. It is not enough for a theory of capitalism to give a qualitative account of value because the nature of capitalism is quantitative. It is on this cross of numerics that the Marxist “abstract labor” and the neoclassicist “util” are ultimately crucified.

They lay the problem out in the following way. It is well understood that wealth is not well-measured by current market value, because of inflation. Economists get around this by coming up with a “price index”, which measures how much a “standard bundle” of goods and services would cost at different times. By this mechanism, comparisons of value across time can be measured. However, the definition of such a “standard bundle” is highly problematic. If this standard bundle were, say, a single transistor, then an economist would conclude that we are over a million times richer than we were in the 1960s. This is clearly not true. Of course, an economist would not calculate a price index in this way, but almost any commodity has similar problems, though less extreme, from roast passenger pigeon to steel.

In practice, some reasonable judgement is made on which commodities to include. But on what basis is this judgement made? The theoretical underpinnings of neoclassicism are that things are ultimately denominated in “utils”. This explains why we don’t simply multiply by the number of transistors: utils don’t scale linearly with the number of transistors. This explains why roast passenger pigeon would go for a high price now; it’s scarcity would cause someone to derive great utility from their signaling of wealth by eating it, or great utility from their curiosity in how it tastes.

As a qualitative theory, utils certainly have a lot going for them. I buy groceries because I derive utility from eating them. I pay rent because I derive utility from having a roof over my head. But even the most ardent proponents of the new utilitarianism balked at doing actual calculations.

In fact, they admitted quite openly that universal utility is impossible to measure and, indeed, difficult to even fathom. The interesting thing, though, is that this recognition did not deter them in the least. ‘If you cannot measure, measure anyhow,’ complained their in-house critic Frank Knight

Instead, neoliberal economists developed a theory of “revealed preferences.” This theory says the following:

‘Utility is the quality in commodities that makes individuals want to buy them, and the fact that individuals want to buy commodities shows that they have utility’ (Robinson 1962: 48).

We wanted to compare wealth over time. Prices change due to inflation, technological improvements, and a whole host of other things, so we need a theory of value that allows us to make this comparison. Utility provides that theory of value. How do we measure utility? Prices. Whoops…

I want to make it explicit that Bichler and Nitzan aren’t making the argument that utilitarianism is not a good basis for morality. This kind of argument would not be novel or relevant. They are making the argument that utilitarianism is a bad basis for a theory of capitalism because it fails to make quantitative testable predictions.

While I’m clarifying things, I might as well reiterate (though it should go without saying) that anyone wishing to refute any of the arguments that I set out here, should not take my presentation as canonical in any way; Bichler and Nitzan’s arguments are far more comprehensive than my brief summary. Although I promised that I would bore you with history and statistics, I really don’t have the space or time.

A similar circularity happens with the labor theory of value. I won’t go into as much detail, because I think it’s somewhat widely accepted that the labor theory of value doesn’t really work, but I encourage the interested reader to read the relevant parts in “Capital as Power”; I learned a lot about what the labor theory of value actually means that I had never really had explained to me in such detail.

Summary of Part B

Separating politics and economy leads to contradictions

Separating out “real” from “fictional” capital also leads to contradictions

Assuming that prices are caused by some sort of externally-defined value leads to contradictions.

Part C: The Machine

Moloch whose blood is running money!

1. Internal and External Logic

If we cannot use utility or abstract labor, then by what means can we determine value? Bichler and Nitzan do not have an answer to this question. They instead say that it is the wrong question to ask. It makes no sense to try to anchor the fluctuating numbers of human society to some fixed and eternal quantity. Rather, prices are just another expression of society.

According to Cornelius Castoriadis (1984), this alternative was articulated some 2,500 years ago, by Aristotle. Equivalence in exchange, Aristotle argued, came not from anything intrinsic to commodities, but from what the Greek called the nomos. It was rooted not in the material sphere of consumption and production, but in the broader social–legal–historical institutions of society. It was not an objective substance, but a human creation.

This articulates the idea that capitalism exists on a continuum. The idea that a gift economy or dictatorship is based on societal context is totally uncontroversial; why should the theory of a transactional economy be any different?

Consider the ratio between the price of petroleum and the wages of oil rig workers; between the value of Enron’s assets and the salaries of accountants; between General Electric’s rate of profit and the price of jet engines; between Halliburton’s earnings and the cost of ‘re-building’ Iraq; between Viacom’s taxes and advertisement rates; between the market capitalization of sub-prime lenders and government bailouts. Why insist that these ratios are somehow determined by — or deviate from — relative utility or relative abstract labour time? Why anchor the logic of capitalism in quanta that cannot be shown to exist, and that no one — not even those who need to know them in order to set prices — has the slightest idea what they are? Isn’t it possible that these capitalist ratios are simply the outcome of social struggles and cooperation?

The title of this section comes from my personal spin on this. Euclid tried to define all of the geometrical terms that he worked with rigorously. However, such an attempt was inevitably circular. Modern mathematics affirms that the proper way to axiomatize math is to start with undefined terms, which derive meaning through their relation to each other. That is, ultimately mathematics cannot be given an “external” grounding; it must be defined through “internal” means.

This is also well-understood in the linguistic sphere. An objective definition of a word does not exist; all that exists is the totality of that word’s relationship with actions and with other words. Analogously, it makes sense that prices and capital and markets are not objectively defined quantities, but exist only in relation to the larger context of civilization.

This resolves the dilemmas of part B. Politics and economics are clearly not separated; the study of one is the study of the other. The private ownership of “financial” assets is just as “real” as the private ownership of material goods; they are simply relations that exist in a certain societal context. And finally, there is no objective standard of value.

But wait, you are surely thinking, isn’t this giving up? After crucifying neoliberal and Marxist theories on a cross of numerics, it’s a pretty poor showing to have the theory that’s supposed to replace them be all “societally-determined” woo.

The answer is that productive empirical tests and theories can be produced from this mindset. Abandoning an “external logic” of capitalism does not mean that scientific inquiry has to pack up and go home. Rather, it frees us to look for theories that are embedded in specific contexts and divorced from pretensions of universality.

An analogy can be drawn here to machine translation: the “rules-based” systems of the early AI days are no match for the deep learning systems of today, and the transition was predicated on an understanding that the meaning of a word can’t be pinned down with formal rules (and of course also predicated on access to GPUs).

Additionally, this mindset allows us to salvage a great deal of neoliberal and Marxist economics. When properly contextualized and removed from their moral and theoretical underpinnings, these theories can have great empirical success.

The famous physicist and captain of the high seas Robert Hooke believed that the law that he discovered governing the motion of a spring, \ddot{x} = -k x, where x is the displacement from rest position, was a new law of nature. However, it actually turned out to just be a good approximation in certain cases, and we still use Hooke’s law in engineering because of this

I think that many economists are all too aware that they are just using an approximation. And if I wanted to know the answer to a given question, I would look to expert economic judgement. This is because a highly tuned misguided theory often gives much better results than a poorly tuned well-founded theory. I recently bought a subscription to the Economist, and despite its heavy neoliberal bias, I still think it gives a clearer picture of the world than many other news sources; a large helping of meritocracy goes a long way. However, the way that human brains work is that understanding is layered. After working with principles and laws for a long time, we forget what the underlying assumptions that lead us to come up with those principles and laws were, and are therefore less able to ascertain when those principles and laws no longer hold. Our intuitions about prices are developed by going to the grocery store, not by buying companies, and we should not trust principles just because they seem to make sense in the micro-scale.

In this way, implicit assumptions about the nature of economics radically change the type of models that are even considered, and also radically change political views. Therefore, although this indictment of neoliberal theory does not in my mind invalidate expert scientific consensus in many areas, it does undermine political arguments based on neoliberal/Marxist theory, and it opens the door to new scientific ideas. And both of these are needed if this new theory is to be a success; to make a dent in neoliberal consensus where “cultural” theories have failed it must open up previously poorly understood areas to empirical analysis, and to better guide society it must have political implications based on that new empirical analysis.

2. Capital and Capitalization

In the previous section, we moved a bit far afield from the review. We now move back to the book as it attempts to examine the “internal logic” of capitalism; the processes and beliefs that keep it afloat.

Bichler and Nitzan argue that the central process of capitalism is, fittingly, capitalization. Capitalization is the process of taking an asset that is expected to produce a certain stream of future profits and assigning a current price to it. This is achieved through a discount rate, where future profits are valued lower than present profits. That is, we assume that 1 dollar now is worth the same as r dollars in a year. Mathematically, we can then say that if the profit from an asset years in the future is p_{n}, then the value of that asset now is

\sum_{n=0}^{\infty} r^{n} p_{n}

If p=p_{n} is constant, then this reduces to

\sum_{n=0}^{\infty} r^{n}p = \frac{p}{1-r}

This formula is called the “discounting/capitalization formula”. This idea of a discount rate (and implicitly this formula, and variations of it), have been in use since the fourteenth century, where it was first introduced by Italian merchants (according to Bichler and Nitzan). As the centuries went on, it spread farther and farther, although not everyone really did the math “correctly”.

In 1907, Irving Fisher proposed that this discounting logic was in fact universal.

It is evident that not bonds and notes alone, but all securities, imply in their price and their expected returns a rate of interest. There is thus an implicit rate of interest in stocks as well as in bonds…. It is, to be sure, often difficult to work out this rate definitely, on account of the elusive element of chance; but it has an existence in all capital…. It is not because the orchard is worth $20,000 that the annual crop will be worth $1000, but it is because the annual crop is worth $1000 that the orchard will be worth $20,000. The $20,000 is the discounted value of the expected income of $1000 per annum; and in the process of discounting, a rate of interest of 5 per cent. is implied.

Bichler and Nitzan are really good a picking good quotes from old economists, so I can’t resist giving you another one:

The primitive economy in its choice of enjoyable goods of different epochs of maturity, in its wars for the possession of hunting grounds and pastures, in its slow accumulation of a store of valuable durable tools, weapons, houses, boats, ornaments, flocks and herds, first appropriated from nature, and then carefully guarded and added to by patient effort — in all this and in much else the primitive economy, even though it were quite patriarchal and communistic, without money, without formal trade, without definite arithmetic calculations, was nevertheless capitalizing, and therefore embodying in its economic environment a rate of premium and discount as between present and future. (Fetter 1914b: 77)

After this quote, Bichler and Nitzan quip

In short, if human beings were indeed made in the image of God, the Almighty must have been a bond trader.

However, despite these enthusiastic embraces of capitalization from the dismal science, the general public were not convinced, and the capitalization formula was not yet embraced. It was not until the dazzling onslaught of complexity in corporate finance that started to unfold in the 1930s and 1940s that the capitalization formula was firmly grasped as a principle with which to evaluate and price assets and make sense of the growing chaos. Later on, risk was incorporated formally into the model, and these modern practices of corporate finance became so firmly entrenched that one would be entirely forgiven if one supposed that they were laws of nature.

For me, this was an interestingly different way of looking at capitalism. Capitalism is typically presented as a system that embraces markets, whereas this presents capitalism as a logic for how to price things in the context of markets. In theory, another polical system could use markets without using the same pricing logic.

Bichler and Nitzan then devote a whole chapter to showing that capital within the modern capitalist system is determined wholly by variants of the discounting formula, and that capital is not necessarily correlated with any sort of “real capital” in terms of physical assets. Unfortunately we don’t have time to get into this, but this is where they start pulling out the graphs and data, and start to show that the assumptions that economics textbooks make in correlating capital with any sort of measurable, tangible asset are empirically unfounded.

3. Profit and Differential Profit

If capital is “capitalized future profits”, then what are profits? Where do they come from, and how do they relate to accumulation of capital?

Recall the framework of “Business and Industry” that Veblen came up with. According to Veblen, when Industry is allowed to run unchecked, there are no profits. Competition makes the price and cost of a good equalize exactly. In order to make profit, a firm must somehow restrict industry, or “sabotage” it, in the case of Veblen. For instance, in section A.2 we saw that the profits of automobile companies directly correspond to their ability to sabotage alternate forms of transportation. To sum up, capital measures and “discounts” a firm’s ability to sabotage industry.

This goes a long with a theory of property rights that sees property rights as fundamentally exclusionary. My ownership of a house does not enable me to use it, it only serves to disallow you from using it. Capital measures the access of a firm to revenue streams that other firms cannot access. A textbook of 21st century materials engineering might be worth billions to a firm in the 1960s that could control access to its knowledge, but once it become common knowledge, it cannot be counted as capital even though the firm’s ability to access it has not changed at all.

As usual, Bichler and Nitzan write about this in a very elegant way.

Business, like other power institutions throughout history, can force people to act, but it cannot make them productive. Moreover, productivity as such, being socially hologramic and therefore open and unrestricted, cannot generate a profit. The only way for capitalists to profit from productivity is by subjugating and limiting it. And since business earnings hinge on strategic sabotage, their capitalization represents nothing but incapacitation. In this particular sense, capital, by its very construction, is a negative industrial magnitude.

Even human and relationship capital can be viewed in this lens. When human capital is weighed on the balance sheet, what is being measured is the firm’s ability to divert the output of those brains away from increasing the common knowledge and towards increasing the profit of the firm.

However, as power is inherently relative, the way it accumulates is not by profit, because a rising tide that lifts up all boats does not change their relative standing. Instead, capital accumulates by differential profit. In other words, firms are unconcerned with outrunning the bear market, they are just concerned with outrunning you.

Bichler and Nitzan phrase this culturally, talking about the drive to beat the average among stockbrokers, but one could easily think about this evolutionarily. The firms that have managed to keep a consistent rate of growth that is above the average have, by the miracle of exponential growth, become those firms that dominate the economy.

In short, differential profit measures the ability of a firm to sabotage industry more than everyone else. This race to beat the average is what determines which firms end up on top, and which CEOs can buy the most expensive yachts. Not efficiency, not productivity, not innovation, but the ability to sabotage industry more than everyone else.

Summary of Part C

It makes more sense to think of prices as numbers which have internal, not external significance.

The central process of capitalism is capitalization of future profits (measured in prices)

Profit measures the ability of a firm to sabotage industry

Firms grow when they can sabotage industry more than other firms

Capital measures the ability of a firm to sabotage industry, discounted over time.

Part D: The Machine in Context

1. Economic Implications

In this section, we talk about some of the predictions that this theory makes, and how well they align with reality.

First of all, no review of Capital as Power would be complete without mentioning their discussion of what they call “Dominant Capital.” Dominant Capital is what they call the conglomerate of the very largest companies in capitalism, the companies who have managed to squeeze above-average profits out of industry year after year and have an outsized influence on not only the markets, but also the political process.

As a rough approximation of dominant capital, Bichler and Nitzan consider the top 100 companies as measured by capitalization. Using this, Bichler and Nitzan shows that while various other measures of growth have not had clear trendlines, accumulation of capital by the top 100 companies has been happening at a steady rate since WWII.

This growth is not driven by expansion of industry. Bichler and Nitzan argue that expansion of industry means loss of control by capitalists, and thus is uncorrelated to accumulation of capital.

To back these arguments up, Bichler and Nitzan make lots of graphs where two trends on different axes are lined up.

These sort of graphs are actually pervasive throughout Capital as Power, and although they were suggestive, I’m kind of skeptical of arguments that pull out variables from thin air and show that they are correlated. Presumably Bichler and Nitzan had many different datasets that they could work with; it’s not too hard to find trends that are correlated if you squint. And additionally, as always causation is hard to tease out; the cycles/trends in these graphs could be coming from common causes that have nothing to do with this theory.

As a current statistics grad student, I was both pleased and disappointed at the general level of math in this book. That is, there is very little math. From my perspective, this is far better than coming up with a lot of ad-hoc models with lots of assumptions; better to have a tight qualitative analysis than a sloppy quantitative one. On the other hand, I was kind of hoping that there would be something a little more precise than “these graphs kind of look similar.” I haven’t read much of the rest of Bichler and Nitzans’ work though, so perhaps they develop these theories more rigorously elsewhere.

Another consequence of the frame of Capital as Power is a new look on inflation. To start this analysis, Bichler and Nitzan claim that one of the most common modes of power is price setting. In neoliberal economics, firms are typically portrayed as having to take prices dictated by the market, but in fact in order to achieve a “normal rate of return”, firms must exert power and set prices higher than what a truly competitive market would bear.

As a consequence of the former point, inflation is the result of power struggles between businesses, and is fundamentally redistributionary, rather than being an expression of a growing economy. This explains stagflation; price wars without industry growth makes a lot of sense; business expresses power by both restricting industry and raising prices.

Additionally, Bichler and Nitzan claim that in the US for the last 50 years, dominant capital accumulates during periods of inflation, where dominant capital can raise prices faster than everyone else, and slows accumulation during periods without inflation. However, this is not necessarily true in other countries, in which different redistribution patterns happen during inflation. That is, the US economy has become a good engine for the accumulation of dominant capital, but this is by no means universal.

The final consequence that I will mention briefly is mergers and acquisitions. Apparently much of the growth of dominant capital in the last few decades has been mediated by mergers and acquisitions, not green-field growth. This makes sense because mergers and acquisitions allow for more concentrated power and more control over industry, and green-field growth has the potential for letting industry run away from business.

In general, I would be very interested for someone with a better understanding of economic indicators than I to take a look at the later chapters in Capital as Power and tell me whether or not Bichler and Nitzan are cherry picking their data or not, and whether their analysis is accurate; this was the part where they really started to lose me, but also the part where all of their claims about the empirical verifiability of their theories rest.

For this book review, however, we must move on!

2. The Space of Political Systems

Capital as Power has a subtitle which I have not mentioned yet. Its full title is “Capital as Power: A Study of Order and Creorder.” What is creorder? I’ll let the authors define this for you

Historical society is a creorder. At every passing moment, it is both Parmenidean and Heraclitean: a state in process, a construct reconstructed, a form transformed. To have a history is to create order — a verb and a noun whose fusion yields the verb-noun creorder.

A creorder can be hierarchical as in dictatorship or tight bureaucracy, horizontal as in direct democracy, or something in between. Its pace of change can be imperceptibly slow — as it was in many ancient tyrannies — yielding the impression of complete stability; or it can be so fast as to undermine any semblance of structure, as it often is in capitalism. Its transformative pattern can be continuous or discrete, uniform or erratic, singular or multifaceted. But whatever its particular properties, it is always a paradoxical duality — a dynamic creation of a static order….

The use of this idea is to situate capitalism within a broader space of creorders. Capitalism is characterized by the quantification of power through a market, and the accumulation of power through exclusive rights. The creordering of the market, and of these property rights, however, is accomplished through the very powers that the markets and property rights affirm. That makes the exact power structure of a capitalist society very fluid. The internal logic of one capitalist society are due to these power relations, and though the power relations may be mediated through the market; there are not “economic laws” that force the logic of one capitalist society to be the same as that of another capitalist society. We have mainly been analyzing the logic of the U.S. economy over the last 50 years, and Bichler and Nitzan are able to separate out the universal framework from particular features of this system.

One example of using “particular theories: is Bichler and Nitzan’s idea of the”petrodollar-weapondollar coalition" vs. the “technodollar-mergerdollar alliance.” They posit that during the Cold War, dominant capital mainly centered around oil extraction and petroleum-dependent industry, and additionally weapons manufacture: the “petrodollar-weapondollar coalition”. These two sectors were dependent on each other because oil required certain international relations that the defense industry was happy to supply. However, as the Cold War wound down in the 90s, dominant capital shifted towards technology and consolidation of power through mergers, which they call the “technodollar-mergerdollar alliance”. This new coalition seemed to observers to herald a new economy centered around growth through high-tech knowledge and industry.

However, with the dotcom crash and the new wars in the Middle East, dominant capital shifted back to “petrodollar-weapondollar”. Depending on where dominant capital was centered, different logic applied.

Bichler and Nitzan developed this theory of “petrodollar-weapondollar” and “technodollar-mergerdollar” back in the 90s, and claim that they were able to successfully predict changes in trends based on this framework that other forecasters were unable to see at the time, in published articles. This should be easy enough to verify for anyone who wants to dig through their old papers, and seems like a strong indicator that they know what they are talking about, at least in this area.

The point that I am trying to make is that, unlike the laws of economics which mostly claim to be universal across time, the strength of Capital as Power is that they can identify what things are true about some periods, and not of others, and integrate these assumptions into their models. In other words, rather than being a general theory of economics, Capital as Power is a general theory of the space of possible capitalist politics, or as Bichler and Nitzan seem to be so happy to coin, a general theory of possible capitalist creorders.

3. Connections with Other Theories

Reading this book, I had the overwhelming desire to introduce Bichler and Nitzan to James C. Scott, the author of “Seeing Like a State.” There seems to be a very strong similarity to how each theorize power. That is, power mainly serves to organize society in such a way that its resources are more easily extracted. However, Bichler and Nitzan are not anthropologists, and James C. Scott is not an economist, and as such each of their analyses is limited by domain.

Specifically, I think that Capital as Power could be greatly improved by a discussion of “legibility.” It seems like one important asset that is discounted implicitly by capitalization is the “legibility” of industry to a business. If an industry is illegible, it is much more difficult to extract profits.

Conversely, James C. Scott is heavy on examples and creates good language to describe situations that were previously harder to describe, however he lacks empirical/numerical theories built off his framework. Using the framework of Capital as Power, it could be more possible to somehow correlate legibility of a certain domain to the stock price of businesses with interest in that domain.

Finally, the idea that I have had in my head ever since reading Seeing Like a State (or more accurately, the idea that I have had in my head ever since reading Scott Alexander’s review of Seeing Like a State) was that there should be some way of talking about all of this using some variant of statistical mechanics. If power is the ability to create order, and order is the absense of entropy, then we should be able to talk about power and creorder in the framework of a stochastic dynamical system, of which we can measure the entropy at certain points of time.

The second law of thermodynamics says that entropy always increases in a closed system. However, in an open system, such as the Earth which is constantly receiving sunlight and radiating out waste heat, this does not have to hold. And in fact, all of evolution and human society is proof that the second law of thermodynamics does not in fact hold in an open system.

This semester, I’m finally taking stochastic mechanics (up to now I’ve only had a vague idea of what it actually is), so maybe I’ll be able to say more after I know what I’m talking about. But if you, dear reader, are catching a glimpse of this vision that i have, and you know something about statistical mechanics, I ask that you keep this idea in your head and toy with it over the next decade. A mathematical framework for creorder would dramatically expand the set of questions that scientists can research about the world.

Summary of Part D

The Capital as Power theory of the capitalist machine has many concrete explanations of graphs of economic indicators over the last 50 years in the United States. Whether or not these are just-so stories must be left to a more-informed reader than I.

Capital as Power provides a framework for talking about specific internal logics of capitalism, and relating them to each other.

Capital as Power seems to be hitting at a much larger theory, which is also talked about by James C. Scott, and should be mathematically modeled by stochastic mechanics.

Should You Read This Book?

Capital as Power is long, but extremely full of content. There are some large points in the book that I didn’t talk about at all, and that which I did talk about was highly condensed. After reading both Seeing Like a State and the slatestarcodex review of Seeing Like a State, I thought that Scott Alexander managed to capture fully the main points, but I would definitely say that this review has not captured in full the main points of Capital as Power. Put in a more positive way; this review has not spoiled Capital as Power for you!

Additionally, Bichler and Nitzan write in a very engaging way; not necessarily easy to read but certainly action-packed. And there are many, many interesting historical nuggets in the book, like the history of GM’s EV1 car that I referenced in the introduction.

And finally, although Capital as Power is long, there is a “middle way” (between reading this review, and reading the entire book). The first chapter, which is around 80 pages on my ereader, contains a summary and overview of the whole thing, and Bichler and Nitzan have a free copy available on their webpage, so you don’t even have to feel bad about buying an entire book. So go do that!

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Three Environmental Proposalshttps://owenlynch.org/posts/2021-09-13-three-proposals2021-09-13T00:00:00Z2021-09-13T00:00:00ZOwen's Blog -- Three Environmental Proposals

These are some environmental demands that I want to see more talk about.

Aggressive, phased-in carbon taxes. It’s boring and penny-pinching and not as sexy as “investment in green infrastructure”, and maybe it will be a regressive tax, but I don’t care. Taxation is one of the most powerful levers our government in our current capitalist system, and by god we have to use every lever we have. Of course, the revenue from this tax should be earmarked for green infrastructure, but that’s besides the point: pollution should have a price tag that is commesurate to its cost to society.

Immediate large-scale investment in nuclear, and a repeal of the doctrine that nuclear be “as safe as possible”. Nuclear should be held to similar safety standards as fossil fuel plants. For more information on how regulation on nuclear has crippled what should be a bountiful source of green energy, see Why has nuclear power been a flop. Ideally, once nuclear has a reasonable level of regulation, energy companies will fall over each other in a rush to decarbonize.

The EPA should be revamped to have powers and independence similar to the Fed. The director of the EPA is a 10-year appointment, and is free from political meddling during those 10 years. The EPA has a mandate to maintain the environmental safety of the territory of the United States, and has broad authority to implement regulations, quotas, and taxes to this end. Perhaps this would require a constitutional amendment, and I don’t know exactly how to separate out powers granted to the new EPA, and powers that should still be reserved for the other branches of government, but just as the complexities of modern finance should not be left to the political process, so should the complexities of modern ecosystems not be left to the political process. If this is successful, then environmental protection should fade into the background, and become a default fact of life.

When I contrast these proposals to “Green New Deal”-style proposals, my feeling is that these proposals are more compatible with a “lean, powerful state”, and “Green New Deal” will end up with tons of opportunities for inefficient, flabby, corrupt infrastructure deals whose main benefit goes to the government contractors awarded the various contracts.

Though, maybe the real solution here is to revamp the US’s ability to build things by carefully learning from the Japanese, Chinese, or Germans and implementing similar managerial policies… A boy can dream…

Discuss!

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Syndication to Twitterhttps://owenlynch.org/posts/2021-09-12-syndication-to-twitter2021-09-12T00:00:00Z2021-09-12T00:00:00ZOwen's Blog -- Syndication to Twitter

This is a test post to see whether my blog posts are automatically cross-posted to twitter.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>2021-05-01-mathematical-classeshttps://owenlynch.org/posts/2021-05-01-mathematical-classes2021-05-01T00:00:00Z2021-05-01T00:00:00ZOwen's Blog -- Mathematical Classes

If mathematics were a game of Dungeons and Dragons, the character classes that would be most highly venerated are the wizards and sorcerors. Wizards are figures like Grothendieck and Noether, people who built huge towering edifices of theory and were masters of the formal. Sorcerors are people like Von Neumann and Ramanujan, mathematicians whose incredible intuitive facilities led them to make massive leaps at which others could only boggle.

Of course, mathematicians don’t fall so cleanly into categories, and most famous mathematicians are both wizardly and sorcerous. But I think the distinction is interesting. And it also opens up the question, what about the other classes?

There are fighter mathematicians, whose up-to-date and specialized knowledge of a specific field allows them to steadily make progress. I would maybe say that Judea Pearl has fighter-like aspects; it seems like he’s been working away at causality for a long time, steadily building and integrating with other fields. For some reason, I also think of Emily Riehl in this way; it seems like she has a strong goal (i.e., making a language for infinity category theory) and is systematically attacking it piece by piece, though I am not tuned in enough to know whether this is an accurate description.

Cleric mathematicians, I could interpret this in two ways. The first way is that maybe a cleric mathematician is one who refactors knowledge. In this sense, I’d say that Bill Lawvere is cleric-y. Another interpretation is that a cleric mathematician is a very good teacher, one who heals their students’ misunderstandings. As I haven’t been taught by many famous mathematicians, I can’t give a terribly famous example, but I would say that Melody Chan is pretty cleric-y, among the teachers that I have had.

What would a rogue mathematician be? I think a rogue mathematician would be someone who steals lots of ideas from other fields and integrates them into new fields. I think the whole field of applied category theory is rather rogue-ish. I don’t know though, I could think about this in different ways.

A bard mathematician, on the other hand, I have a very firm grasp on. These are the mathematicians who know a bit of everything, and spread their knowledge widely through collaboration, writings, and song. There are two bard mathematicians that immediately jump to my mind: John Baez and Paul Erdos. In my mind, the bards are the mathematicians that glue mathematics together, giving it tradition and forging bonds between theories and people. Another strain of bardy-ness are the mathematicians who spend a lot of time illustrating things, like Robert Ghrist or Grant Sanderson of 3Blue1Brown fame.

I think that there is a growing appreciation that wizards and sorcerors are not the be all and end all of what the ideal mathematician should look like, and I think this is a good thing! I don’t really have anything more profound to say in terms of the implications of this, I mainly just thought the classification was interesting.

There are more D&D classes, but their correspondence with mathematical archetypes is left to the reader!

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Dictator Island Chapter 1https://owenlynch.org/posts/2021-01-04-dictator-island-12021-01-04T00:00:00Z2021-01-04T00:00:00ZOwen's Blog -- Dictator Island Chapter 1

Ever since he was a child, Generalissimo Carlos van Witteboom-Diaz hated Wednesdays. Wednesdays were the day that they had music class, every week for every year of grades 1-7, until civil war broke out and school ended for Carlos permanently.

Until that happy day, each Wednesday Carlos’s class sang dumb songs about animals on farms and love and life in the country, and Carlos’s voice would squeek and crach until he could bear it no longer and refused to sing another note, at which point he would be forced to sit in a corner. But that wasn’t the worst part. The worst part came after the singing, after the tears of shame had started to dry. The worst part was the recorders.

These recorders had been carved by hand out of wood approximately a millenia ago, which might have made them more valueable than your standard modern plastic recorder in another context, but as they were, it created two key disadvantages.

The first was that they had begun to splinter. One Wednesday, Carlos had been sent to the Nurse’s office with a wood splinter in his gum lodged by a involuntary twitch in response to a particularly shrill and out of tune E♭, and the wound had festered for months; he had been unable to eat anything that put up more a fight than a refried bean.

The second disadvantage of wood was that the slobber from all eight grades of the school never quite seemed to dry. Carlos suspected that a small fraction of the town’s budget went towards recorder-slobber related illnesses.

But even refusing to let the filthy thing go anywhere near his mouth did not exempt him from its torments, for he was still forced to be in the room as all the other children push tortured air through the decrepit pipes, tortured air that only though of escaping the lungs of his oblivious classmates and venting all of its trauma and heartbreak approximately 3 inches away from Carlos’s ear, like a soap opera star that had aged out of the profession but had not lost a single decibel of her voice.

So one of Carlos’s first acts as Generalissimo was to ban both recorders and Wednesdays. He didn’t ban singing, seeing as that would be rather difficult, but it was well known that anyone who allowed singing to come into his presence would suffer a rapid loss of favor, which typically came accompanied with several other unpleasant things.

His advisors were quite happy with the recorders ban, which was easy to implement because nobody other than sadistic music teachers really likes recorders, but the Wednesday ban was trickier.

They first tried to rename Wednesday to Carlosday, but the Generalissimo said through icy gritted teeth that anyone so foolish as to think that the name of their great Generalissimo should adorn that cursed day did not deserve the air in their lungs, and that put an end to that plan.

The next plan they tried was to remove Wednesdays and add Carlosday to the end of the week, so that their week would still sync up with the rest of the world even though they used different names. But Carlos would have none of that either; it would ruin Thursday.

Finally, they gave up and did everything on a six day schedule. Monday, Tuesday, Thursday, Friday, Saturday and Sunday were the days of the week in New Barcelona, and all tourists and multinational corporations doing business with country would simply have to print two copies of all of their calenders and schedules.

On Dictator Island, where nobody responded “how far” when ex-dictators asked them to jump, Generalissimo Carlos van Witteboom-Diaz celebrated his first Wednesday in 31 years, 8 months, and 22 days by beating his butler, who had brought him breakfast in bed while whistling a old song, senseless with his oak walking stick, before the U.S. Marshall posted outside kicked the door off its hinges and tackled the 73 year-old to the ground.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Coends and Integralshttps://owenlynch.org/posts/2020-11-21-ends-and-integrals2020-11-21T00:00:00Z2020-11-21T00:00:00ZOwen's Blog -- Coends and Integrals

Here I was, browsing through twitter, and I came across an innocuous tweet by Alex Kontorovitch…. Or was it?

First of all, a warning. This is more of a “new result” blog post than it is a “explaining something that already exists”, so I hate to say it but if you are not familiar with coends or differential forms, the rest of this will be pretty gobbledigookish.

As I assume the reader is familiar with, there’s a well-known construction in category theory called an coend, which produces an element of \mathsf{D} from a profunctor P \colon \mathsf{C}^{\mathrm{op}} \times \mathsf{C} \to \mathsf{D}. The notation for it looks like this

\int^{c \in \mathsf{C}} P(c,c) \in \mathsf{D}

It has this notation because it has a lot of properties that are very reminiscent of integrals. Coends (and their dual, ends) are very useful for a lot of constructions in category theory, except for, funnily enough, calculus. Or at least, so I’ve been told; perhaps the correspondence that I am about to make has already been done.

Anyways, plowing ahead, I am going to do a construction that relates coends and actual integrals, and it may or may not be original, so hold onto your seats folks.

The Construction

Let M be a manifold. Then for any k, define C_{k}(M) to be the \mathbb{R}-algebra freely generated by all differentiable maps \Delta^{k} \to M, where \Delta^{k} is the k -simplex. As we are familiar with from simplicial homology, this turns into a chain complex, with a “boundary map” \partial_{k} \colon C_{k}(M) \to C_{k-1}(M).

We also have another chain complex, this one coming from de Rham cohomology, and given in degree k by \Omega^{k}(M), the \mathbb{R}-algebra of k-dimensional differential forms on M.

If \mathbb{N} is the poset of natural numbers, seen as a category by adding a unique arrow m \to n when m \leq n, then C_{-}(M) is a contravariant functor from \mathbb{N}, and \Omega^{-}(M) is a covariant functor from \mathbb{N}.

Now, if \sigma \colon \Delta^{k} \to M is a differentiable map, then we can integrate a k-form \omega \in \Omega^{k}(M) along it to get

\int_{\sigma(\Delta^{k})} \omega \in \mathbb{R}

For notational reasons, I define \mathrm{int}_{k}(\sigma,\omega) to be this integral; from henceforth the integral sign will be reserved for coends. We extend \mathrm{int}_{k}(-,\omega) to be a map C_{k}(M) \to \mathbb{R} using the fact that C_{k}(M) is the free \mathbb{R}-algebra on maps \Delta^{k} \to M. With a bit of work, one can show that this is a map \mathrm{int}_{k} \colon C_{k}(M) \otimes \Omega^{k}(M) \to \mathbb{R}, because \mathrm{int}_{k} \colon C_{k}(M) \times \Omega^{k}(M) \to \mathbb{R} is bilinear.

We can glue together these maps for all k to get a map

Now comes the fun part. Stoke’s theorem says that \mathrm{int}_{k}(\partial \sigma, \omega) = \mathrm{int}_{k+1}(\sigma, \delta \omega). This precisely satisfies the universal property of the coend! That is, recall that the coend of the profunctor C_{-}(M) \otimes \Omega^{-}(M) is the colimit of

where the two maps apply the morphism n \leq m contravariantly to C_{m}(M) (i.e., \partial^{m - n}), or covariantly to \Omega^{n}(M) (i.e. \delta^{m-n}). As \delta^{2} = 0 and \partial^{2} = 0, the only interesting case is m = n+1, and Stoke’s theorem precisely states that in that case, the images of those two maps integrate to the same thing! Therefore, we can extend \mathrm{int} to a \mathbb{R}-algebra morphism

What is this strange \mathbb{R}-algebra called \int^{k \in \mathbb{N}} C_{k}(M) \otimes \Omega^{k}(M)? I have no idea. I don’t know if this algebra goes by another name, or is interesting for any reason. But I do believe that this construction is functorial in M, so this gives a functor from manifolds to \mathbb{R}-algebras, which one might hope preserves things that are important.

Mainly, I’m just happy that coends have any relation to integrals. Also, I wonder if the different properties that coends have (i.e., Fubini’s theorem, Hom-functor as Dirac delta) can somehow be brought to bear on this construction.

Send me thoughts about this!

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>At Long Last, a Good Julia Setuphttps://owenlynch.org/posts/2020-10-23-julia-setup2020-10-23T00:00:00Z2020-10-23T00:00:00ZOwen's Blog -- At Long Last, a Good Julia Setup

I was inspired to write this post because I’ve been programming Julia for the last year, and only now have I found a setup that I like. This is partly because I am on NixOS, and partly because I like emacs, so you may very well already have a nice Julia setup and in that case can ignore this. But for those of you who are slightly unsatisfied… read on!

The key ingredients are docker, jupyter, and doom-emacs.

First off, I have a docker file that looks like this:

Periodically I add more packages to this; I don’t particularly remember why I needed all of those packages but whatever.

I then run this docker container with bind mounts for everything in the root folder. This means that I don’t have to reinstall packages every time I restart the docker container. My script to open up a bash shell in the container looks like this

When I first set up the container, I run this script to go into a bash shell, and I open a Julia REPL to install IJulia. This installs conda and Jupyter for me automatically, which is nice because then python dependencies are also managed through Julia. I love Nix, but Julia just does not play nicely with it, so I’ve found it’s best to just give Julia free reign to do whatever it wants to a docker container.

Once I’ve installed IJulia and Jupyter, I close the bash shell, and run this

#!/usr/bin/env bashdocker run -it -p 8888:8888 -v /home/o/g:/g -v /home/o/s/docker-julia-home:/root julia /root/.julia/conda/3/bin/jupyter lab --ip=0.0.0.0 --allow-root

At this point, you can access the jupyter lab through a webbrowser at localhost:8888, which is convenient. However, this doesn’t stop there!

The next step is emacs-jupyter. I have this installed through doom emacs with the layer (org +jupyter), you may have a better way of installing it. I don’t really use it with org mode most of the time so… *shrug*.

Anyways, once this is installed, I run the command jupyter-run-server-repl and then pass in the connection details for the Jupyter server in docker to the prompts. This opens a repl in Emacs that has support for inline images, which is really nice for plots. Finally, I open up a Julia file and run jupyter-repl-associate-buffer. This allows me to use key combinations like C-c C-c to evaluate a region, or C-c C-b to evaluate the whole buffer.

And that’s pretty much it! This will allow you to use Julia in emacs without permanently installing anything on your computer, and should work well anywhere where docker bind mounts work. In the future, I might look into running PackageCompiler to get Jupyter kernels that load IJulia and Plots faster, but I’m not going to worry about that yet.

This setup should also work with Python and R, but I haven’t used it very much for that, so YMMV.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Rod and Ringhttps://owenlynch.org/posts/2020-10-12-rod-and-ring2020-10-12T00:00:00Z2020-10-12T00:00:00ZOwen's Blog -- Rod and Ring

The “rod and ring”, in addition to being a popular method of hanging curtains, is an ancient symbol of kingly legitimacy that typically came from the Gods (source: Wikipedia). The coronation regalia of British monarchs are a similar phenomenon, and the coronation of the king or queen involves the Archbishop of Canterbury bestowing the crown, scepter, orb, etc. on the monarch.

The particular details are not so important, the point is, throughout history there are all sorts of instances of worldly power being invested into objects, that are then granted to the ruler by a divine power.

I was biking recently, and it occured to me that the three things I always have on me when I leave the house are kind of like this.

My keys. These are bequeathed to me by locksmiths, the clerics of the gods of physical property, and they represent my ownership of my house, my bike, and my violin (my violin case locks).

My wallet. This contains cards and cash bequeathed to me by the bankers, the clerics of the gods of economics, and they represent my power over the market. This also contains the official identity cards given to me by the nation-state that I reside in, which represent my legitimacy as a member of society.

My phone. This was bequeathed to me by a tech company, part of the church of information, and it represents my mastery over the communications of the world.

Just like the Crown Jewels, or rod and ring, give power in a divine-right-to-rule monarchy, these three things give me power in a certain context, i.e. modern capitalist society. Also just like the Crown Jewels, none of these would be anywhere close to as much help for me if I were dropped in an different context. Sure, I could use the computational power of my phone, but on the other hand, you could also bop someone on the head with your scepter; the intrinsic value of my phone as a computational device is divorced from its value as a world-network interface.

I’m working on a book review of “Capital as Power”, which will eventually be published on this blog, and this kind of thinking is going to be integral to understanding that book, so think of this as a taster.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Haskell's Childrenhttps://owenlynch.org/posts/2020-09-16-haskells-children2020-09-16T00:00:00Z2020-09-16T00:00:00ZOwen's Blog -- Haskell's Children

If I were to travel back in time 4 years ago, and tell my old self that Haskell was starting to lose its shine, I wouldn’t believe it. I grew up on Haskell, my appetite for category theory was whetted by Haskell, my biggest programming projects have been in Haskell, and my dream job was to work at a company that used Haskell.

But now, I find myself simply not as excited about Haskell as I used to be. What changed?

I think there are a couple things. I think one primary factor is that the kind of programming that Haskell really excells in; i.e. creating abstract, correct interfaces for things, is just not a type of programming that’s interesting to me anymore. When I wanted to work on software as a career, a language that allowed incredible facilities in not repeating yourself was very useful. Types that ensure correctness of data interchange, or lenses that allow access to complicated data structures are all very well for implementing, say, a compiler, or complicated business logic in a web backend. However, my interests in software are now primarily as a scientific/mathematical tool. Numerical algorithms can be done in Haskell, but they don’t really gain much benefit from the type system, and they also don’t have great library support.

No doubt Haskell could be made into the kind of language to use for the problems that I am interested in, but given the choice between working on the problems that interest me, and working on infrastructure for the problems that interest me, I would rather work on the problems that interest me. The general feeling that I have is that Haskell is a great tool for a software engineer, but I don’t want to be a software engineer, I want to be a mathematician that sometimes uses computers.

But there is another reason too. While I think that Haskell is still a great language, it is close to 30 years old at this point. It manages to stay fresh and relevant with an ever-growing list of extensions, and constantly changing best practices and libraries (which is itself a problem…), but it would be very sad if we as a society of programmers had failed to surpass it in any respect with any of the programming languages that have had the advantage of starting from a clean slate. In this post, I want to talk about these successor languages, and what I think about them.

Rust

Surprisingly, one of Haskell’s great strengths is as a systems language. It manages to be much faster than most dynamic languages, while allowing a much higher-level interface than traditional systems languages like C (obviously). One great example of a “systems” program written in Haskell is git-annex. It is a git addition that adds tracking of large files, and was my primary backup system for a long time (I eventually decided that I didn’t need the additional power from it, and was better served by a more seamless solution).

However, in 2020, the premier system’s language is surely Rust. It would be unfair to compare the performance of Rust and Haskell, because Haskell is optimized for things other than performance. That being said, Rust is faster and lower-latency than Haskell, both of which are important. However, it also has a great type system, unlike C or C++ (we don’t talk about Go…). The type system in Rust is obviously very influenced by the type system of Haskell, but they also implemented “ownership” which allows for the killer feature garbage-collection free automatic memory management.

When I first started using Rust, I really missed monads. But here’s the thing. Having used lots of monads in Haskell, and read lots of blog posts about monads, I’ve learned that in systems contexts, it’s often best to just have a simple monad stack that just consists of Reader + IO (and Maybe’s and Option’s sprinkled about occasionally). Huge monad transformer stacks often raise more problems than they solve. But Reader + IO is essentially the “default monad stack” of Rust.

Rust also has some other killer features, like the ability to compile to webassembly (yes there is ghcjs, but, really, do you want to use ghcjs?) It also from the beginning was targetted towards industry, and consequentially has a much more vibrant ecosystem.

This all being said, I think it is worth looking at the features that are prominent in Haskell that ended up going to Rust

Typeclasses (in Rust they are Traits)

Sum types (you may take this for granted, but a lot of languages don’t have them….)

Pervasive pattern matching

Hindley-Mindler type inference (automatic type inference for variables)

Pervasiveness of things being expressions rather than statements

Parametric Polymorphism

The feeling that once your program compiles, it will run

I think that we should recognize Rust for what it is, a child of Haskell and the Haskell community, and like all good parents, we should want it to do better than the previous generation. In as much as Haskell is the ideas that form Haskell, the success of Rust is the success of Haskell.

Idris

OK, mainstream programming languages are great, but sometimes you just want to make the perfect type-based interface to your stuff and show the imperative scrubs what a wiz-kid you are. Or alternatively, sometimes you really care that your software is correct. Or you want to concretize a new category-theory inspired design for a part of a compiler. Nowadays, the language for that is not Haskell, it is Idris.

There are about six different ways to sort of have dependent types in Haskell (types that depend on values, like a length-n) array. I don’t really fully understand any of them, and it is totally unclear to me how they work together. Presumably, there are blogs which outline the One True Way, but… it’s tough. In Idris, it just seems perfectly natural to use dependent types, like, why wouldn’t you able to have a type parameter which was a value? In many ways other than dependent types, Idris is a much cleaner language than Haskell too. And with Idris2, it has support for linear types, which allow mutability in a functional context via guarantees that nobody is going to try and use the old value. If I want to play around with a cool type system in a language that can also actually do things with the real world (i.e., unlike Agda or Coq), I would go to Idris rather than Haskell.

But Idris is undeniably Haskell’s child. The first version was written in Haskell (it is now self-hosting). They are similar in more ways than it is worth counting. Enough said.

Julia

Unlike the first two, Julia doesn’t really muscle into Haskell’s territory. Scientific computing was never really Haskell’s forte, despite there being some very cool libraries written in it, like ad for autodifferentiation, or various array-handling packages that automatically fused consecutive array operations.

Also, Julia is a dynamically typed language. How could a filthy dynamically typed language ever claim to be Haskell’s child??

Well, for one it steals some of those cool libraries, and makes them much better! Flux is a neural networks library which essentially is just autodifferentiation + some nice utilities, and it is already competitive in my mind with TensorFlow. Julia also has StaticArrays, which integrates the size of the array into the type, and Julia has some neat fusion abilities too for making array operations really fast.

But wait, you ask, how can it do this if it’s not a statically typed language? Well, Julia is not your average dynamically typed language. It actually has a very interesting type system, a full discussion of which is beyond the scope of this post, and the focus on types as the unit of programming is (somewhat?) similar to Haskell (now I’m stretching it a little though).

The real reason I include Julia, however, is because for me personally, it has replaced Haskell as the place to do category theory. This is because of a shift of viewpoint: rather than providing a type system into which category theory can be embedded in to guide typical software engineering tasks, Julia provides a system in which computations in category theory can be carried out in an efficient way. Specifically, I’m talking about Catlab.jl. A discussion of Catlab.jl is also beyond the scope of this post, but I encourage you to check it out.

Therefore, I count Julia as a child of Haskell (or maybe, I count Catlab.jl as a child of Haskell) because the idea of organizing computation with category theory would not exist in the same way if it weren’t for Haskell.

Conclusion

If I could talk to the Haskell-obsessed teenager that was me four years ago, I would tell him to keep his mind open. Haskell is still great for a lot of things (compilers come to mind), but if Haskell couldn’t inspire superior successors, there wouldn’t be worthwhile ideas in Haskell. There are those on the internet who are talking about how Haskell is dying, and they may or may not be wrong. Stephen Diehl, one of my main Haskell idols, is distancing himself from the Haskell community because of Haskell’s use as intellectual eye-candy on scam cryptocurrencies, and I think that there may be a tipping point where Haskell loses the zeitgeist of being exciting, and because it never had much of a foothold to begin with in industry, slips into irrelevance. But Haskell will always live on; it had a huge impact on many programmers and many languages disproportionate to its actual use, and it will always have a special place in my heart.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Defund Some of the Policehttps://owenlynch.org/posts/2020-06-13-defund-some-of-the-police2020-06-13T00:00:00Z2020-06-13T00:00:00ZOwen's Blog -- Defund Some of the Police

The hot topic nowadays is defunding the police. I see some people talking about how it will stop the school-to-prison pipeline and some people saying that the reduction in crime from aggressive policing has been a huge benefit to the black community. As I have no lived experience or empirical data, I really have no idea what defunding the police would do. And its unclear to me why people writing angry thinkpieces on either side could be anywhere close to certain that they understand society enough to predict what smashing a large part of it would do.

However, I do see something that has been left unsaid. Why don’t we /experiment/? Instead of demanding that all police departments are defunded, why don’t we say that we randomly pick 10% of police departments to be defunded, and check back five years later? Or 10% fully defunded, 10% half defunded? If the Black Lives Matters protesters are correct that police defunding will improve outcomes for black communities, then after 5 years we will have the data to show this. If the Black Lives Matters protesters are wrong, and the 10% of communities that have defunded police departments turn out to be really bad after 5 years, then we will have avoided making a huge mistake. Either outcome is better than the worst-case scenario in the case we defund the police, or the worst-case scenario if we don’t defund the police. Additionally, we will have 5 years of experience in managing policeless communities so that if we want to defund more communities, we can do it better.

I also know that when there is big systemic change that doesn’t work, minorities are the ones who pay the price. All of my white friends who are clamoring for police defunding, they will not be the ones who are hurt the most if drugs become much easier to sell, if businesses in poor neighborhoods have to close because of shoplifting, if cop murders are replaced by regular murders.

And finally, the silent majority of people who are not active on social media have many doubts about police defunding. People of all colors have friends and family in the police, and personally know good, upright police officers. The battle for full police defunding seems unwinnable at this point, and because of this, the demands for it on social media seem more like performative wokeness than well-founded activism.

Right now Black Lives Matters has momentum and has power. I think that a randomized controlled trial of police defunding is an attainable goal in 2020, and if we do it right, it could make hugely positive difference in the life of minority communities.

Please contact me with thoughts and feelings.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>R Love Affairhttps://owenlynch.org/posts/2020-03-22-r-love-affair2020-03-22T00:00:00Z2020-03-22T00:00:00ZOwen's Blog -- R Love Affair

This is going to be a pretty short post, just a life update on what technologies I like. Turns out, well-supported, old languages are really nice to use. R with the tidyverse is wonderful.

The tidyverse is made by people with opinions, who care about making the entire workflow nice, rather than just giving users power. It seems like Rstudio has solid finances, and this has allowed Rstudio to pour serious resources into optimizing the experience.

And also have free hosting for reproducable analysis! Check out this little Rmarkdown document that I wrote that computes the last date that the S&P500 was lower than the its current price: contrafunctor.shinyapps.io/market_backward_time.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Mathematical Woolhttps://owenlynch.org/posts/2020-03-21-mathematical-wool2020-03-21T00:00:00Z2020-03-21T00:00:00ZOwen's Blog -- Mathematical Wool

In Scott Alexander’s Unsong, there is a theory about the creation of the universe that goes something like this. (some discursive paragraphs removed)

“Today I will expound unto you the kabbalistic theory of the creation of the world,” said Ana. “It all starts with Leibniz…”

“See, there’s this idea called divine simplicity. People keep asking, okay, so God created the Universe, but who created God? The answer is that God doesn’t need creating. He’s perfectly simple. He’s just a natural thing for there to be. People act like you need God to explain why the universe isn’t just nothing. But why should the Universe be nothing? Why shouldn’t it be, I don’t know, a piece of bread? The only reason people think ‘nothing’ needs no explanation, but a piece of bread does need an explanation, is that nothing is simpler than bread. Well, God is just as simple as nothing. So there.”

“How is this Leibniz?” asked Eli Foss.

“I’m getting to Leibniz! Right now we’re at information theory. A well-defined mathematical explanation of simplicity. We can measure the complexity of a concept in bits. The number of binary digits it would take to specify the concept in some reasonable encoding system. We can do it with numbers. The numbers 0 and 1 are one bit. Two is 10, three is 11; those are two bits. Four is 100, five is 101, six is 110, seven is 111; so three bits. And so on. We can do it with computer programs; just count how many bits and bytes they take up on a computer. We can do it with images if you can get them into a format like .gif or .jpg. And we can do it with material objects. All you have to do is figure out how long it would take to write a program that specifies a description of the material object to the right level of complexity. There are already weather simulators. However many bits the most efficient one of those is, that’s how complex the weather is.”

“And God?” asked Zoe Farr.

“God is one bit. The bit ‘1’”.

“I find that…counterintuitive,” was the best Zoe could answer.

“Well, it’s easy to represent nothingness. That’s just the bit ‘0’. God is the opposite of that. Complete fullness. Perfection in every respect. This kind of stuff is beyond space – our modern theories of space take a bunch of bits to specify – but if it helps, imagine God as being space filled with the maximum amount of power and intelligence and goodness and everything else that it can hold, stretching on to infinity.”

“Leibniz was studying the I Ching, and he noticed that its yin and yang sticks, when arranged in hexagrams, corresponded to a new form of arithmetic, because he was Leibniz and of course he noticed that. So he invented binary numbers and wrote a letter to the Duke of Brunswick saying that he had explained how God could create the universe out of nothing. It goes like this. You’ve got God, who is 1. You’ve got nothingness, which is 0. And that’s all you need to create everything. 1s and 0s arranged in a long enough string.”

“The kabbalistic conception is that God withdrew from Himself to create the world. I, for example, am beautiful and intelligent, but not so physically strong. God is perfectly beautiful and intelligent and strong, so by withdrawing a little bit of His beauty and intelligence, and a lot of His strength, and some other things, we end up with an Ana.”

“How did God decide which 1s to change to 0s?” asked Erica.

“And there’s the rub,” said Ana. “To change any 1s to 0s at all is making the world worse. Less Godly. Creation was taking something that was already perfect – divinity – and making it worse for no reason. A wise woman once said that those who ask how a perfect God create a universe filled with so much that is evil miss a greater conundrum – why would a perfect God create a universe at all?”

In math there is a similar conundrum. Leopold Kronecker once said that “God created the natural numbers, all else is the work of man.”

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Greek Phalinkshttps://owenlynch.org/posts/2020-02-28-february-links2020-02-28T00:00:00Z2020-02-28T00:00:00ZOwen's Blog -- Greek Phalinks

And we continue the grand tradition of the links post (see the beginning of my previous links post January Links). Another tradition in links posts is to have your title be a pun on “link” or “URL”. This comes from Scott Alexander. I’m not as good at puns as he is, so I’m just going to scrape the bottom of the barrel, but at least it will be something.

I wish that the discourse on Socialism moved forward to thoroughly addressing what went wrong in every single socialist attempt. I think there are still arguments for Socialism, because the alternative is Capitalism, but we have to figure out what went wrong before we can fix it. For this reason, I highly recommend this bibliography: Best Books on the Folly of Socialism.

There is a difference between learning a programming language and learning how to use that programming language. A programming language is embedded in a rich ecosystem that is learned socially, and is hard to pick up from the standard documentation. Stephen Diehl has done an admirable job trying to give an introduction to this ecosystem for Haskell: What I Wish I Knew When Learning Haskell, and has recently updated. Warning: this is a lot of material. Don’t look at this as “everything that you have to know before you learn Haskell”, look at this as “accumulated wisdom for lots of specific domains that you can reference instead of rederiving it yourself.”

Related: ormolu is a nice formatter for Haskell code. Formatting, and indenting Haskell can be finicky; this makes the decision for you.

One of my pet issues is planned obsolescense, so I am happy to see that there is a group in France that is combatting it. They successfully sued Apple for releasing software upgrades that slowed down older iPhones. Halte Obsolesence. Warning: French.

Alice Maz is a fascinating individual who writes fascinating essays, and now is part of a fascinating company that does systems biology, a fascinating subject that is highly related to what I want to do with math.

My friend stumbled across an interesting set of parameters for EZ-Petri, which make it blow up. If anyone wants to figure out why, please be my guest.

This is a really interesting software library: you write graphics and it automatically figures out how to let people drag the graphics to change the parameters.

Jason Collins does a great job of explaining why just looking at the average for something can be very misleading: Ergodicity Economics: A Primer. This is very connected to mean extinction times.

Sometimes I have a vague worry that “hard problems” are just candy for mathematically inclined people, and the greatest good to do in the world is not found in hard problems, but in overlooked, boring problems. Ben Kuhn turns this vague worry into a real worry: You don’t need to work on hard problems.

I’m starting to use Roam to take notes. It’s very neat: you essentially build your own little wikipedia, and the focus is on interlinking your knowledge so that you can make new connections between things. Actually, I lied, I don’t use Roam, I use the Emacs version, org-roam.

And on a personal note, I just found my fountain pen again, which had been lost for three weeks in the couch cushions. I’m so happy about this that I’m going to link to it, because it’s really a fantastic pen and not terribly expensive. Kaweco PERKEO Füllhalter All Black. Warning: German. To be perfectly honest, I started taking notes on my computer because I couldn’t bear to take notes on a regular pen after I got used to this pen. So I might move back to taking notes on paper now.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Markov Processes for Category Theoristshttps://owenlynch.org/posts/2020-02-27-markov-functors2020-02-27T00:00:00Z2020-02-27T00:00:00ZOwen's Blog -- Markov Processes for Category Theorists

Recall the definition of an affine space over a ring A: an affine space is a space X equipped with the ability to take “affine combinations” of points in X, i.e. a_1 x_1 + \ldots a_n x_n where a_1 + \ldots + a_n = 1.

We can similarly define an “affine space” over a rig (ring without negation). An “affine space” over the rig \mathbb{R}_{\geq 0} is precisely a convex space. Using this definition makes it obvious how to define “convex morphisms”: a convex morphism is one that preserves affine combinations over \mathbb{R}_{\geq 0}. Note that this is different from the typical definition of a “convex function”, which has

f(\lambda x + (1-\lambda)y) \leq \lambda f(x) + (1-\lambda)f(y)

However, this definition of “convex morphism” has

f(\lambda x + (1-\lambda)y) = \lambda f(x) + (1-\lambda)f(y)

Soon we will pick a different name for “convex morphism” that fixes this difficulty.

The most important convex spaces are the n-simplices:

These are important for many reasons, one of which is that an element of \Delta^n represents a probability mass function on n+1 elements. More generally, the set of (probability) measures on any space X with sigma-algebra \Omega of What has previously been called a stochastic map is precisely a convex map from a probability simplex to itself.

I propose that we should generalize the concept of stochastic map to cases where to domain and codomain are not a probability simplex. Namely, I propose that we define a stochastic map to be what we previously called a “convex morphism”.

We then can consider the category of convex spaces and stochastic maps. Call this category \mathbf{Cvx}.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>The Combinatorics of Compositionhttps://owenlynch.org/posts/2020-02-17-combinatorics-of-composition2020-02-17T00:00:00Z2020-02-17T00:00:00ZOwen's Blog -- The Combinatorics of Composition

Suppose you are an electronic engineer, and you want to connect two integrated circuits. These are doodads that look something like:

You can think of an integrated circuit as a spider which senses things through its legs. To make two integrated circuits talk to each other, you connect their legs. Then you can think of the two integrated circuits that are connected to each other as a single integrated circuit that can then communicate with other integrated circuits. In fact, inside the black spidery body of the integrated circuit there are many many subcircuits all connected to each other.

Anyways, the point is that we build electronic circuits by joining together smaller electronic circuits. This is uncontroversial.

What is slightly more controversial is that if we want a reasonable theory of electronic circuits, this compositional behavior must be taken into account. In other words, we want to derive the behavior of an electronic circuit from the behavior of its parts and the interactions between those parts.

If you have learned about electronic circuits in a physics class, then you are probably used to thinking about an electronic circuit as a collection of resistors, inductors, and capacitors wired together. But in the real world, electronic circuits are big complicated beasts, and if you try to analyze them by thinking of the entire circuit as a big network of resistors, inductors, and capacitors, you will have a bad time. In other words, for practical reasons we want to be able to analyze parts of a circuit separately, and then put them together afterwards.

For instance, consider a chain of guitar effects pedals. Each effect is composed of many tiny circuits all working together. But the effect as a whole can be understood as, for instance, cutting out high frequencies from the signal, or just amplifying the signal (which seems simple, but is actually somewhat hard to do without distortion, and requires a more complex design than you would think). Complex circuits can have simple behavior, and so it is useful to derive that simple behavior and then forget about the internal details.

Guitar effects pedals are somewhat special, however, because on a guitar effects pedal there is a clearly marked “input” port and a clearly marked “output” port. Given two guitar effects pedals, there is an obvious way to compose them; connect the input port of one to the output port of the other. All you have to choose is which effect comes first.

However, if I were to give you two arbitrary electronic circuits and say “compose these”, you wouldn’t have the slightest idea how to do so. There are a lot of different ways to compose two circuits.

In this post, we are going to discuss a mathematical construction called a “cospan” (pronounced koh-span) that solves this problem. Cospans allow the data of “input” ports and “output” ports to be attached to an electronic circuit, and at the same time give a good account of how to fuse two electronic circuits together. In other words, cospans allow us to go from messy circuits that electronic engineers struggle with to plug and play effects pedals that musicians use without thinking about it.

Graph Composition

The fundamental difficulty in composing electronic circuits comes not from the electronics, but from the shape. Therefore, before we tackle the problems of resistors, inductors, and capacitors, we are first just going to talk about composing graphs.

Recall that a graph G is comprised of a set of nodes N, a set of edges E, a function s : E \to N which assigns a source to each edge, and a function t : E \to N which assigns a target to each edge. In this blog post, we allow multiple edges between nodes, and we also allow edges from a node to itself. A graph homomorphism f between G_1 = (N_1,E_1,s_1,t_1) and G_2 = (N_2,E_2,s_2,t_2) is tuple of two functions f_N : N_1 \to N_2 and f_E : E_1 \to E_2 such that the following diagram commutes:

Now, to compose two graphs, one simple solution would be two just join two vertices with an edge. Then the data required to specify inputs and outputs could just be restricted to a choice of input node and a choice of output node.

But we want a finer notion of composability. Generally, electronic circuits need to be composed along multiple wires; for example a USB port has four wires (if you are reading this on a device with a USB port, you can look and verify this!) To get this finer notion, we will compose by gluing together subgraphs. By this I mean that we designate part of the graph to be the “input” and part of the graph to be the “output”. Then we compose the two graphs by gluing together the output subgraph of one graph to the input subgraph of another graph, as in the following picture.

This is fairly intuitive; the question is how do we model this formally? One way would be to just say that the graph is equipped with two subgraphs for input and output. There are a couple problems with this, however. First of all, even if the input subgraph of one graph is isomorphic to the output subgraph of another, there may be many choices of this isomorphism. The concrete instantiation of this is that we need some way of remembering which wire is which in a USB cable. The other problem is that in category theory it is bad style to talk about the “inside” of an object. Instead, it is better to talk about how the object relates to other objects. This is more of a problem of aesthetics, however it turns out that by following our nose and having good aesthetics, we can fix the first problem as well.

The solution is that we denote the input with a map i : X \to G which is a monomorphism (injective on nodes and edges). For instance, in the diagram above, the subgraph in green is X, and the map i is the inclusion map. Similarly, we also have a monomorphism o : Y \to G that denotes the output (which is red in the figure). So the formal definition of a graph with “inputs and outputs” is a diagram that looks like this

We call this a “cospan”, and we call X and Y the “feet” of the cospan. Now the problem of choosing an identification between input and output goes away: we can only compose cospans that share a common foot.

How can we use this to glue together the graphs? Let’s first simplify the question to the case where G,H,X,Y,Z are all just sets. The setup is two sets G and H, along with monomorphisms to G and H from Y (injections).

We then form the disjoint union of G and H. This looks like the following:

The problem with G \sqcup H is that each point in Y gets sent to two different points in G \sqcup H. We want to make these points the same point. The technical term for this (I kid you not) is gluing together the images of Y. The result of this is:

We call G \sqcup_Y H the pushout. From now on, I’m going to move back to an abstract notation, because it’s pretty time consuming to draw those big diagrams, and you should practice thinking about that kind of diagram when you see a diagram like this:

The important thing to note is that you can follow the arrows from Y and whichever way you go you will end up in the same place.

The pushout satisfies a universal property. Namely if there is some set K and maps j_G' : G \to K and j_H' : H \to K such that the same thing happens (whichever way you go from Y you end up in the same place), then there exists a unique map from G \sqcup_Y H to K such that any way you go in the following diagram you end up in the same place.

With this more abstract definition, it is possible to generalize pushouts from sets to graphs. Namely, the pushout G \sqcup_Y H is the object in the category of graphs that satisfies the above universal property. The intuition for how this works should just be “gluing together along Y”, but now you know the formal definition as well. You should work out a concrete definition for the pushout in the category of graphs that is akin to the concrete definition I gave in the category of sets.

Anyways, we can use this to compose two cospans. First we take the pushout with respect to the “foot” that the cospans have in common (in this case, Y), and then we compose the pushout maps with the maps from the “outer feet” (in this case X and Z) to get our new input and output maps, j_G \circ i_G and j_H \circ o_H.

The Bigger Picture

So far, I’ve just been talking about composing two specific graphs. However, cospans allow us to do much more than that. First of all, we can create a category in which the objects are graphs, and a morphism between G and H is a cospan with G and H as its feet. Then composition works as above. Identity morphisms are what you would expect:

and it is easy to check that this is the identity for the above composition.

However, there is a slight problem, which is that pushouts are only defined up to isomorphism. To get around this, we want to say that really the morphisms are isomorphism classes of cospans. However, I haven’t given you a good definition for morphism between cospans. It turns out that there is a natural way of defining this: a morphism between X \to G \leftarrow Y and X \to G' \leftarrow Y is a commutative diagram of the form

This points to a deeper structure on the category of cospans: it is in fact a 2-category, which is a category with 2-morphisms between its morphisms. The 2-category that you might be most familiar with is the category of categories. This has functors as morphisms and natural transformations as 2-morphisms. However, for now we are just going to discuss the category that has isomorphism classes of cospans as its morphisms.

You’ve probably noticed at this point that we’ve moved pretty far away from graphs; the only thing that we have used about the category of graphs is that it has pushouts. It turns out cospans are useful for much, much more than just composing graphs. Here is an example. Consider the category of cospans from the category of real vector spaces and linear maps. If the “meaning” of a cospan on the category of graphs was a graph that was annotated with an input and an output, then what is the “meaning” of a cospan on the category of real vector spaces?

I would like to convince you that the “meaning” of a cospan on the category of real vector spaces is a “linear relation”. Consider a cospan f : V \to A \leftarrow W : g. We can use this to define a relation R_A:

v \mathrel{R_{A}} w \quad \text{if and only if} \quad f(v) = g(w)

However, this relation has some structure on it: if v_1 \mathrel{R_{A}} w_1 and v_2 \mathrel{R_{A}} w_2, then (a_1 v_1 + a_2 v_2) \mathrel{R_{A}} (a_1 w_1 + a_2 w_2). This means that \mathrel{R_{A}} is a linear relation.

What does composition mean for two relations presented in this way? Well, we can derive it. Recall the cospan composition diagram:

Now, by definition of A \sqcup_Y B, j_A(i_A(x)) = j_B(o_B(z)) if and only if there exists some y \in Y such that o_A(y) = i_A(x) and i_B(y) = o_B(z). Therefore, x \mathrel{R_{A \sqcup_Y B}} z if and only if there exists some y such that x \mathrel{R_{A}} y and y \mathrel{R_{B}} z.

Why do we care about linear relations? Well, one reason is that steady-state solutions to certain circuit diagrams can be modelled by vectors of voltages. Therefore, a circuit gives a relation between voltages of its input and voltages of its output. Moreover, this relation is in fact linear! We can add together two solutions of the circuit laws, and we get another solution!

In light of this, the law for composition of relations makes a lot of sense: x is compatible with z if and only if there is some intermediate y that is compatible with both!

Problems

I’m going to wrap up this post now, because I want to keep it around this length, but I’m going to tell you some things that should nag at you and make you worry and then make you anxious to read my next post.

First of all, I never gave a full description of electronic circuits. The problem here is that we want the “feet” to be a particular kind of simple circuit, namely a collection of single nodes (think about a USB cable: there are just 4 nodes that you connect with another set of 4 nodes). However, the apex of the cospan might be very complicated. We need a way of addressing this.

Also, we would like to provide semantics for electronic circuits: we want a description of how electricity flows through them in such a way that we can take the description for an overdrive pedal, and take the description for a distortion pedal and get a description of the combined effects of overdrive and distortion. We started to get at that with the category of linear relations, but we want to also consider dynamics.

Finally, we want to also be able to compose circuits not just in series but also in parallel. Spoiler alert: this involves string diagrams.

For the answers to these questions and more, tune in in a week or two!

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS

]]>Extinction Times and an Important Lessonhttps://owenlynch.org/posts/2020-02-12-mean-extinction-times2020-02-12T00:00:00Z2020-02-12T00:00:00ZOwen's Blog -- Extinction Times and an Important Lesson

One feature of Petri nets that I’ve been harping on for a while now is their ability to describe more than the dynamics of the expected value. There are interesting questions that we can ask about the random variables underlying that stochastic process that are not answered by the dynamics of the expected value. For instance, in a decay model, the graph that we get for the expected value never exactly hits 0. However, in the stochastic model, for any given “run”, there is some point at which there are exactly 0 atoms.

In general, in any Petri net in which the only reactions producing a given species s require s as input, when the population of s goes to 0, it will stay at 0 permanently. So if over any period of time there is a chance of the population falling to 0, then eventually the population is guaranteed to go to 0, as long as the chance doesn’t decrease too fast.

For instance, in a predator-prey petri net, the expected values oscillate forever. However, in an actual stochastic model, after a finite time either all the wolves die off, or all the sheep die off.

To investigate this mathematically, we let T be a random variable that represents the time at which the species s dies off. We want to compute the mean of T.

An Important Lesson

When I first tried to write this post, I was reading an article that had a section on mean extinction times, and I was trying to write about how to solve some equations to calculate the mean extinction time. However, it turns out that we are only able to solve these equations in very special cases, namely in Petri nets with one or two species. Additionally, the amount of math needed to understand the equations is very high, and I ended up writing an absurd amount of background before getting anywhere near mean extinction times. I will write a high-level post on this at some point, but that time is not now.

Moreover, one of the purposes of learning about mean extinction times for me was to add a calculator for mean extinction times on EZ Petri. None of the methods in the article were anywhere close to general enough to be able to create such a calculator.

Fortunately, one other part of the article described a efficient and accurate simulation algorithm for sampling paths from the Markov process of a Petri net (as opposed to my algorithm in “Petri Nets for Computer Scientists”, which is neither efficient or totally accurate, though is perhaps more intuitive). Armed with this algorithm, we don’t need to solve any equations. We can just run the algorithm a hundred times, or a thousand times or a million times, and this gives us the data to compute the mean and standard deviation of the extinction time. This is very accurate, and quite fast. I wrote the algorithm in about two hours, and I can run the Lotka Volterra model to extinction 10,000 times in about two minutes.

The only objection is that this feels like I’m not really solving any equations; I’m giving up on understanding the solution. However, the hard fact of life is that even if I had ended up with an equation to solve, I probably would have solved it numerically, which is what I did for the master equation anyways. From an aesthetic perspective, if I’m already just going to solve something numerically, it’s not so different to solve it with a randomized numerical method.

So, today I learned an important lesson that you should take deeply to heart. The lesson is that if you are trying to solve a problem involving random variables, then instead of solving it as a math problem you can just simulate it on a computer. OK, back to Petri nets.

A (Sketch of an Accurate and Fast) Simulation Algorithm

Recall my previous simulation algorithm for Petri nets in “Petri Nets for Computer Scientists”. Essentially the idea there was to take a continuous process and simulate it in fixed time steps.

The essential difference between this previous algorithm and the algorithm I am about to present is that for the new algorithm, rather than using a fixed time step it instead uses a random variable to govern the time between state changes.

You can think about this in the following way. In the previous algorithm, if you were in a given state, then each time step you roll a die (with any number of sides), and depending on the result of the die, you either move to a different state or stay in the same state. Whenever you are in a given state, you roll the same die.

Now, the die has two types of faces: faces which make you stay in the same state, and faces which make you move to a different state. Therefore, we can have the same effect as the die by having a coin where heads you move to some new state, and tails you go nowhere, and then a new die that you roll after the coin flip to see which new state you end up in.

Notice that you only need to roll the new die if you get heads on the coin. So essentially what happens is that you keep flipping the coin until you get heads, and then you roll a die to see where you go next. The only information that you get from the coin is the number of tails. Therefore, we could replace the coin with some random number generator that tells you how many tails you got.

Remember that the earlier algorithm got more and more accurate as the time step interval got shorter and shorter. In our new scheme, where at each state we wait for a given number of periods and then jump to a new state, the time step interval getting shorter and shorter only changes the granularity of the time that we wait; it doesn’t change the probabilities of moving to another state after we have waited that amount of time. Therefore, we can just take the time step interval to be infinitely small, and then the time that we wait between jumps is a real number.

So our new algorithm is the following.

Initialize the variable S to our first state.

Wait an amount of time controlled by a random variable that depends on S

Jump to a new state with probabilities again depending on S, and go to step 2.

All that remains is to figure out what the random variables in step 2 and 3 are. To do this, we need to do some more difficult math, so if you want to get off the blog train, now is the time.

The Nitty Gritty

Recall that in a Markov process, the probability vector over states evolves according to a differential equation of the form:

where x ranges over the state space of the Markov process, and p(x,t|x_0,t_0) is the probability of being in state x at time t given that the system was in state x_0 at time t_0. We call this equation the master equation. We also fix the initial condition p(x_0,t_0|x_0,t_0) = 1.

The interpretation of the above equation is that for y \neq x, H_{yx} (which is positive) is the rate that probability flows from state y to state x, and H_{xx} (which is negative) is the rate at which probability is flowing out of state x. In order for probability to be conservered, we require that \sum_{y} H_{yx} = 0.

Now, suppose that the process is in the state x_0 at time 0. Let T be a random variable representing the time at which the process leaves state x_0. Intuitively, T is exponentially distributed because it is memoryless. To see this in more detail, consider a derived Markov process from the original. This new Markov process has two states, s_0 and s_1. As long as the first Markov process is in x_0, the second Markov process is in state s_0. Once the first Markov process moves anywhere else, the second Markov process moves to state s_1 and stays there forever. Denote by p^{x_0}(s,t|s_0,t_0) the conditional probability for this Markov process, and denote by H^{x_0} the matrix of probability flows. Then by inspection we can see that

Solving the master equation for this new Markov process shows that the probability of being in state s_0 at time t given that we were in state s_0 at time 0 is e^{H_{x_0x_0} t}. Therefore, the probability of being in state s_1 at time t is 1 - e^{H_{x_0x_0} t}. Now, note that the probability of being in state s_1 at time t is exactly the value of the cumulative distribution function for T at t. We recognize this as the cumulative distribution for an exponentially distributed random variable with rate \lambda = -H_{x_0x_0}.

To derive the probability mass function for the state that the Markov process jumps to after waiting an exponentially distributed amount of time, consider the conditional probabilities p(x_i,h|x_0,0). By definition of derivative,

p(x_i,h|x_0,0) - p(x_i,0|x_0,0) = h p(x_0,0|x_0,0) H_{x_0x_i} + o(h^2)

Therefore,

p(x_i,h|x_0,0) = h H_{x_0,x_i} + o(h^2)

If we condition on the fact that X(h) \neq x_0 (where X(t) is the random variable for the Markov process at time t), then we get

We can then take h \to 0 to get the probability mass function

\frac{H_{x_0x_i}}{\sum_{x_j \neq x_0} H_{x_0x_j}}

Up until now we have been working with a generic Markov process, and in fact this algorithm does work with a generic Markov process. However, if we want to run this algorithm for a Petri net, we must define our H_{yx} using the data of a Petri net (i.e., rates and transitions). This is not something you can “prove”, because it is part of the definition for the semantics of a Petri net. Therefore, all I can do is try and convince you that it is a reasonable definition. If you read my original post on Petri Nets, Petri Nets for Computer Scientists, then you should have the intuition for the following. Namely, if the transition t has rate r, goes from state x to state y, and has input vector n, then we have

H_{xy} = r x_1^{\underline{n_1}} \ldots x_s^{\underline{n_s}}

where a^{\underline{b}} is the “falling exponential” a(a-1)\ldots(a-b+1). This represents the number of input tuples for the transition, multiplied by the rate of the transition. If there are no transitions between x and y, then H_{xy} = 0. Finally, H_{xx} is, as always, the rate at which probability is leaving state x, so it is the negative of the sum of H_{xy} for y \neq x.

If you want to see the implementation, it is up on Github.

Wrap-up

I have been experimenting with different parameters for the stochastic simulation algorithm, specifically for the Lotka-Volterra model. Some have very short extinction times, especially for small numbers of wolves/sheep, and some remain stable for a long time before eventually dying. It seems that changing the value of a parameter by a very small amount can drastically change the mean extinction time. In other words, the mean extinction time, even though it is just a number, has a very complex relationship to the parameters of a Petri net. It is little surprise then that there is not a clear way of calculating it for general Petri nets. In my post from last year, Illegible Science, I talk about sometimes having to give up on analytic solutions and become more reliant on computer models. This post should be read as additional evidence for this viewpoint: even in a case with such a simple model, there is no clear way to solve for basic properties apart from simulation.

This website supports webmentions, a standard for collating reactions across many platforms. If you have a blog that supports sending webmentions and you put a link to this post, your blog post will show up as a response here.
You can also respond via twitter; tweets with links to this post will show up below.

div

Site proudly generated by
Hakyll
with stylistic inspiration from Tufte CSS