Gatlab.jl: symbolic computing with GATs

Owen Lynch

Topos Institute

Kris Brown

Topos Institute

James Fairbanks

University of Florida

Evan Patterson

Topos Institute

July 31, 2023

Outline

  1. Review of algebraic (Lawvere) theories
  2. Toy implementation
  3. Generalized algebraic theories
  4. Gatlab.jl

Algebraic theories

Figure 1: Algebraic theories: the baby version of generalized algebraic theories

Algebraic signatures

Definition 1 An algebraic signature consists of:

  • A set \(\mathcal{S}\) of sorts
  • A set \(\mathcal{F}\) of function symbols
  • A function \(\mathrm{arity} \colon \mathcal{F} \to \mathcal{S}^\ast \times \mathcal{S}\). We denote an element of \(\mathcal{S}^\ast \times S\) by \((s_1,\ldots,s_m; s)\).

A single-sorted algebraic signature is an algebraic signature where \(\mathcal{S} = 1\).

Example 1 The signature for a monoid action has

  • \(\mathcal{S} = \{ M, X \}\)
  • \(\mathcal{X} = \{ \ast, e, \cdot \}\)
  • \(\mathrm{arity}(f) = \begin{cases} (M, M; M) & f = \ast \\ (; M) & f = e \\ (M, X; X) & f = \cdot \end{cases}\)

Algebraic contexts

Definition 2 A context in an algebraic signature with labels in \(\mathcal{N}\) is an element of \((\mathcal{N} \times \mathcal{S})^\ast\), i.e. a list of pairs of name and sort.

We typically write a context as \(x_1 : s_1, \ldots, x_n : s_n\).

Definition 3 A term of sort \(s \in \mathcal{S}\) in context \(x_1 : s_1, \ldots, x_n : s_n\) is inductively defined to either be:

  • \(\mathtt{var}(i)\) for \(i \in \{1,\ldots,n\}\) where \(s_i = s\)
  • \(\mathtt{ap}(f, t_1, \ldots, t_k)\), where \(f\) is a function symbol with sort \((\sigma_1,\ldots,\sigma_m; \sigma)\), where \(t_i\) is a term of sort \(\sigma_i\) and \(\sigma = s\).

Example 2 Consider the context \(g: M, h: M, x: X\). Then the term we might write informally as \((g \ast h) \cdot x\) would formally be

\[ \mathtt{ap}(\cdot, \mathtt{ap}(\ast, \mathtt{var}(1), \mathtt{var}(2)), \mathtt{var}(3)) \]

Algebraic theories

Definition 4 An equation in an algebraic signature is pair of a context and two terms of the same sort in that context.

Definition 5 An algebraic theory consists of an algebraic signature along with a collection of equations in that signature.

Example 3 The theory of monoid actions consists of the signature of monoid actions, along with the equations:

\[ a * (b * c) = (a * b) * c \dashv [a: M, b: M, c: M] \] \[ a * e = a \dashv [a : M] \] \[ e * a = a \dashv [a : M] \] \[ (a * b) \cdot x = a \cdot (b \cdot x) \dashv [a : M, b : M, x : X] \]

Substitutions

Definition 6 If \(\Gamma = [x_1 : s_1, \ldots, x_n : s_n]\) and \(\Delta = [y_1 : t_1, \ldots, y_m : t_m]\) are both contexts, then a substitution \(\phi\) from \(\Gamma\) to \(\Delta\) is a term \(\phi(i)\) of sort \(s_i\) in context \(\Delta\) for each \(i \in \{1,\ldots,n\}\).

Example 4 There is a substitution in the theory of rings from \(a : R, b : R\) to the empty context, given by \(a \mapsto 3, b \mapsto 2\).

Proposition 1 Given a term \(t\) in a context \(\Gamma\) and a substitution \(\phi\) from \(\Gamma\) to \(\Delta\), we can define a term \(\phi^\ast(t)\) in context \(\Delta\) in such a way that

  • \(\phi^\ast(\mathtt{var}(i)) = \phi(i)\)
  • \(\phi^\ast(\mathtt{ap}(f, t_1, \ldots, t_k)) = \mathtt{ap}(f, \phi^\ast(t_1), \ldots, \phi^\ast(t_k))\)

Example 5 Applying the substitution from Example 4 to the term \(ab + b\) creates the term \((3)(2) + 2\).

Many concepts are algebraic theories

  • semigroup, monoid, group, Abelian group
  • C-sets, i.e. copresheaves (only unary function symbols)
  • semigroup action, monoid action, group action, abelian group action
  • group action of a fixed group (single-sorted, with a function symbol for each group element)
  • rig, rommutative rig, ring, Commutative rig
  • rig with a module over it
  • rational/real/complex vector space (with a function symbol for each number)
  • rational/real/complex algebra

Lawvere theories

Definition 7 A Lawvere theory consists of

  • a set \(\mathcal{S}\)
  • a category \(T\) with finite products
  • a product-preserving, identity on objects functor \((\mathsf{FinSet}/\mathcal{S})^{\mathrm{op}} \to T\)

Theorem 1 Any algebraic signature \(\Sigma = (\mathcal{S}, \mathcal{F}, \mathrm{arity})\) induces a Lawvere theory \(\mathrm{Subst}(\Sigma)\) where the objects are contexts and the morphisms are reverse substitutions.

Theorem 2 Any algebraic theory \((\Sigma, E)\) induces a Lawvere theory \(\mathrm{Subst}(\Sigma, E)\) given by minimally quotienting (via colimit) the Hom-sets of \(\mathrm{Subst}(\Sigma)\) so that the equations in \(E\) (viewed as morphisms into singleton contexts) hold.

Models

Definition 8 A model of a Lawvere theory \(T\) is a product-preserving functor \(T \to \mathsf{Set}\).

Implementing algebraic theories

What does it mean to implement algebraic theories on the computer? Many pieces.

  1. Data structures. Signatures, terms, equations, theories, substitutions, functors between theories, etc.
  2. Type checking/inference.
  3. Rewriting. Figure out when two terms can be reduced to the same term using the laws of the theory; i.e. implement the equivalence relation that we use to quotient \(\mathrm{Subst}(\Sigma)\) to produce \(\mathrm{Subst}(\Sigma, E)\).
  4. Models. It should be possible to make a model of a theory and to interpret terms in that model.
  5. Programmatic manipulation of theories. I.e., pushouts of theories for multiple inheritance, or generating the theory of \(n\)-truncated simplicial sets for any \(n\).
  6. Programmatic manipulation of models. I.e. pullback a model along a functor between theories.

Toy implementation (data structures)

struct Constructor
  name::Symbol
  argtypes::Vector{Int}
  rettype::Int
end

struct AlgebraicSignature
  sorts::Vector{Symbol}
  constructors::Vector{Constructor}
end

const Context = Vector{Tuple{Symbol, Int}}

@data Term begin
  Var(i::Int)
  Ap(head::Int, args::Vector{Term})
end

Note: we refer to type constructors and sorts by their index, not by their name; names are just a convenience.

Toy implementation (typechecking)

function typecheck(sig::AlgebraicSignature, ctx::Context, t::Term)::Int
  @match t begin
    Var(i) => ctx[i][2]
    Ap(head, args) => begin
      argtypes = map(arg -> typecheck(sig, ctx, arg), args)
      if any(argtypes .== nothing)
        nothing
      else
        f = sig.constructors[head]
        if argtypes == f.argtypes
          f.rettype
        else
          nothing
        end
      end
    end
  end
end

Toy implementation (theories and substitutions)

struct Equation
  sort::Int
  lhs::Term
  rhs::Term
end

struct AlgebraicTheory
  sig::AlgebraicSignature
  eqs::Vector{Equation}
end

struct Substitution
  dom::Context
  codom::Context
  terms::Vector{Term}
end

Example: Monoid

SigMonoid = AlgebraicSignature(
  [:M],
  [Constructor(:mul, [1, 1], 1), Constructor(:e, Int[], 1)]
)

ThMonoid = AlgebraicTheory(
  SigMonoid,
  [Equation( # associativitiy
    [(:a, 1), (:b, 1), (:c, 1)],
    Ap(1, [Ap(1, [Var(1), Var(2)]), Var(3)])
    Ap(1, [Var(1), Ap(1, [Var(2), Var(3)]]))
  # identity laws omitted
  ]
)

Toy implementation (models)

A model is a struct that subtypes Model and has certain methods defined

abstract type Model end

# gettheory(m::Model)::AlgebraicTheory
function gettheory end

# check(m::Model, sort::Int, x::Any)::Boolean
function check end

# ap(m::Model, f::Int, args...)::Any
function ap end

# Assume we've type-checked `t` and the elements of `ctx`
function interpret(m::Model, ctx::Vector{Any}, t::Term)
  @match t begin
    Var(i) => ctx[i]
    Ap(head, args) => ap(m, head, map(arg -> interpret(m, ctx, arg), args)...)
  end
end

We have macros for declaring models that make sure everything is defined properly

Example: Integer monoid

struct IntPlus <: Model end

gettheory(::IntPlus) = ThMonoid

check(::IntPlus, sort, x) = (sort == 1) && x isa Int

ap(::IntPlus, f, args...) =
  if f == 1 # multiplication
    args[1] + args[2]
  elseif f == 2 # identity
    0
  end

Once more, this time with feeling (= dependent types)

  • No algebraic theory for categories
  • Generalized algebraic theories are like algebraic theories, but with dependent types
Ob :: TYPE
Hom(a,b) :: TYPE  [a::Ob, b::Ob]
compose(f, g) :: Hom(a,c)  [a::Ob, b::Ob, c::Ob, f::Hom(a, b), g::Hom(b, c)]
  • The core of Catlab has always been an implementation of GATs
  • Gatlab is a refactor of this core to bring many different improvements

What’s new in Gatlab?

  • DeBruijn levels, to ensure correctness of substitution when names are non-unique
  • Functors between theories, pushout of theories for multiple inheritance
  • First-class substitutions
  • Models are now attached to structs, so we can do higher-level constructions
  • E-graphs for rewriting

Multiple inheritance

@theory ThStrictRigCategory <: ThCategory begin
  using ThStrictMonCat: ⊗ as , I as zero
  using ThStrictMonCat
  a  (b  c) == (a  b)  (a  c) :: Ob  [(a,b,c)::Ob]
  f  (g  h) == (f  g)  (f  h) :: Hom(a  (b  c), x  (y  z)) 
    [(a,b,c,x,y,z)::Ob, f::Hom(a,x), g::Hom(b,y), h::Hom(c,z)]
end

First-class substitutions

Treating substitutions as “symbolic functions”, we can do symbolic lenses

sirv = @lens ThElementary begin
  dom = [S, I, R, V] | [dS, dI, dR, dV]
  codom = [S, I, R, V] | [β, γ, v]
  expose = begin
    S = S
    I = I
    R = R
    V = V
  end
  update = begin
    dS = - β * (I * S) + (-v * S)
    dI = β * (I * S) + (-γ * I)
    dR = γ * I
    dV = v * S
  end
end

Higher-level model constructions

struct SliceOb{Ob, Hom}
  ob::Ob
  hom::Hom
end

@model ThCategory{SliceOb{Ob, Hom}, Hom} (self::SliceC{Ob, Hom, C<:Model{ThCategory.T, Tuple{Ob, Hom}}}) begin
  cat::C
  over::Ob

  Ob(x::SliceOb{Ob, Hom}) =
    checkvalidity(self.cat, ThCategory.Ob, x.ob) &&
    checkvalidity(self.cat, ThCategory.Hom, x.ob, self.over, x.hom)

  Hom(x::SliceOb{Ob, Hom}, y::SliceOb{Ob, Hom}, f::Hom) =
    checkvalidity(self.cat, ThCategory.Hom, x.ob, y.ob, f) &&
    ap(self.cat, ThCategory.compose, x.ob, y.ob, self.over, f, y.hom) == x.hom

  id(x::SliceOb{Ob, Hom}) = ap(self.cat, ThCategory.id, x.ob)

  compose(x, y, z, f, g) = ap(self.cat, ThCategory.compose, x.ob, y.ob, z.ob, f, g)
end

Similar construction (though much more involved) could make the category of lenses out of a cartesian category

E-Graphs

@theory C <: ThCategory begin
  x :: Ob
  y :: Ob
  z :: Ob
  f :: Hom(x,y)
  g :: Hom(y,z)
end

eg = EGraph(C.T)

i1 = add!(eg, @term C (f  g))

@test_throws Exception add!(eg, @term C (g  f))

merge!(eg, add!(eg, @term C x), add!(eg, @term C z))

i2 = add!(eg, @term C (g  f))

Future work

  • Integration into Catlab
  • API stabilization, tests
  • E-Graph based optimization of symbolic functions, and compilation to Julia
  • Macro for higher-level model constructions
  • Symbolic ACSets
  • Much more…