Homomorphisms as 2-cells

I’m working on this idea of considering theory signatures as contexts of some dependent type theory and theory presentations as a category fibred over the category of contexts. A model of a theory presentation corresponds to a point of said object, that is a morphism from the terminal theory presentation (i.e. the empty context with no axioms) to said object. The fibration projects models down their underlying structure, which is a point in the corresponding theory signature, that is a morphisms from the terminal theory signature (i.e. the empty context) to said theory signature, which in turn corresponds to a term fulfilling the theory signature.

Given that two structures for signature S are arrows M N : 1 → S, it seems natural to represent homomorphisms from M to N as a 2-cell F : M ⇒ N.

If our category of theory signatures is going to actually be a 2-category then we need to consider 2-fibrations, which seem to have been worked out in Claudio Hermida’s paper entitled "Some Properties of Fib as a fibred 2-category".

More importantly I need to work out what a 2-cell in the category of contexts would be exactly. Some things are pretty clear; for the context containing x:Type and when the corresponding components of the structures, M and N, are ⊢ s:Type and ⊢ t:Type, we want the corresponding component F : M ⇒ N to be a term ⊢ Fx : s → t. If the context contains an operation f : x → x, and the corresponding components of the structures are ⊢ g : s → s and ⊢ h : t → t, then we want the corresponding component of F : M ⇒ N to be the constraint ⊢ Ff : Fx∘g =s → t h∘Fx. But to work it out for the general case of dependent type boggles my mind.

There is an interesting downward transformation going on here where types in the signature turn are realized by (function) terms in the 2-cells, and terms in the signatures are realized by (equality) propositions in the 2-cells, and propositions (in theory presentations) will come for free in the 2-cells (is this always true?). Perhaps kinds will realized by types?

This all vaguely resembles Reynold's abstraction theorem.

η-expansions

A : Type;
B : A -> Type;
f : forall x : A. B x
=====================
|- f =η fun (x:A) => f x

A : Type;
B : A -> Type;
C : SigT A B -> Type;
f : forall p : SigT A B. C p
============================
p : SigT A B |- f p =η match p as p0 return C p0 with existT x t => f (existT A B x t) end

f : unit
========
|- f =η tt

A : Type;
x, y : A;
C : forall y, Eq A x y -> Type;
f : forall e : Eq A x y -> C y e
================================
e : Eq A x y |- f e =η match e as e0 in Eq _ _ z return C z e0 with refl => f (refl A x) end

A, B : Type;
C : A + B -> Type;
f : forall p : A + B. C p
=========================
p : A + B |- f p =η match p as p0 return C p0 with left x => f (left x) | right y -> f (right y) end

C : Void -> Type;
f : forall p : Void . C p
=========================
p : Void |- f p =η match p as p0 return C p0 with end

Inductive W (A:Type) (B:A -> Type) : Type :=
node : forall (a:A) (f:B a -> W A B), W A B

A : Type;
B : A -> Type;
C : W A B -> Type;
f0 : forall p : W A B. C p;
f1 : forall (a : A) (f:B a -> W A B), (forall b:B a. C (f b)) -> C (node a f)
-----------------------------------------------------------------------------
a : A, f: B a -> W A B |- f0 (node a f) =βη f1 a f (fun b => f0 (f b))
=============================================================================
             p : W A B |- f0 p =η W_rect A B C f1 p

C : nat -> Type;
f0 : forall n : nat. C n;
f1 : forall n : nat. C n -> C (S n)
-----------------------------------
n:nat |- f0 (S n) =βη f1 (f0 n))
===================================
n:nat |- f0 n =η nat_rect C (f0 O) f1 n

A Semi-plethory of Containers

A container is a specification of S and P and is given by the dependent type s:S ⊢ P s a.k.a. the telescope ΣS:★. S → ★. A container specifies the (strictly positive) functor λX:★. Σs:S. (P s) → X.

An indexed container over I is a specification of S and P and is given by the dependent type ΣS:★. I → S → ★. An indexed container specifies an indexed Functor λX:I → ★. Σs:S. Πi:I. (P i s) → X i.

Containers form a commutative ★-semi-algebra a.k.a. a ★-semiring (not to be confused with a *-semiring) with the following operations:

  • 0 := ⟨⊥, []⟩
  • 1 := ⟨⊤, λs. ⊥⟩
  • ⟨A, P⟩ + ⟨B,Q⟩ := ⟨A + B, [P, Q]⟩
  • ⟨A, P⟩ × ⟨B,Q⟩ := ⟨A × B, λ⟨a,b⟩. P a + Q b⟩.
  • R·⟨A,P⟩ := ⟨R × A, λ⟨_,a⟩. P a⟩

Containers (aka ⊤-indexed containers) form a ★-bi-semiring. The tensor of I indexed containers with J indexed containers is (I + J)-indexed containers. The identity of this tensor are the ⊥-indexed containers which are isomorphic to ★. The tensor of an I indexed container and a J indexed container is:

  • ⟨A,P⟩ ⊗ ⟨B,Q⟩ := ⟨A × B, λ(ij:I+J) ⟨a,b⟩. [λi. P i a, λj. Q j b](ij)⟩.

In particular the tensor of containers with themselves are the 𝔹-indexed containers. The following are the ★-bi-semiring operations on containers

  • ε+⟨A,P⟩ := Σa:A. P a → ⊥ (≜ Σa:A. ¬(P a))
  • ε×⟨A,P⟩ := Σa:A. P a → ⊤ (≅ A)
  • Δ+⟨A,P⟩ := ⟨Σa:A. P a → 𝔹, λ(i:𝔹) (⟨a,f⟩:Σa:A. P a → 𝔹). Σp:P a. i = f p⟩
  • Δ×⟨A,P⟩ := ⟨A, λ(i:𝔹). P⟩
  • β(R)⟨A,P⟩ := Σa:A. P a → R

The composition product of I-indexed containers with J-indexed containers are the (I × J)-indexed containers, with containers (ie ⊤-indexed containers) as the unit for the composition product. The composition product of an I-indexed container with a J-indexed container is as follows.

  • ⟨A,P⟩ ⊙ ⟨B,Q⟩ := ⟨Σa:A. Πi:I. P i a → B, λ⟨i,j⟩ ⟨a,f⟩. Σp : P i a. Q j (f i p)⟩

Now containers trivially form a semi-plethory since ⊤-indexed containers are clearly isomorphic to (⊤ × ⊤)-indexed containers.

  • ι⟨A,P⟩ := ⟨A,λ⟨⟩. P⟩
  • (∘)⟨A,P⟩ := ⟨A,λ⟨⟩. P⟨⟨⟩,⟨⟩⟩⟩

For convenience we define the infix composition of two containers as:

  • ⟨A,P⟩ ∘ ⟨B,Q⟩ := ⟨Σa:A. P a → B, λ⟨a,f⟩. Σp : P a. Q (f p)⟩ (≅ (∘)(⟨A,P⟩ ⊙ ⟨B,Q⟩))

Composition has an identity element, I, which doesn't appear to be a primitive semi-plethory operation for some reason. Nor do I know how to derive it from other operations. But here it is anyways.

  • I := ⟨⊤, λs. ⊤⟩

TODO: Actually verify these operations are correct by checking that all the required laws hold (upto isomorphism).

References: Containers: Constructing Strictly Positive Types

P.S. Actually containers do form a *-semiring, which makes it a ★-*-semiring. Actually we can get a ★-bi-*-semiring with the following asteration operation on containers and its dual.

  • ⟨A,P⟩* := ⟨ℕ, Fin⟩ ∘ ⟨A,P⟩ (≅ ⟨List A, sum ∘ map P⟩)
  • ⟨A,P⟩⁎ := ⟨A,P⟩ ∘ ⟨ℕ, Fin⟩ (≜ ⟨Σa:A. P a → ℕ, λ⟨a,f⟩. Σp:P a. Fin (f p)⟩)

Bill's Approach to Trustable Communication

Recalling the problem: On one side we have a computational theory call GMP that consists entirely of Δ0 theorems. On the other side we have a purely axiomatic theory called PA that is a high-level theory of Peano Arithmetic that has been conservatively extended to support decimal notation and whatever other reasonable conveniences for the natural numbers we want. The task is to use trustable communication to use the computational ability of GMP to solve simple arithmetic problems in PA.

The idea here is to extend GMP to a stronger theory GMP0 that is strong enough to define PA. Next we make a conservative extension of GMP0 to GMP' that defines the language of PA and proves the axioms of PA. Specifically GMP' defines

  • nat := { a : int | 0 ≤int a}
  • +nat := λ (a,b) : nat × nat. ι c : nat. c = a +int b

Here we note that PA is a subtheory in the sense that it uses a sublanguage of GMP' and every theorem of PA is a theorem of GMP' (since GMP' proves all the axioms of PA).

The next step is to prove that GMP' is, in fact, a conservative extension of PA (thus GMP0 shouldn't be more powerful than PA). The idea here is to make an interpretation from GMP' into PA that fixes the language of PA. Unfortunately we cannot directly do this because interpretations in Chiron must be injective. So instead we note that if we make an interpretation from GMP' into a conservative extension of PA, called PA', that fixes the language of PA, then we can conclude that GMP' is a conservative extension of PA

To build this interpretation we extend PA to PA' with definitions:

  • int := nat×{0} ∪ {0}×nat
  • +int := λ ((ap,an),(bp,bn)) : int × int. ι (cp,cn) : int. cp + an + bn = ap + bp + cn ∧ (cp = 0 ∨ cn = 0)

Out interpretation sends int to int, +int to +int, etc. and fixes the lanuage of PA, so it sends nat to nat, and +nat to +nat.

Unfortunately this doesn't work. GMP' has a theorem (actually a definition) that nat = { a : int | 0 <=int a}. If we look at this theorem under the above alleged interpretation, it is equivalent to nat = nat×{0}, which is clearly false. So our alleged interpretation isn't actually an interpretation because it isn't actually truth preserving.

Let us suppose for the sake of argument that we somehow prove that GMP' is a conservative extension of PA. What we then need to do is to write new services for PA that use GMP' services internally to help. We need to do something like writing new services because someone has to be responsible for proving that the operations on integers are, in fact, compatible with the operations on natural numbers. In our case it will be the responsibility of the person writing these new services to prove that the services are conservative (i.e. all the meaning formulas of the services are theorems). (More details on how to make this service is needed). With new services and new meaning formulas, there is no problem with translating the meaning formula through trustable communication since the meaning formula is no longer passed through the trustable communication framework.

In this sense we have sort of failed at our task which was to use the trustable communication framework; however I'm inclined to think that the original trustable communication framework cannot actually solve this particular communication problem since it leaves no room for the required proof that the computation of the natural numbers is compatible with computation on the integers. Therefore, I think it is inevitable that something beyond trustable communication is needed in this particular situation.

Other situations that do not have this compatibility issue are likely more amiable to using trustable communication. In particular our new natural number services could be served through a trustable communication channel.

Some notes on Plethories

This is my cheat sheet for trying to understand an answer to Jacques's question about the theory of polynomials.

  • An R-ring is a commutative R-algebra.
  • An R-R'-biring is an R-ring A such that for every other R-ring B the set of homomorphisms from A to B has the structure of an R'-ring. This is equivalent to giving the following additional operations on A with the following laws:
    • an R-ring homomorphism Δ+ : A → A ⊗R A.
    • an R-ring homomorphism counit ε+ : A → R .
    • an R-ring homomorphism antipode σ : A → A.
    • Δ+ = γ ∘ Δ+ where γ(a⊗b) = (b⊗a).
    • +⊗id) ∘ Δ+ = (id⊗Δ+) ∘ Δ+ where (f⊗g)(a⊗b) = (f(a)⊗g(b)).
    • λ ∘ (ε+⊗id) ∘ Δ+ = id where λ(r⊗a) = r⋅a.
    • (∇ ∘ (id⊗σ) ∘ Δ+)(a) = ε+(a)⋅1 where ∇(a⊗b) = ab.
    • an R-ring homomorphism Δ× : A → A ⊗R A.
    • an R-ring homomorphism counit ε× : A → R .
    • Δ× = γ ∘ Δ×
    • ×⊗id) ∘ Δ× = (id⊗Δ×) ∘ Δ×.
    • λ ∘ (ε×⊗id) ∘ Δ× = id.
    • (λ ∘ (ε+⊗id) ∘ Δ×)(a) = ε+(a)⋅1.
    • +⊗id) ∘ Δ× = τ0 ∘ (Δ×⊗Δ×) ∘ Δ+ where τ0(a⊗b⊗c⊗d) = a⊗c⊗bd.
    • β : R' → A → R
    • β(r) is a R-Ring homomorphism.
    • β(0) = ε+
    • β(1) = ε×
    • β(r + s) = λ ∘ (β(r)⊗β(s)) ∘ Δ+
    • β(-r) = β(r) ∘ σ
    • β(rs) = λ ∘ (β(r)⊗β(s)) ∘ Δ×
    This gives us enough structure to show that for any R-ring B, the homomorphisms from A to B have a R'-ring structure. For instance given R-ring homomorphisms f, g : A → B and r : R' we can define
    • 0 : A → B is 0(a) = ε+(a)⋅1
    • 1 : A → B is 1(a) = ε×(a)⋅1
    • f + g : A → B is f + g = ∇ ∘ (f⊗g) ∘ Δ+
    • -f : A → B is -f = f ∘ σ
    • fg : A → B is fg = ∇ ∘ (f⊗g) ∘ Δ×
    • r⋅f : A → B is r⋅f = f ∘ λ ∘ (β(r)⊗id) ∘ Δ×
  • A biring over R is an R-R-biring.
  • A plain biring is a biring over ℤ just like a plain ring is a ℤ-ring.
  • R itself can be made into a trivial R-R'-biring over R where
    • ε+(a) = ε×(a) = a
    • σ(a) = a
    • Δ+(a) = Δ×(a) = a⊗1 = 1⊗a
    • β(r)(a) = a
  • R[X] is a biring over R where
    • ε+(X) = 0 (polynomial evaluation at 0)
    • ε×(X) = 1 (polynomial evaluation at 1)
    • σ(X) = -X (polynomial evaluation at -X)
    • Δ+(X) = X⊗1 + 1⊗X (polynomial evaluation at X+Y)
    • Δ×(X) = X⊗X (polynomial evaluation at XY)
    • β(r)(X) = r (polynomial evaluation at r)
  • A R-R'-biring A inducts a functor from R-Rings to R'-Rings that takes an R-Ring B to the R'-Ring of homomorphisms from A to B.
  • Birings are composable. Given an R-R'-biring A and and R'-R''-biring B, then there exists an R-R''-bring A⊙R'B. (⊙R') is called the composition product. First we define a R-ring A⊙R'B for an R-R'-biring A and an R'-ring B in a way similar to the definition of a tensor product: A⊙R'B can be constructed as A×B modulo the following equalities:
    • (a + b)⊙x = (a⊙x) + (b⊙x)
    • ab⊙x = (a⊙x)(b⊙x)
    • (r⋅1)⊙x = r⋅(1⊙y) (this y is not a typo: 0⊙x = 0⊙y = 0 and 1⊙x = 1⊙y = 1)
    • a⊙(x + y) = λ ∘ τ ∘ Δ+(a) where τ(w1⊗w2) = (w1⊙x)⊗(w2⊙y)
    • a⊙(xy) = λ ∘ τ ∘ Δ×(a) where τ(w1⊗w2) = (w1⊙x)⊗(w2⊙y)
    • a⊙(r⋅1) = β(r)(a)⋅1
    Think of ax as the polynomial a evaluated at the point x.
    If B also happens to be a R'-R''-biring, then we can make A⊙R'B into an R-R''-biring.
    • ε+(a⊙x) = a⊙(ε+(x)) where a⊙r = β(r)(a)
    • ε×(a⊙x) = a⊙(ε×(x))
    • σ(a⊙x) = a⊙σ(x)
    • Δ+(a⊙x) = a⊙Δ+(x) where a⊙(w1⊗w2) = (a⊙w1)⊗(a⊙w2)
    • Δ×(a⊙x) = a⊙Δ×(x)
    • β(r)(a⊙x) = β(β(r)(x))(a)
  • (⊙R) and R[X] make the category of birings over R into a monoidal category. (TODO: Why is this the Tall-Wraith monoid?). We want to check first that B is isomorphic to R[X]⊙RB. The isomorphism maps x ↔ (X⊙x) (the reverse map extends by the composition product laws). Similarly we want to check that A is isomorphic to A⊙RR[X]. Again, the isomorphism maps a ↔ (a⊙X).
  • an R-plethory is a monoid in this monoidal category of birings over R. This means that an R-plethory is a biring over R, call it A, that also has the following operations forming a monoid:
    • (∘) : A⊙RA → A
    • ι : R[X] → A
  • R[X] is an R-plethory with ι the identity function and (∘)(X⊙X) = X which is effectively entails polyonomial composition.
  • According to the mathoverflow answer, R[X] is an initial R-plethory which seems reasonable.

I am not entirely sure this is all correct. But I'm moderately confident.

I will (hopefully) continue to update this post with further notes.

Simpler proof of the Primitive Element Theorem: Finite Case

Here I present what I think is a slightly simpler proof of the Primitive Element Theorem's so called "finite case" than the one found in A Course in Constructive Algebra. It still follows the most important ideas presented in that book. In particular it is important to use the correct constructive phrasing of the "finite case".

Theorem: Primitive Element Theorem: Finite Case
Given a discrete field F and two algebraic values a and b over F of degrees m and n respectively then forall N either #|F| > N or there exists some g in F[a,b] such that F[a,b] = F[g].

Of course by #|F| > N we really mean the statement that there exists some finite set of elements of F that has cardinality greater than N.

We can assume that a ≠ 0 and b ≠ 0 since in these cases F[ab] = F[b] and F[ab] = F[a] respectively.

Every element xF[a] is a linear combination of a0, …, am-1 with coefficents in F. Say x = Σxiai. Consider the list [a0, …, aNm], and then consider the finite set of coefficients S = {(ai)j | i ≤ Nm, j < m}. If S has cardinality greater than N then we are done.

Suppose S has cardinality less than or equal to N. Then there are at most Nm linear combinations of the coefficients from S on [a0, …, am-1], and since the list [a0, …, aNm] has length Nm+1 then there must be two elements of this list that are the same. By dividing these two elements we conclude there is some i such that ai = 1a has multiplicative order i.

By a similar argument we can conclude that either #|F| > N or there is some j such that bj = 1b has multiplicative order j. Now we can proceed under the assumption that both ai = 1a has multiplicative order i and bj = 1b has multiplicative order j. The rest of the argument is standard.

Let f : ℤi×ℤjF[a,b] be the function that sends (x,y) to axby. The function f is a group homomorphism into the multiplicative group of F[a,b], and therefore ℤi×ℤj must be cyclic and is generated by some z. Let g = f(z). Then there is some ka and ka such that gka = a and gkb = b, and thus F[ab] = F[g]. That concludes the proof.

The infinite case of the Primitive Element Theorem states that there exists some N (specifically I think m*(n-1)+1 should work) such that if #|F|>N and b is separable over F then there exists some g in F[a,b] such that F[a,b] = F[g]. This can be proved as done in A Course in Constructive Algebra or if you want to avoid the construction of splitting fields you can follow the argument in David Speyer or Boyarsky's proof. The infinite case combined with the finite case allow us to prove the primitive element theorem:

Theorem: Primitive Element Theorem
Given a discrete field F and two algebraic values a and b with b separable over F, then there exists some g in F[a,b] such that F[a,b] = F[g].

Theories of Morphisms

Inspired by how Bas and Eelis handle morphisms in their work, I've rethought how morphisms can be implemented in Mathscheme.

Consider how to build a theory of monoid morphisms. One can build this theory by first taking a pushout of the theory of monoids with itself over the empty theory (i.e. take the disjoint union of the theory of monoids with itself) this gives us two copies of the theory of monoids and each copy has a separate universe: U1 and U2. Then we extend this theory with a function f : U1U2 and axioms that says that f preserves each operation. This procedure works in general for algebraic theories.

Notice that this gives us a theory of monoid morphisms between some monoids, but not the theory a theory of monoid morphisms between two particular monoids, which is what we are usually interested in. However, by considering this theory of monoid morphisms as parameterized by two monoids in the sense of my last post one can build models of the theory of monoid morphisms between particular monoids.

This way of developing theories of morphisms has some nice properties. For example, we usually only care about vector space morphisms (aka linear transformations) between two vector spaces over the same field. This sharing of the same field is easy to do with the above technique. Instead of taking the pushout of the theory of vector spaces with itself over the empty theory, one would take the pushout of the theory of vector spaces with itself over the theory of fields. This forces the theory of fields to be shared between the two copies of vector spaces, ensuring that the domain and codomain of a field morphism are both vector spaces over the same field.

Parameterized Models

Some algebraic theories are parameterized by other algebraic theories. The most common example would be vector spaces, which are parameterized by a field that acts on the vector space by scalar multiplication.

In mathscheme, the theory of vector spaces contains types for both the field and the vector space, and provides (via inheritance) all the axioms of a field for the field type and then goes on to provide the vector space axioms (via various mix-ins). When this theory is reflected as a type we get a record for both the field component and the vector space component.

However, in may cases we want to reason about vector spaces whose field component is fixed to some particular value. One idea to support this sort of thing is to not reflect the theory as a type, but as a dependent function type. In this case, a function from the type of field to a type. Loosely speaking it would be

VS :  {U:type; +, *: U*U -> U; -, inv : U -> U; 0 1 : U }  -> type
VS F = {V:type, + : V*V -> V; - : V -> V; 0: V; *: (F.U)*V -> V}

I have left out the axioms above for simplicity, but the axioms would bring the operations of F into play. These parameterized models can probably be created for any theory morphism between theories.

Unparameterized models could then be simply a degenerate case of parameterized models over the initial empty theory. The models of the empty theory have type {} and there is exactly one value of this type {}.

Pushouts of Diagrams

To further help modularize theory development, it may be useful to take pushouts of not just theory presentations themselves, but to take pushouts of diagrams of theory presentations. The following is an elaboration of what Douglas Smith describes in his various papers on SpecWare including "Mechanizing the Development of Software". Although we are interested in diagrams of theory presentations, this development is generic for diagrams for any category with sufficient structure (ie, categories that have colimits.).

First we define what a graph is. A graph is a finite collection of nodes and a finite collection of directed edges between those nodes with a composition operation on edges (composition requires the target node of the first edge be the same as the source node of the second edge). The set of edges is closed under composition and for each node there is an identity edge from the node to itself. We require that the identity edge be an identity for the composition operation in the usual sense.

Actually, an graph is just a category, and I could have simply said that from the beginning. However, I will use the name graph in the cases where we don't really care about the semantics of the category and we only care about the shape of the category.

This definition of a graph differs notably from the definition of directed multigraph from graph theory because our definition of a graph says whether following two paths from a source node to a target node when composed together give equal edges or not. In particular our graphs tell you whether two particular squares commute or not. Usually this information is left implicit when a diagram is drawn on a board, but such information is a vital component of the definition.

Loosely speaking, a diagram is a graph where the nodes and edges are given labels from another category C in a way that is compatible with the category C. In our particular case, we are interested in C = the category of theory presentations, but this development is generic in C. More formally speaking, a diagram is a functor d : GC. where G is a graph that gives the shape of the diagram. The image of a node of G under d labels the node by an object from C, and the image of an edge of G under d labels the edge by an arrow from C. The law for a functor ensure that the labeling of nodes and edges is compatible with C.

Note that is definition of a diagram differs notably from the definition of a subcategory. In a diagram two different nodes could be get the same label, and two different edges could also get the same label. In otherwords, the functor d : GC need not be injective.

All diagrams are functors (which are arrows in the category of categories) with codomain C. The arrows of a category with a constant codomain form the objects of a slice category, so that is an obvious choice for turning diagrams into a category, but we need something a bit more lax: a lax slice category.

The objects of a lax slice category are all functors that have the same codomain (in this case C). A morphism between d1 : G1C and d2 : G2C in a lax slice category is a functor f : G1G2 along with a natural transformation ηf from d1 to d2f.

What does this definition mean for our diagrams? Given two diagrams d1 and d2, a diagram morphism between these two diagrams states two things. Firstly it assigns for each node in the graph G1, a node in the graph G2, and for each edge in the graph G1, an edge in the graph G2 in a way that is compatible. This first part is the information held by the functor f : G1G2 from the previous paragraph. The second part of a diagram morphism is a collection of arrows from C (in our case theory morphisms), one for each node in G1. These arrows map the label for a node in G1 to the label for the corresponding node (under f) in G2. This collection of arrows is given by the natural transformation η in the previous paragraph. Lastly, this collection of arrows is subject to a compatibility constraint that for every edge e : pq in G1, there is a commutative square that states that f(q)∘e = η(e)∘f(p). This condition is what makes the transformation η natural.

The laxness of the lax slice category is important. Without it we would end up with the stricter requirement that d1 = d2f or in other words that the natural transformation is always the identity transformation. This stricter requirement would mean that the collection of arrows between diagrams always be identity arrows, and the only diagram arrows that would exist would be diagram inclusions. We require this laxness to get pushouts to generate a collection of novel theories. Without laxness we could only create the union of diagrams, which is not so interesting.

I believe it is true that if the category C has colimits, then this (lax) category of diagrams will have colimits, but I don't have a reference that proves this.

Now I will show how to use pushouts of diagrams to generate large batches of novel theories. For simplicity consider the diagram B generated from the theories: Semigroup, Monoid, and LeftNearSemiring with the usual arrow from Semigroup to Monoid, and the "multiplicative" arrow from Semigroup to LeftNearSemiring and the "additive" arrow from Monoid to LeftNearSemiring. This example is small enough to be easy to picture, but large enough to be a non-commutative diagram.

We want to use diagram pushouts to create commutative versions of these theories. To do this we will take as base A, the diagram consisting of the single node Semigroup (which implies the single identity arrow on that node), and for C we will take the single node CommutativeSemigroup. The diagram arrow from A to B will map the Semigroup node in A to the Semigroup node in B via the identity morphism. The diagram arrow from A to C will make the Semigroup noda in A to the CommutativeSemigroup node in B via the theory inclusion.

When we take the pushout of this span we get a new diagram D along with diagram arrows from B to D and C to D with the universal properties that pushouts have. What will the contents of D be? The shape of the diagram D will be the the disjoint union of the shapes of B and the shapes of C modulo the shape of A. So in this case that gives us the shape of B plus another point that in this case will be identified with the semigroup node of B. This moding out leaves us with a diagram with the same shape as B that we started with.

What are the contents of D? In this case, what was the semigroup of B has been transformed into the commutative semigroup that we gave in C. The morphisms of the diagram arrow from B to D includes our inclusion from semigroup to commutativeSemigroup. And the diagram arrow from C to D consists of the identity arrow from commutativeSemigroup to commutativeSemigroup. What was a monoid in B is a commutativeMonoid in D, and the diagram arrow from B to D contains a morphism from group to commutativeGroup. There is no morphism in the diagram arrow from C to D that targets commutativeMroup. Finally what was a LeftNearRing in B becomes a commutativeLeftNearRing where both multiplication and addition are commutative. Both operations are commutative because the two squares generated by the "multiplicative" arrow and the "additive" arrow must be commutative.

Commutative Semigroup Semigroup LeftNearSemiring Monoid Semigroup Commutative Monoid Commutitive Semigroup Commutative LeftNearSemiring

Now suppose we don't want to make both operations commutative. Say we only want to make the additive operation commutative. To achieve this we need to remove the multiplicative arrow from the diagram B yielding a diagram B′. Now the pushout of B′ and C over A yields a new diagram D′ which is similar to D, but where the LeftNearRing has been replaced with a LeftNearRing theory that only has addition commutative.

To be sure, we are not deleting the "multiplicative" arrow from our theory development. We are simply deciding to not include it in the diagram B′ we use to generate the pushout. This shows that we need to take arrows very seriously. Which arrows we include in our diagrams when doing diagram pushouts can have significant consequences on the generated theories.

Now suppose we want to have our cake and eat it too. We want to build all three extensions of LeftNearRing, one making the additive operation commutative, one making the multiplicative operation commutative, and one making both operations commutative. This is possible by creating duplicate copies of LeftNearRing in our diagram. One copy of LeftNearRing has only one arrow pointing to it from Monoid corresponding to the "additive" arrow, and another copy of LeftNearRing has only one arrow pointing to it from Semigroup corresponding to the "multiplicative" arrow. Finally we put in a third copy of LeftNearRing with two arrows from the other two LeftNearRing nodes are labeled with the identity morphism.

When we pushout with this diagram, each different copy of LeftNearRing will be assigned to a different resulting theory. The first will get commutative addition only, the second will get commutative multiplication only, and the last will get both commutative addition and commutative multiplication. The two arrows labeled with identity morphisms will transform into inclusions.

Suppose we have a diagram d of theories starting from Magma, and we want to extend this diagram with commutativity, associativity, and both commutativity and associativity. To do this we would use the above pushout mechanism twice, once to create dc the diagram with commutative structures, and another time to create da the diagram with associative structures. Then to create the diagram dac of commutative and associative structures, we do the pushout of dc and da over d, using the diagram morphisms from d when we created dc and da.

Commutative Semigroup Semigroup LeftNearSemiring Monoid Semigroup Commutative Monoid Commutitive Semigroup Commutative LeftNearSemiring LeftNearSemiring LeftNearSemiring Additive Commutative LeftNearSemiring Multiplicative Commutative LeftNearSemiring

Lenses Are Exactly the Coalgebras for the Store Comonad

The store comonad (better known as the costate comonad) can be defined as

data Store b a = Store (b -> a) b

with the following comonadic operations

extract :: Store b a -> a
extract (Store v b) = v b

fmap :: (a -> c) -> Store b a -> Store b c
fmap f (Store v b) = Store (f . v) b

duplicate :: Store b a -> Store b (Store b a)
duplicate (Store v b) = Store (Store v) b

extend :: (Store b a -> c) -> Store b a -> Store b c
extend f x = fmap f (duplicate x)

Given this we can define a lens (aka a functional reference) as follows

type Lens a b = a -> Store b a

A lens can be used as an accessor to a field of a record, though they have many uses beyond this.

data Entry = { name_ :: String
             ; city_ :: String
             ; birthdate_ :: Data
             }

name :: Lens Entry String
name entry = Store (\newName -> entry {name_ = newName})) (name_ entry)

Above we create a lens to access the name field of this Entry structure. We can build similar lenses to access other the fields.

We can write getter and setter functions for an arbitrary lenses.

get :: Lens a b -> a -> b
get lens a = b
 where
  Store v b = lens a

set :: Lens a b -> a -> b -> a
set lens a = v
 where
  Store v b = lens a

However, we expect the get and set functions to obey certain laws. Here are the three laws that Koji Kagawa gives in his paper "Compositional References for Stateful Functional Programming". They are the laws that any second year computer science student would intuitively write down.

  • get l (set l s b) = b
  • set l s (get l s) = s
  • set l (set l s b1) b2 = set l s b2

Today I was trying to rewrite these lens laws to use our comonadic operations on the store comonad instead of get and set. So after playing around I came up with the following alternative set of laws

  • extract . l = id
  • fmap l . l = duplicate . l

Interestingly these laws no longer depend on any special operations of the store comonad. They only use the operations generic for any comonad.

Do these laws look familiar? They should. They are exactly the laws of a coalgebra for a comonad!

I was floored by this. These three, almost arbitrary, set of laws that every computer scientist intuits are exactly the set of laws implied by this abstract nonsense from category theory. I honestly wrote these laws out completely before noticing what they were. I do not recall this fact noted on any paper about lenses that I have read, but it will definitely be mentioned in my upcoming paper.

Below is the Coq code that proves that these two sets of laws are equivalent.

Set Implicit Arguments.

Section Lens.

Variable B : Type.

Definition W a := ((B -> a) * B)%type.

Definition extract (a:Type) (x : W a) : a := fst x (snd x).

Definition fmap (a b:Type) (f : a -> b) (x : W a) : W b := pair (fun z => f (fst x z)) (snd x).

Definition duplicate (a:Type) (x : W a) : W (W a) := pair (pair (fst x)) (snd x) .

Definition extend (a b:Type) (f : W a -> b) (x : W a) : W b := fmap f (duplicate x).

Variable A : Type.

Variable l : A -> W A.

Definition rd (r : A -> W A) (a:A) : B := snd (r a).
Definition wr (r : A -> W A) (a:A) (b:B) : A := fst (r a) b.

Section Iso1.

Hypothesis coalgebra1 : forall a, extract (l a) = a.
Hypothesis coalgebra2 : forall a, fmap l (l a) = duplicate (l a).

Lemma lawA : forall s t, rd l (wr l s t) = t.
Proof.
intros s t.
generalize (coalgebra2 s).
unfold rd, wr.
destruct (l s) as [f b].
unfold fmap, duplicate.
simpl.
inversion 1.
change (l (f t)) with ((fun z => l (f z)) t).
rewrite H1.
reflexivity.
Qed.

Lemma lawB : forall s, wr l s (rd l s) = s.
Proof.
firstorder.
Qed.

Lemma lawC : forall s t1 t2, wr l (wr l s t1) t2 = wr l s t2.
Proof.
intros s t.
generalize (coalgebra2 s).
unfold rd, wr.
destruct (l s) as [f b].
unfold fmap, duplicate.
simpl.
inversion 1.
change (l (f t)) with ((fun z => l (f z)) t).
rewrite H1.
reflexivity.
Qed.

End Iso1.

Section Iso2.

Hypothesis law1 : forall s t, rd l (wr l s t) = t.
Hypothesis law2 : forall s, wr l s (rd l s) = s.
Hypothesis law3 : forall s t1 t2, wr l (wr l s t1) t2 = wr l s t2.

Lemma coalgebraA : forall a, extract (l a) = a.
Proof.
firstorder.
Qed.

(* One day we will have observational type theory, but for now ... *)
Require Import FunctionalExtensionality.

Lemma coalgebraB : forall a, fmap l (l a) = duplicate (l a).
Proof.
intros a.
apply injective_projections; auto.
apply functional_extensionality; intros x.
simpl.
apply injective_projections.
 apply functional_extensionality; intros y.
 firstorder.
firstorder.
Qed.

End Iso2.
End Lens.
Edit (2011-01-19): I renamed the Selection comonad to the Store comonad and rearranged the order of some parameters to make it more compatible with Control.Comonad.Trans.Store. Some of the comments below use to the old name.