At the bottom of the optical hierarchy rests the humble Semantic Editor
Combinator (SEC) also known as a Setter, which denotes an isomorphism between sₓ and F(aₓ) for
some Functor F. The types s and a are possibly parametrized which we indicate
by a subscript x.
The rest of the mainline optical hierarchy consists of various refinements of the class of functors being quantified over.

SEC

::

∃ F: Functor. sₓ ~ F(aₓ)

Traversal

::

∃ F: FinitaryContainer. sₓ ~ F(aₓ)

AffineTraversal

::

∃ c₀ c₁. sₓ ~ c₀ + c₁ × aₓ

Prism

::

∃ c₀ . sₓ ~ c₀ + aₓ

Lens

::

∃ c₁. sₓ ~ c₁ × aₓ

Iso

::

sₓ ~ aₓ

For lenses and prisms the functors are of the form (c₁×) and (c₀+) respectively for some residual type.
At the top of the hierarchy we have ‘Iso’, where we are quantifying over only the Identity functor, so the existential appears to disappear.
All these classes of functors are closed under composition, though I am not sure how important this fact is.

To build profunctor implementations of these optics, we add additional methods that allow profunctors to be lifted through these various classes of functors

> class Profunctor p where
> dimap :: (a -> b) -> (c -> d) -> p b c -> p a d
> class Profunctor p => Strong p where
> first :: p a b -> p (a, c) (b, c)
> second :: p a b -> p (c, a) (c, b)
> second = dimap swap swap . first
> where
> swap (a,b) = (b,a)
> class Profunctor p => Choice p where
> left :: p a b -> p (Either a c) (Either b c)
> right :: p a b -> p (Either c a) (Either c b)
> right = dimap switch switch . left
> where
> switch (Left x) = Right x
> switch (Right y) = Left y
> class (Choice p, Strong p) => Step p where
> step :: p a b -> p (Either c (a,d)) (Either c (b,d))
> step = right . first
> class Step p => Walk p where
> walk :: Traversable f => p a b -> p (f a) (f b)

> class Walk p => Settable p where
> mapping :: Functor f => p a b -> p (f a) (f b)

The profunctor implementation of the optical hierarchy quantifies over these various extensions of profunctors

> type Optic p s t a b = p a b -> p s t
> type Iso s t a b = forall p. Profunctor p => Optic p s t a b
> type Lens s t a b = forall p. Strong p => Optic p s t a b
> type Prism s t a b = forall p. Choice p => Optic p s t a b
> type AffineTraversal s t a b = forall p. Step p => Optic p s t a b
> type Traversal s t a b = forall p. Walk p => Optic p s t a b
> type SEC s t a b = forall p. Settable p => Optic p s t a b

Now we want to show that these optics actually correspond to their canonical definitions.
The functions below are all isomorphisms, but I will only give the hard directions for now.

> iso :: forall s t a b. (s -> a) -> (b -> t) -> Iso s t a b
> iso = dimap
> lens :: forall s t a b. (s -> a) -> (s -> b -> t) -> Lens s t a b
> lens f g = dimap (f &&& id) (uncurry $ flip g) . first
> prism :: forall s t a b. (s -> Either t a) -> (b -> t) -> Prism s t a b
> prism f g = dimap f (either id g) . right
> affineTraversal :: forall s t a b. (s -> Either t a) -> (s -> b -> t) -> AffineTraversal s t a b
> affineTraversal f g = dimap from (either id (uncurry $ flip g)) . right . first
> where
> from :: s -> Either t (a,s)
> from s = (id +++ (,s)) (f s)
> traversal :: forall s t a b. (forall f. Applicative f => (a -> f b) -> s -> f t) -> Traversal s t a b
> traversal f = dimap (Rotate . from) (to . unrotate) . walk
> where
> from :: s -> PKStore a b t
> from = f (Battery (Unit id))
> to :: PKStore b b t -> t
> to = extract
> sec :: forall s t a b. ((a -> b) -> s -> t) -> SEC s t a b
> sec f = dimap from to . mapping
> where
> from :: s -> PStore s a a
> from = PStore id
> to :: PStore s a b -> t
> to (PStore g s) = f g s

The sec solution places the argument f on the to side, while the traversal places the argument f on the from side.
I suspect there is lots of wiggle room to slide f from one side to the other side.
I also suspect that there are much more elegant solutions to traversal and sec, possibly using different characterizations of the Walk and Settable classes.

This just leaves Views and Reviews, which are created my making one of the two variables of a profunctor into a phantom variable.

> class Strong p => PhantomR p where
> contrarmap :: (b -> a) -> p c a -> p c b
> coerceRight :: p c a -> p c b
> coerceRight = dimap id absurd . contrarmap absurd
>
> firstDefault :: PhantomR p => p a b -> p (a,c) (b,c)
> firstDefault = coerceRight . dimap fst id
> class Choice p => PhantomL p where
> colmap :: (a -> b) -> p a c -> p b c
> coerceLeft :: p a c -> p b c
> coerceLeft = colmap absurd . dimap absurd id
>
> leftDefault :: PhantomL p => p a b -> p (Either a c) (Either b c)
> leftDefault = coerceLeft . dimap id Left
> type View s t a b = forall p. PhantomR p => Optic p s t a b
> type Review s t a b = forall p. PhantomL p => Optic p s t a b
> view :: forall s t a b. (s -> a) -> View s t a b
> view f = coerceRight . dimap f id
> review :: forall s t a b. (b -> t) -> Review s t a b
> review f = coerceLeft . dimap id f

This file is effectively a copy of part of bennofs's profunctor lenses library.
He is the one who figured out how to implement 'sec', which I had no idea how to do and thought was impossible.
His library is more extensive and is in turn based on his many discussions on the #haskell-lens channel on freenode.
Once he demonstrated to me that sec was in fact possible, I felt that travesal must be possible.
The trick was noticing that PKStore i j a is applicative in its third argument and traversable in its first argument.
This lets us use the PKStore functor in both contexts.

Edward Kmett hinted to me that all this is known and has been derived many times (but nobody told me).
Readers should work under the assumption that everything presented here is not novel and I am simply writing this down for me so that when I do build a profunctor based optics library in the future, I can reference it.

Nothing in this post is tested, none of the laws for the classes have been presented, and I have no proofs that anything is correct.
All I know is that it all typechecks, so it is probably close to correct.

> newtype Void = Void Void
>
> absurd :: Void -> a
> absurd (Void x) = absurd x
>
> data PStore i j x = PStore (j -> x) i
>
> instance Functor (PStore i j) where
> fmap f (PStore g i) = PStore (f . g) i

> data PKStore i j x = Unit x
> | Battery (PKStore i j (j -> x)) i
>
> instance Functor (PKStore i j) where
> fmap f (Unit x) = Unit (f x)
> fmap f (Battery g i) = Battery (fmap (f .) g) i
>
> instance Applicative (PKStore i j) where
> pure = Unit
> Unit x <*> c = fmap x c
> Battery g i <*> c = Battery (flip <$> g <*> c) i
>
> newtype RotatePKStore j x i = Rotate { unrotate :: PKStore i j x }
>
> instance Functor (RotatePKStore j x) where
> fmap f (Rotate (Unit x)) = Rotate (Unit x)
> fmap f (Rotate (Battery g i)) = Rotate (Battery (unrotate (fmap f (Rotate g))) (f i))
>
> -- I am not 100% certain that the Foldable and Traversable instances are not reversed.
> instance Foldable (RotatePKStore j x) where
> foldMap f (Rotate (Unit x)) = mempty
> foldMap f (Rotate (Battery g i)) = f i <> foldMap f (Rotate g)
>
> instance Traversable (RotatePKStore j x) where
> traverse f (Rotate (Unit x)) = pure (Rotate (Unit x))
> traverse f (Rotate (Battery g i)) = Rotate <$> (f i <**> (Battery . unrotate <$> traverse f (Rotate g)))
>
> extract :: PKStore i i x -> x
> extract (Unit x) = x
> extract (Battery g i) = extract g i

shachaf taught me prisms today! Here is what I learned.

A prism is dual to the notion of lens. A lens lets you access a field of a record by letting you get and set its value. A prism lets you access a component of a sum by letting you beget and match the component. Lenses are composable, letting you access fields inside deeply nested records. Prisms are composable, letting you access components of deep hierarchy of sums.

For example, you may have a large hierarchy of exception types. Using prisms you can compose together the basic prisms that reference components from one level of your hierarchy to the next. The composite prism will let you make an exception by injecting the contents of your specific exception and generating an object of your top level exception type. On the flip side, you can also use the prism to match and catch a specific subtype of exceptions at a particular level of your hierarchy and automatically re-throw exceptions that do not match. Thus prisms can let you have your statically typed extensible exception mechanism without the pain. You can even make prisms that allow you to make and match virtual exception types that aren't part of your concrete representation, just like you can use make lenses to access virtual fields of your records that are not actual concrete fields of your records.

A lens a b type is an abstract representation of ∃c. a ~ b×c, while a prism a b is an abstract representation of ∃c. a ~ b+c.

The simple concrete implementation of a prism has two components: beget and match.

data Prism a b = Prism { beget :: b -> a, match :: a -> Either a b }

An example of a prism is the value held in a Maybe type:

_just :: Prism (Maybe a) a
_just = Prism {beget = Just; match = maybe (Left Nothing) Right}

There are three laws for prisms which are dual to the three laws for lens. Given a prism, p, we require:

1. match p . beget p === Right
2. id ||| (beget p) . match p === id
3. left (match p) . match p === left Left . match p

To help see the duality, consider the following definition of a lens.

data Lens a b = Lens { get :: a -> b, set :: (a,b) -> a }

Given a lens, l, we can write the lens laws as:

1. get l . set l = snd
2. set l . (id &&& get l) = id
3. set l . first (set l) = set l . first fst

which admittedly is quite awkward, but it makes the duality apparent.

I am not yet aware of any nice, algebra or coalgebra representation of prisms, but there is a van Laarhoven representation of prisms.

class Functor f => Costrong f where
costrength :: f (Either a b) -> Either a (f b)
type PrismVL a b = forall g f. (Costrong g, Pointed f) => (g b -> f b) -> g a -> f a

The van Laarhoven representation of a prism family is straightforward.

We can build a PrismVL from a Prism.

toVL :: Prism a b -> PrismVL a b
toVL p f = either pure (fmap (beget p) . f) . costrength . fmap (match p)

In reverse, we can convert a Prism to a PrismVL.

instance Costrong (Const b) where
costrength (Const x) = Right (Const x)
instance Costrong Identity where
costrength (Identity x) = right Identity x
fromVL :: PrismVL a b -> Prism a b
fromVL vlp = Prism begetVL matchVL
where
begetVL = runIdentity . vlp (Identity . getConst) . Const
matchVL = (Right ||| Left) . vlp (Left . runIdentity) . Identity

We can check that fromVL . toVL === id by letting vlp = toVL p in

begetVL x = runIdentity . vlp (Identity . getConst) $ Const x
= runIdentity . toVL p (Identity . getConst) $ (Const x)
= runIdentity . either pure (fmap (beget p) . Identity . getConst) . costrength . fmap (match p) $ (Const x)
= runIdentity . either pure (fmap (beget p) . Identity . getConst) . costrength $ (Const x)
= runIdentity . either pure (fmap (beget p) . Identity . getConst) $ Right (Const x)
= runIdentity (fmap (beget p) . Identity . getConst $ Const x)
= runIdentity (fmap (beget p) . Identity $ x)
= runIdentity (Identity (beget p x))
= beget p x
matchVL x = (Right ||| Left) . vlp (Left . runIdentity) $ Identity x
= (Right ||| Left) . toVL p (Left . runIdentity) $ Identity x
= (Right ||| Left) . either pure (fmap (beget p) . Left . runIdentity) . costrength . fmap (match p) $ Identity x
= (Right ||| Left) . either pure (fmap (beget p) . Left . runIdentity) . costrength $ Identity (match p x)
= (Right ||| Left) . either pure (fmap (beget p) . Left . runIdentity) $ right Identity (match p x)
= (Right ||| Left) $ either pure (fmap (beget p) . Left . runIdentity . Identity) (match p x)
= (Right ||| Left) $ either pure (fmap (beget p) . Left) (match p x)
= either (fmap (beget p) . Left) pure (match p x)
= either Left Right (match p x)
= match p x

To check that toVL . fromVL === id will likely require some heavy use of parametricity theorems, so I will leave that for later.

Lastly I expect that the prism laws in the van Laarhoven representation to something nice. Probably

(* This files proves that the Identity and Composition laws of traversals
are equivalent to the laws of coalgebroid for an indexed Kleene store comonad.
Assumptions:
- functional extensionality
- an instance of a parametricity law
*)
Require Import FunctionalExtensionality.
Set Implicit Arguments.
Unset Strict Implicit.
Record Applicative := applicative
{ carrier :> Type -> Type
; pure : forall a, a -> carrier a
; ap : forall a b, carrier (a -> b) -> carrier a -> carrier b
; map := fun a b (f : a -> b) x => ap (pure f) x
; _ : forall (a : Type) x, map _ _ (fun (y : a) => y) x = x
; _ : forall a b c x y z,
ap (ap (map _ _ (fun (f : b -> c) (g : a -> b) (w:a) => f (g w)) x) y) z =
ap x (ap y z)
; _ : forall a b (f : a -> b) x, map _ _ f (pure x) = pure (f x)
; _ : forall a b (f : carrier (a -> b)) x,
ap f (pure x) = map _ _ (fun g => g x) f
}.
Record IdiomaticTransformation (F G:Applicative) := idomaticTransformation
{ transform :> forall a, F a -> G a
; _ : forall a (x:a), transform (pure F x) = pure G x
; _ : forall a b (f : F (a -> b)) x, transform (ap f x) = ap (transform f) (transform x)
}.
Definition TraversableSig a a' b b' := forall (f:Applicative), (b -> f b') -> (a -> f a').
Axiom parametricity_instance :
forall a a' b b' (t : TraversableSig a a' b b') F G (eta : IdiomaticTransformation F G) f x,
eta _ (t F f x) = t G (fun a => eta _ (f a)) x.
Lemma identity_law (F : Applicative) a (x : F a) : map (fun (y : a) => y) x = x.
Proof.
destruct F.
auto.
Qed.
Lemma composition_law (F : Applicative) a b c x y (z : F a) :
ap (ap (map (fun (f : b -> c) (g : a -> b) (w:a) => f (g w)) x) y) z =
ap x (ap y z).
Proof.
destruct F.
auto.
Qed.
Lemma homomorphism_law (F : Applicative) a b (f : a -> b) x :
ap (pure F f) (pure F x) = pure F (f x).
Proof.
fold (map f (pure F x)).
destruct F.
auto.
Qed.
Lemma interchange_law (F : Applicative) a b (f : F (a -> b)) x :
ap f (pure F x) = map (fun g => g x) f.
Proof.
destruct F.
auto.
Qed.
Definition IdApplicative : Applicative.
exists (fun A => A) (fun a (x : a) => x) (fun a b (f : a -> b) => f);
try reflexivity.
Defined.
Section ComposeApplicative.
Variable (F G :Applicative).
Let FG (a:Type) := F (G a).
Definition Compose_pure (a:Type) (x:a) : FG a :=
pure F (pure G x).
Definition Compose_ap (a b:Type) (f:FG (a -> b)) (x:FG a) : FG b :=
ap (ap (pure F (fun g x => ap g x)) f) x.
Lemma Compose_identity (a:Type) (x : FG a):
Compose_ap (Compose_pure (fun y : a => y)) x = x.
Proof.
unfold Compose_ap, Compose_pure.
rewrite homomorphism_law.
replace (fun x0 : G a => ap (a:=G) (a0:=a) (b:=a) (pure G (fun y : a => y)) x0)
with (fun (y : G a) => y).
apply identity_law.
apply functional_extensionality; intro y.
symmetry.
apply identity_law.
Qed.
Lemma Compose_composition (a b c:Type) x y (z : FG a) :
Compose_ap (Compose_ap (Compose_ap (Compose_pure (fun (f : b -> c) (g : a -> b) (w:a) => f (g w)))
x) y) z =
Compose_ap x (Compose_ap y z).
Proof.
unfold Compose_ap, Compose_pure.
repeat rewrite homomorphism_law.
repeat (unfold map; simpl; (repeat rewrite homomorphism_law); rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
repeat rewrite interchange_law.
repeat (unfold map; simpl; (repeat rewrite homomorphism_law); rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
replace (fun (w : G (b -> c)) (w0 : G (a -> b)) (x0 : G a) =>
ap (a:=G) (a0:=a) (b:=c)
(ap (a:=G) (a0:=a -> b) (b:=a -> c)
(ap (a:=G) (a0:=b -> c) (b:=(a -> b) -> a -> c)
(pure G
(fun (f : b -> c) (g : a -> b) (w1 : a) => f (g w1)))
w) w0) x0)
with (fun (w : G (b -> c)) (w0 : G (a -> b)) (w1 : G a) =>
ap (a:=G) (a0:=b) (b:=c) w (ap (a:=G) (a0:=a) (b:=b) w0 w1)).
reflexivity.
repeat (apply functional_extensionality; intro).
symmetry.
apply composition_law.
Qed.
Lemma Compose_homomorphism a b (f : a -> b) x :
Compose_ap (Compose_pure f) (Compose_pure x) = Compose_pure (f x).
Proof.
unfold Compose_ap, Compose_pure.
repeat rewrite homomorphism_law.
reflexivity.
Qed.
Lemma Compose_interchange a b (f : FG (a -> b)) x :
Compose_ap f (Compose_pure x) = Compose_ap (Compose_pure (fun g => g x)) f.
Proof.
unfold Compose_ap, Compose_pure.
repeat rewrite homomorphism_law.
repeat rewrite interchange_law.
repeat (unfold map; simpl; (repeat rewrite homomorphism_law); rewrite <- composition_law).
repeat rewrite interchange_law.
unfold map; simpl; repeat rewrite homomorphism_law.
replace (fun w : G (a -> b) => ap (a:=G) (a0:=a) (b:=b) w (pure G x))
with (fun x0 : G (a -> b) =>
ap (a:=G) (a0:=a -> b) (b:=b) (pure G (fun g : a -> b => g x)) x0).
reflexivity.
apply functional_extensionality; intro.
symmetry.
apply interchange_law.
Qed.
Definition ComposeApplicative : Applicative.
exists FG Compose_pure Compose_ap.
apply Compose_identity.
apply Compose_composition.
apply Compose_homomorphism.
apply Compose_interchange.
Defined.
End ComposeApplicative.
Inductive KleeneStore (i j a : Type) :=
| Unit : a -> KleeneStore i j a
| Battery : KleeneStore i j (j -> a) -> i -> KleeneStore i j a.
Fixpoint KleeneStore_map (i j a b: Type) (f : (a -> b))
(x : KleeneStore i j a) : KleeneStore i j b :=
match x with
| Unit y => Unit _ _ (f y)
| Battery v a => Battery (KleeneStore_map (fun g z => f (g z)) v) a
end.
Lemma KleeneStore_map_id (i j a : Type) (x : KleeneStore i j a) :
KleeneStore_map (fun a => a) x = x.
Proof.
induction x.
reflexivity.
simpl.
replace (fun (g : j -> a) (z : j) => g z) with (fun (g : j -> a) => g).
rewrite IHx.
reflexivity.
apply functional_extensionality.
apply eta_expansion.
Qed.
Lemma KleeneStore_map_comp (i j a b c : Type) (f : b -> c) (g : a -> b)
(x : KleeneStore i j a) :
KleeneStore_map f (KleeneStore_map g x) = KleeneStore_map (fun a => f (g a)) x.
Proof.
revert b c f g.
induction x.
reflexivity.
simpl.
intros b c f g.
rewrite IHx.
reflexivity.
Qed.
Fixpoint KleeneStore_apH (i j a b c : Type) (h : c -> a -> b) (f : KleeneStore i j c) {struct f} :
KleeneStore i j a -> KleeneStore i j b :=
match f with
| Unit g => KleeneStore_map (h g)
| Battery v a => fun y =>
Battery (KleeneStore_apH (fun r s t => (h (r t) s)) v y) a
end.
Definition KleeneStore_ap (i j a b : Type) :
KleeneStore i j (a -> b) -> KleeneStore i j a -> KleeneStore i j b
:= KleeneStore_apH (fun x => x).
Lemma KleeneStore_map_ap i j a b (f : a -> b) x : KleeneStore_map f x =
KleeneStore_ap (Unit i j f) x.
Proof.
reflexivity.
Qed.
Lemma KleeneStore_identity i j a x : KleeneStore_ap (Unit i j (fun (y : a) => y)) x = x.
Proof.
induction x.
reflexivity.
unfold KleeneStore_ap in *; simpl in *.
replace (fun (g : j -> a) (z : j) => g z) with (fun (g : j -> a) => g).
rewrite IHx; reflexivity.
apply functional_extensionality.
apply eta_expansion.
Qed.
Lemma KleeneStore_apH_ap i j a b c h f x : @KleeneStore_apH i j a b c h f x =
KleeneStore_ap (KleeneStore_map h f) x.
Proof.
cut (forall i j a b c d h (g : d -> c) f x,
@KleeneStore_apH i j a b c h (KleeneStore_map g f) x =
KleeneStore_ap (KleeneStore_map (fun z => h (g z)) f) x).
intros H.
assert (H0 := H i j a b c c h (fun z => z) f x).
rewrite KleeneStore_map_ap, KleeneStore_identity in H0.
rewrite H0.
rewrite (eta_expansion h).
reflexivity.
clear i j a b c h f x.
intros i j a b c d h g f x.
revert b c g h.
induction f.
reflexivity.
intros b c g h.
unfold KleeneStore_ap; simpl.
rewrite IHf, IHf.
reflexivity.
Qed.
Lemma KleeneStore_interchange i j a b (f : KleeneStore i j (a -> b)) x :
KleeneStore_ap f (Unit i j x) = KleeneStore_ap (Unit i j (fun g => g x)) f.
Proof.
cut (forall i j a b c (h : c -> a -> b) (f : KleeneStore i j c) x,
KleeneStore_apH h f (Unit i j x) = KleeneStore_ap (Unit i j (fun g => h g x)) f).
intros H.
apply H.
clear i j a b f x.
intros i j a b c h f.
revert a b h.
induction f.
reflexivity.
intros a0 b h x.
unfold KleeneStore_ap; simpl.
rewrite IHf.
reflexivity.
Qed.
Lemma KleeneStore_composition i j a b c x y z :
KleeneStore_ap (KleeneStore_ap (KleeneStore_ap (Unit i j (fun (f : b -> c) (g : a -> b) (w:a) => f (g w))) x) y) z =
KleeneStore_ap x (KleeneStore_ap y z).
Proof.
unfold KleeneStore_ap; simpl.
cut (forall c0 c1 (h0 : c0 -> b -> c) (x0 : KleeneStore i j c0) (h1 : c1 -> a -> b) (y0 : KleeneStore i j c1),
KleeneStore_ap
(KleeneStore_ap
(KleeneStore_ap
(Unit i j (fun (f : b -> c) (g : a -> b) (w : a) => f (g w)))
(KleeneStore_map h0 x0)) (KleeneStore_map h1 y0)) z =
KleeneStore_apH h0 x0 (KleeneStore_apH h1 y0 z)).
intros H.
rewrite <- H, <- KleeneStore_map_ap.
do 2 rewrite (KleeneStore_map_ap (fun g => g)), KleeneStore_identity.
reflexivity.
clear x y.
assert (H : forall (a b c : Type) (h0 : b -> c) c0 (h1 : c0 -> a -> b) (z : KleeneStore i j a) (y : KleeneStore i j c0),
KleeneStore_ap
(KleeneStore_map (fun (a' : c0) (w : a) => h0 (h1 a' w)) y) z =
KleeneStore_map h0 (KleeneStore_apH h1 y z)).
clear - i j.
intros a b c h0 c0 h1 z y.
revert a b c h0 h1 z.
induction y; intros a' b c h0 h1 z; simpl.
rewrite KleeneStore_map_comp.
reflexivity.
rewrite <- IHy.
unfold KleeneStore_ap; simpl.
repeat rewrite KleeneStore_apH_ap.
repeat rewrite KleeneStore_map_comp.
reflexivity.
intros c0 c1 h0 x.
revert b c h0 c1 a z.
unfold KleeneStore_ap; simpl.
induction x; intros b c h0 c1 a' z h1 y; simpl.
generalize (h0 a0).
clear a h0 a0.
intros h0.
rewrite KleeneStore_map_comp.
apply H.
rewrite <- IHx.
repeat rewrite KleeneStore_map_comp.
repeat rewrite KleeneStore_apH_ap.
repeat rewrite KleeneStore_map_comp.
unfold KleeneStore_ap; simpl.
repeat (rewrite <- H; repeat rewrite KleeneStore_map_comp).
reflexivity.
Qed.
Definition KleeneStoreApplicative (i j : Type) : Applicative.
exists (KleeneStore i j) (Unit i j) (@KleeneStore_ap i j).
apply KleeneStore_identity.
apply KleeneStore_composition.
reflexivity.
apply KleeneStore_interchange.
Defined.
Canonical Structure KleeneStoreApplicative.
Fixpoint KleeneExtract i a (f : KleeneStore i i a) : a :=
match f with
| Unit x => x
| Battery v y => KleeneExtract v y
end.
Fixpoint KleeneDuplicate i j k a (f : KleeneStore i k a) :
KleeneStore i j (KleeneStore j k a) :=
match f with
| Unit x => Unit i j (Unit j k x)
| Battery v y => Battery (KleeneStore_map (@Battery _ _ _) (KleeneDuplicate j v)) y
end.
Lemma KleeneExtract_map i a b (f : a -> b) (x : KleeneStore i i a) :
KleeneExtract (KleeneStore_map f x) = f (KleeneExtract x).
Proof.
revert b f.
induction x; intros b f.
reflexivity.
simpl.
rewrite IHx.
reflexivity.
Qed.
Lemma KleeneDuplicate_map i j k a b (f : a -> b) (x : KleeneStore i k a) :
KleeneDuplicate j (KleeneStore_map f x) = KleeneStore_map (KleeneStore_map f) (KleeneDuplicate j x).
Proof.
revert b f.
induction x; intros b f.
reflexivity.
simpl.
rewrite IHx.
repeat rewrite KleeneStore_map_comp.
simpl.
replace (fun (a0 : KleeneStore j k (k -> a)) (z : j) =>
Battery (KleeneStore_map (fun (g : k -> a) (z0 : k) => f (g z0)) a0) z)
with (fun a0 : KleeneStore j k (k -> a) =>
Battery (KleeneStore_map (fun (g : k -> a) (z : k) => f (g z)) a0)).
reflexivity.
apply functional_extensionality; intro.
apply eta_expansion.
Qed.
Lemma KleeneStore_law1 i j a (x : KleeneStore i j a) :
KleeneExtract (KleeneDuplicate _ x) = x.
Proof.
induction x.
reflexivity.
simpl.
rewrite KleeneExtract_map.
rewrite IHx.
reflexivity.
Qed.
Lemma KleeneStore_law2 i j a (x : KleeneStore i j a) :
KleeneStore_map (@KleeneExtract _ _) (KleeneDuplicate _ x) = x.
Proof.
induction x.
reflexivity.
simpl.
rewrite KleeneStore_map_comp.
simpl.
replace (fun (a0 : KleeneStore j j (j -> a)) (z : j) => KleeneExtract a0 z)
with (@KleeneExtract j (j -> a)).
rewrite IHx.
reflexivity.
do 2 (apply functional_extensionality; intro).
reflexivity.
Qed.
Lemma KleeneStore_law3 i j k l a (x : KleeneStore i l a) :
KleeneDuplicate j (KleeneDuplicate k x) = KleeneStore_map (fun z => KleeneDuplicate k z) (KleeneDuplicate j x).
Proof.
induction x.
reflexivity.
simpl.
rewrite KleeneDuplicate_map, IHx.
repeat rewrite KleeneStore_map_comp.
simpl.
replace (fun (a0 : KleeneStore j l (l -> a)) (z : j) =>
@Battery j k (KleeneStore k l a)
(@KleeneStore_map j k (KleeneStore k l (l -> a))
(k -> KleeneStore k l a) (@Battery k l a)
(@KleeneDuplicate j k l (l -> a) a0)) z)
with (fun (a0 : KleeneStore j l (l -> a)) =>
Battery (KleeneStore_map (@Battery k l a) (KleeneDuplicate k a0))).
reflexivity.
do 2 (apply functional_extensionality; intro); reflexivity.
Qed.
Fixpoint experiment i j (F : Applicative) (g : i -> F j) a (k : KleeneStore i j a) :
F a :=
match k with
| Unit x => pure F x
| Battery v b => ap (map (fun x f => f x) (g b)) (experiment g v)
end.
Lemma experiment_map i j a b (F : Applicative) (g : i -> F j) (f : a -> b) (x : KleeneStore i j a) :
experiment g (KleeneStore_map f x) = map f (experiment g x).
Proof.
revert b f.
induction x; intros b f; simpl.
rewrite <- homomorphism_law.
reflexivity.
rewrite IHx.
repeat (unfold map; simpl; rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
repeat rewrite interchange_law.
repeat (unfold map; simpl; rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
reflexivity.
Qed.
Lemma experiment_ap i j a b (f : Applicative) (g : i -> f j)
(h : KleeneStore i j (a -> b)) (x : KleeneStore i j a) :
experiment g (ap h x) = ap (experiment g h) (experiment g x).
Proof.
cut (forall c (h0 : c -> a -> b) (h : KleeneStore i j c),
experiment g (KleeneStore_apH h0 h x) = ap (experiment g (map h0 h)) (experiment g x)).
intros H.
simpl; unfold KleeneStore_ap; simpl.
rewrite H, identity_law.
reflexivity.
clear h.
intros c h0 h.
revert b h0.
induction h; intros b h0; simpl.
apply experiment_map.
rewrite IHh.
simpl.
rewrite <- KleeneStore_map_ap.
rewrite experiment_map, experiment_map.
repeat (unfold map; simpl; rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
repeat rewrite interchange_law.
repeat (unfold map; simpl; rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
replace (fun (w : j) (w0 : j -> a0) (w1 : a) => h0 (w0 w) w1)
with (fun (w : j) (w0 : j -> a0) => h0 (w0 w)).
reflexivity.
do 2 (apply functional_extensionality; intro); apply eta_expansion.
Qed.
Definition experiment_idomaticTransformation i j (F : Applicative) (g : i -> F j) :
IdiomaticTransformation (KleeneStoreApplicative i j) F.
exists (@experiment i j F g).
reflexivity.
intros a b f x.
apply experiment_ap.
Defined.
Definition impure i j : i -> KleeneStore i j j :=
Battery (Unit _ _ (fun x => x)).
Lemma experiment_impure i j a (x: KleeneStore i j a) : experiment (impure j) x = x.
Proof.
induction x.
reflexivity.
simpl.
rewrite IHx.
unfold KleeneStore_ap; simpl.
replace (fun (s : j -> a) (t : j) => s t) with (fun x : j -> a => x).
rewrite KleeneStore_map_id.
reflexivity.
repeat (apply functional_extensionality; intro); reflexivity.
Qed.
Lemma experiment_duplicate i j k a (x: KleeneStore i k a) :
experiment (F:=ComposeApplicative (KleeneStoreApplicative i j)
(KleeneStoreApplicative j k)) (fun z : i => Battery (Unit i j (impure k)) z) x = KleeneDuplicate j x.
Proof.
induction x.
reflexivity.
simpl; rewrite <- IHx.
simpl; unfold KleeneStore_ap; simpl.
replace (fun (s : KleeneStore j k (k -> a)) (t : j) =>
Battery (KleeneStore_map (fun (s0 : k -> a) (t0 : k) => s0 t0) s) t)
with (@Battery j k a).
reflexivity.
repeat (apply functional_extensionality; intros).
replace (fun (s0 : k -> a) (t0 : k) => s0 t0)
with (fun s0 : k -> a => s0).
rewrite KleeneStore_map_id.
reflexivity.
repeat (apply functional_extensionality; intros).
reflexivity.
Qed.
Definition CoalgebroidSig a a' b b' := a -> KleeneStore b b' a'.
(*** TraversableSig and CoalgebraSig are isomorphic ***)
Definition Traversable_to_Coalgebroid a a' b b'
(f : TraversableSig a a' b b') : CoalgebroidSig a a' b b' :=
f (KleeneStoreApplicative b b') (Battery (Unit b b' (fun x => x))).
Definition Coalgebroid_to_Traversable a a' b b'
(g : CoalgebroidSig a a' b b') : TraversableSig a a' b b' :=
fun f h x => experiment h (g x).
Lemma Traversable_Coalgebroid_iso1 a a' b b' (x : CoalgebroidSig a a' b b') y :
Traversable_to_Coalgebroid (Coalgebroid_to_Traversable x) y = x y.
Proof.
unfold Traversable_to_Coalgebroid, Coalgebroid_to_Traversable.
generalize (x y).
clear x y.
induction k.
reflexivity.
simpl.
rewrite IHk.
clear a IHk.
unfold KleeneStore_ap; simpl.
rewrite KleeneStore_map_ap.
replace (fun (s : b' -> a0) (t : b') => s t) with (fun (s : b' -> a0) => s).
rewrite KleeneStore_identity.
reflexivity.
apply functional_extensionality.
apply eta_expansion.
Qed.
Lemma Traversable_Coalgebroid_iso2 a a' b b' (x : TraversableSig a a' b b') F g y :
Coalgebroid_to_Traversable (Traversable_to_Coalgebroid x) g y = x F g y.
Proof.
unfold Traversable_to_Coalgebroid, Coalgebroid_to_Traversable.
assert (P := (@parametricity_instance _ _ _ _ x _ _ (experiment_idomaticTransformation g) (Battery (Unit b b' (fun y => y))))) .
simpl in P.
rewrite P.
clear P.
replace (fun a0 : b =>
ap (a:=F) (a0:=b' -> b') (b:=b')
(map (a:=F) (fun (x0 : b') (f : b' -> b') => f x0) (g a0))
(pure F (fun y0 : b' => y0))) with g.
reflexivity.
apply functional_extensionality; intros z.
rewrite interchange_law.
repeat (unfold map; simpl; rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
fold (map (fun w => w) (g z)).
rewrite identity_law.
reflexivity.
Qed.
Definition TraversableIdLaw a b (t : TraversableSig a a b b) :=
(forall x, t IdApplicative (fun a => a) x = x).
Definition CoalgebroidExtractLaw a b (t : CoalgebroidSig a a b b) :=
(forall x, KleeneExtract (t x) = x).
Lemma equiv_law1 a b (t : CoalgebroidSig a a b b) :
TraversableIdLaw (Coalgebroid_to_Traversable t) <->
CoalgebroidExtractLaw t.
Proof.
cut (forall x, (Coalgebroid_to_Traversable t) IdApplicative (fun a => a) x =
KleeneExtract (t x)).
intros E.
split; intros H x; simpl; rewrite <- H; auto.
intros x.
unfold Coalgebroid_to_Traversable.
generalize (t x).
clear t x.
induction k.
reflexivity.
simpl.
rewrite IHk.
reflexivity.
Qed.
Definition TraversableComposeLaw (i o : Type -> Type)
(t : forall (a b : Type), TraversableSig (o a) (o b) (i a) (i b)) :=
(forall a b c (F G:Applicative) (f : i a -> F (i b)) (g : i b -> G (i c)) x,
t a c (ComposeApplicative F G) (fun z => map g (f z)) x = map (t _ _ _ g) (t _ _ _ f x)).
Definition CoalgebroidDuplicateLaw (i o : Type -> Type)
(t : forall (a b : Type), CoalgebroidSig (o a) (o b) (i a) (i b)) :=
forall a b c x, KleeneDuplicate _ (t a c x) = KleeneStore_map (t _ _) (t a b x).
Lemma equiv_law2 i o t :
@TraversableComposeLaw i o (fun a b => Coalgebroid_to_Traversable (t a b)) <->
@CoalgebroidDuplicateLaw i o t.
Proof.
split.
intros H a b c x.
assert (H0 := H a b c _ _ (impure _) (impure _) x).
unfold impure in H0; simpl in H0.
unfold KleeneStore_ap in H0; simpl in H0.
unfold Coalgebroid_to_Traversable in H0; simpl in H0.
rewrite experiment_impure in H0.
replace (fun x : o b =>
experiment (F:=KleeneStoreApplicative (i b) (i c))
(Battery (Unit (i b) (i c) (fun x0 : i c => x0))) (t b c x))
with (t b c) in H0.
rewrite <- H0.
rewrite <- experiment_duplicate.
unfold impure; simpl.
replace (fun z0 : i b => Battery (Unit (i b) (i c) (fun x0 : i c => x0)) z0)
with (Battery (Unit (i b) (i c) (fun x0 : i c => x0))).
reflexivity.
apply functional_extensionality; reflexivity.
apply functional_extensionality; intros z.
rewrite experiment_impure.
reflexivity.
intros H a b c F G f g x.
assert (H0 := H a b c x).
unfold Coalgebroid_to_Traversable.
set (fg := (fun z : i a => map (a:=F) g (f z))).
rewrite <- experiment_map.
rewrite <- KleeneStore_map_comp.
rewrite <- H0.
generalize (t a c x).
revert f g fg.
clear H0.
generalize (i a) (i b) (i c) (o c).
clear - F G.
(* This probably should be its own algebraic law but...*)
intros a b c x f g fg k.
induction k.
reflexivity.
simpl.
rewrite IHk.
unfold Compose_ap, Compose_pure, map; simpl.
rewrite KleeneStore_map_comp.
repeat (unfold map; simpl; repeat rewrite homomorpishm_law; rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
repeat rewrite experiment_map.
repeat (unfold map; simpl; repeat rewrite homomorpishm_law; rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
repeat rewrite interchange_law.
repeat (unfold map; simpl; repeat rewrite homomorpishm_law; rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
unfold fg.
repeat (unfold map; simpl; repeat rewrite homomorpishm_law; rewrite <- composition_law).
unfold map; simpl; repeat rewrite homomorphism_law.
reflexivity.
Qed.

Modules After attending the second St Jacob's workshop I think I now understand all the pieces that make up a good module system; though I might not understand how they all fit together yet.

There is also a notable distinction between modular programming and module systems, which turn out to only be loosely related. Modular programming is about isolating software components and using narrow interfaces to communicate between them, while a module system is about assembling larger software modules from smaller modules.

Modules are dependent records / typing contexts for some type system. Modules are abstract over the the exact details of the type system which makes the notion quite versatile. All that is needed for the type system is that it supports variable declarations and substitution. It is also very convenient from both a theoretical and a practical point of view that modules support manifest entries / definitions.

Modules are related to each other through views. A view from a module A to a module B is an assignment of expressions of the type theory in the context A for every declaration in the module B. Thus if you have an instance of a module A you can derive an instance of a module B by substituting the free variables in the expression in B with the value in A. Indeed, one can define an instance of a module A to be a view from the empty context to a module A.

Extensions form an important class of views. An extension from module A to module B is a view where each expression for module B is a unique variable name from module A.

Modules can be combined together to create new modules. I think the Mathscheme project has found a good set of combinators that can be used to create new modules: extend, combine extensions, and mixin a view over an extension (I will discuss rename later as it isn't exactly a combinator, since it only produces isomorphic modules). The empty module gets things started. Arguably combine can be defined as a mixin, but it is probably better to consider it as a separate operation.

It strikes me that where most existing module systems go wrong is by taking inclusions too seriously. Inclusions are special types of extensions where each variable assignment is identical to the variable being assigned (note, it is only possible to define inclusions when modules are *not* take up to alpha equivalence). Inclusions have several nice properties: in particular there is at most one inclusion between any two modules, so inclusions form a partial order out of the modules.

I think module systems tend to only allow combine and mixins over inclusions rather than arbitrary extensions. The problem is that a structure like Ring, is an extension of Monoid in two different ways. Both cannot be inclusions since there is a most one inclusion between two modules. The usual symptom of this problem is that abelian monoids are unrelated to monoids, and rings become a composition of these two distinct structures.

Typeclasses are dependent records that are combined with a type inference mechanism. While type inference is important, it is an orthogonal issue. SSreflect type classes are dependent records that use coercions and canonical structures to define extensions and views (in some way that haven't fully worked out). Coq's coercion mechanism works best when they form a partial order, so ssreflect appears to suffer from the inclusion problem.

Spitters and van der Weegen's pebble style type-classes is a combination of modules being dependent records and typing contexts. The domain of a view is a typing context while the codomain is a dependent record. I haven't considered their module system in detail yet.

Bill and I have worked out an approach to using a GMP algorithmic theory to do natural number computation using trustable communication.

To begin with we assume we are given an algebraic theory of integer arithmetic called GMP.
GMP contains transformers for various integer operations, such as addition, multiplication, integer square root, etc. There are meaning formulas for these transformers whose instances are all Δ_{0}, which makes all the "axioms" of GMP these Δ_{0} sentences.

What does it mean for GMP to be correct? It means that all these meaning formulas are theorems of a theory of integer arithmetic. Define IA to be a theory of integer arithmetic in the same vein as Peano arithmetic. For completeness, let us assume that we have some full induction principle for integer arithmetic.

There is a trivial translation from GMP to IA, which is the identity translation, since the two theories share the same language. GMP is "correct" if this identity translation is a theory interpretation. Proving that GMP is "correct" is an extremely difficult task, and so at this point we will simply trust that it is correct.

I think in practice one might unify GMP and IA into one theory, but the separation between the two theories in the above presentation, at the very least, helps to clarify what is going on and what is trusted.

As it stands GMP and IA only do integer arithmetic, but we want to do natural number arithmetic. Natural number arithmetic is compatible with integer arithmetic, but this fact requires justification. To this end we extend IA and GMP to IA' and GMP' to support natural number arithmetic.

IA' defines a type of natural numbers as a subset of integers and defines natural number operations on this type of natural numbers. From this, one should be able to prove Peano's axioms. We also extend GMP to GMP', by again defining a type of natural numbers. We define a second set of meaning formulas for natural number addition. These meaning formulas quantify over numerals for natural numbers and relate natural number operations to the integer transformers.

The new meaning formulas provide a whole host of new "axioms" for GMP', and we need to check that they are correct. We could again trust that it is correct by assuming the identity translation to IA' is an interpretation. However, instead we can prove that it is relatively correct to GMP. That is, we can prove that if GMP is correct then GMP' is correct. To do this we simply prove that all the "new axioms" of GMP' are theorems of IA' under the assumption that the "axioms" of GMP hold. Each "new axiom" of GMP' should follow from the corresponding integer "axiom" of GMP by the theory of IA'.

From here on out, we closely follow the old trustable communication framework. To use this newly created natural number service, say from PA, we first build a translation from PA to GMP', which is easy since GMP' contains the language of PA. We "run the service" which means using the translated data to pick out an instance of a meaning formula for one of the natural number operations in GMP'. Call this instance φ. Next we send φ through the semi-trusted identity interpretation to IA'.

To complete the query we need an interpretation from IA' to PA, but this requires some preparation. We extend PA to PA' to define the integers from the natural numbers using the ambient set theory of Chiron to construct them. It is important to construct the integers in such a way that the natural numbers are a subset of them, not just isomorphic to a subset of them. We include definitions for all the integer operations.

Now we can translate from IA' to PA'. We claim that this translation is an interpretation. In order to prove this, it suffices to prove that the translations of all the axioms of IA' hold in PA', which should be the case. Finally we interpret φ to PA' and then expand away all the definitions of PA' to get a final answer in PA. This expansion of definitions in practice will do nothing since φ will actually already be in the language of PA.

I’m working on this idea of considering theory signatures as contexts of some dependent type theory and theory presentations as a category fibred over the category of contexts. A model of a theory presentation corresponds to a point of said object, that is a morphism from the terminal theory presentation (i.e. the empty context with no axioms) to said object. The fibration projects models down their underlying structure, which is a point in the corresponding theory signature, that is a morphisms from the terminal theory signature (i.e. the empty context) to said theory signature, which in turn corresponds to a term fulfilling the theory signature.

Given that two structures for signature S are arrows M N : 1 → S, it seems natural to represent homomorphisms from M to N as a 2-cell F : M ⇒ N.

If our category of theory signatures is going to actually be a 2-category then we need to consider 2-fibrations, which seem to have been worked out in Claudio Hermida’s paper entitled "Some Properties of Fib as a fibred 2-category".

More importantly I need to work out what a 2-cell in the category of contexts would be exactly. Some things are pretty clear; for the context containing x:Type and when the corresponding components of the structures, M and N, are ⊢ s:Type and ⊢ t:Type, we want the corresponding component F : M ⇒ N to be a term ⊢ F_{x} : s → t. If the context contains an operation f : x → x, and the corresponding components of the structures are ⊢ g : s → s and ⊢ h : t → t, then we want the corresponding component of F : M ⇒ N to be the constraint ⊢ F_{f} : F_{x}∘g =_{s → t} h∘F_{x}. But to work it out for the general case of dependent type boggles my mind.

There is an interesting downward transformation going on here where types in the signature turn are realized by (function) terms in the 2-cells, and terms in the signatures are realized by (equality) propositions in the 2-cells, and propositions (in theory presentations) will come for free in the 2-cells (is this always true?). Perhaps kinds will realized by types?

This all vaguely resembles Reynold's abstraction theorem.

A : Type;
B : A -> Type;
f : forall x : A. B x
=====================
|- f =_{η} fun (x:A) => f x
A : Type;
B : A -> Type;
C : SigT A B -> Type;
f : forall p : SigT A B. C p
============================
p : SigT A B |- f p =_{η} match p as p0 return C p0 with existT x t => f (existT A B x t) end
f : unit
========
|- f =_{η} tt
A : Type;
x, y : A;
C : forall y, Eq A x y -> Type;
f : forall e : Eq A x y -> C y e
================================
e : Eq A x y |- f e =_{η} match e as e0 in Eq _ _ z return C z e0 with refl => f (refl A x) end
A, B : Type;
C : A + B -> Type;
f : forall p : A + B. C p
=========================
p : A + B |- f p =_{η} match p as p0 return C p0 with left x => f (left x) | right y -> f (right y) end
C : Void -> Type;
f : forall p : Void . C p
=========================
p : Void |- f p =_{η} match p as p0 return C p0 with end
Inductive W (A:Type) (B:A -> Type) : Type :=
node : forall (a:A) (f:B a -> W A B), W A B
A : Type;
B : A -> Type;
C : W A B -> Type;
f0 : forall p : W A B. C p;
f1 : forall (a : A) (f:B a -> W A B), (forall b:B a. C (f b)) -> C (node a f)
-----------------------------------------------------------------------------
a : A, f: B a -> W A B |- f0 (node a f) =_{βη} f1 a f (fun b => f0 (f b))
=============================================================================
p : W A B |- f0 p =_{η} W_rect A B C f1 p
C : nat -> Type;
f0 : forall n : nat. C n;
f1 : forall n : nat. C n -> C (S n)
-----------------------------------
n:nat |- f0 (S n) =_{βη} f1 (f0 n))
===================================
n:nat |- f0 n =_{η} nat_rect C (f0 O) f1 n

A container is a specification of S and P and is given by the dependent type s:S ⊢ P s a.k.a. the telescope ΣS:★. S → ★. A container specifies the (strictly positive) functor λX:★. Σs:S. (P s) → X.

An indexed container over I is a specification of S and P and is given by the dependent type ΣS:★. I → S → ★.
An indexed container specifies an indexed Functor λX:I → ★. Σs:S. Πi:I. (P i s) → X i.

Containers form a commutative ★-semi-algebra a.k.a. a ★-semiring (not to be confused with a *-semiring) with the following operations:

0 := ⟨⊥, []⟩

1 := ⟨⊤, λs. ⊥⟩

⟨A, P⟩ + ⟨B,Q⟩ := ⟨A + B, [P, Q]⟩

⟨A, P⟩ × ⟨B,Q⟩ := ⟨A × B, λ⟨a,b⟩. P a + Q b⟩.

R·⟨A,P⟩ := ⟨R × A, λ⟨_,a⟩. P a⟩

Containers (aka ⊤-indexed containers) form a ★-bi-semiring. The tensor of I indexed containers with J indexed containers is (I + J)-indexed containers. The identity of this tensor are the ⊥-indexed containers which are isomorphic to ★. The tensor of an I indexed container and a J indexed container is:

⟨A,P⟩ ⊗ ⟨B,Q⟩ := ⟨A × B, λ(ĳ:I+J) ⟨a,b⟩. [λi. P i a, λj. Q j b](ĳ)⟩.

In particular the tensor of containers with themselves are the 𝔹-indexed containers. The following are the ★-bi-semiring operations on containers

ε^{+}⟨A,P⟩ := Σa:A. P a → ⊥ (≜ Σa:A. ¬(P a))

ε^{×}⟨A,P⟩ := Σa:A. P a → ⊤ (≅ A)

Δ^{+}⟨A,P⟩ := ⟨Σa:A. P a → 𝔹, λ(i:𝔹) (⟨a,f⟩:Σa:A. P a → 𝔹). Σp:P a. i = f p⟩

Δ^{×}⟨A,P⟩ := ⟨A, λ(i:𝔹). P⟩

β(R)⟨A,P⟩ := Σa:A. P a → R

The composition product of I-indexed containers with J-indexed containers are the (I × J)-indexed containers, with containers (ie ⊤-indexed containers) as the unit for the composition product. The composition product of an I-indexed container with a J-indexed container is as follows.

⟨A,P⟩ ⊙ ⟨B,Q⟩ := ⟨Σa:A. Πi:I. P i a → B, λ⟨i,j⟩ ⟨a,f⟩. Σp : P i a. Q j (f i p)⟩

Now containers trivially form a semi-plethory since ⊤-indexed containers are clearly isomorphic to (⊤ × ⊤)-indexed containers.

ι⟨A,P⟩ := ⟨A,λ⟨⟩. P⟩

(∘)⟨A,P⟩ := ⟨A,λ⟨⟩.
P⟨⟨⟩,⟨⟩⟩⟩

For convenience we define the infix composition of two containers as:

⟨A,P⟩ ∘ ⟨B,Q⟩ := ⟨Σa:A. P a → B, λ⟨a,f⟩. Σp : P a. Q (f p)⟩ (≅ (∘)(⟨A,P⟩ ⊙ ⟨B,Q⟩))

Composition has an identity element, I, which doesn't appear to be a primitive semi-plethory operation for some reason. Nor do I know how to derive it from other operations. But here it is anyways.

I := ⟨⊤, λs. ⊤⟩

TODO: Actually verify these operations are correct by checking that all the required laws hold (upto isomorphism).

P.S. Actually containers do form a *-semiring, which makes it a ★-*-semiring. Actually we can get a ★-bi-*-semiring with the following asteration operation on containers and its dual.

⟨A,P⟩* := ⟨ℕ, Fin⟩ ∘ ⟨A,P⟩ (≅ ⟨List A, sum ∘ map P⟩)

⟨A,P⟩⁎ := ⟨A,P⟩ ∘ ⟨ℕ, Fin⟩ (≜ ⟨Σa:A. P a → ℕ, λ⟨a,f⟩. Σp:P a. Fin (f p)⟩)

Recalling the problem: On one side we have a computational theory call GMP that consists entirely of Δ_{0} theorems. On the other side we have a purely axiomatic theory called PA that is a high-level theory of Peano Arithmetic that has been conservatively extended to support decimal notation and whatever other reasonable conveniences for the natural numbers we want. The task is to use trustable communication to use the computational ability of GMP to solve simple arithmetic problems in PA.

The idea here is to extend GMP to a stronger theory GMP0 that is strong enough to define PA. Next we make a conservative extension of GMP0 to GMP' that defines the language of PA and proves the axioms of PA. Specifically GMP' defines

nat := { a : int | 0 ≤_{int} a}

+_{nat} := λ (a,b) : nat × nat. ι c : nat. c = a +_{int} b

…

Here we note that PA is a subtheory in the sense that it uses a sublanguage of GMP' and every theorem of PA is a theorem of GMP' (since GMP' proves all the axioms of PA).

The next step is to prove that GMP' is, in fact, a conservative extension of PA (thus GMP0 shouldn't be more powerful than PA). The idea here is to make an interpretation from GMP' into PA that fixes the language of PA. Unfortunately we cannot directly do this because interpretations in Chiron must be injective. So instead we note that if we make an interpretation from GMP' into a conservative extension of PA, called PA', that fixes the language of PA, then we can conclude that GMP' is a conservative extension of PA

To build this interpretation we extend PA to PA' with definitions:

int := nat×{0} ∪ {0}×nat

+_{int} := λ ((ap,an),(bp,bn)) : int × int. ι (cp,cn) : int.
cp + an + bn = ap + bp + cn ∧ (cp = 0 ∨ cn = 0)

…

Out interpretation sends int to int, +_{int} to +_{int}, etc. and fixes the lanuage of PA, so it sends nat to nat, and +_{nat} to +_{nat}.

Unfortunately this doesn't work. GMP' has a theorem (actually a definition) that nat = { a : int | 0 <=_{int} a}. If we look at this theorem under the above alleged interpretation, it is equivalent to nat = nat×{0}, which is clearly false. So our alleged interpretation isn't actually an interpretation because it isn't actually truth preserving.

Let us suppose for the sake of argument that we somehow prove that GMP' is a conservative extension of PA. What we then need to do is to write new services for PA that use GMP' services internally to help. We need to do something like writing new services because someone has to be responsible for proving that the operations on integers are, in fact, compatible with the operations on natural numbers. In our case it will be the responsibility of the person writing these new services to prove that the services are conservative (i.e. all the meaning formulas of the services are theorems). (More details on how to make this service is needed). With new services and new meaning formulas, there is no problem with translating the meaning formula through trustable communication since the meaning formula is no longer passed through the trustable communication framework.

In this sense we have sort of failed at our task which was to use the trustable communication framework; however I'm inclined to think that the original trustable communication framework cannot actually solve this particular communication problem since it leaves no room for the required proof that the computation of the natural numbers is compatible with computation on the integers. Therefore, I think it is inevitable that something beyond trustable communication is needed in this particular situation.

Other situations that do not have this compatibility issue are likely more amiable to using trustable communication. In particular our new natural number services could be served through a trustable communication channel.

An R-R'-biring is an R-ring A such that for every other R-ring B the set of homomorphisms from A to B has the structure of an R'-ring. This is equivalent to giving the following additional operations on A with the following laws:

This gives us enough structure to show that for any R-ring B, the homomorphisms from A to B have a R'-ring structure. For instance given R-ring homomorphisms f, g : A → B and r : R' we can define

0 : A → B is 0(a) = ε^{+}(a)⋅1

1 : A → B is 1(a) = ε^{×}(a)⋅1

f + g : A → B is f + g = ∇ ∘ (f⊗g) ∘ Δ^{+}

-f : A → B is -f = f ∘ σ

fg : A → B is fg = ∇ ∘ (f⊗g) ∘ Δ^{×}

r⋅f : A → B is r⋅f = f ∘ λ ∘ (β(r)⊗id) ∘ Δ^{×}

A biring over R is an R-R-biring.

A plain biring is a biring over ℤ just like a plain ring is a ℤ-ring.

R itself can be made into a trivial R-R'-biring over R where

ε^{+}(a) = ε^{×}(a) = a

σ(a) = a

Δ^{+}(a) = Δ^{×}(a) = a⊗1 = 1⊗a

β(r)(a) = a

R[X] is a biring over R where

ε^{+}(X) = 0 (polynomial evaluation at 0)

ε^{×}(X) = 1 (polynomial evaluation at 1)

σ(X) = -X (polynomial evaluation at -X)

Δ^{+}(X) = X⊗1 + 1⊗X (polynomial evaluation at X+Y)

Δ^{×}(X) = X⊗X (polynomial evaluation at XY)

β(r)(X) = r (polynomial evaluation at r)

A R-R'-biring A inducts a functor from R-Rings to R'-Rings that takes an R-Ring B to the R'-Ring of homomorphisms from A to B.

Birings are composable. Given an R-R'-biring A and and R'-R''-biring B, then there exists an R-R''-bring A⊙_{R'}B. (⊙_{R'}) is called the composition product.
First we define a R-ring A⊙_{R'}B for an R-R'-biring A and an R'-ring B in a way similar to the definition of a tensor product: A⊙_{R'}B can be constructed as A×B modulo the following equalities:

(a + b)⊙x = (a⊙x) + (b⊙x)

ab⊙x = (a⊙x)(b⊙x)

(r⋅1)⊙x = r⋅(1⊙y) (this y is not a typo: 0⊙x = 0⊙y = 0 and 1⊙x = 1⊙y = 1)

Think of a⊙x as the polynomial a evaluated at the point x.
If B also happens to be a R'-R''-biring, then we can make A⊙_{R'}B into an R-R''-biring.

ε^{+}(a⊙x) = a⊙(ε^{+}(x)) where a⊙r = β(r)(a)

ε^{×}(a⊙x) = a⊙(ε^{×}(x))

σ(a⊙x) = a⊙σ(x)

Δ^{+}(a⊙x) = a⊙Δ^{+}(x) where a⊙(w_{1}⊗w_{2}) = (a⊙w_{1})⊗(a⊙w_{2})

Δ^{×}(a⊙x) = a⊙Δ^{×}(x)

β(r)(a⊙x) = β(β(r)(x))(a)

(⊙_{R}) and R[X] make the category of birings over R into a monoidal category. (TODO: Why is this the Tall-Wraith monoid?). We want to check first that B is isomorphic to R[X]⊙_{R}B. The isomorphism maps x ↔ (X⊙x) (the reverse map extends by the composition product laws). Similarly we want to check that A is isomorphic to A⊙_{R}R[X]. Again, the isomorphism maps a ↔ (a⊙X).

an R-plethory is a monoid in this monoidal category of birings over R. This means that an R-plethory is a biring over R, call it A, that also has the following operations forming a monoid:

(∘) : A⊙_{R}A → A

ι : R[X] → A

R[X] is an R-plethory with ι the identity function and (∘)(X⊙X) = X which is effectively entails polyonomial composition.

According to the mathoverflow answer, R[X] is an initial R-plethory which seems reasonable.

I am not entirely sure this is all correct. But I'm moderately confident.

I will (hopefully) continue to update this post with further notes.

Here I present what I think is a slightly simpler proof of the Primitive Element Theorem's so called "finite case" than the one found in A Course in Constructive Algebra. It still follows the most important ideas presented in that book. In particular it is important to use the correct constructive phrasing of the "finite case".

Theorem:Primitive Element Theorem: Finite Case Given a discrete field F and two algebraic values a and b over F of degrees m and n respectively then forall N either #|F| > N or there exists some g in F[a,b] such that F[a,b] = F[g].

Of course by #|F| > N we really mean the statement that there exists some finite set of elements of F that has cardinality greater than N.

We can assume that a ≠ 0 and b ≠ 0 since in these cases F[a, b] = F[b] and F[a, b] = F[a] respectively.

Every element x ∈ F[a] is a linear combination of a^{0}, …, a^{m-1} with coefficents in F. Say x = Σx_{i}a^{i}. Consider the list [a^{0}, …, a^{Nm}], and then consider the finite set of coefficients S = {(a^{i})_{j} | i ≤ N^{m}, j < m}. If S has cardinality greater than N then we are done.

Suppose S has cardinality less than or equal to N. Then there are at most N^{m} linear combinations of the coefficients from S on [a^{0}, …, a^{m-1}], and since the list [a^{0}, …, a^{Nm}] has length N^{m}+1 then there must be two elements of this list that are the same. By dividing these two elements we conclude there is some i such that a^{i} = 1a has multiplicative order i.

By a similar argument we can conclude that either #|F| > N or there is some j such that b^{j} = 1b has multiplicative order j. Now we can proceed under the assumption that both a^{i} = 1a has multiplicative order i and b^{j} = 1b has multiplicative order j. The rest of the argument is standard.

Let f : ℤ_{i}×ℤ_{j} → F[a,b] be the function that sends (x,y) to a^{x}b^{y}. The function f is a group homomorphism into the multiplicative group of F[a,b], and therefore ℤ_{i}×ℤ_{j} must be cyclic and is generated by some z. Let g = f(z). Then there is some k_{a} and k_{a} such that g^{ka} = a and g^{kb} = b, and thus F[a, b] = F[g]. That concludes the proof.

The infinite case of the Primitive Element Theorem states that there exists some N (specifically I think m*(n-1)+1 should work) such that if #|F|>N and b is separable over F then there exists some g in F[a,b] such that F[a,b] = F[g]. This can be proved as done in A Course in Constructive Algebra or if you want to avoid the construction of splitting fields you can follow the argument in David Speyer or Boyarsky's proof. The infinite case combined with the finite case allow us to prove the primitive element theorem:

Theorem:Primitive Element Theorem Given a discrete field F and two algebraic values a and b with b separable over F, then there exists some g in F[a,b] such that F[a,b] = F[g].

Inspired by how Bas and Eelis handle morphisms in their work, I've rethought how morphisms can be implemented in Mathscheme.

Consider how to build a theory of monoid morphisms. One can build this theory by first taking a pushout of the theory of monoids with itself over the empty theory (i.e. take the disjoint union of the theory of monoids with itself) this gives us two copies of the theory of monoids and each copy has a separate universe: U_{1} and U_{2}. Then we extend this theory with a function f : U_{1} → U_{2} and axioms that says that f preserves each operation. This procedure works in general for algebraic theories.

Notice that this gives us a theory of monoid morphisms between some monoids, but not the theory a theory of monoid morphisms between two particular monoids, which is what we are usually interested in. However, by considering this theory of monoid morphisms as parameterized by two monoids in the sense of my last post one can build models of the theory of monoid morphisms between particular monoids.

This way of developing theories of morphisms has some nice properties. For example, we usually only care about vector space morphisms (aka linear transformations) between two vector spaces over the same field. This sharing of the same field is easy to do with the above technique. Instead of taking the pushout of the theory of vector spaces with itself over the empty theory, one would take the pushout of the theory of vector spaces with itself over the theory of fields. This forces the theory of fields to be shared between the two copies of vector spaces, ensuring that the domain and codomain of a field morphism are both vector spaces over the same field.

Some algebraic theories are parameterized by other algebraic theories. The most common example would be vector spaces, which are parameterized by a field that acts on the vector space by scalar multiplication.

In mathscheme, the theory of vector spaces contains types for both the field and the vector space, and provides (via inheritance) all the axioms of a field for the field type and then goes on to provide the vector space axioms (via various mix-ins). When this theory is reflected as a type we get a record for both the field component and the vector space component.

However, in may cases we want to reason about vector spaces whose field component is fixed to some particular value. One idea to support this sort of thing is to not reflect the theory as a type, but as a dependent function type. In this case, a function from the type of field to a type. Loosely speaking it would be

VS : {U:type; +, *: U*U -> U; -, inv : U -> U; 0 1 : U } -> type
VS F = {V:type, + : V*V -> V; - : V -> V; 0: V; *: (F.U)*V -> V}

I have left out the axioms above for simplicity, but the axioms would bring the operations of F into play. These parameterized models can probably be created for any theory morphism between theories.

Unparameterized models could then be simply a degenerate case of parameterized models over the initial empty theory. The models of the empty theory have type {} and there is exactly one value of this type {}.

To further help modularize theory development, it may be useful to take pushouts of not just theory presentations themselves, but to take pushouts of diagrams of theory presentations. The following is an elaboration of what Douglas Smith describes in his various papers on SpecWare including "Mechanizing the Development of Software". Although we are interested in diagrams of theory presentations, this development is generic for diagrams for any category with sufficient structure (ie, categories that have colimits.).

First we define what a graph is. A graph is a finite collection of nodes and a finite collection of directed edges between those nodes with a composition operation on edges (composition requires the target node of the first edge be the same as the source node of the second edge). The set of edges is closed under composition and for each node there is an identity edge from the node to itself. We require that the identity edge be an identity for the composition operation in the usual sense.

Actually, an graph is just a category, and I could have simply said that from the beginning. However, I will use the name graph in the cases where we don't really care about the semantics of the category and we only care about the shape of the category.

This definition of a graph differs notably from the definition of directed multigraph from graph theory because our definition of a graph says whether following two paths from a source node to a target node when composed together give equal edges or not. In particular our graphs tell you whether two particular squares commute or not. Usually this information is left implicit when a diagram is drawn on a board, but such information is a vital component of the definition.

Loosely speaking, a diagram is a graph where the nodes and edges are given labels from another category C in a way that is compatible with the category C. In our particular case, we are interested in C = the category of theory presentations, but this development is generic in C.
More formally speaking, a diagram is a functor d : G → C. where G is a graph that gives the shape of the diagram. The image of a node of G under d labels the node by an object from C, and the image of an edge of G under d labels the edge by an arrow from C. The law for a functor ensure that the labeling of nodes and edges is compatible with C.

Note that is definition of a diagram differs notably from the definition of a subcategory. In a diagram two different nodes could be get the same label, and two different edges could also get the same label. In otherwords, the functor d : G → C need not be injective.

All diagrams are functors (which are arrows in the category of categories) with codomain C. The arrows of a category with a constant codomain form the objects of a slice category, so that is an obvious choice for turning diagrams into a category, but we need something a bit more lax: a lax slice category.

The objects of a lax slice category are all functors that have the same codomain (in this case C). A morphism between d_{1} : G_{1} → C and d_{2} : G_{2} → C in a lax slice category is a functor f : G_{1} → G_{2} along with a natural transformation η_{f} from d_{1} to d_{2}∘f.

What does this definition mean for our diagrams? Given two diagrams d_{1} and d_{2}, a diagram morphism between these two diagrams states two things. Firstly it assigns for each node in the graph G_{1}, a node in the graph G_{2}, and for each edge in the graph G_{1}, an edge in the graph G_{2} in a way that is compatible. This first part is the information held by the functor f : G_{1} → G_{2} from the previous paragraph. The second part of a diagram morphism is a collection of arrows from C (in our case theory morphisms), one for each node in G_{1}. These arrows map the label for a node in G_{1} to the label for the corresponding node (under f) in G_{2}. This collection of arrows is given by the natural transformation η in the previous paragraph. Lastly, this collection of arrows is subject to a compatibility constraint that for every edge e : p → q in G_{1}, there is a commutative square that states that f(q)∘e = η(e)∘f(p). This condition is what makes the transformation η natural.

The laxness of the lax slice category is important. Without it we would end up with the stricter requirement that d_{1} = d_{2}∘f or in other words that the natural transformation is always the identity transformation. This stricter requirement would mean that the collection of arrows between diagrams always be identity arrows, and the only diagram arrows that would exist would be diagram inclusions. We require this laxness to get pushouts to generate a collection of novel theories. Without laxness we could only create the union of diagrams, which is not so interesting.

I believe it is true that if the category C has colimits, then this (lax) category of diagrams will have colimits, but I don't have a reference that proves this.

Now I will show how to use pushouts of diagrams to generate large batches of novel theories. For simplicity consider the diagram B generated from the theories: Semigroup, Monoid, and LeftNearSemiring with the usual arrow from Semigroup to Monoid, and the "multiplicative" arrow from Semigroup to LeftNearSemiring and the "additive" arrow from Monoid to LeftNearSemiring. This example is small enough to be easy to picture, but large enough to be a non-commutative diagram.

We want to use diagram pushouts to create commutative versions of these theories. To do this we will take as base A, the diagram consisting of the single node Semigroup (which implies the single identity arrow on that node), and for C we will take the single node CommutativeSemigroup. The diagram arrow from A to B will map the Semigroup node in A to the Semigroup node in B via the identity morphism. The diagram arrow from A to C will make the Semigroup noda in A to the CommutativeSemigroup node in B via the theory inclusion.

When we take the pushout of this span we get a new diagram D along with diagram arrows from B to D and C to D with the universal properties that pushouts have. What will the contents of D be? The shape of the diagram D will be the the disjoint union of the shapes of B and the shapes of C modulo the shape of A. So in this case that gives us the shape of B plus another point that in this case will be identified with the semigroup node of B. This moding out leaves us with a diagram with the same shape as B that we started with.

What are the contents of D? In this case, what was the semigroup of B has been transformed into the commutative semigroup that we gave in C. The morphisms of the diagram arrow from B to D includes our inclusion from semigroup to commutativeSemigroup. And the diagram arrow from C to D consists of the identity arrow from commutativeSemigroup to commutativeSemigroup. What was a monoid in B is a commutativeMonoid in D, and the diagram arrow from B to D contains a morphism from group to commutativeGroup. There is no morphism in the diagram arrow from C to D that targets commutativeMroup. Finally what was a LeftNearRing in B becomes a commutativeLeftNearRing where both multiplication and addition are commutative. Both operations are commutative because the two squares generated by the "multiplicative" arrow and the "additive" arrow must be commutative.

Now suppose we don't want to make both operations commutative. Say we only want to make the additive operation commutative. To achieve this we need to remove the multiplicative arrow from the diagram B yielding a diagram B′. Now the pushout of B′ and C over A yields a new diagram D′ which is similar to D, but where the LeftNearRing has been replaced with a LeftNearRing theory that only has addition commutative.

To be sure, we are not deleting the "multiplicative" arrow from our theory development. We are simply deciding to not include it in the diagram B′ we use to generate the pushout. This shows that we need to take arrows very seriously. Which arrows we include in our diagrams when doing diagram pushouts can have significant consequences on the generated theories.

Now suppose we want to have our cake and eat it too. We want to build all three extensions of LeftNearRing, one making the additive operation commutative, one making the multiplicative operation commutative, and one making both operations commutative. This is possible by creating duplicate copies of LeftNearRing in our diagram. One copy of LeftNearRing has only one arrow pointing to it from Monoid corresponding to the "additive" arrow, and another copy of LeftNearRing has only one arrow pointing to it from Semigroup corresponding to the "multiplicative" arrow. Finally we put in a third copy of LeftNearRing with two arrows from the other two LeftNearRing nodes are labeled with the identity morphism.

When we pushout with this diagram, each different copy of LeftNearRing will be assigned to a different resulting theory. The first will get commutative addition only, the second will get commutative multiplication only, and the last will get both commutative addition and commutative multiplication. The two arrows labeled with identity morphisms will transform into inclusions.

Suppose we have a diagram d of theories starting from Magma, and we want to extend this diagram with commutativity, associativity, and both commutativity and associativity. To do this we would use the above pushout mechanism twice, once to create d_{c} the diagram with commutative structures, and another time to create d_{a} the diagram with associative structures. Then to create the diagram d_{ac} of commutative and associative structures, we do the pushout of d_{c} and d_{a} over d, using the diagram morphisms from d when we created d_{c} and d_{a}.

The store comonad (better known as the costate comonad) can be defined as

data Store b a = Store (b -> a) b

with the following comonadic operations

extract :: Store b a -> a
extract (Store v b) = v b
fmap :: (a -> c) -> Store b a -> Store b c
fmap f (Store v b) = Store (f . v) b
duplicate :: Store b a -> Store b (Store b a)
duplicate (Store v b) = Store (Store v) b
extend :: (Store b a -> c) -> Store b a -> Store b c
extend f x = fmap f (duplicate x)

Given this we can define a lens (aka a functional reference) as follows

type Lens a b = a -> Store b a

A lens can be used as an accessor to a field of a record, though they have many uses beyond this.

data Entry = { name_ :: String
; city_ :: String
; birthdate_ :: Data
}
name :: Lens Entry String
name entry = Store (\newName -> entry {name_ = newName})) (name_ entry)

Above we create a lens to access the name field of this Entry structure. We can build similar lenses to access other the fields.

We can write getter and setter functions for an arbitrary lenses.

get :: Lens a b -> a -> b
get lens a = b
where
Store v b = lens a
set :: Lens a b -> a -> b -> a
set lens a = v
where
Store v b = lens a

However, we expect the get and set functions to obey certain laws. Here are the three laws that Koji Kagawa gives in his paper "Compositional References for Stateful Functional Programming". They are the laws that any second year computer science student would intuitively write down.

get l (set l s b) = b

set l s (get l s) = s

set l (set l s b1) b2 = set l s b2

Today I was trying to rewrite these lens laws to use our comonadic operations on the store comonad instead of get and set. So after playing around I came up with the following alternative set of laws

extract . l = id

fmap l . l = duplicate . l

Interestingly these laws no longer depend on any special operations of the store comonad. They only use the operations generic for any comonad.

Do these laws look familiar? They should. They are exactly the laws of a coalgebra for a comonad!

I was floored by this. These three, almost arbitrary, set of laws that every computer scientist intuits are exactly the set of laws implied by this abstract nonsense from category theory. I honestly wrote these laws out completely before noticing what they were. I do not recall this fact noted on any paper about lenses that I have read, but it will definitely be mentioned in my upcoming paper.

Below is the Coq code that proves that these two sets of laws are equivalent.

Set Implicit Arguments.
Section Lens.
Variable B : Type.
Definition W a := ((B -> a) * B)%type.
Definition extract (a:Type) (x : W a) : a := fst x (snd x).
Definition fmap (a b:Type) (f : a -> b) (x : W a) : W b := pair (fun z => f (fst x z)) (snd x).
Definition duplicate (a:Type) (x : W a) : W (W a) := pair (pair (fst x)) (snd x) .
Definition extend (a b:Type) (f : W a -> b) (x : W a) : W b := fmap f (duplicate x).
Variable A : Type.
Variable l : A -> W A.
Definition rd (r : A -> W A) (a:A) : B := snd (r a).
Definition wr (r : A -> W A) (a:A) (b:B) : A := fst (r a) b.
Section Iso1.
Hypothesis coalgebra1 : forall a, extract (l a) = a.
Hypothesis coalgebra2 : forall a, fmap l (l a) = duplicate (l a).
Lemma lawA : forall s t, rd l (wr l s t) = t.
Proof.
intros s t.
generalize (coalgebra2 s).
unfold rd, wr.
destruct (l s) as [f b].
unfold fmap, duplicate.
simpl.
inversion 1.
change (l (f t)) with ((fun z => l (f z)) t).
rewrite H1.
reflexivity.
Qed.
Lemma lawB : forall s, wr l s (rd l s) = s.
Proof.
firstorder.
Qed.
Lemma lawC : forall s t1 t2, wr l (wr l s t1) t2 = wr l s t2.
Proof.
intros s t.
generalize (coalgebra2 s).
unfold rd, wr.
destruct (l s) as [f b].
unfold fmap, duplicate.
simpl.
inversion 1.
change (l (f t)) with ((fun z => l (f z)) t).
rewrite H1.
reflexivity.
Qed.
End Iso1.
Section Iso2.
Hypothesis law1 : forall s t, rd l (wr l s t) = t.
Hypothesis law2 : forall s, wr l s (rd l s) = s.
Hypothesis law3 : forall s t1 t2, wr l (wr l s t1) t2 = wr l s t2.
Lemma coalgebraA : forall a, extract (l a) = a.
Proof.
firstorder.
Qed.
(* One day we will have observational type theory, but for now ... *)
Require Import FunctionalExtensionality.
Lemma coalgebraB : forall a, fmap l (l a) = duplicate (l a).
Proof.
intros a.
apply injective_projections; auto.
apply functional_extensionality; intros x.
simpl.
apply injective_projections.
apply functional_extensionality; intros y.
firstorder.
firstorder.
Qed.
End Iso2.
End Lens.

Edit (2011-01-19): I renamed the Selection comonad to the Store comonad and rearranged the order of some parameters to make it more compatible with Control.Comonad.Trans.Store. Some of the comments below use to the old name.

Free theorems were popularized by Wadler in his 1989 paper "Theorems for Free!". In that paper only polymorphism over types is considered. In 2009, Voigtländer illustrates how this can be extended to polymorphism over type constructors with class constraints in his paper "Free Theorems Involving Type Constructor Classes". He uses polymorphism over Monads for his example. In order to learn to apply this theory I will use a very simple exercise only involving Functors. Be aware that our reasoning will be "fast and loose" which is a technical term that means that our results only apply to total values.

The exercise here is to prove that the type A → B and the polymorphic type ∀κ:Functor. κA → κB are isomorphic. The intuition is that all a polymorphic function of type ∀κ:Functor. κA → κB can do is map some internally held function of type A → B. The claim is that the following pair of functions witness the isomorphism:

iso_{1} : (A → B) → ∀κ:Functor. κA → κB iso_{1}x := Λκ. λl. map_{κ}xl

iso_{2} : (∀κ:Functor. κA → κB) → A → B iso_{2}y := y Id

In the above we use Id, which is the identity functor.

To prove this we need to show that both ways of composing these two functions produce the identity function. One way is easy:

iso_{2} (iso_{1}x) a =
map_{Id}xa =
xa

So we see that iso_{2} (iso_{1}x) = x as required. The other direction is harder.

iso_{1} (iso_{2}y) κl =
map_{κ} (y Id) l

At this point we get stuck. We need to show that map_{κ} (y Id) l = yκl for an arbitrary functor κ and arbitrary l : κA. To continue the proof, we are going to have to use the parametricity of y : ∀κ:Functor. κA → κB.

Parametricity of y gives us a free theorem which states that given two functors F_{1} and F_{2} and a functor morphism η : ∀α. F_{1}α → F_{2}α we have that:

η_{B} ∘ yF_{1} = yF_{2} ∘ η_{A}

A functor morphism is better known as a natural transformation. A natural transformation η : ∀α. F_{1}α → F_{2}α is required to satisfy the following law: For any types C and D and any function g : C → D, we require that,

map_{F2}g ∘ η_{C} = η_{D} ∘ map_{F1}g

My proof will proceed in two stages. The first lemma I will prove is:

Lemma 1

y Id = y (Env A) id_{A}

Here Env A is the environment functor (it is also a monad) aka the reader functor, and id is the identity function. To prove our lemma it suffices to show that y Id a = y (Env A) id a for an arbitrary a : A. We will use the free theorem to prove this with a natural transformation from (Env A) to Id defined as

η_{1} : ∀α. Env Aα → Id α η_{1} := Λα. λh. ha

However, we must first prove that this is a natural transformation; i.e. we need to prove that map_{Id}g ∘ η_{1C} = η_{1D} ∘ map_{(Env A)}g:

map_{Id}g (η_{1C}h) =
g (ha) =
(g ∘ h) a =
η_{1D} (map_{(Env A)}gh)

Now we can apply our free theorem using the natural transformation &eta_{1} to prove the lemma.

y Id a =
y Id (η_{1A} id_{A}) =
η_{1B} (y (Env A) id_{A}) =
y (Env A) id_{A}a

This completes Lemma 1. To finish our proof of the isomorphism we only need to prove Lemma 2:

Lemma 2

map_{κ} (y (Env A) id_{A}) l = yκl

To prove this we use the free theorem with a new natural transformation from Env A to κ:

Now we can apply our free theorem using the natural transformation η_{2} to prove Lemma 2.

map_{κ} (y (Env A) id_{A}) l =
η_{2B} (y (Env A) id_{A}) =
yκ (η_{2A} id_{A}) =
yκ (map_{κ} id_{A} l) =
yκl

This completes the proof of the isomorphism. Perhaps it can be simplified by combining the two step into one, but I am satisfied with this two step process.

Instantiating Meaning Formulas In my last post I noted that the Trustable Communications paper was unclear about how to instantiate meaning formulas; in particular it doesn't say how much simplification is done.

After reviewing the paper I realize that the word "instantate" doesn't occur at all in the paper. In fact, it was written before the "Biform Theories in Chiron" paper, and doesn't even mention Chiron at all. Instantiation of meaning formulas is defined in the "Biform Theories in Chiron".

The function mapping inputs to instances of meaning formulas appears to take the place of formuloids in the Trustable Communications paper. Indeed the process of instantiation "maps each E in dom(Pi) to a formula of L" just as the definition of a a formuloid requires. However the Biform Theories in Chiron paper also has a further step where the instantiated formula is "reduced" to the desired form. So the mapping from inputs to reduced instantiated formula could also be the map that takes the place of formuloids.

Indeed the reduced formulas have a more pleasant form, where evals have been removed, which are closer to the outputs of the sample formuloids given in the Trustable Communications paper. The question remains, what is reduction? This is never specified in the Biform Theories in Chiron paper. There is only one example of reduction and that example only shows one step of reduction of evaluations of quoted expressions.

Should services do reduction before interpretation? How much reduction should be done? A sufficiently smart reduction machine might go ahead and reduce the entire expression down to "true" before interpretation, and we wouldn't want that to happen. Perhaps no reduction should happen after instantiation and the evals and quotes should be translated. But that could be quite a burden on the interpreter.

I'm working out a concrete example of an informed request in Chiron (see Trustable Communication Between Mathematics Systems. I'm considering an example where the source biform theory stack has at its base Peano Arithmetic whose language has 0, S, +, and *, plus an extension to define ^ and the target biform theory is a big-int theory, which I've been calling GMP even though the current implementation doesn't use GMP. The GMP language has may operators such as +, *, ^, sqrt, sqr, div, mod, and it has one symbol for every integer.

The goal here is to use the trustable communications framework to get GMP to compute the product of two "big integers", say (3^15 + 1)*(5^12 + 1). The first thing to notice is that what we have written down isn't in the language of PA. There are various ways to deal with this. We can either say that when I write 3 I really mean to write "(S (S (S 0)))" and similarly for other numerals. I could alternatively say that when I write 12 I really mean to write "(2^(2^(2^0) + 2^0) + 2^(2^(2^0)))" where 2 really means "(S (S 0))", or if we want to stay even closer to the string notation, when I write 12 I really mean to write "(1*10^(1*10^0) + 2*10^0)" where 10 really means "(S (S ... 0))" and 9 ... 1 really means .... Or for both the iterated binary and iterated decimal notations we could make an extension of PA to add symbols for the small number of digits needed. All of these choices and more are reasonable; we will not need commit ourselves to any particular choice for this discussion, but we can assume some choice is made.

GMP has a primitive multiply transformer which appears to be perfect to use as our service to compute the product (3^15 + 1)*(5^12 + 1), but that transformer's input has to be a pair of integer symbols. If the the translation of the inputs (3^15 + 1) and (5^12 + 1) have to be integer symbols, then the translation has to do the computation to evaluate (3^15 + 1) and the translator would be doing more work than the service itself!

So instead, we give GMP a service called "normalizeT". This transformer takes an expression in the term language of GMP and evaluates it to normal form, a single integer symbol. Now the translator can be as straightforward as we expect. The source language of PA is almost a subset of GMP. The only problem is with "Successor" which we simply map to "1 +" on the GMP side.

The next step in the trustable communications method is to instantiate the meaning formula for the normalize service. The meaning formula appears to be fairly trivial. It says forall e:Expr_GMP. [[normalizeT(e)]] = [[e]], and we instantiate this at (3^15 + 1)*(5^12 + 1) (note that this is the translated input, it just happens that GMP uses identical looking symbols to the language of PA). After simplifying the instance we get 3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8 = (3^15 + 1)*(5^12 + 1) where 3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8 is a single symbol in the language of GMP denoting the number "3503151381536408". Now this meaning formula instance needs to be interpreted back into the language of Peano Arithmetic. There are a few problems here.

The RHS, "(3^15 + 1)*(5^12 + 1)" is easy because it contains symbols that were translated from the language of PA, so we convert them back. The only tricky problem is that S was translated to "1 +". So we need to either interpret "1" going back as "S 0" or we need to have a sophisticated translation that finds "1 +" and replaces it with in the interpretation "S". Alternatively we could extend the language of GMP to include a successor function, but that seems a little strange because it would mean we are tailoring the target theory to our source theory, and the point of trustable communication was separation of concerns of about these two theories. The translation/interpretation link should be the only thing customized two the two interfaces. (Perhaps it is okay to make a conservative extension of the target theory to facilitate the link?) This is also pointing to a bigger potential problem. In many cases we expect the response to our query to contain our original inputs as part of the resulting formula, usually occurring on one side of an equality. However if our translation is not injective, then it will be impossible for the interpreted response always to contain the original question. Fortunately in this example the translation is injective, but will it always be so?

The LHS, 3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8 is more difficult to interpret back because it isn't in the image of the translation. Again we have several not so great choices to use for our interpretation. We could return S (S (S ... O)) which would be unfeasible in this case. We could return iterated binary or iterated decimal notatations (ie. (2^(2^(2^(2^(2^0)) + 2^0)+ ...) + ...)) or (3*10^(1*10^1 + 5*10^0) + ...)). Again any one of these choice are reasonable (although frankly, none of these solutions are all that compelling.)

One reason these solutions are not compelling is that if we want to make another request to GMP we have to translate back again. It would be nice if we could hold onto the the GMP result so that we can reuse it in a further request. One idea would be to suspend the translation. Say we have a mk_iterated_decimal transformer for interpreting GMP integer symbols into iterated decimal notation in the language of PA. We can suspend this processing and instead return [[mk_iterated_decimal(`3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8')]] or maybe run(`mk_iterated_decimal(`3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8')') or some other variant of eval/run.

There is another issue with our meaning formula for Normalize: forall e:Expr_GMP, [[normalizeT(e)]] = [[e]]. This meaning formula terribly underspecifies what normalize does. In fact, normalize could do nothing at all (the identify function) and it would still have the same meaning formula. The point of normalize is that the syntax of the output should be a single integer symbol.

A proper specification for normalize is forall e:Expr_GMP, [[normalizeT(e)]] = [[e]] /\ is_numeral(normalizeT(e)). It is the meaning formula plus an additional specification that e as a piece of syntax is a numeral. Here we are assuming that we extend the "meta-"langauge of GMP with a new predicate is_numeral. is_numeral is not a transformer because it is taking quoted syntax to values. A transformer always takes syntax to syntax. However we could imagine a transformer version of is_numeral that takes quoted syntax to the quoted syntax of Bool. In this case the specification for normalize would be written forall e:Expr_GMP, [[normalizeT(e)]] = [[e]] /\ [[is_numeralT(normalizeT(e))]].

If we take one of these specifications as the meaning formula for normalize, we are faced with the problem of interpreting this extended meaning formula back. For the first possible specification, after instantiation, we need to translate is_numeral(`3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8') back to the (meta?)langauge of PA. But we don't really have any obvious candidates on what to map this back to. The closest thing our (meta?)langauge has an is_normal_form predicate which can say that a number is in the form "(S (S (S ... O)", but that isn't really an appropriate interpretation for is_numeral. We also need to interpret `3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8' into the (meta?) language of PA. This interpretation might not be the same as our interpretation of 3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8. One expression denotes sytnax where as the other denotes semantics.

It may be unclear how to interpret this part of the specification, but we do know what the answer should be. We want to say that the result in PA is either is_unary(result), is_iterated_binary(result) or is_iterated_decimal(result) depending on what choice we made for our interpretation of GHC integer symbols. Let's say we picked iterated decimal notation. So the result of interpreting is_numeral(`3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8') should be is_iterated_decimal(interpret(`3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8')). Notice that we are using the transformer name for interpret in the output of the interpretation! Actually this is more like the suspended version. The output is more likely to be is_iterated_decimal(`3*10^(1*10^1 + 5*10^0) + ...')).

If we use [[is_numeralT(e)]] in the clause of our meaning formula, then after some simplification of our instance of the meaning formula, this reduces to true, which is easy to interpret into PA and the problem is avoided. What is unclear to me is how much simplification is done between the instantiation and the interpretation back. This simplification step isn't explicitly mentioned in the Trustable Communication paper.

Perhaps this is all pointing to a defect in our example. Remember that 3-5-0-3-1-5-1-3-8-1-5-3-6-4-0-8 is not in the image of the translations. So we are trying to use a service whose result doesn't occur in (the image of) our language in PA. Perhaps this is just a bad way to communicate. Perhaps we should only be running queries where both the input and outputs of the service are in (the image of) our source language. If we restrict ourselves to such queries, do our problems go away? Is this too restrictive for the types of queries we wish to make?

The Chiron implementation has a function called run which is supposed to resemble the functionality of when a user passes an input to a top command into a computer algebra system and some sort of answer is spit out at the bottom. A central idea of Mathscheme is that this computation should be done through a series of transformers. But getting from user input to output via transformers is not really as clear as one might think.

factor should produce lists of (polynomial, natural number) pairs and ifactor should produce similar lists with integers.
expand should produce a polynomial as a linear combination in the power basis. 6*35 should produce 210 and it is unclear to me what (X^2 - 1)*(X^2 + 1) should produce. Maple, for example, will spit it back out to you.

The question is, how do we get from these inputs to these output via transformers? What are the names of the symbols in the input referring to? Are they transformers, or are they mathematical concepts?

Take "6*35" as an example. The * here seems like it refers to the concept of integer multiplication. However, the concept of multiplication cannot actually multiply strings of digits and return an answer. Only a transformer can do that. So perhaps * is the name of the transformer for multiplication. However, transformers must take syntax as parameters, not numbers.

There are two possible ways to deal with this. If * is the concept of multiplication then run needs to look up the concept of * in the environment then substitute it with an associated transformer and running the transformer to get 210. If * is a transformer, then the actual input is really going to be "`6' *T `35'" and run is simply finding the transformer whose name is *T and evaluating on the inputs `6' and `35' resulting in `210'.

The other examples lead to some insight. factor looks more like a transformer than a concept, though a concept could be created for it. expand definitely appears to be a transformer. There isn't really a mathematical concept associated with it. So this suggests that the user input consists of transformer names and not concepts.

The problem is since factor is a transformer (let's call it factorT from now on to make it clear that it is a transformer name, though in practice the name would really be "factor") it is expecting syntax as its input. This means the user input needs to really be "factorT(`(X^2 - 1)*(X^2 + 1)')".

Why do we quote the argument to factorT? Do we just quote the input to all transformers? No, because we can chain transformers together. The input to the factor-expand example should really be "factorT(expandT(`(X^2 - 1)*(X^2 + 1)'))" and neither "factorT(`expandT((X^2 - 1)*(X^2 + 1))')" nor "factorT(`expandT(`(X^2 - 1)*(X^2 + 1)')')". The last two don't even typecheck because "expand" isn't part of the language of syntax that factorT accepts.

So why do we push the quotes only in as far as `(X^2 - 1)*(X^2 + 1)' and not any further. Shouldn't we be doing `(X^2 - 1)' *T `(X^2 + 1)'. In fact, shouldn't we push further down and use -T and ^T as well? We really don't want to push the quotes further in because if *T, say, multiplies two polynomials out into an expanded form, then factor has to undo all that work to get the factorization. If factor is smart, and we actually pass the syntax of a product of polynomials, then it can factor each factor independently and merge the results, which is much more efficient. Similarly for "ifactorT(`6*35')", if we don't run a "`6' *T `35'" transformation, then a smart implementation of ifactorT will have a much easier time.

The key to solving this conundrum is that there are a set of trivial transformers for * that simply term constructors. In particular there is a transformer named &*T that will take `6' and `35' and return `6*35'. This is distinct from the transformer named *T that takes `6' and `35' and returns `210'. In fact, these two transformers don't even have the same type. *T operates on a language L1 which is essentially [0-9]+, while &*T operates on a language L2 which is a term algebra for integers with symbols for + and * and whatnot.

Notice that &*T and *T both have approximately the same meaning formula. The meaning formula for *T is "forall x,y : E_L1. [[x *T y]] = [[x]] * [[y]]" and the meaning formula for &*T is "forall x,y : E_L2. [[x &*T y]] = [[x]] * [[y]]". It is now clear that meaning formulas are not a complete specification of the transformers. Transformers could produce lots of different syntax that all denote the same result, and meaning formulas of this form do not distinguish between them. I'm not saying this is a bad thing. The purpose of meaning formulas may not be to fully specify the transformer behaviour; however if meaning formulas are intended as specification, then a lot more details need to be put in them than currently appear in the Chiron-Biform theories paper.

So with two different transformers in hand, both "representing" multiplication in different ways we can now start to disambiguate what is going on. There are really two different possible user inputs that correspond to "ifactor(6*35)". The user could ask for "ifactorT('6' *T '35')" in which case the computer will first multiply 6 and 35 together to get 210 before then factoring 210. Alternatively the user could ask for "ifactorT('6' &*T '35'), in which case the computer would first combine '6' and '35' into the syntax tree '6*35' and pass that tree to ifactor which would hopefully be clever enough to factor 6 and 35 independently and merge the results. Notice that the results of both functions are the same. Only their operational behaviour is different. We could even have two different ifactorT transformers. One simple version, ifactorSimpleT, that only understands the language L1, and one clever version, ifactorCleverT, that accepts a broader language L2 and can understand how to factor '6*35' without first multiplying them together.

The conclusion from all this is that run is always given an expression of transformers, although some of them might be trivial transformers. The leaves of the expressions must be quoted constants, or alternatively they could be really trivial transformers that take no parameters and simply return a quoted constant (depending on your formalization these two ideas may simply be identical). If we consider run as a transformer itself, then its input is the syntax of the language of syntax transformers. In this sense I say that run operates on meta-syntax, and that run is a meta-transformer. This is no problem in Chiron which can formalize meta syntax in its syntax. The result of run is regular syntax of some language, which depends on the transformers used. So run maps meta-syntax to syntax.

This means that the output of the result of run cannot be used as the input to run, because they are of different types. However there is always a trivial injection from syntax- to meta-syntax that replaces each operation of the language with the trivial transformer that inserts the corresponding operation into the syntax i.e. "6*35" can be lifted to "'6' &*T '35'".

I don't promise that everything I write here will be correct. All polynomials we will consider are assumed to be seperable. All types are going to have decidable equality and choice. The base field F is assumed to be splitable, which means it is decidable if a polynomial over F is irreducible or not. Since we will only consider finite extensions of F they should all be splitable as well.

Suppose we are given a polynomial f ∈ F[x] of degree n. How do we compute the Galois group of f? Assume we are also given a splitting field E = F(α_{1},…,α_{n}) of f. The traditional approach is to define the Galois group to be the automorphisms of E that fix F. This is unsuitable for formalization in Coq because the carrier are functions, which leads to all sorts of issues about undecidable equality and functional extensionality that we would rather not deal with because, after all, the Galois group is going to be a nice finite set.

The obvious candidate to represent the Galois group is some subgroup of the symmetric group on {α_{1},…,α_{n}}, but which subgroup do we consider? The modern mathematical answer is simply “the one that generates automorphisms of E”, but this isn't a priori a decidable predicate.

The classical answer is to first prove some form of the primitive element theorem. We need to find an element β ∈ E such that F(β) = E. We need to compute the minimum polynomial of β, μ ∈ F[x] of degree m. Once we do this things are looking much easier. This means that the powers of β, e.g. β^{0},…,β^{m-1}, form a basis for E.

Let β_{1},…β_{m} be all the roots of μ. We also need to compute rational functions θ_{1}(x),…,&theta_{m}(x) ∈ F(x) such that each root of the μ can be computed from β, e.g. β_{i} = θ_{i}(β). (Actually we can find polynomials &theta_{i}, but I say rational functions here because in some cases it might be easier to compute them using Lagrange’s rational function theorem.)

Since β generates E then E is also the splitting field of μ, so the Galois group of μ is going to be the same as the Galois group of f. But μ is a much nicer polynomial than f because when one root of μ is added to F then the entire polynomial factors completely. In particular, adding β to F causes μ to factor completely because we can compute each other root of μ as a rational function of β using the θ_{i}’s.

The Galois permutations of the roots of μ is also easy to compute. First choose any element of {β_{1},…,β_{m}} for β to map to, say you pick β_{j}. Then the θ_{i}’s dictate how the rest of the roots must be permuted. In particular β_{i} = θ_{i}(β) must be mapped to θ_{i}(β_{j}). The field automorphism associated with this choice is also easy to see since E = F(β), then an automorphism fixing F is completely defined by its behaviour on β.

So the Galois group is specified by which root β is mapped to. There were m choices available so the Galois group has size m. This is great because it is also the dimension of the field extension E as it ought to be.

Thus we are left with the question of how to compute β. One way of finding a suitable β is to look for a linear combinations of α_{1},…,&alpha_{n} that work. There is only a finite number of combinations that won't work, and it seems to be decidable which combinations will or won't work. This is great when F is infinite. When F is finite it is also easy to find β because it will be a generator of the cyclic multiplicative group of E. This this procedure only works on fields that are either finite or infinite. Perhaps there is more general procedure for finding β, but this restriction may not be too severe for most purposes.

One can also follow Galois’s original approach to find a minimal polynomial for β without using the field extension E. At least assuming F has characteristic 0 (perhaps it works more generally) and f is irreducible, one can find a Galois resolvant, V, which is a linear multivariable polynomial in n-variables. It is generated in such a way that V(α_{1},…,α_{n}) is a suitable β, but one can compute V only using the field F. One can compute a symmetrized version of this Galois resolvant as a polynomial in n+1 variables. One can evaluate this symmetrized polynomial at (α_{1},…,α_{n}) leaving you with a polynomial V* ∈ F[x]. It is in F[x] because of symmetry of the original multivariable polynomial. V* is a n! degree polynomial and every root will generate the splitting field E. Thus one can factor V* in to several m degree irreducible polynomials and each one would make a suitable minimal polynomial μ. Finally, we have manage to construct μ without working in the splitting field E. This is nice because we can now construct a splitting field as F[x]/μ and represent the Galois group by the roots of μ in F[x]/μ. Obviously I'm quite sketchy on the details in this last paragraph. A little more research is needed here.