Yoneda’s lemma is not hard, say I, having spent an embarrassing amount of time formalising it in Agda. It’s really just following the types. However, the types are complex and Awodey goes over the details a little too quickly for my taste, so I feel validated in processing my trauma by writing a blog post, millennial style.
I assume familiarity with products, exponentials, naturality and the representable functors. The next section contains very brief discussions of these topics, to serve as a reference and to fix the notation (which is mostly Awodey’s, but I deviate in a few places). Afterwards, we’ll dive into the proof.
1 Prerequisites
1.1 Notation
Categories use boldfaced letters. I write \(A ∈ \mathbf{C}\) for “A is an object of \(\mathbf{C}\)” and \(f : A → B ∈ \mathbf{C}\) for “\(f\) is an arrow from \(A\) to \(B\) in \(\mathbf{C}\)”.
For function application, I sometimes use juxtaposition \((f \;x)\) and sometimes parentheses \(f(x)\). I also write things like \(f(g \;∙, x)\) to describe the function \(λ y → f \;(g \;y) \;x\).
I like to put type information all over the place, so for example \(f = g : A → B\) means that \(f : A → B\), \(g : A → B\) and \(f = g\). \(f \;(x ∈ S)\) is the function \(f\) applied to \(x\), noting that \(x ∈ S\).
1.2 Products
Two objects \(A, B ∈ \mathbf{C}\) have a product, \(A × B\), if we have arrows \(\mathrm{fst}: A × B → A\) and \(\mathrm{snd}: A × B → B\) and for every pair of arrows \(f : X → A\), \(g : X → B\) there is a unique \(⟨ f , g ⟩ : X → A × B\) such that \(\mathrm{fst}∘ ⟨ f , g ⟩ = f\) and \(\mathrm{snd}∘ ⟨ f , g ⟩ = g\).
Given two arrows \(f : A → A′\) and \(g : B → B′\), we can define an arrow that maps \(f\) and \(g\) over the two components of a product: \[\begin{eqnarray} && ⟨ f × g ⟩ : A × B → A′ × B′ \\ && ⟨ f × g ⟩ := ⟨ f ∘ \mathrm{fst}, g ∘ \mathrm{snd}⟩ \end{eqnarray}\]
We can also define an arrow that swaps the two components of a product: \[\begin{eqnarray} && \mathrm{Swap}: A × B → B × A \\ && \mathrm{Swap}:= ⟨ \mathrm{snd}, \mathrm{fst}⟩ \end{eqnarray}\]
1.3 Naturality
A natural transformation \(θ : F → G\) between functors \(F, G : \mathbf{C}→ \mathbf{D}\) is a family of functions \(θ_C : F \;C → G \;C\) (\(∀ C ∈ \mathbf{C}\)), with the property that for all \(C\), \(D\) and \(f : C → D ∈ \mathbf{C}\),
\[ G \;f ∘ θ_C = θ_D ∘ F \;f : F \;C → G \;D \]
\(θ_C\) is called the component of \(θ\) at \(C\).
1.4 Exponentials
The object \(C ⇒ D ∈ \mathbf{C}\) is an exponential object of \(C\) and \(D\) in \(\mathbf{C}\) if we have the following:
An arrow \(\mathrm{Ap}: (C ⇒ D) × C → D\).
For every arrow \(f : B × C → D\) an exponential transpose \(\mathrm{Curry}\;f : B → C ⇒ D\) such that \(\mathrm{Curry}\;f\) is unique with the property that
\[ \mathrm{Ap}∘ ⟨ \mathrm{Curry}\;f × \mathrm{id}⟩ = f \]
\(\mathbf{Cats}\), the category of small1 categories, has all exponentials. Recall the definitions of \(\mathrm{Ap}\) and \(\mathrm{Curry}\) in this category, which we’ll need later (where \(\mathbf{C}⇒ \mathbf{D}\) is the category of functors from \(\mathbf{C}\) to \(\mathbf{D}\), whose arrows are natural transformations): \[\begin{eqnarray} && \mathrm{Ap}\;(F , C) := F \;C \\ && \mathrm{Ap}\;(θ : F → G, f : C → D) : F C → G D \\ && \mathrm{Ap}\;(θ, f) := G \;f ∘ θ_C \\ && \\ && \mathrm{Curry}\;(F : \mathbf{B}× \mathbf{C}→ \mathbf{D}) : \mathbf{B}→ \mathbf{C}⇒ \mathbf{D}\\ && \mathrm{Curry}\;(F : \mathbf{B}× \mathbf{C}→ \mathbf{D}) \;(B ∈ \mathbf{B}) := \widetilde{F}_B \\ && \mathrm{where} \\ && \widetilde{F}_B \;C := F \;B \;C \\ && \widetilde{F}_B \;(f : C → C′) : F \;B \;C → F \;B \;C′ \\ && \widetilde{F}_B \;f := F \;(\mathrm{id}, f) \\ && \\ && \mathrm{Curry}\;(F : \mathbf{B}× \mathbf{C}→ \mathbf{D}) \;(f : B → B′) : \widetilde{F}_B → \widetilde{F}_{B′} \\ && (\mathrm{Curry}\;F \;f)_C : F \;B \;C → F \;B′ \;C \\ && (\mathrm{Curry}\;F \;f)_C := F \;(f, \mathrm{id}) \end{eqnarray}\]
You may want to verify that the above are valid definitions of functors/natural transformations.
Using \(\mathrm{Curry}\), we can also define a function that transposes a bifunctor in its second argument, rather than the first: \[\begin{eqnarray} && \mathrm{Curry}₂ \;F : \mathbf{C}→ \mathbf{B}⇒ \mathbf{D}\\ && \mathrm{Curry}₂ \;F := \mathrm{Curry}\;(F ∘ \mathrm{Swap}) \end{eqnarray}\]
1.5 Representable Functors
For any category \(\mathbf{C}\) and objects \(C, D ∈ \mathbf{C}\), \(\mathrm{Hom}_\mathbf{C}(C, D)\) is the set of arrows \(f : C → D ∈ \mathbf{C}\). We overload the name \(\mathrm{Hom}\) to also mean the representable functor of \(\mathbf{C}\), defined by \[\begin{eqnarray} && \mathrm{Hom}_\mathbf{C}: \mathbf{C}^{\mathrm{op}} × \mathbf{C}→ \mathbf{Sets}\\ && \mathrm{Hom}_\mathbf{C}(C, D) := \mathrm{Hom}_\mathbf{C}(C, D) \\ && \mathrm{Hom}_\mathbf{C}(f : C → C′ ∈ \mathbf{C}^{\mathrm{op}}, g : D → D′ ∈ \mathbf{C}) : \mathrm{Hom}(C, D) → \mathrm{Hom}(C′, D′) \\ && \mathrm{Hom}_\mathbf{C}(f , g) := λ (h : C → D) → g ∘ h ∘ f \end{eqnarray}\]
Note that \(f : C → C′ ∈ \mathbf{C}^{\mathrm{op}}\), so \(f : C′ → C ∈ \mathbf{C}\). I’ll usually omit the index of \(\mathrm{Hom}\).
1.6 Yoneda Embedding
The Yoneda Embedding \(y\) (for a given category \(\mathbf{C}\)) is simply the \(\mathrm{Hom}\) functor transposed in its second argument, i.e. \[\begin{eqnarray} && y : \mathbf{C}→ \mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets}\\ && y := \mathrm{Curry}₂ \;\mathrm{Hom} \end{eqnarray}\]
2 Proof
2.1 Theorem Statement
When I formalised the proof, one of the main difficulties was an imprecise understanding of the actual theorem statement, so let’s get that out of the way first. As a nice side effect, we’ll discover a more succinct formulation of the theorem than what I’ve seen elsewhere.
The way the theorem is usually presented goes something like this: For an arbitrary category \(\mathbf{C}\), an object \(C ∈ \mathbf{C}\) and a functor \(F : \mathbf{C}^{\mathrm{op}} → \mathbf{Sets}\),
We have a family of isomorphisms
\[ i_{C,F} : \mathrm{Hom}(y \;C, F) ≅ F \;C \]
Recall that \(y : \mathbf{C}→ \mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets}\), so \(y \;C : \mathbf{C}^{\mathrm{op}} → \mathbf{Sets}\). The \(\mathrm{Hom}\) is the set of natural transformations between \(y \;C\) and \(F\).
\(i\) is natural in \(C\). This means that if we hold \(F\) steady and let \(C\) vary, we get a natural transformation. More formally: The function
\[ i_{∙,F} : \mathrm{Hom}(y \;∙, F) → F \;∙ \]
is a natural transformation for all \(F\).
\(i\) is natural in \(F\), meaning that we can also let \(F\) vary while holding \(C\) steady:
\[ i_{C,∙} : \mathrm{Hom}(y \;C, ∙) → ∙ \;C \]
is a natural transformation for all \(C\).
I have two problems with this formulation. The first: When we say that \(i_{C,∙}\) is a natural transformation, then \(\mathrm{Hom}(y \;∙, F)\) must be a functor. But is it? After all, not just any old function is a functor.
Turns out it is, which becomes obvious with a little reformulation:
\[ \mathrm{Hom}(y \;∙, F) = (\mathrm{Hom}∘ ⟨ y × \mathrm{id}⟩) \;(∙, F) \]
where
- \(\mathrm{Hom}: (\mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets})^{\mathrm{op}} × (\mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets}) → \mathbf{Sets}\).
- \(⟨∙×∙⟩\), defined above, is an operator that maps two functions (here \(y\) and \(\mathrm{id}\)) over the two components of a pair.
Actually, the above equation contains a type error: \(\mathrm{Hom}\)’s domain is \((\mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets})^{\mathrm{op}} × (\mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets})\), but the codomain of \(⟨ y × \mathrm{id}⟩\) is \((\mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets}) × (\mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets})\). Luckily, for every functor \(F : \mathbf{C}→ \mathbf{D}\) there is a functor \(F^{\mathrm{op}} : C^{\mathrm{op}} → D^{\mathrm{op}}\), and so we can replace \(y\) with \(y^{\mathrm{op}}\) to fix the types. The construction of \(F^{\mathrm{op}}\) is trivial: \[\begin{eqnarray} && F^{\mathrm{op}} \;C := F \;C \\ && F^{\mathrm{op}} \;(f : C → D ∈ \mathbf{C}^{\mathrm{op}}) : F \;C → F \;D ∈ \mathbf{D}^{\mathrm{op}} \\ && F^{\mathrm{op}} \;f := F \;f \end{eqnarray}\]
This also brings us closer to addressing my second problem: Why are there two seemingly unrelated naturality conditions in the theorem statement? The answer, perhaps unsurprisingly, is that they are not unrelated: Yoneda’s lemma in fact talks about a single natural isomorphism.
To see why, consider that a natural transformation between bifunctors essentially consists of two natural transformations where we hold the first and second argument, respectively, steady. This looks much like the situation we’re in here: \(\mathrm{Hom}(y \;∙, ∙)\) is certainly a bifunctor. And the strange-looking “\(∙ \;∙\)” is the application bifunctor \(\mathrm{Ap}: (\mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets}) × \mathbf{C}^{\mathrm{op}} → \mathbf{Sets}\). The lemma then simply states,
\[ \mathrm{Hom}∘ ⟨ y^{\mathrm{op}} × \mathrm{id}⟩ ≅ \mathrm{Ap}∘ \mathrm{Swap}\quad : C^{\mathrm{op}} × (C^{\mathrm{op}} ⇒ \mathbf{Sets}) → \mathbf{Sets}\]
If you’re not convinced, verify that if we have a natural isomorphism with this signature, we can satisfy all three clauses of the original theorem statement.
2.2 Isomorphism
To construct a natural iso, we first need to provide component isos
\[\begin{eqnarray} i_{C,F} && : (\mathrm{Hom}∘ ⟨ y^{\mathrm{op}} × \mathrm{id}⟩) \;(C, F) ≅ (\mathrm{Ap}∘ \mathrm{Swap}) \;(C, F) \\ && = \mathrm{Hom}(y \;C, F) ≅ F \;C \quad (∀ C, F) \end{eqnarray}\]
These isos are in \(\mathbf{Sets}\), so they are simply bijections. Fix \(C\) and \(F\) for the remainder of this section.
2.2.1 Forward
In the forward direction, we need to construct a function \(i : \mathrm{Hom}(y \;C, F) → F \;C\). Given \(θ ∈ \mathrm{Hom}(y \;C, F)\) (a natural transformation), we have a component \(θ_C : y \;C \;C → F \;C\). Recall that \(y \;C \;C = \mathrm{Hom}(C, C)\), so we can apply \(θ_C\) to \(\mathrm{id}_C ∈ \mathrm{Hom}(C, C)\) to obtain some element of the set \(F \;C\). So the definition of \(i\) is
\[ i_{C,F}(θ) := θ_C \;\mathrm{id}_C \]
2.2.2 Backward
In the backward direction, we need to construct a function \(j : F \;C → \mathrm{Hom}(y \;C, F)\). Fix the argument as \(x ∈ F \;C\). To get a natural transformation \(ι : y \;C → F\), define components \(ι_D : y \;C \;D → F \;D\) as \(ι_D(f : C → D) := F \;f \;x ∈ F \;D\).
\(ι\) also has to be natural, meaning
\[ ∀ A, B, (g : A → B). F \;g ∘ ι_A = ι_B ∘ y \;C \;g \]
Both sides of the equation are functions from \(\mathrm{Hom}(C, A)\) to \(F \;B\). Fix \(h ∈ \mathrm{Hom}(C, A)\). Then on the left-hand side, we have
\[\begin{eqnarray} && (F \;g ∘ ι_A) \;h \\ &=& F \;g \;(ι_A \;h) \\ &=& F \;g \;(F \;h \;x) \\ &=& F \;(g ∘ h) \;x \end{eqnarray}\]
On the right-hand side, recall that \(y \;C \;g = λ h → h ∘ g\). With this, we get
\[\begin{eqnarray} && (ι_B ∘ y \;C \;g) \;h \\ &=& ι_B \;(y \;C \;g \;h) \\ &=& ι_B \;(g ∘ h) \\ &=& F \;(g ∘ h) \;x \end{eqnarray}\]
So \(ι\) is indeed natural, and we can define
\[ j_{C,F}(x ∈ F \;C)_D(f : C → D) := F \;f \;x ∈ F \;D \]
2.2.3 Inverses
To see that \(i_{C,F} ∘ j_{C,F} = id : F \;C → F \;C\), take \(x ∈ F \;C\) and calculate
\[\begin{eqnarray} && (i ∘ j) \;x \\ &=& i \;(j \;x) \\ &=& (j \;x)_C \;id_C \\ &=& F \;id_C \;x \\ &=& x \end{eqnarray}\]
To see that \(j_{C,F} ∘ i_{C,F} = id : \mathrm{Hom}(y \;C, F) → \mathrm{Hom}(y \;C, F)\), take \(θ : y \;C → F\), \(D ∈ \mathbf{C}\) and \(f ∈ y \;C \;D = \mathrm{Hom}(C, D)\) and calculate
\[\begin{eqnarray} && ((j ∘ i) \;θ)_D \;f \\ &=& (j \;(i \;θ))_D \;f \\ &=& (j \;(θ_C \;id_C))_D \;f \\ &=& F \;f \;(θ_C \;id_C) \\ &=& (F \;f ∘ θ_C) \;id_C \\ &=& (θ_D ∘ y \;C f) \;id_C \\ &=& θ_D \;(y \;C \;f \;id_C) \\ &=& θ_D \;(id_C ∘ f) \\ &=& θ_D \;f \end{eqnarray}\]
Hence, \(j ∘ i\) maps \(θ\) to \(θ\).
2.3 Naturality
It remains to be seen that \(i\) is natural, i.e.
\[\begin{eqnarray} && ∀ (C, F), (D, G), ((f, θ) : (C, F) → (D, G)). \\ && (\mathrm{Ap}∘ \mathrm{Swap}) \;(f, θ) ∘ i_{C, F} = i_{D, G} ∘ (\mathrm{Hom}∘ ⟨ y^{\mathrm{op}} × \mathrm{id}⟩) \;(f , θ) \end{eqnarray}\]
The types are perhaps a little confusing, so let’s recap:
- \((C, F)\) and \((D, G)\) are objects of \(\mathbf{C}^{\mathrm{op}} × (\mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets})\).
- \(f : C → D ∈ \mathbf{C}^{\mathrm{op}}\) (!) and \(θ : F → G ∈ \mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets}\) (a natural transformation), so \((f, θ) : (C, F) → (D, G)\).
- \(i : \mathrm{Hom}∘ ⟨ y^{\mathrm{op}} × \mathrm{id}⟩ → \mathrm{Ap}∘ \mathrm{Swap}\) (also a natural transformation, as we will see, but in a different category).
- \(i_{C,F} : \mathrm{Hom}(y \;C, F) → F \;C\) is the component of \(i\) at \((C, F)\).
We’ll need to know what the various functors’ actions on arrows are, so here’s a reminder: \[\begin{eqnarray} && \mathrm{Ap}: (\mathbf{C}⇒ \mathbf{D}) × \mathbf{C}→ \mathbf{D}\\ && \mathrm{Ap}\;(F, C) = F \;C \\ && \mathrm{Ap}\;(θ : F → G, f : C → D) : F \;C → G \;D \\ && \mathrm{Ap}\;(θ , f) = G \;f ∘ θ_C \\ && \\ && \mathrm{Hom}: \mathbf{C}^{\mathrm{op}} × \mathbf{C}→ \mathbf{Sets}\\ && \mathrm{Hom}\;(C, D) = \mathrm{Hom}(C, D) \\ && \mathrm{Hom}\;(f : C′ → C ∈ \mathbf{C}, g : D → D′ ∈ \mathbf{C}) : \mathrm{Hom}(C, D) → \mathrm{Hom}(C′, D′) \\ && \mathrm{Hom}\;(f , g) = λ (h : C → D) → g ∘ h ∘ f \\ && \\ && y : \mathbf{C}→ \mathbf{C}^{\mathrm{op}} ⇒ \mathbf{Sets}\\ && y \;D = \mathrm{Hom}(∙, D) \\ && y \;(f : D → D′) : \mathrm{Hom}(∙, D) → \mathrm{Hom}(∙, D′) \\ && (y \;f)_C = λ (h : C → D) → f ∘ h \end{eqnarray}\]
Now we can simplify the terms on both sides of the equation. They denote functions from \(\mathrm{Hom}(y \;C, F)\) to \(G \;D\), so we apply them to some natural transformation \(ι : y \;C → F\). On the left-hand side:
\[\begin{eqnarray} && ((\mathrm{Ap}∘ \mathrm{Swap}) \;(f, θ) ∘ i_{C, F}) \;ι \\ &=& (\mathrm{Ap}\;(θ, f) ∘ i_{C, F}) \;ι \\ &=& (G f ∘ θ_C ∘ i_{C, F}) \;ι \\ &=& (G f ∘ θ_C) \;(i_{C, F} \;ι) \\ &=& (G f ∘ θ_C) \;(ι_C \;\mathrm{id}_C) \\ &=& (G f ∘ θ_C ∘ ι_C) \;\mathrm{id}_C \\ &=& (G f ∘ (θ ∘ ι)_C) \;\mathrm{id}_C \\ &=& ((θ ∘ ι)_D ∘ y \;C \;f) \;\mathrm{id}_C \\ &=& ((θ ∘ ι)_D \;(y \;C \;f \;\mathrm{id}_C) \\ &=& ((θ ∘ ι)_D \;(id_C ∘ f)) \\ &=& (θ ∘ ι)_D \;f \end{eqnarray}\]
And the right-hand side:
\[\begin{eqnarray} && (i_{D, G} ∘ (\mathrm{Hom}∘ ⟨ y^{\mathrm{op}} × \mathrm{id}⟩) \;(f , θ)) \;ι \\ &=& (i_{D, G} ∘ (\mathrm{Hom}\;(y^{\mathrm{op}} \;f , θ)) \;ι \\ &=& i_{D, G} \;((\mathrm{Hom}\;(y^{\mathrm{op}} \;f , θ)) \;ι) \\ &=& i_{D, G} \;(θ ∘ ι ∘ y^{\mathrm{op}} \;f) \\ &=& i_{D, G} \;(θ ∘ ι ∘ y^{\mathrm{op}} \;f) \\ &=& (θ ∘ ι ∘ y^{\mathrm{op}} \;f)_D \;id_D \\ &=& (θ ∘ ι)_D \;((y^{\mathrm{op}} \;f)_D \;id_D) \\ &=& (θ ∘ ι)_D \;(f ∘ id_D) \\ &=& (θ ∘ ι)_D \;f \end{eqnarray}\]
The equation holds, so \(i\) is indeed natural. Naturality of \(j = i^{-1}\) already follows from this, and so we have a natural iso.
I won’t bother with any formalities regarding small and large categories. The reliance on these crude notions – compared to a proper tower of universes – strikes me as a weakness of the usual presentation of category theory, though I have to admit it mostly does the job.↩︎