\clearpage\null\newpage
Hidden: Macros and libraries
Chapter 1: The λ-calculus is a collection of systems formalizing the notion of functions as formulae, function abstraction and function composition. They can be seen as programming languages and as formal logics at the same time. We start by analyzing the characteristics that make untyped lambda calculus (\S section-untypedcalculus) a proper programming language; these are the confluence property (known as the Church-Rosser theorem, \S section-confluence and section-churchrossertheorem), the existence of normal forms for terminating computations, and evaluation strategies that allow us to reliably find these normal forms in the cases where they exist. We also relate lambda calculus to combinatorial logic with SKI combinators (\S section-ski) and we discuss its historical importance on the definition of Turing-completeness (\S section-turing).
There also exist typed variants of lambda calculus (\S sec-simplytypedlambda).
Simply typed lambda calculus, for instance, can include function types, a unit type,
a void type, pairs and unions (\S sec-simpletypes). We focus on the logical interpretation
of this system, called the Curry-Howard isomorphism (\S sec-curryhoward). Brouwer, Heyting and
Kolmogorov proposed in the 30s a computational interpretation of intuitionistic
logic, a logic that does not assume the law of excluded middle,
Chapter 2: We have developed Mikrokosmos, an untyped and simply typed λ-calculus interpreter written in the purely functional programming language Haskell cite:hudak07_haskell, (\S sec-haskelllang). It aims to provide students with a tool to learn and understand λ-calculus and the relation between logic and types. We show how to program and prove statements of intuitionistic propositional logic with it.
We discuss why Haskell and the use of monadic parser libraries are an adequate choice for this project (\S sec-monadicparsers). Our implementation relies on De Bruijn indexes to represent variable scope and bindings internally (\S sec-debruijnindexes). We need to implement substitutions (\S sec-substitution), evaluation (\S section-evaluation) and type inference (\S sec-typeinference); but all these algorithms can be directly derived from the theoretical discussion we had on Chapter 1. Our programming environment makes use of version control using Git, test suites using the Tasty testing framework (\S section-testing) and continuous integration using TravisCI (\S sec-versioncontrolci). We also rely on packaging tools that allow us to distribute the software as a library and as an executable on the appropriate channels (\S section-installation). A Jupyter Notebook integration is provided (\S section-jupyterkernel); it has been used to provide students with interactive tutorials of lambda calculus (\S section-jupyterhub). Along with the command-line interpreter, we also include a mechanism allowing us to call the interpreter from Javascript and embed it into a web page (\S section-javascript).
At the end of this chapter, we describe how to program on the untyped
lambda calculus (\S sec-programming-untyped), encoding complex data structures with a technique due
to Church and learning how to use fixpoint operators to achieve
unbounded recursion. On the other hand, programming in the
simply-typed lambda calculus (\S sec-programming-typed) more resembles proving statements on
intuitionistic logic with a proof assistant. We show how each program
is, itself, a proof of a proposition in intuitionistic logic; their types can
be seen as the propositions they are proving; and evaluation and
Chapter 3: Categories (\S section-categories) will be the framework we will use to study the fundamental notion of function composition inside mathematics. They provide a fruitful language to talk about theories, mathematics and logic. Notions such as naturality, universality or duality can be given a explicit formal meaning with categories. We have examples of them all around mathematics. Functors (\S sec-functorsnatural) can be seen as homomorphisms that preserve the structure of a category, and we can base many different useful mathematical constructions on them, with the added benefit that they can be formulated in a very abstract setting (\S section-constructions).
Yoneda Lemma (\S section-universality) is one of the foundational results in category theory; intuitively, it says that every object in a category is determined by how it relates to any object of the category. This is a tool that will be necessary when constructing categorical models of algebraic theories in Chapter 4. We also define limits and colimits, which allow us more expressive constructions inside categories; in particular, fibered products and coproducts can be expressed this way (pullbacks and pushouts), and they will be also useful when interpreting syntactic substitution inside categories.
Finally, we present monads, algebras and adjoint functors (\S section-adjoints-monads). Algebras are useful to encode induction in a purely categorical way. Adjoint functors, on the other hand, will be one of the key insights we will use to understand and expand the connection between computation and logic we have described so far. This section is primarily based on cite:maclane78.
Chapter 4: The internal logic of categories is studied. We start by defining presheaf categories (\S section-presheaves). Presheaves can be thought as some kind of generalized sets. Pair of sets, graphs, or sets depending on a real variable can be seen as presheaves. As we will see in this same chapter, presheaf categories are examples of categories having all the structure we are interested in. Lawvere’s algebraic theories show that it is possible to give a description of algebraic structures in terms of categories (\S sec-lawveretheories). Moreover, this description is independent of the particular axiomatization for the structure we are using. Models for these theories in any mathematical structure can be seen as functors from the category representing the theory.
In particular, cartesian closed categories (\S sec-cartesianclosedlambdacalculus) can be used to model simply typed lambda calculus. Each lambda calculus theory is the internal logical language of a cartesian closed category. Bicartesian categories correspond to intuitionistic logic and we can model inductive types as algebras over certain functors. Using this correspondence, we can encode theorems and proofs about cartesian closed categories using lambda calculus. As an example, we prove a theorem by Lawvere on fixed points and diagonal arguments (\S section-diagonalarguments). We can then interpret this theorem in many categories to get many known results by Cantor, Russel, Tarski and Gödel as corollaries.
We extend this structure asking the same structure to exist in each
slice of the category and thus defining locally cartesian closed categories
(\S sec-locallydependent).
Existential and universal quantifier can be interpreted with this new
structure, which also provides a framework for higher-order logics.
Dependent type theory arises as the natural language for these richer
categories, and it carries a natural computational interpretation. In
particular, equality (\S sec-equalitytypes) can be encoded using the J-eliminator, a rule,
directly derived from a categorical construction that says that, if
Under this interpretation, categories, logic and computation can be seen as multiple aspects of a unique central notion. All the constructions and definitions we have been discussing until this moment nudge us towards considering topoi, a particular kind of category that is itself a model of constructive mathematics (\S section-topoi). In other words, each topos is a different universe of mathematics. In particular, we study Lawvere’s Elementary Theory of the Category of Sets and how it recovers classical set-based mathematics. This section is primarily based on the work by cite:martinlof75 and cite:seely84.
Chapter 5: As we have been discussing on previous chapters, the internal logic of categories with enough structure is expressive enough to allow us to develop mathematical theories inside them. Moreover, these theories can be expressed in a programming language with a strong dependent type system (\S sec-mltt). Thus, type theory provides both a foundation of programming language theory and a foundation of mathematics. It is now interesting to consider how to embed Lawvere’s axioms for the category of sets and some classical principles inside this constructive logic, thereby showing that a set theoretical foundation can be seen as a particular case. The proof we present of Diaconescu’s theorem is an example of the use of this embedding. In particular, we discover that the law of excluded middle and classical mathematics follow from accepting the Axiom of Choice. We also study a first result on Homotopy Type Theory (\S sec-hott), which is the theory we obtain when we consider the higher dimensional structure of the type of equality and we relate it to Grothendieck’s homotopy hypothesis using Voevodsky’s Univalence Axiom. This section is mainly based on cite:hottbook.
Two libraries of computer verified mathematics have been developed (\S sec-verified). The first one implements positive real numbers as Dedekind cuts and provides a proof of Diaconescu’s theorem. More than a hundred computer verified lemmas and propositions about natural numbers and dyadic rationals have been needed for this construction. The second one accepts Voevodsky’s Univalence Axiom and proves the fact that the fundamental group of the circle is given by the integers.
Keywords: category theory, lambda calculus, categorical semantics, type theory, constructivism.
Capítulo 1: Empezamos estudiando el cálculo lambda de Church como
lenguaje de programación. Detallamos una definición formal de sus
aspectos sintácticos y exponemos demostraciones de las propiedades
que lo hacen implementable como lenguaje de programación; ejemplos son
la propiedad de confluencia (Teorema de Church-Rosser) o la existencia
de estrategias de reducción óptimas desde la perspectiva de
encontrar una forma normal. Tenemos la intención de usar en la
práctica estos resultados para posteriormente diseñar el intérprete de
un lenguaje basado en cálculo lambda. Brevemente hacemos mención a la
equivalencia con la lógica de combinadores SKI y cómo, cuando discutamos
la implementación del operador de minimización de las funciones
Existen variantes tipadas del cálculo lambda, más orientadas a la
lógica, que resuelven los problemas de normalización de las versiones
sin tipos. Discutimos la diferencia entre tipado de Curry y tipado de
Church y cómo el primero nos servirá para proveer de una forma de
polimorfismo implícito a un lenguaje de tipos simples; esto es,
podremos tener expresiones cuyo tipo sea lo más general posible hasta el
mismo momento de instanciarlas. Seguidamente, prestamos particular
interés a la relación de estas variantes tipadas del cálculo lambda
con la lógica proposicional intuicionista representada en la deducción
natural de Gentzen. La lógica intuicionista es base de la matemática
de Brouwer o Bishop y se caracteriza por no asumir la ley del tercio
excluso,
Terminamos revisando la literatura de posibles extensiones a los
sistemas de tipos del cálculo lambda con el
Capítulo 2: Diseñamos un intérprete de dos lenguajes de programación que basamos respectivamente en el cálculo lambda sin tipos y en el cálculo lambda simplemente tipado. La implementación deberá usar internamente índices de DeBruijn, una notación formal para representar variables, y aplicar algoritmos de evaluación e inferencia de tipos directamente derivados de nuestro estudio en el capítulo anterior. Elegimos el lenguaje funcional puro Haskell como lenguaje de desarrollo y analizamos esta elección. Utilizaremos librerías de parseado monádico y un entorno de herramientas de integración y control orientadas a garantizar corrección y finalmente a distribuir el paquete de software en los canales propios del lenguaje. Este intérprete nos permitirá explorar técnicas de programación en el cálculo lambda y aplicarlas a la verificación formal de demostraciones en la lógica intuicionista. Se incluyen además utilidades para integrarlo en el entorno de computación interactiva Jupyter Notebook; así como una transpilación y adaptación a Javascript que lo hace apto para la docencia y para ser embebido en páginas web. Siguiendo esta idea, se proporcionan enlaces a tutoriales interactivos sobre cálculo lambda.
En la sección sobre programación en cálculo lambda destacamos la posibilidad de codificar estructuras complejas de datos con la codificación de Church y la potencia del operador de punto fijo, que es el que nos permitirá escribir definiciones recursivas no acotadas a priori. En la sección sobre cálculo lambda tipado escribimos expresiones que son al mismo tiempo demostraciones de teoremas de la lógica intuicionista y programas tipados.
Capítulo 3: La teoría de categorías proporciona una abstracción del concepto de composición en matemáticas y una formalización de nociones como naturalidad, dualidad o universalidad. Esencialmente, una categoría viene dada por objetos y flechas o morfismos componibles entre ellos, siendo una generalización de la idea de grupoide. Tenemos ejemplos de categorías dentro de una gran variedad de áreas de las matemáticas y podemos estudiar los funtores como los homomorfismos que respetan su estructura; a su vez, estos funtores sirven como traducciones entre esas teorías. Realizaremos varias construcciones que justifican el hablar de naturalidad y universalidad de las estructuras matemáticas. El lema de Yoneda es un resultado fundacional de categorías que afirma, intuitivamente, que los objetos quedan determinados por sus morfismos; sobre él se basarán varios desarrollos en los siguientes capítulos. Definimos asimismo límites y colímites desde la noción de universalidad, que nos permitirán la construcción de generalizaciones del producto y el coproducto; tenemos especial interés en el caso de productos fibrados, que corresponderán después, en teoría de tipos, a la substitución sintática.
Terminamos trabajando con adjunciones, usando ideas de Lawvere en cuanto a su relación con las inferencias lógicas. Todos estos desarrollos sirven de fundamento teórico a los próximos capítulos. Por ejemplo, las álgebras sobre funtores modelarán la recursión en los lenguajes de programación, las mónadas nos han servido ya para modelar el parseado en Haskell, y las adjunciones son la base de la lógica categórica.
Capítulo 4: Empezamos definiendo las categorías de prehaces, que serán modelos naturales de toda la discusión posterior; esta es una clase de categorías que generaliza la categoría de conjuntos y proporciona modelos no estándar de las mismas construcciones que estudiaríamos en conjuntos. La perspectiva de la teoría de categorías y en especial la noción de las adjunciones como elemento fundacional nos permiten primero dar una formalización de la estructura de las teorías algebraicas, que muestra las ideas que usaremos para dotar de semántica a otras estructuras sintáticas. En esta teoría, podremos dar varias axiomatizaciones de una teoría, como la teoría de grupos, y obtener con todas ellas una sola categoría cuyos funtores representarán cada uno un modelo de la teoría.
Usamos esta misma idea para formalizar la estructura lógica del cálculo lambda como el lenguaje interno de las categorías cartesianas cerradas. Usando esta correspondencia, podemos recrear demostraciones sobre categorías cartesianas usando expresiones del cálculo lambda. En particular, aplicamos esta técnica a un teorema de William Lawvere (1969) sobre puntos fijos en categorías cartesianas. Su particularización nos da como corolario demostraciones de resultados de Cantor, Russell, Tarski y Gödel, y sintetiza la idea común a todas las demostraciones por diagonalización. Las categorías bicartesianas suponen una primera extensión que nos proporciona una estructura de lógica intuicionista interna a la categoría, y comprobamos cómo la existencia de álgebras iniciales en una categoría nos permite usarla como modelo de las variantes del cálculo lambda que contienen tipos inductivos (como los números naturales).
Una vez comprobada esta correspondencia, buscamos ampliarla hacia una
lógica con cuantificadores y de orden superior que nos permita
expresar nociones matemáticas más complejas. Enriquecemos la
estructura tomando categorías localmente cartesianas cerradas como
modelos e inspirando en ellas la noción de tipos dependientes. La
estructura de categoría localmente cartesiana cerrada permite
sustituciones dentro de los tipos y la expresión de tipos que actúan
como proposiciones sobre elementos y otros tipos. Estos tipos
dependientes nos proveen un marco lógico que además conserva una
interpretación computacional natural. Tenemos así un lenguaje de
programación cuyos tipos pueden ser proposiciones y cuyos elementos
pueden ser demostraciones en un sistema formal. Los cuantificadores
existencial y universal aparecen como casos particulares de
adjunciones. La igualdad puede ser codificada usando el eliminador
J, que permite demostrar una proposición en dos elementos,
Categorías, lógica y computación aparecen bajo esta interpretación como tres manifestaciones de una misma noción. Terminamos intuyendo cómo todas las construcciones hasta el momento nos dirigen naturalmente a la estructura algebraica de topos, una categoría que puede contener en su lógica interna formulaciones completas de teorías matemáticas. La teoría de conjuntos de Lawvere es el ejemplo que proponemos para esta correspondencia.
Capítulo 5: Como hemos detallado hasta aquí, la lógica interna de
ciertas categorías es suficientemente rica como para permitirnos
desarrollar teorías matemáticas que además pueden expresarse en un
lenguaje de programación con un sistema de tipos suficientemente
fuerte, sirviendo su compilación como comprobación de corrección de la
demostración. Para demostrar el potencial de esta perspectiva, usamos
el lenguaje de programación Agda y detallamos las correspondencias
explícitas entre tipos y categorías. La teoría de conjuntos de
Lawvere, unida al teorema de Diaconescu, nos permite recuperar la
matemática clásica como caso particular; sin embargo, permanecemos en
el caso más general posible para desarrollar una definición de los
reales positivos por cortes de Dedekind en la teoría de tipos de
Martin-Löf. Como resultado colateral, algunas demostraciones sobre
ciertas instancias del tercio excluso en esta estructura se convierten en programas que
calculan reales con precisión arbitraria. Proponemos
asimismo un ejemplo de topología sintética usando una correspondencia
recientemente descubierta entre la estructura de los tipos, los
Nuestras referencias principales son la exposición del cálculo lambda de Selinger cite:selinger13 en la primera parte, los textos de teoría de categorías de MacLane cite:maclane78 y Awodey cite:awodey10 en la segunda, y el libro fundacional de la teoría homotópica de tipos cite:hottbook en la parte final. Es destacable la utilidad de proyecto nLab cite:nlab para localizar la bibliografía relevante entre temas a priori dispares.
Palabras clave: categorías, cálculo lambda, lógica categórica, teoría de tipos, constructivismo.
\centering \raggedleft What is the primary tool for such summing up of
the essence of ongoing mathematics? Algebra! \ Nodal points in the progress of this kind of \ research occur when, as in the case with the \ finite number of axioms for the metacategory \ of categories, all that we know so far can be \ expressed in a single sort of algebra.– F. William Lawvere
\centering \raggedleft Mathematics has serious cultural issues.
Too much fame for climbing mountains; \ not enough for flattening them.– Conor McBride,
#
Assuming
Some years after, in 1945, the notion of categories originated with Eilenberg and Mac Lane’s work on cohomology theory cite:eilenberg45. Nowadays, functors, natural transformations and universal properties are used in all the different branches of mathematics. While we generally still prefer to rely on set-theoretical foundations, many axiomatizations of mathematics based on category theory have been proposed (see, for instance, cite:lawvere64). It was observed that any category with enough structure is able to interpret most of mathematics in an internal logic; and remarkably, this internal logic does not satisfy in general the law of excluded middle.
How does all of this relate to programming? Suppose we want to design a programming language. A possible path is to follow the functional paradigm that revolves around composition and inductive data structures. We would borrow ideas from λ-calculus, a collection of formal systems Alonzo Church proposed in the 1930s cite:church36. They can be seen as models of computation, and many modern programming languages are heavily influenced by them cite:hudak07_haskell, specially in the typed variants of the calculus. But then, we have the realizability interpretation of logic without excluded middle by Heyting and Kolmogorov (see cite:troelstra2014), that regards proofs as programs and propositions as types. Under this interpretation, the simply typed lambda calculus is precisely a language for constructive logic that loses its computational nature when excluded middle is added.
Categories are the framework in which we can make sense of this. A cartesian closed category is a category that has exponential objects (whose elements can be regarded as functions) and all finite products. These two constructions are particular cases of adjunctions, a purely categorical description of how certain transformations in mathematics relate (see cite:lawvere1969adjointness). It is easy to see how these adjunctions translate into computations in lambda calculus, thus encoding a language for logic and programming inside categories.
The logic we first obtain this way is a propositional logic. This
identification suffices to prove some theorems that have many known
results as corollaries, such as the Lawvere’s diagonal theorem
(cite:lawvere06): a surjective morphism
Finally, Martin-Löf type theory (see cite:martinlof75 and
cite:nordstrom90) helps to synthesize all these ideas into a formal system that
is active subject of research (cite:kraus2016notions, cite:rabe2011kripke
or cite:booij2017parametricity are examples). This is the environment
in which Voevodsky’s Univalence Axiom gives rise to Homotopy Type Theory
(see cite:hottbook), a recent development relating type theory and topological constructions.
As an example, we will prove that the fundamental group of the circle is
Our first objective is to build an interpreter for untyped and simply typed lambda calculus. Results on the first section of this text provide solid theoretical grounds for this purpose. We want the interpreter to be a didactic tool, useful for experimentation and with a clear theoretical basis in which to interpret the constructions and programs we can write on it. Our second main objective is to understand how dependent type theory works internally and to develop libraries of computer verified mathematics, proving simple theorems, undertaking the project of constructing the real numbers from scratch and formulating homotopy types within this structure.
<<sec-untypedlambda>>
When are two functions equal? Classically in mathematics, functions are graphs.
A function from a domain to a codomain,
From a computational point of view, this perspective could seem incomplete in some cases; we usually care not only about the result but, crucially, about how it can be computed. Classically in computer science, functions are formulae; and two functions mapping equal inputs to equal outputs need not to be equal. For instance, two sorting algorithms can have different efficiency or different memory requisites, even if they output the same sorted list. This vision, where two functions are equal if and only if they are given by essentially the same formula, is called /intensional/ (see cite:selinger13, where this difference is detailed).
The λ-calculus is a collection of formal systems, all of them based on the lambda notation introduced by Alonzo Church in the 1930s while trying to develop a foundational notion of functions as formulae on mathematics. It is a logical theory of functions, where application and abstraction are primitive notions, and at the same time it is also one of the simplest programming languages, in which many other full-fledged languages are based.
The untyped or pure λ-calculus is syntactically the
simplest of these formal systems. In it, a function does not need a
domain nor a codomain; every function is a formula that can be
directly applied to any expression. It even allows functions to be
applied to themselves, a notion that would be troublesome
in our usual set-theoretical foundations. In particular, if
This presentation of the untyped lambda calculus will follow cite:Hindley08 and cite:selinger13.
<<section-untypedcalculus>>
As a formal language, the untyped λ-calculus is given by a set of equations between expressions called λ-terms; equivalences between them can be computed using some manipulation rules. These λ-terms can stand for functions or arguments indistinctly: they all use the same λ-notation in order to define function abstractions and applications.
This λ-notation allows a function to be written and inlined as any other element
of the language, identifying it with the formula it determines and
admitting a more succinct representation. For example, the polynomial function
The use of λ-notation also eases the writing of
higher-order functions, functions whose arguments or outputs are
functions themselves. For instance,
By convention, we omit outermost parentheses and assume
left-associativity. For example,
<<sec-freeboundvars>>
In λ-calculus, the scope of a variable restricts to the λ-abstraction
where it appears, if any. Thus, the same variable can be used multiple
times on the same term independently. For example, in
Evaluation in λ-calculus relies in the notion of /substitution/.
Any free ocurrence of a variable can be substituted by a term, as we do
when evaluating function applications. For instance, in the previous example, we
can evaluate
To avoid this, the bounded
We could define a criterion for choosing exactly what this new variable should be, or simply accept that this procedure is not well-defined, but only well-defined up to a change on the name of the variables. This equivalence relation between terms with the same structure but different variable names is defined formally on the next section. In practice, it is common to follow the Barendregt’s variable convention, which simply assumes that bound variables have been renamed to be distinct.
Variables are only placeholders; and its name, as we have
seen before, is not relevant. Two λ-terms whose only difference is
the naming of the variables are called α-equivalent. For example,
The relation of α-equivalence formally captures the fact that the name of a bound variable can be changed without changing the meaning of the term. This idea appears repeatedly on mathematics; for example, the renaming of variables of integration or the variable on a limit are a examples of α-equivalence. \[ ∫_0^1 x^2\ dx = ∫_0^1 y^2\ dy; \qquad limx → ∞ \frac{1}{x} = limy → ∞ \frac{1}{y}. \]
The core notion of evaluation in λ-calculus is captured by the idea
of β-reduction. Until now, evaluation has been only informally
described; it is time to define it as a relation, $\ttoβ$, going from the
initial term to any of its partial evaluations. We
consider first a one-step reduction relationship, called
$→β
Ideally, we would like to define evaluation as a series of reductions into a canonical form which could not be further reduced. Unfortunately, as we will see later, it is not possible to find that canonical form in general.
Although we lost the extensional view of functions when we decided to
adopt the functions as formulae perspective, some notion of
function extensionality in λ-calculus can be partially recovered
by the notion of η-reduction: any term which simply applies a function to the
argument it takes can be reduced to that function. That is, given any term
<<section-confluence>> It is not possible in general to evaluate a λ-term into a canonical, non-reducible term. We discuss many examples of this phenomenon in the following sections. However, we will be able to prove that, in the cases where it exists, it is unique. This property is a consequence of a sightly more general one called /confluence/, which can be defined in any abstract rewriting system.
Given any binary relation
- the confluence property (also called Church-Rosser property) we have just defined;
- the quasidiamond property, similar to the confluence property but assuming
$M → N$ and$M → P$ instead of the weaker hypothesis of$M \tto N$ and$M \tto P$ ; - and the diamond property, which is defined by substituting
$\tto$ by$→$ in the definition of confluence.
The three properties can be represented respectively as follows
(left: confluence, center: quasidiamond property, right: diamond property).
\[\begin{tikzcd}[column sep=small]
&
M \drar[two heads]\dlar[two heads] &&&
M \drar\dlar &&&
M \drar\dlar &
N \drar[dashed,two heads] &&
P \dlar[dashed,two heads] &
N \drar[dashed,two heads] &&
P \dlar[dashed,two heads] &
N \drar[dashed] &&
P \dlar[dashed] \&
Z &&&
Z &&&
Z &\
\end{tikzcd}\]
We can show that the diamond property implies confluence; while the
quasidiamond property does not. In fact, the following figure provides
a relation satisfying the quasidiamond property but not the
confluence property (example from cite:selinger13). If we want to prove
confluence for a given relation, we must use the diamond property instead
of the quasidiamond property.
\[\begin{tikzcd}[column sep=tiny, row sep=tiny]
& & • \drar\dlar & & &
& • \dlar & & • \dlar\drar & & \
•\drar & & • \drar\dlar & & • \drar & \
& •\dlar & & • \drar\dlar & & •\dlar \
•\drar & & • \drar\dlar & & • \drar & \
& •\dlar & & • \drar\dlar & & •\dlar \
\makebox[0pt][l]{…}\phantom{•} & & \makebox[0pt][l]{…}\phantom{•} & & \makebox[0pt][l]{…}\phantom{•} & \
\end{tikzcd}\]
The statement of $\ttoβ$ and $\ttoβ\eta$ being confluent is what we call the /Church-Rosser Theorem/. The definition of a relation satisfying the diamond property and whose reflexive transitive closure is $\ttoβ\eta$ will be the core of our proof.
<<section-churchrossertheorem>> The proof presented here is due to Tait and Per Martin-Löf; an earlier but more convoluted proof was discovered by Alonzo Church and Barkley Rosser in 1935 (see cite:barendregt84 and cite:pollack95). It is based on the idea of parallel one-step reduction.
In order to prove that this newly defined relation has the diamond property, we will define a reduction of a term with the property that it can be reached from any of its parallel one-step reductions. We first prove a lemma on substitution that will handle later the more challenging cases of the proof.
Once the Church-Rosser theorem is proved, we can formally define the notion of a normal form as a completely reduced lambda term.
Fully evaluating λ-terms means to iteratively apply reductions to them until a normal form is reached. We know, by virtue of Theorem theorem-churchrosser, that if a normal form for a particular term exists, then it is unique; but we do not know whether a normal form actually exists. We say that a term has a normal form if it can be reduced to a normal form.
A consequence of Theorem theorem-churchrosser is that a weakly normalizing
term has a unique normal form. Strong normalization implies weak normalization,
but the converse is not true; as an example, the term
We would like to find a β-reduction strategy such that, if a term has a normal form, it can be found by following that strategy. Our basic result will be the standardization theorem, which shows that, if a β-reduction to a normal form exists, then a sequence of β-reductions from left to right on the λ-expression will be able to find it. From this result, we will be able to prove that the reduction strategy that always reduces the leftmost β-abstraction will always find a normal form if it exists. This section follows cite:kashima00, cite:barendsen94 and cite:barendregt84.
We will prove that every term that can be reduced to a normal form can be reduced to it using a standard sequence. This will imply the existence of an optimal beta reduction strategy that will always reach a normal form if one exists.
<<section-ski>>
As we have seen in previous sections, untyped λ-calculus is already a very syntactically simple system; but it can be further reduced to a few λ-terms without losing its expressiveness. In particular, untyped λ-calculus can be essentially recovered from only two of its terms; these are
-
$S = λ x.λ y.λ z. xz(yz)$ , and -
$K = λ x.λ y.x$ .
A language can be defined with these combinators and function
application. Every λ-term can be translated to this language and recovered up
to $=β\eta$ equivalence. For example, the identity λ-term,
It is common to also add the
The language of SKI-terms can be defined by the Backus-Naus form
\[
\mathsf{SKI} ::= x \mid (\mathsf{SKI}\ \mathsf{SKI}) \mid S \mid K \mid I,
\]
where
In general this translation is not an isomorphism. For instance,
<<section-turing>>
Three different notions of computability were proposed in the 1930s,
- the general recursive functions were defined by Herbrand and Gödel; they form a class of functions over the natural numbers closed under composition, recursion and unbounded search;
- the λ-definable functions were proposed by Church; they are functions on the natural numbers that can be represented by λ-terms;
- the Turing computable functions, proposed by Alan Turing as the functions that can be computed using Turing machines, theoretical models of a machine.
In cite:church36 and cite:turing37, Church and Turing proved the equivalence of the three definitions. This lead to the metatheoretical /Church-Turing thesis/, which postulated the equivalence between these models of computation and the intuitive notion of effective calculability mathematicians were using. In practice, this means that the λ-calculus, as a programming language, is as expressive as Turing machines; it can define every computable function. It is Turing-complete.
A complete implementation of untyped λ-calculus is discussed in Chapter Implementation of λ-expressions; and a detailed description on how to use the untyped λ-calculus as a programming language is given in Chapter Programming in untyped λ-calculus. General recursive functions, for example, can be encoded using these techniques, thus proving that it is in fact Turing complete (see sec-fixed-points). Note that the lambda calculus has even a cost model allowing us to develop complexity theory within it (see cite:dallago08). It is however beyond the scope of this text.
<<sec-simplytypedlambda>>
/Types/ were introduced in mathematics as a response to the Russell’s paradox found in the first naive axiomatizations of set theory (see Corollary cor-russellparadox). An attempt to use untyped λ-calculus as a foundational logical system by Church suffered from a variant called the Rosser-Kleene paradox and types were a method to avoid it, as detailed in cite:kleene35 and cite:curry46. Once types are added to the calculus, a deep connection between λ-calculus and logic arises. This connection will be discussed in Section sec-curryhoward.
In programming languages, types indicate how the programmer intends to use the data, prevent errors and enforce certain invariants and levels of abstraction in programs. The role of types in λ-calculus when interpreted as a programming language closely matches the usual notion, and typed λ-calculus has been the basis of many modern type systems for programming languages.
Simply typed λ-calculus is a refinement of the untyped
λ-calculus. On it, each term has a type that limits how it can
be combined with other terms. Only a set of basic types and function
types between any to types are considered in this system. Whereas
functions in untyped λ-calculus can be applied over any term,
once we introduce types, a function of type
We present now an account of simply typed λ-calculus based
on cite:Hindley08. Our description will rely only on the arrow type constructor
<<sec-simpletypes>> We start by assuming a set of basic types. Those basic types would correspond, in a programming language interpretation, with the fundamental types of the language, such as the strings or the integers. Minimal presentations of λ-calculus tend to use only one basic type.
We define the terms of simply typed λ-calculus using
the same constructors we used on the untyped version. The set of
typed lambda terms is given by the following Backus-Naus form.
\[\mathsf{Term} ::=
x \mid
\mathsf{Term}\ \mathsf{Term} \mid
λ x\mathsf{Type}. \mathsf{Term}.
\]
The main difference here with Definition def-lambdaterms is
that every bound variable has a type, and therefore, every λ-abstraction
of the form
However, the set of raw typed λ-terms contains some meaningless terms
under this type interpretation, such as
Every typing rule assumes a typing context, usually denoted by
- The
$(var)$ rule simply makes explicit the type of a variable from the context. That is, a context that assumes that$x : A$ can be written as$Γ,x:A$ ; and we can trivially deduce from it that$x:A$ . \begin{prooftree} \RightLabel{($var$ )} \AXC{} \UIC{$Γ, x:A \vdash x:A$} \end{prooftree} - The
$(abs)$ rule declares that the type of a λ-abstraction is the type of functions from the variable type to the result type. If a term$m:B$ can be built from the assumption that$x:A$ , then $λ xA. m : A → B$. It acts as an introduction of function terms. \begin{prooftree} \RightLabel{$(abs)$} \AXC{$Γ, x:A \vdash m : B$} \UIC{$Γ \vdash λ x.m : A → B$} \end{prooftree} - The
$(app)$ rule declares the type of a well-typed application. A term$f : A → B$ applied to a term$a : A$ is a term$f\ a : B$ . It acts as an elimination of function terms. \begin{prooftree} \RightLabel{$(app)$} \AXC{$Γ \vdash f : A → B$} \AXC{$Γ \vdash a : A$} \BIC{$Γ \vdash f\ a : B$} \end{prooftree}
A term
It can be seen that the typing derivation of a term somehow encodes the complete λ-term. If we were to derive the term bottom-up, there would be only one possible choice at each step on which rule to use. In Section Unification and type inference we will discuss a type inference algorithm that determines if a type is typeable and what its type should be, and we will make precise this intuition.
<<sec-currystyle>>
Two different approaches to typing in λ-calculus are commonly used.
- Church-style typing, also known as explicit typing, originated from the work of Alonzo Church in cite:church40, where he described a simply-typed lambda calculus with two basic types. The term’s type is defined as an intrinsic property of the term; and the same term has to be always interpreted with the same type.
- Curry-style typing, also known as implicit typing, creates a formalism where every single term can be given an infinite number of possible types. This technique is called polymorphism when it is a formal part of the language; but here, it is only used to allow us to build intermediate terms without having to directly specify their type.
As an example, we can consider the identity term
Assuming an infinite numerable set of type variables, we define
type templates as inductively generated by
We consider a type to be more general than other if the latter can be obtained by
applying a substitution to the former. In this case, the latter is called an instance
of the former. For example,
The unification of two type templates is the construction of two substitutions making them equal as type templates; that is, the construction of a type that is a particular instance of both at the same time. We will not only aim for an unifier but for the most general one between them, the universal one.
A substitution
Using unification, we can write an algorithm inferring types.
A crucial property is that type inference and β-reductions do not interfere with each other. A term can be β-reduced without changing its type.
We have seen previously that the term
The notion of /reducibility/ is an abstract concept originally
defined by Tait in cite:tait67 which we will use to ease this
proof. It should not be confused with the notion of β-reduction.
We inductively define the set
- If
$t : T$ where$T$ is a basic type, $t ∈ \reduT$ if$t$ is strongly normalizable. - If
$t : U → V$ , an arrow type, $t ∈ \reduU → V$ if $t\ u ∈ \reduV$ for all $u ∈ \reduU$.
We prove three properties of reducibility at the same time in order to use mutual induction.
A final lemma is needed before the proof of the Strong Normalization Theorem. It is a generalization of the main theorem, useful because of the stronger induction hypothesis it provides.
Every term normalizes in simply typed λ-calculus and every computation ends, therefore, simply typed λ-calculus cannot be Turing complete.
<<sec-curryhoward>>
<<sec-extendingstlc>> We will add now special syntax for some terms and types, such as pairs, unions and unit types. This new syntax will make our λ-calculus more expressive, but the unification and type inference algorithms will continue to work in a similar way. The previous proofs and algorithms can be extended to cover all the new cases.
The new set of simple types is given by the following Backus-Naur form,
\[\mathsf{Type} ::= ι \mid
\mathsf{Type} → \mathsf{Type} \mid
\mathsf{Type} × \mathsf{Type} \mid
\mathsf{Type} + \mathsf{Type} \mid
1 \mid
0,\]
where
The new set of typed lambda terms is given by the following Backus-Naur form. \[\begin{aligned} \mathsf{Term} ::=\ & x \mid \mathsf{Term}\mathsf{Term} \mid λ x. \mathsf{Term} \mid \& \left\langle \mathsf{Term},\mathsf{Term} \right\rangle \mid π_1 \mathsf{Term} \mid π_2 \mathsf{Term} \mid \& \textrm{inl}\ \mathsf{Term} \mid \textrm{inr}\ \mathsf{Term} \mid \textrm{case}\ \mathsf{Term}\ \textrm{of}\ \mathsf{Term}; \mathsf{Term} \mid \& \textrm{abort}\ \mathsf{Term} \mid ∗. \end{aligned}\]
And the use of these new terms is formalized by the following extended set of typing rules.
- The
$(var)$ rule simply makes explicit the type of a variable from the context. \begin{prooftree} \LeftLabel{($var$ )} \AXC{} \UIC{$Γ, \red{x}:\blue{A} \vdash \red{x}:\blue{A}$} \end{prooftree} - The
$(abs)$ and$(app)$ rules construct and apply function terms. \begin{prooftree} \LeftLabel{$(abs)$} \AXC{$Γ, \red{x}:\blue{A} \vdash \red{m} : \blue{B}$} \UIC{$Γ \vdash \red{λ x.m} : \blue{A → B}$} \LeftLabel{$(app)$} \AXC{$Γ \vdash \red{f} : \blue{A → B}$} \AXC{$Γ \vdash \red{a} : \blue{A}$} \BIC{$Γ \vdash \red{f\ a} : \blue{B}$} \noLine \BIC{} \end{prooftree} - The
$(pair)$ rule constructs pairs of elements. The$(π_1)$ and$(π_2)$ rules destruct a pair into its projections. \begin{prooftree} \LeftLabel{$(pair)$} \AXC{$Γ \vdash \red{a} : \blue{A}$} \AXC{$Γ \vdash \red{b} : \blue{B}$} \BIC{$Γ \vdash \red{\pair{a,b}} : \blue{A × B}$} \LeftLabel{$(π_1)$} \AXC{$Γ \vdash \red{m} : \blue{A × B}$} \UIC{$Γ \vdash \red{π_1\ m} : \blue{A}$} \LeftLabel{$(π_2)$} \AXC{$Γ \vdash \red{m} : \blue{A × B}$} \UIC{$Γ \vdash \red{π_2\ m} : \blue{B}$} \noLine \TIC{} \end{prooftree} - The
$(inl)$ and$(inr)$ rules provide the two ways of creating a tagged union type, while the$(case)$ rule extracts a term from a union type applying case analysis. Note that we write$[a].n$ and$[b].p$ to explicitly indicate that$n$ and$p$ can depend on$a$ and$b$ , respectively. \begin{prooftree} \LeftLabel{$(inl)$} \AXC{$Γ \vdash \red{a} : \blue{A}$} \UIC{$Γ \vdash \red{\mathrm{inl}\ a} : \blue{A + B}$} \LeftLabel{$(inr)$} \AXC{$Γ \vdash \red{b} : \blue{B}$} \UIC{$Γ \vdash \red{\mathrm{inr}\ b} : \blue{A + B}$} \noLine \BIC{} \end{prooftree} \begin{prooftree} \LeftLabel{$(case)$} \AXC{$Γ \vdash \red{m} : \blue{A + B}$} \AXC{$Γ, \red{a}:\blue{A} \vdash \red{n} : \blue{C}$} \AXC{$Γ, \red{b}:\blue{B} \vdash \red{p} : \blue{C}$} \TIC{$Γ \vdash (\red{\mathrm{case}\ m\ \mathrm{of}\ [a].n;\ [b].p}) : \blue{C}$} \end{prooftree} - The
$(∗)$ rule simply creates the only element of type$1$ . \begin{prooftree} \LeftLabel{$(∗)$} \AXC{$$} \UIC{$Γ \vdash \red{∗} : \blue{1}$} \end{prooftree} - The
$(abort)$ rule extracts a term of any type from the void type. If we reach a void type, we have reached an error, and thus we can throw any typed exception. \begin{prooftree} \LeftLabel{$(abort)$} \AXC{$Γ \vdash \red{m} : \blue{0}$} \UIC{$Γ \vdash \red{\mathrm{abort}_A\ m} : \blue{A}$} \end{prooftree}
The β-reduction of terms is defined the same way as for the untyped λ-calculus; except for the inclusion of β-rules governing the new terms, each for every new destruction rule.
- Function application, $(λ x.m)\ n →β m[n/x]$.
- First projection, $π_1 \left\langle m,n \right\rangle →β m$.
- Second projection, $π_2 \left\langle m,n \right\rangle →β n$.
- Case rule, $(\mathrm{case}\ m\ \mathrm{of}\ [x].n;\ [y].p) →β n[a/x]$ if
$m$ is of the form$m = \mathrm{inl}\ a$ ; and $(\mathrm{case}\ m\ \mathrm{of}\ [x].n;\ [y].p) →β p[b/y]$ if$m$ is of the form$m = \mathrm{inr}\ b$ .
On the other hand, new η-rules are defined, each for every new construction rule.
- Function extensionality: $λ x.f\ x →η f$ for any
$f : A → B$ . - Definition of product: $\langle π_1\ m, π2\ m \rangle →η m$ for any
$m : A × B$ . - Uniqueness of the unit: $t →η ∗$ for any
$t : 1$ . - Case rule: $(\mathrm{case}\ m\ \mathrm{of}\ [a].p[ \mathrm{inl}\ a/c ];\ [b].p[ \mathrm{inr}\ b/c ]) →η p[m/c]$
for any
$m : A + B$ .
<<sec-naturaldeduction>> The natural deduction is a logical system due to Gentzen. We introduce it here following cite:selinger13 and cite:wadler15. Its relationship with the simply-typed lambda calculus will be made explicit in Section Propositions as types.
We use the logical binary connectives
- Every axiom on the context can be used. \begin{prooftree} \RightLabel{(Ax)} \AXC{} \UIC{$Γ,\blue{A} \vdash \blue{A}$} \end{prooftree}
- Introduction and elimination of the
$→$ connective. Note that the elimination rule corresponds to modus ponens and the introduction rule corresponds to the deduction theorem. \begin{prooftree} \RightLabel{($I→$)} \AXC{$ Γ, \blue{A} \vdash \blue{B}$} \UIC{$Γ \vdash \blue{A → B}$} \RightLabel{($E→$)} \AXC{$ Γ \vdash \blue{A → B}$} \AXC{$Γ \vdash \blue{A}$} \BIC{$Γ \vdash \blue{B}$} \noLine \BIC{} \end{prooftree} - Introduction and elimination of the
$∧$ connective. Note that the introduction in this case takes two assumptions, and there are two different elimination rules. \begin{prooftree} \RightLabel{($I∧$)} \AXC{$ Γ \vdash \blue{A}$} \AXC{$Γ \vdash \blue{B}$} \BIC{$Γ \vdash \blue{A ∧ B}$} \RightLabel{($E∧^1$)} \AXC{$Γ \vdash \blue{A ∧ B}$} \UIC{$Γ \vdash \blue{A}$} \RightLabel{($E∧^2$)} \AXC{$Γ \vdash \blue{A ∧ B}$} \UIC{$Γ \vdash \blue{B}$} \noLine \TIC{} \end{prooftree} - Introduction and elimination of the
$∨$ connective. Here, we need two introduction rules to match the two assumptions we use on the eliminator. \begin{prooftree} \RightLabel{($I∨^1$)} \AXC{$Γ \vdash \blue{A}$} \UIC{$Γ \vdash \blue{A ∨ B}$} \RightLabel{($I∨^2$)} \AXC{$Γ \vdash \blue{B}$} \UIC{$Γ \vdash \blue{A ∨ B}$} \RightLabel{($E∨$)} \AXC{$ Γ \vdash \blue{A ∨ B}$} \AXC{$Γ,\blue{A} \vdash \blue{C}$} \AXC{$Γ,\blue{B} \vdash \blue{C}$} \TIC{$Γ \vdash \blue{C}$} \noLine \TIC{} \end{prooftree} - Introduction for
$\top$ . It needs no assumptions and, consequently, there is no elimination rule for it. \begin{prooftree} \RightLabel{($I\top$)} \AXC{} \UIC{$ Γ \vdash \blue{\top}$} \end{prooftree} - Elimination for
$\bot$ . It can be eliminated in all generality, and, consequently, there are no introduction rules for it. This elimination rule represents the “ex falsum quodlibet” principle that says that falsity implies anything. \begin{prooftree} \RightLabel{($E\bot$)} \AXC{$ Γ \vdash \blue{\bot}$} \UIC{$Γ \vdash \blue{C}$} \end{prooftree}
Proofs on natural deduction are written as deduction trees, and they can be simplified according to some simplification rules, which can be applied anywhere on the deduction tree. On these rules, a chain of dots represents any given part of the deduction tree.
- An implication and its antecedent can be simplified using the
antecedent directly on the implication.
\begin{prooftree} \AXC{$[\blue{A}]$}\noLine \UIC{$\vdots1
$}\noLine \UIC{$ \blue{B}$} \UIC{$\blue{A} → \blue{B}$} \AXC{$\vdots^2$}\noLine \UIC{$\blue{A}$} \BIC{$\blue{B}$}\UIC{$\vdots$}\noLine \AXC{$\Longrightarrow$} \UIC{}\noLine\UIC{}\noLine\UIC{}\noLine
\AXC{$\vdots2
$}\noLine \UIC{$ \blue{A}$}\noLine \UIC{$\vdots1$}\noLine \UIC{$ \blue{B}$} \UIC{$\vdots$}\noLine \noLine \TIC{} \end{prooftree} - The introduction of an unused conjunction can be simplified
as
\begin{prooftree}
\AXC{$\vdots1
$}\noLine \UIC{$ \blue{A}$} \AXC{$\vdots2$}\noLine \UIC{$ \blue{B}$} \BIC{$\blue{A} ∧ \blue{B}$} \UIC{$\blue{A}$} \UIC{$\vdots$}\noLine \AXC{$\Longrightarrow$} \UIC{}\noLine\UIC{}\noLine\UIC{}\noLine \AXC{$\vdots1$}\noLine \UIC{$ \blue{A}$} \UIC{$\vdots$}\noLine \noLine \TIC{} \end{prooftree}and, similarly, on the other side as follows. \begin{prooftree} \AXC{$\vdots1
$}\noLine \UIC{$ \blue{A}$} \AXC{$\vdots2$}\noLine \UIC{$ \blue{B}$} \BIC{$\blue{A} ∧ \blue{B}$} \UIC{$\blue{B}$} \UIC{$\vdots$}\noLine \AXC{$\Longrightarrow$} \UIC{}\noLine\UIC{}\noLine\UIC{}\noLine \AXC{$\vdots2$}\noLine \UIC{$ \blue{B}$} \UIC{$\vdots$}\noLine \noLine \TIC{} \end{prooftree} - The introduction of a disjunction followed by its elimination can
be also simplified
\begin{prooftree} \AXC{$\vdots1
$}\noLine \UIC{$ \blue{A}$} \UIC{$\blue{A} ∨ \blue{B}$} \AXC{$[\blue{A}]$}\noLine \UIC{$\vdots^2$}\noLine \UIC{$\blue{C}$} \AXC{$[\blue{B}]$}\noLine \UIC{$\vdots^3$}\noLine \UIC{$\blue{C}$} \TIC{$\blue{C}$}\UIC{$\vdots$}\noLine \AXC{$\Longrightarrow$} \UIC{}\noLine\UIC{}\noLine\UIC{}\noLine
\AXC{$\vdots1
$}\noLine \UIC{$ \blue{A}$}\noLine \UIC{$\vdots2$}\noLine \UIC{$ \blue{C}$} \UIC{$\vdots$}\noLine \noLine \TIC{} \end{prooftree}and a similar pattern is used on the other side of the disjunction.
\begin{prooftree} \AXC{$\vdots1
$}\noLine \UIC{$ \blue{B}$} \UIC{$\blue{A} ∨ \blue{B}$} \AXC{$[\blue{A}]$}\noLine \UIC{$\vdots^2$}\noLine \UIC{$\blue{C}$} \AXC{$[\blue{B}]$}\noLine \UIC{$\vdots^3$}\noLine \UIC{$\blue{C}$} \TIC{$\blue{C}$}\UIC{$\vdots$}\noLine \AXC{$\Longrightarrow$} \UIC{}\noLine\UIC{}\noLine\UIC{}\noLine
\AXC{$\vdots1
$}\noLine \UIC{$ \blue{B}$}\noLine \UIC{$\vdots3$}\noLine \UIC{$ \blue{C}$} \UIC{$\vdots$}\noLine \noLine \TIC{} \end{prooftree}
<<sec-propositionstypes>>
In 1934, Curry observed in cite:curry34 that the type of a function
- propositions are types;
- proofs are programs; and
- simplification of proofs is evaluation of programs.
In the case of simply typed λ-calculus and natural deduction, the correspondence starts when we describe the following one-to-one relation between types and propositions.
\begin{center}\begin{tabular}{c|c}
Types & Propositions
\hline
Unit type (
Where, in particular, the negation of a proposition
Now it is easy to notice that every deduction rule of Section Natural deduction has a correspondence with a typing rule of Section Extending the simply typed λ-calculus. The only distinction between them is the appearance of λ-terms on the first set of rules. As every typing rule results on the construction of a particular kind of λ-term, they can be interpreted as encodings of proofs in the form of derivation trees. That is, terms are proofs of the propositions represented by their types.
Furthermore, under this interpretation, /simplification rules are precisely β-reduction rules/. This makes execution of λ-calculus programs correspond to proof simplification on natural deduction. The Curry-Howard correspondence is then not only a simple bijection between types and propositions, but a deeper isomorphism regarding the way they are constructed, used in derivations, and simplified.
The λ-cube is a taxonomy for Church-style type systems given by Barendregt in cite:barendregt92. It describes eight type systems based on the λ-calculus along three axes, representing three properties of the systems.
-
Parametric polymorphism: terms that depend on types. This is
achieved via universal quantification over types. It allows type
variables and binders for them. An example is the following parametric
identity function where
$Λ$ acts as a$λ$ for types, and$τ$ is a type variable. \[ \mathrm{id} ≡ Λ τ . λ x . x : ∀ τ . τ → τ, \] It can be applied to any particular type$A$ to obtain the specific identity function for that type as \[ \mathrm{id}A ≡ λ x.x : A → A. \] -
Type operators: types that depend on types. An example of type
operator is
$[-]$ , which sends each type$A$ to the type$[A]$ of lists of elements of$A$ . If we also assume polymorphism, a higher-order function mapping a function argument over a list would have the following type. \[ \mathrm{map} : ∀ τ. ∀ σ. (τ → σ) → [τ] → [σ] \] -
Dependent types: types that depend on terms. An example is the
type
$\mathrm{Vect}(n)$ of vectors of a fixed length$n$ , where$n$ is, itself, an element of a natural numbers type$n : \mathbb{N}$ . The type of vectors of any length,$\mathrm{Vect}(0) + \mathrm{Vect}(1) + \mathrm{Vect}(2) + …$ , is written as \[ ∑n : \mathbb{N} \mathrm{Vect}(n). \] Chapters Locally cartesian closed categories and dependent types and Martin-Löf type theory are devoted to the study of dependent types.
The λ-cube is shown in the following figure.
\[\begin{tikzcd}[column sep=small]
&&& |[label={above:\lcubett{System F$ω$}}]| \systemfo \ar{rr}
&& |[label=above:\lcubett{CoC},label=above:\phantom{System Fo}]| \systemcoc
&
\phantom{.}
&& |[label={left:\lcubett{System F}}]| \systemf \ar{ur}\ar{rr}
&& \systemfp \ar{ur}
&& \
&&& \systemo \ar{rr}\ar{uu}
&& |[label=right:\lcubett{wCoC}]| \systemlpo \ar{uu}
&& \phantom{λ}\phantom{PQW} \
\ar[\lcred]{uu}[\lcred]{\text{\parbox{2cm}{\centering terms depend on types}}}
&& |[label=below:\lcubett{STLC}]| \stlc \ar{uu}\ar{rr}\ar{ur}
&& |[label=below:\lcubett{DTLC}]| \systemlp \ar{uu}\ar{ur}
&& \phantom{.} \
&& \ar[\lcred]{rr}[swap,\lcred]{\text{\parbox{2cm}{\centering types depend on terms}}}
&& \phantom{.}
& \ar[\lcred]{ur}[swap,\lcred]{\text{\parbox{2cm}{\centering types depend on types}}}
&
\end{tikzcd}\]
It presents the following type systems. Some of which are not commonly used, but all of them are strongly normalizing.
-
Simply typed λ-calculus (
$\stlc$ ); as described in Section Typing rules for simply typed λ-calculus. -
Simply typed λ-calculus with operators (
$\systemo$ ). -
System F (
$\systemf$ ) and System F-omega ($\systemfo$ ) add polymorphism to the simply typed λ-calculus and type operators. The Haskell programming language is based on System F-omega with some restrictions. -
Dependently typed λ-calculus (
$\systemlp$ ); used in the Edinburgh Logical Framework for logic programming (see cite:harper93). -
Calculus of constructions (
$\systemcoc$ ); where full mathematical theories can be developed (see cite:coquand88). It is used in the Coq Proof Assistant.
The λ-cube is generalized by the theory of pure type systems, described in cite:barendregt92 and cite:geuvers93.
<<sec-haskelllang>>
Haskell is the purely functional programming language of our choice to implement our λ-calculus interpreter. Its own design is heavily influenced by the λ-calculus and it is a general-purpose language with a rich ecosystem and plenty of consolidated libraries[fn:hackagelibs] in areas such as parsing, testing or system interaction; matching the requisites of our project. In the following sections, we describe this ecosystem in more detail and justify our choice.
[fn:hackagelibs]: A categorized list of libraries can be found in the central package archive of the Haskell community: https://hackage.haskell.org/packages/
In the 1980s, many lazy programming languages were independently being written by researchers such as Miranda, Lazy ML, Orwell, Clean or Daisy. All of them were similar in expressive power, but their differences were holding back the efforts to communicate ideas on functional programming, so the Haskell 98 Report was a first standardized reference of a common lazy functional language. A revised version can be read in cite:haskell98. We will use its most standard implementation: the Glasgow Haskell Compiler (GHC); an open source compiler written in Haskell and C. The complete history of Haskell and its design decisions is detailed on cite:hudak07_haskell, but we are interested in the following properties:
Haskell is
- strongly and statically typed, meaning that it only compiles well-typed programs and it does not allow implicit type casting; type declarations are then useful in our interpreter to keep a track of what kind of data are we dealing with at each specific function;
- lazy, with non-strict semantics, meaning that it will not evaluate a term or the argument of a function until it is needed; this can help to solve the traditional efficiency problems on functional programming (see cite:hughes89);
- purely functional; as the evaluation order is demand-driven and not explicitly known, it is not possible to perform ordered input/output actions or any other side-effects that rely on the evaluation order; this helps modularity of our code, testing, and verification;
- referentially transparent; as a consequence of its purity, every term on the code can be replaced by its definition without changing the global meaning of the program; this allows equational reasoning with rules that are directly derived from λ-calculus and makes it easier to reason about our functions;
- based on System Fω with some restrictions; crucially, it implements System F adding quantification over type operators even if it does not allow abstraction on type operators; the GHC Haskell compiler, however, allows the user to activate extensions that implement dependent types.
Where most imperative languages use semicolons to separate sequential commands, Haskell has no notion of sequencing, and programs are written in a purely declarative way. A Haskell program essentially consist on a series of definitions (of both types and terms) and type declarations. The following example shows the definition of a binary tree and its preorder.
-- A tree is either empty or a node with two subtrees.
data Tree a = Empty | Node a (Tree a) (Tree a)
-- The preorder function takes a tree and returns a list
preorder :: Tree a -> [a]
preorder Empty = []
preorder (Node x lft rgt) = preorder lft ++ [x] ++ preorder rgt
We can see on the previous example that function definitions allow pattern matching, that is, data constructors can be used in definitions to decompose values of the type. This increases readability when working with algebraic data types or implementing inductive definitions. Note that the majority of the definitions we discussed in Sections sec-untypedlambda and sec-simplytypedlambda are precisely structurally inductive.
While infix operators are allowed, function application is left-associative in general. Definitions using partial application are allowed, meaning that functions on multiple arguments can use currying and can be passed only one of its arguments to define a new function. For example, a function that squares every number on a list could be written in two ways, as the following example shows. The second one, because of its simplicity, is usually preferred.
squareList :: [Int] -> [Int]
squareList list = map square list
squareList' :: [Int] -> [Int]
squareList' = map square
A characteristic piece of Haskell are type classes, which allow
defining common interfaces for different types. In the following
example, we define Monad
as the type class of types with suitably
typed return
and >>=
operators.
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
And lists, for example, are monads in this sense.
instance Monad [] where
return x = [x] -- returns a one-element list
xs >>= f = concat (map f xs) -- map and concatenation
Monads are used in I/O, error propagation and stateful
computations. Another characteristic syntax feature of Haskell is the
do
notation, which provides a nicer, cleaner way to concatenate
computations with monads that resembles an imperative language. The
following example uses the list monad to compute the list of
Pythagorean triples.
pythagorean = do
a <- [1..] -- let a be any natural
b <- [1..a] -- let b be a natural between 1 and a
c <- [1..b] -- let c be a natural between 1 and b
guard (a^2 == b^2 + c^2) -- filter the list
return (a,b,c) -- return matching tuples
Note that this list is infinite. As the language is lazy, this does not represent a problem: the list will be evaluated only on demand.
Another common example of an instance of the Monad
typeclass is the
Maybe monad used to deal with error propagation. A Maybe a
type can
consist of a term of type a
, written as Just a
; or as a Nothing
constant, signalling an error. The monad is then defined as
instance Monad Maybe where
return x = Just x
xs >>= f = case xs of Nothing -> Nothing | Just a -> Just (f a)
and can be used as in the following example to use exception-like error handling in a pure declarative language.
roots :: (Float,Float,Float) -> Maybe Int
roots (a,b,c) = do
-- Some errors can occur during this computation
discriminant <- sqroot (b*b - 4*c*a) -- roots of negative numbers?
root1 <- safeDiv ((-b) + discriminant) (2*a) -- division by zero?
root2 <- safeDiv ((-b) - discriminant) (2*a)
-- The monad ensures that we return a number only if no error has been raised
return (root1,root2)
A more detailed treatment of monads from the perspective of category theory is presented in Section sec-monads.
<<sec-debruijnindexes>>
Nicolaas Govert De Bruijn proposed (see cite:debruijn81) a way of defining λ-terms modulo
α-conversion based on indices. The main goal of De Bruijn
indices is to remove all variables from binders and replace every
variable on the body of an expression with a number, called index,
representing the number of λ-abstractions in scope between the
occurrence and its binder. Consider the following example where we draw
arrows between each term and its intermediate λ-abstractions: the λ-term
\[ \red{λ\tikzmark{2} y}.\ \tikzmark{3}y\
(\blue{λ\tikzmark{1} z}.\ \tikzmark{0}y\
\tikzmark{4}z) \begin{tikzpicture}[remember picture, overlay, bend right=45, -latex, redPRL]
\draw ([yshift=2ex]pic cs:0) to ([yshift=2ex]pic cs:1);
\draw ([yshift=2ex]pic cs:1) to ([yshift=2ex]pic cs:2);
\draw ([yshift=2ex]pic cs:3) to ([yshift=2ex]pic cs:2);
\end{tikzpicture} \begin{tikzpicture}[remember picture, overlay, bend left=45, -latex, bluePRL]
\draw ([yshift=-1ex]pic cs:4) to ([yshift=-1ex]pic cs:1);
\end{tikzpicture}
\]
can be written with De Bruijn indices as
De Bruijn also proposed a notation for the λ-calculus changing the order of binders and λ-applications. A review on the syntax of this notation, its advantages and De Bruijn indexes, can be found in cite:kamareddine01. In this section, we are going to describe De Bruijn indexes while preserving the usual notation of λ-terms; that is, De Bruijn indexes and De Bruijn notation are two different concepts and we are going to use only the former one for clarity of exposition.
Our internal definition closely matches the formal one. The names of
the constructors are Var
, Lambda
and App
, representing
variables, abstractions and applications, respectively.
-- | A lambda expression using DeBruijn indexes.
data Exp = Var Integer -- ^ integer indexing the variable.
| Lambda Exp -- ^ lambda abstraction
| App Exp Exp -- ^ function application
deriving (Eq, Ord)
This notation avoids the need for the Barendregt’s variable convention and the α-reductions. It will be useful to implement λ-calculus without having to worry about the specific names of variables.
<<sec-substitution>>
We now implement the substitution operation described in Section
sec-freeboundvars, as it will be needed for β-reducing terms on
De Bruijn indices. In order to define the substitution of the n-th
variable by a λ-term
- find all the occurrences of the variable. At each level of scope we are looking for the successor of the number we were looking for before;
- decrease the higher variables to reflect the disappearance of a lambda;
- replace the occurrences of the variables by the new term, taking into account that free variables must be increased to avoid them getting captured by the outermost lambda terms.
In our code, we can apply the function subs
to any expression. When
it is applied to a λ-abstraction, the index and the free
variables of the replaced term are increased with an auxiliary
function called incrementFreeVars
; whenever it is applied to a
variable, the previous cases are taken into consideration.
-- | Substitutes an index for a lambda expression
subs :: Integer -> Exp -> Exp -> Exp
subs n p (Lambda e) = Lambda (subs (n+1) (incrementFreeVars 0 p) e)
subs n p (App f g) = App (subs n p f) (subs n p g)
subs n p (Var m)
| n == m = p -- The lambda is replaced directly
| n < m = Var (m-1) -- A more exterior lambda decreases a number
| otherwise = Var m -- An unrelated variable remains untouched
Now β-reduction can be defined using this subs
function.
betared :: Exp -> Exp
betared (App (Lambda e) x) = substitute 1 x e
betared e = e
The internal language of the interpreter uses De Bruijn expressions, while the user interacts with it using lambda expressions with alphanumeric variables. Our definition of a lambda expression with variables will be used in parsing and output formatting.
data NamedLambda = LambdaVariable String
| LambdaAbstraction String NamedLambda
| LambdaApplication NamedLambda NamedLambda
The translation from a natural lambda expression to De Bruijn notation is done using a dictionary which keeps track of bounded variables.
tobruijn :: Map.Map String Integer -- ^ names of the variables used
-> Context -- ^ names already binded on the scope
-> NamedLambda -- ^ initial expression
-> Exp
-- Every lambda abstraction is inserted in the variable dictionary,
-- and every number in the dictionary increases to reflect we are entering
-- a deeper context.
tobruijn d context (LambdaAbstraction c e) =
Lambda $ tobruijn newdict context e
where newdict = Map.insert c 1 (Map.map succ d)
-- Translation distributes over applications.
tobruijn d context (LambdaApplication f g) =
App (tobruijn d context f) (tobruijn d context g)
-- We look for every variable on the local dictionary and the current scope.
tobruijn d context (LambdaVariable c) =
case Map.lookup c d of
Just n -> Var n
Nothing -> fromMaybe (Var 0) (MultiBimap.lookupR c context)
The translation from a de Bruijn expression to a natural one is done considering an infinite list of possible variable names and keeping a list of currently-on-scope variables to name the indices.
-- | An infinite list of all possible variable names
-- in lexicographical order.
variableNames :: [String]
variableNames = concatMap (`replicateM` ['a'..'z']) [1..]
-- | A function translating a deBruijn expression into a
-- natural lambda expression.
nameIndexes :: [String] -> [String] -> Exp -> NamedLambda
nameIndexes _ _ (Var 0) = LambdaVariable "undefined"
nameIndexes used _ (Var n) =
LambdaVariable (used !! pred (fromInteger n))
nameIndexes used new (Lambda e) =
LambdaAbstraction (head new) (nameIndexes (head new:used) (tail new) e)
nameIndexes used new (App f g) =
LambdaApplication (nameIndexes used new f) (nameIndexes used new g)
<<section-evaluation>> As we proved on Corollary cor-leftmosttheorem, the leftmost reduction strategy will find the normal form of any given term provided that it exists. Consequently, we will implement reduction in our interpreter using a function that simply applies the leftmost possible reductions at each step. As a side benefit, this will allow us to easily show how the interpreter performs step-by-step evaluations to the final user (see Section sec-verbose).
-- | Simplifies the expression recursively.
-- Applies only one parallel beta reduction at each step.
simplify :: Exp -> Exp
simplify (Lambda e) = Lambda (simplify e)
simplify (App (Lambda f) x) = betared (App (Lambda f) x)
simplify (App (Var e) x) = App (Var e) (simplify x)
simplify (App a b) = App (simplify a) (simplify b)
simplify (Var e) = Var e
-- | Applies repeated simplification to the expression until it stabilizes and
-- returns all the intermediate results.
simplifySteps :: Exp -> [Exp]
simplifySteps e
| e == s = [e]
| otherwise = e : simplifySteps s
where s = simplify e
From the code we can see that the evaluation finishes whenever the expression stabilizes. This can happen in two different cases
- there are no more possible β-reductions, and the algorithm stops; or
- β-reductions do not change the expression. The computation
would lead to an infinite loop, so it is immediately stopped.
An common example of this is the λ-term
$(λ x.x x)(λ x.x x)$ .
<<sec-typeinference>> The interpreter implements the unification and type inference algorithms described in Lemma lemma-unification and Theorem thm-typeinfer. Their recursive nature makes them very easy to implement directly on Haskell.
We implement a simply-typed lambda calculus with Curry-style typing (see Section sec-currystyle) and type templates. Our type system has a unit type; a void type; product types; union types; and function types.
-- | A type template is a free type variable or an arrow between two
-- types; that is, the function type.
data Type = Tvar Variable
| Arrow Type Type
| Times Type Type
| Union Type Type
| Unitty
| Bottom
deriving (Eq)
We will work with substitutions on type templates. They can be directly defined as functions from types to types. A basic substitution that inserts a given type on the place of a variable will be our building block for more complex ones.
type Substitution = Type -> Type
-- | A basic substution. It changes a variable for a type
subs :: Variable -> Type -> Substitution
subs x typ (Tvar y)
| x == y = typ
| otherwise = Tvar y
subs x typ (Arrow a b) = Arrow (subs x typ a) (subs x typ b)
subs x typ (Times a b) = Times (subs x typ a) (subs x typ b)
subs x typ (Union a b) = Union (subs x typ a) (subs x typ b)
subs _ _ Unitty = Unitty
subs _ _ Bottom = Bottom
Unification will be implemented making extensive use of the Maybe
monad. If the unification fails, it will return an error value, and
the error will be propagated to the whole computation. The algorithm
is exactly the same that was defined in Lemma lemma-unification.
-- | Unifies two types with their most general unifier. Returns the substitution
-- that transforms any of the types into the unifier.
unify :: Type -> Type -> Maybe Substitution
unify (Tvar x) (Tvar y)
| x == y = Just id
| otherwise = Just (subs x (Tvar y))
unify (Tvar x) b
| occurs x b = Nothing
| otherwise = Just (subs x b)
unify a (Tvar y)
| occurs y a = Nothing
| otherwise = Just (subs y a)
unify (Arrow a b) (Arrow c d) = unifypair (a,b) (c,d)
unify (Times a b) (Times c d) = unifypair (a,b) (c,d)
unify (Union a b) (Union c d) = unifypair (a,b) (c,d)
unify Unitty Unitty = Just id
unify Bottom Bottom = Just id
unify _ _ = Nothing
-- | Unifies a pair of types
unifypair :: (Type,Type) -> (Type,Type) -> Maybe Substitution
unifypair (a,b) (c,d) = do
p <- unify b d
q <- unify (p a) (p c)
return (q . p)
The type inference algorithm from Theorem thm-typeinfer is more involved. It takes a list of fresh variables, a type context, a lambda expression and a constraint on the type, expressed as a type template. It outputs a substitution. As an example, the following code shows the type inference algorithm for function types.
-- | Type inference algorithm. Infers a type from a given context and expression
-- with a set of constraints represented by a unifier type. The result type must
-- be unifiable with this given type.
typeinfer :: [Variable] -- ^ List of fresh variables
-> Context -- ^ Type context
-> Exp -- ^ Lambda expression whose type has to be inferred
-> Type -- ^ Constraint
-> Maybe Substitution
typeinfer (x:vars) ctx (App p q) b = do -- Writing inside the Maybe monad.
sigma <- typeinfer (evens vars) ctx p (Arrow (Tvar x) b)
tau <- typeinfer (odds vars) (applyctx sigma ctx) q (sigma (Tvar x))
return (tau . sigma)
The final form of the type inference algorithm will use a normalization algorithm shortening the type names and will apply the type inference to the empty type context.
A generalized version of the type inference algorithm is used to generate derivation trees from terms, as it was described in Section sec-propositionstypes. In order to draw these diagrams in Unicode characters, a data type for character blocks has been defined. A monoidal structure is defined over them; blocks can be joined vertically and horizontally; and every deduction step can be drawn independently.
newtype Block = Block { getBlock :: [String] }
deriving (Eq, Ord)
instance Monoid Block where
mappend = joinBlocks -- monoid operation, joins blocks vertically
mempty = Block [[]] -- neutral element
-- Type signatures
joinBlocks :: Block -> Block -> Block
stackBlocks :: String -> Block -> Block -> Block
textBlock :: String -> Block
deductionBlock :: Block -> String -> [Block] -> Block
box :: Block -> Block
<<sec-monadicparsers>> The common approach to building parsers in functional programming is to model parsers as functions. Higher-order functions on parsers act as combinators, which are used to implement complex parsers in a modular way from a set of primitive ones. In this setting, parsers exhibit a monad algebraic structure, which can be used to simplify the combination of parsers. A technical report on monadic parser combinators can be found on cite:hutton96.
The use of monads for parsing was discussed for the first time in cite:Wadler85,
and later in cite:Wadler90 and cite:hutton98. The parser type is
defined as a function taking an input String
and returning a list of pairs,
representing a successful parse each. The first component of the pair
is the parsed value and the second component is the remaining
input. The Haskell code for this definition is the following,
where the monadic structure is defined by >>=
and return
.
-- A parser takes a string an returns a list of possible parsings with
-- their remaining string.
newtype Parser a = Parser (String -> [(a,String)])
parse :: Parser a -> String -> [(a,String)]
parse (Parser p) = p
-- A parser can be composed monadically, the composed parser (p >>= q)
-- applies q to every possible parsing of p. A trivial one is defined.
instance Monad Parser where
return x = Parser (\s -> [(x,s)]) -- Trivial parser, directly returns x.
p >>= q = Parser (\s -> concat [parse (q x) s' | (x,s') <- parse p s ])
Given a value, the return
function creates a parser that consumes no input
and simply returns the given value. The >>=
function acts as a sequencing
operator for parsers; it takes two parsers and applies the second one
over the remaining inputs of the first one, using the parsed values on
the first parsing as arguments.
An example of primitive parser is the item
parser, which consumes a
character from a non-empty string. It is written in Haskell code using
pattern matching on the string as follows.
item :: Parser Char
item = Parser (\s -> case s of "" -> []; (c:s') -> [(c,s')])
An example of parser combinator is the many
function, which
creates a parser that allows one or more applications of the given
parser. In the following example many item
would be a parser
consuming all characters from the input string.
many :: Paser a -> Parser [a]
many p = do
a <- p
as <- many p
return (a:as)
Parsec is a monadic parser combinator Haskell library described in cite:leijen2001. We have chosen to use it due to its simplicity and extensive documentation. As we expect to use it to parse user live input, which will tend to be short, performance is not a critical concern. A high-performace library supporting incremental parsing, such as Attoparsec cite:attoparsec, would be suitable otherwise.
<<sec-verbose>>
As we mentioned previously, the evaluation of lambda terms can be
analyzed step-by-step. The interpreter allows us to see the complete
evaluation when the verbose mode is activated. To activate it, we
can execute :verbose on
in the interpreter. The difference can be
seen on the following example, which shows the execution of the
expression
mikro> plus 1 2
λa.λb.(a (a (a b))) ⇒ 3
mikro> :verbose on
verbose: on
mikro> plus 1 2
((plus 1) 2)
((λλλλ((4 2) ((3 2) 1)) λλ(2 1)) λλ(2 (2 1)))
(λλλ((λλ(2 1) 2) ((3 2) 1)) λλ(2 (2 1)))
λλ((λλ(2 1) 2) ((λλ(2 (2 1)) 2) 1))
λλ(λ(3 1) (λ(3 (3 1)) 1))
λλ(2 (λ(3 (3 1)) 1))
λλ(2 (2 (2 1)))
λa.λb.(a (a (a b))) ⇒ 3
The interpreter output can be colored to show specifically where it
is performing reductions. It is activated by default, but can be deactivated
by executing :color off
. The following code implements verbose mode in both
cases.
-- | Shows an expression, coloring the next reduction if necessary
showReduction :: Exp -> String
showReduction (Lambda e) = "λ" ++ showReduction e
showReduction (App (Lambda f) x) = betaColor (App (Lambda f) x)
showReduction (Var e) = show e
showReduction (App rs x) = "("++showReduction rs++" "++showReduction x++")"
showReduction e = show e
Every λ-term can be written in terms of SKI combinators. SKI combinator expressions can be defined as a binary tree having variables, S, K, and I as possible leafs.
data Ski = S | K | I | Comb Ski Ski | Cte String
The SKI-abstraction and bracket abstraction algorithms are implemented
on Mikrokosmos, and they can be used by activating the ski mode with
:ski on
. When this mode is activated, every result is written in terms
of SKI combinators.
mikro> 2 λa.λb.(a (a b)) ⇒ S(S(KS)K)I ⇒ 2 mikro> and λa.λb.((a b) a) ⇒ SSK ⇒ and
The code implementing these algorithms follows directly from the theoretical version in cite:Hindley08.
-- | Bracket abstraction of a SKI term, as defined in Hindley-Seldin
-- (2.18).
bracketabs :: String -> Ski -> Ski
bracketabs x (Cte y) = if x == y then I else Comb K (Cte y)
bracketabs x (Comb u (Cte y))
| freein x u && x == y = u
| freein x u = Comb K (Comb u (Cte y))
| otherwise = Comb (Comb S (bracketabs x u)) (bracketabs x (Cte y))
bracketabs x (Comb u v)
| freein x (Comb u v) = Comb K (Comb u v)
| otherwise = Comb (Comb S (bracketabs x u)) (bracketabs x v)
bracketabs _ a = Comb K a
-- | SKI abstraction of a named lambda term. From a lambda expression
-- creates a SKI equivalent expression. The following algorithm is a
-- version of the algorithm (9.10) on the Hindley-Seldin book.
skiabs :: NamedLambda -> Ski
skiabs (LambdaVariable x) = Cte x
skiabs (LambdaApplication m n) = Comb (skiabs m) (skiabs n)
skiabs (LambdaAbstraction x m) = bracketabs x (skiabs m)
<<section-installation>> The complete Mikrokosmos suite is divided in multiple parts:
- the Mikrokosmos interpreter, written in Haskell;
- the Jupyter kernel, written in Python;
- the CodeMirror Lexer, written in Javascript;
- the Mikrokosmos libraries, written in the Mikrokosmos language;
- the Mikrokosmos-js compilation, which can be used in web browsers.
These parts will be detailed on the following sections. A system that already satisfies all dependencies (Stack, Pip and Jupyter), can install Mikrokosmos using the following script, which is detailed on this section
stack install mikrokosmos # Mikrokosmos interpreter
sudo pip install imikrokosmos # Jupyter kernel for Mikrokosmos
git clone https://github.com/mroman42/mikrokosmos-lib.git ~/.mikrokosmos # Libs
The Mikrokosmos interpreter is listed in the central Haskell package
archive[fn:hackage]. The packaging of Mikrokosmos has been done using
the cabal tool; and the configuration of the package can be read in
the file mikrokosmos.cabal
of the source code. As a result,
Mikrokosmos can be installed using the Haskell package managers
cabal and stack.
cabal install mikrokosmos # Installation with cabal
stack install mikrokosmos # Installation with stack
The Mikrokosmos Jupyter kernel is listed in the central Python
package archive[fn:pipmikro]. Jupyter is a dependency of this kernel, which only
can be used in conjunction with it. It can be installed with the
pip
package manager.
sudo pip install imikrokosmos # Installation with pip
The installation can be checked by listing the available Jupyter kernels.
jupyter kernelspec list # Checks installation
The Mikrokosmos libraries can be downloaded directly from their GitHub
repository[fn:mikrokosmoslibgit]. They have to be placed under
~/.mikrokosmos
if we want them to be locally available or under
/usr/lib/mikrokosmos
if we want them to be globally available.
git clone https://github.com/mroman42/mikrokosmos-lib.git ~/.mikrokosmos
The following script installs the complete Mikrokosmos suite on a
fresh system. It has been tested under Ubuntu 16.04.3 LTS (Xenial
Xerus)
.
# 1. Installs Stack, the Haskell package manager
wget -qO- https://get.haskellstack.org | sh
STACK=$(which stack)
# 2. Installs the ncurses library, used by the console interface
sudo apt install libncurses5-dev libncursesw5-dev
# 3. Installs the Mikrokosmos interpreter using Stack
$STACK setup
$STACK install mikrokosmos
# 4. Installs the Mikrokosmos standard libraries
sudo apt install git
git clone https://github.com/mroman42/mikrokosmos-lib.git ~/.mikrokosmos
# 5. Installs the IMikrokosmos kernel for Jupyter
sudo apt install python3-pip
sudo -H pip install --upgrade pip
sudo -H pip install jupyter
sudo -H pip install imikrokosmos
[fn:pipmikro]: The Jupyter-Mikrokosmos package can be found in https://pypi.org/project/imikrokosmos/. [fn:hackage]: Hackage can be accessed in http://hackage.haskell.org/ and the Mikrokosmos package can be found in https://hackage.haskell.org/package/mikrokosmos [fn:mikrokosmoslibgit]: The repository can be accessed in: https://github.com/mroman42/mikrokosmos-lib.git
Once installed, the Mikrokosmos λ interpreter can be opened from
the terminal with the mikrokosmos
command. It will enter a read-eval-print loop
where λ-expressions and interpreter commands can be evaluated.
$> mikrokosmos Welcome to the Mikrokosmos Lambda Interpreter! Version 0.7.0. GNU General Public License Version 3. mikro> _
The interpreter evaluates every line as a lambda expression. Examples on the use of the interpreter can be read on the following sections. Apart from the evaluation of expressions, the interpreter accepts the following commands
:quit
and:restart
, stop the interpreter;:verbose
activates verbose mode;:ski
activates SKI mode;:types
changes between untyped and simply typed λ-calculus;:color
deactivates colored output;:load
loads a library.
Figure mikrosession is an example session on the Mikrokosmos interpreter.
<<section-jupyterkernel>> The Jupyter Project cite:jupyter is an open source project providing support for interactive scientific computing. Specifically, the Jupyter Notebook provides a web application for creating interactive documents with live code and visualizations.
We have developed a Mikrokosmos kernel for the Jupyter Notebook, allowing the user to write and execute arbitrary Mikrokosmos code on this web application. An example session can be seen on Figure jupytersession.
The implementation is based on the pexpect
library for Python. It
allows direct interaction with any REPL and collects its results.
Specifically, the following Python lines represent the central idea of
this implementation.
# Initialization
mikro = pexpect.spawn('mikrokosmos')
mikro.expect('mikro>')
# Interpreter interaction
# Multiple-line support
output = ""
for line in code.split('\n'):
# Send code to mikrokosmos
self.mikro.sendline(line)
self.mikro.expect('mikro> ')
# Receive and filter output from mikrokosmos
partialoutput = self.mikro.before
partialoutput = partialoutput.decode('utf8')
output = output + partialoutput
A pip
installable package has been created following the
Python Packaging Authority guidelines[fn:pypaguide]. This allows
the kernel to be installed directly using the pip
python package manager.
sudo -H pip install imikrokosmos
[fn:pypaguide]: The PyPA packaging user guide can be found in its official page: https://packaging.python.org/
CodeMirror [fn:codemirror] is a text editor for the browser implemented in Javascript. It is used internally by the Jupyter Notebook.
A CodeMirror lexer for Mikrokosmos has been written. It uses Javascript regular expressions and signals the occurrence of any kind of operator to CodeMirror. It enables syntax highlighting for Mikrokosmos code on Jupyter Notebooks. It comes bundled with the kernel specification and no additional installation is required.
CodeMirror.defineSimpleMode("mikrokosmos", {
start: [
// Comments
{regex: /\#.*/,
token: "comment"},
// Interpreter
{regex: /\:load|\:verbose|\:ski|\:restart|\:types|\:color/,
token: "atom"},
// Binding
{regex: /(.*?)(\s*)(=)(\s*)(.*?)$/,
token: ["def",null,"operator",null,"variable"]},
// Operators
{regex: /[=!]+/,
token: "operator"},
],
meta: {
dontIndentStates: ["comment"],
lineComment: "#"
}
}
[fn:codemirror]: Documentation for CodeMirror can be found in its official page: https://codemirror.net/
<<section-jupyterhub>>
JupyterHub manages multiple instances of independent single-user
Jupyter notebooks. It has been used to serve Mikrokosmos notebooks and
tutorials to students. In order to install Mikrokosmos on a server and
use it as root
user, we perform the following steps.
- Cloning the libraries into
/usr/lib/mikrokosmos
. They should be available system-wide. - Installing the Mikrokosmos interpreter into
/usr/local/bin
. In this case, we can choose not to install Mikrokosmos from source, but simply copy the binaries and check the availability of thencurses
library. - Installing the Mikrokosmos Jupyter kernel as usual.
A JupyterHub server was made available at iemath1.ugr.es
. Our
server used a SSL certificate and OAuth authentication via GitHub.
Mikrokosmos tutorials and exercises were installed for every student.
<<section-javascript>> The GHCjs[fn:ghcjs] compiler allows transpiling from Haskell to Javascript. Its foreign function interface allows a Haskell function to be passed as a continuation to a Javascript function.
A particular version of the Main.hs
module of Mikrokosmos was
written in order to provide a mikrokosmos
function, callable from
Javascript. This version includes the standard libraries automatically
and reads blocks of text as independent Mikrokosmos commands. The
relevant use of the foreign function interface is shown in the
following code, which provides mikrokosmos
as a Javascript function
once the code is transpiled.
foreign import javascript unsafe "mikrokosmos = $1"
set_mikrokosmos :: Callback a -> IO ()
In particular, the following is an example of how to call Mikrokosmos from Javascript.
button.onclick = function () {
editor.save();
outputcode.getDoc().setValue(mikrokosmos(inputarea.value).mkroutput);
textAreaAdjust(outputarea);
}
A small script has been written in Javascript to help with the task of embedding Mikrokosmos into a web page. It and can be included directly from the following direction, using GitHub as a CDN.
The script will convert any HTML script tag written as follows into a CodeMirror pad where Mikrokosmos can be executed.
<div class="mikrojs-console">
<script type="text/mikrokosmos">
(λx.x)
... your code
</script>
</div>
The Mikrokosmos tutorials are an example of this feature and can be seen on Figure mikrokosmosjstutorial. They can be accessed from the following direction.
[fn:ghcjs]: The GHCjs documentation is available on its web page https://github.com/ghcjs/ghcjs
The Mikrokosmos documentation as a Haskell library is included in its own code. It uses Haddock, a tool that generates documentation from annotated Haskell code; it is the de facto standard for Haskell software.
Dependencies and packaging details for Mikrokosmos are specified in
a file distributed with the source code called mikrokosmos.cabal
.
It is used by the package managers stack and cabal to provide the
necessary libraries even if they are not available system-wide. The
stack tool is also used to package the software, which is uploaded
to *Hackage*.
<<section-testing>> Tasty is the Haskell testing framework of our choice for this project. It allows the user to create a comprehensive test suite combining multiple types of tests. The Mikrokosmos code is tested using the following techniques
- unit tests, in which individual core functions are tested independently of the rest of the application;
- property-based testing, in which multiple test cases are created automatically in order to verify that a specified property always holds; this has been useful to test our implementation of several algorithms on the lambda calculus;
- golden tests, a special case of unit tests in which the expected results of an IO action, as described on a file, are checked to match the actual ones; they have been used to check correctness of the Mikrokosmos language.
We are using the HUnit library for unit tests. It tests particular
cases of type inference, unification and parsing. The following is an
example of unit test, as found in tests.hs
. It checks that the type
inference of the identity term is correct.
-- Checks that the type of λx.x is exactly A → A
testCase "Identity type inference" $
typeinference (Lambda (Var 1)) @?= Just (Arrow (Tvar 0) (Tvar 0))
We are using the QuickCheck library for property-based tests. It tests transformation properties of lambda expressions. In the following example, it tests that any De Bruijn expression keeps its meaning when translated into a λ-term.
-- Tests if translation preserves meaning
QC.testProperty "Expression -> named -> expression" $
\expr -> toBruijn emptyContext (nameExp expr) == expr
We are using the tasty-golden package for golden tests. Mikrokosmos
can be passed a file as an argument to interpret it and show only the
results. This feature is used to create a golden test in which the
interpreter is asked to provide the correct interpretation of a given
file. This file is called testing.mkr
, and contains library
definitions and multiple tests. Its expected output is
testing.golden
. For example, the following Mikrokosmos code can be
found on the testing file.
:types on
caseof (inr 3) (plus 2) (mult 2)
While the expected output is the following.
-- types: on
-- λa.λb.(a (a (a (a (a (a b)))))) ⇒ 6 :: (A → A) → A → A
<<sec-versioncontrolci>> Mikrokosmos uses git as its version control system and the code, which is licensed under GPLv3, can be publicly accessed on the following GitHub repository:
Development takes place on the development
git branch and permanent
changes are released into the master
branch. Some more minor
repositories have been used in the development; they directly
depend on the main one.
- https://github.com/mroman42/mikrokosmos-js
- https://github.com/mroman42/jupyter-mikrokosmos
- https://github.com/mroman42/mikrokosmos-lib
The code uses the Travis CI continuous integration system to run tests and check that the software builds correctly after each change and in a reproducible way on a fresh Linux installation provided by the service.
<<sec-programming-untyped>>
This section explains how to use untyped λ-calculus to encode data structures such as booleans, linked lists, natural numbers or binary trees. All of this is done in pure λ-calculus, avoiding the addition of new syntax or axioms.
This presentation follows the Mikrokosmos tutorial on λ-calculus, which aims to teach how it is possible to program using untyped λ-calculus without discussing more advanced technical topics such as those we addressed on Chapter sec-untypedlambda. It also follows the exposition on cite:selinger13 of the usual Church encodings.
All the code on this section is valid Mikrokosmos code.
In the interpreter, λ-abstractions are written with the symbol \
,
representing a λ. This is a convention used on some functional languages
such as Haskell or Agda. Any alphanumeric string can be a variable and
can be defined to represent a particular λ-term using the === operator.
As a first example, we define the identity function (id
), function
composition (compose
) and a constant function on two arguments which
always returns the first one untouched (const
).
id = \x.x
compose = \f.\g.\x.f (g x)
const = \x.\y.x
Evaluation of terms will be presented as comments to the code, as in the following example.
compose id id -- [1]: λa.a ⇒ id
It is important to notice that multiple argument functions are defined as higher one-argument functions that return different functions as arguments. These intermediate functions are also valid λ-terms. For example,
discard = const id
is a function that discards one argument and returns the identity, id
.
This way of defining multiple argument functions is called the currying
of a function in honor to the American logician Haskell Curry in cite:haskell58.
It is a particular instance of a deeper fact we will detail on Chapter
sec-cartesianclosedlambdacalculus: exponentials are defined by the adjunction
<<technique-on-inductive>> We will implicitly use a technique on the majority of our data encodings that allows us to write an encoding for any algebraically inductive generated data. This technique is used without explicit comment on cite:selinger13 and represents the basis of what is called the Church encoding of data in λ-calculus.
We start considering the usual inductive representation of
a data type with constructors, as we do when representing a
syntax with a Backus-Naus form. For example, the naturals can be written
as
It is not possible to directly encode constructors on λ-calculus. Even if we were able to declare constants, they would have no computational content; the data structure would not be reduced under any λ-term, and we would need at least the ability to pattern-match on the constructors to define functions on them. Our λ-calculus would need to be extended with additional syntax for every new data structure.
Our technique, instead, is to define a data term as a function on
multiple arguments representing the missing constructors. For instance,
the number
This acts as the definition of an initial algebra over the constructors (see Section sec-algebras) and lets us to compute over instances of that algebra by instantiating it on particular cases. We will develop explicit examples of this technique on the following sections.
Booleans can be defined as the data generated by a pair of zero-ary
constructors, each one representing a truth value, as in
true = \t.\f.t
false = \t.\f.f
Note that true
and the const
function we defined earlier are
exactly the same term up to α-conversion. The same thing happens
with false
and alwaysid
. The absence of types prevents any effort
to discriminate between these two uses of the same
λ-term. Another side-effect of this definition is that our
true
and false
terms can be interpreted as binary functions
choosing between two arguments, that is,
We can test this interpretation on the interpreter to get the expected results.
true id const --- [1]: id
false id const --- [2]: const
And this inspires the definition of an ifelse
combinator as the
identity, when applied to boolean values.
ifelse = \b.b
(ifelse true) id const --- [1]: id
(ifelse false) id const --- [2]: const
The usual logic gates can be defined using this interpretation of the booleans.
and = \p.\q.p q p
or = \p.\q.p p q
not = \b.b false true
xor = \a.\b.a (not b) b
implies = \p.\q.or (not p) q
xor true true --- [1]: false
and true true --- [2]: true
Our definition of natural numbers is inspired by the Peano natural numbers. We use two constructors
- zero is a natural number, written as Z;
- the successor of a natural number is a natural number, written as S;
and the Backus-Naus form we defined when discussing how to encode inductive data in Section technique-on-inductive.
0 = \s.\z.z
succ = \n.\s.\z.s (n s z)
This definition of 0
is trivial: given a successor function and a
zero, return zero. The successor function seems more complex, but
it uses the same underlying idea: given a number, a successor and a
zero, apply the successor to the interpretation of that number using
the same successor and zero.
We can then name some natural numbers as follows, even if we can not define an infinite number of terms as we might wish.
1 = succ 0
2 = succ 1
3 = succ 2
4 = succ 3
5 = succ 4
6 = succ 5
...
The interpretation a natural number f
and applying them
5 not true --- [1]: false
4 not true --- [2]: true
double = \n.\s.\z.n (compose s s) z
double 3 --- [3]: 6
Addition
plus = \m.\n.\s.\z.m s (n s z)
mult = \m.\n.\s.\z.m (n s) z
plus 2 1 --- [1]: 3
mult 2 4 --- [2]: 8
The predecessor function is much more complex than the previous ones. As we can see, it is not trivial how we could compute the predecessor using the limited form of induction that Church numerals allow.
Stephen Kleene, one of the students of Alonzo Church only discovered how to write the predecessor function after thinking about it for a long time (and he only discovered it while a long visit at the dentist’s, which is the reason why this definition is often called the wisdom tooth trick, see cite:crossley75). We will use a slightly different version of that definition, as we consider it to be simpler to understand.
We will start defining a reverse composition operator, called
rcomp
; and we will study what happens when it is composed to itself;
that is, the operator we define in the following code
rcomp = \f.\g.\h.h (g f)
\f.3 (inc f) --- [1]: λa.λb.λc.c (a (a (b a)))
\f.4 (inc f) --- [2]: λa.λb.λc.c (a (a (a (b a))))
\f.5 (inc f) --- [3]: λa.λb.λc.c (a (a (a (a (b a)))))
allows us to use the b
argument to discard the first instance of the
a
argument and return the same number without the last constructor.
Thus, our definition of pred
is the following.
pred = \n.\s.\z.(n (inc s) (\x.z) (\x.x))
From the definition of pred
, some predicates on numbers can be
defined. The first predicate will be a function distinguishing a
successor from a zero. It will be user later to build more complex
ones. It is built by applying a const false
function n
times to a
true constant. Only if it is applied 0
times, it will return a true
value.
iszero = \n.(n (const false) true)
iszero 0 --- [1]: true
iszero 2 --- [2]: false
From this predicate, we can derive predicates on equality and ordering.
leq = \m.\n.(iszero (minus m n))
eq = \m.\n.(and (leq m n) (leq n m))
We would need two constructors to represent a list: a
nil = \c.\n.n
cons = \h.\t.\c.\n.(c h (t c n))
The interpretation of a list as a higher-order function is its
The
fold = \c.\n.\l.(l c n)
sum = fold plus 0
prod = fold mult 1
all = fold and true
any = fold or false
length = fold (\h.\t.succ t) 0
sum (cons 1 (cons 2 (cons 3 nil))) --- [1]: 6
all (cons true (cons true (cons true nil))) --- [2]: true
The two most commonly used particular cases of fold and frequent examples
of the functional programming paradigm are
- The map function applies a function
$f$ to every element on a list. - The filter function removes the elements of the list that do not satisfy a given predicate. It filters the list, leaving only elements that satisfy the predicate.
They can be defined as follows.
map = \f.(fold (\h.\t.cons (f h) t) nil)
filter = \p.(fold (\h.\t.((p h) (cons h t) t)) nil)
The
mylist = cons 1 (cons 2 (cons 3 nil))
sum (map succ mylist) --- [1]: 9
length (filter (leq 2) mylist) --- [2]: 2
Lists have been defined using two constructors and binary trees will
be defined using the same technique. The only difference with lists is
that the cons
constructor is replaced by a node
constructor, which
takes two binary trees as arguments. That is, a binary tree is
- an empty tree; or
- a node, containing a label, a left subtree, and a right subtree.
Defining functions using a fold-like combinator is again very simple due to the chosen representation. We need a variant of the usual function acting on three arguments, the label, the right node and the left node.
-- Binary tree definition
node = \x.\l.\r.\f.\n.(f x (l f n) (r f n))
-- Example on natural numbers
mytree = node 4 (node 2 nil nil) (node 3 nil nil)
triplesum = \a.\b.\c.plus (plus a b) c
mytree triplesum 0 --- [1]: 9
<<sec-fixed-points>>
A fixpoint combinator is a term representing a higher-order function
that, given any function
Such a combinator exists; and it can be defined and used as follows.
fix := (\f.(\x.f (x x)) (\x.f (x x)))
fix (const id) --- [1]: id
Where :=
defines a function without trying to evaluate it to a
normal form; this is useful in cases like the previous one, where the
function has no normal form. Examples of its applications are a
factorial function or a Fibonacci function, as in
fact := fix (\f.\n.iszero n 1 (mult n (f (pred n))))
fib := fix (\f.\n.iszero n 1 (plus (f (pred n)) (f (pred (pred n)))))
fact 3 --- [1]: 6
fib 3 --- [2]: 5
Note the use of
The !=
. Our evaluation strategy, however, will always find
a way to reduce the term if it is possible, as we saw in Corollary
cor-leftmosttheorem; even if it has intermediate irreducible terms.
fix -- diverges
true id fix -- evaluates to id
false id fix -- diverges
Other examples of the interpreter dealing with non terminating functions include infinite lists as in the following examples, where we take the first term of an infinite list without having to evaluate it completely or compare an infinite number arising as the fix point of the successor function with a finite number.
-- Head of an infinite list of zeroes
head = fold const false
head (fix (cons 0))
-- Compare infinity with other numbers
infinity := fix succ --- [1]: 0
leq infinity 6 --- [2]: false
These definitions unfold as
-
$\mathsf{fix\ (cons\ 0) = cons\ 0\ (cons\ 0\ (cons\ 0\ …))}$ , an infinite list of zeroes; -
$\mathsf{fix\ succ = succ\ (succ\ (succ\ …))}$ , an infinite natural number.
As a final example, we define a minimization operator that finds the first natural causing a given predicate to return true. This, with all the previous considerations about how to program with natural numbers, can be used to prove that Gödel’s μ-recursive functions can be encoded inside untyped lambda calculus, and thus, it is Turing-complete.
mu := \p.fix (\f.\n.(p n) n (f (succ n))) 0
mu (\n.eq (mult 2 n) 10) --- [1]: 5
<<sec-programming-typed>> This section explains how to use the simply typed λ-calculus to encode compound data structures and proofs in intuitionistic logic. We will use the interpreter as a typed language and, at the same time, as a proof assistant for the intuitionistic propositional logic.
This presentation of simply typed structures follows the Mikrokosmos tutorial and the previous sections on simply typed λ-calculus (see Section sec-simplytypedlambda). All the code on this section is valid Mikrokosmos code.
Types can be activated with the command :types on
. If types are
activated, the interpreter will infer (see Section sec-typeinference)
the principal type of every term before its evaluation. The type will
then be displayed after the result of the computation.
If a term is found to be non-typeable, Mikrokosmos will output an error
message signaling the fact. In this way, the evaluation of λ-terms
that could potentially not terminate is prevented. Only typed λ-terms
will be evaluated while the option :types
is on; this ensures the termination
of every computation on typed terms.
Note that the evaluation of compound λ-expressions where the fixpoint operators appear applied to other terms can terminate, but the terms are still non typeable.
Until now, we have only used the function type. That is to say that we are working on the implicational fragment of the simply-typed lambda calculus we described when first describing typing rules. We are now going to extend our type system in the same sense we extended (see Section sec-extendingstlc) that simply-typed lambda calculus. The following types are added to the system.
Type | Name | Description |
---|---|---|
→ | Function type | Functions from a type to another |
× | Product type | Cartesian product of types |
+ | Union type | Disjoint union of types |
⊤ | Unit type | A type with exactly one element |
⊥ | Void type | A type with no elements |
And the following typed constructors are added to the language.
Constructor | Type | Description |
---|---|---|
(-,-) | A → B → A × B | Pair of elements |
fst | (A × B) → A | First projection |
snd | (A × B) → B | Second projection |
inl | A → A + B | First inclusion |
inr | B → A + B | Second inclusion |
caseof | (A + B) → (A → C) → (B → C) → C | Case analysis of an union |
unit | ⊤ | Unital element |
abort | ⊥ → A | Empty function |
absurd | ⊥ → ⊥ | Particular empty function |
They correspond to the constructors we described on previous
sections. The only new term is the absurd
function, which is
only a particular case of abort
, useful when we want to make explicit
that we are deriving an instance of the empty type. This addition will
only make the logical interpretation on the following sections
clearer.
Now it is possible to define a new encoding of the booleans with an
uniform type. The type ⊤ + ⊤
has two inhabitants, inl ⊤
and inr
⊤
; and they can be used by case analysis.
btrue = inl unit
bfalse = inr unit
bnot = \a.caseof a (\a.bfalse) (\a.btrue)
bnot btrue --- [1]: (INR UNIT) ⇒ bfalse :: A + ⊤
bnot bfalse --- [2]: (INL UNIT) ⇒ btrue :: ⊤ + A
With these extended types, Mikrokosmos can be used as a proof checker on first-order intuitionistic logic by virtue of the Curry-Howard correspondence.
<<section-proof-lem>>
Under the logical interpretation of Mikrokosmos, we can transcribe
proofs in intuitionistic logic to λ-terms and check them on the
interpreter. The translation between logical propositions and types is
straightforward, except for the negation of a proposition
Note that this is, in fact, a constructive proof. Although it seems
to use the intuitionistically forbidden technique of proving by
contradiction, it is actually only proving a negation. There is a
difference between assuming
This can be translated to the Mikrokosmos implementation of simply
typed λ-calculus as the following term, whose type is
precisely
notnotlem = \f.absurd (f (inr (\a.f (inl a))))
notnotlem
--- [1]: λa.(ABSURD (a (INR λb.(a (INL b))))) :: ((A + (A → ⊥)) → ⊥) → ⊥
The derivation tree can be seen directly on the interpreter as Figure mikrogentzen shows.
<<section-categories>>
Categories are algebraic structures that capture the notion of composition. They consist of objects linked by composable arrows; to which associativity and identity laws will apply. Thus, a category has to rely in some notion of collection of objects. When interpreted inside set-theory, this term can denote sets or proper classes. We want to talk about categories containing subsets of the class of all sets, and thus it is necessary to allow the objects to form a proper class (which is not itself a set) in order to avoid inconsistent results such as the Russell’s paradox. This is why we will consider a particular class of categories of small set-theoretical size to be specially well-behaved.
A different approach, however, would be to simply take objects and arrows as fundamental concepts of our theory. These foundational concerns will not cause any explicit problem in this presentation of category theory, so we will keep it deliberately open to both interpretations.
The collection of morphisms between two objects
Objects in category theory are an atomic concept and can be only studied by their morphisms; that is, by how they relate to each other. In other words, the essence of a category is given not by its objects, but by the morphisms between them and how composition is defined. It is so much so, that we will consider two objects essentially equivalent (and we will call them isomorphic) whenever they relate to other objects in the exact same way. This occurs if we can find an invertible morphism connecting both: composition by this morphism or its inverse will translate arrows from one object to the other.
As expected, to be isomorphic to is an equivalence
relation. In particular, reflexivity follows from the fact that the
identity is its own inverse, $\id = \id-1
We can notice that morphisms are an abstraction of the notion of structure-preserving maps between mathematical structures. Two structures can be identified if they are related by an isomorphism. From this perspective, it seems natural to ask how injective and surjective can be described only in terms of composition. Monomorphisms and epimorphisms will be abstractions of the usual injective and surjective homomorphisms, respectively.
Note that we could have chosen a stronger and non-equivalent notion to generalize the notions of injective and surjective functions.
By virtue of Proposition prop-unicityinverse, a morphism that is both a retraction and a section is an isomorphism. In general, however, not every epimorphism is a section and not every monomorphism is a retraction. Consider for instance a category with two objects and a single morphism connecting them; it is a monomorphism and an epimorphism, but it has no inverse. In any case, we will usually work with the more general notion of monomorphisms and epimorphisms.
Products and sums are very widespread notions in mathematics.
Whenever a new structure is defined, it is common to ask what the
product or sum of two of these structures would be. Examples of
products are the cartesian product of sets, the product topology or
the product of abelian groups; examples of sums are the disjoint
union of sets, topological sum or the free product of groups. Following
this idea, we can consider certain structures to constitute a
We will abstract categorically these notions in terms of universal properties. This point of view, however, is an important shift with respect to how these properties are classically defined. We will not define the product of two objects in terms of their internal structure (categorically, objects are atomic and do not have any); but in terms of all the other objects, that is, in terms of the complete structure of the category. This turns inside-out the focus of the definitions. Moreover, objects defined in terms of universal properties are usually not uniquely determined, but only determined up to isomorphism. This reinforces our previous idea of considering two isomorphic objects in a category as essentially the same object.
Note that these objects may not exist in any given category;
but when they do, they are essentially unique. If
Note that neither the product nor the sum of two objects necessarily exist
on a category; but when they do, they are essentially unique. This
justifies writing them as
Many mathematical structures, such as sets, groups, or partial orders, are particular cases of a category. Apart from these, we will be also interested in categories whose objects are known mathematical structures (the category of all groups, the category of all sets, and so on). The following are examples on how general the definition of a category is.
<<sec-functorsnatural>>
“Category” has been defined in order to define “functor” and “functor” has been defined in order to define “natural transformation”.
– Saunders MacLane, Categories for the working mathematician, cite:maclane42.
Functors and natural transformations were defined for the first time by Eilenberg and MacLane in cite:maclane42 and cite:eilenberg45 while studying cohomology theory. While initially they were devised mainly as a language for this study, they have proven its foundational value with the passage of time. The notion of naturality will be a key element of our presentation of algebraic theories and categorical logic.
Functors can be seen as a homomorphisms of categories that preserve
composition and identity arrows. A functor between two categories,
- an object function,
$F : \mathrm{obj}({\cal C}) → \mathrm{obj}({\cal D})$ ; - and an arrow function,
$F : hom(A, B) → hom(FA, FB)$ , for any two objects$A,B$ of the category;
such that
We now consider certain properties of functors in this category.
We say that a functor
These notions are equivalent to notions of injectivity and surjectivity
on the arrow function between any two objects.
Note, however, that a faithful functor needs not to be injective on objects nor
on morphisms. In particular, if
Unfortunately, this notion of isomorphism of categories is too strict in
some cases. Sometimes, the two compositions
We have defined functors relating structures on different categories,
and now, we could consider if any two given functors can be related in
a canonical way, involving no arbitrary choices. An example is the
relation between the identity functor and the double dual functor in
vector spaces. They are related by the canonical isomorphism that
sends each vector to the operator that evaluates a function over
it. In this case, the isomorphism can be described without making any
explicit reference to the space. On the other hand, the isomorphism
relating a vector space to its dual depends on the choice of a basis.
A family of canonical linear isomorphisms $σ_V \colon V → V∗$ translating between
an space and its dual, should be invariant to linear maps
A natural transformation between two functors
C’ & & FC’ \rar{αC’} & GC’
\end{tikzcd}\]
Sometimes, we also say that the family of morphisms
& B \dlar{g} & & & F B \dlar{F g} \arrow{rrr}{α_B} & & & G B \dlar{G g} \
C & & & F C \arrow{rrr}{α_C} & & & G C & \
\end{tikzcd}\]
Natural isomorphisms are natural transformations in which every
component, every morphism of the parameterized family, is
invertible. In this case, the inverses form themselves a new
transformation, whose naturality follows from the naturality of the
original transformation. We say that
The notion of a natural isomorphism between functors allows us to
weaken the condition of strict equality we imposed when talking about
isomorphisms of categories (Definition def-isomorphismcategories).
We say that two categories
The next reasonable step is to ask how natural transformations compose. If we draw natural transformations between functors as double arrows, as in the following diagram, \[\begin{tikzcd} {\cal C} \arrow[bend left=50]{r}[name=U,below]{}{F} \arrow[bend right=50]{r}[name=D]{}[swap]{G} & {\cal D} \arrow[Rightarrow,from=U,to=D]{}{α} \end{tikzcd}\] we notice that two different notions of composition arise; we have a vertical composition of natural transformations, which, diagrammatically, composes the two sequential natural transformations on the left side into a single transformation on the right side of the following diagram; \[\begin{tikzcd}[column sep=large] {\cal C} \arrow[bend left=80]{r}[name=U,below]{}{F} \arrow[bend right=80]{r}[name=D]{}[swap]{H} \arrow[]{r}[name=C]{}[swap,xshift=-1.5ex]{G} \arrow[]{r}[name=C2,below]{}[]{} & {\cal D} \arrow[Rightarrow,from=U,to=C]{}{α} \arrow[Rightarrow,from=C2,to=D]{}{β} & {\cal C} \arrow[bend left=50]{r}[name=U,below]{}{F} \arrow[bend right=50]{r}[name=D]{}[swap]{H} & {\cal D} \arrow[Rightarrow,from=U,to=D]{}{α\cdotβ} \end{tikzcd}\] and we have a horizontal composition of natural transformations, which composes the two parallel natural transformations on the left side into a single transformation on the right side of the following diagram. \[\begin{tikzcd}[column sep=large] {\cal C} \arrow[bend left=50]{r}[name=U,below]{}{F} \arrow[bend right=50]{r}[name=D]{}[swap]{G} & {\cal D} \arrow[Rightarrow,from=U,to=D]{}{α} \arrow[bend left=50]{r}[name=UU,below]{}{F’} \arrow[bend right=50]{r}[name=DD]{}[swap]{G’} & {\cal E} \arrow[Rightarrow,from=UU,to=DD]{}{α’} & {\cal C} \arrow[bend left=50]{r}[name=U,below]{}{F’ ˆ F} \arrow[bend right=50]{r}[name=D]{}[swap]{G’ ˆ G} & {\cal E} \arrow[Rightarrow,from=U,to=D]{}{α’ˆ\alpha} \end{tikzcd}\]
Naturality of this family follows from the naturality of its two
factors. Given any morphism
GA \rar{Gf}\dar{β_A} &
GB \dar{βB} \
HA \rar{Hf} &
HB
\end{tikzcd}\]
Naturality of this family follows again from the naturality of its two
factors. Given any morphism
F’FB \rar{F’α} &
F’GB \rar{α’} &
G’GB
\end{tikzcd}\]
<<section-constructions>> Before continuing our study of functors and universal properties, we provide some constructions and examples of categories we will need in the future. In particular, we consider some variations on the definition of functors and we use them to construct new categories.
The product category (see cite:eilenberg45) of two categories
- objects of the form
$\left\langle C,D \right\rangle$ , where$C ∈ {\cal C}$ and$D ∈ {\cal D}$ ; - and morphisms of the form
$\pair{f,g} : \pair{C,D} → \pair{C’,D’}$ , where$f \colon C → C’$ and$g \colon D → D’$ are morphisms in their respective categories.
The identity morphism of any object
The product functor of two functors
{\cal C}’ &
{\cal C}’ × {\cal D}’ \rar[swap]{Q’}\lar{P’}&
{\cal D}’
\end{tikzcd}\]
Explicitly, it acts on arrows as
Bifunctors are functors from a product category, but they can also be regarded as functors on two variables. As we will show in the following proposition, they are completely determined by the two families of functors we obtain when we fix any of the arguments. We will also study a characterization of naturality between these functors.
For many constructions in categories, it makes sense to consider how
the process of reverting all the arrows yields a new construction.
The opposite category ${\cal C}op$ of a category
Reversing all the arrows is a process that directly translates every
property of the category into its dual property. A morphism
Given two categories
In cite:lawvere09, multiple examples of usual mathematical constructions in terms of functor categories can be found. Graphs, for instance, can be seen as functors; and graphs homomorphisms as the natural transformations between them.
<<section-universality>>
A universal property is commonly given in mathematics by some
conditions of existence and uniqueness on morphisms, representing some
sort of natural isomorphism. They can be used to define certain
constructions up to isomorphism and to operate with them in an
abstract setting. We will formally introduce universal properties
using universal arrows from an object
A universal arrow from
C \rar[swap]{u}\urar{g} &
GD_0 \uar[swap,dashed]{Gf} &
D_0 \uar[dashed,swap]{∃! f}
\end{tikzcd}\]
Note how an universal arrow is, equivalently, the initial object of
the comma category
The categorical dual of an universal arrow from an object to a functor is the notion of universal arrow from a functor to an object. Note how, in this case, we avoid the name couniversal arrow; as both arrows are representing what we usually call a universal property.
A dual universal arrow from
D_0 & GD_0 \rar[swap]{v} & C
\end{tikzcd}\]
A representation of a functor from a locally small category to the
category of sets,
The Yoneda lemma is one of the first results in pure category theory. It allows us to embed any small category into a functor category over the category of sets. It will be very useful when studying presheaves and algebraic theories (see Section sec-lawveretheories) to create models of these theories.
In the definition of product, we chose two objects of the category, we
considered all possible cones over two objects and we picked the
universal one.
\[\begin{tikzcd}[column sep=tiny]
&&&&&
D \dar[dashed]{∃!} \ar[bend left,color=myred]{ddr}\ar[bend right,swap,color=myred]{ddl} &\&
C \drar[color=myred]\dlar[color=myred] &&&&
A × B \drar[color=myred]\dlar[color=myred] &
A &&
B &\phantom{gap}&
A &&
B
\end{tikzcd}\]
In this previous commutative diagram,
We will be able to create new constructions on categories by formalizing the notion of cone and generalizing to arbitrary bases, given as functors from arbitrary index categories. Constant functors are the first step into formalizing the notion of cone.
We could say that
&\phantom{gap}&& &&& & \phantom{gap} &&\
& ⋅ \ar{ld}\ar{r}\ar{drr} & ⋅ \ar{dr} & \ar[Rightarrow, shorten <= 3mm, shorten >= 14mm]{uurrrr}[xshift=-3mm]{Δ} &&&
& ⋅ \ar{ld}\ar{r}\ar{drr} & ⋅ \ar{dr} & \
⋅ \arrow{rrr}[description, yshift=-5mm]{\big{\cal J}} &&&
⋅ \ar[Rightarrow, yshift=3mm,swap, shorten <= 2mm, shorten >= 2mm]{rrr}{F}
&\phantom{gap}&&
⋅ \arrow{rrr}[description, yshift=-5mm]{\big{F \cal J}} & & & ⋅ \
%&{\cal J}&& &&& &F{\cal J}&&
\end{tikzcd}\]
The components of the natural transformation appear highlighted in
the diagram. The naturality of the transformation implies that each
triangle
\[\begin{tikzcd}[column sep=tiny,row sep=small]
&⋅
\drar[color=myred, shorten >= -1.5mm, shorten <= -2mm]
\dlar[color=myred, shorten >= -1.5mm, shorten <= -2mm] &\
⋅
\ar[shorten >= -1mm, shorten <= -1mm]{rr} &&
⋅
\end{tikzcd}\]
on that cone must be commutative. Thus, natural transformations are
a way to recover all the information of an arbitrary *index category*
That is, for every natural transformation
r & Δ r \rar[swap]{v} & F
\end{tikzcd}\]
commutes. This reflects directly on the universality of the cone we described
earlier and proves that limits are unique up to isomorphism.
By choosing different index categories, we will be able to define multiple different constructions on categories as limits.
For our first example, we will take the following category,
called
A
\arrow[bend left=15]{rr}{f}
\arrow[bend right=15, swap]{rr}{g} &&
B
\end{tikzcd}\]
Note how the right part of the cone is completely determined
as
Pullbacks are defined as limits whose index category is
& A
\drar[bend left,swap,color=myred]{q}
\dlar[bend right,color=myred]{p}
\arrow[color=myred]{d} & \
X\rar[swap]{f} & Z & Y\lar{g} \
\end{tikzcd}\]
Again, the central arrow of the diagram is determined as
A colimit is the dual notion of a limit. We could consider
cocones to be the dual of cones and pick the universal one.
Once an index category
&\phantom{gap}&& &&& & \phantom{gap} &&\
& ⋅ \ar{ld}\ar{r}\ar{drr} & ⋅ \ar{dr} & \ar[Rightarrow, shorten <= 3mm, shorten >= 14mm]{uurrrr}[xshift=-3mm]{Δ} &&&
& ⋅ \ar{ld}\ar{r}\ar{drr}\ar[color=myblue]{uu} & ⋅ \ar{dr} \ar[color=myblue,bend right]{uul} & \
⋅ \arrow{rrr}[description, yshift=-5mm]{\big{\cal J}} &&&
⋅ \ar[Rightarrow, yshift=3mm,swap, shorten <= 2mm, shorten >= 2mm]{rrr}{F}
&\phantom{gap}&&
⋅ \arrow[color=myblue, bend left]{uuur} \arrow{rrr}[description, yshift=-5mm]{\big{F \cal J}} &&&
⋅ \arrow[color=myblue, bend right]{uuull}\
\end{tikzcd}\]
is an example of a cocone. The universal one would be the
That is, for every natural transformation
r\uar[dashed]{∃! f} &
Δ r \uar[dashed]{Δ f} &
F \lar{v}\ular[swap]{w}
\end{tikzcd}\]
commutes. This reflects directly on the universality of the cocone we described
earlier and proves that colimits are unique up to isomorphism.
Coequalizers are the dual of equalizers; colimits of functors
from
a
\ar[bend left, color=myblue, swap]{uur}
\ar[color=myblue]{ur}
\arrow[bend left=15]{rr}{f}
\arrow[bend right=15, swap]{rr}{g} &&
b
\ar[color=myblue, swap]{ul}{e}
\ar[color=myblue, bend right, swap]{uul}{e’}
\end{tikzcd}\]
As the arrows on the right of the cocone are completely determined by
the fact that the diagram must be commutative, we can simplify the diagram
as follows.
\[\begin{tikzcd}
a
\ar[bend left=15]{r}{f}
\ar[bend right=15, swap]{r}{g} &
b
\ar{r}{e}
\ar[swap]{dr}{e’}&
\mathrm{coeq}(f,g) \ar[dashed]{d}{∃!} \
&&
d
\end{tikzcd}\]
Pushouts are the dual of pullbacks; colimits whose index category
is
& a
\arrow[dashed,swap]{u}{∃!} & \
x
\urar[bend left,color=myblue]{p}
\arrow[bend left,color=myblue]{uur}{p’} &
z
\lar{f}
\rar[swap]{g}
\arrow[color=myblue]{u}
\arrow[bend left,color=myblue]{uu} &
y
\ular[bend right,swap,color=myblue]{q}
\arrow[bend right,swap,color=myblue]{uul}{q’} \
\end{tikzcd}\]
We can define the pushout of two morphisms
<<section-adjoints-monads>>
An adjunction from categories
Naturality of
hom(FX’,Y)
\rar[swap]{\varphi} &
hom(X’,GY) &
hom(FX,Y’)
\rar[swap]{\varphi} &
hom(X,GY’)
\end{tikzcd}\]
commute for every
A different and maybe more intuitive way to write adjunctions is by
logical diagrams (see cite:lawvere309). An adjunction
\begin{prooftree}
\AXC{\begin{tikzcd}[fragile,ampersand replacement=\&] FX \rar{f}\& Y \end{tikzcd}}
\doubleLine
\UIC{\begin{tikzcd}[fragile,ampersand replacement=\&] X \rar{\varphi{(f)}}\& GY \end{tikzcd}}
\end{prooftree}
to emphasize that, for each morphism
In the following two propositions, we will characterize all this
information in terms of natural transformations made up of universal
arrows. Given an adjunction
- the unit
$η$ as the family of morphisms $η_X = \varphi(\idFX) \colon X → GFX$, for each$X$ ; - the counit
$ε$ as the family of morphisms $ε_Y = \varphi-1(\idGY) \colon FGY → Y$, for each$Y$ .
Diagrammatically, they can be obtained by taking
\begin{prooftree} \AXC{\begin{tikzcd}[fragile,ampersand replacement=\&] FX \rar{\id}\& FX \end{tikzcd}} \doubleLine \UIC{\begin{tikzcd}[fragile,ampersand replacement=\&] X \rar{ηx}\& GFX \end{tikzcd}} \AXC{\begin{tikzcd}[fragile,ampersand replacement=\&] FGY \rar{εy}\& Y \end{tikzcd}} \doubleLine \UIC{\begin{tikzcd}[fragile,ampersand replacement=\&] GY \rar{\id}\& GY \end{tikzcd}} \noLine \BIC{} \end{prooftree}
<<sec-monads>>
The notion of monads is pervasive in category theory. A monad is a certain type of endofunctor that arises naturally when considering adjoints. They will be useful to model algebraic notions inside category theory and to model a variety of effects and contextual computations in functional programming languages. We will only prove here that each adjunction gives rise to a monad, but it is also true that there are multiple ways to turn a monad into an adjunction. With this result, we aim to illustrate how adjoints capture another fundamental concept in functional programming.
A monad is a functor
-
$η\colon \Id \tonat T$ , called unit; and -
$μ \colon T^2 \tonat T$ , called multiplication;
such that the following two diagrams commute.
\[\begin{tikzcd}
T^3 \rar{Tμ}\dar{μ T} & T^2\dar{μ}
T^2 \rar{μ} & T
\end{tikzcd}
\qquad
\begin{tikzcd}
\Id ˆ T \rar{η T}\drar[swap]{≅} & T^2\dar{μ} & \lar[swap]{Tη}\dlar{≅} Tˆ \Id \
& T &
\end{tikzcd}\]
The first diagram is encoding some form of associativity of the
multiplication, while the second one is encoding the fact that
<<sec-algebras>> In category theory, the word algebra denotes certain structures that can be easily described in terms of endofunctors as universal fixed points for some functors. They model inductive constructions in terms of category theory.
Let
X \rar{h} & Y
\end{tikzcd}\]
This defines a category
<<section-presheaves>>
Presheaves can be thought as sets parameterized by a category or as generalized sets. As we will study in the following chapters, categories of presheaves share a lot of categorical structure with the category of Sets and can be used as non-standard models of lambda calculus.
Specifically, a presheaf on the category
<<sec-cartesianclosedlambdacalculus>>
<<sec-lawveretheories>>
The usual notion of algebraic theory is given by a set of k-ary operations
for each
An algebraic theory can be built from its operations and axioms as
follows: objects represent natural numbers,
The category of models, $\Mod{\cal C}(\mathbb{A})$, is the subcategory of the functor category ${\cal C}\mathbb{A}$ containing the functors that preserve all finite products, with the usual natural transformations between them. We say that a category is algebraic if it is equivalent to a category of the form $\Mod{\cal C}(\mathbb{A})$.
By construction we know that, if an equation can be proved from the axioms, it is valid in all models (our semantics are sound); but we would like to also prove that, if every model of the theory satisfies a particular equation, it can actually be proved from the axioms of the theory (our semantics are complete). In general, we can actually prove a stronger, but maybe unsatisfying, result.
This proof feels rather disappointing because this trivial model is not even set-theoretic in general; but we can go further and assert the existence of a universal model in a presheaf category via the Yoneda embedding (Definition def-yoneda-functor).
These adjoints are given by terminal, product and exponential objects
respectively, written as follows.
\begin{prooftree}
\AXC{\begin{tikzcd}[fragile,ampersand replacement=\&] ∗ \rar{}\& ∗ \end{tikzcd}}
\doubleLine
\UIC{\begin{tikzcd}[fragile,ampersand replacement=\&] C \rar{!}\& 1 \end{tikzcd}}
\AXC{\begin{tikzcd}[fragile,ampersand replacement=\&] C,C \rar{f,g}\& A,B \end{tikzcd}}
\doubleLine
\UIC{\begin{tikzcd}[fragile,ampersand replacement=\&] C \rar{\pair{f,g}}\& A × B \end{tikzcd}}
\AXC{\begin{tikzcd}[fragile,ampersand replacement=\&] C × A \rar{f}\& B \end{tikzcd}}
\doubleLine
\UIC{\begin{tikzcd}[fragile,ampersand replacement=\&] C \rar{\widetilde{f}}\& B^A \end{tikzcd}}
\noLine
\TIC{}
\end{prooftree}
In particular, exponentials are characterized by the family of
evaluation morphisms
In general, cartesian-closed categories with functors preserving
products and exponentials form a category called
If we read
A λ-theory
\pair{a,b} ≡ \pair{a’,b’}, & \mbox{ for each } a ≡ a’,\ b ≡ b’ \
\fst\pair{a,b} ≡ a,\ \snd\pair{a,b}≡ b, & \mbox{ for each } a : A, b : B; \
\fst\ m ≡ \fst\ m’,\ \snd\ m ≡ \snd\ m’, & \mbox{ for each } m ≡ m’; \
m ≡ \pair{\fst\ m,\snd\ m}, & \mbox{ for each } m : A × B; \
f\ x ≡ f’\ x’, & \mbox{ for each } f ≡ f’, x ≡ x’; \
(λ x.f\ x) ≡ f, & \mbox{ for each } f : A → B; \
(λ x.m) ≡ (λ x.m’), & \mbox{ for each } m ≡ m’; \
(λ x.m)\ n ≡ m[n/x] & \mbox{ for each } m : B, n : A. \
\end{array}\end{center}
Two types are isomorphic,
We consider the category
Apart from the natural definition of isomorphism, we consider the
weaker notion of equivalence of theories. Two theories with translations
Thus, we can say that the simply-typed lambda calculus is the internal language of cartesian closed categories; each lambda theory represents a cartesian closed category.
<<section-diagonalarguments>>
We can now talk internally about cartesian closed categories using
lambda calculus. Note each closed λ term
We want to note here how abstracting a seemingly banal theorem
has resulted in a myriad of deep results. Moreover, Lawvere’s
original theorem can be replicated without exponentials in any
category with all products, taking adjoints in
The rules of union and void types in simply-typed lambda calculus can be rewritten to match the structure of these adjoints. \begin{prooftree} \AXC{$$} \UIC{$Γ, z:0 \vdash \mathrm{abort} : C$} \AXC{$Γ, a:A \vdash c : C$} \AXC{$Γ, b:B \vdash c’ : C$} \BIC{$Γ, u : A+B \vdash \mathrm{case}\ u\ \mathrm{of}\ c;c’ : C$} \noLine \BIC{} \end{prooftree} In a similar way to how our previous fragment of simply-typed lambda calculus was the internal language of cartesian closed categories, the extended version of lambda calculus defined in Section sec-extendingstlc is the internal language of bicartesian closed categories.
The existence of an initial algebra in a cartesian closed category is equivalent to the existence a type with an elimination rule representing an induction principle. Types generated by an inductive rule are called inductive types and are common constructs in functional programming languages. Inductive types and their properties can be encoded as initial algebras satisfying a certain universal property. In particular, we will see examples of inductive types and how all of them can be modeled as initial algebras over a certain class of functors.
<<sec-locallydependent>>
The motivation for this section is the interpretation of logical formulas
as subsets. Every predicate
Given a function
A \rar{f} & B
\end{tikzcd}\]
A particular case of a substitution is logical weakening: a proposition
on the set
A × B \rar{π} & A
\end{tikzcd}\]
Although it seems like an uninteresting particular case, once we
formalize this operation as a functor, existential and universal
quantifiers can be obtained as adjoints to weakening.
In
- proving the implication
$P(π(a,b)) → Q(a,b)$ for each pair$(a,b)$ amounts to prove that$P(a) → \left(∀ b ∈ B, Q(a,b)\right)$ for each$a$ ; \begin{prooftree} \AXC{\begin{tikzcd}[fragile,ampersand replacement=\&, row sep=tiny] \& A × B \&
\left\{ (a,b) \mid P(a) \right\} \urar[hook] \ar{rr} \&\& \left\{ (a,b) \mid Q(a,b) \right\} \ular[hook] \end{tikzcd}} \doubleLine \UIC{\begin{tikzcd}[fragile,ampersand replacement=\&,row sep=tiny] \left\{ a \mid P(a) \right\} \drar[hook] \ar{rr} \&\& \left\{ a \mid ∀ b∈ B, Q(a,b) \right\} \dlar[hook] \ \& A \& \end{tikzcd}} \end{prooftree} - and proving that
$(∃ b∈ B, P(a,b)) → Q(a)$ for each$a$ is the same as proving that$P(a,b) → Q(π(a,b))$ for each pair$(a,b)$ . \begin{prooftree} \AXC{\begin{tikzcd}[fragile,ampersand replacement=\&, row sep=tiny] \& A × B \&
\left\{ (a,b) \mid P(a,b) \right\} \urar[hook] \ar{rr} \&\& \left\{ (a,b) \mid Q(a) \right\} \ular[hook] \end{tikzcd}} \doubleLine \UIC{\begin{tikzcd}[fragile,ampersand replacement=\&,row sep=tiny] \left\{ a \mid ∃ b∈ B, P(a,b) \right\} \drar[hook] \ar{rr} \&\& \left\{ a \mid Q(a) \right\} \dlar[hook] \ \& A \& \end{tikzcd}} \end{prooftree}
Note how, in this case, we are considering adjunction diagrams in the slice category. A generalization of this idea to other categories will extend our categorical logic with quantifiers.
We define a locally cartesian closed category as a category with a
terminal object and pullbacks
We proved earlier that generalized sets, or presheaves, were cartesian closed (Proposition proposition-presheaves-ccc); we will now prove that they are actually locally cartesian closed.
In the same way that cartesian closed categories model the types of
the simply typed lambda calculus, locally cartesian closed categories
model dependent types that can depend on elements of another type.
Each dependent type
Typing contexts for dependent type theory are given as a list of
variables
\[
Γ = (a_1 : A_1, a_2 : A_2, …, a_n : A_n)
\]
where each type
The interpretation of a dependent type theory as a category takes
contexts
A particular kind of substitutions will be display maps. If a term
can be constructed on a given context,
This way, each type
& \intr{Γ}
\end{tikzcd}\]
The locally cartesian closed structure of a category induces new type constructors in the type theory: dependent pairs and dependent functions. Their meaning under the Curry-Howard interpretation is that of the existential and universal quantifiers, respectively.
Dependent pair types, or Σ-types, can be seen as a
generalized version of product types. Given a family of types parameterized
by another type, $\{B(a)\}a : A
In a locally closed cartesian category, a type
& \intr{Γ} \rar{a}\drar[equal] & \intr{Γ, x : A} \dar{π_A} \
&& \intr{Γ}
\end{tikzcd}\]
This can be rewritten as the following introduction rule.
\begin{prooftree}
\AXC{$Γ \vdash a : A$}
\AXC{$Γ \vdash b : B[a/x]$}
\BIC{$Γ \vdash \pair{a,b} : ∑x:AB$}
\end{prooftree}
The adjunction in the slice category can be then particularized
in the following two cases, taking
\intr{Γ,A,B} \ar{rr}{δ\circ πB} \urar[bend left] \&\&
\intr{Γ,A,A} \ular[bend right]
\end{tikzcd}}
\doubleLine
\UIC{\begin{tikzcd}[fragile,ampersand replacement=\&,column sep=tiny,row sep=small]
\intr{Γ, ∑a:AB} \ar{rr}{\mathrm{fst}} \drar[bend right] \&\&
\intr{Γ,A} \dlar[bend left] \
\& \intr{Γ} \&
\end{tikzcd}}
\AXC{\begin{tikzcd}[fragile,ampersand replacement=\&,column sep=tiny,row sep=small]
\& \intr{Γ,A} \& \
\intr{Γ, A, B} \ar{rr}{id} \urar[bend left] \&\&
\intr{Γ, A, B} \ular[bend right]
\end{tikzcd}}
\doubleLine
\UIC{\begin{tikzcd}[fragile,ampersand replacement=\&,column sep=tiny,row sep=small]
\intr{Γ, ∑a:AB} \ar{rr}{\mathrm{snd}} \drar[bend right] \&\&
\intr{Γ, A, B} \dlar[bend left] \
\& \intr{Γ} \&
\end{tikzcd}}
\noLine
\BIC{}
\end{prooftree}
These two equalities represent two elimination rules.
\begin{prooftree}
\AXC{$Γ \vdash m : ∑x:A C$}
\UIC{$Γ \vdash \fst(m) : A$}
\AXC{$Γ \vdash m : ∑x:A C$}
\UIC{$Γ \vdash \snd(m) : C[\fst(m)/a]$}
\noLine
\BIC{}
\end{prooftree}
We have two beta rules
Dependent function types, or Π-types, can be seen as a generalized
version of function types. Given a family of types parameterized by another
type, $\{B(a)\}a : A
In a locally closed cartesian category, given a type
Thus, categorically, the type $∏a:AB(a)$ is given in the empty context as the
adjoint $π_A∗ \dashv Ππ_A$ of the morphism representing the type
& \intr{Γ, a : A}
&
& \intr{Γ}
\end{tikzcd}\]
That is, we have the following adjunction and counit in the slice
category
\begin{prooftree}
\AXC{\begin{tikzcd}[fragile,ampersand replacement=\&,column sep=tiny,row sep=small]
\& \intr{Γ,A} \&
\intr{Γ,A} \ar{rr}{b} \urar[bend left] \&\&
\intr{Γ,A,B} \ular[bend right]
\end{tikzcd}}
\doubleLine
\UIC{\begin{tikzcd}[fragile,ampersand replacement=\&,column sep=tiny,row sep=small]
\intr{Γ} \ar{rr}{(λ a.b)} \drar[bend right] \&\&
\intr{Γ, ∏a:AB} \dlar[bend left] \
\& \intr{Γ} \&
\end{tikzcd}}
\AXC{\begin{tikzcd}[fragile,ampersand replacement=\&,column sep=tiny,row sep=small]
\& \intr{Γ,A} \& \
\intr{Γ,A,∏a:AB} \ar{rr}{app} \urar[bend left] \&\&
\intr{Γ,A,B} \ular[bend right]
\end{tikzcd}}
\doubleLine
\UIC{\begin{tikzcd}[fragile,ampersand replacement=\&,column sep=tiny,row sep=small]
\intr{Γ, ∏a:AB} \ar{rr}{\mathrm{id}} \drar[bend right] \&\&
\intr{Γ, ∏a:AB} \dlar[bend left] \
\& \intr{Γ} \&
\end{tikzcd}}
\noLine
\BIC{}
\end{prooftree}
which can be rewritten as introduction and elimination rules.
\begin{prooftree}
\AXC{$Γ, a : A \vdash b : B$}
\UIC{$Γ \vdash (λ a.b) : ∏a:AB$}
\AXC{$Γ \vdash a : A$}
\AXC{$Γ \vdash f : ∏a:AB$}
\BIC{$Γ \vdash f\ a : B(a)$}
\noLine
\BIC{}
\end{prooftree}
We have the equalities
<<sec-examples-dependent-types>>
Surprisingly, the theorem we have just proved reads classically as
“If for all
$a:A$ there exists a$b:B$ such that$R(a,b)$ , then there exists a function$g : A → B$ such that for each$a:A$ ,$R(a,g(a))$ .”
and this is a form of the axiom of choice. Have we just proved
the axiom of choice? The key insight here is that
“If for all
$a:A$ we have a rule to explicitly construct a$b:B$ such that$R(a,b)$ , then we can use that rule to define a function$g : A → B$ such that for each$a:A$ ,$R(a,g(a))$ .”
and this trivial-sounding theorem is known as the theorem of choice. The moral of this example is that, when we work in type theory, we are working inside constructivist mathematics and the existential quantifier has a fundamentally different meaning. Later, we will see a technique that will allow us to recreate the classical existential quantifier.
<<sec-equalitytypes>>
Equality in our theory comes from an adjunction, as was first proposed
in cite:lawvere70. The equality type
between elements of type
In general we have that, for any two objects in a slice category,
B\rar{f}\uar[dashed,bend left]{\widetilde k} & A
\end{tikzcd} &\qquad& \begin{tikzcd}[column sep=tiny]
B \drar[swap]{f}\ar[dashed]{rr}{k} && C\dlar{g} \
& A &
\end{tikzcd}\end{array}\]
Given any
In particular, let
A\rar{\vartriangle}\uar[dashed,bend left]{\widetilde k} & A× A
\end{tikzcd} &\qquad& \begin{tikzcd}[column sep=tiny]
A \drar[swap]{\vartriangle}\ar[dashed]{rr}{k} && C\dlar{π} \
& A× A &
\end{tikzcd}\end{array}\]
A section
We can still generalize the rule to allow
<<sec-subobject-propositions>>
A subobject classifier is an object
X\rar[dashed]{χ} & Ω
\end{tikzcd}\]
is a pullback square.
Now, if we have a type given by a monomorphism,
\intr{Γ} \rar[dashed]{χ_P} & \intr{Ω}
\end{tikzcd}\]
A type
<<sec-proptrunc>>
Propositions are types, but not all types are propositions; a type
For instance, there are four distinct elements of
\[
∑(n,m) : \mathbb{N × \mathbb{N}} (n + m = 3),
\]
each one of them providing an ordered pair of natural numbers that add up to
How should we represent truncations inside our theory? It can be proved (see cite:awodey04)
that propositional truncation is the left adjoint to the inclusion of propositions
into general types; and, if we assume the existence of this adjoint into the category
we are working into, we can use propositional truncations. That is, if
<<section-topoi>>
Topoi (singular topos) are category-theoretic models of constructive mathematics; we can reason in its internal logic and rebuild large parts of mathematics inside their structure. Each topos is thus an universe of mathematics with different axioms and interpretations cite:bauer2017five; for example,
- inside the Hayland’s realizability topos, every function is computable and we can study Kleene’s realizability theory (see cite:vanoosten08);
- the Dubuc topos provides a non paradoxical formalization of notion of “infinitesimal” used by Newton and Leibniz, and we can study synthetic differential geometry inside it (see cite:dubuc89);
- inside the Johnstone’s topological topos, we can reason with topological spaces and continuous functions between them (see cite:johnstone79), and under certain hypothesis, we can assume that all functions we can build are automatically continuous (see cite:escardo15).
Topoi are defined as locally cartesian closed categories with a subobject classifier in which every finite limit exists. Usually, we will be interested in W-topoi, or topoi with a natural numbers object (in the sense of Example example-naturalnumbersobj) cite:leinster10.
The study of any of these theories is beyond the scope of this text. However, as we have been relating type theory to the internal language of locally closed cartesian categories with enough structure; we know that type theory (depending on what constructions we assume) will have models in categories like these. We only describe a particular case developed by W. Lawvere while trying to provide an axiomatization of the category of sets.
Lawvere’s Elementary Theory of the Category of Sets cite:lawvere64
provides a foundation for mathematics based on category theory. It
describes the category
Using the notation we have developed so far, Lawvere’s axioms can be reduced to the following definition. A model of the Elementary Theory of the Category of Sets is
- a topos, that is, a locally closed cartesian category with all finite limits and a subobject classifier,
- which is /well-pointed/, meaning that any two morphisms
$f,g \colon A → B$ are equal if and only if$f ˆ a = g ˆ a$ for each$a \colon 1 → A$ ; morphisms from the terminal object are called global elements, and this property can be thought as function extensionality; - which has a natural numbers object, in the sense of Example example-naturalnumbersobj;
- which satisfies the Axiom of Choice, meaning that point-surjective
morphisms have a section; in terms of our previous Example example-type-axiom-of-choice,
we could translate this internally as
\[ \left(∏(a:A) \left\|∑(b : B) f(b) = a \right\|\right) → \left\| ∑(g : A → B)∏(a:A) f(g(a)) = a \right\|, \]
for any
$f \colon B → A$ .
Note that the category
<<sec-mltt>> In this chapter, we will exclusively work internally in dependent type theory and use it as a constructive foundation of mathematics. This will have the added benefit that every formal proof in the system will be a closed lambda term and checking that a proof is correct will amount to typechecking the term. Explicitly, the type theory we have been describing in the previous sections corresponds to Martin-Löf type theory cite:nordstrom90. Two libraries of formalized mathematics, corresponding to two different foundational theories, have been built, containing a verified version of all the theorems on this chapter. See Section sec-verified for links to the formalized proofs.
Martin-Löf type theory and the internal language we have been describing so far can also be regarded as a programming language in which is possible to formally specify and prove theorems about the code itself. Formal proofs in this theory can be written in any language with a powerful enough type system; examples of these include
- Agda, a programming language that implements a variant of Martin-Löf type theory;
- Coq cite:coq04 was developed in 1984 in INRIA; it implements Calculus of Constructions and was used, for example, to check a proof of the Four Color Theorem;
- Idris, a programming language implementing dependent types and using a slightly modified version of intensional equality types;
- NuPRL cite:nuprl86, a proof assistant implementing Martin-Löf extensional Type Theory, which is different from the intensional theory in how it defines equality types;
- Cubical cite:cohen16 and RedPRL cite:redprl provide experimental implementations of Cubical Type Theory, a different variant of type theory.
In this text, we will use Agda to write mathematics, taking ideas from cite:Conor17. We will check proofs in Martin-Löf type theory using its type system.
We can translate the categorical structure inside Agda as follows. Dependent products exist naturally as dependent functions of the language.
\setlength{\columnseprule}{0.4pt} \begin{multicols}{2} \begin{prooftree} \AXC{$Γ, a : A \vdash b : B$} \UIC{$Γ \vdash (λ a.b) : ∏a:AB$} \end{prooftree} \vfill \columnbreak \vfill \centering\ExecuteMetaData[agda-mltt/latex/Snippets.tex]{mlttproduct} \vfill \end{multicols}
Dependent sums must be explicitly specified in the form of records: data structures in which the type of every element can depend on the previous ones.
\setlength{\columnseprule}{0.4pt} \begin{multicols}{2} \vfill \begin{prooftree} \AXC{$Γ \vdash a : A$} \AXC{$Γ \vdash b : B[a/x]$} \BIC{$Γ \vdash \pair{a,b} : ∑x:AB$} \end{prooftree} \columnbreak \centering\ExecuteMetaData[agda-mltt/latex/Snippets.tex]{mlttsum} \end{multicols}
Naturals, and initial algebras in general, can be defined as inductive types with the data keyword. Functions over these types can be constructed using their universal property, as we did in Example example-naturalnumbersobj.
\setlength{\columnseprule}{0.4pt}
\begin{multicols}{2}
\vfill
\begin{tikzcd}[fragile]
1+\mathbb{N}\rar{} \dar[swap]{\pair{0,\mathrm{succ}}} &
1+hom(\mathbb{N},\mathbb{N}) \dar{\pair{\id, \mathrm{succ}\,ˆ\, -}}
\mathbb{N}\rar{+} &
hom(\mathbb{N},\mathbb{N})
\end{tikzcd}
\columnbreak
\vfill
\centering\ExecuteMetaData[agda-mltt/latex/Snippets.tex]{mlttnat}
\end{multicols}
Equality types are a particular case of an inductive family of types. The induction principle over equalities is the J-eliminator we described in Section sec-equalitytypes.
\setlength{\columnseprule}{0.4pt} \begin{multicols}{2} \vfill \begin{prooftree} \AXC{$Γ \vdash a : A$} \UIC{$Γ \vdash \refl_a : a = a$} \end{prooftree} \columnbreak \vfill \centering\ExecuteMetaData[agda-mltt/latex/Snippets.tex]{mltteq} \end{multicols}
Using all this machinery, we can prove facts about equality by
induction and prove facts about the natural numbers, as in the
following example, where we prove that
Finally, we can define propositions as those types where any two elements are equal and postulate that truncations are propositions. The induction principle for truncated types represents the adjunction we described in Section sec-proptrunc.
\setlength{\columnseprule}{0.4pt} \begin{multicols}{2} \vfill \begin{prooftree} \AXC{$Γ \vdash P : Ω$} \AXC{$Γ \vdash a : P$} \AXC{$Γ \vdash b : P$} \TIC{$Γ \vdash \mathrm{isProp}_P(a,b) : a = b$} \end{prooftree} \columnbreak \vfill \centering\ExecuteMetaData[agda-mltt/latex/Snippets.tex]{isprop} \end{multicols}
\setlength{\columnseprule}{0.4pt} \begin{multicols}{2} \vfill \begin{prooftree} \AXC{\begin{tikzcd}[fragile,ampersand replacement=\&] A \rar{}\& P \end{tikzcd}} \doubleLine \UIC{\begin{tikzcd}[fragile,ampersand replacement=\&] \|A\| \rar{}\& P \end{tikzcd}} \end{prooftree} \columnbreak \vfill \centering\ExecuteMetaData[agda-mltt/latex/Snippets.tex]{truncrec} \end{multicols}
Using categories, we have strengthen the propositional logic we
described in Section sec-naturaldeduction to a fully-fledged higher
order logic in which we can construct mathematical proofs.
However, during all this process, we have not accepted the Law of
Excluded Middle; we do not assume that
This has deep effects on the mathematics we are working with. We are working at a great level of generality, where many results are just independent of our theory. Independence of statements like the following may be shocking to a classical mathematician: we cannot prove that the subset of a finite set is finite; we cannot prove that every real is equal or distinct from zero; we cannot prove that every ideal is contained in a maximal ideal; we cannot prove that every vector space has a basis (see cite:bauer2017five for more examples).
Why are we not simply accepting the excluded middle, as is common practice? The
Law of Excluded middle would affect the computational properties
of the theory. For instance, whenever we create a natural number
in our theory, we expect it to be a numeral of the form
In any case, classical mathematics are a particular case of constructive mathematics. We can choose to work inside classical mathematics, although we would lose the computational content. When we choose not to, we are working in constructive mathematics with a programming language as it was first proposed by Errett Bishop cite:bishop67.
<<sec-diaconescuth>> If we want to recover classical mathematics, we should try to model the axioms of the Elementary Theory of Sets into our locally closed cartesian categories. We already have natural numbers in Martin-Löf type theory, and well-pointedness, for instance, is called function extensionality in type theory literature.
We postulate this in Agda as follows.
The last ingredient is the Axiom of Choice. This, like in the case of the Law of Excluded Middle, will make us lose the computational content of the theory. In fact, we can prove that, in general, the Axiom of Choice implies the Law of Excluded Middle. This is the statement of Diaconescu’s theorem, for which we provide both a natural proof and a computer-verified proof.
This proof has been implemented as follows. Note that we have only assumed the Axiom of Choice (and therefore, the Law of Excluded Middle) during this particular section of the text. In the next section, we return to constructive mathematics, working with the real numbers and extracting algorithms from proofs.
<<sec-dedekindreals>>
In Reals.agda
, we provide a formalized construction of the Dedekind
positive reals in intensional Martin-Löf type theory, implemented in
Agda. This implementation constructs reals as Dedekind cuts over the
positive dyadic rationals
Natural numbers are constructed as initial algebras (as described in
Example example-naturalnumbersobj). Dyadic rationals
are constructed as pairs of naturals endowed with a normalization property:
a fraction
- A Dedekind cut,
$\mathrm{cut}_r : D → Ω$ , representing a proposition parameterized over the positive dyadic rationals. Given$q : D$ , the proposition$\mathrm{cut}_r(q)$ is true if and only if$r < q$ . - A proof $\| ∑q : D \mathrm{cut}_r(q) \|$ witnessing that the Dedekind cut is inhabited, that
is, there exists some
$q$ such that$r < q$ , providing an upper bound on the real number. - Two functions that witness that the cut is round.
That is, a dyadic
$q : D$ is in the cut if and only if there is a smaller$p : D$ also in the cut. Symbolically, \small \[ \left( ∏q : D \mathrm{cut}_r(q) → \left\| ∑p : D (p < q) × \mathrm{cut}_r(p) \right\| \right) × \left( ∏q : D \left\| ∑p : D (p < q) × \mathrm{cut}_r(p) \right\| → \mathrm{cut}_r(q) \right). \] \normalsize
The following code shows the core definition of the Agda implementation.
Note that, under the intuitionistic interpretation, it is not the case
in general that $∏q : D \mathrm{cut}_r(q) + ¬\, \mathrm{cut}_r(q)$ for an arbitrary
As a proof of concept, we define square roots, $\sqrt{-} : \mathbb{R}→ \mathbb{R}$, as
\[
\mathrm{cut}\sqrt{r}(q) =
\left\| ∑p : D ( \mathrm{cut}_r(p)) × \left(p < q^2\right) \right\|.
\]
We prove they are well-defined using several lemmas about the dyadic numbers
and we prove that
Dedekind cuts are not the only path we can take in order to define the real numbers in type theory. Reals can be also defined in terms of Cauchy sequences, but the resulting construction is not in general equivalent to ours. Only when we assume excluded middle, both Dedekind and Cauchy reals become equivalent and the classical notion of real number is recovered. A detailed discussion can be found in the last chapters of cite:hottbook.
<<sec-hott>>
And it soon became clear that the only long-term solution was somehow to make it possible for me to use computers to verify my abstract, logical, and mathematical constructions. When I first started to explore the possibility, computer proof verification was almost a forbidden subject among mathematicians. The primary challenge that needed to be addressed was that the foundations of mathematics were unprepared for the requirements of the task.
– Vladimir Voevodsky, cite:voevodsky14.
We have already discussed how, given any Base.agda
and Equality.agda
). However, it is not possible to prove by path
induction that every path is equal to the reflexivity path, that is,
\[
∏(x : A)∏(p : x = x) (p = \refl),
\]
is not derivable from the induction principle. This is equivalent to the
principle of uniqueness of identity proofs, it is also equivalent to
Streicher’s Axiom K cite:streicher93 and it is independent
from Martin-Löf type theory.
If we do not assume this axiom, we open the possibility to the existence of multiple different proofs of the same equality. The structure of these proofs, endowed with symmetry and transitivity, could be modeled into a groupoid; and this idea allows us to construct models of type theory where the principle of uniqueness of identity proofs does not hold (see cite:hofmann98). If we also consider equalities between proofs of equality, and equalities between proofs of equalities between equalities, and so on, we would get a weak ω-groupoid structure cite:berg11. Following the Grothendieck’s Homotopy Hypothesis, groupoids can be regarded as an homotopical spaces, where equalities are paths and equalities between equalities are homotopies between paths (some work on this hypothesis can be read in cite:tamsamani96).
In any case, as the existence of this non-trivial structure is
independent of the theory, we need to introduce new axioms or new types
with nontrivial equalities if we want to profit from this interpretation.
The introduction of Voevodsky’s Univalence Axiom leads to Homotopy Type theory, an extension of
Martin-Löf type theory where we can work with this groupoid structure.
Under the identification of higher groupoids and homotopical types,
the new axiom allows us to reason in some sort of synthetic homotopy
theory, where paths and homotopies are primitive notions. For instance,
we can define the fundamental group (fundamental groupoid, if we also
want to consider higher structure) of a type
In this setting, results such as the Van Kampen theorem or the construction of Eilenberg-MacLane spaces have been formalized (see cite:hottbook).
We say that there is an equivalence between two types
In practical terms, this implies that isomorphic structures can
be identified. For example, we could consider the type of the
integers FundGroupCircle.agda
, we
use this fact to prove, inside type theory, that
We could think that this only proves that Z is the free group over a single generator. After all, this is what the circle is, when interpreted as a groupoid.
I think that the main problem with that view is that the groupoid structure is not present inside the theory, it is only visible from outside. When we prove this we need univalence to internalize the structure of groupoid inside the theory, using Z as a bridge.
<<sec-verified>> All definitions and theorems from Sections sec-mltt and sec-hott have been formalized into two Agda libraries whose complete code is presented in navigable HTML format at the following links. Note that each function name is a link to its definition.
- https://mroman42.github.io/ctlc/agda-mltt/Total.html
- https://mroman42.github.io/ctlc/agda-hott/Total.html
On the Agda-mltt library, we define the basic types for Martin-Löf type theory. We have chosen to add propositions and propositional truncations, as described in sec-subobject-propositions, because of their expressive power, even if they are not part of the usual presentations of Martin-Löf type theory. In particular, they allow us to build the type of Dedekind positive reals (see Section sec-dedekindreals). We also prove Diaconescu’s theorem and show how a classical setting could be recovered (see Section sec-diaconescuth). More than a hundred propositions about the natural numbers and the dyadic rationals have been formalized in order to construct the real numbers, each one being a function in dependently typed lambda calculus.
This first library assumes uniqueness of identity proofs (Axiom axiom-uip), which eases some proofs but which is incompatible with Univalence (Axiom axiom-univalence). This is why we will need a second library in order to work with the Univalence axiom; this second library will not assume uniqueness of identity proofs. We also assume well-pointedness in the form of function extensionality (Axiom axiom-funext) and, only when working with Diaconescu’s theorem (Theorem th-diaconescu), we assume the Axiom of Choice.
On the Agda-hott library, we define again basic types for Martin-Löf
type theory, we again assume function extensionality (Axiom axiom-funext),
but this time we also assume Univalence Axiom. A more detailed study
of the notion of equality and equivalence follows, defining multiple
equivalent notions of equivalence. We again define truncations and
relations, the natural numbers, the integers and some algebraic structures.
All this machinery is used while constructing the higher-inductive type
of the circle to prove that its fundamental group is
These two libraries are original developments. They both follow ideas mainly from the Homotopy Theory book cite:hottbook, but also from the Univalent Foundations library written with the Coq proof assistant cite:unimath and the Coq homotopy theory library cite:bauer17. The definition of the type of propositions follows cite:escardoagda. Our second library compares to the highly more complete and more technical cite:hott-in:agda by presenting only the small subset of type theory necessary to compute the fundamental group of the circle and thus simplifying the whole structure of the library.
Categories are an exceptionally powerful and natural language to talk about mathematics. I feel that the knowledge about logic and foundations of mathematics I have gained while writing this text has been amplified by the fact that categories have always provided a confortable framework to work with logic while avoiding delving excessively in more philosophically-related concerns. In addition, and perhaps surprisingly, categories create very reasonable programming languages! While this may be a bit oversimplifying, this is the practical lesson we can learn from both Mikrokosmos and Agda. When I started to write this text, I was aware of some form of unspecified connection between logic, programming and categories, and I wanted to make precise these statements on my mind. In any case, I was not expecting to find such a fruitful and deeply interesting topic of study right at the intersection between mathematics and computation.
Categories must be taken seriously cite:lawvere1986taking. Many examples of this text refer to basic notions of mathematics seen from the perspective of category theory. Rather than seeing them as trivial identifications, it is sometimes illuminating to take this perspective. Induction, natural numbers and more complex initial algebras are an example of an idea that gets considerably clear when we speak in the internal language of a category.
Mikrokosmos was born as a didactic project to teach lambda calculus. While functional programming languages are based on lambda calculus, its essence gets blurred between the many constructions that are needed for a modern programming language to be of any utility, even when they have not a strong theoretical basis. Many lambda calculus interpreters only aim to provide useful languages that show that programming in lambda calculus can be done, but they also sometimes blend together many different aspects and it can be difficult for the user to determine if a particular construction can be actually translated back to categories. We could say that we have succeeded in the task of providing a clear interpreter teaching how to translate between lambda calculus and logic. I have had the opportunity to conduct successful seminars on the lambda calculus using Mikrokosmos as a tool and the interpreter (only in its command-line version) has been downloaded more than a thousand times from the Hackage platform, exceeding my initial expectations and confirming there was a need for this kind of purely didactic experiment.
While proving on the order of a hundred lemmas about the natural numbers only with the help of the Agda type checker, the need for automation tools quickly arises. They are available in the form of tactic languages in more specialized proof assistants (see for instance, LTac in Coq, cite:delahaye00); but it would be cleaner to use the same language both to prove and to automate. Reflection is an example of a technique that would help to leverage the power of Agda as a programming language to automate Agda as a proof checker; an example of its usage is described in cite:vanderwalt12. Maybe, using these techniques, proving theorems on real analysis encoding more complex theories such as synthetic differential geometry (cite:kock06) would be possible in Agda.
In this text, we have discussed how intuitionistic logic provides
a useful generalization of classical logic with semantics in
cartesian closed categories and a computational interpretation
on simply-typed lambda calculus. It seems plausible that a
similar discussion on monoidal categories and linear logic
could have been made. Linear logic can be interpreted as
a logic of resources where we can make a distinction
between, for example, two types of conjunctions:
It is conceivable that, in the future, mathematicians will be able to write and check their proofs directly into a computer as part of their routine. Reaching that stage will require a massive effort into the study of foundations and implementation of languages, coupled with the creation of tools and software easing the task for the working mathematician.
The complete code that generates this text, along with the code here presented can be found at https://github.com/mroman42/ctlc.
The Mikrokosmos lambda interpreter has its documented code published under a GNU General Public License v3.0 at https://github.com/mroman42/mikrokosmos. Code for a previous stable version on the Hackage platform can be found at https://hackage.haskell.org/package/mikrokosmos. An HTML version of the documented code can be accessed at https://mroman42.github.io/mikrokosmos/haddock/.
Mikrokosmos has been released along with the following references.
- A User’s guide for the command line interpreter. It details the installation procedure and the core functionality. It can be found at https://mroman42.github.io/mikrokosmos/userguide.html.
- The online version of the interpreter, which can be accessed at https://mroman42.github.io/mikrokosmos/.
- An interactive Tutorial on the lambda calculus, following Sections sec-programming-untyped and sec-programming-typed from a more didactic perspective. It can be found at https://mroman42.github.io/mikrokosmos/tutorial.html.
- The Javascript interface and instructions on how to use it at https://github.com/mroman42/mikrokosmos-js.
- The IPython/Jupyter Mikrokosmos kernel, whose code can be found at https://github.com/mroman42/jupyter-mikrokosmos. Tutorials in Jupyter notebook format can be found at https://github.com/mroman42/mikrokosmos-tutorials.
- The standard libraries, which are also part of the interpreter from version 0.5.0. The code can be found at https://github.com/mroman42/mikrokosmos-lib.
The code for the Agda-mltt and Agda-hott libraries can be downloaded from the main repository. The easily-navigable HTML versions can be found at
- https://mroman42.github.io/ctlc/agda-mltt/Total.html,
- https://mroman42.github.io/ctlc/agda-hott/Total.html.
\small The opportunity of devoting my bachelor’s thesis to this fascinating subject has been made possible by Professor Pedro García-Sánchez and Professor Manuel Bullejos. They have provided me with corrections and useful guidelines for the text and they have gone beyond their obligation in their efforts to instruct me on how to expose these ideas clearly. Any deviation from this goal must be attributed to my own inexperience.
I would like to thank the LibreIM community in general and Ignacio Cordón, David Charte, Marta Andrés, José Carlos Entrena, Pablo Baeyens, Antonio Checa, Daniel Pozo, and Sofía Almeida in particular, for testing the interpreter and providing useful discussion and questions on the topics of this text. Professor Luis Merino, Professor Juan Julián Merelo and Braulio Valdivielso have made possible and participated on the organization of meetings and workshops on these ideas.
Finally, I would like to express my gratitude to Benedikt Ahrens and the organizers of the School and Workshop on Univalent Mathematics at the University of Birmingham, whose effort has given me the opportunity to learn the rudiments of type theory and the Univalent Foundations program.
This document has been written with Emacs26 and org-mode 9, using the
org
file format and LaTeX as intermediate format. The document takes
some configurations from the classicthesis
template by André Miede
and modifications by Adrián Ranea, Alejandro García and David
Charte. The minted
package has been used for code listings and the
tikzcd
package has been used for commutative diagrams. The document
is released under a Creative Commons BY-SA 3.0 license, while the
source code can be redistributed under the terms of the GNU General
Public License.
\normalsize
bibliographystyle:alpha bibliography:Bibliography.bib