TypecheckingA Typechecker for STLC

The has_type relation of the STLC defines what it means for a term to belong to a type (in some context). But it doesn't, by itself, give us an algorithm for checking whether or not a term is well typed.

Fortunately, the rules defining has_type are syntax directed -- that is, for every syntactic form of the language, there is just one rule that can be used to give a type to terms of that form. This makes it straightforward to translate the typing rules into clauses of a typechecking function that takes a term and a context and either returns the term's type or else signals that the term is not typable.

This short chapter constructs such a function and proves it correct.

Set Warnings "-notation-overridden,-parsing,-deprecated-hint-without-locality".
Set Warnings "-non-recursive".
From Coq Require Import Bool.Bool.
From PLF Require Import Maps.
From PLF Require Import Smallstep.
From PLF Require Import Stlc.
From PLF Require MoreStlc.

Module STLCTypes.
Export STLC.

Comparing Types

First, we need a function to compare two types for equality...

Fixpoint eqb_ty (T₁ T₂:ty) : bool :=
  match T₁,T₂ with
  | <{ Bool }> , <{ Bool }> ⇒
      true
  | <{ T₁₁→T₁₂ }>, <{ T₂₁→T₂₂ }> ⇒
      andb (eqb_ty T₁₁ T₂₁) (eqb_ty T₁₂ T₂₂)
  | _,_ ⇒
      false
  end.

... and we need to establish the usual two-way connection between the boolean result returned by eqb_ty and the logical proposition that its inputs are equal.

Lemma eqb_ty_refl : ∀ T,
eqb_ty T T = true.

Proof.
  intros T. induction T; simpl.
    reflexivity.
    rewrite IHT1. rewrite IHT2. reflexivity. Qed.

Lemma eqb_ty__eq : ∀ T₁ T₂,
eqb_ty T₁ T₂ = true → T₁ = T₂.

Proof with auto.
  intros T₁. induction T₁; intros T₂ Hbeq; destruct T₂; inversion Hbeq.
  - (* T₁=Bool *)
    reflexivity.
  - (* T₁ = T1_1->T1_2 *)
    rewrite andb_true_iff in H₀. inversion H₀ as [Hbeq1 Hbeq2].
    apply IHT1_1 in Hbeq1. apply IHT1_2 in Hbeq2. subst... Qed.

End STLCTypes.

The Typechecker

The typechecker works by walking over the structure of the given term, returning either Some T or None. Each time we make a recursive call to find out the types of the subterms, we need to pattern-match on the results to make sure that they are not None. Also, in the app case, we use pattern matching to extract the left- and right-hand sides of the function's arrow type (and fail if the type of the function is not T₁₁→T₁₂ for some T₁₁ and T₁₂).

Module FirstTry.
Import STLCTypes.

Fixpoint type_check (Gamma : context) (t : tm) : option ty :=
  match t with
  | tm_var x ⇒
      Gamma x
  | <{\x :T₂, t₁}> ⇒
      match type_check (x ⊢> T₂ ; Gamma) t₁ with
      | Some T₁ ⇒ Some <{T₂→T₁}>
      | _ ⇒ None
      end
  | <{t₁ t₂}> ⇒
      match type_check Gamma t₁, type_check Gamma t₂ with
      | Some <{T₁₁→T₁₂}>, Some T₂ ⇒
          if eqb_ty T₁₁ T₂ then Some T₁₂ else None
      | _,_ ⇒ None
      end
  | <{true }> ⇒
      Some <{Bool }>
  | <{false }> ⇒
      Some <{Bool }>
  | <{if guard then t else f}> ⇒
      match type_check Gamma guard with
      | Some <{Bool }> ⇒
          match type_check Gamma t, type_check Gamma f with
          | Some T₁, Some T₂ ⇒
              if eqb_ty T₁ T₂ then Some T₁ else None
          | _,_ ⇒ None
          end
      | _ ⇒ None
      end
  end.

End FirstTry.

Digression: Improving the Notation

Before we consider the properties of this algorithm, let's write it out again in a cleaner way, using "monadic" notations in the style of Haskell to streamline the plumbing of options. First, we define a notation for composing two potentially failing (i.e., option-returning) computations:

Notation " x <- e₁ ;; e₂" := (match e₁ with
                              | Some x ⇒ e₂
                              | None ⇒ None
                              end)
         (right associativity, at level 60).

Second, we define return and fail as synonyms for Some and None:

Notation " 'return' e "
:= (Some e) (at level 60).

Notation " 'fail' "
:= None.

Module STLCChecker.
Import STLCTypes.

Now we can write the same type-checking function in a more imperative-looking style using these notations.

Fixpoint type_check (Gamma : context) (t : tm) : option ty :=
  match t with
  | tm_var x ⇒
      match Gamma x with
      | Some T ⇒ return T
      | None ⇒ fail
      end
  | <{\x :T₂, t₁}> ⇒
      T₁ <- type_check (x ⊢> T₂ ; Gamma) t₁ ;;
      return <{T₂→T₁ }>
  | <{t₁ t₂}> ⇒
      T₁ <- type_check Gamma t₁ ;;
      T₂ <- type_check Gamma t₂ ;;
      match T₁ with
      | <{T₁₁→T₁₂}> ⇒
          if eqb_ty T₁₁ T₂ then return T₁₂ else fail
      | _ ⇒ fail
      end
  | <{true }> ⇒
      return <{ Bool }>
  | <{false }> ⇒
      return <{ Bool }>
  | <{if guard then t₁ else t₂}> ⇒
      Tguard <- type_check Gamma guard ;;
      T₁ <- type_check Gamma t₁ ;;
      T₂ <- type_check Gamma t₂ ;;
      match Tguard with
      | <{ Bool }> ⇒
          if eqb_ty T₁ T₂ then return T₁ else fail
      | _ ⇒ fail
      end
  end.

Properties

To verify that the typechecking algorithm is correct, we show that it is sound and complete for the original has_type relation -- that is, type_check and has_type define the same partial function.

Theorem type_checking_sound : ∀ Gamma t T,
type_check Gamma t = Some T → has_type Gamma t T.

Proof with eauto.
  intros Gamma t. generalize dependent Gamma.
  induction t; intros Gamma T Htc; inversion Htc.
  - (* var *) rename s into x. destruct (Gamma x) eqn:H.
    rename t into T'. inversion H₀. subst. eauto. solve_by_invert.
  - (* app *)
    remember (type_check Gamma t₁) as TO₁.
    destruct TO₁ as [T₁|]; try solve_by_invert;
    destruct T₁ as [|T₁₁ T₁₂]; try solve_by_invert;
    remember (type_check Gamma t₂) as TO₂;
    destruct TO₂ as [T₂|]; try solve_by_invert.
    destruct (eqb_ty T₁₁ T₂) eqn: Heqb.
    apply eqb_ty__eq in Heqb.
    inversion H₀; subst...
    inversion H₀.
  - (* abs *)
    rename s into x, t into T₁.
    remember (x ⊢> T₁ ; Gamma) as G'.
    remember (type_check G' t₀) as TO₂.
    destruct TO₂; try solve_by_invert.
    inversion H₀; subst...
  - (* tru *) eauto.
  - (* fls *) eauto.
  - (* test *)
    remember (type_check Gamma t₁) as TOc.
    remember (type_check Gamma t₂) as TO₁.
    remember (type_check Gamma t₃) as TO₂.
    destruct TOc as [Tc|]; try solve_by_invert.
    destruct Tc; try solve_by_invert;
    destruct TO₁ as [T₁|]; try solve_by_invert;
    destruct TO₂ as [T₂|]; try solve_by_invert.
    destruct (eqb_ty T₁ T₂) eqn:Heqb;
    try solve_by_invert.
    apply eqb_ty__eq in Heqb.
    inversion H₀. subst. subst...
Qed.

Theorem type_checking_complete : ∀ Gamma t T,
has_type Gamma t T → type_check Gamma t = Some T.

Proof with auto.
  intros Gamma t T Hty.
  induction Hty; simpl.
  - (* T_Var *) destruct (Gamma _) eqn:H₀; assumption.
  - (* T_Abs *) rewrite IHHty...
  - (* T_App *)
    rewrite IHHty1. rewrite IHHty2.
    rewrite (eqb_ty_refl T₂)...
  - (* T_True *) eauto.
  - (* T_False *) eauto.
  - (* T_If *) rewrite IHHty1. rewrite IHHty2.
    rewrite IHHty3. rewrite (eqb_ty_refl T₁)...
Qed.

End STLCChecker.

Exercises

In this exercise we'll extend the typechecker to deal with the extended features discussed in chapter MoreStlc. Your job is to fill in the omitted cases in the following.

Module TypecheckerExtensions.
Import MoreStlc.
Import STLCExtended.

Fixpoint eqb_ty (T₁ T₂ : ty) : bool :=
  match T₁,T₂ with
  | <{{Nat }}>, <{{Nat }}> ⇒
      true
  | <{{Unit }}>, <{{Unit }}> ⇒
      true
  | <{{T₁₁ → T₁₂}}>, <{{T₂₁ → T₂₂}}> ⇒
      andb (eqb_ty T₁₁ T₂₁) (eqb_ty T₁₂ T₂₂)
  | <{{T₁₁ × T₁₂}}>, <{{T₂₁ × T₂₂}}> ⇒
      andb (eqb_ty T₁₁ T₂₁) (eqb_ty T₁₂ T₂₂)
  | <{{T₁₁ + T₁₂}}>, <{{T₂₁ + T₂₂}}> ⇒
      andb (eqb_ty T₁₁ T₂₁) (eqb_ty T₁₂ T₂₂)
  | <{{List T₁₁}}>, <{{List T₂₁}}> ⇒
      eqb_ty T₁₁ T₂₁
  | _,_ ⇒
      false
  end.

Lemma eqb_ty_refl : ∀ T,
  eqb_ty T T = true.
Proof.
  intros T.
  induction T; simpl; auto;
    rewrite IHT1; rewrite IHT2; reflexivity. Qed.

Lemma eqb_ty__eq : ∀ T₁ T₂,
  eqb_ty T₁ T₂ = true → T₁ = T₂.
Proof.
  intros T₁.
  induction T₁; intros T₂ Hbeq; destruct T₂; inversion Hbeq;
    try reflexivity;
    try (rewrite andb_true_iff in H₀; inversion H₀ as [Hbeq1 Hbeq2];
         apply IHT1_1 in Hbeq1; apply IHT1_2 in Hbeq2; subst; auto);
    try (apply IHT1 in Hbeq; subst; auto).
Qed.

Just for fun, we'll do the soundness proof with just a bit more automation than above, using these "mega-tactics":

Ltac invert_typecheck Gamma t T :=
  remember (type_check Gamma t) as TO;
  destruct TO as [T|];
  try solve_by_invert; try (inversion H₀; eauto); try (subst; eauto).

Ltac analyze T T₁ T₂ :=
destruct T as [T₁ T₂| |T₁ T₂|T₁| |T₁ T₂]; try solve_by_invert.

Ltac fully_invert_typecheck Gamma t T T₁ T₂ :=
  let TX := fresh T in
  remember (type_check Gamma t) as TO;
  destruct TO as [TX|]; try solve_by_invert;
  destruct TX as [T₁ T₂| |T₁ T₂|T₁| |T₁ T₂];
  try solve_by_invert; try (inversion H₀; eauto); try (subst; eauto).

Ltac case_equality S T :=
destruct (eqb_ty S T) eqn: Heqb;
inversion H₀; apply eqb_ty__eq in Heqb; subst; subst; eauto.

End TypecheckerExtensions.

Above, we showed how to write a typechecking function and prove it sound and complete for the typing relation. Do the same for the operational semantics -- i.e., write a function stepf of type tm → option tm and prove that it is sound and complete with respect to step from chapter MoreStlc.

Module StepFunction.
Import MoreStlc.
Import STLCExtended.

Exercise: 2 stars, standard, optional (valuef_defn)

(* We must first also redefine value as a function. *)
Fixpoint valuef (t : tm) : bool :=
  match t with
  | tm_var _ ⇒ false
  | <{ \_:_, _ }> ⇒ true
  | <{ _ _ }> ⇒ false
  | tm_const _ ⇒ true
  | <{ succ _ }> | <{ pred _ }> | <{ _ × _ }> | <{ if₀ _ then _ else _ }> ⇒ false
  (* Complete the following cases *)
  (* sums *)
  (* FILL IN HERE *)
  | _ ⇒ false (* ... and delete this line when you complete the exercise. *)
  end.
(* Do not modify the following line: *)
Definition manual_grade_for_valuef_defn : option (nat ×string) := None.
☐

(* A little helper to concisely check some boolean properties
(in this case, that some term is a value, with valuef). *)
Definition assert (b : bool) (a : option tm) : option tm :=
if b then a else None.

(* To prove that stepf is equivalent to step, we start with
a couple of intermediate lemmas. *)

(* We show that valuef is sound and complete with respect to value. *)

Exercise: 2 stars, standard, optional (sound_valuef)

(* valuef is sound with respect to value *)
Lemma sound_valuef : ∀ t,
valuef t = true → value t.
Proof.
(* FILL IN HERE *) Admitted.
☐

Exercise: 2 stars, standard, optional (complete_valuef)

(* valuef is complete with respect to value.
   This proof by induction is quite easily done by simplification. *)
Lemma complete_valuef : ∀ t,
    value t → valuef t = true.
Proof.
  (* FILL IN HERE *) Admitted.
☐

(* Soundness of stepf:

   Theorem sound_stepf : forall t t',
       stepf t = Some t'  ->  t --> t'.

   By induction on t. We automate the handling of each case with
   the following tactic auto_stepf. *)

Tactic Notation "auto_stepf" ident(H) :=
  (* Step 1: In every case, the left hand side of the hypothesis
     H : stepf t = Some t' simplifies to some combination of
     match u with ... end, assert u (...) (for some u).
     The tactic auto_stepf then destructs u as required.
     We repeat this step as long as it is possible. *)
  repeat
    match type of H with
    | (match ?u with _ ⇒ _ end = _) ⇒
      let e := fresh "e" in
      destruct u eqn:e
    | (assert ?u _ = _) ⇒
      (* In this case, u is always of the form valuef t₀
         for some term t₀. If valuef t₀ = true, we immediately
         deduce value t₀ via sound_valuef. If valuef t₀ = false,
         then that equation simplifies to None = Some t', which is
         contradictory and can be eliminated with discriminate. *)
      let e := fresh "e" in
      destruct u eqn:e;
      simpl in H; (* assert true (...) must be simplified
                     explicitly. *)
      [apply sound_valuef in e | discriminate]
    end;
  (* Step 2: We are now left with either H : None = Some t' or
     Some (...) = Some t', and the rest of the proof is a
     straightforward combination of the induction hypotheses. *)
  (discriminate + (inversion H; subst; auto)).

Exercise: 2 stars, standard, optional (value_stepf_nf)

(* Now for completeness, another lemma will be useful:
   every value is a normal form for stepf. *)
Lemma value_stepf_nf : ∀ t,
    value t → stepf t = None.
Proof.
  (* FILL IN HERE *) Admitted.
☐

Exercise: 2 stars, standard, optional (complete_stepf)

(* Completeness of stepf. *)
Theorem complete_stepf : ∀ t t',
t --> t' → stepf t = Some t'.
Proof.
(* FILL IN HERE *) Admitted.
☐

End StepFunction.

Exercise: 5 stars, standard, optional (stlc_impl)

Using the Imp parser described in the ImpParser chapter of Logical Foundations as a guide, build a parser for extended STLC programs. Combine it with the typechecking and stepping functions from the above exercises to yield a complete typechecker and interpreter for this language.

Module StlcImpl.
Import StepFunction.

(* FILL IN HERE *)
End StlcImpl.
☐

(* 2024-12-27 01:28 *)