RecordsAdding Records to STLC

Set Warnings "-notation-overridden,-parsing,-deprecated-hint-without-locality".
From Coq Require Import Strings.String.
From PLF Require Import Maps.
From PLF Require Import Smallstep.
From PLF Require Import Stlc.

Adding Records

We saw in chapter MoreStlc how records can be treated as just syntactic sugar for nested uses of products. This is OK for simple examples, but the encoding is informal (in reality, if we actually treated records this way, the translation would be carried out in the parser, which we are eliding here), and anyway it is not very efficient, nor does it lead to nice error messages for programs that the typechecker rejects. So it is also interesting to see how records can be treated as first-class citizens of the language. This chapter shows how.

Recall the informal definitions we gave before:

Syntax:
       t ::= Terms:
           | {i=t, ..., i=t} record
           | t.i projection
           | ...

       v ::= Values:
           | {i=v, ..., i=v} record value
           | ...

       T ::= Types:
           | {i:T, ..., i:T} record type
           | ... Reduction:

tn ==> tn'	(ST_Rcd)

{i₁=v₁, ..., im=vm, in=tn, ...} ==> {i₁=v₁, ..., im=vm, in=tn', ...}

t₁ ==> t₁'	(ST_Proj1)

t₁.i ==> t₁'.i

	(ST_ProjRcd)

{..., i=vi, ...}.i ==> vi

Typing:

Gamma \|-- t₁ : T₁ ... Gamma \|-- tn : Tn	(T_Rcd)

Gamma \|-- {i₁=t₁, ..., in=tn} : {i₁:T₁, ..., in:Tn}

Gamma \|-- t₀ : {..., i:Ti, ...}	(T_Proj)

Gamma \|-- t₀.i : Ti

Formalizing Records

Module STLCExtendedRecords.

Syntax and Operational Semantics

The most obvious way to formalize the syntax of record types would be this:

Module FirstTry.

Definition alist (X : Type) := list (string × X).

Inductive ty : Type :=
  | Base : string → ty
  | Arrow : ty → ty → ty
  | TRcd : (alist ty ) → ty.

Unfortunately, we encounter here a limitation in Coq: this type does not automatically give us the induction principle we expect: the induction hypothesis in the TRcd case doesn't give us any information about the ty elements of the list, making it useless for the proofs we want to do.

(* Check ty_ind.
   ====>
    ty_ind :
      forall P : ty -> Prop,
        (forall i : id, P (Base i)) ->
        (forall t : ty, P t -> forall t₀ : ty, P t₀
                            -> P (Arrow t t₀)) ->
        (forall a : alist ty, P (TRcd a)) ->    (* ??? *)
        forall t : ty, P t
*)

End FirstTry.

It is possible to get a better induction principle out of Coq, but the details of how this is done are not very pretty, and the principle we obtain is not as intuitive to use as the ones Coq generates automatically for simple Inductive definitions.

Fortunately, there is a different way of formalizing records that is, in some ways, even simpler and more natural: instead of using the standard Coq list type, we can essentially incorporate its constructors ("nil" and "cons") in the syntax of our types.

Inductive ty : Type :=
  | Ty_Base : string → ty
  | Ty_Arrow : ty → ty → ty
  | Ty_RNil : ty
  | Ty_RCons : string → ty → ty → ty.

Similarly, at the level of terms, we have constructors trnil, for the empty record, and rcons, which adds a single field to the front of a list of fields.

Declare Custom Entry stlc_ty.

Notation "<{ e }>" := e (e custom stlc at level 99).
Notation "<{{ e }}>" := e (e custom stlc_ty at level 99).
Notation "( x )" := x (in custom stlc, x at level 99).
Notation "( x )" := x (in custom stlc_ty, x at level 99).
Notation "x" := x (in custom stlc at level 0, x constr at level 0).
Notation "x" := x (in custom stlc_ty at level 0, x constr at level 0).
Notation "S -> T" := (Ty_Arrow S T) (in custom stlc_ty at level 50, right associativity).
Notation "x y" := (tm_app x y) (in custom stlc at level 1, left associativity).
Notation "\ x : t , y" :=
  (tm_abs x t y) (in custom stlc at level 90, x at level 99,
                     t custom stlc_ty at level 99,
                     y custom stlc at level 99,
                     left associativity).
Coercion tm_var : string >-> tm.

Notation "{ x }" := x (in custom stlc at level 1, x constr).

Notation "'Base' x" := (Ty_Base x) (in custom stlc_ty at level 0).

Notation " l ':' t₁ '::' t₂" := (Ty_RCons l t₁ t₂) (in custom stlc_ty at level 3, right associativity).
Notation " l := e₁ '::' e₂" := (tm_rcons l e₁ e₂) (in custom stlc at level 3, right associativity).
Notation "'nil'" := (Ty_RNil) (in custom stlc_ty).
Notation "'nil'" := (tm_rnil) (in custom stlc).
Notation "o --> l" := (tm_rproj o l) (in custom stlc at level 0).

Some examples...

Open Scope string_scope.

Notation a := "a".
Notation f := "f".
Notation g := "g".
Notation l := "l".
Notation A := <{{ Base "A" }}>.
Notation B := <{{ Base "B" }}>.
Notation k := "k".
Notation i₁ := "i₁".
Notation i₂ := "i₂".

{ i₁:A }

(* Check (RCons i₁ A RNil). *)

{ i₁:A→B, i₂:A }

(* Check (RCons i₁ (Arrow A B)
(RCons i₂ A RNil)). *)

Well-Formedness

One issue with generalizing the abstract syntax for records from lists to the nil/cons presentation is that it introduces the possibility of writing strange types like this...

Definition weird_type := <{{ a : A :: B }}>.

where the "tail" of a record type is not actually a record type!

We'll structure our typing judgement so that no ill-formed types like weird_type are ever assigned to terms. To support this, we define predicates record_ty and record_tm, which identify record types and terms, and well_formed_ty which rules out the ill-formed types.

First, a type is a record type if it is built with just RNil and RCons at the outermost level.

Inductive record_ty : ty → Prop :=
  | RTnil :
        record_ty <{{ nil }}>
  | RTcons : ∀ i T₁ T₂,
        record_ty <{{ i : T₁ :: T₂ }}>.

With this, we can define well-formed types.

Inductive well_formed_ty : ty → Prop :=
  | wfBase : ∀ (i : string),
        well_formed_ty <{{ Base i }}>
  | wfArrow : ∀ T₁ T₂,
        well_formed_ty T₁ →
        well_formed_ty T₂ →
        well_formed_ty <{{ T₁ → T₂ }}>
  | wfRNil :
        well_formed_ty <{{ nil }}>
  | wfRCons : ∀ i T₁ T₂,
        well_formed_ty T₁ →
        well_formed_ty T₂ →
        record_ty T₂ →
        well_formed_ty <{{ i : T₁ :: T₂ }}>.

Hint Constructors record_ty well_formed_ty : core.

Note that record_ty is not recursive -- it just checks the outermost constructor. The well_formed_ty property, on the other hand, verifies that the whole type is well formed in the sense that the tail of every record (the second argument to RCons) is a record.

Of course, we should also be concerned about ill-formed terms, not just types; but typechecking can rule those out without the help of an extra well_formed_tm definition because it already examines the structure of terms. All we need is an analog of record_ty saying that a term is a record term if it is built with trnil and rcons.

Inductive record_tm : tm → Prop :=
  | rtnil :
        record_tm <{ nil }>
  | rtcons : ∀ i t₁ t₂,
        record_tm <{ i := t₁ :: t₂ }>.

Hint Constructors record_tm : core.

Substitution

Substitution extends easily.

Reserved Notation "'[' x ':=' s ']' t" (in custom stlc at level 20, x constr).

Fixpoint subst (x : string) (s : tm) (t : tm) : tm :=
  match t with
  | tm_var y ⇒
      if String.eqb x y then s else t
  | <{\y:T, t₁}> ⇒
      if String.eqb x y then t else <{\y:T, [x :=s ] t₁}>
  | <{t₁ t₂}> ⇒
      <{([x :=s ] t₁) ([x :=s ] t₂)}>
  | <{ t₁ --> i }> ⇒
      <{ ( [x := s ] t₁) --> i }>
  | <{ nil }> ⇒
      <{ nil }>
  | <{ i := t₁ :: tr }> ⇒
     <{ i := [x := s ] t₁ :: ( [x := s ] tr) }>
  end

where "'[' x ':=' s ']' t" := (subst x s t) (in custom stlc).

Reduction

A record is a value if all of its fields are.

Inductive value : tm → Prop :=
  | v_abs : ∀ x T₂ t₁,
      value <{ \ x : T₂ , t₁ }>
  | v_rnil : value <{ nil }>
  | v_rcons : ∀ i v₁ vr,
      value v₁ →
      value vr →
      value <{ i := v₁ :: vr }>.

Hint Constructors value : core.

To define reduction, we'll need a utility function for extracting one field from record term:

Fixpoint tlookup (i:string) (tr:tm) : option tm :=
  match tr with
  | <{ i' := t :: tr'}> ⇒ if String.eqb i i' then Some t else tlookup i tr'
  | _ ⇒ None
  end.

The step function uses this term-level lookup function in the projection rule.

Inductive step : tm → tm → Prop :=
  | ST_AppAbs : ∀ x T₂ t₁ v₂,
         value v₂ →
         <{(\x :T₂ , t₁ ) v₂ }> --> <{ [x :=v₂ ]t₁ }>
  | ST_App1 : ∀ t₁ t₁' t₂,
         t₁ --> t₁' →
         <{t₁ t₂ }> --> <{t₁' t₂ }>
  | ST_App2 : ∀ v₁ t₂ t₂',
         value v₁ →
         t₂ --> t₂' →
         <{v₁ t₂ }> --> <{v₁ t₂'}>
  | ST_Proj1 : ∀ t₁ t₁' i,
        t₁ --> t₁' →
        <{ t₁ --> i }> --> <{ t₁' --> i }>
  | ST_ProjRcd : ∀ tr i vi,
        value tr →
        tlookup i tr = Some vi →
        <{ tr --> i }> --> vi
  | ST_Rcd_Head : ∀ i t₁ t₁' tr₂,
        t₁ --> t₁' →
        <{ i := t₁ :: tr₂ }> --> <{ i := t₁' :: tr₂ }>
  | ST_Rcd_Tail : ∀ i v₁ tr₂ tr₂',
        value v₁ →
        tr₂ --> tr₂' →
        <{ i := v₁ :: tr₂ }> --> <{ i := v₁ :: tr₂' }>

where "t '-->' t'" := (step t t').

Notation multistep := (multi step).
Notation "t₁ '-->*' t₂" := (multistep t₁ t₂) (at level 40).

Hint Constructors step : core.

Typing

Next we define the typing rules. These are nearly direct transcriptions of the inference rules shown above: the only significant difference is the use of well_formed_ty. In the informal presentation we used a grammar that only allowed well-formed record types, so we didn't have to add a separate check.

One sanity condition that we'd like to maintain is that, whenever has_type Gamma t T holds, will also be the case that well_formed_ty T, so that has_type never assigns ill-formed types to terms. In fact, we prove this theorem below. However, we don't want to clutter the definition of has_type with unnecessary uses of well_formed_ty. Instead, we place well_formed_ty checks only where needed: where an inductive call to has_type won't already be checking the well-formedness of a type. For example, we check well_formed_ty T in the T_Var case, because there is no inductive has_type call that would enforce this. Similarly, in the T_Abs case, we require a proof of well_formed_ty T₁₁ because the inductive call to has_type only guarantees that T₁₂ is well-formed.

Fixpoint Tlookup (i:string) (Tr:ty) : option ty :=
  match Tr with
  | <{{ i' : T :: Tr' }}> ⇒
      if String.eqb i i' then Some T else Tlookup i Tr'
  | _ ⇒ None
  end.

Definition context := partial_map ty.

Hint Constructors has_type : core.

Examples

Exercise: 2 stars, standard (examples)

Finish the proofs below. Feel free to use Coq's automation features in this proof. However, if you are not confident about how the type system works, you may want to carry out the proofs first using the basic features (apply instead of eapply, in particular) and then perhaps compress it using automation.

Before starting to prove anything, make sure you understand what it is saying. :-)

Lemma typing_example_2 :
  empty |-- (\a : ( i₁ : (A → A ) :: i₂ : (B → B ) :: nil ), a --> i₂ )
            ( i₁ := (\a : A , a ) :: i₂ := (\a : B ,a ) :: nil ) \in (B → B ).
Proof.
  (* FILL IN HERE *) Admitted.

Example typing_nonexample :
  ¬ ∃ T ,
     (a ⊢> <{{ i₂ : (A → A ) :: nil }}>) |--
       ( i₁ := (\a : B , a ) :: a ) \in
               T.
Proof.
  (* FILL IN HERE *) Admitted.

Example typing_nonexample_2 : ∀ y,
  ¬ ∃ T ,
    (y ⊢> A ) |--
     (\a : ( i₁ : A :: nil ), a --> i₁ )
      ( i₁ := y :: i₂ := y :: nil ) \in T.
Proof.
  (* FILL IN HERE *) Admitted.
☐

Properties of Typing

The proofs of progress and preservation for this system are essentially the same as for the pure simply typed lambda-calculus, but we need to add some technical lemmas involving records.

Well-Formedness

Lemma wf_rcd_lookup : ∀ i T Ti,
  well_formed_ty T →
  Tlookup i T = Some Ti →
  well_formed_ty Ti.

Proof with eauto.
  intros i T.
  induction T; intros; try solve_by_invert.
  - (* RCons *)
    inversion H. subst. unfold Tlookup in H₀.
    destruct (String.eqb i s)...
    inversion H₀. subst... Qed.

Lemma step_preserves_record_tm : ∀ tr tr',
  record_tm tr →
  tr --> tr' →
  record_tm tr'.

Proof.
intros tr tr' Hrt Hstp.
inversion Hrt; subst; inversion Hstp; subst; auto.
Qed.

Lemma has_type__wf : ∀ Gamma t T,
Gamma |-- t \in T → well_formed_ty T.

Proof with eauto.
  intros Gamma t T Htyp.
  induction Htyp...
  - (* T_App *)
    inversion IHHtyp1...
  - (* T_Proj *)
    eapply wf_rcd_lookup...
Qed.

Field Lookup

Lemma: If empty |-- v : T and Tlookup i T returns Some Ti, then tlookup i v returns Some ti for some term ti such that empty |-- ti ∈ Ti.

Proof: By induction on the typing derivation Htyp. Since Tlookup i T = Some Ti, T must be a record type, this and the fact that v is a value eliminate most cases by inspection, leaving only the T_RCons case.

If the last step in the typing derivation is by T_RCons, then t = rcons i₀ t tr and T = RCons i₀ T Tr for some i₀, t, tr, T and Tr.

This leaves two possiblities to consider - either i₀ = i or not.

If i = i₀, then since Tlookup i (RCons i₀ T Tr) = Some Ti we have T = Ti. It follows that t itself satisfies the theorem.
On the other hand, suppose i ≠ i₀. Then
Tlookup i T = Tlookup i Tr and
tlookup i t = tlookup i tr, so the result follows from the induction hypothesis. ☐

Here is the formal statement:

Lemma lookup_field_in_value : ∀ v T i Ti,
  value v →
  empty |-- v \in T →
  Tlookup i T = Some Ti →
  ∃ ti , tlookup i v = Some ti ∧ empty |-- ti \in Ti.

Proof with eauto.
  intros v T i Ti Hval Htyp Hget.
  remember empty as Gamma.
  induction Htyp; subst; try solve_by_invert...
  - (* T_RCons *)
    simpl in Hget. simpl. destruct (String.eqb i i₀).
    + (* i is first *)
      simpl. injection Hget as Hget. subst.
      ∃ t...
    + (* get tail *)
      destruct IHHtyp2 as [vi [Hgeti Htypi] ]...
      inversion Hval... Qed.

Progress

Theorem progress : ∀ t T,
empty |-- t \in T →
value t ∨ ∃ t', t --> t'.

Proof with eauto.
  (* Theorem: Suppose empty |-- t : T.  Then either
       1. t is a value, or
       2. t --> t' for some t'.
     Proof: By induction on the given typing derivation. *)
  intros t T Ht.
  remember empty as Gamma.
  generalize dependent HeqGamma.
  induction Ht; intros HeqGamma; subst.
  - (* T_Var *)
    (* The final rule in the given typing derivation cannot be
       T_Var, since it can never be the case that
       empty |-- x : T (since the context is empty). *)
    inversion H.
  - (* T_Abs *)
    (* If the T_Abs rule was the last used, then
       t = abs x T₁₁ t₁₂, which is a value. *)
    left...
  - (* T_App *)
    (* If the last rule applied was T_App, then t = t₁ t₂,
       and we know from the form of the rule that
         empty |-- t₁ : T₁ → T₂
         empty |-- t₂ : T₁
       By the induction hypothesis, each of t₁ and t₂ either is a value
       or can take a step. *)
    right.
    destruct IHHt1; subst...
    + (* t₁ is a value *)
      destruct IHHt2; subst...
      × (* t₂ is a value *)
      (* If both t₁ and t₂ are values, then we know that
         t₁ = abs x T₁₁ t₁₂, since abstractions are the only
         values that can have an arrow type.  But
         (abs x T₁₁ t₁₂) t₂ --> [x:=t₂]t₁₂ by ST_AppAbs. *)
        inversion H; subst; try solve_by_invert.
        ∃ <{ [x:=t₂]t₀ }>...
      × (* t₂ steps *)
        (* If t₁ is a value and t₂ --> t₂', then
           t₁ t₂ --> t₁ t₂' by ST_App2. *)
        destruct H₀ as [t₂' Hstp]. ∃ <{ t₁ t₂' }>...
    + (* t₁ steps *)
      (* Finally, If t₁ --> t₁', then t₁ t₂ --> t₁' t₂
         by ST_App1. *)
      destruct H as [t₁' Hstp]. ∃ <{ t₁' t₂ }>...
  - (* T_Proj *)
    (* If the last rule in the given derivation is T_Proj, then
       t = rproj t i and
           empty |-- t : (TRcd Tr)
       By the IH, t either is a value or takes a step. *)
    right. destruct IHHt...
    + (* rcd is value *)
      (* If t is a value, then we may use lemma
         lookup_field_in_value to show tlookup i t = Some ti
         for some ti which gives us rproj i t --> ti by
         ST_ProjRcd. *)
      destruct (lookup_field_in_value _ _ _ _ H₀ Ht H)
        as [ti [Hlkup _] ].
      ∃ ti...
    + (* rcd_steps *)
      (* On the other hand, if t --> t', then
         rproj t i --> rproj t' i by ST_Proj1. *)
      destruct H₀ as [t' Hstp]. ∃ <{ t' --> i }>...
  - (* T_RNil *)
    (* If the last rule in the given derivation is T_RNil,
       then t = trnil, which is a value. *)
    left...
  - (* T_RCons *)
    (* If the last rule is T_RCons, then t = rcons i t tr and
         empty |-- t : T
         empty |-- tr : Tr
       By the IH, each of t and tr either is a value or can
       take a step. *)
    destruct IHHt1...
    + (* head is a value *)
      destruct IHHt2; try reflexivity.
      × (* tail is a value *)
      (* If t and tr are both values, then rcons i t tr
         is a value as well. *)
        left...
      × (* tail steps *)
        (* If t is a value and tr --> tr', then
           rcons i t tr --> rcons i t tr' by
           ST_Rcd_Tail. *)
        right. destruct H₂ as [tr' Hstp].
        ∃ <{ i := t :: tr'}>...
    + (* head steps *)
      (* If t --> t', then
         rcons i t tr --> rcons i t' tr
         by ST_Rcd_Head. *)
      right. destruct H₁ as [t' Hstp].
      ∃ <{ i := t' :: tr }>... Qed.

Weakening

The weakening lemma is proved as in pure STLC.

Lemma weakening : ∀ Gamma Gamma' t T,
     includedin Gamma Gamma' →
     Gamma |-- t \in T →
     Gamma' |-- t \in T.
Proof.
  intros Gamma Gamma' t T H Ht.
  generalize dependent Gamma'.
  induction Ht; eauto using includedin_update.
Qed.

Lemma weakening_empty : ∀ Gamma t T,
     empty |-- t \in T →
     Gamma |-- t \in T.
Proof.
  intros Gamma t T.
  eapply weakening.
  discriminate.
Qed.

Preservation

As before, we prove the substitution lemma by induction on the term t. The only new case (compared to the proof in StlcProp.v) is the case of rcons. For this case, we must do a little extra work to show that substituting into a term doesn't change whetherit is a record term.

Lemma substitution_preserves_typing : ∀ Gamma x U t v T,
  (x ⊢> U ; Gamma ) |-- t \in T →
  empty |-- v \in U →
  Gamma |-- [x :=v ]t \in T.
Proof.
  intros Gamma x U t v T Ht Hv.
  generalize dependent Gamma. generalize dependent T.
  induction t; intros T Gamma H;
  (* in each case, we'll want to get at the derivation of H *)
    inversion H; clear H; subst; simpl; eauto.
  - (* var *)
    rename s into y. destruct (eqb_spec x y); subst.
    + (* x=y *)
      rewrite update_eq in H₁.
      injection H₁ as H₁; subst.
      apply weakening_empty. assumption.
    + (* x<>y *)
      apply T_Var. rewrite update_neq in H₁; auto. assumption.
  - (* abs *)
    rename s into y, t into T.
    destruct (eqb_spec x y); subst; apply T_Abs; try assumption.
    + (* x=y *)
      rewrite update_shadow in H₅. assumption.
    + (* x<>y *)
      apply IHt.
      rewrite update_permute; auto.
  - (* rcons *) (* <=== only new case compared to pure STLC *)
     apply T_RCons; eauto.
     inversion H₇; subst; simpl; auto.
Qed.

Theorem preservation : ∀ t t' T,
  empty |-- t \in T →
  t --> t' →
  empty |-- t' \in T.
Proof with eauto.
  intros t t' T HT. generalize dependent t'.
  remember empty as Gamma.
  induction HT;
       intros t' HE; subst;
       try solve [inversion HE; subst; auto].
  - (* T_App *)
    inversion HE; subst...
    + (* ST_AppAbs *)
      apply substitution_preserves_typing with T₁...
      inversion HT₁...
  - (* T_Proj *) (* <=== new case compared to pure STLC *)
    (* If the last rule was T_Proj, then t = rproj t₁ i.
       Two rules could have caused t --> t': T_Proj1 and
       T_ProjRcd.  The typing of t' follows from the IH
       in the former case, so we only consider T_ProjRcd.

       Here we have that t is a record value.  Since rule
       T_Proj was used, we know empty |-- t ∈ Tr and
       Tlookup i Tr = Some Ti for some i and Tr.
       We may therefore apply lemma lookup_field_in_value
       to find the record element this projection steps to. *)
    inversion HE; subst...
    destruct (lookup_field_in_value _ _ _ _ H₂ HT H)
      as [vi [Hget Htyp] ].
    rewrite H₄ in Hget. injection Hget as Hget. subst...
  - (* T_RCons *) (* <=== new case compared to pure STLC *)
    (* If the last rule was T_RCons, then t = rcons i t tr
       for some i, t and tr such that record_tm tr.  If
       the step is by ST_Rcd_Head, the result is immediate by
       the IH.  If the step is by ST_Rcd_Tail, tr --> tr₂'
       for some tr₂' and we must also use lemma step_preserves_record_tm
       to show record_tm tr₂'. *)
    inversion HE; subst...
    apply T_RCons... eapply step_preserves_record_tm...
Qed.

End STLCExtendedRecords.

(* 2024-12-27 01:28 *)