RIS.normalise: normal form for words up-to alpha-equivalence.

Set Implicit Arguments.

Require Export transducer representative traces alternate_eq.

Section s.

Adding natural numbers in the set of atoms

Let us fix a set of atoms and an alphabet.

Context {atom X : Set}{𝐀 : Atom atom} {𝐗 : Alphabet 𝐀 X}.

Lifting variables

We write 𝐎 for the set of representatives of X, as constructed in RIS.representative.

Definition 𝐎 := (@Repr _ X 𝐀).

For technical reasons, in the following we need the set of representatives to be a nominal set rather than a nominal setoid. This is due to our definition of an alphabet. Therefore, we assume we have a function ϕ from 𝐎 to some set R, decidable and nominal over pointer, such that:

  Parameter R : Set.
  Context {𝐑 : Alphabet 𝐏 R}.
  Parameter ϕ : 𝐎 → R.

ϕ is injective up-to ≃;

Axiom ϕ_inj : ∀ x y : 𝐎, x ≃ y ↔ ϕ x = ϕ y.

ϕ is equivariant;

Axiom ϕ_equivariant : ∀ p x, ϕ (p ⊙x) = p ∙(ϕ x ).

and ϕ preserves supports.

Axiom ϕ_support : ∀ x, ⌊ϕ x ⌋ ≈ ⌈x ⌉.

Lifting letters

Notation " a ? " := (@bound atom a) (at level 20).

We can build using η, ϕ and ! a map ⦑ ⦒: letter 𝐀 𝐗 → letter 𝐏 𝐑.

  Definition repr_letter : @letter _ _ 𝐀 𝐗 → @letter _ _ 𝐏 𝐑 :=
    fun l ⇒
      match l with
      | ⟨ a ⇒ ⟨ (a!)
      | a ⟩ ⇒ (a!) ⟩
      | var x ⇒ var (ϕ(η x))
      end.
  Notation " ⦑ l ⦒ " := (repr_letter l).

The resulting function is injective and equivariant.

  Lemma repr_letter_injective : injective (fun l ⇒ ⦑l ⦒).

  Lemma repr_letter_equivariant :
    ∀ (x : @letter _ _ 𝐀 𝐗) (π : @perm _ 𝐀), ⦅π ⦆ ∙ ⦑ x ⦒ = ⦑ π ∙x ⦒.

Furthermore, the support of ⦑x⦒ is the set of free names a! such that a is in the support of x.

Lemma repr_letter_support : ∀ x, ⌊⦑x ⦒⌋ ≈ map free ⌊x ⌋.

⦑_⦒ preserves binding powers.

Lemma repr_letter_binding a l : 𝗳 a l = 𝗳 (a !) ⦑l ⦒.

Lifting words

We write W for words built out of ⟨p, p⟩ and var r, were p ranges over pointers and r ranges over R.

Notation W := (@word _ _ _ 𝐑).

We may naturally lift ⦑ ⦒: X → R as ⟪ ⟫ : word 𝐀 𝐗 → W.

Notation " ⟪ w ⟫ " := (map repr_letter w).

⟪_⟫ is injective and equivariant.

Lemma repr_word_injective : injective (fun u ⇒ ⟪u ⟫).

Lemma repr_word_perm π u : ⟪π ∙u ⟫ = ⦅π ⦆ ∙ ⟪u ⟫.

⟪_⟫ behaves well with supports.

Lemma repr_word_support a u : a ∈ ⌊u ⌋ ↔ a ! ∈ ⌊⟪u ⟫⌋.

⟪_⟫ preserves binding powers.

Corollary repr_word_binding a u : 𝗙 a u = 𝗙 (a !) ⟪u ⟫.

⟪_⟫ is injective up-to alpha-equivalence.

Theorem repr_word_equiv u v : u ≡ v ↔ ⟪u ⟫ ≡ ⟪v ⟫.

Valid words and dynamic sequences

Validity

Definition

A word u is valid with respect to opening brackets if for every way of splitting u into u1++⟨n?::u2, n is exactly the length of u1 and there is a match bracket ?n⟩ in u2.

  Definition valid_open (u : W) :=
    ∀ u1 u2 n, u = u1 ++open(n ?)::u2 →
               n = ⎢u1 ⎥ ∧ close(n ?) ∈ u2.

u is valid with respect to variable if for any decomposition u=u1++var r::u2, and any bound number n?∈⌊r⌋, there are match brackets ⟨n? in u1 and ?n⟩ in u2.

  Definition valid_var (u : W) :=
    ∀ u1 u2 r n, u = u1 ++var r ::u2 → n ? ∈ ⌊var r ⌋ →
                 open(n ?)∈ u1 ∧ close (n ?) ∈ u2.

Finally, u is valid with respect to closing brackets if for every decomposition u=u1++?n⟩::u2 there is a matching bracket ⟨?n in u1 and no other n?⟩ anywhere in u1++u2.

  Definition valid_close (u : W) :=
    ∀ u1 u2 n, u = u1 ++close (n ?)::u2 →
               open (n ?) ∈ u1 ∧ ¬ close (n ?) ∈ (u1 ++u2 ).

A word is valid if it is valid with respect to all three kinds of letters.

  Definition valid u := valid_open u ∧ valid_var u ∧ valid_close u.

  Remark is_valid_open u1 u2 n :
    valid (u1 ++open(n ?)::u2) → n = ⎢u1 ⎥ ∧ close(n ?) ∈ u2.

  Remark is_valid_var u1 u2 x n :
    valid (u1 ++var x ::u2) → n ? ∈ ⌊x ⌋ → open(n ?)∈ u1 ∧ close (n ?) ∈ u2.

  Remark is_valid_close u1 u2 n :
    valid(u1 ++close (n ?)::u2) → open (n ?) ∈ u1 ∧ ¬ close (n ?) ∈ (u1 ++u2 ).

Boolean predicate

test_var w1 w2 p is true if p is a free name a! or a bound number k? such that ⟨k? appears in w1 and k?⟩ appears in w2.

  Definition test_var w1 w2 (p : @pointer atom) :=
    match p with
    | a! ⇒ true
    | bound k ⇒ inb (open(k?)) w1 && inb (close(k?)) w2
    end.

test_valid_aux w1 w2 is the auxiliary function that we use to test validity of a string. Intuitively, when testing a word u, we will successively test its suffixes w2, while remembering the matching prefix rev w1 to check for instance the presence of an opening bracket to the left of some closing bracket.

  Fixpoint test_valid_aux w1 w2 :=
    match w2 with
    | [] ⇒ true
    | open (bound k)::w2 ⇒ k =?= ⎢w1 ⎥ && inb (close (k?)) w2
                             && test_valid_aux (open(k?)::w1) w2
    | close (bound k)::w2 ⇒ (inb (open (k?)) w1 )
                              && (negb (inb (close (k?)) (w1 ++w2)))
                              && test_valid_aux (close(k?)::w1) w2
    | var r::w2 ⇒ forallb (test_var w1 w2) ⌊r⌋
                    && test_valid_aux (var r::w1) w2
    | b::w2 ⇒ test_valid_aux (b::w1) w2
    end.

  Definition test_valid w := test_valid_aux [] w.

We now prove that test_valid w correctly tests if w is valid. Before doing so, we establish the following two facts. If test_valid_aux (rev u1++w) u2 is false, then this value is propagated and therefore test_valid_aux w (u1++u2) is also false.

Fact test_valid_aux_false_app w u1 u2 :
test_valid_aux (rev u1 ++w) u2 = false → test_valid_aux w (u1 ++u2) = false.

test_valid w is true if and only if every test of suffixes of w is true, i.e. :

Fact test_valid_true w :
test_valid w = true ↔ ∀ u v, w = rev u ++ v → test_valid_aux u v = true.

We may now check that a word is valid if and only if test_valid of that word is true.

Lemma test_valid_spec w : valid w ↔ test_valid w = true.

Remarks about validity

⟪_⟫ always produces valid words.

Proposition valid_repr_word u : valid ⟪u ⟫.

If a bound number appears in the support of a valid word w, then an opening bracket labelled with that number must be present in w.

Lemma valid_support_open w n : valid w → (n ?) ∈ ⌊w ⌋ → open (n ?) ∈ w.

We may strengthen this remark, by noticing that n? belongs to the support of a valid word w if and only if w's nth letter is ⟨n?.

  Lemma valid_support_open_explicit w n :
    valid w →
    (n ?) ∈ ⌊w ⌋ ↔ ∃ w1 w2, w = w1 ++ open(n ?)::w2 ∧ n = ⎢w1 ⎥.

The support of a valid word may only contain bound numbers smaller than its length.

Lemma valid_bound_support_lt u n : valid u → n ? ∈ ⌊u ⌋ → n < ⎢u ⎥.

Numbers are always α-fresh in valid words.

Lemma valid_bound_αfresh u n : valid u → n ? #α u.

Validity may be characterised by the following property : u is valid if and only if it has no α-free numbers and opening brackets labelled with a number only appear at the correct position.

Proposition alternate_validity u :
valid u ↔ (∀ n, n ? #α u ) ∧ (∀ u1 u2 n, u = u1 ++open(n ?)::u2 → n = ⎢u1 ⎥).

Defects

w has a defect if it can be decomposed as w1++⟨a!::w2++a!⟩++w3.

Definition match_defect w :=
∃ w1 w2 w3 a, w = w1 ++ ⟨(a !) ::w2 ++(a !)⟩::w3.

match_defectb a w checks if w has a decomposition w1++a!⟩::w2 such that w1 does not contain the letter ⟨a!.

  Fixpoint match_defectb a w :=
    match w with
    | [] ⇒ false
    | l::w ⇒ (l=?=close(a !))
             || ((negb (l=?= open (a !)))
                  && match_defectb a w )
    end.

  Remark match_defectb_spec a w :
    match_defectb a w = true ↔ ∃ w1 w2, w = w1 ++ close (a !) :: w2
                                        ∧ ¬ open (a !) ∈ w1
                                        ∧ ¬ close (a !) ∈ w1.

With the predicate match_defectb, we can reformulate match_defect as follows.

  Lemma match_defect_match_defectb w :
    match_defect w ↔ ∃ w1 w2 a, w = w1 ++ open (a !) :: w2
                                ∧ match_defectb a w2 = true.

w has a defect at n means that w can be decomposed as w1++⟨a!::w2, with w1 having length n and match_defectb a w2 holds.

Definition match_defect_at_n w n :=
∃ w1 w2 a, ⎢w1 ⎥ = n ∧ w = w1 ++open (a !)::w2 ∧ match_defectb a w2 = true.

Notice that w has a defect if and only if it has a defect at some n.

  Remark match_defect_at_n_spec (w : @word _ _ _ 𝐑) :
    match_defect w ↔ ∃ n, match_defect_at_n w n.

  Remark match_defect_at_Sn l w n :
    match_defect_at_n (l ::w) (S n) ↔ match_defect_at_n w n.

first_defect w n holds when w has a defect at n and n is the minimal such number.

  Definition first_defect w n :=
    match_defect_at_n w n ∧ ∀ k, match_defect_at_n w k → n ≤ k.

  Remark first_defect_Sn l w n : first_defect (l ::w) (S n) → first_defect w n.

We may also check that if w has a defect then it has a first defect.

Remark match_defect_first_defect (w : @word _ _ _ 𝐑) :
match_defect w → ∃ n, first_defect w n.

defect_count will be used later on to guess the number of rewrite steps needed to put a word in normal form. It counts the number of decompositions w1++⟨a!::w2 such that a! is not open-balanced in w2.

  Fixpoint defect_count w :=
    match w with
    | [] ⇒ 0
    | open (a!)::w ⇒
      if d_binding (𝗙 (a!) w) =?= 0
      then defect_count w
      else S (defect_count w)
    | b::w ⇒ defect_count w
    end.

Notice that the following case distinction holds:

  Remark defect_count_disj b w :
    (∃ a, b = open(a !))
    ∨ ((∀ a, b ≠ open(a !))/\ defect_count (b ::w) = defect_count w ).

In particular, having a positive defect_count is equivalent to having a defect.

Lemma defect_count_spec w : match_defect w ↔ 0 < defect_count w.

The following case distinction will be useful: either w has no defect, otherwise it may be decomposed as w1++open(a!)::w2++close(a!)::w3 such that the first defect of w is at ⎢w1⎥ and w2 does not contain a bracket labelled with a!.

  Remark case_defect w :
    {¬ match_defect w } + {∃ w1 w2 w3 a, w = w1 ++open(a !)::w2 ++close(a !)::w3
                                        ∧ first_defect w ⎢w1 ⎥
                                        ∧ ¬ open (a !) ∈ w2
                                        ∧ ¬ close (a !) ∈ w2 }.

Dynamic sequences

Dynamic sequences will be our normal forms. A dynamic sequence is a valid word that has no defect.

Definition dynamic_sequence w := valid w ∧ ¬ match_defect w.

This property is decidable.

  Definition test_dyn_seq (w : @word _ _ _ 𝐑) :=
    test_valid w && defect_count w =?= 0.

  Lemma test_dyn_seq_spec w : test_dyn_seq w = true ↔ dynamic_sequence w.

Dynamic sequences enjoy the following property: two dyn. seq. u,v are alpha-equivalent if and only if they are equal.

Theorem equiv_dynamic_sequence_eq u v :
dynamic_sequence u → dynamic_sequence v → u ≡ v → u = v.

Normalisation

Rewrite steps

  Inductive red_def : relation W :=
  | step u v w a : ¬ open (a !) ∈ v → ¬ close (a !) ∈ v →
                   red_def (u ++open (a !)::v ++close (a !)::w)
                           (u ++open (⎢u ⎥?)::([(a !,⎢u ⎥?)]∙v )++close(⎢u ⎥?)::w).

  Lemma red_def_valid u v : valid u → red_def u v → valid v.

  Lemma red_def_defect_count u v :
    valid u → red_def u v → defect_count v = defect_count u - 1.

  Lemma red_def_equiv u v :
    valid u → red_def u v → u ≡ v.

hide a n w goes through the word w, applying to each letter the transposition (a!,n?) until the letter a!⟩ is found, after which the procedure stops.

  Fixpoint hide a n w :=
    match w with
    | [] ⇒ []
    | b::w ⇒ if close (a !) =?= b
             then ([(a !,n ?)]∙b)::w
             else ([(a !,n ?)]∙b)::(hide a n w )
    end.

More directly, if w1++a!⟩::w2 is a decomposition of w such that a!⟩ does not appear in w1, then hide a n w produces the word ((a!,n?)∙w1)++n?⟩::w2.

  Lemma hide_close a n w1 w2 :
    ¬ close(a !) ∈ w1 →
    hide a n (w1 ++close(a !)::w2) = ([(a !,n ?)] ∙w1 )++close(n ?)::w2.

This function is meant to remove the first defect of a word. The argument n is used to store the length of list visited so far.

  Fixpoint remove_defect_aux n w :=
    match w with
    | [] ⇒ []
    | open (a!)::w ⇒
      if match_defectb a w
      then open(n ?)::hide a n w
      else open(a!)::remove_defect_aux (S n) w
    | b::w ⇒ b::remove_defect_aux (S n) w
    end.

  Definition remove_defect := remove_defect_aux 0.

If the word has no defect, then remove_defect leaves it unchanged (whatever the numeric argument). If on the other hand w has a first defect, then remove_defect will perform a rewrite step.

Lemma case_remove_defect w :
{¬match_defect w ∧ remove_defect w = w } + {red_def w (remove_defect w)}.

remove_defect preserves validity.

Proposition remove_defect_valid w : valid w → valid (remove_defect w).

The number of defects of remove_defect w is exactly defect_count w - 1, meaning that if w is not a dynamic sequence, remove_defect will indeed remove one defect.

Lemma defect_count_remove_defect w :
valid w → defect_count (remove_defect w) = defect_count w - 1.

We may also show rather simply that remove_defect w is alpha-equivalent to w, provided w was valid.

Lemma remove_defect_equiv w :
valid w → remove_defect w ≡ w.

Normalisation procedure

iter n f x applies f to x successively n times. iter n is in fact the Church numeral n.

  Fixpoint iter {A} n (f : A → A) x :=
    match n with
    | 0 ⇒ x
    | S n ⇒ iter n f (f x)
    end.

To normalise a word w, we apply to it the remove_defect function defect_count w times.

Definition normalise w := iter (defect_count w) remove_defect w.

If we start with a valid word, normalise w will produce a dynamic sequence that is alpha-equivalent to w.

Proposition normalise_spec w :
valid w → dynamic_sequence (normalise w) ∧ w ≡ normalise w.

Combining what we have established so far, it immediately follows that two words over X are alpha-equivalent if and only if their normal form after ⟪_⟫-translation coincide.

Theorem normal_form :
∀ u v, u ≡ v ↔ normalise ⟪u ⟫ = normalise ⟪v ⟫.

Concatenation of valid words and dynamic sequences

Validity and the property of being a dynamic sequence are not stable under concatenation. However, we may define alternative products on words that does preserve these properties while remaining alpha-equivalent to the concatenation.

Addition as a permutation

Obviously the operation n+_ is not a finitely supported permutation, but if we know of a bound on the argument, we may build a permutation that mimics this operation. shift n produces a permutation of pointers that bound numbers smaller that n to their successors, n to 0 and maps free names and bound numbers greater than n to themselves.

  Fixpoint shift n : (@perm (@pointer atom) _):=
    match n with
    | 0 ⇒ []
    | S n ⇒ (shift n )++[(n ?,(S n )?)]
    end.

  Lemma shift_free n (a : atom) : shift n ∙ (a !) = a !.

  Lemma shift_bound_n n : shift n ∙ (n ?) = 0?.

  Lemma shift_bound_gt n m : n < m → shift n ∙ (m ?) = m ?.

  Lemma shift_bound_lt n m : m < n → shift n ∙ (m ?) = (S m )?.

By iterating shift cleverly, we can mimic addition by n for numbers smaller that m. This is the meaning of the permutation Shift n m.

  Fixpoint Shift n m :=
    match n with
    | 0 ⇒ []
    | S n ⇒ shift (n +m) ++ Shift n m
    end.

The specification of Shift is described by the next two lemmas.

Lemma Shift_free (a : atom) n m : Shift n m ∙ (a !) = a !.

Lemma Shift_bound_lt n m k : k < m → Shift n m ∙ (k ?) = (n +k )?.

Notice that applying Shift n m to any bound number yields some bound number.

Remark Shift_bound n m k : ∃ l, Shift n m ∙ (k ?) = l ?.

Validity preserving concatenation

If u and v are valid, then so is u++Shift ⎢u⎥⎢v⎥∙v.

Lemma valid_app u v : valid u → valid v → valid (u ++Shift ⎢u ⎥⎢v ⎥∙v).

To concatenate dynamic sequences, we need in addition to normalise the result:

Definition app_dyn u v := normalise (u ++Shift ⎢u ⎥⎢v ⎥∙v).
Infix " +++ " := app_dyn (at level 50).

Indeed, whenever u and v are valid, u+++v is a dynamic sequence.

Lemma app_dyn_dynamic_sequence u v :
valid u → valid v → dynamic_sequence (u +++ v).

Furthermore, u+++v is alpha-equivalent to u++v.

Lemma app_dyn_equiv_app u v : valid u → valid v → u +++ v ≡ u ++ v.

This operation allows us to generalise our normal_form theorem as follows: u and v are alpha-equivalent if and only if for any valid word w we have w +++ ⟪u⟫ = w +++ ⟪v⟫.

Proposition app_dyn_equiv u v :
u ≡ v ↔ ∀ w, valid w → w +++ ⟪u ⟫ = w +++ ⟪v ⟫.

Relational interpretation

We now build a relational interpretation of words over X, and prove that this interpretation is fully abstract with respect to alpha-equivalence, in the sense that two words will be alpha-equivalent if and only if they are mapped to the same binary relation by the interpretation. We define the type of dynamic sequences using a sigma-type construction. This will form the base type for our relations.

Definition Dyn :={w : word | test_dyn_seq w = true }.

Lemma Dyn_eq (o1 o2 : Dyn) : $ o1 = $ o2 ↔ o1 = o2.

Recall that we consider two relations to be equal if they contain the same pairs.

Global Instance relseq : SemEquiv (relation Dyn) := eqRel.

The interpretation ρ l of a letter l consists of the pairs (w,w+++⦑l⦒) for every dynamic sequence w.

Definition ρ : letter → relation Dyn :=
fun l w1 w2 ⇒ $ w2 = ($ w1 ) +++ [⦑l ⦒].

This interpretation lifts to words, in the sense that the relation associated with u contains exactly the pairs (w,w+++⟪u⟫).

Lemma ρ _spec u : mapRel ρ u ≃ fun w1 w2 ⇒ $ w2 = $ w1 +++ ⟪u ⟫.

It is then fairly routine to check that this relational interpretation is indeed fully-abstract.

Theorem free_relational_interpretation :
∀ u v, u ≡ v ↔ mapRel ρ u ≃ mapRel ρ v.

End s.