Require Import Coq.Init.Notations.
Require Import Coq.Init.Datatypes.
Inductive Wrapper (T : Type) :=
| Wrap : T -> Wrapper T
.
Inductive Unwrapper :=
| Empty : Unwrapper
| Unwrap : Wrapper Unwrapper -> Unwrapper
.
Fixpoint Unwrapper_size (u : Unwrapper) {struct u} : nat :=
match u with
| Empty => O
| Unwrap w => Wrapper_size w
end
with Wrapper_size (w : Wrapper Unwrapper) {struct w} : nat :=
match w with
| Wrap _ t => Unwrapper_size t
end.
produces the error:
Recursive definition of Wrapper_size is ill-formed.
In environment
Unwrapper_size : Unwrapper -> nat
Wrapper_size : Wrapper Unwrapper -> nat
w : Wrapper Unwrapper
t : Unwrapper
Recursive call to Unwrapper_size has principal argument equal to
"t" instead of a subterm of "w".
Recursive definition is:
"fun w : Wrapper Unwrapper =>
match w with
| Wrap _ t => Unwrapper_size t
end".
The error states that the principal argument of Unwrapper_size is t instead of a subterm of w, but t is precisely a subterm of w.
This is on Coq version 8.8.2, macOS 10.12.6.
The error states that the principal argument of Unwrapper_size is t instead of a subterm of w, but t is precisely a subterm of w.
It is a subterm, but Coq is not able to see it. This is an unfortunate weakness.
The fix is not so complicated: the recursive structure of Unwrapper should be propagated to the parameter A of Wrapper but the function responsible of the guard condition, (check_fix in file inductive.ml if you know a bit the code), is not doing it.
Duplicate of #1632 (in French).
The fix is not so complicated: the recursive structure of Unwrapper should be propagated to the parameter A of Wrapper but the function responsible of the guard condition, (check_fix in file inductive.ml if you know a bit the code), is not doing it.
I don't know the code, but I've just taken a look at check_fix and it's not immediately clear what's going on (though it doesn't seem too complex). How difficult do you think implementing a fix would be for someone new to the code? Could you provide any mentoring instructions on making a fix?
How difficult do you think implementing a fix would be for someone new to the code?
It is difficult for me to evaluate. I now took some time to look at how to proceed and it is maybe 2 or 3 days of work for someone who does not know the code at all but who has a background, say, in typed lambda-calculus, which is your case according to your profile.
Could you provide any mentoring instructions on making a fix?
Let me explain the picture. This shall be a pretty long explanation, so I hope I shall not frighten you, but it is also to explain things step by step.
Inductive types can be mutual, so the implementation of inductive types always manipulates array of inductive types, even if most of the time, the array has only one component.
To each block of mutual inductive types is associated a precomputed recursive structure of the constructors of the inductive types of the block. This is an object of type Declarations.wf_paths, which is registered in the fields Declarations.mind_recargs of a block of mutual inductive types.
When checking the well-guardedness, the function responsible of dispatching the precomputed recursive structure towards the branch of a match is Inductive.branches_specif.
The recursive structure of a block of inductive types is represented using a generic notion of recursive trees called Rtree.t (in lib/rtree.ml). This structure is a bit difficult to understand, but, basically, if we focus on the instance of the generic structure which the guard checker uses, it is defined by the following informal syntax:
R ::= μ(X1..Xn).{I1..In}.i | Xi | ⊥
I ::= {C1..Cn}
C ::= {R1..Rn}
An R is the recursive structure associated to a type. A μ{X1..Xn}.{I1..In} represents the n-tuple of recursive structures of each types of a block of mutual inductive types where the Xis bind the recursive occurrences of the types. A I is thus the recursive structure for one type of the block. It is composed from the collection of recursive structure for its constructors. A C is thus the recursive structure for one constructor. It is composed from the information associated to the array of types of the arguments of the constructors. If an argument of the constructor is of type one of the recursive types of the block, it will be Xi where Xi is the type variable associated to the given type of the block. If an argument of the constructor is of a previously defined inductive type, it will recursively be a recursive structure of the form μ(X1..Xn).{I1..In}.i. If an argument of the constructor is not an inductive type nor a recursive occurrence of the block of inductive types being under consideration, then it is a ⊥.
For instance, for list, the recursive structure is:
μ(X1).{{{},{⊥,X1}}}.1
where the left {} is for nil and the {⊥,X1} is for cons. Note in particular that the argument of type A of cons is marked as non-recursive (⊥).
As another example, the type:
Inductive tree A := node : A -> list (tree A) -> tree A.
has recursive structure
μ(X1).{{{⊥,μ(Y1).{{{},{X1,Y1}}}.1}}}.1
Note here that the argument A of list has been bound to tree A, thus to X1.
One shall see that what we are going to have to do is to be able to dynamically replaced a precomputed recursive structure of the form μ(Y1).{{{},{⊥,Y1}}}.1 (for list A) by a more precise recursive structure μ(Y1).{{{},{X1,Y1}}}.1 where A has been instantiated by tree A (mutatis mutandis, this is the same situation as for Wrapper and Unwrapper in your fixpoint).
Now, the OCaml representation of the above syntax in the code is a bit tricky, if not actually confusing.
The generic Rtree.t structure is instantiated in practice with Declarations.recarg, resulting to a type which is basically the following:
type recarg =
| Norec
| Mrec of inductive
| Imbr of inductive
type Rtree.t =
| Param of int * int
| Node of recarg * Rtree.t array
| Rec of int * Rtree.t array
There is now a bit of encoding. The dictionary is:
μ(X1..Xn).{I1..In}.i = Rec (i, {|I1;...;In|})
Xi = Param (X,i)
⊥ = Node (Norec, [||])
{C1..Cn} = Node (MRec J, [||C1;...;Cn|]) for the toplevel inductive type J
{C1..Cn} = Node (Imbr J, [||C1;...;Cn|]) for an inner inductive type J
{R1..Rn} = Node (Norec, [||R1;...;Rn|])
Note that we use de Bruijn indices for the Xi variables. In Param (X,i), the X is then now a number n referring the n-th Rec block of binder upwards. And the i is the index in the block.
I'm saying that the encoding is somehow confusing because it is reusing the name Norec as a tag for the block {R1..Rn} which has nothing to do with the information of being non-recursive. The name Norec is reused only because we need a name for the encoding but the name does not matter, so any name around is good enough to use (even if anti-pedagogical).
Let's now come to the meat. My proposal is to extend the syntax of the recursive structure so that we remember when the argument of a constructor is of type a parameter of the block of inductive types. This means the following syntax:
S ::= λZ1..Zn.R
R ::= μ(X1..Xn).{I1..In}.i | Xi | ⊥ | Zi t1..tn
I ::= {C1..Cn}
C ::= {R1..Rn}
where λZ1..Zn.R denotes a recursive structure abstracted over the parameters of the block and Z t1..tn denotes a parameter applied to (arbitrary) terms t1...tn (these terms shall be of type Constr.t which is the type of Coq terms).
Then, every time a match on a term v is traversed in the guard check, we don't assume anymore that the recursive structure of the inductive type J param1...paramn index1...indexp of v has parameters with no recursive contents. We instead take the generalized recursive structure S and instantiate it with the actual parameters param1..paramn of this particular instance of the inductive type, dynamically recomputing the recursive substructure of the Zi t1..tn subterms of the recursive structure of J.
The function to instantiate a S to get an R is basically the only new proper function to write. To dynamically compute the recursive substructure of the Z t1..tn, this should be doable by calling the existing function Inductive.check_positive_nested (which is however currently restricted to unary blocks of mutual inductive types).
For returning a S rather than an R, this means changing the return type of the function check_positivity which is the one in charge of precomputing the recursive structure.
All in all, this would also mean changing a few types in the implementation, adding a field in recarg to encode the new case Zi t1..tn, and changing wf_paths into a type representing S.
In any case, I believe that the above is enough to start investigating what can be done if you'd like to do so.
Don't hesitate to ask questions at any time.
[_Edited to fix errors in the first version of the description of the encoding_]
@herbelin Slightly out of topic, but I don't remember reading such a precise explanation of the recursion structure in the comments. You should probably write down the above post in some place in the repository.
@herbelin: thank you very much for your detailed guidance on the issue! Your instructions make sense (without yet having looked at the specific code) and sounds like it could be fairly straightforward. I don't have time immediately to attempt to implement the feature, but I should very much like to and possibly next month or so when things are less busy, I intend to try! Thanks again!
@varkor: ok. Thanks for your reply.
As a coincidence, I attended yesterday a talk from Véronique Benzaken on her Coq formalization of the SQL specification with Évelyne Contejean and @ckeller. They exactly fell on the artificial limitation of Inductive.check_positive_nested to unary blocks of mutual inductive types. This was reported at #6865.
Most helpful comment
It is difficult for me to evaluate. I now took some time to look at how to proceed and it is maybe 2 or 3 days of work for someone who does not know the code at all but who has a background, say, in typed lambda-calculus, which is your case according to your profile.
Let me explain the picture. This shall be a pretty long explanation, so I hope I shall not frighten you, but it is also to explain things step by step.
Mutual inductive types
Inductive types can be mutual, so the implementation of inductive types always manipulates array of inductive types, even if most of the time, the array has only one component.
Recursive structure of a block of mutual inductive types
To each block of mutual inductive types is associated a precomputed recursive structure of the constructors of the inductive types of the block. This is an object of type
Declarations.wf_paths, which is registered in the fieldsDeclarations.mind_recargsof a block of mutual inductive types.When checking the well-guardedness, the function responsible of dispatching the precomputed recursive structure towards the branch of a
matchisInductive.branches_specif.The recursive structure of a block of inductive types is represented using a generic notion of recursive trees called
Rtree.t(inlib/rtree.ml). This structure is a bit difficult to understand, but, basically, if we focus on the instance of the generic structure which the guard checker uses, it is defined by the following informal syntax:An
Ris the recursive structure associated to a type. Aμ{X1..Xn}.{I1..In}represents the n-tuple of recursive structures of each types of a block of mutual inductive types where theXis bind the recursive occurrences of the types. AIis thus the recursive structure for one type of the block. It is composed from the collection of recursive structure for its constructors. ACis thus the recursive structure for one constructor. It is composed from the information associated to the array of types of the arguments of the constructors. If an argument of the constructor is of type one of the recursive types of the block, it will beXiwhereXiis the type variable associated to the given type of the block. If an argument of the constructor is of a previously defined inductive type, it will recursively be a recursive structure of the formμ(X1..Xn).{I1..In}.i. If an argument of the constructor is not an inductive type nor a recursive occurrence of the block of inductive types being under consideration, then it is a⊥.For instance, for
list, the recursive structure is:where the left
{}is forniland the{⊥,X1}is forcons. Note in particular that the argument of typeAofconsis marked as non-recursive (⊥).As another example, the type:
has recursive structure
Note here that the argument
Aoflisthas been bound totree A, thus toX1.One shall see that what we are going to have to do is to be able to dynamically replaced a precomputed recursive structure of the form
μ(Y1).{{{},{⊥,Y1}}}.1(forlist A) by a more precise recursive structureμ(Y1).{{{},{X1,Y1}}}.1whereAhas been instantiated bytree A(mutatis mutandis, this is the same situation as forWrapperandUnwrapperin your fixpoint).OCaml encoding of the recursive structure of a block of mutual inductive types
Now, the OCaml representation of the above syntax in the code is a bit tricky, if not actually confusing.
The generic
Rtree.tstructure is instantiated in practice withDeclarations.recarg, resulting to a type which is basically the following:There is now a bit of encoding. The dictionary is:
Note that we use de Bruijn indices for the
Xivariables. InParam (X,i), theXis then now a numbernreferring then-thRecblock of binder upwards. And theiis the index in the block.I'm saying that the encoding is somehow confusing because it is reusing the name
Norecas a tag for the block{R1..Rn}which has nothing to do with the information of being non-recursive. The nameNorecis reused only because we need a name for the encoding but the name does not matter, so any name around is good enough to use (even if anti-pedagogical).How to take parameters into account in computing the recursive structure
Let's now come to the meat. My proposal is to extend the syntax of the recursive structure so that we remember when the argument of a constructor is of type a parameter of the block of inductive types. This means the following syntax:
where
λZ1..Zn.Rdenotes a recursive structure abstracted over the parameters of the block andZ t1..tndenotes a parameter applied to (arbitrary) termst1...tn(these terms shall be of typeConstr.twhich is the type of Coq terms).Then, every time a
matchon a termvis traversed in the guard check, we don't assume anymore that the recursive structure of the inductive typeJ param1...paramn index1...indexpofvhas parameters with no recursive contents. We instead take the generalized recursive structureSand instantiate it with the actual parametersparam1..paramnof this particular instance of the inductive type, dynamically recomputing the recursive substructure of theZi t1..tnsubterms of the recursive structure ofJ.Implementation sketch
The function to instantiate a
Sto get anRis basically the only new proper function to write. To dynamically compute the recursive substructure of theZ t1..tn, this should be doable by calling the existing functionInductive.check_positive_nested(which is however currently restricted to unary blocks of mutual inductive types).For returning a
Srather than anR, this means changing the return type of the functioncheck_positivitywhich is the one in charge of precomputing the recursive structure.All in all, this would also mean changing a few types in the implementation, adding a field in
recargto encode the new caseZi t1..tn, and changingwf_pathsinto a type representingS.In any case, I believe that the above is enough to start investigating what can be done if you'd like to do so.
Don't hesitate to ask questions at any time.
[_Edited to fix errors in the first version of the description of the encoding_]