Recursive operators and functions
While TLA+ allows the use of recursive operators and functions, we have decided to no longer support them in Apalache from version 0.23.1
onward,
and suggest the use of fold operators, described in Folding sets and sequences instead:
* Similar to Skolem, this has to be done carefully. Apalache automatically
* places this hint by static analysis.
*)
ConstCardinality(__cardExpr) == __cardExpr
(**
THEN __v
ELSE LET __w == CHOOSE __x \in __S: TRUE IN
LET __T == __S \ {__w} IN
ApaFoldSet(__Op, __Op(__v,__w), __T)
These operators are treated by Apalache in a more efficient manner than
recursive operators. They always take at most |S|
or Len(seq)
steps to evaluate and require no additional annotations.
Note that the remainder of this section discusses only recursive operators, for brevity. Recursive functions share the same issues.
The problem with recursive operators
In the preprocessing phase, Apalache replaces every application of a user-defined operator with its body. We call this process "operator inlining". This obviously cannot be done for recursive operators, since the process would never terminate. Additionally, even if inlining wasn't problematic, we would still face the following issues when attempting to construct a symbolic encoding:
-
A recursive operator may be non-terminating (although a non-terminating operator is useless in TLA+);
-
A terminating call to an operator may take an unpredictable number of iterations.
A note on (2): In practice, when one fixes specification parameters (that is, CONSTANTS
),
it is sometimes possible to find a bound on the number of operator iterations.
For instance, consider the following specification:
--------- MODULE Rec6 -----------------
EXTENDS Integers
N == 5
VARIABLES
\* @type: Set(Int);
set,
\* @type: Int;
count
RECURSIVE Sum(_)
Sum(S) ==
IF S = {}
THEN 0
ELSE LET x == CHOOSE y \in S: TRUE IN
x + Sum(S \ {x})
UNROLL_DEFAULT_Sum == 0
UNROLL_TIMES_Sum == N
Init ==
/\ set = {}
/\ count = 0
Next ==
\E x \in (1..N) \ set:
/\ count' = count + x
/\ set' = set \union {x}
Inv == count = Sum(set)
=======================================
It is clear that the expression Sum(S)
requires Cardinality(S)
steps of recursive computation.
Moreover, as the unspecified invariant set \subseteq 1..N
always holds for this specification,
every call Sum(set)
requires up to N
recursive steps.
The previously supported approach
Previously, when it was possible to find an upper bound on the number of iterations of an operator Op
,
such as N
for Sum
in the example above, Apalache would unroll the recursive operator up to this bound.
Two additional operators, UNROLL_DEFAULT_Op
and UNROLL_TIMES_Op
, were required, for instance:
UNROLL_DEFAULT_Sum == 0
UNROLL_TIMES_Sum == N
With the operators UNROLL_DEFAULT_Op
and UNROLL_TIMES_Op
,
Apalache would internally replace the definition of Op
with an operator OpInternal
, that had the following property:
OpInternal
was non-recursive- If computing
Op(x)
required a recursion stack of depth at mostUNROLL_TIMES_Op
, thenOpInternal(x) = Op(x)
- Otherwise,
OpInternal(x)
would return the value, which would have been produced by the computation ofOp(x)
, if all applications ofOp
while the recursion stack height wasUNROLL_TIMES_Op
returnedUNROLL_DEFAULT_Op
instead of the value produced by another recursive call toOp
Unsurprisingly, (3) caused a lot of confusion, particularly w.r.t. the meaning of the value UNROLL_DEFAULT_Op
.
Consider the following example:
RECURSIVE Max(_)
Max(S) ==
IF S = {}
THEN 0
ELSE
LET x == CHOOSE v \in S: TRUE IN
LET maxRest == Max(S \ {x})
IN IF x < maxRest THEN maxRest ELSE x
As computing Max(S)
requires |S|
recursive calls, there is no static upper bound to the recursion stack height that works for all set sizes.
Therefore, if one wanted to use this operator in Apalache, one would have to guess (or externally compute) a value N
,
such that, in the particular specification, Max
would never be called on an argument, the cardinality of which exceeded N
, e.g.
UNROLL_TIMES_Max = 2
In this case, Apalache would produce something equivalent to
MaxInternal(S) ==
IF S = {}
THEN 0
ELSE
LET x1 == CHOOSE v \in S: TRUE IN
LET maxRest1 ==
IF S \ {x1} = {}
THEN 0
ELSE
LET x2 == CHOOSE v \in S \ {x1}: TRUE IN
LET maxRest2 ==
IF S \ {x1, x2} = {}
THEN 0
ELSE
LET x3 == CHOOSE v \in S \ {x1, x2}: TRUE IN
LET maxRest3 == UNROLL_DEFAULT_Max
IN IF x3 < maxRest3 THEN maxRest3 ELSE x3
IN IF x2 < maxRest2 THEN maxRest2 ELSE x2
IN IF x1 < maxRest1 THEN maxRest1 ELSE x1
In this case, MaxInternal({1,42}) = 42 = Max({1,42})
, by property (2) as expected, but MaxInternal(1..10)
can be any one of
3..10 \union {UNROLL_DEFAULT_Max}
(depending on the value of
UNROLL_DEFAULT_Max
and the order in which elements are selected by CHOOSE
), by property (3).
So how does one select a sensible value for UNROLL_DEFAULT_Op
? The problem is, one generally cannot.
In the Max
case, one could pick a "very large" N
and then assume that Max
computation has "failed"
(exceeded the UNROLL_TIMES_Max
recursion limit) if the result was ever equal to N
,
though "very large" is of course subjective and gives absolutely no guarantees that Max
won't be called on a set containing some element M > N
.
As the recursion becomes more complex (e.g. non-primitive or non-tail), attempting to implement a sort of "monitor" via default values quickly becomes impractical, if not impossible.
Fundamentally though, it is very easy to accidentally either introduce spurious invariant violations, or hide actual invariant violations by doing this. For instance, in a specification with
UNROLL_DEFAULT_Max = 42
UNROLL_TIMES_Max = 2
Inv == \A n \in 10..20: Max(1..n) = 42
Apalache could "prove" Inv
holds, as it would rewrite this Inv
to
\A n \in 10..20: MaxInternal(1..n) = 99
and MaxInternal(1..n)
evaluates to 99
for all n \in 3..99
(and might still evaluate to 99
for n > 99
,
based on the CHOOSE
order), despite the fact that Max(1..n) = n
in the mathematical sense.
Consider now the much simpler alternative:
NonRecursiveMax(S) ==
LET Max2(a,b) == IF a < b THEN b ELSE a IN
ApaFoldSet(Max2, 0, S)
In this case, the user doesn't have to think about defaults (aside from the empty-set case), or recursion,
as ApaFoldSet
ensures |S|
-step "iteration". As an additional benefit, one also doesn't need to use CHOOSE
to select elements this way.
So ultimately, the reasons for abandoning support for recursive operators boils down to the following:
- In the vast majority of cases, equivalent functionality can be achieved by using
ApaFoldSet
orApaFoldSeqLeft
UNROLL_TIMES_Op
is hard to determine, or doesn't exist statically,UNROLL_DEFAULT_Op
is hard to determine,- Apalache doesn't have runtime evaluation of recursion, so it can't natively determine when a call to a recursive
Op
would have required more thanUNROLL_TIMES_Op
steps - The use of recursive operators produces unpredictable results, particularly when used in invariants