How to write type annotations
Revision: August 24, 2022
Important updates:
-
Version 0.29.0: The new syntax for records and variants is enabled by default (previously, enabled with
--features=rows
). For the transition period, the old type syntax can be activated with--features=no-rows
. See Recipe 9 on transitioning to Type System 1.2. -
Version 0.25.10: This HOWTO introduces new syntax for type aliases. See Type Aliases in ADR-002.
-
Version 0.25.9: This HOWTO introduces new syntax for record types and variants, which is currently under testing. This syntax is activated via the option
--features=rows
. See Type System 1.2 in ADR-002. -
Version 0.23.1: The example specification uses recursive operators, which were removed in version 0.23.1.
-
Version 0.15.0: This HOWTO discusses how to write type annotations for the type checker Snowcat, which is used in Apalache since version 0.15.0 (introduced in 2021).
This HOWTO gives you concrete steps to extend TLA+ specifications with type annotations. You can find the detailed syntax of type annotations in ADR002. The first rule of writing type annotations:
Do not write any annotations at all, until the type checker Snowcat is asking you to write a type annotation.
Of course, there must be an exception to this rule. You have to write type annotations for CONSTANTS and VARIABLES. This is because Snowcat infers types of declarations in isolation instead of analyzing the whole specification. The good news is that the type checker finds the types of many operators automatically.
Recipe 1: Annotating variables
Consider the example HourClock.tla from Specifying Systems:
---------------------- MODULE HourClock ----------------------
\* This is a local copy of the example from Specifying Systems:
\* https://github.com/tlaplus/Examples/blob/master/specifications/SpecifyingSystems/RealTime/HourClock.tla
EXTENDS Naturals
VARIABLE
\* @type: Int;
hr
HCini == hr \in (1 .. 12)
HCnxt == hr' = IF hr # 12 THEN hr + 1 ELSE 1
HC == HCini /\ [][HCnxt]_hr
TypeOK == hr \in (1 .. 12)
--------------------------------------------------------------
THEOREM HC => []HCini
==============================================================
Without thinking much about the types, run the type checker:
$ apalache-mc typecheck HourClock.tla
The type checker complains about not knowing the type of the variable hr
:
...
Typing input error: Expected a type annotation for VARIABLE hr
...
Annotate the type of variable hr
as below. Note carefully that the type
annotation should be between the keyword VARIABLE
and the variable name.
This is because variable declarations may declare several variables at once.
In this case, you have to write one type annotation per name.
VARIABLE
\* @type: Int;
hr
Run the type checker again. You should see the following message:
...
> Running Snowcat .::.
> Your types are purrfect!
> All expressions are typed
...
Recipe 2: Annotating constants
Consider the example Channel.tla from Specifying Systems:
-------------------------- MODULE Channel -----------------------------
\* This is a typed version of the example from Specifying Systems:
\* https://github.com/tlaplus/Examples/blob/master/specifications/SpecifyingSystems/FIFO/Channel.tla
EXTENDS Naturals
CONSTANT Data
VARIABLE chan
TypeInvariant == chan \in [val : Data, rdy : {0, 1}, ack : {0, 1}]
-----------------------------------------------------------------------
Init == /\ TypeInvariant
/\ chan.ack = chan.rdy
Send(d) == /\ chan.rdy = chan.ack
/\ chan' = [chan EXCEPT !.val = d, !.rdy = 1 - @]
Rcv == /\ chan.rdy # chan.ack
/\ chan' = [chan EXCEPT !.ack = 1 - @]
Next == (\E d \in Data : Send(d)) \/ Rcv
Spec == Init /\ [][Next]_chan
-----------------------------------------------------------------------
THEOREM Spec => []TypeInvariant
=======================================================================
Run the type checker:
$ apalache-mc typecheck Channel.tla
The type checker does not know the type of the variable chan
:
Typing input error: Expected a type annotation for VARIABLE chan
According to TypeInvariant
, the variable chan
is a record that has three
fields: val
, rdy
, and ack
. The field val
ranges over a set Data
,
which is actually defined as CONSTANT
. In principle, we can annotate the
constant Data
with a set of any type, e.g., Set(Int)
or Set(BOOLEAN)
.
Since the specification is not using any operators over Data
except equality,
we can use an uninterpreted type as a type for set elements, e.g.,
we can define Data
to have the type Set(DATUM)
. Uninterpreted types are
always written in CAPITALS. Now we can annotate Data
and chan
as follows:
CONSTANT
\* @type: Set(DATUM);
Data
VARIABLE
\* @type: { val: DATUM, rdy: Int, ack: Int };
chan
Note carefully that the type annotation should be between the keyword
CONSTANT
and the constant name. This is because constant declarations may
declare several constants at once. In this case, you have to write one type
annotation per name.
Have a look at the type of chan
:
\* @type: { val: DATUM, rdy: Int, ack: Int };
The type of chan
is a record that has three fields: field val
of type
DATUM
, field rdy
of type Int
, field ack
of type Int
.
The record type syntax is similar to dictionary syntax from programming languages (e.g. Python).
We made it different from TLA+'s syntax for records [ val |-> v, rdy |-> r, ack |-> a ]
and record sets [ val: V, rdy: R, ack: A ]
, to avoid confusion between types and values.
Run the type checker again. You should see the following message:
$ apalache-mc typecheck ChannelTyped.tla
...
> Running Snowcat .::.
> Your types are purrfect!
> All expressions are typed
Recipe 3: Annotating operators
Check the example CarTalkPuzzle.tla from the repository of TLA+
examples. This example has 160 lines of code, so we do not inline it here.
By running the type checker as in previous sections, you should figure out
that the constants N
and P
should be annotated with the type Int
.
Annotate N
and P
with Int
and run the type checker:
$ apalache-mc typecheck CarTalkPuzzle.tla
Now you should see the following error:
[CarTalkPuzzle.tla:52:32-52:35]: Cannot apply f to the argument x() in f[x()].
[CarTalkPuzzle.tla:50:1-52:53]: Error when computing the type of Sum
Although the error message may look confusing, the reason is simple: The type
checker cannot figure out whether the operator Sum
expects a sequence
or a function of integers as its first parameter. By looking carefully at
the definition of Sum
, we can see that it expects: (1) a function from
integers to integers as its first parameter, (2) a set of integers
as its second parameter, and (3) an integer as a result. Hence, we annotate
Sum
as follows:
RECURSIVE Sum(_,_)
\* type: (Int -> Int, Set(Int)) => Int;
Sum(f,S) ==
...
Note that the annotation has to be written between RECURSIVE Sum(_, _)
and
the definition of Sum
. This might change later, see Issue 578 at tlaplus.
After providing the type checker with the annotation for Sum
, we get one
more type error:
[CarTalkPuzzle.tla:160:23-160:26]: Cannot apply B to the argument x in B[x].
[CarTalkPuzzle.tla:160:7-160:37]: Error when computing the type of Image
This time the type checker cannot choose between two options for the second
parameter of Image
: It could be a function, or a sequence. We help the
type checker by writing that the second parameter should be a function
of integers to integers, that is, Int -> Int
:
\* @type: (Set(Int), Int -> Int) => Set(Int);
Image(S, B) == {B[x] : x \in S}
This time the type checker can find the types of all expressions:
...
> Running Snowcat .::.
> Your types are purrfect!
> All expressions are typed
...
Recipe 4: Using variants in heterogenous sets
Check the example TwoPhase.tla from the repository of TLA+ examples (you will also need TCommit.tla, which is imported by TwoPhase.tla). This example has 176 lines of code, so we do not inline it here.
As you probably expected, the type checker complains about not knowing
the types of constants and variables. As for constant RM
, we opt for using
an uninterpreted type that we call RM
. That is:
CONSTANT
\* @type: Set(RM);
RM \* The set of resource managers
By looking at the spec, it is easy to guess the types of the variables
rmState
, tmState
, and tmPrepared
:
VARIABLES
\* @type: RM -> Str;
rmState, \* $rmState[rm]$ is the state of resource manager RM.
\* @type: Str;
tmState, \* The state of the transaction manager.
\* @type: Set(RM);
tmPrepared, \* The set of RMs from which the TM has received $"Prepared"$
\* messages.
The type of the variable msgs
is less obvious. We can check the original
(untyped) definitions of TPTypeOK
and Message
to get an idea about the
type of msgs
:
Message ==
({[type |-> t, rm |-> r]: t \in {"Prepared"}, r \in RM }
\union
{[type |-> t] : t \in {"Commit", "Abort"}})
TPTypeOK ==
...
/\ msgs \in SUBSET Message
From these (untyped) definitions, you can see that msgs
is a set that
contains records of two types: { type: Str }
and { type: Str, rm: RM }
.
This seems to be problematic, as we have to mix in two records types in a
single set, which requires us to specify its only type.
To this end, we have to use the Variants module, which is distributed with
Apalache. For reference, check the Chapter on variants. First, we declare a
type alias for the type of messages in a separate file called
TwoPhaseTyped_typedefs.tla
:
----------------------- MODULE TwoPhaseTyped_typedefs ----------------
(*
@typeAlias: message = Commit(NIL) | Abort(NIL) | Prepared(RM);
*)
TwoPhaseTyped_aliases == TRUE
======================================================================
Usually, we place type aliases in a separate file for when we have to use the same type alias in different specifications, e.g., the specification and its instance for model checking.
With the type alias MESSAGE
, we specify that a message is a variant type,
that is, it can represent three kinds of different values:
-
A value tagged with
Commit
. Since we do not require the variant to carry any value here, we simply declare that the value has the uninterpreted typeNIL
. This is simply a convention, we could use any type in this case. -
A value tagged with
Abort
. Similar toCommit
, we are using theNIL
type. -
A value tagged with
Prepared
. In this case, the value is of importance. We are using the valueRM
, that is, the (uninterpreted) type of a resource manager.
Once we have specified the variant type, we introduce three constructors, one per variant option:
\* @type: $message;
MkCommit == Variant("Commit", "0_OF_NIL")
\* @type: $message;
MkAbort == Variant("Abort", "0_OF_NIL")
\* @type: RM => $message;
MkPrepared(rm) == Variant("Prepared", rm)
Since the values carried by the Commit
and Abort
messages are not
important, we use the uninterpeted value "0_OF_NIL"
. This is merely a
convention. We could use any value of type NIL
. Importantly, the operators
MkAbort
, MkCommit
, and MkPrepared
all produce values of type MESSAGE
,
which makes it possible to add them to a single set of messages.
Now it should be clear how to specify the type of the variable msgs
:
\* @type: Set($message);
msgs
We run the type checker once again:
$ apalache-mc typecheck TwoPhaseTyped.tla
...
> All expressions are typed
Type checker [OK]
As you can see, variants require quite a bit of boilerplate. If you can simply
introduce a set of records of the same type, this is usually a simpler
solution. For instance, we could partition msgs
into three subsets: the
subset of Commit
messages, the subset of Abort
messages, and the subset of
Prepared
messages. See the discussion in Idiom 15.
Recipe 5: functions as sequences
Check the example Queens.tla from the repository of TLA+ examples. It has 85 lines of code, so we do not include it here. Similar to the previous sections, we annotate constants and variables:
CONSTANT
\* @type: Int;
N \** number of queens and size of the board
...
VARIABLES
\* @type: Set(Seq(Int));
todo,
\* @type: Set(Seq(Int));
sols
After having inspected the type errors reported by Snowcat, we annotate the
operators Attacks
, IsSolution
, and vars
as follows:
\* @type: (Seq(Int), Int, Int) => Bool;
Attacks(queens,i,j) ==
...
\* @type: Seq(Int) => Bool;
IsSolution(queens) ==
...
\* @type: <<Set(Seq(Int)), Set(Seq(Int))>>;
vars == <<todo,sols>>
Now we run the type checker and receive the following type error:
[Queens.tla:35:44-35:61]: The operator IsSolution of type ((Seq(Int)) => Bool) is applied to arguments of incompatible types in IsSolution(queens):
Argument queens should have type Seq(Int) but has type (Int -> Int). E@11:07:53.285
[Queens.tla:35:1-35:63]: Error when computing the type of Solutions
Let's have a closer look at the problematic operator definition of Solutions
:
Solutions ==
{ queens \in [1..N -> 1..N]: IsSolution(queens) }
This looks interesting: IsSolution
expects a sequence, whereas
Solutions
produces a set of functions. This is obviously not a
problem in untyped TLA+. In fact, it is a well-known idiom: Construct a
function by using the function set operator, and then apply sequence operators to it.
In Apalache we have to explicitly write that a function should be reinterpreted
as a sequence. To this end, we have to use the operator FunAsSeq
from the
module Apalache.tla. Hence, we add Apalache
to the EXTENDS
clause and
apply the operator FunAsSeq
as follows:
EXTENDS Naturals, Sequences, Apalache
...
Solutions ==
LET Queens == { FunAsSeq(queens, N, N): queens \in [1..N -> 1..N] } IN
{queens \in Queens : IsSolution(queens)}
This time the type checker can find the types of all expressions:
> Running Snowcat .::.
> Your types are purrfect!
> All expressions are typed
Recipe 6: type aliases
Type aliases can be used to provide a concise label for complex types, or to clarify the intended meaning of a simple types in the given context.
Type aliases are declared with the @typeAlias
annotation, as follows:
\* @typeAlias: aliasNameInCamelCase = <type>;
For example, suppose we have annotated some constants as follows:
CONSTANTS
\* @type: Set(PERSON);
Missionaries,
\* @type: Set(PERSON);
Cannibals
If we continue annotating other declarations in the specification, we will see
that the type Set(PERSON)
is used frequently. Type aliases let us provide a
shortcut.
By convention, we introduce all type aliases by annotating an operator called
<PREFIX>_typedefs
, where the <PREFIX>
is replaced with a unique prefix to
prevent name clashes across different modules. Typically <PREFIX>
is just the
module name. For the MissionariesAndCannibalsTyped.tla example, we have:
\* @typeAlias: persons = Set(PERSON);
MissionariesAndCannibals_typedefs = TRUE
Having defined the type alias, we can use it in other definitions anywhere else in the file:
CONSTANTS
\* @type: $persons;
Missionaries,
\* @type: $persons;
Cannibals
VARIABLES
\* @type: Str;
bank_of_boat,
\* @type: Str -> $persons;
who_is_on_bank
Surely, we did not gain much by writing $persons
instead of Set(PERSON)
.
But if your specification has complex types (e.g., records), aliases may help
you in minimizing the burden of specification maintenance. If you add one
more field to the record type, it suffices to change the definition of the type
alias, instead of changing the record type everywhere.
For more details on the design and usage, see Type Aliases in ADR-002.
Recipe 7: Multi-line annotations
A type annotation may span over multiple lines. You may use both the (* ... *)
syntax as well as the single-line syntax \* ...
.
All three examples below are accepted by the parser:
VARIABLES
(*
@type: Int
=> Bool;
*)
f,
\* @type:
\* Int
\* => Bool;
g,
\* @type("Int
\* => Bool
\* ")
h
Note that the parser removes the leading strings " \*"
from the annotations,
similar to how multi-line strings are treated in modern programming languages.
Recipe 8: Comments in annotations
Sometimes, it helps to document the meaning of type components. Consider the following example from Recipe 5:
\* @type: (Seq(Int), Int, Int) => Bool;
Attacks(queens,i,j)
If you think that an explanation of the arguments would help, you can do that as follows:
(*
@type:
(
// the column of an n-th queen, for n in the sequence domain
Seq(Int),
// the index (line number) of the first queen
Int,
// the index (line number) of the second queen
Int
) => Bool;
*)
Attacks(queens,i,j)
You don't have to do that, but if you feel that types can also help you in documenting your specification, you have this option.
Recipe 9: Migrate from Type System 1 to Type System 1.2
As explained in ADR002, Type System 1.2 (TS1.2) differs from Type System 1 (TS1) as follows:
-
TS1 allows one to mix records of varying domains, as long as the records agree on the types of the common fields. Hence, record access is not enforced by the type checker and thus is error-prone.
-
TS1 is using the syntax
[ field_n: T_1, ..., field_n: T_n ]
, which is sometimes confused with the TLA+ expression[ field_n: e_1, ..., field_n: e_n ]
. -
TS1.2 is using the syntax
{ field_n: T_1, ..., field_n: T_n }
for record types and the syntaxTag_1(T_1) | ... | Tag_n(T_n)
for variant types. -
TS1.2 differentiates between records of different domains and does not allow the specification writer to mix them. As a result, TS1.2 can catch incorrect record access. Instead of mixing records, TS1.2 allows one to mix Variants.
-
TS1.2 supports Row polymorphism and thus lets the user write type annotations over records and variants, whose shape is only partially-defined. For example,
{ foo: Int, bar: Bool, a }
defines a record type that has at least two fields (that is,foo
of typeInt
andbar
of typeBool
), but may have more fields, which are captured with the row variablea
.
Case 1: plain records
Many specifications are using plain records. For instance, they do not assign records of different domains to the same variable. Nor do they mix records of different domains in the same set. Plenty of specifications fall into this class.
For example, check Recipe 2. In this recipe, the variable chan
is
always carrying a record with the domain { "val", "rdy", "ack" }
.
In this case, all you have to do is to replace the old record types of the form
[ field_n: T_1, ..., field_n: T_n ]
with the new record types of the form { field_n: T_1, ..., field_n: T_n }
. That is, replace [
and ]
with {
and
}
, respectively.
Case 2: mixed records
Some specifications are using mixed records, which are similar to unions in C.
For example, check Recipe 4. In this recipe, the variable
tmPrepared
is a set that contains records of different domains. For instance,
tmPrepared
may be equal to:
{ [ type |-> "Commit" ], [ type |-> "Prepared", rm |-> "0_OF_RM" ] }
In this case, you have two choices:
-
Partition the single variable into multiple variables, see Idiom 15.
-
Introduce variant types, see Recipe 4.
Known issues
Annotations of LOCAL operators
In contrast to all other cases, a local operator definition does require
a type annotation before the keyword LOCAL
, not after it. For example:
\* @type: Int => Int;
LOCAL LocalInc(x) == x + 1
This may change later when the tlaplus Issue 578 is resolved.