# Introduction to formal modeling

A model in our settings is a transition system. The goal of the following exercises is to build models that represents the behaviors of given system, with specific verification goals in mind. We will first write models as transition systems, and then write the corresponding PlusCal code.

## Foreword on the PlusCal language

PlusCal is an algorithmic language. It differs from programming languages since:

• PlusCal algorithms can manipulate arbitrary mathematical objects (graphs, etc). Programming languages require to implement these objects using low-level types (integers, booleans, etc).
• PlusCal algorithms can be non-deterministic. Programs need to be deterministic in order to be executable.
• PlusCal has a well-defined notion of an atomic step. In general, the instructions in a program are not atomic (mechanisms are usually provided to group instructions in atomic steps)
• PlusCal has a formal semantics defined as TLA+ expressions, that, in turns, formally defines the state-space and the runs of the algorithm.

The TLA tools offer three main features:

• The translation of PlusCal programs into TLA+ expressions.
• The TLC model-checker that allows to find bugs in algorithms, and to prove algorithms with finite resources.
• The TLA prover which is not in the scope of this class.

Please refer to the PlusCal documentation ~herbrete/public/TLA/c-manual.pdf and ~herbrete/public/TLA/pluscal.pdf if needed.

## A simple C function

We consider the following C function:

int f(int x)
{
while (x != 0) {
if (x < -1)
x = 0;
x = x - 1;
}
}
1. Model the behaviors of the C function f above as a transition system assuming that int values range from -2 to 2. What defines a state?
2. Does it have infinite runs? What can you conclude?
3. The file simple_process.tla contains a PlusCal model of the function f above. Compare the model to the C code, and check for termination using the TLC model-checker.

## A multi-process C program

We consider the C program below which has a shared variable x and a function increment that is executed by several processes. This function first stores the value of x in a local variable y. Then, it writes y+1 into the shared variable x. We assume that each assignment is an atomic instruction.

int x = 0;

void increment()
{
int y = x;
x = y + 1;
}
1. What defines a state of this program with N processes?
2. Model the behavior of this program for N=2 threads as a transition system
3. What can you conclude regarding the final value of x?

The following code is a PlusCal model of the C program above, with N threads executing the function increment.

EXTENDS Naturals, TLC

CONSTANTS N (* Number of processes *)

(*
--algorithm non_atomic_increment {
variables
x = 0;

process (P \in 1..N)
variables
y = 0;
{
l0:    y := x;
l1:    x := y + 1
}

}
*)

In this algorithm:

• EXTENDS is used to import libraries (here, the TLC library and the natural numbers library).
• CONSTANT introduces new symbolic names. Here N is the number of processes in the algorithm. Its value is unspecified yet.
• The algorithm itself is placed inside a comment bloc (* ... *). Its specification starts with --algorithm followed by a name.
• The variable x is declared in the algorithm scope. It is a global variable that is shared by all the processes. The variable y is declared in the process scope. Each process of type P has its own local variable y.
• The declarations x = 0 and y = 0 define the initial values of variables x and y. PlusCal allows non-deterministic initial values. For instance, replacing y = 0 by y \in 1..10 result in choosing the initial value of y as any value between 1 and 10. Our algorithm would thus have 10 initial states: one for each possible initial value of y.
• l0 and l1 are labels that identify statements. Labels are used to define atomic blocks of statements as we will see later.

Use the parse module command to check for syntax errors, and get a translation of the PlusCal code in TLA+. The TLA+ translation corresponds to a logical description of the algorithm. It is used by the model-checking tool TLC to assess the correcteness of the algorithm.

Our goal is to check that, when the algorithm terminates, the value of x is equal to N:

AllDone == \A self \in 1..N: pc[self] = "Done"
Correctness == [](AllDone => x=N)
1. What is expressed by the TLA formulas above?
2. Create a PlusCal model atomicity, and add this specification after the translation of the algorithm. Check the correctness of the algorithm for N=1 process, and then N=2 processes. What do you observe? Does this correspond to the transition system model?

Labels in PlusCal algorithms define which blocs of statements are executed in an atomic way. Labeling is a key to the modeling of concurrent and distributed algorithms where correctness heavily depends on which statements can be executed atomically. PlusCal puts restrictions on labels in order to let the translation in TLA+ be as close as possible to the algorithm. These constraints are described in section 3.7 of the PlusCal user manual (available at ~herbrete/public/TLA/c-manual.pdf).

1. Remove the label l1 in the previous algorithm. Draw the corresponding transition system and check that the specification hold.
2. Explain why removing label l1 result in an algorithm that does not faithfully model the C program multiprocess.c

## A small model for a C program

We consider the following C program where the lock is initially free, and variables old and new are 8-bytes pointers. The unspecified condition ... in the if instruction has been abstracted for simplicity and should be modelled by a non-deterministic choice.

// lock is free, old and new are 8-bytes pointers
do {
lock();
old = new;
if (...) {
unlock();
new++;
}
}
while (new != old);
1. How many states has this program?
2. Build an abstract model of the program where states only keep track of: the current instruction, the status of the lock (boolean), and whether old is equal to new (i.e. the boolean value of old == new).
3. How many states has your model? Can you prove that the lock() instruction is never blocking from your model?