# Introduction to PlusCal

The following exercices use PlusCal and TLA to write and check algorithms. Please refer to the documentation ~herbrete/public/TLA/c-manual.pdf and ~herbrete/public/TLA/pluscal.pdf.

## Introduction to the PlusCal language

PlusCal is an algorithmic language. It differs from programming languages since:

• PlusCal algorithms can manipulate arbitrary mathematical objects (graphs, etc). Programming languages require to implement these objects using low-level types (integers, booleans, etc).
• PlusCal algorithms can be non-deterministic. Programs need to be deterministic in order to be executable.
• PlusCal has a well-defined notion of an atomic step. In general, the instructions in a program are not atomic (mechanisms are usually provided to group instructions in atomic steps)
• PlusCal has a formal semantics defined as TLA+ expressions, that, in turns, formally defines the state-space and the runs of the algorithm.

The TLA tools offer three main features:

• The translation of PlusCal programs into TLA+ expressions.
• The TLC model-checker that allows to find bugs in algorithms, and to prove algorithms with finite resources.
• The TLA prover which is not in the scope of this class.

## Semantics of multi-process algorithms and Atomicity

The following algorithm has one global variable x that is shared by N processes. Each process increments x in two steps. It first reads the value of x and stores it in a local variable y. It then writes back y+1 to x. These two steps model read and write operations on a shared variable in a context where they cannot be achieved in a single atomic step (e.g. a C program using threads or unix processes).

EXTENDS Naturals, TLC

CONSTANTS N (* Number of processes *)

(*
--algorithm non_atomic_increment {
variables
x = 0;

process (P \in 1..N)
variables
y = 0;
{
l0:    y := x;
l1:    x := y + 1
}

}
*)

In this algorithm:

• EXTENDS is used to import libraries (here, the TLC library and the natural numbers library).
• CONSTANT introduces new symbolic names. Here N is the number of processes in the algorithm. Its value is unspecified yet.
• The algorithm itself is placed inside a comment bloc (* ... *). Its specification starts with --algorithm followed by a name.
• The variable x is declared in the algorithm scope. It is a global variable that is shared by all the processes. The variable y is declared in the process scope. Each process of type P has its own local variable y.
• The declarations x = 0 and y = 0 define the initial values of variables x and y. PlusCal allows non-deterministic initial values. For instance, replacing y = 0 by y \in 1..10 result in choosing the initial value of y as any value between 1 and 10. Our algorithm would thus have 10 initial states: one for each possible initial value of y.
• l0 and l1 are labels that identify statements. Labels are used to define atomic blocks of statements as we will see later.

Use the parse module command to check for syntax errors, and get a translation of Euclid’s algorithm in TLA+. The TLA+ translation corresponds to a logical description of the algorithm. It is used by the model-checking tool TLC to assess the correcteness of the algorithm.

Our goal is to check that, when the algorithm terminates, the value of x is equal to N:

AllDone == \A self \in 1..N: pc[self] = "Done"
Correctness == [](AllDone => x=N)
1. What is expressed by the TLA formulas above?
2. Create a PlusCal model atomicity, and add this specification after the translation of the algorithm. Check the correctness of the algorithm for N=1 process, and then N=2 processes. What do you observe?
3. Draw the transition systems corresponding to the PlusCal algorithm above for N=1 process and for N=2 processes. Explain the counter-example obtained with N=2 processes.

Labels in PlusCal algorithms define which blocs of statements are executed in an atomic way. Labeling is a key to the modeling of concurrent and distributed algorithms where correctness heavily depends on which statements can be executed atomically. PlusCal puts restrictions on labels in order to let the translation in TLA+ be as close as possible to the algorithm. These constraints are described in section 3.7 of the PlusCal user manual (available at ~herbrete/public/TLA/c-manual.pdf).

1. Remove the label l1 in the previous algorithm. Draw the corresponding transition system and check that the specification hold.
2. Explain why removing label l1 result in an algorithm that does not faithfully model the C program multiprocess.c

## Implementing atomic blocks with a semaphore

A solution to the error in the previous algorithm is to implement a mutual exclusion access to the shared variable x. A classical way to implement synchronizations (in particular mutual exclusion) is to use semaphores (see Wikipedia page). A semaphore s can be seen as a shared integer variable that can be manipulated using two operations: lock(s) and unlock(s). The operation lock(s) is blocking if the semaphore s is already locked, and non-blocking otherwise. The operation unlock(s) unlocks the semaphore s. As a result, one of the locked process (if any) is unblocked.

Our goal is to write a PlusCal algoithm that models the behavior of the C program multiprocess_sem.c. The following PlusCal algorithm gives a skeleton of the model where s is the semaphore, and operations lock and unlock need to be implemented (where the ... need to be filled).

EXTENDS Naturals, TLC

CONSTANTS N  (* Number of processes *)

(*
--algorithm mutex_increment {
variables
x = 0;
s = ...;      (* Shared semaphore *)

macro lock(sem) { ... }
macro unlock(sem) { ... }

process (P \in 1..N)
variables
y = 0;
{
lock:    lock(s);
l0:      y := x;
l1:      x := y + 1;
unlock:  unlock(s)
}
}
*)
1. Implement the macros lock(sem) and unlock(sem) above and initialize the shared variable s in such a way to implement a mutual exclusive access to x
2. Check that the resulting algorithm is correct for the specification (see previous part) using the model-checker.