Multi-process algorithms in PlusCal

Semantics of multi-process algorithms and Atomicity

The following algorithm has one global variable x that is shared by N processes. Each process increments x in two steps. It first reads the value of x and stores it in a local variable y. It then writes back y+1 to x. These two steps model read and write operations on a shared variable in a context where they cannot be achieved in a single atomic step (e.g. a C program using threads or unix processes).

EXTENDS Naturals, TLC

CONSTANTS N (* Number of processes *)

(*
--algorithm non_atomic_increment {
  variables
    x = 0;

  process (P \in 1..N)
  variables
    y = 0;
  {
    l0:    y := x;
    l1:    x := y + 1
  }

}
*)

Our goal is to check that, when the algorithm terminates, the value of x is equal to N:

AllDone == \A self \in 1..N: pc[self] = "Done"
Correctness == [](AllDone => x=N)
  1. What is expressed by the TLA formulas above?
  2. Can we specify this property as an assertion?
  3. Create a PlusCal model atomicity, and add this specification after the translation of the algorithm. Check the correctness of the algorithm for N=1 process, and then N=2 processes. What do you observe?
  4. Draw the transition systems corresponding to the PlusCal algorithm above for N=1 process and for N=2 processes. Explain the counter-example obtained with N=2 processes.
  5. Compile the file multiprocess.c with the command cc -std=c99 -Wall -o multiprocess multiprocess.c -lpthread. The resulting executable multiprocess takes a number N of processes on the command line, and runs N concurrent processes as in the PlusCal model above. How many runs or which value of N do you need to observe the error identified by the model-checker? What can you conclude?

Labels in PlusCal algorithms define which blocs of statements are executed in an atomic way. Labeling is not mandatory, in particular it is useless for sequential programs. However, labeling is a key to the modeling of concurrent and distributed algorithms where correctness heavily depends on which statements can be executed atomically. PlusCal puts restrictions on labels in order to let the translation in TLA+ be as close as possible to the algorithm. These constraints are described in section 3.7 of the PlusCal user manual (available at ~herbrete/public/TLA/c-manual.pdf).

Adding -label to the PlusCal options turns on the automatic labeling of the algorithm by the translator. This is the default behavior in the case of a single process algorithm. The default labeling consists in adding a minimal set of labels to guarantee the constraints mentioned above. It thus results in maximizing the size of the atomic blocs of statements. Automatic labeling is however not suitable for multi-process algorithms where atomic steps need to be modeled.

Implementing atomic blocks with a semaphore

A solution to the error in the previous algorithm is to implement a mutual exclusion access to the shared variable x. A classical way to implement synchronizations (in particular mutual exclusion) is to use semaphores (see Wikipedia page). A semaphore s can be seen as a shared integer variable that can be manipulated using two operations: P(s) and V(s). The operation P(s) first checks the value of s. If s is smaller than or equal to 0, P(s) is blocking until s gets a value greater than 0. When s has a value greater than 0, P(s) decrements s. The operation V(s) simply increments s. If s becomes positive, then one the waiting processes (if any) is unblocked. We can make both operations P(s) and V(s) atomic thanks to test-and-set operations implemented by CPUs.

The following PlusCal algorithm shows how a semaphore s can be used to make the increment of shared variable x atomic (where the ... need to be filled):

EXTENDS Naturals, TLC

CONSTANTS N  (* Number of processes *)

(*
--algorithm mutex_increment {
  variables
    x = 0;
    s = ...;      (* Shared semaphore *)

  macro P(sem) { ... }
  macro V(sem) { ... }

  process (P \in 1..N)
  variables
    y = 0;
  {
    lock:    P(s);
    l0:      y := x;
    l1:      x := y + 1;
    unlock:  V(s)
  }
}
*)
  1. Implement the macros P(sem) and V(sem) above and initialize the shared variable s in such a way to implement a mutual exclusive access to x
  2. Check that the resulting algorithm is correct for the specification (see previous part) using the model-checker.
  3. Compile and test multiprocess_sem.c. In particular, check that the new implementation is correct for big values of N that failed our previous implementation. Check that the new implementation correctly implements our model.