Formal Methods Paparazzi: 2015

Wednesday, June 10, 2015

SAT solvers

So far, we have been focusing on model checking (post, post), in this post we will start discussing one tool several model checkers use: SAT solvers.

The Boolean Satisfiability Problem (or SAT) is very famous in computer science. Mainly, since it was the first problem to be shown to be NP-complete (a complexity class) but I will not get into complexity in this post.

The problem is to decide whether a given a propositional logic formula is satisfiable or not. A possible (brute force) solution, would be to check every possible assignment for the variables and check whether the formula is satisfied for that assignment. We can see that this solution would imply checking (in the worst case) 2^|VARS| assignments, which as the number of variables increases, the number of checks increases exponentially!

Fortunately, there are many heuristics and algorithms that make SAT feasible in practice for certain inputs even having many many variables. Thus multiple tools are available and are used in the context of formal verification.

For example, SAT solvers are used for Bounded Model Checking (checking the prefixes of the model of length <= k). To translate a state machine over a set of boolean variables V to sat in order to consider k steps, for each variable v in V the variables v_i (i <= k) is created (i.e. the value of variable v in step i) and the initial and transition relation constraints are expressed in terms of step i for every i lower than the bound.

Then by calling the SAT solver, we can check whether some problematic state is reachable (within these k steps).

For example:

if we want to express the first two steps of this state machine (initial and next state), a possible translation could be as follows:

s0_0, s1_0, s2_0, s3_0, s4_0, s5_0: boolean //i.e. the value of each state in step 0
s0_1, s1_1, s2_1, s3_1, s4_1, s5_1: boolean //i.e. the value of each state in step 1
a0, b0, c0: boolean // the value of the variables in step 0
a1, b1, c1: boolean // the value of the variables in step 1

Represent each state at step 0:
assert (s0_0 implies (!a0 and b0 and c0)) // the notation is just to explain the ideas, not the actual input to the sat solver: this assertion indicates which propositions (a, b or c) hold when in s0 at step 0; the ! character represents negation.
assert (s1_0 implies (a0 and b0 and !c0))
assert (s2_0 implies (a0 and !b0 and c0))
assert (s3_0 implies (!a0 and b0 and !c0))
assert (s4_0 implies (!a0 and !b0 and c0))
assert (s5_0 implies (a0 and b0 and c0))

Represent each state at step 1:
assert (s0_1 implies (!a1 and b1 and c1))
assert (s1_1 implies (a1 and b1 and !c1))
assert (s2_1 implies (a1 and !b1 and c1))
assert (s3_1 implies (!a1 and b1 and !c1))
assert (s4_1 implies (!a1 and !b1 and c1))
assert (s5_1 implies (a1 and b1 and c1))

represent the initial states

assert (s0_0 and !s1_0 and !s2_0 and !s3_0 and !s4_0 and !s5_0)

according to the current state at step 0, define the next state

assert (s0_0 implies ((s1_1 or s2_1) and (!s0_1 and !s3_1 and !s4_1 and !s5_1)))
and the same for the remaining states..
assert (s1_0 implies (s3_1 and (!s0_1 and !s1_1 and !s2_1 and !s4_1 and !s5_1)))
...

If we want to check the property: "if a, then not b and c"

Then we can check for each of the steps, that is:

assert ( (a0 implies not (b0 and c0)) or (a1 implies not (b1 and c1)) )

In this case, the model is satisfiable: for the first two steps it holds that "if a, then not b and c".

If we added two more steps, then when checking:

assert ( (a0 implies not (b0 and c0)) or (a1 implies not (b1 and c1)) or (a2 implies not (b2 and c2)) or (a3 implies not (b3 and c3)) )

the model is shown unsatisfiable.

In the next post I'll explain how to run a sat solver (Minisat) to check these examples

Sunday, May 10, 2015

CTL - Computation Tree Logic

In the previous posts we have been talking about Linear Temporal Logic formulas, but there is another kind of temporal logic that allows expressing properties such as "there exists a path starting from ... ". In CTL (Computation Tree Logic), the temporal operators are the same as in LTL (X, U, G, F), and includes quantifiers (A, E) which express "for every path" or "exists path" respectively.
Then, valid formulas are those that include: Quantifier + Temporal Operator

Examples:
AX p : In every path, the next state satisfies p
EX p : There is a path such that the next state satisfies p

Then the following model:

satisfies the property EX p (there is a path starting from the first state such that in the next state p holds), but it doesn't satisfy AXp (the path that continues below doesn't have p in the next state).

The temporal operators can be combined with boolean ones to build more complex formulas, for example

E(p U AX q) : there is a path where p holds until one state where every next state satisfies q.

Challenge:

1) for what CTL formula there isn't an LTL equivalent formula ?

2) for what LTL formula there isn't a CTL equivalent formula ?

Friday, April 10, 2015

Interactive nusmv model checking - 3

Continuing with the interactive mode of NuSMV, let's consider our example from the previous posts (nusmv1.smv).

Recall that to enter the interactive mode, we simply execute the application with the -int argument and then build the model (previous post):

NuSMV -int nusmv1.smv

NuSMV > go

Now, in order to start simulating paths, we first choose an initial state (with the pick_state command)

If you don't remember exactly how the commands are called, it is possible to type help and view all the available commands, i.e.

NuSMV > help

and then just to check that is the command we are interested in:

NuSMV > pick_state -h

We can see all the available initial states by passing the argument -i. That is,

NuSMV > pick_state -i

The output looks as follows:

*************** AVAILABLE STATES *************

================= State =================
0) -------------------------
request = TRUE
st = ready

================= State =================
1) -------------------------
request = FALSE

Choose a state from the above (0-1):

There are two possible states. The first one includes the values for every possible variable, while the second one includes only the valuable of request. This means that the value of st is the same as in the previous state. Let's say we choose the first state which generates a request (i.e. state 0) ).
Now by writing

NuSMV > show_traces -v

The output is

Trace Description: Simulation Trace
Trace Type: Simulation
-> State: 1.1 <-
request = TRUE
st = ready

we can simulate randomly other 5 steps:

NuSMV > simulate -r -k 5

Trace Description: Simulation Trace
Trace Type: Simulation
-> State: 1.1 <-
request = TRUE
st = ready
-> State: 1.2 <-
request = FALSE
st = busy
-> State: 1.3 <-
request = TRUE
st = ready
-> State: 1.4 <-
request = TRUE
st = busy
-> State: 1.5 <-
request = FALSE
st = ready
-> State: 1.6 <-
request = TRUE
st = ready

or we can be asked to interactively select the next k steps (in the example 2 steps)

NuSMV > simulate -i -k 2

So, -r represents choosing randomly, -i interactively and -k the number of steps.

An additional useful option for this command is -c which allows adding constraints that must be satisfied by the next steps. For example, we can start a new simulation path, and the condition we require is that there is a request:

NuSMV > pick_state -r -c "request"

These are the main commands and options I use for simulating paths in a model, for more options, it is always possible to use the help command (get all available commands), command -h (prints the options for any command) or revise the documentation on the site of NuSMV.

Tuesday, March 10, 2015

Interactive nusmv model checking - 2

Continuing from the last post and the model example, where we could check some LTL formulas, and see the traces, we know show how to apply interactive simulation.
This is useful for when we want to analyze different paths, and see what happens at each step, and what are the possible values of each variable at each step.

For example, we want to step over each step of a trace interactively, choosing the next step at each step.

To run NuSMV in the interactive mode we add the -int option, i.e.
NuSMV -int nusmv1.smv

Then, the NuSMV prompt appears:

NuSMV >

From now on, we will write the commands next to the prompt, but you dont need to write the prompt, just what appears after the > symbol.

The first step is to tell NuSMV to build the model, which is achieved with the command go

NuSMV > go

And now, we can check different properties. For instance:

NuSMV > check_ltlspec -p "G (st = ready -> X st = busy)"

outputs the counterexample while

NuSMV > check_ltlspec -p "G ((st = ready & request) -> X st = busy)"

outputs that the property holds:

-- specification G ((st = ready & request) -> X st = busy) is true

we can check more information about the check_ltlspec option by typing

NuSMV > check_ltlspec -h

which outputs the different options for this command.

Then, we see that for a file that includes some LTL specification properties, we can check the property by number, for instance

NuSMV > check_ltlspec -n 0

To quit the command is "quit"

NuSMV > quit

In the next post I will write about simulating paths within the interactive mode.

Tuesday, February 10, 2015

valgrind

In this post, I'm going to present a small program written in C and show how to use a tool (Valgrind [1]) to understand the memory access problems.

Let's consider the following program:

main.c:

#include <stdlib.h>
void f(void)
{
int* x = malloc(10 * sizeof(int));
x[10] = 0; // problem 1: heap block overrun
} // problem 2: memory leak -- x not freed

int main(void)
{
f();
return 0;
}

Once the program has been compiled (with debug info (-g) into ./main in unix-based systems), we can check for memory issues by running valgrind:

valgrind --leak-check=yes ./main

The output is the following (after the header):

==3938== Invalid write of size 4
==3938== at 0x40054B: f (main.c:6)
==3938== by 0x40055B: main (main.c:11)
==3938== Address 0x51fd068 is 0 bytes after a block of size 40 alloc'd
==3938== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3938== by 0x40053E: f (main.c:5)
==3938== by 0x40055B: main (main.c:11)
==3938==
==3938==
==3938== HEAP SUMMARY:
==3938== in use at exit: 40 bytes in 1 blocks
==3938== total heap usage: 1 allocs, 0 frees, 40 bytes allocated
==3938==
==3938== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
==3938== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3938== by 0x40053E: f (main.c:5)
==3938== by 0x40055B: main (main.c:11)
==3938==
==3938== LEAK SUMMARY:
==3938== definitely lost: 40 bytes in 1 blocks
==3938== indirectly lost: 0 bytes in 0 blocks
==3938== possibly lost: 0 bytes in 0 blocks
==3938== still reachable: 0 bytes in 0 blocks
==3938== suppressed: 0 bytes in 0 blocks

The prefix "==3938== " represents the process id.

The first main issue is the invalid write: the first part of the output indicates where the write occurs, while the second part indicates where the memory incorrectly written has been allocated. From this information, we can see that the write within f() is invalid and observe that the problem is because x receives memory for 10 ints (i.e. the accessible indexes are between 0 and 9). Thus, the problem is trying to write to x[10].

The second issue is the memory lost indicated both in the sentence : "40 bytes in 1 blocks are definitely lost in loss record 1 of 1"
and in the LEAK SUMMARY. We can see here which malloc wasn't freed (then one in f (main.c:5)).

We can write now the following similar program:

#include <stdlib.h>

void f(void)
{
int* x = malloc(10 * sizeof(int));
x[9] = 0; // problem 1: we now write to a valid inded
free(x); // problem 2: solved, now x was freed
}

int main(void)
{
f();
return 0;
}

and running valgrind we obtain the following output (after the general header):

==4060== HEAP SUMMARY:
==4060== in use at exit: 0 bytes in 0 blocks
==4060== total heap usage: 1 allocs, 1 frees, 40 bytes allocated
==4060==
==4060== All heap blocks were freed -- no leaks are possible

The program has no memory issues!

[1] http://valgrind.org/ . The example is from the the tool's quick start guide.