Formal Methods Paparazzi

Wednesday, June 10, 2015

SAT solvers

So far, we have been focusing on model checking (post, post), in this post we will start discussing one tool several model checkers use: SAT solvers.

The Boolean Satisfiability Problem (or SAT) is very famous in computer science. Mainly, since it was the first problem to be shown to be NP-complete (a complexity class) but I will not get into complexity in this post.

The problem is to decide whether a given a propositional logic formula is satisfiable or not. A possible (brute force) solution, would be to check every possible assignment for the variables and check whether the formula is satisfied for that assignment. We can see that this solution would imply checking (in the worst case) 2^|VARS| assignments, which as the number of variables increases, the number of checks increases exponentially!

Fortunately, there are many heuristics and algorithms that make SAT feasible in practice for certain inputs even having many many variables. Thus multiple tools are available and are used in the context of formal verification.

For example, SAT solvers are used for Bounded Model Checking (checking the prefixes of the model of length <= k). To translate a state machine over a set of boolean variables V to sat in order to consider k steps, for each variable v in V the variables v_i (i <= k) is created (i.e. the value of variable v in step i) and the initial and transition relation constraints are expressed in terms of step i for every i lower than the bound.

Then by calling the SAT solver, we can check whether some problematic state is reachable (within these k steps).

For example:

if we want to express the first two steps of this state machine (initial and next state), a possible translation could be as follows:

s0_0, s1_0, s2_0, s3_0, s4_0, s5_0: boolean //i.e. the value of each state in step 0
s0_1, s1_1, s2_1, s3_1, s4_1, s5_1: boolean //i.e. the value of each state in step 1
a0, b0, c0: boolean // the value of the variables in step 0
a1, b1, c1: boolean // the value of the variables in step 1

Represent each state at step 0:
assert (s0_0 implies (!a0 and b0 and c0)) // the notation is just to explain the ideas, not the actual input to the sat solver: this assertion indicates which propositions (a, b or c) hold when in s0 at step 0; the ! character represents negation.
assert (s1_0 implies (a0 and b0 and !c0))
assert (s2_0 implies (a0 and !b0 and c0))
assert (s3_0 implies (!a0 and b0 and !c0))
assert (s4_0 implies (!a0 and !b0 and c0))
assert (s5_0 implies (a0 and b0 and c0))

Represent each state at step 1:
assert (s0_1 implies (!a1 and b1 and c1))
assert (s1_1 implies (a1 and b1 and !c1))
assert (s2_1 implies (a1 and !b1 and c1))
assert (s3_1 implies (!a1 and b1 and !c1))
assert (s4_1 implies (!a1 and !b1 and c1))
assert (s5_1 implies (a1 and b1 and c1))

represent the initial states

assert (s0_0 and !s1_0 and !s2_0 and !s3_0 and !s4_0 and !s5_0)

according to the current state at step 0, define the next state

assert (s0_0 implies ((s1_1 or s2_1) and (!s0_1 and !s3_1 and !s4_1 and !s5_1)))
and the same for the remaining states..
assert (s1_0 implies (s3_1 and (!s0_1 and !s1_1 and !s2_1 and !s4_1 and !s5_1)))
...

If we want to check the property: "if a, then not b and c"

Then we can check for each of the steps, that is:

assert ( (a0 implies not (b0 and c0)) or (a1 implies not (b1 and c1)) )

In this case, the model is satisfiable: for the first two steps it holds that "if a, then not b and c".

If we added two more steps, then when checking:

assert ( (a0 implies not (b0 and c0)) or (a1 implies not (b1 and c1)) or (a2 implies not (b2 and c2)) or (a3 implies not (b3 and c3)) )

the model is shown unsatisfiable.

In the next post I'll explain how to run a sat solver (Minisat) to check these examples

Sunday, May 10, 2015

CTL - Computation Tree Logic

In the previous posts we have been talking about Linear Temporal Logic formulas, but there is another kind of temporal logic that allows expressing properties such as "there exists a path starting from ... ". In CTL (Computation Tree Logic), the temporal operators are the same as in LTL (X, U, G, F), and includes quantifiers (A, E) which express "for every path" or "exists path" respectively.
Then, valid formulas are those that include: Quantifier + Temporal Operator

Examples:
AX p : In every path, the next state satisfies p
EX p : There is a path such that the next state satisfies p

Then the following model:

satisfies the property EX p (there is a path starting from the first state such that in the next state p holds), but it doesn't satisfy AXp (the path that continues below doesn't have p in the next state).

The temporal operators can be combined with boolean ones to build more complex formulas, for example

E(p U AX q) : there is a path where p holds until one state where every next state satisfies q.

Challenge:

1) for what CTL formula there isn't an LTL equivalent formula ?

2) for what LTL formula there isn't a CTL equivalent formula ?

Friday, April 10, 2015

Interactive nusmv model checking - 3

Continuing with the interactive mode of NuSMV, let's consider our example from the previous posts (nusmv1.smv).

Recall that to enter the interactive mode, we simply execute the application with the -int argument and then build the model (previous post):

NuSMV -int nusmv1.smv

NuSMV > go

Now, in order to start simulating paths, we first choose an initial state (with the pick_state command)

If you don't remember exactly how the commands are called, it is possible to type help and view all the available commands, i.e.

NuSMV > help

and then just to check that is the command we are interested in:

NuSMV > pick_state -h

We can see all the available initial states by passing the argument -i. That is,

NuSMV > pick_state -i

The output looks as follows:

*************** AVAILABLE STATES *************

================= State =================
0) -------------------------
request = TRUE
st = ready

================= State =================
1) -------------------------
request = FALSE

Choose a state from the above (0-1):

There are two possible states. The first one includes the values for every possible variable, while the second one includes only the valuable of request. This means that the value of st is the same as in the previous state. Let's say we choose the first state which generates a request (i.e. state 0) ).
Now by writing

NuSMV > show_traces -v

The output is

Trace Description: Simulation Trace
Trace Type: Simulation
-> State: 1.1 <-
request = TRUE
st = ready

we can simulate randomly other 5 steps:

NuSMV > simulate -r -k 5

Trace Description: Simulation Trace
Trace Type: Simulation
-> State: 1.1 <-
request = TRUE
st = ready
-> State: 1.2 <-
request = FALSE
st = busy
-> State: 1.3 <-
request = TRUE
st = ready
-> State: 1.4 <-
request = TRUE
st = busy
-> State: 1.5 <-
request = FALSE
st = ready
-> State: 1.6 <-
request = TRUE
st = ready

or we can be asked to interactively select the next k steps (in the example 2 steps)

NuSMV > simulate -i -k 2

So, -r represents choosing randomly, -i interactively and -k the number of steps.

An additional useful option for this command is -c which allows adding constraints that must be satisfied by the next steps. For example, we can start a new simulation path, and the condition we require is that there is a request:

NuSMV > pick_state -r -c "request"

These are the main commands and options I use for simulating paths in a model, for more options, it is always possible to use the help command (get all available commands), command -h (prints the options for any command) or revise the documentation on the site of NuSMV.

Tuesday, March 10, 2015

Interactive nusmv model checking - 2

Continuing from the last post and the model example, where we could check some LTL formulas, and see the traces, we know show how to apply interactive simulation.
This is useful for when we want to analyze different paths, and see what happens at each step, and what are the possible values of each variable at each step.

For example, we want to step over each step of a trace interactively, choosing the next step at each step.

To run NuSMV in the interactive mode we add the -int option, i.e.
NuSMV -int nusmv1.smv

Then, the NuSMV prompt appears:

NuSMV >

From now on, we will write the commands next to the prompt, but you dont need to write the prompt, just what appears after the > symbol.

The first step is to tell NuSMV to build the model, which is achieved with the command go

NuSMV > go

And now, we can check different properties. For instance:

NuSMV > check_ltlspec -p "G (st = ready -> X st = busy)"

outputs the counterexample while

NuSMV > check_ltlspec -p "G ((st = ready & request) -> X st = busy)"

outputs that the property holds:

-- specification G ((st = ready & request) -> X st = busy) is true

we can check more information about the check_ltlspec option by typing

NuSMV > check_ltlspec -h

which outputs the different options for this command.

Then, we see that for a file that includes some LTL specification properties, we can check the property by number, for instance

NuSMV > check_ltlspec -n 0

To quit the command is "quit"

NuSMV > quit

In the next post I will write about simulating paths within the interactive mode.

Tuesday, February 10, 2015

valgrind

In this post, I'm going to present a small program written in C and show how to use a tool (Valgrind [1]) to understand the memory access problems.

Let's consider the following program:

main.c:

#include <stdlib.h>
void f(void)
{
int* x = malloc(10 * sizeof(int));
x[10] = 0; // problem 1: heap block overrun
} // problem 2: memory leak -- x not freed

int main(void)
{
f();
return 0;
}

Once the program has been compiled (with debug info (-g) into ./main in unix-based systems), we can check for memory issues by running valgrind:

valgrind --leak-check=yes ./main

The output is the following (after the header):

==3938== Invalid write of size 4
==3938== at 0x40054B: f (main.c:6)
==3938== by 0x40055B: main (main.c:11)
==3938== Address 0x51fd068 is 0 bytes after a block of size 40 alloc'd
==3938== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3938== by 0x40053E: f (main.c:5)
==3938== by 0x40055B: main (main.c:11)
==3938==
==3938==
==3938== HEAP SUMMARY:
==3938== in use at exit: 40 bytes in 1 blocks
==3938== total heap usage: 1 allocs, 0 frees, 40 bytes allocated
==3938==
==3938== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
==3938== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3938== by 0x40053E: f (main.c:5)
==3938== by 0x40055B: main (main.c:11)
==3938==
==3938== LEAK SUMMARY:
==3938== definitely lost: 40 bytes in 1 blocks
==3938== indirectly lost: 0 bytes in 0 blocks
==3938== possibly lost: 0 bytes in 0 blocks
==3938== still reachable: 0 bytes in 0 blocks
==3938== suppressed: 0 bytes in 0 blocks

The prefix "==3938== " represents the process id.

The first main issue is the invalid write: the first part of the output indicates where the write occurs, while the second part indicates where the memory incorrectly written has been allocated. From this information, we can see that the write within f() is invalid and observe that the problem is because x receives memory for 10 ints (i.e. the accessible indexes are between 0 and 9). Thus, the problem is trying to write to x[10].

The second issue is the memory lost indicated both in the sentence : "40 bytes in 1 blocks are definitely lost in loss record 1 of 1"
and in the LEAK SUMMARY. We can see here which malloc wasn't freed (then one in f (main.c:5)).

We can write now the following similar program:

#include <stdlib.h>

void f(void)
{
int* x = malloc(10 * sizeof(int));
x[9] = 0; // problem 1: we now write to a valid inded
free(x); // problem 2: solved, now x was freed
}

int main(void)
{
f();
return 0;
}

and running valgrind we obtain the following output (after the general header):

==4060== HEAP SUMMARY:
==4060== in use at exit: 0 bytes in 0 blocks
==4060== total heap usage: 1 allocs, 1 frees, 40 bytes allocated
==4060==
==4060== All heap blocks were freed -- no leaks are possible

The program has no memory issues!

[1] http://valgrind.org/ . The example is from the the tool's quick start guide.

Tuesday, October 14, 2014

Interactive nusmv model checking - 1

In the last post we saw about model checking LTL formulas with NuSMV by writing the model, the formula to be satisfied, and calling the program from the command line. We will now see a few things useful to interact with the model.

if we run NuSMV with the option -int, NuSMV remains open to ask commands.

We can execute:

NuSMV -int nusmv1.smv

After the header including the NuSMV version, the prompt appears:

NuSMV >

In the next lines this prompt (NuSMV > ) will appear to show where the commands go, but you don't have to write it. We can now write go so that the model is built. That is,

NuSMV > go

And now we can check formulas. For example,

NuSMV > check_ltlspec -p "G st = ready"

outputs a counterexample, while

NuSMV > check_ltlspec -p "st = ready"

indicating that in the first state st is ready, is true.

NuSMV > show_traces 1

shows the first trace (the counterexample to "G st = ready"

At any moment we can use

NuSMV > help
to see all the available commands

and command -h to see the options of a command for instance

NuSMV > check_ltlspec -h

Finally we use the quit command to exit.

NuSMV > quit

Tuesday, October 7, 2014

First NuSMV model check

There are several tools to apply check LTL formulas (LTL - 3, etc.). Among these, NuSMV is an open source tool that I have been using. This tool uses BDDs or SAT solvers to check properties, but I'll leave these topics for later and now focus on actual model checking a system..

In the tutorial, the following example is given:

MODULE main
VAR
request: boolean;
st: {ready, busy};
ASSIGN
init(st) := ready;
next(st) := case
              st = ready & request : busy;
              TRUE : {ready, busy};
            esac;

The given model contains two variables: request is a boolean and st - representing the state of the system - has two possible values (ready or busy).

In the ASSIGN section the initial and transition relation constraints are defined. The variable st is initialized as ready, and then for each transition:
if in the current state st is ready and there is a request, in the next state the value of st will be busy. Otherwise ("TRUE : ..."), the next value of st will be either ready or busy, non-deterministically chosen.
There are no constraints regarding initial or next values of request.

This model represents the given state machine:

The two states on the left are the initial states, since these are the states satisfying that initially st = ready.
Then, almost every transition is possible, besides when st = ready and there is a request, according to the state machine definition in the next state st must be busy.

We can check the following LTL formula:
\(\textbf{G} (st = ready \implies \textbf{X}\ st = busy)\)

That is, at any state, if st = ready, then in the next state st = busy.

To do so, we add the following to the model:

LTLSPEC
G (st = ready -> X st = busy)

Let's assume we have a file nusmv1.smv (open with any text editor).

Now (from the command line) we can run NuSMV and check if the guarantee holds.

After having downloaded it, we can execute

NuSMV nusmv1.smv

You may need to access the whole path, for instance (in Windows)

C:\Program Files\NuSMV\bin\NuSMV nusmv1.smv

The output is the following:

-- specification G (st = ready -> X st = busy) is false
-- as demonstrated by the following execution sequence
Trace Description: LTL Counterexample
Trace Type: Counterexample
-> State: 1.1 <-
request = 1
st = ready
-> Input: 1.2 <-
-> State: 1.2 <-
request = 0
st = busy
-- Loop starts here-> State: 1.3 <-
st = ready
-> State: 1.4 <--> Input: 1.5 <-
-> State: 1.5 <-
st = busy
-> Input: 1.6 <-
-> State: 1.6 <-
st = ready

In the first state of the counterexample, request is true (1) and st is ready. In the next state request is false and st is busy. So far the formula holds.
Then it is inidicated the start of the loop. This is since infinite paths are analyzed, and every counterexample includes a prefix and a loop. The first state of the loop (State 1.3) changes the value of st but not of request (only the variables changed appear). So in State 1.3 request is false and st=ready. However, in the next state st keeps being ready! Our formula is false, since there could be states when the state is ready but the next state is not busy (for instance, when there is no request).

If we change the formula and check

LTLSPEC
G ((request & st = ready) -> X st = busy)

The output obtained is:

-- specification G ((request & st = ready) -> X st = busy) is true