• 主页
  • 相册
  • 随笔
  • 目录
  • 存档
Total 244
Search AboutMe

  • 主页
  • 相册
  • 随笔
  • 目录
  • 存档

Bounded Model Checking

2022-03-30

1. Circuit Satisfiability

  • problem
  • 由此引入了Formula Satisfiability
    • The SAT problem asks whether a given Boolean formula is satisfiable
    • The SMT decides the satisfiability of first-order logic formulae using the combination of different background theories

2. BMC

BMC: 有界模型检测

2.1. MC vs BMC

  • 注意completeness threshold

2.2. example

2.3. Step


3. SAT/SMT-based BMC tools for C

3.1. Typical features

  • full language support
    • bit-precise operations, structs, arrays, …
    • heap-allocated memory
    • concurrency
  • built-in safety checks
    • overflow, div-by-zero, array out-of-bounds indexing, …
    • memory safety: nil pointer deref, memory leaks, …
    • deadlocks, race conditions
  • user-specified assertions and error labels
  • non-deterministic modelling
    • nondeterministic assignments
    • assume-statements

3.2. General Architecture and Approach

General approach

  1. Simplify control flow
  2. Unwind all of the loops (指的是都会被unwind,但unwind多少次还是由bound决定)
  3. Convert into single static assignment (SSA) form
  4. Convert into equations and simplify
  5. (Bit-blast)

    “Bit-blasting”比特爆破 is a technique for reducing first-order terms and predicates in the theory of bit-vectors 1 to propositional logic. Every bit in each bit-vector term is represented by a propositional literal and clauses are added that link them.

  6. Solve with a SAT/SMT solver
  7. Convert SAT assignment into a counterexample

3.2.1. Control flow simplifications

  • remove all side effects
    • e.g., j=++i; becomes i =i+1; j = i;
  • simplify all control flow structures into core forms
    • e.g., replace for, do while by while
    • e.g., replace case by if
  • make control flow explicit
    • e.g., replace continue, break by goto
    • e.g., replace if, while by goto

3.2.2. Loop unwinding

  • all loops are “unwound”
    • i.e., replaced by several guarded copies of the loop body
      • same for backward gotos and recursive functions
      • can use different unwinding bounds for different loops
    • each statement is executed at most once
      • to check whether unwinding is sufficient special “unwinding assertion” claims are added
    • ⇒if a program satisfies all of its claims and all unwinding assertions then it is correct!
  • multiple loops
    • use --partial-loops to suppress insertion (?)
    • nsound

3.2.3. Safety conditions

  • Built-in safety checks converted into explicit assertions
    • e.g., array safety: a[i]=...; $\leftrightarrow$ assert(0 <= i && i < N); a[i]=...;
  • sometimes easier at intermediate representation or formula level
    • e.g., word-aligned pointer access, overflow, …

3.2.4. Transforming straight-line programs into equations

  • each variable is assigned only once
    • x=1; y=2; $\leftrightarrow$ x0=1; y0=2'
  • variables are assigned multiple times
    • x=1; x=x+1; $\leftrightarrow$ x0=1; x1=x0+1;'
  • in control flow branches (if-statements)
    • for each control flow join point连接点, add a new variable with guarded assignment as definition
      • also called $ϕ$-function

3.2.5. Handling arrays

  • Arrays can be replaced by individual variables, with a “demux”解码 at each access
    • surprisingly effective (for N<1000) because value of i can often be determined statically
      • due to constant propagation

Handling arrays with theories

  • Arrays can be seen as ADT (abstract data type, 抽象数据结构) with two operations:
    • read
    • write

3.2.6. Modeling with non-determinism

Extend C with three modeling features

  • assert(e): aborts execution when e is false, no-op otherwise
    • void assert (_Bool e) { if (!e) exit(); }
  • nondet_int(): returns non-deterministic非确定(也就是随机?) int-value
    • int nondet_int () { int x; return x; }
  • assume(e): “ignores” execution when e is false, no-op otherwise
    • void assume (_Bool e) { while (!e) ; }

General approach

  • use C program to set up structure and deterministic computations
  • use non-determinism to set up search space
  • use assumptions to constrain search space
  • use failing assertion to start search
1
2
3
4
5
6
7
int main() { 
int x=nondet_int(),y=nondet_int(),z=nondet_int();
__ESBMC_assume(x > 0 && y > 0 && z > 0);
__ESBMC_assume(x < 16384 && y < 16384 && z < 16384);
assert(x*x + y*y != z*z);
return 0;
}

4. BMC of Multi-threaded Software

  • 见Detection of Software Vulnerabilities: Static Analysis

5. SAT vs SMT

6. SMT-based BMC

  • In SMT-based BMC, ψ is a quantifier-free formula in a decidable subset of first-order logic which is then checked for satisfiability by an SMT solver
  • SMT-based BMC must consider a number of issues that are not easily mapped into the theories supported by SMT solvers (i.e. embedded software)

    In previous work on SMTbased BMC for software only the theories of uninterpreted functions, arrays and linear arithmetic were considered, but no encoding was provided for ANSI-C constructs such as bit-level operations, fixedpoint arithmetic, pointers (i.e., pointer arithmetic and comparisons) and unions.

6.1. Generation of VCs

  • VC: verification condition

    BMC unrolls the system k times and translates it into a verification condition (VC) $ψ$ such that $ψ$ is satisfiable if and only if $φ$ has a counterexample of depth k or less

    The VC $ψ$ is a quantifier-free formula in a decidable subset of first-order logic, which is then checked for satisfiability by an SMT solver

  • model checking problem is formulated by constructing the following logical formula:$ψ_k=I(s_0) \land \bigvee_{i=0}^k \bigwedge_{j=0}^{i-1} γ(s_j, s_{j+1}) \land \lnot φ(s_i)$

    • $φ$ is a safety property
    • $I$ the set of initial states of $M$
    • $γ(s_j, s_{j+1})$ the transition relation of M between time steps $j$ and $j + 1$
    • $I(s_0) \land \bigvee_{i=0}^k \bigwedge_{j=0}^{i-1} γ(s_j,s_{j+1})$
      • represents the executions of M of length i and the original formula can be satisfied if and only if for some $i ≤ k$ there exists a reachable state at time step i in which φ is violated
  • If it is satisfiable, then the SMT solver provides a satisfying assignment, from which we can extract the values of the program variables to construct a counter-example

    • A counter-example for a property φ is a sequence of states $s_0, s_1,…,s_k$ with $s_0 ∈ S_0$, $s_k ∈ S$, and $γ(s_i, s_{i+1})$ for $0 ≤ i < k$.
  • If it is unsatisfiable, we can conclude that no error state is reachable in k steps or less

It is important to note that this approach can be used only to find violations of the property up to the bound k. In order to prove properties we need to compute the completeness threshold (CT), which can be smaller than or equal to the maximum number of loop-iterations occurring in the program (完备性阈值(CT),它可以小于或等于程序中出现的最大循环迭代次数)

  • 也就是有不有界都差不多了

points

  • the encoding techniques that we use to convert the constraints and properties from the ANSI-C programs into the different background theories of the SMT solvers
  • approach to decide the best encoding and solver to be used during the verification process.

7. CBMC

  • CBMC implements BMC for ANSI-C/C++ programs using SAT solvers
    • process C/C++ code using the goto-cc tool
    • process the C/C++ files and to build an abstract syntax tree (AST) (via internal parser based on Flex/Bison)

      The typechecker of CBMC’s front-end annotates this AST with types and generates a symbol table. CBMC’s IRep class then converts the annotated AST into an internal, language-independent format used by the remaining phase of the front-end.

8. ESBMC

8.1. Tool Architecture

8.1.1. Frontend

  • The white boxes (except for the SMT solver) represent the components that we reused from the CBMC model checker without any modification

8.1.2. Middleware

  • The gray boxes with dashed lines represent the components that we modified in order to:

    • generate VCs to check for memory leaks (implemented in GOTO program
    • to simplify the unrolled formula (implemented in GOTO symex
    • to perform an up-front analysis先期分析 in the CFG of the program to determine the best encoding and solver for a particular program (implemented in GOTO symex
  • GOTO program component

    • converts the ANSI-C program into a GOTO-program, which simplifies the representation (e.g., replacement of switch and while by if and goto statements)
    • and handles the unrolling of the loops and the elimination of recursive functions
  • GOTO symex component

    • performs a symbolic simulation of the program

8.1.3. Backend

  • In the back-end of ESBMC, we build two sets of quantifier-free formulae C (for the constraints) and P (for the properties)
    • C encodes the first part of $ψ_k$ (more precisely, $I(s_0) \land \bigvee_{i=0}^k \bigwedge_{j=0}^{i-1} γ(s_j, s_{j+1})$ )
    • $¬P$ encodes the second part (more precisely, $\bigvee_{i=0}^k \lnot φ(s_i)$)
    • we check $C \models_{\tau} P$ using an SMT solver
      • If the answer is satisfiable, we have found a violation of the property φ, which is encoded in $ψ_k$.
      • If not, the property holds up to the bound k.
  • sec
  • Security
  • ESBMC
软件安全备忘录:Mem Management
计算机数理逻辑-一阶逻辑-SMT
  1. 1. 1. Circuit Satisfiability
  2. 2. 2. BMC
    1. 2.1. 2.1. MC vs BMC
    2. 2.2. 2.2. example
    3. 2.3. 2.3. Step
  3. 3. 3. SAT/SMT-based BMC tools for C
    1. 3.1. 3.1. Typical features
    2. 3.2. 3.2. General Architecture and Approach
      1. 3.2.1. 3.2.1. Control flow simplifications
      2. 3.2.2. 3.2.2. Loop unwinding
      3. 3.2.3. 3.2.3. Safety conditions
      4. 3.2.4. 3.2.4. Transforming straight-line programs into equations
      5. 3.2.5. 3.2.5. Handling arrays
      6. 3.2.6. 3.2.6. Modeling with non-determinism
  4. 4. 4. BMC of Multi-threaded Software
  5. 5. 5. SAT vs SMT
  6. 6. 6. SMT-based BMC
    1. 6.1. 6.1. Generation of VCs
  7. 7. 7. CBMC
  8. 8. 8. ESBMC
    1. 8.1. 8.1. Tool Architecture
      1. 8.1.1. 8.1.1. Frontend
      2. 8.1.2. 8.1.2. Middleware
      3. 8.1.3. 8.1.3. Backend
© 2024 何决云 载入天数...