cppcheck/man/cppcheck-design.docbook

<?xml version="1.0" encoding="UTF-8"?>
<appendix>
  <info>
    <title>Cppcheck Design</title>

    <author>
      <personname><firstname>Daniel</firstname><surname>Marjamäki</surname></personname>

      <affiliation>
        <orgname>Cppcheck</orgname>
      </affiliation>
    </author>

    <pubdate>2010</pubdate>
  </info>

  <section>
    <title>Introduction</title>

    <para>The goal with this article is to give users an idea of how Cppcheck
    works.</para>

    <para>Cppcheck is a static analysis tool that tries to completely avoid
    false warnings. A false warning is when the tool reports that there is an
    error even though there is no error.</para>

    <para>Cppcheck is a relatively simple tool. I hope that this article will
    highlight that it is possible to avoid false warnings even with simple
    analysis.</para>
  </section>

  <section>
    <title>Limitations of static analysis</title>

    <para>There are many bugs in programs that are really hard to detect for
    tools. Here is an example:</para>

    <programlisting>    // calculate the number of days
    int days = hours / 23;</programlisting>

    <para>A human programmer knows that there are 24 hours in a day and
    therefore he could see that "23" is wrong. A tool will probably not know
    that there are 24 hours in a day.</para>

    <para>A tool that tries to detect all bugs could write a warning message
    for every calculation in the program. Then it will correctly report that
    "hours / 23" is wrong but incorrectly warn about "hours / 24".</para>

    <para>Cppcheck will only write a warning message if it can determine that
    the calculation is wrong. In this case, no error will be written.</para>
  </section>

  <section>
    <title>Control flow analysis</title>

    <para>When you review code you will probably use "control flow analysis"
    in your head to determine if there are bugs or not.</para>

    <para>Control flow analysis is when you try to determine what the possible
    execution paths are.</para>

    <para>The control flow analysis in Cppcheck is quite simple.</para>
  </section>

  <section>
    <title>Buffer overflows</title>

    <para>This is a simple description of how buffer overflows are detected by
    Cppcheck.</para>

    <para>If an array is accessed out of bounds somewhere in its scope then an
    error message will be written. An example code:</para>

    <programlisting>void f()
{
    char a[10];
    if (x + y == 2) {
        a[20] = 0;
    }
}</programlisting>

    <para>Cppcheck will report this message:</para>

    <programlisting>Array 'a[10]' index 20 out of bounds</programlisting>

    <para>No control flow analysis is used. Cppcheck will not try to determine
    how execution can reach the "a[20] = 0;" statement. It is assumed that all
    statements are reachable. Cppcheck will detect the error even if it is
    really impossible that "x + y == 2" is true. I still claim that this is a
    correct warning because the statement is there and it has the
    error.</para>

    <para>Cppcheck will also investigate function calls. But then control flow
    analysis can be needed to avoid false warnings. Here is an example that
    logically is the same as the previous example:</para>

    <programlisting>void f1(char *s)
{
    s[20] = 0;
}

void f2()
{
    char a[10];
    if (x + y == 2) {
        f1(a);
    }
}</programlisting>

    <para>Cppcheck will report this message:</para>

    <programlisting>Array 'a[10]' index 20 out of bounds</programlisting>

    <para>If the execution reaches the function call then there will be an
    error.</para>

    <para>But if the condition is moved into "f1" then it will be necessary to
    prove that "x+y==2" can be true when the function is called from "f2". No
    error message is reported for this code:</para>

    <programlisting>void f1(char *s)
{
    if (x + y == 2) {
        s[20] = 0;
    }
}

void f2()
{
    char a[10];
    f1(a);
}</programlisting>
  </section>

  <section>
    <title>Memory leaks</title>

    <para>The check uses simple control-flow analysis. The control flow
    analysis assumes that all conditions can always be either true or false.
    It is assumed that all statements are reachable. Here is an
    example:</para>

    <programlisting>void f()
{
    char *a = malloc(10);
    if (x + y == 2) {
        return;
    }
    free(a);
}</programlisting>

    <para>Cppcheck will determine that there is a leak at the "return;"
    statement:</para>

    <programlisting>Memory leak: a</programlisting>

    <para>Cppcheck doesn't try to determine how the execution reaches the
    "return;" statement. It will only see that if the execution reaches the
    "return;" then there will be a memory leak.</para>

    <para>Lack of advanced control-flow analysis means that many bugs are not
    detected:</para>

    <programlisting>void f(int x)
{
    char *a = 0;

    if (x == 10)
        a = malloc(10);

    if (x == 20)
        free(a);
}</programlisting>

    <para>Cppcheck doesn't detect any error. The "all conditions can be either
    true/false" means that cppcheck doesn't know that "if (x==20)" is always
    false when "if (x==10)" is true. So Cppcheck can't establish that there is
    a leak. Many other static analysis tools will probably detect that there
    will be a leak if x is 10.</para>
  </section>

  <section>
    <title>Final thoughts</title>

    <para>You cannot trust that Cppcheck will detect all bugs.</para>

    <para>Cppcheck will just find some bugs. It is likely that you won't find
    these bugs unless you use Cppcheck.</para>
  </section>
</appendix>