cppcheck/man/writing-rules.docbook

222 lines
6.1 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<article>
<articleinfo>
<title>Writing Cppcheck rules</title>
<author>
<firstname>Daniel</firstname>
<surname>Marjamäki</surname>
<affiliation>
<orgname>Cppcheck</orgname>
</affiliation>
</author>
<pubdate>2010</pubdate>
</articleinfo>
<section>
<title>Introduction</title>
<para>This is supposed to be a manual for developers who want to write
Cppcheck rules.</para>
<para>There are two ways to write rules.</para>
<variablelist>
<varlistentry>
<term>Regular expressions</term>
<listitem>
<para>Simple rules can be created by using regular expressions. No
compilation is required.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>C++</term>
<listitem>
<para>Advanced rules must be created with C++. These rules must be
compiled and linked statically with Cppcheck.</para>
</listitem>
</varlistentry>
</variablelist>
<para>The data used by the rules are not the raw source code. Cppcheck
will read the source code and process it before the rules are used.</para>
</section>
<section>
<title>Data representation of the source code</title>
<para>There are two types of data you can use: symbol database and token
list.</para>
<section>
<title>Token lists</title>
<para>The code is stored in token lists (simple double-linked
lists).</para>
<para>The token lists are designed for rule matching. All redundant
information is removed. A number of transformations are made
automatically on the token lists to simplify writing rules.</para>
<para>The class <literal>Tokenizer</literal> create the token lists and
perform all simplifications.</para>
<para>The class <literal>Token</literal> is used for every token in the
token list. The <literal>Token</literal> class also contain
functionality for matching tokens.</para>
<section>
<title>Normal token list</title>
<para>The first token list that is created has many basic
simplifications. For example:</para>
<itemizedlist>
<listitem>
<para>There are no templates. Templates have been
instantiated.</para>
</listitem>
<listitem>
<para>There is no "else if". These are converted into "else { if
.."</para>
</listitem>
<listitem>
<para>The bodies of "if", "else", "while", "do" and "for" are
always enclosed in "{" and "}".</para>
</listitem>
<listitem>
<para>A declaration of multiple variables is split up into
multiple variable declarations. "int a,b;" =&gt; "int a; int
b;"</para>
</listitem>
<listitem>
<para>All variables have unique ID numbers</para>
</listitem>
</itemizedlist>
</section>
<section>
<title>Simplified token list</title>
<para>The second token list that is created has all simplifications
the normal token list has and then many more simplifications. For
example:</para>
<itemizedlist>
<listitem>
<para>There is no sizeof</para>
</listitem>
<listitem>
<para>There are no templates.</para>
</listitem>
<listitem>
<para>Control flow transformations.</para>
</listitem>
<listitem>
<para>NULL is replaced with 0.</para>
</listitem>
<listitem>
<para>Static value flow analysis is made. Known values are
inserted into the code.</para>
</listitem>
<listitem>
<para>variable initialization is replaced with assignment</para>
</listitem>
</itemizedlist>
<para>The simple token list is written if you use
<literal>--debug</literal>. For example, use <literal>cppcheck --debug
test1.cpp</literal> and check this code:</para>
<programlisting>void f1() {
int a = 1;
f2(a++);
}</programlisting>
<para>The result is:</para>
<programlisting>##file test1.cpp
1: void f1 ( ) {
2: ; ;
3: f2 ( 1 ) ;
4: }</programlisting>
<para></para>
</section>
<section>
<title>Reference</title>
<para>To learn more about the token lists, the doxygen information for
the <literal>Tokenizer</literal> is recommended.</para>
<para>http://cppcheck.sourceforge.net/doxyoutput/classTokenizer.html</para>
</section>
</section>
</section>
<section>
<title>Regular expressions</title>
<para>Simple rules can be defined through regular expressions.</para>
<para>A rule consist of:</para>
<itemizedlist>
<listitem>
<para>a pattern to search for.</para>
</listitem>
<listitem>
<para>an error message that is reported when pattern is found</para>
</listitem>
</itemizedlist>
<para>Here is an example:</para>
<programlisting>&lt;?xml version="1.0"?&gt;
&lt;rule data="simple"&gt;
&lt;pattern&gt;/ 0&lt;/pattern&gt;
&lt;message&gt;
&lt;id&gt;divbyzero&lt;/id&gt;
&lt;severity&gt;error&lt;/severity&gt;
&lt;summary&gt;Division by zero&lt;/summary&gt;
&lt;/message&gt;
&lt;/rule&gt;</programlisting>
<para>It is recommended that you use the <literal>simple</literal> token
list whenever you can. If you need some information that is removed in it
then try the <literal>normal</literal> token list.</para>
<para>When you write the patterns remember that;</para>
<itemizedlist>
<listitem>
<para>tokens are always separated by spaces. "1+2" is not
possible.</para>
</listitem>
<listitem>
<para>there is no indentation, spaces, comments, line breaks.</para>
</listitem>
</itemizedlist>
</section>
</article>