Documentation update for fuzz support.

This commit is contained in:
Philip.Hazel 2016-11-01 11:56:07 +00:00
parent ee3b0feec0
commit d3b60a9b7d
2 changed files with 52 additions and 19 deletions

45
README
View File

@ -204,13 +204,6 @@ library. They are also documented in the pcre2build man page.
--enable-newline-is-crlf, --enable-newline-is-anycrlf, or --enable-newline-is-crlf, --enable-newline-is-anycrlf, or
--enable-newline-is-any to the "configure" command, respectively. --enable-newline-is-any to the "configure" command, respectively.
If you specify --enable-newline-is-cr or --enable-newline-is-crlf, some of
the standard tests will fail, because the lines in the test files end with
LF. Even if the files are edited to change the line endings, there are likely
to be some failures. With --enable-newline-is-anycrlf or
--enable-newline-is-any, many tests should succeed, but there may be some
failures.
. By default, the sequence \R in a pattern matches any Unicode line ending . By default, the sequence \R in a pattern matches any Unicode line ending
sequence. This is independent of the option specifying what PCRE2 considers sequence. This is independent of the option specifying what PCRE2 considers
to be the end of a line (see above). However, the caller of PCRE2 can to be the end of a line (see above). However, the caller of PCRE2 can
@ -253,13 +246,13 @@ library. They are also documented in the pcre2build man page.
sizes in the pcre2stack man page. sizes in the pcre2stack man page.
. In the 8-bit library, the default maximum compiled pattern size is around . In the 8-bit library, the default maximum compiled pattern size is around
64K. You can increase this by adding --with-link-size=3 to the "configure" 64K bytes. You can increase this by adding --with-link-size=3 to the
command. PCRE2 then uses three bytes instead of two for offsets to different "configure" command. PCRE2 then uses three bytes instead of two for offsets
parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is to different parts of the compiled pattern. In the 16-bit library,
the same as --with-link-size=4, which (in both libraries) uses four-byte --with-link-size=3 is the same as --with-link-size=4, which (in both
offsets. Increasing the internal link size reduces performance in the 8-bit libraries) uses four-byte offsets. Increasing the internal link size reduces
and 16-bit libraries. In the 32-bit library, the link size setting is performance in the 8-bit and 16-bit libraries. In the 32-bit library, the
ignored, as 4-byte offsets are always used. link size setting is ignored, as 4-byte offsets are always used.
. You can build PCRE2 so that its internal match() function that is called from . You can build PCRE2 so that its internal match() function that is called from
pcre2_match() does not call itself recursively. Instead, it uses memory pcre2_match() does not call itself recursively. Instead, it uses memory
@ -346,7 +339,8 @@ library. They are also documented in the pcre2build man page.
The value must be a plain integer. The default is 20480. The amount of memory The value must be a plain integer. The default is 20480. The amount of memory
used by pcre2grep is actually three times this number, to allow for "before" used by pcre2grep is actually three times this number, to allow for "before"
and "after" lines. and "after" lines. If very long lines are encountered, the buffer is
automatically enlarged, up to a fixed maximum size.
. The default maximum size of pcre2grep's internal buffer can be set by, for . The default maximum size of pcre2grep's internal buffer can be set by, for
example: example:
@ -378,6 +372,22 @@ library. They are also documented in the pcre2build man page.
If you get error messages about missing functions tgetstr, tgetent, tputs, If you get error messages about missing functions tgetstr, tgetent, tputs,
tgetflag, or tgoto, this is the problem, and linking with the ncurses library tgetflag, or tgoto, this is the problem, and linking with the ncurses library
should fix it. should fix it.
. There is a special option called --enable-fuzz-support for use by people who
want to run fuzzing tests on PCRE2. At present this applies only to the 8-bit
library. If set, it causes an extra library called libpcre2-fuzzsupport.a to
be built, but not installed. This contains a single function called
LLVMFuzzerTestOneInput() whose arguments are a pointer to a string and the
length of the string. When called, this function tries to compile the string
as a pattern, and if that succeeds, to match it. This is done both with no
options and with some random options bits that are generated from the string.
Setting --enable-fuzz-support also causes a binary called pcre2fuzzcheck to
be created. This is normally run under valgrind or used when PCRE2 is
compiled with address sanitizing enabled. It calls the fuzzing function and
outputs information about it is doing. The input strings are specified by
arguments: if an argument starts with "=" the rest of it is a literal input
string. Otherwise, it is assumed to be a file name, and the contents of the
file are the test string.
The "configure" script builds the following files for the basic C library: The "configure" script builds the following files for the basic C library:
@ -553,7 +563,7 @@ script creates the .txt and HTML forms of the documentation from the man pages.
Testing PCRE2 Testing PCRE2
------------ -------------
To test the basic PCRE2 library on a Unix-like system, run the RunTest script. To test the basic PCRE2 library on a Unix-like system, run the RunTest script.
There is another script called RunGrepTest that tests the pcre2grep command. There is another script called RunGrepTest that tests the pcre2grep command.
@ -767,6 +777,7 @@ The distribution should contain the files listed below.
src/pcre2_xclass.c ) src/pcre2_xclass.c )
src/pcre2_printint.c debugging function that is used by pcre2test, src/pcre2_printint.c debugging function that is used by pcre2test,
src/pcre2_fuzzsupport.c function for (optional) fuzzing support
src/config.h.in template for config.h, when built by "configure" src/config.h.in template for config.h, when built by "configure"
src/pcre2.h.in template for pcre2.h when built by "configure" src/pcre2.h.in template for pcre2.h when built by "configure"
@ -855,4 +866,4 @@ The distribution should contain the files listed below.
Philip Hazel Philip Hazel
Email local part: ph10 Email local part: ph10
Email domain: cam.ac.uk Email domain: cam.ac.uk
Last updated: 07 October 2016 Last updated: 01 November 2016

View File

@ -1,4 +1,4 @@
.TH PCRE2BUILD 3 "07 October 2016" "PCRE2 10.23" .TH PCRE2BUILD 3 "01 November 2016" "PCRE2 10.23"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
. .
@ -515,6 +515,28 @@ information about code coverage, see the \fBgcov\fP and \fBlcov\fP
documentation. documentation.
. .
. .
.SH "SUPPORT FOR FUZZERS"
.rs
.sp
There is a special option for use by people who want to run fuzzing tests on
PCRE2:
.sp
--enable-fuzz-support
.sp
At present this applies only to the 8-bit library. If set, it causes an extra
library called libpcre2-fuzzsupport.a to be built, but not installed. This
contains a single function called LLVMFuzzerTestOneInput() whose arguments are
a pointer to a string and the length of the string. When called, this function
tries to compile the string as a pattern, and if that succeeds, to match it.
This is done both with no options and with some random options bits that are
generated from the string. Setting --enable-fuzz-support also causes a binary
called \fBpcre2fuzzcheck\fP to be created. This is normally run under valgrind
or used when PCRE2 is compiled with address sanitizing enabled. It calls the
fuzzing function and outputs information about it is doing. The input strings
are specified by arguments: if an argument starts with "=" the rest of it is a
literal input string. Otherwise, it is assumed to be a file name, and the
contents of the file are the test string.
.
.SH "SEE ALSO" .SH "SEE ALSO"
.rs .rs
.sp .sp
@ -535,6 +557,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 07 October 2016 Last updated: 01 November 2016
Copyright (c) 1997-2016 University of Cambridge. Copyright (c) 1997-2016 University of Cambridge.
.fi .fi