Test binary zero in callout strings; change offset to PCRE2_SIZE; some

documentation tidies.
This commit is contained in:
Philip.Hazel 2015-03-16 15:38:26 +00:00
parent 2ec7cbf9b5
commit aa8d7342da
7 changed files with 51 additions and 16 deletions

View File

@ -1,4 +1,4 @@
.TH PCRE2CALLOUT 3 "15 March 2015" "PCRE2 10.20"
.TH PCRE2CALLOUT 3 "16 March 2015" "PCRE2 10.20"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS
@ -197,8 +197,8 @@ documentation). The callout block structure contains the following fields:
PCRE2_SIZE \fIpattern_position\fP;
PCRE2_SIZE \fInext_item_length\fP;
PCRE2_SIZE \fIcallout_string_offset\fP;
PCRE2_SIZE \fIcallout_string_length\fP;
PCRE2_SPTR \fIcallout_string\fP;
uint32_t \fIcallout_string_length\fP;
.sp
The \fIversion\fP field contains the version number of the block format. The
@ -225,11 +225,12 @@ For callouts with string arguments, \fIcallout_number\fP is always zero, and
\fIcallout_string\fP points to the string that is contained within the compiled
pattern. Its length is given by \fIcallout_string_length\fP. Duplicated ending
delimiters that were present in the original pattern string have been turned
into single characters. An additional code unit containing binary zero is
present after the string, but is not included in the length. The delimiter that
was used to start the string is also stored within the pattern, immediately
before the string itself. You can therefore access this delimiter as
\fIcallout_string\fP[-1] if you need it.
into single characters, but there is no other processing of the callout string
argument. An additional code unit containing binary zero is present after the
string, but is not included in the length. The delimiter that was used to start
the string is also stored within the pattern, immediately before the string
itself. You can access this delimiter as \fIcallout_string\fP[-1] if you need
it.
.P
The \fIcallout_string_offset\fP field is the code unit offset to the start of
the callout argument string within the original pattern string. This is
@ -327,6 +328,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 15 March 2015
Last updated: 16 March 2015
Copyright (c) 1997-2015 University of Cambridge.
.fi

View File

@ -1,4 +1,4 @@
.TH PCRE2TEST 1 "14 March 2015" "PCRE 10.20"
.TH PCRE2TEST 1 "16 March 2015" "PCRE 10.20"
.SH NAME
pcre2test - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS
@ -61,11 +61,17 @@ names used in the libraries have a suffix _8, _16, or _32, as appropriate.
.sp
Input to \fBpcre2test\fP is processed line by line, either by calling the C
library's \fBfgets()\fP function, or via the \fBlibreadline\fP library (see
below). In Unix-like environments, \fBfgets()\fP treats any bytes other than
newline as data characters. However, in some Windows environments character 26
(hex 1A) causes an immediate end of file, and no further data is read. For
maximum portability, therefore, it is safest to avoid non-printing characters
in \fBpcre2test\fP input files.
below). The input is processed using using C's string functions, so must not
contain binary zeroes, even though in Unix-like environments, \fBfgets()\fP
treats any bytes other than newline as data characters. In some Windows
environments character 26 (hex 1A) causes an immediate end of file, and no
further data is read.
.P
For maximum portability, therefore, it is safest to avoid non-printing
characters in \fBpcre2test\fP input files. There is a facility for specifying a
pattern's characters as hexadecimal pairs, thus making it possible to include
binary zeroes in a pattern for testing purposes. Subject lines are processed
for backslash escapes, which makes it possible to include any data value.
.
.
.SH "COMMAND LINE OPTIONS"
@ -1431,6 +1437,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 14 March 2015
Last updated: 16 March 2015
Copyright (c) 1997-2015 University of Cambridge.
.fi

View File

@ -339,8 +339,8 @@ typedef struct pcre2_callout_block { \
PCRE2_SIZE next_item_length; /* Length of next item in the pattern */ \
/* ------------------- Added for Version 1 -------------------------- */ \
PCRE2_SIZE callout_string_offset; /* Offset to string within pattern */ \
PCRE2_SIZE callout_string_length; /* Length of string compiled into pattern */ \
PCRE2_SPTR callout_string; /* String compiled into pattern */ \
uint32_t callout_string_length; /* Length of string compiled into pattern */ \
/* ------------------------------------------------------------------ */ \
} pcre2_callout_block;

5
testdata/testinput2 vendored
View File

@ -4224,4 +4224,9 @@ a random value. /Ix
/(?:a(?C`code`)){3}X/
aaaXY
# Binary zero in callout string
# a ( ? C ' x z ' ) b
/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
abcdefgh
# End of testinput2

5
testdata/testinput6 vendored
View File

@ -4841,4 +4841,9 @@
/(?:a(?C`code`)){3}X/
aaaXY
# Binary zero in callout string
# a ( ? C ' x z ' ) b
/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
abcdefgh
# End of testinput6

View File

@ -14169,4 +14169,13 @@ Callout (8): `code`
^ ^ )
0: aaaX
# Binary zero in callout string
# a ( ? C ' x z ' ) b
/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
abcdefgh
Callout (5): 'x\x00z'
--->abcdefgh
^^ b
0: ab
# End of testinput2

View File

@ -7910,4 +7910,13 @@ Callout (8): `code`
^ ^ )
0: aaaX
# Binary zero in callout string
# a ( ? C ' x z ' ) b
/ 61 28 3f 43 27 78 00 7a 27 29 62/hex
abcdefgh
Callout (5): 'x\x00z'
--->abcdefgh
^^ b
0: ab
# End of testinput6