Documentation update.
This commit is contained in:
parent
bcba497c0b
commit
42549d089b
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2TEST 1 "03 June 2017" "PCRE 10.30"
|
.TH PCRE2TEST 1 "06 June 2017" "PCRE 10.30"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
pcre2test - a program for testing Perl-compatible regular expressions.
|
pcre2test - a program for testing Perl-compatible regular expressions.
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -67,7 +67,7 @@ no further data is read, so this character should be avoided unless you really
|
||||||
want that action.
|
want that action.
|
||||||
.P
|
.P
|
||||||
The input is processed using using C's string functions, so must not
|
The input is processed using using C's string functions, so must not
|
||||||
contain binary zeroes, even though in Unix-like environments, \fBfgets()\fP
|
contain binary zeros, even though in Unix-like environments, \fBfgets()\fP
|
||||||
treats any bytes other than newline as data characters. An error is generated
|
treats any bytes other than newline as data characters. An error is generated
|
||||||
if a binary zero is encountered. Subject lines are processed for backslash
|
if a binary zero is encountered. Subject lines are processed for backslash
|
||||||
escapes, which makes it possible to include any data value in strings that are
|
escapes, which makes it possible to include any data value in strings that are
|
||||||
|
@ -334,8 +334,9 @@ of the standard test input files.
|
||||||
.P
|
.P
|
||||||
When the POSIX API is being tested there is no way to override the default
|
When the POSIX API is being tested there is no way to override the default
|
||||||
newline convention, though it is possible to set the newline convention from
|
newline convention, though it is possible to set the newline convention from
|
||||||
within the pattern. A warning is given if the \fBposix\fP modifier is used when
|
within the pattern. A warning is given if the \fBposix\fP or \fBposix_nosub\fP
|
||||||
\fB#newline_default\fP would set a default for the non-POSIX API.
|
modifier is used when \fB#newline_default\fP would set a default for the
|
||||||
|
non-POSIX API.
|
||||||
.sp
|
.sp
|
||||||
#pattern <modifier-list>
|
#pattern <modifier-list>
|
||||||
.sp
|
.sp
|
||||||
|
@ -685,18 +686,6 @@ testing that \fBpcre2_compile()\fP behaves correctly in this case (it uses
|
||||||
default values).
|
default values).
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SS "Specifying the pattern's length"
|
|
||||||
.rs
|
|
||||||
.sp
|
|
||||||
By default, patterns are passed to the compiling functions as zero-terminated
|
|
||||||
strings. When using the POSIX wrapper API, there is no other option. However,
|
|
||||||
when using PCRE2's native API, patterns can be passed by length instead of
|
|
||||||
being zero-terminated. The \fBuse_length\fP modifier causes this to happen.
|
|
||||||
Using a length happens automatically (whether or not \fBuse_length\fP is set)
|
|
||||||
when \fBhex\fP is set, because patterns specified in hexadecimal may contain
|
|
||||||
binary zeros.
|
|
||||||
.
|
|
||||||
.
|
|
||||||
.SS "Specifying pattern characters in hexadecimal"
|
.SS "Specifying pattern characters in hexadecimal"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
|
@ -717,11 +706,23 @@ nine characters, only two of which are specified in hexadecimal:
|
||||||
Either single or double quotes may be used. There is no way of including
|
Either single or double quotes may be used. There is no way of including
|
||||||
the delimiter within a substring. The \fBhex\fP and \fBexpand\fP modifiers are
|
the delimiter within a substring. The \fBhex\fP and \fBexpand\fP modifiers are
|
||||||
mutually exclusive.
|
mutually exclusive.
|
||||||
|
.
|
||||||
|
.
|
||||||
|
.SS "Specifying the pattern's length"
|
||||||
|
.rs
|
||||||
|
.sp
|
||||||
|
By default, patterns are passed to the compiling functions as zero-terminated
|
||||||
|
strings but can be passed by length instead of being zero-terminated. The
|
||||||
|
\fBuse_length\fP modifier causes this to happen. Using a length happens
|
||||||
|
automatically (whether or not \fBuse_length\fP is set) when \fBhex\fP is set,
|
||||||
|
because patterns specified in hexadecimal may contain binary zeros.
|
||||||
.P
|
.P
|
||||||
The POSIX API cannot be used with patterns specified in hexadecimal because
|
If \fBhex\fP or \fBuse_length\fP is used with the POSIX wrapper API (see
|
||||||
they may contain binary zeros, which conflicts with \fBregcomp()\fP's
|
.\" HTML <a href="#posixwrapper">
|
||||||
requirement for a zero-terminated string. Such patterns are always passed to
|
.\" </a>
|
||||||
\fBpcre2_compile()\fP as a string with a length, not as zero-terminated.
|
"Using the POSIX wrapper API"
|
||||||
|
.\"
|
||||||
|
below), the REG_PEND extension is used to pass the pattern's length.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SS "Specifying wide characters in 16-bit and 32-bit modes"
|
.SS "Specifying wide characters in 16-bit and 32-bit modes"
|
||||||
|
@ -787,7 +788,7 @@ below
|
||||||
.\"
|
.\"
|
||||||
for details of how these options are specified for each match attempt.
|
for details of how these options are specified for each match attempt.
|
||||||
.P
|
.P
|
||||||
JIT compilation is requested by the \fB/jit\fP pattern modifier, which may
|
JIT compilation is requested by the \fBjit\fP pattern modifier, which may
|
||||||
optionally be followed by an equals sign and a number in the range 0 to 7.
|
optionally be followed by an equals sign and a number in the range 0 to 7.
|
||||||
The three bits that make up the number specify which of the three JIT operating
|
The three bits that make up the number specify which of the three JIT operating
|
||||||
modes are to be compiled:
|
modes are to be compiled:
|
||||||
|
@ -811,7 +812,7 @@ to \fBpcre2_match()\fP with either the PCRE2_PARTIAL_SOFT or the
|
||||||
PCRE2_PARTIAL_HARD option set. Note that such a call may return a complete
|
PCRE2_PARTIAL_HARD option set. Note that such a call may return a complete
|
||||||
match; the options enable the possibility of a partial match, but do not
|
match; the options enable the possibility of a partial match, but do not
|
||||||
require it. Note also that if you request JIT compilation only for partial
|
require it. Note also that if you request JIT compilation only for partial
|
||||||
matching (for example, /jit=2) but do not set the \fBpartial\fP modifier on a
|
matching (for example, jit=2) but do not set the \fBpartial\fP modifier on a
|
||||||
subject line, that match will not use JIT code because none was compiled for
|
subject line, that match will not use JIT code because none was compiled for
|
||||||
non-partial matching.
|
non-partial matching.
|
||||||
.P
|
.P
|
||||||
|
@ -888,10 +889,11 @@ causes a compilation error. The default is the largest number a PCRE2_SIZE
|
||||||
variable can hold (essentially unlimited).
|
variable can hold (essentially unlimited).
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
|
.\" HTML <a name="posixwrapper"></a>
|
||||||
.SS "Using the POSIX wrapper API"
|
.SS "Using the POSIX wrapper API"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
The \fB/posix\fP and \fBposix_nosub\fP modifiers cause \fBpcre2test\fP to call
|
The \fBposix\fP and \fBposix_nosub\fP modifiers cause \fBpcre2test\fP to call
|
||||||
PCRE2 via the POSIX wrapper API rather than its native API. When
|
PCRE2 via the POSIX wrapper API rather than its native API. When
|
||||||
\fBposix_nosub\fP is used, the POSIX option REG_NOSUB is passed to
|
\fBposix_nosub\fP is used, the POSIX option REG_NOSUB is passed to
|
||||||
\fBregcomp()\fP. The POSIX wrapper supports only the 8-bit library. Note that
|
\fBregcomp()\fP. The POSIX wrapper supports only the 8-bit library. Note that
|
||||||
|
@ -921,6 +923,10 @@ large buffer is used.
|
||||||
The \fBaftertext\fP and \fBallaftertext\fP subject modifiers work as described
|
The \fBaftertext\fP and \fBallaftertext\fP subject modifiers work as described
|
||||||
below. All other modifiers are either ignored, with a warning message, or cause
|
below. All other modifiers are either ignored, with a warning message, or cause
|
||||||
an error.
|
an error.
|
||||||
|
.P
|
||||||
|
The pattern is passed to \fBregcomp()\fP as a zero-terminated string by
|
||||||
|
default, but if the \fBuse_length\fP or \fBhex\fP modifiers are set, the
|
||||||
|
REG_PEND extension is used to pass it by length.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SS "Testing the stack guard feature"
|
.SS "Testing the stack guard feature"
|
||||||
|
@ -1041,11 +1047,11 @@ for a description of their effects.
|
||||||
The partial matching modifiers are provided with abbreviations because they
|
The partial matching modifiers are provided with abbreviations because they
|
||||||
appear frequently in tests.
|
appear frequently in tests.
|
||||||
.P
|
.P
|
||||||
If the \fBposix\fP modifier was present on the pattern, causing the POSIX
|
If the \fBposix\fP or \fBposix_nosub\fP modifier was present on the pattern,
|
||||||
wrapper API to be used, the only option-setting modifiers that have any effect
|
causing the POSIX wrapper API to be used, the only option-setting modifiers
|
||||||
are \fBnotbol\fP, \fBnotempty\fP, and \fBnoteol\fP, causing REG_NOTBOL,
|
that have any effect are \fBnotbol\fP, \fBnotempty\fP, and \fBnoteol\fP,
|
||||||
REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to \fBregexec()\fP.
|
causing REG_NOTBOL, REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to
|
||||||
The other modifiers are ignored, with a warning message.
|
\fBregexec()\fP. The other modifiers are ignored, with a warning message.
|
||||||
.P
|
.P
|
||||||
There is one additional modifier that can be used with the POSIX wrapper. It is
|
There is one additional modifier that can be used with the POSIX wrapper. It is
|
||||||
ignored (with a warning) if used for non-POSIX matching.
|
ignored (with a warning) if used for non-POSIX matching.
|
||||||
|
@ -1053,13 +1059,15 @@ ignored (with a warning) if used for non-POSIX matching.
|
||||||
posix_startend=<n>[:<m>]
|
posix_startend=<n>[:<m>]
|
||||||
.sp
|
.sp
|
||||||
This causes the subject string to be passed to \fBregexec()\fP using the
|
This causes the subject string to be passed to \fBregexec()\fP using the
|
||||||
REG_STARTEND option, which uses offsets to restrict which part of the string is
|
REG_STARTEND option, which uses offsets to specify which part of the string is
|
||||||
searched. If only one number is given, the end offset is passed as the end of
|
searched. If only one number is given, the end offset is passed as the end of
|
||||||
the subject string. For more detail of REG_STARTEND, see the
|
the subject string. For more detail of REG_STARTEND, see the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
\fBpcre2posix\fP
|
\fBpcre2posix\fP
|
||||||
.\"
|
.\"
|
||||||
documentation.
|
documentation. If the subject string contains binary zeros (coded as escapes
|
||||||
|
such as \ex{00} because \fBpcre2test\fP does not support actual binary zeros in
|
||||||
|
its input), you must use \fBposix_startend\fP to specify its length.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SS "Setting match controls"
|
.SS "Setting match controls"
|
||||||
|
@ -1416,8 +1424,8 @@ pair of offsets.)
|
||||||
By default, the subject string is passed to a native API matching function with
|
By default, the subject string is passed to a native API matching function with
|
||||||
its correct length. In order to test the facility for passing a zero-terminated
|
its correct length. In order to test the facility for passing a zero-terminated
|
||||||
string, the \fBzero_terminate\fP modifier is provided. It causes the length to
|
string, the \fBzero_terminate\fP modifier is provided. It causes the length to
|
||||||
be passed as PCRE2_ZERO_TERMINATED. (When matching via the POSIX interface,
|
be passed as PCRE2_ZERO_TERMINATED. When matching via the POSIX interface,
|
||||||
this modifier has no effect, as there is no facility for passing a length.)
|
this modifier is ignored, with a warning.
|
||||||
.P
|
.P
|
||||||
When testing \fBpcre2_substitute()\fP, this modifier also has the effect of
|
When testing \fBpcre2_substitute()\fP, this modifier also has the effect of
|
||||||
passing the replacement string as zero-terminated.
|
passing the replacement string as zero-terminated.
|
||||||
|
@ -1636,7 +1644,7 @@ the current position precedes the start position, which can happen if the
|
||||||
callout is in a lookbehind assertion.
|
callout is in a lookbehind assertion.
|
||||||
.P
|
.P
|
||||||
Callouts numbered 255 are assumed to be automatic callouts, inserted as a
|
Callouts numbered 255 are assumed to be automatic callouts, inserted as a
|
||||||
result of the \fB/auto_callout\fP pattern modifier. In this case, instead of
|
result of the \fBauto_callout\fP pattern modifier. In this case, instead of
|
||||||
showing the callout number, the offset in the pattern, preceded by a plus, is
|
showing the callout number, the offset in the pattern, preceded by a plus, is
|
||||||
output. For example:
|
output. For example:
|
||||||
.sp
|
.sp
|
||||||
|
@ -1807,6 +1815,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 03 June 2017
|
Last updated: 06 June 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
Loading…
Reference in New Issue