Minor pcre2test code fix for framesize modifier and documentation update.

This commit is contained in:
Philip.Hazel 2017-03-21 12:22:58 +00:00
parent 20cafacaa4
commit 31455a7665
2 changed files with 27 additions and 21 deletions

View File

@ -1,4 +1,4 @@
.TH PCRE2TEST 1 "28 December 2016" "PCRE 10.23"
.TH PCRE2TEST 1 "21 March 2017" "PCRE 10.30"
.SH NAME
pcre2test - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS
@ -98,12 +98,13 @@ character values up to 0x7fffffff. Each character is placed in one 16-bit or
32-bit code unit (in the 16-bit case, values greater than 0xffff cause an error
to occur).
.P
UTF-8 is not capable of encoding values greater than 0x7fffffff, but such
values can be handled by the 32-bit library. When testing this library in
non-UTF mode with \fButf8_input\fP set, if any character is preceded by the
byte 0xff (which is an illegal byte in UTF-8) 0x80000000 is added to the
character's value. This is the only way of passing such code points in a
pattern string. For subject strings, using an escape sequence is preferable.
UTF-8 (in its original definition) is not capable of encoding values greater
than 0x7fffffff, but such values can be handled by the 32-bit library. When
testing this library in non-UTF mode with \fButf8_input\fP set, if any
character is preceded by the byte 0xff (which is an illegal byte in UTF-8)
0x80000000 is added to the character's value. This is the only way of passing
such code points in a pattern string. For subject strings, using an escape
sequence is preferable.
.
.
.SH "COMMAND LINE OPTIONS"
@ -566,6 +567,7 @@ about the pattern:
/B bincode show binary code without lengths
callout_info show callout information
debug same as info,fullbincode
framesize show matching frame size
fullbincode show binary code with lengths
/I info show info about compiled pattern
hex unquoted characters are hexadecimal
@ -648,6 +650,10 @@ unit" is the last literal code unit that must be present in any match. This is
not necessarily the last character. These lines are omitted if no starting or
ending code units are recorded.
.P
The \fBframesize\fP modifier shows the size, in bytes, of the storage frames
used by \fBpcre2_match()\fP for handling backtracking. The size depends on the
number of capturing parentheses in the pattern.
.P
The \fBcallout_info\fP modifier requests information about all the callouts in
the pattern. A list of them is output at the end of any other information that
is requested. For each callout, either its number or string is given, followed
@ -1044,6 +1050,7 @@ pattern.
callout_fail=<n>[:<m>] control callout failure
callout_none do not supply a callout function
copy=<number or name> copy captured substring
depth_limit=<n> set a depth limit
dfa use \fBpcre2_dfa_match()\fP
find_limits find match and recursion limits
get=<number or name> extract captured substring
@ -1057,7 +1064,7 @@ pattern.
offset=<n> set starting offset
offset_limit=<n> set offset limit
ovector=<n> set size of output vector
recursion_limit=<n> set a recursion limit
recursion_limit=<n> obsolete synonym for depth_limit
replace=<string> specify a replacement string
startchar show startchar when relevant
startoffset=<n> same as offset=<n>
@ -1279,10 +1286,10 @@ stack that is larger than the default 32K is necessary only for very
complicated patterns.
.
.
.SS "Setting match and recursion limits"
.SS "Setting match and depth limits"
.rs
.sp
The \fBmatch_limit\fP and \fBrecursion_limit\fP modifiers set the appropriate
The \fBmatch_limit\fP and \fBdepth_limit\fP modifiers set the appropriate
limits in the match context. These values are ignored when the
\fBfind_limits\fP modifier is specified.
.
@ -1292,21 +1299,21 @@ limits in the match context. These values are ignored when the
.sp
If the \fBfind_limits\fP modifier is present, \fBpcre2test\fP calls
\fBpcre2_match()\fP several times, setting different values in the match
context via \fBpcre2_set_match_limit()\fP and \fBpcre2_set_recursion_limit()\fP
context via \fBpcre2_set_match_limit()\fP and \fBpcre2_set_depth_limit()\fP
until it finds the minimum values for each parameter that allow
\fBpcre2_match()\fP to complete without error.
.P
If JIT is being used, only the match limit is relevant. If DFA matching is
being used, neither limit is relevant, and this modifier is ignored (with a
warning message).
being used, only the depth limit is relevant, but at present this modifier is
ignored (with a warning message).
.P
The \fImatch_limit\fP number is a measure of the amount of backtracking
that takes place, and learning the minimum value can be instructive. For most
simple matches, the number is quite small, but for patterns with very large
numbers of matching possibilities, it can become large very quickly with
increasing length of subject string. The \fImatch_limit_recursion\fP number is
a measure of how much stack (or, if PCRE2 is compiled with NO_RECURSE, how much
heap) memory is needed to complete the match attempt.
increasing length of subject string. The \fIdepth_limit\fP number is
a measure of how much memory for recording backtracking points is needed to
complete the match attempt.
.
.
.SS "Showing MARK names"
@ -1425,7 +1432,7 @@ code unit offset of the start of the failing character is also output. Here is
an example of an interactive \fBpcre2test\fP run.
.sp
$ pcre2test
PCRE2 version 9.00 2014-05-10
PCRE2 version 10.22 2016-07-29
.sp
re> /^abc(\ed+)/
data> abc123
@ -1758,6 +1765,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 28 December 2016
Copyright (c) 1997-2016 University of Cambridge.
Last updated: 21 March 2017
Copyright (c) 1997-2017 University of Cambridge.
.fi

View File

@ -467,7 +467,6 @@ data line. */
CTL_ALLCAPTURES|\
CTL_ALLUSEDTEXT|\
CTL_ALTGLOBAL|\
CTL_FRAMESIZE|\
CTL_GLOBAL|\
CTL_MARK|\
CTL_MEMORY|\
@ -577,7 +576,7 @@ static modstruct modlist[] = {
{ "extended", MOD_PATP, MOD_OPT, PCRE2_EXTENDED, PO(options) },
{ "find_limits", MOD_DAT, MOD_CTL, CTL_FINDLIMITS, DO(control) },
{ "firstline", MOD_PAT, MOD_OPT, PCRE2_FIRSTLINE, PO(options) },
{ "framesize", MOD_PD, MOD_CTL, CTL_FRAMESIZE, PD(control) },
{ "framesize", MOD_PAT, MOD_CTL, CTL_FRAMESIZE, PO(control) },
{ "fullbincode", MOD_PAT, MOD_CTL, CTL_FULLBINCODE, PO(control) },
{ "get", MOD_DAT, MOD_NN, DO(get_numbers), DO(get_names) },
{ "getall", MOD_DAT, MOD_CTL, CTL_GETALL, DO(control) },