Documentation for offset limits.

This commit is contained in:
Philip.Hazel 2015-09-22 16:51:11 +00:00
parent 74affd9210
commit c70450d08b
11 changed files with 5259 additions and 746 deletions

View File

@ -64,6 +64,7 @@ dist_html_DATA = \
doc/html/pcre2_set_character_tables.html \
doc/html/pcre2_set_compile_recursion_guard.html \
doc/html/pcre2_set_match_limit.html \
doc/html/pcre2_set_offset_limit.html \
doc/html/pcre2_set_newline.html \
doc/html/pcre2_set_parens_nest_limit.html \
doc/html/pcre2_set_recursion_limit.html \
@ -143,6 +144,7 @@ dist_man_MANS = \
doc/pcre2_set_character_tables.3 \
doc/pcre2_set_compile_recursion_guard.3 \
doc/pcre2_set_match_limit.3 \
doc/pcre2_set_offset_limit.3 \
doc/pcre2_set_newline.3 \
doc/pcre2_set_parens_nest_limit.3 \
doc/pcre2_set_recursion_limit.3 \

View File

@ -210,6 +210,9 @@ in the library.
<tr><td><a href="pcre2_set_match_limit.html">pcre2_set_match_limit</a></td>
<td>&nbsp;&nbsp;Set the match limit</td></tr>
<tr><td><a href="pcre2_set_offset_limit.html">pcre2_set_offset_limit</a></td>
<td>&nbsp;&nbsp;Set the offset limit</td></tr>
<tr><td><a href="pcre2_set_newline.html">pcre2_set_newline</a></td>
<td>&nbsp;&nbsp;Set the newline convention</td></tr>

View File

@ -0,0 +1,40 @@
<html>
<head>
<title>pcre2_set_offset_limit specification</title>
</head>
<body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
<h1>pcre2_set_offset_limit man page</h1>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.
</p>
<p>
This page is part of the PCRE2 HTML documentation. It was generated
automatically from the original man page. If there is any nonsense in it,
please consult the man page, in case the conversion went wrong.
<br>
<br><b>
SYNOPSIS
</b><br>
<P>
<b>#include &#60;pcre2.h&#62;</b>
</P>
<P>
<b>int pcre2_set_offset_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> PCRE2_SIZE <i>value</i>);</b>
</P>
<br><b>
DESCRIPTION
</b><br>
<P>
This function sets the offset limit field in a match context. The result is
always zero.
</P>
<P>
There is a complete description of the PCRE2 native API in the
<a href="pcre2api.html"><b>pcre2api</b></a>
page and a description of the POSIX API in the
<a href="pcre2posix.html"><b>pcre2posix</b></a>
page.
<p>
Return to the <a href="index.html">PCRE2 index page</a>.
</p>

View File

@ -176,6 +176,10 @@ document for an overview of all the PCRE2 documentation.
<b> uint32_t <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_offset_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
<b>int pcre2_set_recursion_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
@ -697,6 +701,7 @@ A match context is required if you want to change the default values of any
of the following match-time parameters:
<pre>
A callout function
The offset limit for matching an unanchored pattern
The limit for calling <i>match()</i>
The limit for calling <i>match()</i> recursively
</pre>
@ -729,6 +734,30 @@ This sets up a "callout" function, which PCRE2 will call at specified points
during a matching operation. Details are given in the
<a href="pcre2callout.html"><b>pcre2callout</b></a>
documentation.
<b>int pcre2_set_offset_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> PCRE2_SIZE <i>value</i>);</b>
<br>
<br>
The <i>offset_limit</i> parameter limits how far an unanchored search can
advance in the subject string. The default value is PCRE2_UNSET. The
<b>pcre2_match()</b> and <b>pcre2_dfa_match()</b> functions return
PCRE2_ERROR_NOMATCH if a match with a starting point before or at the given
offset is not found. For example, if the pattern /abc/ is matched against
"123abc" with an offset limit less than 3, the result is PCRE2_ERROR_NO_MATCH.
A match can never be found if the <i>startoffset</i> argument of
<b>pcre2_match()</b> or <b>pcre2_dfa_match()</b> is greater than the offset
limit.
</P>
<P>
When using this facility, you must set PCRE2_USE_OFFSET_LIMIT when calling
<b>pcre2_compile()</b> so that when JIT is in use, different code can be
compiled. If a match is started with a non-default match limit when
PCRE2_USE_OFFSET_LIMIT is not set, an error is generated.
</P>
<P>
The offset limit facility can be used to track progress when searching large
subject strings. See also the PCRE2_FIRSTLINE option, which requires a match to
start within the first line of the subject.
<b>int pcre2_set_match_limit(pcre2_match_context *<i>mcontext</i>,</b>
<b> uint32_t <i>value</i>);</b>
<br>
@ -1168,7 +1197,8 @@ built.
</pre>
If this option is set, an unanchored pattern is required to match before or at
the first newline in the subject string, though the matched text may continue
over the newline.
over the newline. See also PCRE2_USE_OFFSET_LIMIT, which provides a more
general limiting facility.
<pre>
PCRE2_MATCH_UNSET_BACKREF
</pre>
@ -1350,6 +1380,17 @@ support.
This option inverts the "greediness" of the quantifiers so that they are not
greedy by default, but become greedy if followed by "?". It is not compatible
with Perl. It can also be set by a (?U) option setting within the pattern.
<pre>
PCRE2_USE_OFFSET_LIMIT
</pre>
This option must be set for <b>pcre2_compile()</b> if
<b>pcre2_set_offset_limit()</b> is going to be used to set a non-default offset
limit in a match context for matches that use this pattern. An error is
generated if an offset limit is set without this option. For more details, see
the description of <b>pcre2_set_offset_limit()</b> in the
<a href="#matchcontext">section</a>
that describes match contexts. See also the PCRE2_FIRSTLINE
option above.
<pre>
PCRE2_UTF
</pre>
@ -2912,7 +2953,7 @@ Cambridge, England.
</P>
<br><a name="SEC40" href="#TOC1">REVISION</a><br>
<P>
Last updated: 02 September 2015
Last updated: 22 September 2015
<br>
Copyright &copy; 1997-2015 University of Cambridge.
<br>

View File

@ -485,6 +485,12 @@ the start of a modifier list. For example:
<pre>
abc\=notbol,notempty
</pre>
If the subject string is empty and \= is followed by whitespace, the line is
treated as a comment line, and is not used for matching. For example:
<pre>
\= This is a comment.
abc\= This is an invalid modifier list.
</pre>
A backslash followed by any other non-alphanumeric character just escapes that
character. A backslash followed by anything else causes an error. However, if
the very last character in the line is a backslash (and there is no modifier
@ -691,9 +697,9 @@ If no number is given, 7 is assumed. The phrase "partial matching" means a call
to <b>pcre2_match()</b> with either the PCRE2_PARTIAL_SOFT or the
PCRE2_PARTIAL_HARD option set. Note that such a call may return a complete
match; the options enable the possibility of a partial match, but do not
require it. Note also that if you request JIT compilation only for partial
matching (for example, /jit=2) but do not set the <b>partial</b> modifier on a
subject line, that match will not use JIT code because none was compiled for
require it. Note also that if you request JIT compilation only for partial
matching (for example, /jit=2) but do not set the <b>partial</b> modifier on a
subject line, that match will not use JIT code because none was compiled for
non-partial matching.
</P>
<P>
@ -1533,7 +1539,7 @@ Cambridge, England.
</P>
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
<P>
Last updated: 12 September 2015
Last updated: 14 September 2015
<br>
Copyright &copy; 1997-2015 University of Cambridge.
<br>

View File

@ -210,6 +210,9 @@ in the library.
<tr><td><a href="pcre2_set_match_limit.html">pcre2_set_match_limit</a></td>
<td>&nbsp;&nbsp;Set the match limit</td></tr>
<tr><td><a href="pcre2_set_offset_limit.html">pcre2_set_offset_limit</a></td>
<td>&nbsp;&nbsp;Set the offset limit</td></tr>
<tr><td><a href="pcre2_set_newline.html">pcre2_set_newline</a></td>
<td>&nbsp;&nbsp;Set the newline convention</td></tr>

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,28 @@
.TH PCRE2_SET_OFFSET_LIMIT 3 "22 September 2015" "PCRE2 10.21"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS
.rs
.sp
.B #include <pcre2.h>
.PP
.nf
.B int pcre2_set_offset_limit(pcre2_match_context *\fImcontext\fP,
.B " PCRE2_SIZE \fIvalue\fP);"
.fi
.
.SH DESCRIPTION
.rs
.sp
This function sets the offset limit field in a match context. The result is
always zero.
.P
There is a complete description of the PCRE2 native API in the
.\" HREF
\fBpcre2api\fP
.\"
page and a description of the POSIX API in the
.\" HREF
\fBpcre2posix\fP
.\"
page.

View File

@ -1,4 +1,4 @@
.TH PCRE2API 3 "02 September 2015" "PCRE2 10.21"
.TH PCRE2API 3 "22 September 2015" "PCRE2 10.21"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.sp
@ -120,6 +120,9 @@ document for an overview of all the PCRE2 documentation.
.B int pcre2_set_match_limit(pcre2_match_context *\fImcontext\fP,
.B " uint32_t \fIvalue\fP);"
.sp
.B int pcre2_set_offset_limit(pcre2_match_context *\fImcontext\fP,
.B " PCRE2_SIZE \fIvalue\fP);"
.sp
.B int pcre2_set_recursion_limit(pcre2_match_context *\fImcontext\fP,
.B " uint32_t \fIvalue\fP);"
.sp
@ -659,6 +662,7 @@ A match context is required if you want to change the default values of any
of the following match-time parameters:
.sp
A callout function
The offset limit for matching an unanchored pattern
The limit for calling \fImatch()\fP
The limit for calling \fImatch()\fP recursively
.sp
@ -696,6 +700,30 @@ during a matching operation. Details are given in the
documentation.
.sp
.nf
.B int pcre2_set_offset_limit(pcre2_match_context *\fImcontext\fP,
.B " PCRE2_SIZE \fIvalue\fP);"
.fi
.sp
The \fIoffset_limit\fP parameter limits how far an unanchored search can
advance in the subject string. The default value is PCRE2_UNSET. The
\fBpcre2_match()\fP and \fBpcre2_dfa_match()\fP functions return
PCRE2_ERROR_NOMATCH if a match with a starting point before or at the given
offset is not found. For example, if the pattern /abc/ is matched against
"123abc" with an offset limit less than 3, the result is PCRE2_ERROR_NO_MATCH.
A match can never be found if the \fIstartoffset\fP argument of
\fBpcre2_match()\fP or \fBpcre2_dfa_match()\fP is greater than the offset
limit.
.P
When using this facility, you must set PCRE2_USE_OFFSET_LIMIT when calling
\fBpcre2_compile()\fP so that when JIT is in use, different code can be
compiled. If a match is started with a non-default match limit when
PCRE2_USE_OFFSET_LIMIT is not set, an error is generated.
.P
The offset limit facility can be used to track progress when searching large
subject strings. See also the PCRE2_FIRSTLINE option, which requires a match to
start within the first line of the subject.
.sp
.nf
.B int pcre2_set_match_limit(pcre2_match_context *\fImcontext\fP,
.B " uint32_t \fIvalue\fP);"
.fi
@ -1142,7 +1170,8 @@ built.
.sp
If this option is set, an unanchored pattern is required to match before or at
the first newline in the subject string, though the matched text may continue
over the newline.
over the newline. See also PCRE2_USE_OFFSET_LIMIT, which provides a more
general limiting facility.
.sp
PCRE2_MATCH_UNSET_BACKREF
.sp
@ -1335,6 +1364,20 @@ support.
This option inverts the "greediness" of the quantifiers so that they are not
greedy by default, but become greedy if followed by "?". It is not compatible
with Perl. It can also be set by a (?U) option setting within the pattern.
.sp
PCRE2_USE_OFFSET_LIMIT
.sp
This option must be set for \fBpcre2_compile()\fP if
\fBpcre2_set_offset_limit()\fP is going to be used to set a non-default offset
limit in a match context for matches that use this pattern. An error is
generated if an offset limit is set without this option. For more details, see
the description of \fBpcre2_set_offset_limit()\fP in the
.\" HTML <a href="#matchcontext">
.\" </a>
section
.\"
that describes match contexts. See also the PCRE2_FIRSTLINE
option above.
.sp
PCRE2_UTF
.sp
@ -2965,6 +3008,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 02 September 2015
Last updated: 22 September 2015
Copyright (c) 1997-2015 University of Cambridge.
.fi

View File

@ -1,4 +1,4 @@
.TH PCRE2TEST 1 "14 September 2015" "PCRE 10.21"
.TH PCRE2TEST 1 "22 September 2015" "PCRE 10.21"
.SH NAME
pcre2test - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS
@ -472,7 +472,7 @@ can add to or override default modifiers that were set by a previous
The following modifiers set options for \fBpcre2_compile()\fP. The most common
ones have single-letter abbreviations. See
.\" HREF
\fBpcreapi\fP
\fBpcre2api\fP
.\"
for a description of their effects.
.sp
@ -500,6 +500,7 @@ for a description of their effects.
no_utf_check set PCRE2_NO_UTF_CHECK
ucp set PCRE2_UCP
ungreedy set PCRE2_UNGREEDY
use_offset_limit set PCRE2_USE_OFFSET_LIMIT
utf set PCRE2_UTF
.sp
As well as turning on the PCRE2_UTF option, the \fButf\fP modifier causes all
@ -892,9 +893,10 @@ pattern.
/g global global matching
jitstack=<n> set size of JIT stack
mark show mark values
match_limit=>n> set a match limit
match_limit=<n> set a match limit
memory show memory usage
offset=<n> set starting offset
offset_limit=<n> set offset limit
ovector=<n> set size of output vector
recursion_limit=<n> set a recursion limit
replace=<string> specify a replacement string
@ -1133,6 +1135,16 @@ The \fBoffset\fP modifier sets an offset in the subject string at which
matching starts. Its value is a number of code units, not characters.
.
.
.SS "Setting an offset limit"
.rs
.sp
The \fBoffset_limit\fP modifier sets a limit for unanchored matches. If a match
cannot be found starting at or before this offset in the subject, a "no match"
return is given. The data value is a number of code units, not characters. When
this modifier is used, the \fBuse_offset_limit\fP modifier must have been set
for the pattern; if not, an error is generated.
.
.
.SS "Setting the size of the output vector"
.rs
.sp
@ -1525,6 +1537,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 14 September 2015
Last updated: 22 September 2015
Copyright (c) 1997-2015 University of Cambridge.
.fi

View File

@ -432,6 +432,13 @@ SUBJECT LINE SYNTAX
abc\=notbol,notempty
If the subject string is empty and \= is followed by whitespace, the
line is treated as a comment line, and is not used for matching. For
example:
\= This is a comment.
abc\= This is an invalid modifier list.
A backslash followed by any other non-alphanumeric character just
escapes that character. A backslash followed by anything else causes an
error. However, if the very last character in the line is a backslash
@ -1391,5 +1398,5 @@ AUTHOR
REVISION
Last updated: 12 September 2015
Last updated: 14 September 2015
Copyright (c) 1997-2015 University of Cambridge.