Documentation update.

This commit is contained in:
Philip.Hazel 2019-06-21 16:10:17 +00:00
parent 175b4919f7
commit a89423624d
5 changed files with 666 additions and 627 deletions

View File

@ -355,7 +355,14 @@ characters:
</PRE> </PRE>
</P> </P>
<P> <P>
3. Because a partial match must always contain at least one character, what 3. The maximum lookbehind count is also important when the result of a partial
match attempt is "no match". In this case, the maximum lookbehind characters
from the end of the current segment must be retained at the start of the next
segment, in case the lookbehind is at the start of the pattern. Matching the
next segment must then start at the appropriate offset.
</P>
<P>
4. Because a partial match must always contain at least one character, what
might be considered a partial match of an empty string actually gives a "no might be considered a partial match of an empty string actually gives a "no
match" result. For example: match" result. For example:
<pre> <pre>
@ -369,7 +376,7 @@ happen if characters from the previous segment are retained. For this reason, a
when the pattern contains lookbehinds. when the pattern contains lookbehinds.
</P> </P>
<P> <P>
4. Matching a subject string that is split into multiple segments may not 5. Matching a subject string that is split into multiple segments may not
always produce exactly the same result as matching over one single long string, always produce exactly the same result as matching over one single long string,
especially when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and especially when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and
Word Boundaries" above describes an issue that arises if the pattern ends with Word Boundaries" above describes an issue that arises if the pattern ends with
@ -411,7 +418,7 @@ multi-segment data. The example above then behaves differently:
data&#62; gsb\=ph,dfa,dfa_restart data&#62; gsb\=ph,dfa,dfa_restart
Partial match: gsb Partial match: gsb
</pre> </pre>
5. Patterns that contain alternatives at the top level which do not all start 6. Patterns that contain alternatives at the top level which do not all start
with the same pattern item may not work as expected when PCRE2_DFA_RESTART is with the same pattern item may not work as expected when PCRE2_DFA_RESTART is
used. For example, consider this pattern: used. For example, consider this pattern:
<pre> <pre>
@ -456,9 +463,9 @@ Cambridge, England.
</P> </P>
<br><a name="SEC10" href="#TOC1">REVISION</a><br> <br><a name="SEC10" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 22 December 2014 Last updated: 21 June 2019
<br> <br>
Copyright &copy; 1997-2014 University of Cambridge. Copyright &copy; 1997-2019 University of Cambridge.
<br> <br>
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -2014,8 +2014,10 @@ no characters with a quantifier that has no upper limit, for example:
</pre> </pre>
Earlier versions of Perl and PCRE1 used to give an error at compile time for Earlier versions of Perl and PCRE1 used to give an error at compile time for
such patterns. However, because there are cases where this can be useful, such such patterns. However, because there are cases where this can be useful, such
patterns are now accepted, but if any repetition of the group does in fact patterns are now accepted, but whenever an iteration of such a group matches no
match no characters, the loop is forcibly broken. characters, matching moves on to the next item in the pattern instead of
repeatedly matching an empty string. This does not prevent backtracking into
any of the iterations if a subsequent item fails to match.
</P> </P>
<P> <P>
By default, quantifiers are "greedy", that is, they match as much as possible By default, quantifiers are "greedy", that is, they match as much as possible
@ -2371,6 +2373,10 @@ A lookaround assertion may also appear as the condition in a
which branch of the condition is followed. which branch of the condition is followed.
</P> </P>
<P> <P>
Lookaround assertions are atomic. If an assertion is true, but there is a
subsequent matching failure, there is no backtracking into the assertion.
</P>
<P>
Assertion groups are not capture groups. If an assertion contains capture Assertion groups are not capture groups. If an assertion contains capture
groups within it, these are counted for the purposes of numbering the capture groups within it, these are counted for the purposes of numbering the capture
groups in the whole pattern. Within each branch of an assertion, locally groups in the whole pattern. Within each branch of an assertion, locally
@ -3519,9 +3525,9 @@ first match attempt, the second attempt would start at the second character
instead of skipping on to "c". instead of skipping on to "c".
</P> </P>
<P> <P>
If (*SKIP) is used inside a lookbehind to specify a new starting point that is If (*SKIP) is used inside a lookbehind to specify a new starting position that
not later than the starting point of the current match, it is ignored, and the is not later than the starting point of the current match, the position
normal "bumpalong" occurs. specified by (*SKIP) is ignored, and instead the normal "bumpalong" occurs.
<pre> <pre>
(*SKIP:NAME) (*SKIP:NAME)
</pre> </pre>
@ -3748,7 +3754,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC31" href="#TOC1">REVISION</a><br> <br><a name="SEC31" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 20 June 2019 Last updated: 21 June 2019
<br> <br>
Copyright &copy; 1997-2019 University of Cambridge. Copyright &copy; 1997-2019 University of Cambridge.
<br> <br>

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,4 @@
.TH PCRE2PARTIAL 3 "22 December 2014" "PCRE2 10.00" .TH PCRE2PARTIAL 3 "21 June 2019" "PCRE2 10.34"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions PCRE2 - Perl-compatible regular expressions
.SH "PARTIAL MATCHING IN PCRE2" .SH "PARTIAL MATCHING IN PCRE2"
@ -326,7 +326,13 @@ characters:
Partial match: 123ab Partial match: 123ab
<<< <<<
.P .P
3. Because a partial match must always contain at least one character, what 3. The maximum lookbehind count is also important when the result of a partial
match attempt is "no match". In this case, the maximum lookbehind characters
from the end of the current segment must be retained at the start of the next
segment, in case the lookbehind is at the start of the pattern. Matching the
next segment must then start at the appropriate offset.
.P
4. Because a partial match must always contain at least one character, what
might be considered a partial match of an empty string actually gives a "no might be considered a partial match of an empty string actually gives a "no
match" result. For example: match" result. For example:
.sp .sp
@ -339,7 +345,7 @@ happen if characters from the previous segment are retained. For this reason, a
"no match" result should be interpreted as "partial match of an empty string" "no match" result should be interpreted as "partial match of an empty string"
when the pattern contains lookbehinds. when the pattern contains lookbehinds.
.P .P
4. Matching a subject string that is split into multiple segments may not 5. Matching a subject string that is split into multiple segments may not
always produce exactly the same result as matching over one single long string, always produce exactly the same result as matching over one single long string,
especially when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and especially when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and
Word Boundaries" above describes an issue that arises if the pattern ends with Word Boundaries" above describes an issue that arises if the pattern ends with
@ -380,7 +386,7 @@ multi-segment data. The example above then behaves differently:
data> gsb\e=ph,dfa,dfa_restart data> gsb\e=ph,dfa,dfa_restart
Partial match: gsb Partial match: gsb
.sp .sp
5. Patterns that contain alternatives at the top level which do not all start 6. Patterns that contain alternatives at the top level which do not all start
with the same pattern item may not work as expected when PCRE2_DFA_RESTART is with the same pattern item may not work as expected when PCRE2_DFA_RESTART is
used. For example, consider this pattern: used. For example, consider this pattern:
.sp .sp
@ -429,6 +435,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 22 December 2014 Last updated: 21 June 2019
Copyright (c) 1997-2014 University of Cambridge. Copyright (c) 1997-2019 University of Cambridge.
.fi .fi

View File

@ -1,4 +1,4 @@
.TH PCRE2PATTERN 3 "20 June 2019" "PCRE2 10.34" .TH PCRE2PATTERN 3 "21 June 2019" "PCRE2 10.34"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH "PCRE2 REGULAR EXPRESSION DETAILS" .SH "PCRE2 REGULAR EXPRESSION DETAILS"
@ -2022,8 +2022,10 @@ no characters with a quantifier that has no upper limit, for example:
.sp .sp
Earlier versions of Perl and PCRE1 used to give an error at compile time for Earlier versions of Perl and PCRE1 used to give an error at compile time for
such patterns. However, because there are cases where this can be useful, such such patterns. However, because there are cases where this can be useful, such
patterns are now accepted, but if any repetition of the group does in fact patterns are now accepted, but whenever an iteration of such a group matches no
match no characters, the loop is forcibly broken. characters, matching moves on to the next item in the pattern instead of
repeatedly matching an empty string. This does not prevent backtracking into
any of the iterations if a subsequent item fails to match.
.P .P
By default, quantifiers are "greedy", that is, they match as much as possible By default, quantifiers are "greedy", that is, they match as much as possible
(up to the maximum number of permitted times), without causing the rest of the (up to the maximum number of permitted times), without causing the rest of the
@ -2378,6 +2380,9 @@ conditional group
(see below). In this case, the result of matching the assertion determines (see below). In this case, the result of matching the assertion determines
which branch of the condition is followed. which branch of the condition is followed.
.P .P
Lookaround assertions are atomic. If an assertion is true, but there is a
subsequent matching failure, there is no backtracking into the assertion.
.P
Assertion groups are not capture groups. If an assertion contains capture Assertion groups are not capture groups. If an assertion contains capture
groups within it, these are counted for the purposes of numbering the capture groups within it, these are counted for the purposes of numbering the capture
groups in the whole pattern. Within each branch of an assertion, locally groups in the whole pattern. Within each branch of an assertion, locally
@ -3559,9 +3564,9 @@ effect as this example; although it would suppress backtracking during the
first match attempt, the second attempt would start at the second character first match attempt, the second attempt would start at the second character
instead of skipping on to "c". instead of skipping on to "c".
.P .P
If (*SKIP) is used inside a lookbehind to specify a new starting point that is If (*SKIP) is used inside a lookbehind to specify a new starting position that
not later than the starting point of the current match, it is ignored, and the is not later than the starting point of the current match, the position
normal "bumpalong" occurs. specified by (*SKIP) is ignored, and instead the normal "bumpalong" occurs.
.sp .sp
(*SKIP:NAME) (*SKIP:NAME)
.sp .sp
@ -3782,6 +3787,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 20 June 2019 Last updated: 21 June 2019
Copyright (c) 1997-2019 University of Cambridge. Copyright (c) 1997-2019 University of Cambridge.
.fi .fi