Documentation update.
This commit is contained in:
parent
175b4919f7
commit
a89423624d
|
@ -355,7 +355,14 @@ characters:
|
|||
</PRE>
|
||||
</P>
|
||||
<P>
|
||||
3. Because a partial match must always contain at least one character, what
|
||||
3. The maximum lookbehind count is also important when the result of a partial
|
||||
match attempt is "no match". In this case, the maximum lookbehind characters
|
||||
from the end of the current segment must be retained at the start of the next
|
||||
segment, in case the lookbehind is at the start of the pattern. Matching the
|
||||
next segment must then start at the appropriate offset.
|
||||
</P>
|
||||
<P>
|
||||
4. Because a partial match must always contain at least one character, what
|
||||
might be considered a partial match of an empty string actually gives a "no
|
||||
match" result. For example:
|
||||
<pre>
|
||||
|
@ -369,7 +376,7 @@ happen if characters from the previous segment are retained. For this reason, a
|
|||
when the pattern contains lookbehinds.
|
||||
</P>
|
||||
<P>
|
||||
4. Matching a subject string that is split into multiple segments may not
|
||||
5. Matching a subject string that is split into multiple segments may not
|
||||
always produce exactly the same result as matching over one single long string,
|
||||
especially when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and
|
||||
Word Boundaries" above describes an issue that arises if the pattern ends with
|
||||
|
@ -411,7 +418,7 @@ multi-segment data. The example above then behaves differently:
|
|||
data> gsb\=ph,dfa,dfa_restart
|
||||
Partial match: gsb
|
||||
</pre>
|
||||
5. Patterns that contain alternatives at the top level which do not all start
|
||||
6. Patterns that contain alternatives at the top level which do not all start
|
||||
with the same pattern item may not work as expected when PCRE2_DFA_RESTART is
|
||||
used. For example, consider this pattern:
|
||||
<pre>
|
||||
|
@ -456,9 +463,9 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC10" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 22 December 2014
|
||||
Last updated: 21 June 2019
|
||||
<br>
|
||||
Copyright © 1997-2014 University of Cambridge.
|
||||
Copyright © 1997-2019 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -2014,8 +2014,10 @@ no characters with a quantifier that has no upper limit, for example:
|
|||
</pre>
|
||||
Earlier versions of Perl and PCRE1 used to give an error at compile time for
|
||||
such patterns. However, because there are cases where this can be useful, such
|
||||
patterns are now accepted, but if any repetition of the group does in fact
|
||||
match no characters, the loop is forcibly broken.
|
||||
patterns are now accepted, but whenever an iteration of such a group matches no
|
||||
characters, matching moves on to the next item in the pattern instead of
|
||||
repeatedly matching an empty string. This does not prevent backtracking into
|
||||
any of the iterations if a subsequent item fails to match.
|
||||
</P>
|
||||
<P>
|
||||
By default, quantifiers are "greedy", that is, they match as much as possible
|
||||
|
@ -2371,6 +2373,10 @@ A lookaround assertion may also appear as the condition in a
|
|||
which branch of the condition is followed.
|
||||
</P>
|
||||
<P>
|
||||
Lookaround assertions are atomic. If an assertion is true, but there is a
|
||||
subsequent matching failure, there is no backtracking into the assertion.
|
||||
</P>
|
||||
<P>
|
||||
Assertion groups are not capture groups. If an assertion contains capture
|
||||
groups within it, these are counted for the purposes of numbering the capture
|
||||
groups in the whole pattern. Within each branch of an assertion, locally
|
||||
|
@ -3519,9 +3525,9 @@ first match attempt, the second attempt would start at the second character
|
|||
instead of skipping on to "c".
|
||||
</P>
|
||||
<P>
|
||||
If (*SKIP) is used inside a lookbehind to specify a new starting point that is
|
||||
not later than the starting point of the current match, it is ignored, and the
|
||||
normal "bumpalong" occurs.
|
||||
If (*SKIP) is used inside a lookbehind to specify a new starting position that
|
||||
is not later than the starting point of the current match, the position
|
||||
specified by (*SKIP) is ignored, and instead the normal "bumpalong" occurs.
|
||||
<pre>
|
||||
(*SKIP:NAME)
|
||||
</pre>
|
||||
|
@ -3748,7 +3754,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC31" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 20 June 2019
|
||||
Last updated: 21 June 2019
|
||||
<br>
|
||||
Copyright © 1997-2019 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -5970,7 +5970,14 @@ ISSUES WITH MULTI-SEGMENT MATCHING
|
|||
Partial match: 123ab
|
||||
<<<
|
||||
|
||||
3. Because a partial match must always contain at least one character,
|
||||
3. The maximum lookbehind count is also important when the result of a
|
||||
partial match attempt is "no match". In this case, the maximum lookbe-
|
||||
hind characters from the end of the current segment must be retained at
|
||||
the start of the next segment, in case the lookbehind is at the start
|
||||
of the pattern. Matching the next segment must then start at the appro-
|
||||
priate offset.
|
||||
|
||||
4. Because a partial match must always contain at least one character,
|
||||
what might be considered a partial match of an empty string actually
|
||||
gives a "no match" result. For example:
|
||||
|
||||
|
@ -5983,7 +5990,7 @@ ISSUES WITH MULTI-SEGMENT MATCHING
|
|||
this reason, a "no match" result should be interpreted as "partial
|
||||
match of an empty string" when the pattern contains lookbehinds.
|
||||
|
||||
4. Matching a subject string that is split into multiple segments may
|
||||
5. Matching a subject string that is split into multiple segments may
|
||||
not always produce exactly the same result as matching over one single
|
||||
long string, especially when PCRE2_PARTIAL_SOFT is used. The section
|
||||
"Partial Matching and Word Boundaries" above describes an issue that
|
||||
|
@ -6027,7 +6034,7 @@ ISSUES WITH MULTI-SEGMENT MATCHING
|
|||
data> gsb\=ph,dfa,dfa_restart
|
||||
Partial match: gsb
|
||||
|
||||
5. Patterns that contain alternatives at the top level which do not all
|
||||
6. Patterns that contain alternatives at the top level which do not all
|
||||
start with the same pattern item may not work as expected when
|
||||
PCRE2_DFA_RESTART is used. For example, consider this pattern:
|
||||
|
||||
|
@ -6072,8 +6079,8 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 22 December 2014
|
||||
Copyright (c) 1997-2014 University of Cambridge.
|
||||
Last updated: 21 June 2019
|
||||
Copyright (c) 1997-2019 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
|
@ -7801,8 +7808,11 @@ REPETITION
|
|||
|
||||
Earlier versions of Perl and PCRE1 used to give an error at compile
|
||||
time for such patterns. However, because there are cases where this can
|
||||
be useful, such patterns are now accepted, but if any repetition of the
|
||||
group does in fact match no characters, the loop is forcibly broken.
|
||||
be useful, such patterns are now accepted, but whenever an iteration of
|
||||
such a group matches no characters, matching moves on to the next item
|
||||
in the pattern instead of repeatedly matching an empty string. This
|
||||
does not prevent backtracking into any of the iterations if a subse-
|
||||
quent item fails to match.
|
||||
|
||||
By default, quantifiers are "greedy", that is, they match as much as
|
||||
possible (up to the maximum number of permitted times), without causing
|
||||
|
@ -8143,6 +8153,10 @@ ASSERTIONS
|
|||
tional group (see below). In this case, the result of matching the
|
||||
assertion determines which branch of the condition is followed.
|
||||
|
||||
Lookaround assertions are atomic. If an assertion is true, but there is
|
||||
a subsequent matching failure, there is no backtracking into the asser-
|
||||
tion.
|
||||
|
||||
Assertion groups are not capture groups. If an assertion contains cap-
|
||||
ture groups within it, these are counted for the purposes of numbering
|
||||
the capture groups in the whole pattern. Within each branch of an
|
||||
|
@ -9219,9 +9233,10 @@ BACKTRACKING CONTROL
|
|||
attempt would start at the second character instead of skipping on to
|
||||
"c".
|
||||
|
||||
If (*SKIP) is used inside a lookbehind to specify a new starting point
|
||||
that is not later than the starting point of the current match, it is
|
||||
ignored, and the normal "bumpalong" occurs.
|
||||
If (*SKIP) is used inside a lookbehind to specify a new starting posi-
|
||||
tion that is not later than the starting point of the current match,
|
||||
the position specified by (*SKIP) is ignored, and instead the normal
|
||||
"bumpalong" occurs.
|
||||
|
||||
(*SKIP:NAME)
|
||||
|
||||
|
@ -9432,7 +9447,7 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 20 June 2019
|
||||
Last updated: 21 June 2019
|
||||
Copyright (c) 1997-2019 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2PARTIAL 3 "22 December 2014" "PCRE2 10.00"
|
||||
.TH PCRE2PARTIAL 3 "21 June 2019" "PCRE2 10.34"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions
|
||||
.SH "PARTIAL MATCHING IN PCRE2"
|
||||
|
@ -326,7 +326,13 @@ characters:
|
|||
Partial match: 123ab
|
||||
<<<
|
||||
.P
|
||||
3. Because a partial match must always contain at least one character, what
|
||||
3. The maximum lookbehind count is also important when the result of a partial
|
||||
match attempt is "no match". In this case, the maximum lookbehind characters
|
||||
from the end of the current segment must be retained at the start of the next
|
||||
segment, in case the lookbehind is at the start of the pattern. Matching the
|
||||
next segment must then start at the appropriate offset.
|
||||
.P
|
||||
4. Because a partial match must always contain at least one character, what
|
||||
might be considered a partial match of an empty string actually gives a "no
|
||||
match" result. For example:
|
||||
.sp
|
||||
|
@ -339,7 +345,7 @@ happen if characters from the previous segment are retained. For this reason, a
|
|||
"no match" result should be interpreted as "partial match of an empty string"
|
||||
when the pattern contains lookbehinds.
|
||||
.P
|
||||
4. Matching a subject string that is split into multiple segments may not
|
||||
5. Matching a subject string that is split into multiple segments may not
|
||||
always produce exactly the same result as matching over one single long string,
|
||||
especially when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and
|
||||
Word Boundaries" above describes an issue that arises if the pattern ends with
|
||||
|
@ -380,7 +386,7 @@ multi-segment data. The example above then behaves differently:
|
|||
data> gsb\e=ph,dfa,dfa_restart
|
||||
Partial match: gsb
|
||||
.sp
|
||||
5. Patterns that contain alternatives at the top level which do not all start
|
||||
6. Patterns that contain alternatives at the top level which do not all start
|
||||
with the same pattern item may not work as expected when PCRE2_DFA_RESTART is
|
||||
used. For example, consider this pattern:
|
||||
.sp
|
||||
|
@ -429,6 +435,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 22 December 2014
|
||||
Copyright (c) 1997-2014 University of Cambridge.
|
||||
Last updated: 21 June 2019
|
||||
Copyright (c) 1997-2019 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2PATTERN 3 "20 June 2019" "PCRE2 10.34"
|
||||
.TH PCRE2PATTERN 3 "21 June 2019" "PCRE2 10.34"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.SH "PCRE2 REGULAR EXPRESSION DETAILS"
|
||||
|
@ -2022,8 +2022,10 @@ no characters with a quantifier that has no upper limit, for example:
|
|||
.sp
|
||||
Earlier versions of Perl and PCRE1 used to give an error at compile time for
|
||||
such patterns. However, because there are cases where this can be useful, such
|
||||
patterns are now accepted, but if any repetition of the group does in fact
|
||||
match no characters, the loop is forcibly broken.
|
||||
patterns are now accepted, but whenever an iteration of such a group matches no
|
||||
characters, matching moves on to the next item in the pattern instead of
|
||||
repeatedly matching an empty string. This does not prevent backtracking into
|
||||
any of the iterations if a subsequent item fails to match.
|
||||
.P
|
||||
By default, quantifiers are "greedy", that is, they match as much as possible
|
||||
(up to the maximum number of permitted times), without causing the rest of the
|
||||
|
@ -2378,6 +2380,9 @@ conditional group
|
|||
(see below). In this case, the result of matching the assertion determines
|
||||
which branch of the condition is followed.
|
||||
.P
|
||||
Lookaround assertions are atomic. If an assertion is true, but there is a
|
||||
subsequent matching failure, there is no backtracking into the assertion.
|
||||
.P
|
||||
Assertion groups are not capture groups. If an assertion contains capture
|
||||
groups within it, these are counted for the purposes of numbering the capture
|
||||
groups in the whole pattern. Within each branch of an assertion, locally
|
||||
|
@ -3559,9 +3564,9 @@ effect as this example; although it would suppress backtracking during the
|
|||
first match attempt, the second attempt would start at the second character
|
||||
instead of skipping on to "c".
|
||||
.P
|
||||
If (*SKIP) is used inside a lookbehind to specify a new starting point that is
|
||||
not later than the starting point of the current match, it is ignored, and the
|
||||
normal "bumpalong" occurs.
|
||||
If (*SKIP) is used inside a lookbehind to specify a new starting position that
|
||||
is not later than the starting point of the current match, the position
|
||||
specified by (*SKIP) is ignored, and instead the normal "bumpalong" occurs.
|
||||
.sp
|
||||
(*SKIP:NAME)
|
||||
.sp
|
||||
|
@ -3782,6 +3787,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 20 June 2019
|
||||
Last updated: 21 June 2019
|
||||
Copyright (c) 1997-2019 University of Cambridge.
|
||||
.fi
|
||||
|
|
Loading…
Reference in New Issue