Documentation update.
This commit is contained in:
parent
175b4919f7
commit
a89423624d
|
@ -355,7 +355,14 @@ characters:
|
||||||
</PRE>
|
</PRE>
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
3. Because a partial match must always contain at least one character, what
|
3. The maximum lookbehind count is also important when the result of a partial
|
||||||
|
match attempt is "no match". In this case, the maximum lookbehind characters
|
||||||
|
from the end of the current segment must be retained at the start of the next
|
||||||
|
segment, in case the lookbehind is at the start of the pattern. Matching the
|
||||||
|
next segment must then start at the appropriate offset.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
4. Because a partial match must always contain at least one character, what
|
||||||
might be considered a partial match of an empty string actually gives a "no
|
might be considered a partial match of an empty string actually gives a "no
|
||||||
match" result. For example:
|
match" result. For example:
|
||||||
<pre>
|
<pre>
|
||||||
|
@ -369,7 +376,7 @@ happen if characters from the previous segment are retained. For this reason, a
|
||||||
when the pattern contains lookbehinds.
|
when the pattern contains lookbehinds.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
4. Matching a subject string that is split into multiple segments may not
|
5. Matching a subject string that is split into multiple segments may not
|
||||||
always produce exactly the same result as matching over one single long string,
|
always produce exactly the same result as matching over one single long string,
|
||||||
especially when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and
|
especially when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and
|
||||||
Word Boundaries" above describes an issue that arises if the pattern ends with
|
Word Boundaries" above describes an issue that arises if the pattern ends with
|
||||||
|
@ -411,7 +418,7 @@ multi-segment data. The example above then behaves differently:
|
||||||
data> gsb\=ph,dfa,dfa_restart
|
data> gsb\=ph,dfa,dfa_restart
|
||||||
Partial match: gsb
|
Partial match: gsb
|
||||||
</pre>
|
</pre>
|
||||||
5. Patterns that contain alternatives at the top level which do not all start
|
6. Patterns that contain alternatives at the top level which do not all start
|
||||||
with the same pattern item may not work as expected when PCRE2_DFA_RESTART is
|
with the same pattern item may not work as expected when PCRE2_DFA_RESTART is
|
||||||
used. For example, consider this pattern:
|
used. For example, consider this pattern:
|
||||||
<pre>
|
<pre>
|
||||||
|
@ -456,9 +463,9 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC10" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC10" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 22 December 2014
|
Last updated: 21 June 2019
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2014 University of Cambridge.
|
Copyright © 1997-2019 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
|
@ -2014,8 +2014,10 @@ no characters with a quantifier that has no upper limit, for example:
|
||||||
</pre>
|
</pre>
|
||||||
Earlier versions of Perl and PCRE1 used to give an error at compile time for
|
Earlier versions of Perl and PCRE1 used to give an error at compile time for
|
||||||
such patterns. However, because there are cases where this can be useful, such
|
such patterns. However, because there are cases where this can be useful, such
|
||||||
patterns are now accepted, but if any repetition of the group does in fact
|
patterns are now accepted, but whenever an iteration of such a group matches no
|
||||||
match no characters, the loop is forcibly broken.
|
characters, matching moves on to the next item in the pattern instead of
|
||||||
|
repeatedly matching an empty string. This does not prevent backtracking into
|
||||||
|
any of the iterations if a subsequent item fails to match.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
By default, quantifiers are "greedy", that is, they match as much as possible
|
By default, quantifiers are "greedy", that is, they match as much as possible
|
||||||
|
@ -2371,6 +2373,10 @@ A lookaround assertion may also appear as the condition in a
|
||||||
which branch of the condition is followed.
|
which branch of the condition is followed.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
|
Lookaround assertions are atomic. If an assertion is true, but there is a
|
||||||
|
subsequent matching failure, there is no backtracking into the assertion.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
Assertion groups are not capture groups. If an assertion contains capture
|
Assertion groups are not capture groups. If an assertion contains capture
|
||||||
groups within it, these are counted for the purposes of numbering the capture
|
groups within it, these are counted for the purposes of numbering the capture
|
||||||
groups in the whole pattern. Within each branch of an assertion, locally
|
groups in the whole pattern. Within each branch of an assertion, locally
|
||||||
|
@ -3519,9 +3525,9 @@ first match attempt, the second attempt would start at the second character
|
||||||
instead of skipping on to "c".
|
instead of skipping on to "c".
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
If (*SKIP) is used inside a lookbehind to specify a new starting point that is
|
If (*SKIP) is used inside a lookbehind to specify a new starting position that
|
||||||
not later than the starting point of the current match, it is ignored, and the
|
is not later than the starting point of the current match, the position
|
||||||
normal "bumpalong" occurs.
|
specified by (*SKIP) is ignored, and instead the normal "bumpalong" occurs.
|
||||||
<pre>
|
<pre>
|
||||||
(*SKIP:NAME)
|
(*SKIP:NAME)
|
||||||
</pre>
|
</pre>
|
||||||
|
@ -3748,7 +3754,7 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC31" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC31" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 20 June 2019
|
Last updated: 21 June 2019
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2019 University of Cambridge.
|
Copyright © 1997-2019 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
|
@ -5970,7 +5970,14 @@ ISSUES WITH MULTI-SEGMENT MATCHING
|
||||||
Partial match: 123ab
|
Partial match: 123ab
|
||||||
<<<
|
<<<
|
||||||
|
|
||||||
3. Because a partial match must always contain at least one character,
|
3. The maximum lookbehind count is also important when the result of a
|
||||||
|
partial match attempt is "no match". In this case, the maximum lookbe-
|
||||||
|
hind characters from the end of the current segment must be retained at
|
||||||
|
the start of the next segment, in case the lookbehind is at the start
|
||||||
|
of the pattern. Matching the next segment must then start at the appro-
|
||||||
|
priate offset.
|
||||||
|
|
||||||
|
4. Because a partial match must always contain at least one character,
|
||||||
what might be considered a partial match of an empty string actually
|
what might be considered a partial match of an empty string actually
|
||||||
gives a "no match" result. For example:
|
gives a "no match" result. For example:
|
||||||
|
|
||||||
|
@ -5983,7 +5990,7 @@ ISSUES WITH MULTI-SEGMENT MATCHING
|
||||||
this reason, a "no match" result should be interpreted as "partial
|
this reason, a "no match" result should be interpreted as "partial
|
||||||
match of an empty string" when the pattern contains lookbehinds.
|
match of an empty string" when the pattern contains lookbehinds.
|
||||||
|
|
||||||
4. Matching a subject string that is split into multiple segments may
|
5. Matching a subject string that is split into multiple segments may
|
||||||
not always produce exactly the same result as matching over one single
|
not always produce exactly the same result as matching over one single
|
||||||
long string, especially when PCRE2_PARTIAL_SOFT is used. The section
|
long string, especially when PCRE2_PARTIAL_SOFT is used. The section
|
||||||
"Partial Matching and Word Boundaries" above describes an issue that
|
"Partial Matching and Word Boundaries" above describes an issue that
|
||||||
|
@ -6027,7 +6034,7 @@ ISSUES WITH MULTI-SEGMENT MATCHING
|
||||||
data> gsb\=ph,dfa,dfa_restart
|
data> gsb\=ph,dfa,dfa_restart
|
||||||
Partial match: gsb
|
Partial match: gsb
|
||||||
|
|
||||||
5. Patterns that contain alternatives at the top level which do not all
|
6. Patterns that contain alternatives at the top level which do not all
|
||||||
start with the same pattern item may not work as expected when
|
start with the same pattern item may not work as expected when
|
||||||
PCRE2_DFA_RESTART is used. For example, consider this pattern:
|
PCRE2_DFA_RESTART is used. For example, consider this pattern:
|
||||||
|
|
||||||
|
@ -6072,8 +6079,8 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 22 December 2014
|
Last updated: 21 June 2019
|
||||||
Copyright (c) 1997-2014 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@ -7801,8 +7808,11 @@ REPETITION
|
||||||
|
|
||||||
Earlier versions of Perl and PCRE1 used to give an error at compile
|
Earlier versions of Perl and PCRE1 used to give an error at compile
|
||||||
time for such patterns. However, because there are cases where this can
|
time for such patterns. However, because there are cases where this can
|
||||||
be useful, such patterns are now accepted, but if any repetition of the
|
be useful, such patterns are now accepted, but whenever an iteration of
|
||||||
group does in fact match no characters, the loop is forcibly broken.
|
such a group matches no characters, matching moves on to the next item
|
||||||
|
in the pattern instead of repeatedly matching an empty string. This
|
||||||
|
does not prevent backtracking into any of the iterations if a subse-
|
||||||
|
quent item fails to match.
|
||||||
|
|
||||||
By default, quantifiers are "greedy", that is, they match as much as
|
By default, quantifiers are "greedy", that is, they match as much as
|
||||||
possible (up to the maximum number of permitted times), without causing
|
possible (up to the maximum number of permitted times), without causing
|
||||||
|
@ -8143,6 +8153,10 @@ ASSERTIONS
|
||||||
tional group (see below). In this case, the result of matching the
|
tional group (see below). In this case, the result of matching the
|
||||||
assertion determines which branch of the condition is followed.
|
assertion determines which branch of the condition is followed.
|
||||||
|
|
||||||
|
Lookaround assertions are atomic. If an assertion is true, but there is
|
||||||
|
a subsequent matching failure, there is no backtracking into the asser-
|
||||||
|
tion.
|
||||||
|
|
||||||
Assertion groups are not capture groups. If an assertion contains cap-
|
Assertion groups are not capture groups. If an assertion contains cap-
|
||||||
ture groups within it, these are counted for the purposes of numbering
|
ture groups within it, these are counted for the purposes of numbering
|
||||||
the capture groups in the whole pattern. Within each branch of an
|
the capture groups in the whole pattern. Within each branch of an
|
||||||
|
@ -9219,9 +9233,10 @@ BACKTRACKING CONTROL
|
||||||
attempt would start at the second character instead of skipping on to
|
attempt would start at the second character instead of skipping on to
|
||||||
"c".
|
"c".
|
||||||
|
|
||||||
If (*SKIP) is used inside a lookbehind to specify a new starting point
|
If (*SKIP) is used inside a lookbehind to specify a new starting posi-
|
||||||
that is not later than the starting point of the current match, it is
|
tion that is not later than the starting point of the current match,
|
||||||
ignored, and the normal "bumpalong" occurs.
|
the position specified by (*SKIP) is ignored, and instead the normal
|
||||||
|
"bumpalong" occurs.
|
||||||
|
|
||||||
(*SKIP:NAME)
|
(*SKIP:NAME)
|
||||||
|
|
||||||
|
@ -9432,7 +9447,7 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 20 June 2019
|
Last updated: 21 June 2019
|
||||||
Copyright (c) 1997-2019 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2PARTIAL 3 "22 December 2014" "PCRE2 10.00"
|
.TH PCRE2PARTIAL 3 "21 June 2019" "PCRE2 10.34"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions
|
PCRE2 - Perl-compatible regular expressions
|
||||||
.SH "PARTIAL MATCHING IN PCRE2"
|
.SH "PARTIAL MATCHING IN PCRE2"
|
||||||
|
@ -326,7 +326,13 @@ characters:
|
||||||
Partial match: 123ab
|
Partial match: 123ab
|
||||||
<<<
|
<<<
|
||||||
.P
|
.P
|
||||||
3. Because a partial match must always contain at least one character, what
|
3. The maximum lookbehind count is also important when the result of a partial
|
||||||
|
match attempt is "no match". In this case, the maximum lookbehind characters
|
||||||
|
from the end of the current segment must be retained at the start of the next
|
||||||
|
segment, in case the lookbehind is at the start of the pattern. Matching the
|
||||||
|
next segment must then start at the appropriate offset.
|
||||||
|
.P
|
||||||
|
4. Because a partial match must always contain at least one character, what
|
||||||
might be considered a partial match of an empty string actually gives a "no
|
might be considered a partial match of an empty string actually gives a "no
|
||||||
match" result. For example:
|
match" result. For example:
|
||||||
.sp
|
.sp
|
||||||
|
@ -339,7 +345,7 @@ happen if characters from the previous segment are retained. For this reason, a
|
||||||
"no match" result should be interpreted as "partial match of an empty string"
|
"no match" result should be interpreted as "partial match of an empty string"
|
||||||
when the pattern contains lookbehinds.
|
when the pattern contains lookbehinds.
|
||||||
.P
|
.P
|
||||||
4. Matching a subject string that is split into multiple segments may not
|
5. Matching a subject string that is split into multiple segments may not
|
||||||
always produce exactly the same result as matching over one single long string,
|
always produce exactly the same result as matching over one single long string,
|
||||||
especially when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and
|
especially when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and
|
||||||
Word Boundaries" above describes an issue that arises if the pattern ends with
|
Word Boundaries" above describes an issue that arises if the pattern ends with
|
||||||
|
@ -380,7 +386,7 @@ multi-segment data. The example above then behaves differently:
|
||||||
data> gsb\e=ph,dfa,dfa_restart
|
data> gsb\e=ph,dfa,dfa_restart
|
||||||
Partial match: gsb
|
Partial match: gsb
|
||||||
.sp
|
.sp
|
||||||
5. Patterns that contain alternatives at the top level which do not all start
|
6. Patterns that contain alternatives at the top level which do not all start
|
||||||
with the same pattern item may not work as expected when PCRE2_DFA_RESTART is
|
with the same pattern item may not work as expected when PCRE2_DFA_RESTART is
|
||||||
used. For example, consider this pattern:
|
used. For example, consider this pattern:
|
||||||
.sp
|
.sp
|
||||||
|
@ -429,6 +435,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 22 December 2014
|
Last updated: 21 June 2019
|
||||||
Copyright (c) 1997-2014 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2PATTERN 3 "20 June 2019" "PCRE2 10.34"
|
.TH PCRE2PATTERN 3 "21 June 2019" "PCRE2 10.34"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH "PCRE2 REGULAR EXPRESSION DETAILS"
|
.SH "PCRE2 REGULAR EXPRESSION DETAILS"
|
||||||
|
@ -2022,8 +2022,10 @@ no characters with a quantifier that has no upper limit, for example:
|
||||||
.sp
|
.sp
|
||||||
Earlier versions of Perl and PCRE1 used to give an error at compile time for
|
Earlier versions of Perl and PCRE1 used to give an error at compile time for
|
||||||
such patterns. However, because there are cases where this can be useful, such
|
such patterns. However, because there are cases where this can be useful, such
|
||||||
patterns are now accepted, but if any repetition of the group does in fact
|
patterns are now accepted, but whenever an iteration of such a group matches no
|
||||||
match no characters, the loop is forcibly broken.
|
characters, matching moves on to the next item in the pattern instead of
|
||||||
|
repeatedly matching an empty string. This does not prevent backtracking into
|
||||||
|
any of the iterations if a subsequent item fails to match.
|
||||||
.P
|
.P
|
||||||
By default, quantifiers are "greedy", that is, they match as much as possible
|
By default, quantifiers are "greedy", that is, they match as much as possible
|
||||||
(up to the maximum number of permitted times), without causing the rest of the
|
(up to the maximum number of permitted times), without causing the rest of the
|
||||||
|
@ -2378,6 +2380,9 @@ conditional group
|
||||||
(see below). In this case, the result of matching the assertion determines
|
(see below). In this case, the result of matching the assertion determines
|
||||||
which branch of the condition is followed.
|
which branch of the condition is followed.
|
||||||
.P
|
.P
|
||||||
|
Lookaround assertions are atomic. If an assertion is true, but there is a
|
||||||
|
subsequent matching failure, there is no backtracking into the assertion.
|
||||||
|
.P
|
||||||
Assertion groups are not capture groups. If an assertion contains capture
|
Assertion groups are not capture groups. If an assertion contains capture
|
||||||
groups within it, these are counted for the purposes of numbering the capture
|
groups within it, these are counted for the purposes of numbering the capture
|
||||||
groups in the whole pattern. Within each branch of an assertion, locally
|
groups in the whole pattern. Within each branch of an assertion, locally
|
||||||
|
@ -3559,9 +3564,9 @@ effect as this example; although it would suppress backtracking during the
|
||||||
first match attempt, the second attempt would start at the second character
|
first match attempt, the second attempt would start at the second character
|
||||||
instead of skipping on to "c".
|
instead of skipping on to "c".
|
||||||
.P
|
.P
|
||||||
If (*SKIP) is used inside a lookbehind to specify a new starting point that is
|
If (*SKIP) is used inside a lookbehind to specify a new starting position that
|
||||||
not later than the starting point of the current match, it is ignored, and the
|
is not later than the starting point of the current match, the position
|
||||||
normal "bumpalong" occurs.
|
specified by (*SKIP) is ignored, and instead the normal "bumpalong" occurs.
|
||||||
.sp
|
.sp
|
||||||
(*SKIP:NAME)
|
(*SKIP:NAME)
|
||||||
.sp
|
.sp
|
||||||
|
@ -3782,6 +3787,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 20 June 2019
|
Last updated: 21 June 2019
|
||||||
Copyright (c) 1997-2019 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
Loading…
Reference in New Issue