Documentation update.
This commit is contained in:
parent
c37995d2bd
commit
7be3fef0ea
|
@ -37,9 +37,9 @@ for example, \b* (but not \b{3}), but these do not seem to have any use.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
3. Capturing subpatterns that occur inside negative lookaround assertions are
|
3. Capturing subpatterns that occur inside negative lookaround assertions are
|
||||||
counted, but their entries in the offsets vector are set only if the assertion
|
counted, but their entries in the offsets vector are set only when a negative
|
||||||
is a condition. Perl has changed its behaviour in this regard from time to
|
assertion is a condition that has a matching branch (that is, the condition is
|
||||||
time.
|
false).
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
4. The following Perl escape sequences are not supported: \l, \u, \L,
|
4. The following Perl escape sequences are not supported: \l, \u, \L,
|
||||||
|
@ -214,7 +214,7 @@ Cambridge, England.
|
||||||
REVISION
|
REVISION
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 29 March 2017
|
Last updated: 03 April 2017
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2017 University of Cambridge.
|
Copyright © 1997-2017 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
|
@ -2216,15 +2216,27 @@ coded as \b, \B, \A, \G, \Z, \z, ^ and $ are described
|
||||||
<P>
|
<P>
|
||||||
More complicated assertions are coded as subpatterns. There are two kinds:
|
More complicated assertions are coded as subpatterns. There are two kinds:
|
||||||
those that look ahead of the current position in the subject string, and those
|
those that look ahead of the current position in the subject string, and those
|
||||||
that look behind it. An assertion subpattern is matched in the normal way,
|
that look behind it, and in each case an assertion may be positive (must
|
||||||
except that it does not cause the current matching position to be changed.
|
succeed for matching to continue) or negative (must not succeed for matching to
|
||||||
|
continue). An assertion subpattern is matched in the normal way, except that,
|
||||||
|
when matching continues afterwards, the matching position in the subject string
|
||||||
|
is as it was at the start of the assertion.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Assertion subpatterns are not capturing subpatterns. If such an assertion
|
Assertion subpatterns are not capturing subpatterns. If an assertion contains
|
||||||
contains capturing subpatterns within it, these are counted for the purposes of
|
capturing subpatterns within it, these are counted for the purposes of
|
||||||
numbering the capturing subpatterns in the whole pattern. However, substring
|
numbering the capturing subpatterns in the whole pattern. However, substring
|
||||||
capturing is normally carried out only for positive assertions (but see the
|
capturing is carried out only for positive assertions that succeed, that is,
|
||||||
discussion of conditional subpatterns below).
|
one of their branches matches, so matching continues after the assertion. If
|
||||||
|
all branches of a positive assertion fail to match, nothing is captured, and
|
||||||
|
control is passed to the previous backtracking point.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
No capturing is done for a negative assertion unless it is being used as a
|
||||||
|
condition in a
|
||||||
|
<a href="#subpatternsassubroutines">conditional subpattern</a>
|
||||||
|
(see the discussion below). Matching continues after a non-conditional negative
|
||||||
|
assertion only if all its branches fail to match.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
For compatibility with Perl, most assertion subpatterns may be repeated; though
|
For compatibility with Perl, most assertion subpatterns may be repeated; though
|
||||||
|
@ -2604,10 +2616,11 @@ against the second. This pattern matches strings in one of the two forms
|
||||||
dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits.
|
dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
For Perl compatibility, if an assertion that is a condition contains capturing
|
When an assertion that is a condition contains capturing subpatterns, any
|
||||||
subpatterns, any capturing that occurs is retained afterwards, for both
|
capturing that occurs in a matching branch is retained afterwards, for both
|
||||||
positive and negative assertions. (Compare non-conditional assertions, when
|
positive and negative assertions, because matching always continues after the
|
||||||
captures are retained only for positive assertions.)
|
assertion, whether it succeeds or fails. (Compare non-conditional assertions,
|
||||||
|
when captures are retained only for positive assertions that succeed.)
|
||||||
<a name="comments"></a></P>
|
<a name="comments"></a></P>
|
||||||
<br><a name="SEC22" href="#TOC1">COMMENTS</a><br>
|
<br><a name="SEC22" href="#TOC1">COMMENTS</a><br>
|
||||||
<P>
|
<P>
|
||||||
|
@ -3351,28 +3364,34 @@ in the second repeat of the group acts.
|
||||||
Backtracking verbs in assertions
|
Backtracking verbs in assertions
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
(*FAIL) in an assertion has its normal effect: it forces an immediate
|
(*FAIL) in any assertion has its normal effect: it forces an immediate
|
||||||
backtrack.
|
backtrack. The behaviour of the other backtracking verbs depends on whether or
|
||||||
|
not the assertion is standalone or acting as the condition in a conditional
|
||||||
|
subpattern.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
(*ACCEPT) in a positive assertion causes the assertion to succeed without any
|
(*ACCEPT) in a standalone positive assertion causes the assertion to succeed
|
||||||
further processing. In a negative assertion, (*ACCEPT) causes the assertion to
|
without any further processing; captured strings are retained. In a standalone
|
||||||
fail without any further processing.
|
negative assertion, (*ACCEPT) causes the assertion to fail without any further
|
||||||
|
processing; captured substrings are discarded.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
If the assertion is a condition, (*ACCEPT) causes the condition to be true for
|
||||||
|
a positive assertion and false for a negative one; captured substrings are
|
||||||
|
retained in both cases.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
The effect of (*THEN) is not allowed to escape beyond an assertion. If there
|
||||||
|
are no more branches to try, (*THEN) causes a positive assertion to be false,
|
||||||
|
and a negative assertion to be true.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The other backtracking verbs are not treated specially if they appear in a
|
The other backtracking verbs are not treated specially if they appear in a
|
||||||
positive assertion. In particular, (*THEN) skips to the next alternative in the
|
standalone positive assertion. In a conditional positive assertion,
|
||||||
innermost enclosing group that has alternations, whether or not this is within
|
backtracking into (*COMMIT), (*SKIP), or (*PRUNE) causes the condition to be
|
||||||
the assertion.
|
false. However, for both standalone and conditional negative assertions,
|
||||||
</P>
|
backtracking into (*COMMIT), (*SKIP), or (*PRUNE) causes the assertion to be
|
||||||
<P>
|
true, without considering any further alternative branches.
|
||||||
Negative assertions are, however, different, in order to ensure that changing a
|
|
||||||
positive assertion into a negative assertion changes its result. Backtracking
|
|
||||||
into (*COMMIT), (*SKIP), or (*PRUNE) causes a negative assertion to be true,
|
|
||||||
without considering any further alternative branches in the assertion.
|
|
||||||
Backtracking into (*THEN) causes it to skip to the next enclosing alternative
|
|
||||||
within the assertion (the normal behaviour), but if the assertion does not have
|
|
||||||
such an alternative, (*THEN) behaves like (*PRUNE).
|
|
||||||
<a name="btsub"></a></P>
|
<a name="btsub"></a></P>
|
||||||
<br><b>
|
<br><b>
|
||||||
Backtracking verbs in subroutines
|
Backtracking verbs in subroutines
|
||||||
|
@ -3415,7 +3434,7 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC30" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC30" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 18 March 2017
|
Last updated: 03 April 2017
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2017 University of Cambridge.
|
Copyright © 1997-2017 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2COMPAT 3 "29 March 2017" "PCRE2 10.30"
|
.TH PCRE2COMPAT 3 "03 April 2017" "PCRE2 10.30"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH "DIFFERENCES BETWEEN PCRE2 AND PERL"
|
.SH "DIFFERENCES BETWEEN PCRE2 AND PERL"
|
||||||
|
@ -24,9 +24,9 @@ assertion just once). Perl allows some repeat quantifiers on other assertions,
|
||||||
for example, \eb* (but not \eb{3}), but these do not seem to have any use.
|
for example, \eb* (but not \eb{3}), but these do not seem to have any use.
|
||||||
.P
|
.P
|
||||||
3. Capturing subpatterns that occur inside negative lookaround assertions are
|
3. Capturing subpatterns that occur inside negative lookaround assertions are
|
||||||
counted, but their entries in the offsets vector are set only if the assertion
|
counted, but their entries in the offsets vector are set only when a negative
|
||||||
is a condition. Perl has changed its behaviour in this regard from time to
|
assertion is a condition that has a matching branch (that is, the condition is
|
||||||
time.
|
false).
|
||||||
.P
|
.P
|
||||||
4. The following Perl escape sequences are not supported: \el, \eu, \eL,
|
4. The following Perl escape sequences are not supported: \el, \eu, \eL,
|
||||||
\eU, and \eN when followed by a character name or Unicode value. (\eN on its
|
\eU, and \eN when followed by a character name or Unicode value. (\eN on its
|
||||||
|
@ -179,6 +179,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 29 March 2017
|
Last updated: 03 April 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2PATTERN 3 "18 March 2017" "PCRE2 10.30"
|
.TH PCRE2PATTERN 3 "03 April 2017" "PCRE2 10.30"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH "PCRE2 REGULAR EXPRESSION DETAILS"
|
.SH "PCRE2 REGULAR EXPRESSION DETAILS"
|
||||||
|
@ -2225,14 +2225,28 @@ above.
|
||||||
.P
|
.P
|
||||||
More complicated assertions are coded as subpatterns. There are two kinds:
|
More complicated assertions are coded as subpatterns. There are two kinds:
|
||||||
those that look ahead of the current position in the subject string, and those
|
those that look ahead of the current position in the subject string, and those
|
||||||
that look behind it. An assertion subpattern is matched in the normal way,
|
that look behind it, and in each case an assertion may be positive (must
|
||||||
except that it does not cause the current matching position to be changed.
|
succeed for matching to continue) or negative (must not succeed for matching to
|
||||||
|
continue). An assertion subpattern is matched in the normal way, except that,
|
||||||
|
when matching continues afterwards, the matching position in the subject string
|
||||||
|
is as it was at the start of the assertion.
|
||||||
.P
|
.P
|
||||||
Assertion subpatterns are not capturing subpatterns. If such an assertion
|
Assertion subpatterns are not capturing subpatterns. If an assertion contains
|
||||||
contains capturing subpatterns within it, these are counted for the purposes of
|
capturing subpatterns within it, these are counted for the purposes of
|
||||||
numbering the capturing subpatterns in the whole pattern. However, substring
|
numbering the capturing subpatterns in the whole pattern. However, substring
|
||||||
capturing is normally carried out only for positive assertions (but see the
|
capturing is carried out only for positive assertions that succeed, that is,
|
||||||
discussion of conditional subpatterns below).
|
one of their branches matches, so matching continues after the assertion. If
|
||||||
|
all branches of a positive assertion fail to match, nothing is captured, and
|
||||||
|
control is passed to the previous backtracking point.
|
||||||
|
.P
|
||||||
|
No capturing is done for a negative assertion unless it is being used as a
|
||||||
|
condition in a
|
||||||
|
.\" HTML <a href="#subpatternsassubroutines">
|
||||||
|
.\" </a>
|
||||||
|
conditional subpattern
|
||||||
|
.\"
|
||||||
|
(see the discussion below). Matching continues after a non-conditional negative
|
||||||
|
assertion only if all its branches fail to match.
|
||||||
.P
|
.P
|
||||||
For compatibility with Perl, most assertion subpatterns may be repeated; though
|
For compatibility with Perl, most assertion subpatterns may be repeated; though
|
||||||
it makes no sense to assert the same thing several times, the side effect of
|
it makes no sense to assert the same thing several times, the side effect of
|
||||||
|
@ -2620,10 +2634,11 @@ subject is matched against the first alternative; otherwise it is matched
|
||||||
against the second. This pattern matches strings in one of the two forms
|
against the second. This pattern matches strings in one of the two forms
|
||||||
dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits.
|
dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits.
|
||||||
.P
|
.P
|
||||||
For Perl compatibility, if an assertion that is a condition contains capturing
|
When an assertion that is a condition contains capturing subpatterns, any
|
||||||
subpatterns, any capturing that occurs is retained afterwards, for both
|
capturing that occurs in a matching branch is retained afterwards, for both
|
||||||
positive and negative assertions. (Compare non-conditional assertions, when
|
positive and negative assertions, because matching always continues after the
|
||||||
captures are retained only for positive assertions.)
|
assertion, whether it succeeds or fails. (Compare non-conditional assertions,
|
||||||
|
when captures are retained only for positive assertions that succeed.)
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.\" HTML <a name="comments"></a>
|
.\" HTML <a name="comments"></a>
|
||||||
|
@ -3381,25 +3396,30 @@ in the second repeat of the group acts.
|
||||||
.SS "Backtracking verbs in assertions"
|
.SS "Backtracking verbs in assertions"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
(*FAIL) in an assertion has its normal effect: it forces an immediate
|
(*FAIL) in any assertion has its normal effect: it forces an immediate
|
||||||
backtrack.
|
backtrack. The behaviour of the other backtracking verbs depends on whether or
|
||||||
|
not the assertion is standalone or acting as the condition in a conditional
|
||||||
|
subpattern.
|
||||||
.P
|
.P
|
||||||
(*ACCEPT) in a positive assertion causes the assertion to succeed without any
|
(*ACCEPT) in a standalone positive assertion causes the assertion to succeed
|
||||||
further processing. In a negative assertion, (*ACCEPT) causes the assertion to
|
without any further processing; captured strings are retained. In a standalone
|
||||||
fail without any further processing.
|
negative assertion, (*ACCEPT) causes the assertion to fail without any further
|
||||||
|
processing; captured substrings are discarded.
|
||||||
|
.P
|
||||||
|
If the assertion is a condition, (*ACCEPT) causes the condition to be true for
|
||||||
|
a positive assertion and false for a negative one; captured substrings are
|
||||||
|
retained in both cases.
|
||||||
|
.P
|
||||||
|
The effect of (*THEN) is not allowed to escape beyond an assertion. If there
|
||||||
|
are no more branches to try, (*THEN) causes a positive assertion to be false,
|
||||||
|
and a negative assertion to be true.
|
||||||
.P
|
.P
|
||||||
The other backtracking verbs are not treated specially if they appear in a
|
The other backtracking verbs are not treated specially if they appear in a
|
||||||
positive assertion. In particular, (*THEN) skips to the next alternative in the
|
standalone positive assertion. In a conditional positive assertion,
|
||||||
innermost enclosing group that has alternations, whether or not this is within
|
backtracking into (*COMMIT), (*SKIP), or (*PRUNE) causes the condition to be
|
||||||
the assertion.
|
false. However, for both standalone and conditional negative assertions,
|
||||||
.P
|
backtracking into (*COMMIT), (*SKIP), or (*PRUNE) causes the assertion to be
|
||||||
Negative assertions are, however, different, in order to ensure that changing a
|
true, without considering any further alternative branches.
|
||||||
positive assertion into a negative assertion changes its result. Backtracking
|
|
||||||
into (*COMMIT), (*SKIP), or (*PRUNE) causes a negative assertion to be true,
|
|
||||||
without considering any further alternative branches in the assertion.
|
|
||||||
Backtracking into (*THEN) causes it to skip to the next enclosing alternative
|
|
||||||
within the assertion (the normal behaviour), but if the assertion does not have
|
|
||||||
such an alternative, (*THEN) behaves like (*PRUNE).
|
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.\" HTML <a name="btsub"></a>
|
.\" HTML <a name="btsub"></a>
|
||||||
|
@ -3445,6 +3465,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 18 March 2017
|
Last updated: 03 April 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
Loading…
Reference in New Issue