Documentation update.

This commit is contained in:
Philip.Hazel 2018-06-28 16:56:56 +00:00
parent 5a45a0712a
commit b87a1b5e31
5 changed files with 970 additions and 945 deletions

View File

@ -1085,12 +1085,19 @@ Resetting the match start
</b><br> </b><br>
<P> <P>
The escape sequence \K causes any previously matched characters not to be The escape sequence \K causes any previously matched characters not to be
included in the final matched sequence. For example, the pattern: included in the final matched sequence that is returned. For example, the
pattern:
<pre> <pre>
foo\Kbar foo\Kbar
</pre> </pre>
matches "foobar", but reports that it has matched "bar". This feature is matches "foobar", but reports that it has matched "bar". \K does not interact
similar to a lookbehind assertion with anchoring in any way. The pattern:
<pre>
^foo\Kbar
</pre>
matches only when the subject begins with "foobar" (in single line mode),
though it again reports the matched string as "bar". This feature is similar to
a lookbehind assertion
<a href="#lookbehind">(described below).</a> <a href="#lookbehind">(described below).</a>
However, in this case, the part of the subject before the real match does not However, in this case, the part of the subject before the real match does not
have to be of fixed length, as lookbehind assertions do. The use of \K does have to be of fixed length, as lookbehind assertions do. The use of \K does
@ -1107,7 +1114,8 @@ Perl documents that the use of \K within assertions is "not well defined". In
PCRE2, \K is acted upon when it occurs inside positive assertions, but is PCRE2, \K is acted upon when it occurs inside positive assertions, but is
ignored in negative assertions. Note that when a pattern such as (?=ab\K) ignored in negative assertions. Note that when a pattern such as (?=ab\K)
matches, the reported start of the match can be greater than the end of the matches, the reported start of the match can be greater than the end of the
match. match. Using \K in a lookbehind assertion at the start of a pattern can also
lead to odd effects.
<a name="smallassertions"></a></P> <a name="smallassertions"></a></P>
<br><b> <br><b>
Simple assertions Simple assertions
@ -1158,18 +1166,18 @@ end.
</P> </P>
<P> <P>
The \G assertion is true only when the current matching position is at the The \G assertion is true only when the current matching position is at the
start point of the match, as specified by the <i>startoffset</i> argument of start point of the matching process, as specified by the <i>startoffset</i>
<b>pcre2_match()</b>. It differs from \A when the value of <i>startoffset</i> is argument of <b>pcre2_match()</b>. It differs from \A when the value of
non-zero. By calling <b>pcre2_match()</b> multiple times with appropriate <i>startoffset</i> is non-zero. By calling <b>pcre2_match()</b> multiple times
arguments, you can mimic Perl's /g option, and it is in this kind of with appropriate arguments, you can mimic Perl's /g option, and it is in this
implementation where \G can be useful. kind of implementation where \G can be useful.
</P> </P>
<P> <P>
Note, however, that PCRE2's interpretation of \G, as the start of the current Note, however, that PCRE2's implementation of \G, being true at the starting
match, is subtly different from Perl's, which defines it as the end of the character of the matching process, is subtly different from Perl's, which
previous match. In Perl, these can be different when the previously matched defines it as true at the end of the previous match. In Perl, these can be
string was empty. Because PCRE2 does just one match at a time, it cannot different when the previously matched string was empty. Because PCRE2 does just
reproduce this behaviour. one match at a time, it cannot reproduce this behaviour.
</P> </P>
<P> <P>
If all the alternatives of a pattern begin with \G, the expression is anchored If all the alternatives of a pattern begin with \G, the expression is anchored
@ -3476,7 +3484,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC30" href="#TOC1">REVISION</a><br> <br><a name="SEC30" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 25 April 2018 Last updated: 28 June 2018
<br> <br>
Copyright &copy; 1997-2018 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>

View File

@ -23,7 +23,7 @@ please consult the man page, in case the conversion went wrong.
<li><a name="TOC8" href="#SEC8">CHARACTER CLASSES</a> <li><a name="TOC8" href="#SEC8">CHARACTER CLASSES</a>
<li><a name="TOC9" href="#SEC9">QUANTIFIERS</a> <li><a name="TOC9" href="#SEC9">QUANTIFIERS</a>
<li><a name="TOC10" href="#SEC10">ANCHORS AND SIMPLE ASSERTIONS</a> <li><a name="TOC10" href="#SEC10">ANCHORS AND SIMPLE ASSERTIONS</a>
<li><a name="TOC11" href="#SEC11">MATCH POINT RESET</a> <li><a name="TOC11" href="#SEC11">REPORTED MATCH POINT SETTING</a>
<li><a name="TOC12" href="#SEC12">ALTERNATION</a> <li><a name="TOC12" href="#SEC12">ALTERNATION</a>
<li><a name="TOC13" href="#SEC13">CAPTURING</a> <li><a name="TOC13" href="#SEC13">CAPTURING</a>
<li><a name="TOC14" href="#SEC14">ATOMIC GROUPS</a> <li><a name="TOC14" href="#SEC14">ATOMIC GROUPS</a>
@ -387,10 +387,10 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
\G first matching position in subject \G first matching position in subject
</PRE> </PRE>
</P> </P>
<br><a name="SEC11" href="#TOC1">MATCH POINT RESET</a><br> <br><a name="SEC11" href="#TOC1">REPORTED MATCH POINT SETTING</a><br>
<P> <P>
<pre> <pre>
\K reset start of match \K set reported start of match
</pre> </pre>
\K is honoured in positive assertions, but ignored in negative ones. \K is honoured in positive assertions, but ignored in negative ones.
</P> </P>
@ -600,9 +600,9 @@ Cambridge, England.
</P> </P>
<br><a name="SEC27" href="#TOC1">REVISION</a><br> <br><a name="SEC27" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 17 June 2017 Last updated: 28 June 2018
<br> <br>
Copyright &copy; 1997-2017 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -6665,16 +6665,23 @@ BACKSLASH
Resetting the match start Resetting the match start
The escape sequence \K causes any previously matched characters not to The escape sequence \K causes any previously matched characters not to
be included in the final matched sequence. For example, the pattern: be included in the final matched sequence that is returned. For exam-
ple, the pattern:
foo\Kbar foo\Kbar
matches "foobar", but reports that it has matched "bar". This feature matches "foobar", but reports that it has matched "bar". \K does not
is similar to a lookbehind assertion (described below). However, in interact with anchoring in any way. The pattern:
this case, the part of the subject before the real match does not have
to be of fixed length, as lookbehind assertions do. The use of \K does ^foo\Kbar
not interfere with the setting of captured substrings. For example,
when the pattern matches only when the subject begins with "foobar" (in single line
mode), though it again reports the matched string as "bar". This fea-
ture is similar to a lookbehind assertion (described below). However,
in this case, the part of the subject before the real match does not
have to be of fixed length, as lookbehind assertions do. The use of \K
does not interfere with the setting of captured substrings. For exam-
ple, when the pattern
(foo)\Kbar (foo)\Kbar
@ -6684,7 +6691,8 @@ BACKSLASH
defined". In PCRE2, \K is acted upon when it occurs inside positive defined". In PCRE2, \K is acted upon when it occurs inside positive
assertions, but is ignored in negative assertions. Note that when a assertions, but is ignored in negative assertions. Note that when a
pattern such as (?=ab\K) matches, the reported start of the match can pattern such as (?=ab\K) matches, the reported start of the match can
be greater than the end of the match. be greater than the end of the match. Using \K in a lookbehind asser-
tion at the start of a pattern can also lead to odd effects.
Simple assertions Simple assertions
@ -6729,17 +6737,18 @@ BACKSLASH
as well as at the very end, whereas \z matches only at the end. as well as at the very end, whereas \z matches only at the end.
The \G assertion is true only when the current matching position is at The \G assertion is true only when the current matching position is at
the start point of the match, as specified by the startoffset argument the start point of the matching process, as specified by the startoff-
of pcre2_match(). It differs from \A when the value of startoffset is set argument of pcre2_match(). It differs from \A when the value of
non-zero. By calling pcre2_match() multiple times with appropriate startoffset is non-zero. By calling pcre2_match() multiple times with
arguments, you can mimic Perl's /g option, and it is in this kind of appropriate arguments, you can mimic Perl's /g option, and it is in
implementation where \G can be useful. this kind of implementation where \G can be useful.
Note, however, that PCRE2's interpretation of \G, as the start of the Note, however, that PCRE2's implementation of \G, being true at the
current match, is subtly different from Perl's, which defines it as the starting character of the matching process, is subtly different from
end of the previous match. In Perl, these can be different when the Perl's, which defines it as true at the end of the previous match. In
previously matched string was empty. Because PCRE2 does just one match Perl, these can be different when the previously matched string was
at a time, it cannot reproduce this behaviour. empty. Because PCRE2 does just one match at a time, it cannot reproduce
this behaviour.
If all the alternatives of a pattern begin with \G, the expression is If all the alternatives of a pattern begin with \G, the expression is
anchored to the starting match position, and the "anchored" flag is set anchored to the starting match position, and the "anchored" flag is set
@ -8921,7 +8930,7 @@ AUTHOR
REVISION REVISION
Last updated: 25 April 2018 Last updated: 28 June 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
@ -9982,9 +9991,9 @@ ANCHORS AND SIMPLE ASSERTIONS
\G first matching position in subject \G first matching position in subject
MATCH POINT RESET REPORTED MATCH POINT SETTING
\K reset start of match \K set reported start of match
\K is honoured in positive assertions, but ignored in negative ones. \K is honoured in positive assertions, but ignored in negative ones.
@ -10190,8 +10199,8 @@ AUTHOR
REVISION REVISION
Last updated: 17 June 2017 Last updated: 28 June 2018
Copyright (c) 1997-2017 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------

View File

@ -1,4 +1,4 @@
.TH PCRE2PATTERN 3 "25 April 2018" "PCRE2 10.32" .TH PCRE2PATTERN 3 "28 June 2018" "PCRE2 10.32"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH "PCRE2 REGULAR EXPRESSION DETAILS" .SH "PCRE2 REGULAR EXPRESSION DETAILS"
@ -1073,12 +1073,19 @@ sequences but the characters that they represent.)
.rs .rs
.sp .sp
The escape sequence \eK causes any previously matched characters not to be The escape sequence \eK causes any previously matched characters not to be
included in the final matched sequence. For example, the pattern: included in the final matched sequence that is returned. For example, the
pattern:
.sp .sp
foo\eKbar foo\eKbar
.sp .sp
matches "foobar", but reports that it has matched "bar". This feature is matches "foobar", but reports that it has matched "bar". \eK does not interact
similar to a lookbehind assertion with anchoring in any way. The pattern:
.sp
^foo\eKbar
.sp
matches only when the subject begins with "foobar" (in single line mode),
though it again reports the matched string as "bar". This feature is similar to
a lookbehind assertion
.\" HTML <a href="#lookbehind"> .\" HTML <a href="#lookbehind">
.\" </a> .\" </a>
(described below). (described below).
@ -1100,7 +1107,8 @@ Perl documents that the use of \eK within assertions is "not well defined". In
PCRE2, \eK is acted upon when it occurs inside positive assertions, but is PCRE2, \eK is acted upon when it occurs inside positive assertions, but is
ignored in negative assertions. Note that when a pattern such as (?=ab\eK) ignored in negative assertions. Note that when a pattern such as (?=ab\eK)
matches, the reported start of the match can be greater than the end of the matches, the reported start of the match can be greater than the end of the
match. match. Using \eK in a lookbehind assertion at the start of a pattern can also
lead to odd effects.
. .
. .
.\" HTML <a name="smallassertions"></a> .\" HTML <a name="smallassertions"></a>
@ -1152,17 +1160,17 @@ end of the string as well as at the very end, whereas \ez matches only at the
end. end.
.P .P
The \eG assertion is true only when the current matching position is at the The \eG assertion is true only when the current matching position is at the
start point of the match, as specified by the \fIstartoffset\fP argument of start point of the matching process, as specified by the \fIstartoffset\fP
\fBpcre2_match()\fP. It differs from \eA when the value of \fIstartoffset\fP is argument of \fBpcre2_match()\fP. It differs from \eA when the value of
non-zero. By calling \fBpcre2_match()\fP multiple times with appropriate \fIstartoffset\fP is non-zero. By calling \fBpcre2_match()\fP multiple times
arguments, you can mimic Perl's /g option, and it is in this kind of with appropriate arguments, you can mimic Perl's /g option, and it is in this
implementation where \eG can be useful. kind of implementation where \eG can be useful.
.P .P
Note, however, that PCRE2's interpretation of \eG, as the start of the current Note, however, that PCRE2's implementation of \eG, being true at the starting
match, is subtly different from Perl's, which defines it as the end of the character of the matching process, is subtly different from Perl's, which
previous match. In Perl, these can be different when the previously matched defines it as true at the end of the previous match. In Perl, these can be
string was empty. Because PCRE2 does just one match at a time, it cannot different when the previously matched string was empty. Because PCRE2 does just
reproduce this behaviour. one match at a time, it cannot reproduce this behaviour.
.P .P
If all the alternatives of a pattern begin with \eG, the expression is anchored If all the alternatives of a pattern begin with \eG, the expression is anchored
to the starting match position, and the "anchored" flag is set in the compiled to the starting match position, and the "anchored" flag is set in the compiled
@ -3503,6 +3511,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 25 April 2018 Last updated: 28 June 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
.fi .fi

View File

@ -1,4 +1,4 @@
.TH PCRE2SYNTAX 3 "17 June 2017" "PCRE2 10.30" .TH PCRE2SYNTAX 3 "28 June 2018" "PCRE2 10.32"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH "PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY" .SH "PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY"
@ -361,10 +361,10 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
\eG first matching position in subject \eG first matching position in subject
. .
. .
.SH "MATCH POINT RESET" .SH "REPORTED MATCH POINT SETTING"
.rs .rs
.sp .sp
\eK reset start of match \eK set reported start of match
.sp .sp
\eK is honoured in positive assertions, but ignored in negative ones. \eK is honoured in positive assertions, but ignored in negative ones.
. .
@ -589,6 +589,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 17 June 2017 Last updated: 28 June 2018
Copyright (c) 1997-2017 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
.fi .fi