Doc file tidies for 10.38-RC1
This commit is contained in:
parent
e2fde18833
commit
8f3e11a355
2
README
2
README
|
@ -12,7 +12,7 @@ repository:
|
||||||
https://github.com/PhilipHazel/pcre2/releases
|
https://github.com/PhilipHazel/pcre2/releases
|
||||||
|
|
||||||
There is a mailing list for discussion about the development of PCRE2 at
|
There is a mailing list for discussion about the development of PCRE2 at
|
||||||
pcre2-dev@googlegroups.com. You can subscribe by sending an email to
|
pcre2-dev@googlegroups.com. You can subscribe by sending an email to
|
||||||
pcre2-dev+subscribe@googlegroups.com.
|
pcre2-dev+subscribe@googlegroups.com.
|
||||||
|
|
||||||
You can access the archives and also subscribe or manage your subscription
|
You can access the archives and also subscribe or manage your subscription
|
||||||
|
|
|
@ -12,7 +12,7 @@ repository:
|
||||||
https://github.com/PhilipHazel/pcre2/releases
|
https://github.com/PhilipHazel/pcre2/releases
|
||||||
|
|
||||||
There is a mailing list for discussion about the development of PCRE2 at
|
There is a mailing list for discussion about the development of PCRE2 at
|
||||||
pcre2-dev@googlegroups.com. You can subscribe by sending an email to
|
pcre2-dev@googlegroups.com. You can subscribe by sending an email to
|
||||||
pcre2-dev+subscribe@googlegroups.com.
|
pcre2-dev+subscribe@googlegroups.com.
|
||||||
|
|
||||||
You can access the archives and also subscribe or manage your subscription
|
You can access the archives and also subscribe or manage your subscription
|
||||||
|
|
|
@ -28,7 +28,7 @@ nearly two decades, the limitations of the original API were making development
|
||||||
increasingly difficult. The new API is more extensible, and it was simplified
|
increasingly difficult. The new API is more extensible, and it was simplified
|
||||||
by abolishing the separate "study" optimizing function; in PCRE2, patterns are
|
by abolishing the separate "study" optimizing function; in PCRE2, patterns are
|
||||||
automatically optimized where possible. Since forking from PCRE1, the code has
|
automatically optimized where possible. Since forking from PCRE1, the code has
|
||||||
been extensively refactored and new features introduced. The old library is now
|
been extensively refactored and new features introduced. The old library is now
|
||||||
obsolete and is no longer maintained.
|
obsolete and is no longer maintained.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
|
|
|
@ -45,10 +45,10 @@ just once (except when processing lookaround assertions). This function is
|
||||||
<i>workspace</i> Points to a vector of ints used as working space
|
<i>workspace</i> Points to a vector of ints used as working space
|
||||||
<i>wscount</i> Number of elements in the vector
|
<i>wscount</i> Number of elements in the vector
|
||||||
</pre>
|
</pre>
|
||||||
The size of output vector needed to contain all the results depends on the
|
The size of output vector needed to contain all the results depends on the
|
||||||
number of simultaneous matches, not on the number of parentheses in the
|
number of simultaneous matches, not on the number of parentheses in the
|
||||||
pattern. Using <b>pcre2_match_data_create_from_pattern()</b> to create the match
|
pattern. Using <b>pcre2_match_data_create_from_pattern()</b> to create the match
|
||||||
data block is therefore not advisable when using this function.
|
data block is therefore not advisable when using this function.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
A match context is needed only if you want to set up a callout function or
|
A match context is needed only if you want to set up a callout function or
|
||||||
|
|
|
@ -1917,10 +1917,10 @@ The option bits that can be set in a compile context by calling the
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
||||||
</pre>
|
</pre>
|
||||||
Since release 10.38 PCRE2 has forbidden the use of \K within lookaround
|
Since release 10.38 PCRE2 has forbidden the use of \K within lookaround
|
||||||
assertions, following Perl's lead. This option is provided to re-enable the
|
assertions, following Perl's lead. This option is provided to re-enable the
|
||||||
previous behaviour (act in positive lookarounds, ignore in negative ones) in
|
previous behaviour (act in positive lookarounds, ignore in negative ones) in
|
||||||
case anybody is relying on it.
|
case anybody is relying on it.
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
|
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
|
||||||
</pre>
|
</pre>
|
||||||
|
@ -2526,7 +2526,7 @@ string that define the matched parts of the subject. This is known as the
|
||||||
Before calling <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b>, or
|
Before calling <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b>, or
|
||||||
<b>pcre2_jit_match()</b> you must create a match data block by calling one of
|
<b>pcre2_jit_match()</b> you must create a match data block by calling one of
|
||||||
the creation functions above. For <b>pcre2_match_data_create()</b>, the first
|
the creation functions above. For <b>pcre2_match_data_create()</b>, the first
|
||||||
argument is the number of pairs of offsets in the <i>ovector</i>.
|
argument is the number of pairs of offsets in the <i>ovector</i>.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
When using <b>pcre2_match()</b>, one pair of offsets is required to identify the
|
When using <b>pcre2_match()</b>, one pair of offsets is required to identify the
|
||||||
|
@ -2535,14 +2535,14 @@ captured substring. For example, a value of 4 creates enough space to record
|
||||||
the matched portion of the subject plus three captured substrings.
|
the matched portion of the subject plus three captured substrings.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
When using <b>pcre2_dfa_match()</b> there may be multiple matched substrings of
|
When using <b>pcre2_dfa_match()</b> there may be multiple matched substrings of
|
||||||
different lengths at the same point in the subject. The ovector should be made
|
different lengths at the same point in the subject. The ovector should be made
|
||||||
large enough to hold as many as are expected.
|
large enough to hold as many as are expected.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
A minimum of at least 1 pair is imposed by <b>pcre2_match_data_create()</b>, so
|
A minimum of at least 1 pair is imposed by <b>pcre2_match_data_create()</b>, so
|
||||||
it is always possible to return the overall matched string in the case of
|
it is always possible to return the overall matched string in the case of
|
||||||
<b>pcre2_match()</b> or the longest match in the case of
|
<b>pcre2_match()</b> or the longest match in the case of
|
||||||
<b>pcre2_dfa_match()</b>.
|
<b>pcre2_dfa_match()</b>.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
|
|
|
@ -234,11 +234,11 @@ pcre2_match_data_create_from_pattern() above. */
|
||||||
if (rc == 0)
|
if (rc == 0)
|
||||||
printf("ovector was not big enough for all the captured substrings\n");
|
printf("ovector was not big enough for all the captured substrings\n");
|
||||||
|
|
||||||
/* Since release 10.38 PCRE2 has locked out the use of \K in lookaround
|
/* Since release 10.38 PCRE2 has locked out the use of \K in lookaround
|
||||||
assertions. However, there is an option to re-enable the old behaviour. If that
|
assertions. However, there is an option to re-enable the old behaviour. If that
|
||||||
is set, it is possible to run patterns such as /(?=.\K)/ that use \K in an
|
is set, it is possible to run patterns such as /(?=.\K)/ that use \K in an
|
||||||
assertion to set the start of a match later than its end. In this demonstration
|
assertion to set the start of a match later than its end. In this demonstration
|
||||||
program, we show how to detect this case, but it shouldn't arise because the
|
program, we show how to detect this case, but it shouldn't arise because the
|
||||||
option is never set. */
|
option is never set. */
|
||||||
|
|
||||||
if (ovector[0] > ovector[1])
|
if (ovector[0] > ovector[1])
|
||||||
|
|
|
@ -1175,10 +1175,10 @@ For example, when the pattern
|
||||||
matches "foobar", the first substring is still set to "foo".
|
matches "foobar", the first substring is still set to "foo".
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
From version 5.32.0 Perl forbids the use of \K in lookaround assertions. From
|
From version 5.32.0 Perl forbids the use of \K in lookaround assertions. From
|
||||||
release 10.38 PCRE2 also forbids this by default. However, the
|
release 10.38 PCRE2 also forbids this by default. However, the
|
||||||
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK option can be used when calling
|
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK option can be used when calling
|
||||||
<b>pcre2_compile()</b> to re-enable the previous behaviour. When this option is
|
<b>pcre2_compile()</b> to re-enable the previous behaviour. When this option is
|
||||||
set, \K is acted upon when it occurs inside positive assertions, but is
|
set, \K is acted upon when it occurs inside positive assertions, but is
|
||||||
ignored in negative assertions. Note that when a pattern such as (?=ab\K)
|
ignored in negative assertions. Note that when a pattern such as (?=ab\K)
|
||||||
matches, the reported start of the match can be greater than the end of the
|
matches, the reported start of the match can be greater than the end of the
|
||||||
|
|
|
@ -429,7 +429,7 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
|
||||||
<pre>
|
<pre>
|
||||||
\K set reported start of match
|
\K set reported start of match
|
||||||
</pre>
|
</pre>
|
||||||
From release 10.38 \K is not permitted by default in lookaround assertions,
|
From release 10.38 \K is not permitted by default in lookaround assertions,
|
||||||
for compatibility with Perl. However, if the PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
for compatibility with Perl. However, if the PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
||||||
option is set, the previous behaviour is re-enabled. When this option is set,
|
option is set, the previous behaviour is re-enabled. When this option is set,
|
||||||
\K is honoured in positive assertions, but ignored in negative ones.
|
\K is honoured in positive assertions, but ignored in negative ones.
|
||||||
|
|
|
@ -84,7 +84,7 @@ names used in the libraries have a suffix _8, _16, or _32, as appropriate.
|
||||||
<br><a name="SEC3" href="#TOC1">INPUT ENCODING</a><br>
|
<br><a name="SEC3" href="#TOC1">INPUT ENCODING</a><br>
|
||||||
<P>
|
<P>
|
||||||
Input to <b>pcre2test</b> is processed line by line, either by calling the C
|
Input to <b>pcre2test</b> is processed line by line, either by calling the C
|
||||||
library's <b>fgets()</b> function, or via the <b>libreadline</b> or <b>libedit</b>
|
library's <b>fgets()</b> function, or via the <b>libreadline</b> or <b>libedit</b>
|
||||||
library. In some Windows environments character 26 (hex 1A) causes an immediate
|
library. In some Windows environments character 26 (hex 1A) causes an immediate
|
||||||
end of file, and no further data is read, so this character should be avoided
|
end of file, and no further data is read, so this character should be avoided
|
||||||
unless you really want that action.
|
unless you really want that action.
|
||||||
|
@ -610,7 +610,7 @@ way <b>pcre2_compile()</b> behaves. See
|
||||||
for a description of the effects of these options.
|
for a description of the effects of these options.
|
||||||
<pre>
|
<pre>
|
||||||
allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS
|
allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS
|
||||||
allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
||||||
allow_surrogate_escapes set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
|
allow_surrogate_escapes set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
|
||||||
alt_bsux set PCRE2_ALT_BSUX
|
alt_bsux set PCRE2_ALT_BSUX
|
||||||
alt_circumflex set PCRE2_ALT_CIRCUMFLEX
|
alt_circumflex set PCRE2_ALT_CIRCUMFLEX
|
||||||
|
|
|
@ -11,7 +11,7 @@ nearly two decades, the limitations of the original API were making development
|
||||||
increasingly difficult. The new API is more extensible, and it was simplified
|
increasingly difficult. The new API is more extensible, and it was simplified
|
||||||
by abolishing the separate "study" optimizing function; in PCRE2, patterns are
|
by abolishing the separate "study" optimizing function; in PCRE2, patterns are
|
||||||
automatically optimized where possible. Since forking from PCRE1, the code has
|
automatically optimized where possible. Since forking from PCRE1, the code has
|
||||||
been extensively refactored and new features introduced. The old library is now
|
been extensively refactored and new features introduced. The old library is now
|
||||||
obsolete and is no longer maintained.
|
obsolete and is no longer maintained.
|
||||||
.P
|
.P
|
||||||
As well as Perl-style regular expression patterns, some features that appeared
|
As well as Perl-style regular expression patterns, some features that appeared
|
||||||
|
|
|
@ -185,8 +185,8 @@ REVISION
|
||||||
Last updated: 27 August 2021
|
Last updated: 27 August 2021
|
||||||
Copyright (c) 1997-2021 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2API(3) Library Functions Manual PCRE2API(3)
|
PCRE2API(3) Library Functions Manual PCRE2API(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -3851,8 +3851,8 @@ REVISION
|
||||||
Last updated: 30 August 2021
|
Last updated: 30 August 2021
|
||||||
Copyright (c) 1997-2021 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2BUILD(3) Library Functions Manual PCRE2BUILD(3)
|
PCRE2BUILD(3) Library Functions Manual PCRE2BUILD(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -4445,8 +4445,8 @@ REVISION
|
||||||
Last updated: 20 March 2020
|
Last updated: 20 March 2020
|
||||||
Copyright (c) 1997-2020 University of Cambridge.
|
Copyright (c) 1997-2020 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2CALLOUT(3) Library Functions Manual PCRE2CALLOUT(3)
|
PCRE2CALLOUT(3) Library Functions Manual PCRE2CALLOUT(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -4875,8 +4875,8 @@ REVISION
|
||||||
Last updated: 03 February 2019
|
Last updated: 03 February 2019
|
||||||
Copyright (c) 1997-2019 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2COMPAT(3) Library Functions Manual PCRE2COMPAT(3)
|
PCRE2COMPAT(3) Library Functions Manual PCRE2COMPAT(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -5090,8 +5090,8 @@ REVISION
|
||||||
Last updated: 30 August 2021
|
Last updated: 30 August 2021
|
||||||
Copyright (c) 1997-2021 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2JIT(3) Library Functions Manual PCRE2JIT(3)
|
PCRE2JIT(3) Library Functions Manual PCRE2JIT(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -5516,8 +5516,8 @@ REVISION
|
||||||
Last updated: 23 May 2019
|
Last updated: 23 May 2019
|
||||||
Copyright (c) 1997-2019 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2LIMITS(3) Library Functions Manual PCRE2LIMITS(3)
|
PCRE2LIMITS(3) Library Functions Manual PCRE2LIMITS(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -5586,8 +5586,8 @@ REVISION
|
||||||
Last updated: 02 February 2019
|
Last updated: 02 February 2019
|
||||||
Copyright (c) 1997-2019 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2MATCHING(3) Library Functions Manual PCRE2MATCHING(3)
|
PCRE2MATCHING(3) Library Functions Manual PCRE2MATCHING(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -5811,8 +5811,8 @@ REVISION
|
||||||
Last updated: 28 August 2021
|
Last updated: 28 August 2021
|
||||||
Copyright (c) 1997-2021 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2PARTIAL(3) Library Functions Manual PCRE2PARTIAL(3)
|
PCRE2PARTIAL(3) Library Functions Manual PCRE2PARTIAL(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -6191,8 +6191,8 @@ REVISION
|
||||||
Last updated: 04 September 2019
|
Last updated: 04 September 2019
|
||||||
Copyright (c) 1997-2019 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2PATTERN(3) Library Functions Manual PCRE2PATTERN(3)
|
PCRE2PATTERN(3) Library Functions Manual PCRE2PATTERN(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -9641,8 +9641,8 @@ REVISION
|
||||||
Last updated: 30 August 2021
|
Last updated: 30 August 2021
|
||||||
Copyright (c) 1997-2021 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2PERFORM(3) Library Functions Manual PCRE2PERFORM(3)
|
PCRE2PERFORM(3) Library Functions Manual PCRE2PERFORM(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -9876,8 +9876,8 @@ REVISION
|
||||||
Last updated: 03 February 2019
|
Last updated: 03 February 2019
|
||||||
Copyright (c) 1997-2019 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2POSIX(3) Library Functions Manual PCRE2POSIX(3)
|
PCRE2POSIX(3) Library Functions Manual PCRE2POSIX(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -10210,8 +10210,8 @@ REVISION
|
||||||
Last updated: 26 April 2021
|
Last updated: 26 April 2021
|
||||||
Copyright (c) 1997-2021 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2SAMPLE(3) Library Functions Manual PCRE2SAMPLE(3)
|
PCRE2SAMPLE(3) Library Functions Manual PCRE2SAMPLE(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -10489,8 +10489,8 @@ REVISION
|
||||||
Last updated: 27 June 2018
|
Last updated: 27 June 2018
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2018 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2SYNTAX(3) Library Functions Manual PCRE2SYNTAX(3)
|
PCRE2SYNTAX(3) Library Functions Manual PCRE2SYNTAX(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -11009,8 +11009,8 @@ REVISION
|
||||||
Last updated: 30 August 2021
|
Last updated: 30 August 2021
|
||||||
Copyright (c) 1997-2021 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2UNICODE(3) Library Functions Manual PCRE2UNICODE(3)
|
PCRE2UNICODE(3) Library Functions Manual PCRE2UNICODE(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -11444,5 +11444,5 @@ REVISION
|
||||||
Last updated: 23 February 2020
|
Last updated: 23 February 2020
|
||||||
Copyright (c) 1997-2020 University of Cambridge.
|
Copyright (c) 1997-2020 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -33,10 +33,10 @@ just once (except when processing lookaround assertions). This function is
|
||||||
\fIworkspace\fP Points to a vector of ints used as working space
|
\fIworkspace\fP Points to a vector of ints used as working space
|
||||||
\fIwscount\fP Number of elements in the vector
|
\fIwscount\fP Number of elements in the vector
|
||||||
.sp
|
.sp
|
||||||
The size of output vector needed to contain all the results depends on the
|
The size of output vector needed to contain all the results depends on the
|
||||||
number of simultaneous matches, not on the number of parentheses in the
|
number of simultaneous matches, not on the number of parentheses in the
|
||||||
pattern. Using \fBpcre2_match_data_create_from_pattern()\fP to create the match
|
pattern. Using \fBpcre2_match_data_create_from_pattern()\fP to create the match
|
||||||
data block is therefore not advisable when using this function.
|
data block is therefore not advisable when using this function.
|
||||||
.P
|
.P
|
||||||
A match context is needed only if you want to set up a callout function or
|
A match context is needed only if you want to set up a callout function or
|
||||||
specify the heap limit or the match or the recursion depth limits. The
|
specify the heap limit or the match or the recursion depth limits. The
|
||||||
|
|
|
@ -19,7 +19,7 @@ housed in a compile context. It completely replaces all the bits. The extra
|
||||||
options are:
|
options are:
|
||||||
.sp
|
.sp
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \eK in lookarounds
|
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \eK in lookarounds
|
||||||
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{df800} to \ex{dfff}
|
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{df800} to \ex{dfff}
|
||||||
in UTF-8 and UTF-32 modes
|
in UTF-8 and UTF-32 modes
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
|
|
|
@ -1878,10 +1878,10 @@ The option bits that can be set in a compile context by calling the
|
||||||
.sp
|
.sp
|
||||||
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
||||||
.sp
|
.sp
|
||||||
Since release 10.38 PCRE2 has forbidden the use of \eK within lookaround
|
Since release 10.38 PCRE2 has forbidden the use of \eK within lookaround
|
||||||
assertions, following Perl's lead. This option is provided to re-enable the
|
assertions, following Perl's lead. This option is provided to re-enable the
|
||||||
previous behaviour (act in positive lookarounds, ignore in negative ones) in
|
previous behaviour (act in positive lookarounds, ignore in negative ones) in
|
||||||
case anybody is relying on it.
|
case anybody is relying on it.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
|
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
|
||||||
.sp
|
.sp
|
||||||
|
@ -2503,20 +2503,20 @@ string that define the matched parts of the subject. This is known as the
|
||||||
Before calling \fBpcre2_match()\fP, \fBpcre2_dfa_match()\fP, or
|
Before calling \fBpcre2_match()\fP, \fBpcre2_dfa_match()\fP, or
|
||||||
\fBpcre2_jit_match()\fP you must create a match data block by calling one of
|
\fBpcre2_jit_match()\fP you must create a match data block by calling one of
|
||||||
the creation functions above. For \fBpcre2_match_data_create()\fP, the first
|
the creation functions above. For \fBpcre2_match_data_create()\fP, the first
|
||||||
argument is the number of pairs of offsets in the \fIovector\fP.
|
argument is the number of pairs of offsets in the \fIovector\fP.
|
||||||
.P
|
.P
|
||||||
When using \fBpcre2_match()\fP, one pair of offsets is required to identify the
|
When using \fBpcre2_match()\fP, one pair of offsets is required to identify the
|
||||||
string that matched the whole pattern, with an additional pair for each
|
string that matched the whole pattern, with an additional pair for each
|
||||||
captured substring. For example, a value of 4 creates enough space to record
|
captured substring. For example, a value of 4 creates enough space to record
|
||||||
the matched portion of the subject plus three captured substrings.
|
the matched portion of the subject plus three captured substrings.
|
||||||
.P
|
.P
|
||||||
When using \fBpcre2_dfa_match()\fP there may be multiple matched substrings of
|
When using \fBpcre2_dfa_match()\fP there may be multiple matched substrings of
|
||||||
different lengths at the same point in the subject. The ovector should be made
|
different lengths at the same point in the subject. The ovector should be made
|
||||||
large enough to hold as many as are expected.
|
large enough to hold as many as are expected.
|
||||||
.P
|
.P
|
||||||
A minimum of at least 1 pair is imposed by \fBpcre2_match_data_create()\fP, so
|
A minimum of at least 1 pair is imposed by \fBpcre2_match_data_create()\fP, so
|
||||||
it is always possible to return the overall matched string in the case of
|
it is always possible to return the overall matched string in the case of
|
||||||
\fBpcre2_match()\fP or the longest match in the case of
|
\fBpcre2_match()\fP or the longest match in the case of
|
||||||
\fBpcre2_dfa_match()\fP.
|
\fBpcre2_dfa_match()\fP.
|
||||||
.P
|
.P
|
||||||
The second argument of \fBpcre2_match_data_create()\fP is a pointer to a
|
The second argument of \fBpcre2_match_data_create()\fP is a pointer to a
|
||||||
|
|
|
@ -234,11 +234,11 @@ pcre2_match_data_create_from_pattern() above. */
|
||||||
if (rc == 0)
|
if (rc == 0)
|
||||||
printf("ovector was not big enough for all the captured substrings\en");
|
printf("ovector was not big enough for all the captured substrings\en");
|
||||||
|
|
||||||
/* Since release 10.38 PCRE2 has locked out the use of \eK in lookaround
|
/* Since release 10.38 PCRE2 has locked out the use of \eK in lookaround
|
||||||
assertions. However, there is an option to re-enable the old behaviour. If that
|
assertions. However, there is an option to re-enable the old behaviour. If that
|
||||||
is set, it is possible to run patterns such as /(?=.\eK)/ that use \eK in an
|
is set, it is possible to run patterns such as /(?=.\eK)/ that use \eK in an
|
||||||
assertion to set the start of a match later than its end. In this demonstration
|
assertion to set the start of a match later than its end. In this demonstration
|
||||||
program, we show how to detect this case, but it shouldn't arise because the
|
program, we show how to detect this case, but it shouldn't arise because the
|
||||||
option is never set. */
|
option is never set. */
|
||||||
|
|
||||||
if (ovector[0] > ovector[1])
|
if (ovector[0] > ovector[1])
|
||||||
|
|
|
@ -1168,10 +1168,10 @@ For example, when the pattern
|
||||||
.sp
|
.sp
|
||||||
matches "foobar", the first substring is still set to "foo".
|
matches "foobar", the first substring is still set to "foo".
|
||||||
.P
|
.P
|
||||||
From version 5.32.0 Perl forbids the use of \eK in lookaround assertions. From
|
From version 5.32.0 Perl forbids the use of \eK in lookaround assertions. From
|
||||||
release 10.38 PCRE2 also forbids this by default. However, the
|
release 10.38 PCRE2 also forbids this by default. However, the
|
||||||
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK option can be used when calling
|
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK option can be used when calling
|
||||||
\fBpcre2_compile()\fP to re-enable the previous behaviour. When this option is
|
\fBpcre2_compile()\fP to re-enable the previous behaviour. When this option is
|
||||||
set, \eK is acted upon when it occurs inside positive assertions, but is
|
set, \eK is acted upon when it occurs inside positive assertions, but is
|
||||||
ignored in negative assertions. Note that when a pattern such as (?=ab\eK)
|
ignored in negative assertions. Note that when a pattern such as (?=ab\eK)
|
||||||
matches, the reported start of the match can be greater than the end of the
|
matches, the reported start of the match can be greater than the end of the
|
||||||
|
|
|
@ -401,7 +401,7 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
|
||||||
.sp
|
.sp
|
||||||
\eK set reported start of match
|
\eK set reported start of match
|
||||||
.sp
|
.sp
|
||||||
From release 10.38 \eK is not permitted by default in lookaround assertions,
|
From release 10.38 \eK is not permitted by default in lookaround assertions,
|
||||||
for compatibility with Perl. However, if the PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
for compatibility with Perl. However, if the PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
||||||
option is set, the previous behaviour is re-enabled. When this option is set,
|
option is set, the previous behaviour is re-enabled. When this option is set,
|
||||||
\eK is honoured in positive assertions, but ignored in negative ones.
|
\eK is honoured in positive assertions, but ignored in negative ones.
|
||||||
|
|
|
@ -56,7 +56,7 @@ names used in the libraries have a suffix _8, _16, or _32, as appropriate.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
Input to \fBpcre2test\fP is processed line by line, either by calling the C
|
Input to \fBpcre2test\fP is processed line by line, either by calling the C
|
||||||
library's \fBfgets()\fP function, or via the \fBlibreadline\fP or \fBlibedit\fP
|
library's \fBfgets()\fP function, or via the \fBlibreadline\fP or \fBlibedit\fP
|
||||||
library. In some Windows environments character 26 (hex 1A) causes an immediate
|
library. In some Windows environments character 26 (hex 1A) causes an immediate
|
||||||
end of file, and no further data is read, so this character should be avoided
|
end of file, and no further data is read, so this character should be avoided
|
||||||
unless you really want that action.
|
unless you really want that action.
|
||||||
|
@ -567,7 +567,7 @@ way \fBpcre2_compile()\fP behaves. See
|
||||||
for a description of the effects of these options.
|
for a description of the effects of these options.
|
||||||
.sp
|
.sp
|
||||||
allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS
|
allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS
|
||||||
allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
|
||||||
allow_surrogate_escapes set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
|
allow_surrogate_escapes set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
|
||||||
alt_bsux set PCRE2_ALT_BSUX
|
alt_bsux set PCRE2_ALT_BSUX
|
||||||
alt_circumflex set PCRE2_ALT_CIRCUMFLEX
|
alt_circumflex set PCRE2_ALT_CIRCUMFLEX
|
||||||
|
|
|
@ -5,7 +5,7 @@
|
||||||
/* This is the public header file for the PCRE library, second API, to be
|
/* This is the public header file for the PCRE library, second API, to be
|
||||||
#included by applications that call PCRE2 functions.
|
#included by applications that call PCRE2 functions.
|
||||||
|
|
||||||
Copyright (c) 2016-2020 University of Cambridge
|
Copyright (c) 2016-2021 University of Cambridge
|
||||||
|
|
||||||
-----------------------------------------------------------------------------
|
-----------------------------------------------------------------------------
|
||||||
Redistribution and use in source and binary forms, with or without
|
Redistribution and use in source and binary forms, with or without
|
||||||
|
@ -42,9 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||||
/* The current PCRE version information. */
|
/* The current PCRE version information. */
|
||||||
|
|
||||||
#define PCRE2_MAJOR 10
|
#define PCRE2_MAJOR 10
|
||||||
#define PCRE2_MINOR 37
|
#define PCRE2_MINOR 38
|
||||||
#define PCRE2_PRERELEASE
|
#define PCRE2_PRERELEASE -RC1
|
||||||
#define PCRE2_DATE 2021-05-26
|
#define PCRE2_DATE 2021-08-31
|
||||||
|
|
||||||
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
||||||
imported have to be identified as such. When building PCRE2, the appropriate
|
imported have to be identified as such. When building PCRE2, the appropriate
|
||||||
|
@ -152,6 +152,7 @@ D is inspected during pcre2_dfa_match() execution
|
||||||
#define PCRE2_EXTRA_MATCH_LINE 0x00000008u /* C */
|
#define PCRE2_EXTRA_MATCH_LINE 0x00000008u /* C */
|
||||||
#define PCRE2_EXTRA_ESCAPED_CR_IS_LF 0x00000010u /* C */
|
#define PCRE2_EXTRA_ESCAPED_CR_IS_LF 0x00000010u /* C */
|
||||||
#define PCRE2_EXTRA_ALT_BSUX 0x00000020u /* C */
|
#define PCRE2_EXTRA_ALT_BSUX 0x00000020u /* C */
|
||||||
|
#define PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK 0x00000040u /* C */
|
||||||
|
|
||||||
/* These are for pcre2_jit_compile(). */
|
/* These are for pcre2_jit_compile(). */
|
||||||
|
|
||||||
|
@ -311,6 +312,7 @@ pcre2_pattern_convert(). */
|
||||||
#define PCRE2_ERROR_SCRIPT_RUN_NOT_AVAILABLE 196
|
#define PCRE2_ERROR_SCRIPT_RUN_NOT_AVAILABLE 196
|
||||||
#define PCRE2_ERROR_TOO_MANY_CAPTURES 197
|
#define PCRE2_ERROR_TOO_MANY_CAPTURES 197
|
||||||
#define PCRE2_ERROR_CONDITION_ATOMIC_ASSERTION_EXPECTED 198
|
#define PCRE2_ERROR_CONDITION_ATOMIC_ASSERTION_EXPECTED 198
|
||||||
|
#define PCRE2_ERROR_BACKSLASH_K_IN_LOOKAROUND 199
|
||||||
|
|
||||||
|
|
||||||
/* "Expected" matching error codes: no match and partial match. */
|
/* "Expected" matching error codes: no match and partial match. */
|
||||||
|
|
|
@ -788,8 +788,8 @@ are allowed. */
|
||||||
/* Compile time error code numbers. They are given names so that they can more
|
/* Compile time error code numbers. They are given names so that they can more
|
||||||
easily be tracked. When a new number is added, the tables called eint1 and
|
easily be tracked. When a new number is added, the tables called eint1 and
|
||||||
eint2 in pcre2posix.c may need to be updated, and a new error text must be
|
eint2 in pcre2posix.c may need to be updated, and a new error text must be
|
||||||
added to compile_error_texts in pcre2_error.c. Also, the error codes in
|
added to compile_error_texts in pcre2_error.c. Also, the error codes in
|
||||||
pcre2.h.in must be updated - their values are exactly 100 greater than these
|
pcre2.h.in must be updated - their values are exactly 100 greater than these
|
||||||
values. */
|
values. */
|
||||||
|
|
||||||
enum { ERR0 = COMPILE_ERROR_BASE,
|
enum { ERR0 = COMPILE_ERROR_BASE,
|
||||||
|
@ -7802,15 +7802,15 @@ for (;; pptr++)
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/* \K is forbidden in lookarounds since 10.38 because that's what Perl has
|
/* \K is forbidden in lookarounds since 10.38 because that's what Perl has
|
||||||
done. However, there's an option, in case anyone was relying on it. */
|
done. However, there's an option, in case anyone was relying on it. */
|
||||||
|
|
||||||
if (cb->assert_depth > 0 && meta_arg == ESC_K &&
|
if (cb->assert_depth > 0 && meta_arg == ESC_K &&
|
||||||
(cb->cx->extra_options & PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK) == 0)
|
(cb->cx->extra_options & PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK) == 0)
|
||||||
{
|
{
|
||||||
*errorcodeptr = ERR99;
|
*errorcodeptr = ERR99;
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* For the rest (including \X when Unicode is supported - if not it's
|
/* For the rest (including \X when Unicode is supported - if not it's
|
||||||
faulted at parse time), the OP value is the escape value when PCRE2_UCP is
|
faulted at parse time), the OP value is the escape value when PCRE2_UCP is
|
||||||
|
|
|
@ -3713,7 +3713,7 @@ for (;;)
|
||||||
start_match = (pp2 == NULL)? end_subject : pp2;
|
start_match = (pp2 == NULL)? end_subject : pp2;
|
||||||
else
|
else
|
||||||
start_match = (pp2 == NULL || pp1 < pp2)? pp1 : pp2;
|
start_match = (pp2 == NULL || pp1 < pp2)? pp1 : pp2;
|
||||||
|
|
||||||
#endif /* 8-bit handling */
|
#endif /* 8-bit handling */
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -6847,7 +6847,7 @@ for(;;)
|
||||||
start_match = (pp2 == NULL)? end_subject : pp2;
|
start_match = (pp2 == NULL)? end_subject : pp2;
|
||||||
else
|
else
|
||||||
start_match = (pp2 == NULL || pp1 < pp2)? pp1 : pp2;
|
start_match = (pp2 == NULL || pp1 < pp2)? pp1 : pp2;
|
||||||
|
|
||||||
#endif /* 8-bit handling */
|
#endif /* 8-bit handling */
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -217,11 +217,11 @@ pcre2_match_data_create_from_pattern() above. */
|
||||||
if (rc == 0)
|
if (rc == 0)
|
||||||
printf("ovector was not big enough for all the captured substrings\n");
|
printf("ovector was not big enough for all the captured substrings\n");
|
||||||
|
|
||||||
/* Since release 10.38 PCRE2 has locked out the use of \K in lookaround
|
/* Since release 10.38 PCRE2 has locked out the use of \K in lookaround
|
||||||
assertions. However, there is an option to re-enable the old behaviour. If that
|
assertions. However, there is an option to re-enable the old behaviour. If that
|
||||||
is set, it is possible to run patterns such as /(?=.\K)/ that use \K in an
|
is set, it is possible to run patterns such as /(?=.\K)/ that use \K in an
|
||||||
assertion to set the start of a match later than its end. In this demonstration
|
assertion to set the start of a match later than its end. In this demonstration
|
||||||
program, we show how to detect this case, but it shouldn't arise because the
|
program, we show how to detect this case, but it shouldn't arise because the
|
||||||
option is never set. */
|
option is never set. */
|
||||||
|
|
||||||
if (ovector[0] > ovector[1])
|
if (ovector[0] > ovector[1])
|
||||||
|
|
|
@ -148,7 +148,7 @@ static const int eint2[] = {
|
||||||
37, REG_EESCAPE, /* PCRE2 does not support \L, \l, \N{name}, \U, or \u */
|
37, REG_EESCAPE, /* PCRE2 does not support \L, \l, \N{name}, \U, or \u */
|
||||||
56, REG_INVARG, /* internal error: unknown newline setting */
|
56, REG_INVARG, /* internal error: unknown newline setting */
|
||||||
92, REG_INVARG, /* invalid option bits with PCRE2_LITERAL */
|
92, REG_INVARG, /* invalid option bits with PCRE2_LITERAL */
|
||||||
99, REG_EESCAPE /* \K in lookaround */
|
99, REG_EESCAPE /* \K in lookaround */
|
||||||
};
|
};
|
||||||
|
|
||||||
/* Table of texts corresponding to POSIX error codes */
|
/* Table of texts corresponding to POSIX error codes */
|
||||||
|
|
Loading…
Reference in New Issue