Doc file tidies for 10.38-RC1

This commit is contained in:
Philip Hazel 2021-08-31 17:14:42 +01:00
parent e2fde18833
commit 8f3e11a355
24 changed files with 97 additions and 95 deletions

2
README
View File

@ -12,7 +12,7 @@ repository:
https://github.com/PhilipHazel/pcre2/releases
There is a mailing list for discussion about the development of PCRE2 at
pcre2-dev@googlegroups.com. You can subscribe by sending an email to
pcre2-dev@googlegroups.com. You can subscribe by sending an email to
pcre2-dev+subscribe@googlegroups.com.
You can access the archives and also subscribe or manage your subscription

View File

@ -12,7 +12,7 @@ repository:
https://github.com/PhilipHazel/pcre2/releases
There is a mailing list for discussion about the development of PCRE2 at
pcre2-dev@googlegroups.com. You can subscribe by sending an email to
pcre2-dev@googlegroups.com. You can subscribe by sending an email to
pcre2-dev+subscribe@googlegroups.com.
You can access the archives and also subscribe or manage your subscription

View File

@ -28,7 +28,7 @@ nearly two decades, the limitations of the original API were making development
increasingly difficult. The new API is more extensible, and it was simplified
by abolishing the separate "study" optimizing function; in PCRE2, patterns are
automatically optimized where possible. Since forking from PCRE1, the code has
been extensively refactored and new features introduced. The old library is now
been extensively refactored and new features introduced. The old library is now
obsolete and is no longer maintained.
</P>
<P>

View File

@ -45,10 +45,10 @@ just once (except when processing lookaround assertions). This function is
<i>workspace</i> Points to a vector of ints used as working space
<i>wscount</i> Number of elements in the vector
</pre>
The size of output vector needed to contain all the results depends on the
number of simultaneous matches, not on the number of parentheses in the
pattern. Using <b>pcre2_match_data_create_from_pattern()</b> to create the match
data block is therefore not advisable when using this function.
The size of output vector needed to contain all the results depends on the
number of simultaneous matches, not on the number of parentheses in the
pattern. Using <b>pcre2_match_data_create_from_pattern()</b> to create the match
data block is therefore not advisable when using this function.
</P>
<P>
A match context is needed only if you want to set up a callout function or

View File

@ -1917,10 +1917,10 @@ The option bits that can be set in a compile context by calling the
<pre>
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
</pre>
Since release 10.38 PCRE2 has forbidden the use of \K within lookaround
assertions, following Perl's lead. This option is provided to re-enable the
previous behaviour (act in positive lookarounds, ignore in negative ones) in
case anybody is relying on it.
Since release 10.38 PCRE2 has forbidden the use of \K within lookaround
assertions, following Perl's lead. This option is provided to re-enable the
previous behaviour (act in positive lookarounds, ignore in negative ones) in
case anybody is relying on it.
<pre>
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
</pre>
@ -2526,7 +2526,7 @@ string that define the matched parts of the subject. This is known as the
Before calling <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b>, or
<b>pcre2_jit_match()</b> you must create a match data block by calling one of
the creation functions above. For <b>pcre2_match_data_create()</b>, the first
argument is the number of pairs of offsets in the <i>ovector</i>.
argument is the number of pairs of offsets in the <i>ovector</i>.
</P>
<P>
When using <b>pcre2_match()</b>, one pair of offsets is required to identify the
@ -2535,14 +2535,14 @@ captured substring. For example, a value of 4 creates enough space to record
the matched portion of the subject plus three captured substrings.
</P>
<P>
When using <b>pcre2_dfa_match()</b> there may be multiple matched substrings of
When using <b>pcre2_dfa_match()</b> there may be multiple matched substrings of
different lengths at the same point in the subject. The ovector should be made
large enough to hold as many as are expected.
</P>
<P>
A minimum of at least 1 pair is imposed by <b>pcre2_match_data_create()</b>, so
it is always possible to return the overall matched string in the case of
<b>pcre2_match()</b> or the longest match in the case of
it is always possible to return the overall matched string in the case of
<b>pcre2_match()</b> or the longest match in the case of
<b>pcre2_dfa_match()</b>.
</P>
<P>

View File

@ -234,11 +234,11 @@ pcre2_match_data_create_from_pattern() above. */
if (rc == 0)
printf("ovector was not big enough for all the captured substrings\n");
/* Since release 10.38 PCRE2 has locked out the use of \K in lookaround
assertions. However, there is an option to re-enable the old behaviour. If that
/* Since release 10.38 PCRE2 has locked out the use of \K in lookaround
assertions. However, there is an option to re-enable the old behaviour. If that
is set, it is possible to run patterns such as /(?=.\K)/ that use \K in an
assertion to set the start of a match later than its end. In this demonstration
program, we show how to detect this case, but it shouldn't arise because the
program, we show how to detect this case, but it shouldn't arise because the
option is never set. */
if (ovector[0] &gt; ovector[1])

View File

@ -1175,10 +1175,10 @@ For example, when the pattern
matches "foobar", the first substring is still set to "foo".
</P>
<P>
From version 5.32.0 Perl forbids the use of \K in lookaround assertions. From
release 10.38 PCRE2 also forbids this by default. However, the
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK option can be used when calling
<b>pcre2_compile()</b> to re-enable the previous behaviour. When this option is
From version 5.32.0 Perl forbids the use of \K in lookaround assertions. From
release 10.38 PCRE2 also forbids this by default. However, the
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK option can be used when calling
<b>pcre2_compile()</b> to re-enable the previous behaviour. When this option is
set, \K is acted upon when it occurs inside positive assertions, but is
ignored in negative assertions. Note that when a pattern such as (?=ab\K)
matches, the reported start of the match can be greater than the end of the

View File

@ -429,7 +429,7 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
<pre>
\K set reported start of match
</pre>
From release 10.38 \K is not permitted by default in lookaround assertions,
From release 10.38 \K is not permitted by default in lookaround assertions,
for compatibility with Perl. However, if the PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
option is set, the previous behaviour is re-enabled. When this option is set,
\K is honoured in positive assertions, but ignored in negative ones.

View File

@ -84,7 +84,7 @@ names used in the libraries have a suffix _8, _16, or _32, as appropriate.
<br><a name="SEC3" href="#TOC1">INPUT ENCODING</a><br>
<P>
Input to <b>pcre2test</b> is processed line by line, either by calling the C
library's <b>fgets()</b> function, or via the <b>libreadline</b> or <b>libedit</b>
library's <b>fgets()</b> function, or via the <b>libreadline</b> or <b>libedit</b>
library. In some Windows environments character 26 (hex 1A) causes an immediate
end of file, and no further data is read, so this character should be avoided
unless you really want that action.
@ -610,7 +610,7 @@ way <b>pcre2_compile()</b> behaves. See
for a description of the effects of these options.
<pre>
allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS
allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
allow_surrogate_escapes set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
alt_bsux set PCRE2_ALT_BSUX
alt_circumflex set PCRE2_ALT_CIRCUMFLEX

View File

@ -11,7 +11,7 @@ nearly two decades, the limitations of the original API were making development
increasingly difficult. The new API is more extensible, and it was simplified
by abolishing the separate "study" optimizing function; in PCRE2, patterns are
automatically optimized where possible. Since forking from PCRE1, the code has
been extensively refactored and new features introduced. The old library is now
been extensively refactored and new features introduced. The old library is now
obsolete and is no longer maintained.
.P
As well as Perl-style regular expression patterns, some features that appeared

View File

@ -185,8 +185,8 @@ REVISION
Last updated: 27 August 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2API(3) Library Functions Manual PCRE2API(3)
@ -3851,8 +3851,8 @@ REVISION
Last updated: 30 August 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2BUILD(3) Library Functions Manual PCRE2BUILD(3)
@ -4445,8 +4445,8 @@ REVISION
Last updated: 20 March 2020
Copyright (c) 1997-2020 University of Cambridge.
------------------------------------------------------------------------------
PCRE2CALLOUT(3) Library Functions Manual PCRE2CALLOUT(3)
@ -4875,8 +4875,8 @@ REVISION
Last updated: 03 February 2019
Copyright (c) 1997-2019 University of Cambridge.
------------------------------------------------------------------------------
PCRE2COMPAT(3) Library Functions Manual PCRE2COMPAT(3)
@ -5090,8 +5090,8 @@ REVISION
Last updated: 30 August 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2JIT(3) Library Functions Manual PCRE2JIT(3)
@ -5516,8 +5516,8 @@ REVISION
Last updated: 23 May 2019
Copyright (c) 1997-2019 University of Cambridge.
------------------------------------------------------------------------------
PCRE2LIMITS(3) Library Functions Manual PCRE2LIMITS(3)
@ -5586,8 +5586,8 @@ REVISION
Last updated: 02 February 2019
Copyright (c) 1997-2019 University of Cambridge.
------------------------------------------------------------------------------
PCRE2MATCHING(3) Library Functions Manual PCRE2MATCHING(3)
@ -5811,8 +5811,8 @@ REVISION
Last updated: 28 August 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2PARTIAL(3) Library Functions Manual PCRE2PARTIAL(3)
@ -6191,8 +6191,8 @@ REVISION
Last updated: 04 September 2019
Copyright (c) 1997-2019 University of Cambridge.
------------------------------------------------------------------------------
PCRE2PATTERN(3) Library Functions Manual PCRE2PATTERN(3)
@ -9641,8 +9641,8 @@ REVISION
Last updated: 30 August 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2PERFORM(3) Library Functions Manual PCRE2PERFORM(3)
@ -9876,8 +9876,8 @@ REVISION
Last updated: 03 February 2019
Copyright (c) 1997-2019 University of Cambridge.
------------------------------------------------------------------------------
PCRE2POSIX(3) Library Functions Manual PCRE2POSIX(3)
@ -10210,8 +10210,8 @@ REVISION
Last updated: 26 April 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2SAMPLE(3) Library Functions Manual PCRE2SAMPLE(3)
@ -10489,8 +10489,8 @@ REVISION
Last updated: 27 June 2018
Copyright (c) 1997-2018 University of Cambridge.
------------------------------------------------------------------------------
PCRE2SYNTAX(3) Library Functions Manual PCRE2SYNTAX(3)
@ -11009,8 +11009,8 @@ REVISION
Last updated: 30 August 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2UNICODE(3) Library Functions Manual PCRE2UNICODE(3)
@ -11444,5 +11444,5 @@ REVISION
Last updated: 23 February 2020
Copyright (c) 1997-2020 University of Cambridge.
------------------------------------------------------------------------------

View File

@ -33,10 +33,10 @@ just once (except when processing lookaround assertions). This function is
\fIworkspace\fP Points to a vector of ints used as working space
\fIwscount\fP Number of elements in the vector
.sp
The size of output vector needed to contain all the results depends on the
number of simultaneous matches, not on the number of parentheses in the
pattern. Using \fBpcre2_match_data_create_from_pattern()\fP to create the match
data block is therefore not advisable when using this function.
The size of output vector needed to contain all the results depends on the
number of simultaneous matches, not on the number of parentheses in the
pattern. Using \fBpcre2_match_data_create_from_pattern()\fP to create the match
data block is therefore not advisable when using this function.
.P
A match context is needed only if you want to set up a callout function or
specify the heap limit or the match or the recursion depth limits. The

View File

@ -19,7 +19,7 @@ housed in a compile context. It completely replaces all the bits. The extra
options are:
.sp
.\" JOIN
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \eK in lookarounds
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \eK in lookarounds
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{df800} to \ex{dfff}
in UTF-8 and UTF-32 modes
.\" JOIN

View File

@ -1878,10 +1878,10 @@ The option bits that can be set in a compile context by calling the
.sp
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
.sp
Since release 10.38 PCRE2 has forbidden the use of \eK within lookaround
assertions, following Perl's lead. This option is provided to re-enable the
previous behaviour (act in positive lookarounds, ignore in negative ones) in
case anybody is relying on it.
Since release 10.38 PCRE2 has forbidden the use of \eK within lookaround
assertions, following Perl's lead. This option is provided to re-enable the
previous behaviour (act in positive lookarounds, ignore in negative ones) in
case anybody is relying on it.
.sp
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
.sp
@ -2503,20 +2503,20 @@ string that define the matched parts of the subject. This is known as the
Before calling \fBpcre2_match()\fP, \fBpcre2_dfa_match()\fP, or
\fBpcre2_jit_match()\fP you must create a match data block by calling one of
the creation functions above. For \fBpcre2_match_data_create()\fP, the first
argument is the number of pairs of offsets in the \fIovector\fP.
argument is the number of pairs of offsets in the \fIovector\fP.
.P
When using \fBpcre2_match()\fP, one pair of offsets is required to identify the
string that matched the whole pattern, with an additional pair for each
captured substring. For example, a value of 4 creates enough space to record
the matched portion of the subject plus three captured substrings.
.P
When using \fBpcre2_dfa_match()\fP there may be multiple matched substrings of
When using \fBpcre2_dfa_match()\fP there may be multiple matched substrings of
different lengths at the same point in the subject. The ovector should be made
large enough to hold as many as are expected.
.P
A minimum of at least 1 pair is imposed by \fBpcre2_match_data_create()\fP, so
it is always possible to return the overall matched string in the case of
\fBpcre2_match()\fP or the longest match in the case of
it is always possible to return the overall matched string in the case of
\fBpcre2_match()\fP or the longest match in the case of
\fBpcre2_dfa_match()\fP.
.P
The second argument of \fBpcre2_match_data_create()\fP is a pointer to a

View File

@ -234,11 +234,11 @@ pcre2_match_data_create_from_pattern() above. */
if (rc == 0)
printf("ovector was not big enough for all the captured substrings\en");
/* Since release 10.38 PCRE2 has locked out the use of \eK in lookaround
assertions. However, there is an option to re-enable the old behaviour. If that
/* Since release 10.38 PCRE2 has locked out the use of \eK in lookaround
assertions. However, there is an option to re-enable the old behaviour. If that
is set, it is possible to run patterns such as /(?=.\eK)/ that use \eK in an
assertion to set the start of a match later than its end. In this demonstration
program, we show how to detect this case, but it shouldn't arise because the
program, we show how to detect this case, but it shouldn't arise because the
option is never set. */
if (ovector[0] > ovector[1])

View File

@ -1168,10 +1168,10 @@ For example, when the pattern
.sp
matches "foobar", the first substring is still set to "foo".
.P
From version 5.32.0 Perl forbids the use of \eK in lookaround assertions. From
release 10.38 PCRE2 also forbids this by default. However, the
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK option can be used when calling
\fBpcre2_compile()\fP to re-enable the previous behaviour. When this option is
From version 5.32.0 Perl forbids the use of \eK in lookaround assertions. From
release 10.38 PCRE2 also forbids this by default. However, the
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK option can be used when calling
\fBpcre2_compile()\fP to re-enable the previous behaviour. When this option is
set, \eK is acted upon when it occurs inside positive assertions, but is
ignored in negative assertions. Note that when a pattern such as (?=ab\eK)
matches, the reported start of the match can be greater than the end of the

View File

@ -401,7 +401,7 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
.sp
\eK set reported start of match
.sp
From release 10.38 \eK is not permitted by default in lookaround assertions,
From release 10.38 \eK is not permitted by default in lookaround assertions,
for compatibility with Perl. However, if the PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
option is set, the previous behaviour is re-enabled. When this option is set,
\eK is honoured in positive assertions, but ignored in negative ones.

View File

@ -56,7 +56,7 @@ names used in the libraries have a suffix _8, _16, or _32, as appropriate.
.rs
.sp
Input to \fBpcre2test\fP is processed line by line, either by calling the C
library's \fBfgets()\fP function, or via the \fBlibreadline\fP or \fBlibedit\fP
library's \fBfgets()\fP function, or via the \fBlibreadline\fP or \fBlibedit\fP
library. In some Windows environments character 26 (hex 1A) causes an immediate
end of file, and no further data is read, so this character should be avoided
unless you really want that action.
@ -567,7 +567,7 @@ way \fBpcre2_compile()\fP behaves. See
for a description of the effects of these options.
.sp
allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS
allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK
allow_surrogate_escapes set PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES
alt_bsux set PCRE2_ALT_BSUX
alt_circumflex set PCRE2_ALT_CIRCUMFLEX

View File

@ -5,7 +5,7 @@
/* This is the public header file for the PCRE library, second API, to be
#included by applications that call PCRE2 functions.
Copyright (c) 2016-2020 University of Cambridge
Copyright (c) 2016-2021 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@ -42,9 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
/* The current PCRE version information. */
#define PCRE2_MAJOR 10
#define PCRE2_MINOR 37
#define PCRE2_PRERELEASE
#define PCRE2_DATE 2021-05-26
#define PCRE2_MINOR 38
#define PCRE2_PRERELEASE -RC1
#define PCRE2_DATE 2021-08-31
/* When an application links to a PCRE DLL in Windows, the symbols that are
imported have to be identified as such. When building PCRE2, the appropriate
@ -152,6 +152,7 @@ D is inspected during pcre2_dfa_match() execution
#define PCRE2_EXTRA_MATCH_LINE 0x00000008u /* C */
#define PCRE2_EXTRA_ESCAPED_CR_IS_LF 0x00000010u /* C */
#define PCRE2_EXTRA_ALT_BSUX 0x00000020u /* C */
#define PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK 0x00000040u /* C */
/* These are for pcre2_jit_compile(). */
@ -311,6 +312,7 @@ pcre2_pattern_convert(). */
#define PCRE2_ERROR_SCRIPT_RUN_NOT_AVAILABLE 196
#define PCRE2_ERROR_TOO_MANY_CAPTURES 197
#define PCRE2_ERROR_CONDITION_ATOMIC_ASSERTION_EXPECTED 198
#define PCRE2_ERROR_BACKSLASH_K_IN_LOOKAROUND 199
/* "Expected" matching error codes: no match and partial match. */

View File

@ -788,8 +788,8 @@ are allowed. */
/* Compile time error code numbers. They are given names so that they can more
easily be tracked. When a new number is added, the tables called eint1 and
eint2 in pcre2posix.c may need to be updated, and a new error text must be
added to compile_error_texts in pcre2_error.c. Also, the error codes in
pcre2.h.in must be updated - their values are exactly 100 greater than these
added to compile_error_texts in pcre2_error.c. Also, the error codes in
pcre2.h.in must be updated - their values are exactly 100 greater than these
values. */
enum { ERR0 = COMPILE_ERROR_BASE,
@ -7802,15 +7802,15 @@ for (;; pptr++)
}
#endif
/* \K is forbidden in lookarounds since 10.38 because that's what Perl has
/* \K is forbidden in lookarounds since 10.38 because that's what Perl has
done. However, there's an option, in case anyone was relying on it. */
if (cb->assert_depth > 0 && meta_arg == ESC_K &&
(cb->cx->extra_options & PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK) == 0)
{
*errorcodeptr = ERR99;
return 0;
}
return 0;
}
/* For the rest (including \X when Unicode is supported - if not it's
faulted at parse time), the OP value is the escape value when PCRE2_UCP is

View File

@ -3713,7 +3713,7 @@ for (;;)
start_match = (pp2 == NULL)? end_subject : pp2;
else
start_match = (pp2 == NULL || pp1 < pp2)? pp1 : pp2;
#endif /* 8-bit handling */
}

View File

@ -6847,7 +6847,7 @@ for(;;)
start_match = (pp2 == NULL)? end_subject : pp2;
else
start_match = (pp2 == NULL || pp1 < pp2)? pp1 : pp2;
#endif /* 8-bit handling */
}

View File

@ -217,11 +217,11 @@ pcre2_match_data_create_from_pattern() above. */
if (rc == 0)
printf("ovector was not big enough for all the captured substrings\n");
/* Since release 10.38 PCRE2 has locked out the use of \K in lookaround
assertions. However, there is an option to re-enable the old behaviour. If that
/* Since release 10.38 PCRE2 has locked out the use of \K in lookaround
assertions. However, there is an option to re-enable the old behaviour. If that
is set, it is possible to run patterns such as /(?=.\K)/ that use \K in an
assertion to set the start of a match later than its end. In this demonstration
program, we show how to detect this case, but it shouldn't arise because the
program, we show how to detect this case, but it shouldn't arise because the
option is never set. */
if (ovector[0] > ovector[1])

View File

@ -148,7 +148,7 @@ static const int eint2[] = {
37, REG_EESCAPE, /* PCRE2 does not support \L, \l, \N{name}, \U, or \u */
56, REG_INVARG, /* internal error: unknown newline setting */
92, REG_INVARG, /* invalid option bits with PCRE2_LITERAL */
99, REG_EESCAPE /* \K in lookaround */
99, REG_EESCAPE /* \K in lookaround */
};
/* Table of texts corresponding to POSIX error codes */