Implement REG_PEND (GNU extension) for the POSIX wrapper.
This commit is contained in:
parent
f850015168
commit
bcba497c0b
|
@ -182,6 +182,8 @@ deeply. (Compare item 10.23/36.) This should fix oss-fuzz #1761.
|
|||
38. Fix returned offsets from regexec() when REG_STARTEND is used with a
|
||||
starting offset greater than zero.
|
||||
|
||||
39. Implement REG_PEND (GNU extension) for the POSIX wrapper.
|
||||
|
||||
|
||||
Version 10.23 14-February-2017
|
||||
------------------------------
|
||||
|
|
|
@ -69,7 +69,7 @@ replacement library. Other POSIX options are not even defined.
|
|||
<P>
|
||||
There are also some options that are not defined by POSIX. These have been
|
||||
added at the request of users who want to make use of certain PCRE2-specific
|
||||
features via the POSIX calling interface.
|
||||
features via the POSIX calling interface or to add BSD or GNU functionality.
|
||||
</P>
|
||||
<P>
|
||||
When PCRE2 is called via these functions, it is only the API that is POSIX-like
|
||||
|
@ -91,10 +91,11 @@ identifying error codes.
|
|||
<br><a name="SEC3" href="#TOC1">COMPILING A PATTERN</a><br>
|
||||
<P>
|
||||
The function <b>regcomp()</b> is called to compile a pattern into an
|
||||
internal form. The pattern is a C string terminated by a binary zero, and
|
||||
is passed in the argument <i>pattern</i>. The <i>preg</i> argument is a pointer
|
||||
to a <b>regex_t</b> structure that is used as a base for storing information
|
||||
about the compiled regular expression.
|
||||
internal form. By default, the pattern is a C string terminated by a binary
|
||||
zero (but see REG_PEND below). The <i>preg</i> argument is a pointer to a
|
||||
<b>regex_t</b> structure that is used as a base for storing information about
|
||||
the compiled regular expression. (It is also used for input when REG_PEND is
|
||||
set.)
|
||||
</P>
|
||||
<P>
|
||||
The argument <i>cflags</i> is either zero, or contains one or more of the bits
|
||||
|
@ -124,6 +125,16 @@ matching, the <i>nmatch</i> and <i>pmatch</i> arguments are ignored, and no
|
|||
captured strings are returned. Versions of the PCRE library prior to 10.22 used
|
||||
to set the PCRE2_NO_AUTO_CAPTURE compile option, but this no longer happens
|
||||
because it disables the use of back references.
|
||||
<pre>
|
||||
REG_PEND
|
||||
</pre>
|
||||
If this option is set, the <b>reg_endp</b> field in the <i>preg</i> structure
|
||||
(which has the type const char *) must be set to point to the character beyond
|
||||
the end of the pattern before calling <b>regcomp()</b>. The pattern itself may
|
||||
now contain binary zeroes, which are treated as data characters. Without
|
||||
REG_PEND, a binary zero terminates the pattern and the <b>re_endp</b> field is
|
||||
ignored. This is a GNU extension to the POSIX standard and should be used with
|
||||
caution in software intended to be portable to other systems.
|
||||
<pre>
|
||||
REG_UCP
|
||||
</pre>
|
||||
|
@ -156,9 +167,10 @@ class such as [^a] (they are).
|
|||
</P>
|
||||
<P>
|
||||
The yield of <b>regcomp()</b> is zero on success, and non-zero otherwise. The
|
||||
<i>preg</i> structure is filled in on success, and one member of the structure
|
||||
is public: <i>re_nsub</i> contains the number of capturing subpatterns in
|
||||
the regular expression. Various error codes are defined in the header file.
|
||||
<i>preg</i> structure is filled in on success, and one other member of the
|
||||
structure (as well as <i>re_endp</i>) is public: <i>re_nsub</i> contains the
|
||||
number of capturing subpatterns in the regular expression. Various error codes
|
||||
are defined in the header file.
|
||||
</P>
|
||||
<P>
|
||||
NOTE: If the yield of <b>regcomp()</b> is non-zero, you must not attempt to
|
||||
|
@ -228,15 +240,26 @@ function.
|
|||
<pre>
|
||||
REG_STARTEND
|
||||
</pre>
|
||||
The string is considered to start at <i>string</i> + <i>pmatch[0].rm_so</i> and
|
||||
to have a terminating NUL located at <i>string</i> + <i>pmatch[0].rm_eo</i>
|
||||
(there need not actually be a NUL at that location), regardless of the value of
|
||||
<i>nmatch</i>. This is a BSD extension, compatible with but not specified by
|
||||
IEEE Standard 1003.2 (POSIX.2), and should be used with caution in software
|
||||
intended to be portable to other systems. Note that a non-zero <i>rm_so</i> does
|
||||
not imply REG_NOTBOL; REG_STARTEND affects only the location of the string, not
|
||||
how it is matched. Setting REG_STARTEND and passing <i>pmatch</i> as NULL are
|
||||
mutually exclusive; the error REG_INVARG is returned.
|
||||
When this option is set, the subject string is starts at <i>string</i> +
|
||||
<i>pmatch[0].rm_so</i> and ends at <i>string</i> + <i>pmatch[0].rm_eo</i>, which
|
||||
should point to the first character beyond the string. There may be binary
|
||||
zeroes within the subject string, and indeed, using REG_STARTEND is the only
|
||||
way to pass a subject string that contains a binary zero.
|
||||
</P>
|
||||
<P>
|
||||
Whatever the value of <i>pmatch[0].rm_so</i>, the offsets of the matched string
|
||||
and any captured substrings are still given relative to the start of
|
||||
<i>string</i> itself. (Before PCRE2 release 10.30 these were given relative to
|
||||
<i>string</i> + <i>pmatch[0].rm_so</i>, but this differs from other
|
||||
implementations.)
|
||||
</P>
|
||||
<P>
|
||||
This is a BSD extension, compatible with but not specified by IEEE Standard
|
||||
1003.2 (POSIX.2), and should be used with caution in software intended to be
|
||||
portable to other systems. Note that a non-zero <i>rm_so</i> does not imply
|
||||
REG_NOTBOL; REG_STARTEND affects only the location and length of the string,
|
||||
not how it is matched. Setting REG_STARTEND and passing <i>pmatch</i> as NULL
|
||||
are mutually exclusive; the error REG_INVARG is returned.
|
||||
</P>
|
||||
<P>
|
||||
If the pattern was compiled with the REG_NOSUB flag, no data about any matched
|
||||
|
@ -291,9 +314,9 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC9" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 31 January 2016
|
||||
Last updated: 05 June 2017
|
||||
<br>
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
Copyright © 1997-2017 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -1078,6 +1078,19 @@ are <b>notbol</b>, <b>notempty</b>, and <b>noteol</b>, causing REG_NOTBOL,
|
|||
REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to <b>regexec()</b>.
|
||||
The other modifiers are ignored, with a warning message.
|
||||
</P>
|
||||
<P>
|
||||
There is one additional modifier that can be used with the POSIX wrapper. It is
|
||||
ignored (with a warning) if used for non-POSIX matching.
|
||||
<pre>
|
||||
posix_startend=<n>[:<m>]
|
||||
</pre>
|
||||
This causes the subject string to be passed to <b>regexec()</b> using the
|
||||
REG_STARTEND option, which uses offsets to restrict which part of the string is
|
||||
searched. If only one number is given, the end offset is passed as the end of
|
||||
the subject string. For more detail of REG_STARTEND, see the
|
||||
<a href="pcre2posix.html"><b>pcre2posix</b></a>
|
||||
documentation.
|
||||
</P>
|
||||
<br><b>
|
||||
Setting match controls
|
||||
</b><br>
|
||||
|
@ -1817,7 +1830,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 01 June 2017
|
||||
Last updated: 03 June 2017
|
||||
<br>
|
||||
Copyright © 1997-2017 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -8986,7 +8986,8 @@ DESCRIPTION
|
|||
|
||||
There are also some options that are not defined by POSIX. These have
|
||||
been added at the request of users who want to make use of certain
|
||||
PCRE2-specific features via the POSIX calling interface.
|
||||
PCRE2-specific features via the POSIX calling interface or to add BSD
|
||||
or GNU functionality.
|
||||
|
||||
When PCRE2 is called via these functions, it is only the API that is
|
||||
POSIX-like in style. The syntax and semantics of the regular expres-
|
||||
|
@ -9008,10 +9009,11 @@ DESCRIPTION
|
|||
COMPILING A PATTERN
|
||||
|
||||
The function regcomp() is called to compile a pattern into an internal
|
||||
form. The pattern is a C string terminated by a binary zero, and is
|
||||
passed in the argument pattern. The preg argument is a pointer to a
|
||||
regex_t structure that is used as a base for storing information about
|
||||
the compiled regular expression.
|
||||
form. By default, the pattern is a C string terminated by a binary zero
|
||||
(but see REG_PEND below). The preg argument is a pointer to a regex_t
|
||||
structure that is used as a base for storing information about the com-
|
||||
piled regular expression. (It is also used for input when REG_PEND is
|
||||
set.)
|
||||
|
||||
The argument cflags is either zero, or contains one or more of the bits
|
||||
defined by the following macros:
|
||||
|
@ -9042,6 +9044,17 @@ COMPILING A PATTERN
|
|||
used to set the PCRE2_NO_AUTO_CAPTURE compile option, but this no
|
||||
longer happens because it disables the use of back references.
|
||||
|
||||
REG_PEND
|
||||
|
||||
If this option is set, the reg_endp field in the preg structure (which
|
||||
has the type const char *) must be set to point to the character beyond
|
||||
the end of the pattern before calling regcomp(). The pattern itself may
|
||||
now contain binary zeroes, which are treated as data characters. With-
|
||||
out REG_PEND, a binary zero terminates the pattern and the re_endp
|
||||
field is ignored. This is a GNU extension to the POSIX standard and
|
||||
should be used with caution in software intended to be portable to
|
||||
other systems.
|
||||
|
||||
REG_UCP
|
||||
|
||||
The PCRE2_UCP option is set when the regular expression is passed for
|
||||
|
@ -9071,9 +9084,10 @@ COMPILING A PATTERN
|
|||
ter (they are not) or by a negative class such as [^a] (they are).
|
||||
|
||||
The yield of regcomp() is zero on success, and non-zero otherwise. The
|
||||
preg structure is filled in on success, and one member of the structure
|
||||
is public: re_nsub contains the number of capturing subpatterns in the
|
||||
regular expression. Various error codes are defined in the header file.
|
||||
preg structure is filled in on success, and one other member of the
|
||||
structure (as well as re_endp) is public: re_nsub contains the number
|
||||
of capturing subpatterns in the regular expression. Various error codes
|
||||
are defined in the header file.
|
||||
|
||||
NOTE: If the yield of regcomp() is non-zero, you must not attempt to
|
||||
use the contents of the preg structure. If, for example, you pass it to
|
||||
|
@ -9146,15 +9160,24 @@ MATCHING A PATTERN
|
|||
|
||||
REG_STARTEND
|
||||
|
||||
The string is considered to start at string + pmatch[0].rm_so and to
|
||||
have a terminating NUL located at string + pmatch[0].rm_eo (there need
|
||||
not actually be a NUL at that location), regardless of the value of
|
||||
nmatch. This is a BSD extension, compatible with but not specified by
|
||||
IEEE Standard 1003.2 (POSIX.2), and should be used with caution in
|
||||
software intended to be portable to other systems. Note that a non-zero
|
||||
rm_so does not imply REG_NOTBOL; REG_STARTEND affects only the location
|
||||
of the string, not how it is matched. Setting REG_STARTEND and passing
|
||||
pmatch as NULL are mutually exclusive; the error REG_INVARG is
|
||||
When this option is set, the subject string is starts at string +
|
||||
pmatch[0].rm_so and ends at string + pmatch[0].rm_eo, which should
|
||||
point to the first character beyond the string. There may be binary
|
||||
zeroes within the subject string, and indeed, using REG_STARTEND is the
|
||||
only way to pass a subject string that contains a binary zero.
|
||||
|
||||
Whatever the value of pmatch[0].rm_so, the offsets of the matched
|
||||
string and any captured substrings are still given relative to the
|
||||
start of string itself. (Before PCRE2 release 10.30 these were given
|
||||
relative to string + pmatch[0].rm_so, but this differs from other
|
||||
implementations.)
|
||||
|
||||
This is a BSD extension, compatible with but not specified by IEEE
|
||||
Standard 1003.2 (POSIX.2), and should be used with caution in software
|
||||
intended to be portable to other systems. Note that a non-zero rm_so
|
||||
does not imply REG_NOTBOL; REG_STARTEND affects only the location and
|
||||
length of the string, not how it is matched. Setting REG_STARTEND and
|
||||
passing pmatch as NULL are mutually exclusive; the error REG_INVARG is
|
||||
returned.
|
||||
|
||||
If the pattern was compiled with the REG_NOSUB flag, no data about any
|
||||
|
@ -9209,8 +9232,8 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 31 January 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
Last updated: 05 June 2017
|
||||
Copyright (c) 1997-2017 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2POSIX 3 "03 June 2017" "PCRE2 10.30"
|
||||
.TH PCRE2POSIX 3 "05 June 2017" "PCRE2 10.30"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.SH "SYNOPSIS"
|
||||
|
@ -46,7 +46,7 @@ replacement library. Other POSIX options are not even defined.
|
|||
.P
|
||||
There are also some options that are not defined by POSIX. These have been
|
||||
added at the request of users who want to make use of certain PCRE2-specific
|
||||
features via the POSIX calling interface.
|
||||
features via the POSIX calling interface or to add BSD or GNU functionality.
|
||||
.P
|
||||
When PCRE2 is called via these functions, it is only the API that is POSIX-like
|
||||
in style. The syntax and semantics of the regular expressions themselves are
|
||||
|
@ -68,10 +68,11 @@ identifying error codes.
|
|||
.rs
|
||||
.sp
|
||||
The function \fBregcomp()\fP is called to compile a pattern into an
|
||||
internal form. The pattern is a C string terminated by a binary zero, and
|
||||
is passed in the argument \fIpattern\fP. The \fIpreg\fP argument is a pointer
|
||||
to a \fBregex_t\fP structure that is used as a base for storing information
|
||||
about the compiled regular expression.
|
||||
internal form. By default, the pattern is a C string terminated by a binary
|
||||
zero (but see REG_PEND below). The \fIpreg\fP argument is a pointer to a
|
||||
\fBregex_t\fP structure that is used as a base for storing information about
|
||||
the compiled regular expression. (It is also used for input when REG_PEND is
|
||||
set.)
|
||||
.P
|
||||
The argument \fIcflags\fP is either zero, or contains one or more of the bits
|
||||
defined by the following macros:
|
||||
|
@ -100,6 +101,16 @@ matching, the \fInmatch\fP and \fIpmatch\fP arguments are ignored, and no
|
|||
captured strings are returned. Versions of the PCRE library prior to 10.22 used
|
||||
to set the PCRE2_NO_AUTO_CAPTURE compile option, but this no longer happens
|
||||
because it disables the use of back references.
|
||||
.sp
|
||||
REG_PEND
|
||||
.sp
|
||||
If this option is set, the \fBreg_endp\fP field in the \fIpreg\fP structure
|
||||
(which has the type const char *) must be set to point to the character beyond
|
||||
the end of the pattern before calling \fBregcomp()\fP. The pattern itself may
|
||||
now contain binary zeroes, which are treated as data characters. Without
|
||||
REG_PEND, a binary zero terminates the pattern and the \fBre_endp\fP field is
|
||||
ignored. This is a GNU extension to the POSIX standard and should be used with
|
||||
caution in software intended to be portable to other systems.
|
||||
.sp
|
||||
REG_UCP
|
||||
.sp
|
||||
|
@ -130,9 +141,10 @@ newlines are matched by the dot metacharacter (they are not) or by a negative
|
|||
class such as [^a] (they are).
|
||||
.P
|
||||
The yield of \fBregcomp()\fP is zero on success, and non-zero otherwise. The
|
||||
\fIpreg\fP structure is filled in on success, and one member of the structure
|
||||
is public: \fIre_nsub\fP contains the number of capturing subpatterns in
|
||||
the regular expression. Various error codes are defined in the header file.
|
||||
\fIpreg\fP structure is filled in on success, and one other member of the
|
||||
structure (as well as \fIre_endp\fP) is public: \fIre_nsub\fP contains the
|
||||
number of capturing subpatterns in the regular expression. Various error codes
|
||||
are defined in the header file.
|
||||
.P
|
||||
NOTE: If the yield of \fBregcomp()\fP is non-zero, you must not attempt to
|
||||
use the contents of the \fIpreg\fP structure. If, for example, you pass it to
|
||||
|
@ -204,21 +216,24 @@ function.
|
|||
.sp
|
||||
REG_STARTEND
|
||||
.sp
|
||||
When this option is set, the string is considered to start at \fIstring\fP +
|
||||
\fIpmatch[0].rm_so\fP and to have a terminating NUL located at \fIstring\fP +
|
||||
\fIpmatch[0].rm_eo\fP (there need not actually be a NUL at that location),
|
||||
regardless of the value of \fInmatch\fP. However, the offsets of the matched
|
||||
string and any captured substrings are still given relative to the start of
|
||||
\fIstring\fP. (Before PCRE2 release 10.30 these were given relative to
|
||||
When this option is set, the subject string is starts at \fIstring\fP +
|
||||
\fIpmatch[0].rm_so\fP and ends at \fIstring\fP + \fIpmatch[0].rm_eo\fP, which
|
||||
should point to the first character beyond the string. There may be binary
|
||||
zeroes within the subject string, and indeed, using REG_STARTEND is the only
|
||||
way to pass a subject string that contains a binary zero.
|
||||
.P
|
||||
Whatever the value of \fIpmatch[0].rm_so\fP, the offsets of the matched string
|
||||
and any captured substrings are still given relative to the start of
|
||||
\fIstring\fP itself. (Before PCRE2 release 10.30 these were given relative to
|
||||
\fIstring\fP + \fIpmatch[0].rm_so\fP, but this differs from other
|
||||
implementations.)
|
||||
.P
|
||||
This is a BSD extension, compatible with but not specified by IEEE Standard
|
||||
1003.2 (POSIX.2), and should be used with caution in software intended to be
|
||||
portable to other systems. Note that a non-zero \fIrm_so\fP does not imply
|
||||
REG_NOTBOL; REG_STARTEND affects only the location of the string, not how it is
|
||||
matched. Setting REG_STARTEND and passing \fIpmatch\fP as NULL are mutually
|
||||
exclusive; the error REG_INVARG is returned.
|
||||
REG_NOTBOL; REG_STARTEND affects only the location and length of the string,
|
||||
not how it is matched. Setting REG_STARTEND and passing \fIpmatch\fP as NULL
|
||||
are mutually exclusive; the error REG_INVARG is returned.
|
||||
.P
|
||||
If the pattern was compiled with the REG_NOSUB flag, no data about any matched
|
||||
strings is returned. The \fInmatch\fP and \fIpmatch\fP arguments of
|
||||
|
@ -277,6 +292,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 03 June 2017
|
||||
Last updated: 05 June 2017
|
||||
Copyright (c) 1997-2017 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -965,6 +965,17 @@ SUBJECT MODIFIERS
|
|||
REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to regexec().
|
||||
The other modifiers are ignored, with a warning message.
|
||||
|
||||
There is one additional modifier that can be used with the POSIX wrap-
|
||||
per. It is ignored (with a warning) if used for non-POSIX matching.
|
||||
|
||||
posix_startend=<n>[:<m>]
|
||||
|
||||
This causes the subject string to be passed to regexec() using the
|
||||
REG_STARTEND option, which uses offsets to restrict which part of the
|
||||
string is searched. If only one number is given, the end offset is
|
||||
passed as the end of the subject string. For more detail of REG_STAR-
|
||||
TEND, see the pcre2posix documentation.
|
||||
|
||||
Setting match controls
|
||||
|
||||
The following modifiers affect the matching process or request addi-
|
||||
|
@ -1651,5 +1662,5 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 01 June 2017
|
||||
Last updated: 03 June 2017
|
||||
Copyright (c) 1997-2017 University of Cambridge.
|
||||
|
|
|
@ -231,10 +231,14 @@ PCRE2POSIX_EXP_DEFN int PCRE2_CALL_CONVENTION
|
|||
regcomp(regex_t *preg, const char *pattern, int cflags)
|
||||
{
|
||||
PCRE2_SIZE erroffset;
|
||||
PCRE2_SIZE patlen;
|
||||
int errorcode;
|
||||
int options = 0;
|
||||
int re_nsub = 0;
|
||||
|
||||
patlen = ((cflags & REG_PEND) != 0)? (PCRE2_SIZE)(preg->re_endp - pattern) :
|
||||
PCRE2_ZERO_TERMINATED;
|
||||
|
||||
if ((cflags & REG_ICASE) != 0) options |= PCRE2_CASELESS;
|
||||
if ((cflags & REG_NEWLINE) != 0) options |= PCRE2_MULTILINE;
|
||||
if ((cflags & REG_DOTALL) != 0) options |= PCRE2_DOTALL;
|
||||
|
@ -243,8 +247,8 @@ if ((cflags & REG_UCP) != 0) options |= PCRE2_UCP;
|
|||
if ((cflags & REG_UNGREEDY) != 0) options |= PCRE2_UNGREEDY;
|
||||
|
||||
preg->re_cflags = cflags;
|
||||
preg->re_pcre2_code = pcre2_compile((PCRE2_SPTR)pattern, PCRE2_ZERO_TERMINATED,
|
||||
options, &errorcode, &erroffset, NULL);
|
||||
preg->re_pcre2_code = pcre2_compile((PCRE2_SPTR)pattern, patlen, options,
|
||||
&errorcode, &erroffset, NULL);
|
||||
preg->re_erroffset = erroffset;
|
||||
|
||||
if (preg->re_pcre2_code == NULL)
|
||||
|
|
|
@ -62,6 +62,7 @@ extern "C" {
|
|||
#define REG_NOTEMPTY 0x0100 /* NOT defined by POSIX; maps to PCRE2_NOTEMPTY */
|
||||
#define REG_UNGREEDY 0x0200 /* NOT defined by POSIX; maps to PCRE2_UNGREEDY */
|
||||
#define REG_UCP 0x0400 /* NOT defined by POSIX; maps to PCRE2_UCP */
|
||||
#define REG_PEND 0x0800 /* GNU feature: pass end pattern by re_endp */
|
||||
|
||||
/* This is not used by PCRE2, but by defining it we make it easier
|
||||
to slot PCRE2 into existing programs that make POSIX calls. */
|
||||
|
@ -91,11 +92,13 @@ enum {
|
|||
};
|
||||
|
||||
|
||||
/* The structure representing a compiled regular expression. */
|
||||
/* The structure representing a compiled regular expression. It is also used
|
||||
for passing the pattern end pointer when REG_PEND is set. */
|
||||
|
||||
typedef struct {
|
||||
void *re_pcre2_code;
|
||||
void *re_match_data;
|
||||
const char *re_endp;
|
||||
size_t re_nsub;
|
||||
size_t re_erroffset;
|
||||
int re_cflags;
|
||||
|
|
|
@ -699,7 +699,8 @@ static modstruct modlist[] = {
|
|||
#define POSIX_SUPPORTED_COMPILE_EXTRA_OPTIONS (0)
|
||||
|
||||
#define POSIX_SUPPORTED_COMPILE_CONTROLS ( \
|
||||
CTL_AFTERTEXT|CTL_ALLAFTERTEXT|CTL_EXPAND|CTL_POSIX|CTL_POSIX_NOSUB)
|
||||
CTL_AFTERTEXT|CTL_ALLAFTERTEXT|CTL_EXPAND|CTL_HEXPAT|CTL_POSIX| \
|
||||
CTL_POSIX_NOSUB|CTL_USE_LENGTH)
|
||||
|
||||
#define POSIX_SUPPORTED_COMPILE_CONTROLS2 (0)
|
||||
|
||||
|
@ -733,11 +734,9 @@ the first control word. Note that CTL_POSIX_NOSUB is always accompanied by
|
|||
CTL_POSIX, so it doesn't need its own entries. */
|
||||
|
||||
static uint32_t exclusive_pat_controls[] = {
|
||||
CTL_POSIX | CTL_HEXPAT,
|
||||
CTL_POSIX | CTL_PUSH,
|
||||
CTL_POSIX | CTL_PUSHCOPY,
|
||||
CTL_POSIX | CTL_PUSHTABLESCOPY,
|
||||
CTL_POSIX | CTL_USE_LENGTH,
|
||||
CTL_PUSH | CTL_PUSHCOPY,
|
||||
CTL_PUSH | CTL_PUSHTABLESCOPY,
|
||||
CTL_PUSHCOPY | CTL_PUSHTABLESCOPY,
|
||||
|
@ -896,7 +895,7 @@ static PCRE2_SIZE malloclistlength[MALLOCLISTSIZE];
|
|||
static uint32_t malloclistptr = 0;
|
||||
|
||||
#ifdef SUPPORT_PCRE2_8
|
||||
static regex_t preg = { NULL, NULL, 0, 0, 0 };
|
||||
static regex_t preg = { NULL, NULL, 0, 0, 0, 0 };
|
||||
#endif
|
||||
|
||||
static int *dfa_workspace = NULL;
|
||||
|
@ -5264,6 +5263,12 @@ if ((pat_patctl.control & CTL_POSIX) != 0)
|
|||
if ((pat_patctl.options & PCRE2_DOTALL) != 0) cflags |= REG_DOTALL;
|
||||
if ((pat_patctl.options & PCRE2_UNGREEDY) != 0) cflags |= REG_UNGREEDY;
|
||||
|
||||
if ((pat_patctl.control & (CTL_HEXPAT|CTL_USE_LENGTH)) != 0)
|
||||
{
|
||||
preg.re_endp = (char *)pbuffer8 + patlen;
|
||||
cflags |= REG_PEND;
|
||||
}
|
||||
|
||||
rc = regcomp(&preg, (char *)pbuffer8, cflags);
|
||||
|
||||
/* Compiling failed */
|
||||
|
|
|
@ -123,4 +123,10 @@
|
|||
/^a\x{00}b$/posix
|
||||
a\x{00}b\=posix_startend=0:3
|
||||
|
||||
/"A" 00 "B"/hex
|
||||
A\x{00}B\=posix_startend=0:3
|
||||
|
||||
/ABC/use_length
|
||||
ABC
|
||||
|
||||
# End of testdata/testinput18
|
||||
|
|
|
@ -15,4 +15,7 @@
|
|||
/\w/ucp
|
||||
+++\x{c2}
|
||||
|
||||
/"^AB" 00 "\x{1234}$"/hex,utf
|
||||
AB\x{00}\x{1234}\=posix_startend=0:6
|
||||
|
||||
# End of testdata/testinput19
|
||||
|
|
|
@ -191,4 +191,12 @@ No match: POSIX code 17: match failed
|
|||
a\x{00}b\=posix_startend=0:3
|
||||
0: a\x00b
|
||||
|
||||
/"A" 00 "B"/hex
|
||||
A\x{00}B\=posix_startend=0:3
|
||||
0: A\x00B
|
||||
|
||||
/ABC/use_length
|
||||
ABC
|
||||
0: ABC
|
||||
|
||||
# End of testdata/testinput18
|
||||
|
|
|
@ -18,4 +18,8 @@ No match: POSIX code 17: match failed
|
|||
+++\x{c2}
|
||||
0: \xc2
|
||||
|
||||
/"^AB" 00 "\x{1234}$"/hex,utf
|
||||
AB\x{00}\x{1234}\=posix_startend=0:6
|
||||
0: AB\x{00}\x{1234}
|
||||
|
||||
# End of testdata/testinput19
|
||||
|
|
Loading…
Reference in New Issue