Add support for (?^) as now supported by Perl.
This commit is contained in:
parent
27337495dc
commit
6e245572b8
|
@ -131,6 +131,8 @@ present.
|
|||
terminated by (*ACCEPT).
|
||||
|
||||
29. Add support for \N{U+dddd}, but not in EBCDIC environments.
|
||||
|
||||
30. Add support for (?^) for unsetting all imnsx options.
|
||||
|
||||
|
||||
Version 10.31 12-February-2018
|
||||
|
|
|
@ -1466,7 +1466,8 @@ character, even if newlines are coded as CRLF. Without this option, a dot does
|
|||
not match when the current position in the subject is at a newline. This option
|
||||
is equivalent to Perl's /s option, and it can be changed within a pattern by a
|
||||
(?s) option setting. A negative class such as [^a] always matches newline
|
||||
characters, independent of the setting of this option.
|
||||
characters, and the \N escape sequence always matches a non-newline character,
|
||||
independent of the setting of PCRE2_DOTALL.
|
||||
<pre>
|
||||
PCRE2_DUPNAMES
|
||||
</pre>
|
||||
|
@ -3634,7 +3635,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 02 July 2018
|
||||
Last updated: 27 July 2018
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -42,13 +42,14 @@ assertion is a condition that has a matching branch (that is, the condition is
|
|||
false).
|
||||
</P>
|
||||
<P>
|
||||
4. The following Perl escape sequences are not supported: \l, \u, \L,
|
||||
\U, and \N when followed by a character name or Unicode value. (\N on its
|
||||
own, matching a non-newline character, is supported.) In fact these are
|
||||
4. The following Perl escape sequences are not supported: \F, \l, \L, \u,
|
||||
\U, and \N when followed by a character name. \N on its own, matching a
|
||||
non-newline character, and \N{U+dd..}, matching a Unicode code point, are
|
||||
supported. The escapes that modify the case of following letters are
|
||||
implemented by Perl's general string-handling and are not part of its pattern
|
||||
matching engine. If any of these are encountered by PCRE2, an error is
|
||||
generated by default. However, if the PCRE2_ALT_BSUX option is set,
|
||||
\U and \u are interpreted as ECMAScript interprets them.
|
||||
generated by default. However, if the PCRE2_ALT_BSUX option is set, \U and \u
|
||||
are interpreted as ECMAScript interprets them.
|
||||
</P>
|
||||
<P>
|
||||
5. The Perl escape sequences \p, \P, and \X are supported only if PCRE2 is
|
||||
|
@ -61,17 +62,22 @@ internal representation of Unicode characters, there is no need to implement
|
|||
the somewhat messy concept of surrogates."
|
||||
</P>
|
||||
<P>
|
||||
6. PCRE2 does support the \Q...\E escape for quoting substrings. Characters
|
||||
in between are treated as literals. This is slightly different from Perl in
|
||||
that $ and @ are also handled as literals inside the quotes. In Perl, they
|
||||
cause variable interpolation (but of course PCRE2 does not have variables).
|
||||
Note the following examples:
|
||||
6. PCRE2 supports the \Q...\E escape for quoting substrings. Characters
|
||||
in between are treated as literals. However, this is slightly different from
|
||||
Perl in that $ and @ are also handled as literals inside the quotes. In Perl,
|
||||
they cause variable interpolation (but of course PCRE2 does not have
|
||||
variables). Also, Perl does "double-quotish backslash interpolation" on any
|
||||
backslashes between \Q and \E which, its documentation says, "may lead to
|
||||
confusing results". PCRE2 treats a backslash between \Q and \E just like any
|
||||
other character. Note the following examples:
|
||||
<pre>
|
||||
Pattern PCRE2 matches Perl matches
|
||||
Pattern PCRE2 matches Perl matches
|
||||
|
||||
\Qabc$xyz\E abc$xyz abc followed by the contents of $xyz
|
||||
\Qabc\$xyz\E abc\$xyz abc\$xyz
|
||||
\Qabc\E\$\Qxyz\E abc$xyz abc$xyz
|
||||
\QA\B\E A\B A\B
|
||||
\Q\\E \ \\E
|
||||
</pre>
|
||||
The \Q...\E sequence is recognized both inside and outside character classes.
|
||||
</P>
|
||||
|
@ -229,9 +235,9 @@ Cambridge, England.
|
|||
REVISION
|
||||
</b><br>
|
||||
<P>
|
||||
Last updated: 18 April 2017
|
||||
Last updated: 28 July 2018
|
||||
<br>
|
||||
Copyright © 1997-2017 University of Cambridge.
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -357,13 +357,18 @@ of the pattern.
|
|||
If you want to remove the special meaning from a sequence of characters, you
|
||||
can do so by putting them between \Q and \E. This is different from Perl in
|
||||
that $ and @ are handled as literals in \Q...\E sequences in PCRE2, whereas
|
||||
in Perl, $ and @ cause variable interpolation. Note the following examples:
|
||||
in Perl, $ and @ cause variable interpolation. Also, Perl does "double-quotish
|
||||
backslash interpolation" on any backslashes between \Q and \E which, its
|
||||
documentation says, "may lead to confusing results". PCRE2 treats a backslash
|
||||
between \Q and \E just like any other character. Note the following examples:
|
||||
<pre>
|
||||
Pattern PCRE2 matches Perl matches
|
||||
|
||||
\Qabc$xyz\E abc$xyz abc followed by the contents of $xyz
|
||||
\Qabc\$xyz\E abc\$xyz abc\$xyz
|
||||
\Qabc\E\$\Qxyz\E abc$xyz abc$xyz
|
||||
\QA\B\E A\B A\B
|
||||
\Q\\E \ \\E
|
||||
</pre>
|
||||
The \Q...\E sequence is recognized both inside and outside character classes.
|
||||
An isolated \E that is not preceded by \Q is ignored. If \Q is not followed
|
||||
|
@ -545,7 +550,7 @@ character class, these sequences have different meanings.
|
|||
Unsupported escape sequences
|
||||
</b><br>
|
||||
<P>
|
||||
In Perl, the sequences \l, \L, \u, and \U are recognized by its string
|
||||
In Perl, the sequences \F, \l, \L, \u, and \U are recognized by its string
|
||||
handler and used to modify the case of following characters. By default, PCRE2
|
||||
does not support these escape sequences. However, if the PCRE2_ALT_BSUX option
|
||||
is set, \U matches a "U" character, and \u can be used to define a character
|
||||
|
@ -1635,21 +1640,27 @@ Perl option letters enclosed between "(?" and ")". The option letters are
|
|||
xx for PCRE2_EXTENDED_MORE
|
||||
</pre>
|
||||
For example, (?im) sets caseless, multiline matching. It is also possible to
|
||||
unset these options by preceding the letter with a hyphen. The two "extended"
|
||||
options are not independent; unsetting either one cancels the effects of both
|
||||
of them.
|
||||
unset these options by preceding the relevant letters with a hyphen, for
|
||||
example (?-im). The two "extended" options are not independent; unsetting either
|
||||
one cancels the effects of both of them.
|
||||
</P>
|
||||
<P>
|
||||
A combined setting and unsetting such as (?im-sx), which sets PCRE2_CASELESS
|
||||
and PCRE2_MULTILINE while unsetting PCRE2_DOTALL and PCRE2_EXTENDED, is also
|
||||
permitted. If a letter appears both before and after the hyphen, the option is
|
||||
unset. An empty options setting "(?)" is allowed. Needless to say, it has no
|
||||
effect.
|
||||
permitted. Only one hyphen may appear in the options string. If a letter
|
||||
appears both before and after the hyphen, the option is unset. An empty options
|
||||
setting "(?)" is allowed. Needless to say, it has no effect.
|
||||
</P>
|
||||
<P>
|
||||
If the first character following (? is a circumflex, it causes all of the above
|
||||
options to be unset. Thus, (?^) is equivalent to (?-imnsx). Letters may follow
|
||||
the circumflex to cause some options to be re-instated, but a hyphen may not
|
||||
appear.
|
||||
</P>
|
||||
<P>
|
||||
The PCRE2-specific options PCRE2_DUPNAMES and PCRE2_UNGREEDY can be changed in
|
||||
the same way as the Perl-compatible options by using the characters J and U
|
||||
respectively.
|
||||
respectively. However, these are not unset by (?^).
|
||||
</P>
|
||||
<P>
|
||||
When one of these option changes occurs at top level (that is, not inside
|
||||
|
@ -3579,7 +3590,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC30" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 27 July 2018
|
||||
Last updated: 28 July 2018
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -456,7 +456,15 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
|
|||
(?x) extended: ignore white space except in classes
|
||||
(?xx) as (?x) but also ignore space and tab in classes
|
||||
(?-...) unset option(s)
|
||||
(?^) unset imnsx options
|
||||
</pre>
|
||||
Unsetting x or xx unsets both. Several options may be set at once, and a
|
||||
mixture of setting and unsetting such as (?i-x) is allowed, but there may be
|
||||
only one hyphen. Setting (but no unsetting) is allowed after (?^ for example
|
||||
(?^in). An option setting may appear at the start of a non-capturing group, for
|
||||
example (?i:...).
|
||||
</P>
|
||||
<P>
|
||||
The following are recognized only at the very start of a pattern or after one
|
||||
of the newline or \R options with similar syntax. More than one of them may
|
||||
appear. For the first three, d is a decimal number.
|
||||
|
@ -624,7 +632,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC27" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 27 July 2018
|
||||
Last updated: 28 July 2018
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
|
|
4176
doc/pcre2.txt
4176
doc/pcre2.txt
File diff suppressed because it is too large
Load Diff
|
@ -1639,19 +1639,24 @@ Perl option letters enclosed between "(?" and ")". The option letters are
|
|||
xx for PCRE2_EXTENDED_MORE
|
||||
.sp
|
||||
For example, (?im) sets caseless, multiline matching. It is also possible to
|
||||
unset these options by preceding the letter with a hyphen. The two "extended"
|
||||
options are not independent; unsetting either one cancels the effects of both
|
||||
of them.
|
||||
unset these options by preceding the relevant letters with a hyphen, for
|
||||
example (?-im). The two "extended" options are not independent; unsetting either
|
||||
one cancels the effects of both of them.
|
||||
.P
|
||||
A combined setting and unsetting such as (?im-sx), which sets PCRE2_CASELESS
|
||||
and PCRE2_MULTILINE while unsetting PCRE2_DOTALL and PCRE2_EXTENDED, is also
|
||||
permitted. If a letter appears both before and after the hyphen, the option is
|
||||
unset. An empty options setting "(?)" is allowed. Needless to say, it has no
|
||||
effect.
|
||||
permitted. Only one hyphen may appear in the options string. If a letter
|
||||
appears both before and after the hyphen, the option is unset. An empty options
|
||||
setting "(?)" is allowed. Needless to say, it has no effect.
|
||||
.P
|
||||
If the first character following (? is a circumflex, it causes all of the above
|
||||
options to be unset. Thus, (?^) is equivalent to (?-imnsx). Letters may follow
|
||||
the circumflex to cause some options to be re-instated, but a hyphen may not
|
||||
appear.
|
||||
.P
|
||||
The PCRE2-specific options PCRE2_DUPNAMES and PCRE2_UNGREEDY can be changed in
|
||||
the same way as the Perl-compatible options by using the characters J and U
|
||||
respectively.
|
||||
respectively. However, these are not unset by (?^).
|
||||
.P
|
||||
When one of these option changes occurs at top level (that is, not inside
|
||||
subpattern parentheses), the change applies to the remainder of the pattern
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2SYNTAX 3 "27 July 2018" "PCRE2 10.32"
|
||||
.TH PCRE2SYNTAX 3 "28 July 2018" "PCRE2 10.32"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.SH "PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY"
|
||||
|
@ -431,7 +431,14 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
|
|||
(?x) extended: ignore white space except in classes
|
||||
(?xx) as (?x) but also ignore space and tab in classes
|
||||
(?-...) unset option(s)
|
||||
(?^) unset imnsx options
|
||||
.sp
|
||||
Unsetting x or xx unsets both. Several options may be set at once, and a
|
||||
mixture of setting and unsetting such as (?i-x) is allowed, but there may be
|
||||
only one hyphen. Setting (but no unsetting) is allowed after (?^ for example
|
||||
(?^in). An option setting may appear at the start of a non-capturing group, for
|
||||
example (?i:...).
|
||||
.P
|
||||
The following are recognized only at the very start of a pattern or after one
|
||||
of the newline or \eR options with similar syntax. More than one of them may
|
||||
appear. For the first three, d is a decimal number.
|
||||
|
@ -612,6 +619,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 27 July 2018
|
||||
Last updated: 28 July 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -317,6 +317,7 @@ pcre2_pattern_convert(). */
|
|||
#define PCRE2_ERROR_NO_SURROGATES_IN_UTF16 191
|
||||
#define PCRE2_ERROR_BAD_LITERAL_OPTIONS 192
|
||||
#define PCRE2_ERROR_NOT_SUPPORTED_IN_EBCDIC 193
|
||||
#define PCRE2_ERROR_INVALID_HYPHEN_IN_OPTIONS 194
|
||||
|
||||
|
||||
/* "Expected" matching error codes: no match and partial match. */
|
||||
|
|
|
@ -263,7 +263,7 @@ versions. */
|
|||
#define META_SKIP 0x802d0000u /* kept */
|
||||
#define META_SKIP_ARG 0x802e0000u /* in */
|
||||
#define META_THEN 0x802f0000u /* this */
|
||||
#define META_THEN_ARG 0x80300000u /* order */
|
||||
#define META_THEN_ARG 0x80300000u /* order */
|
||||
|
||||
/* These must be kept in groups of adjacent 3 values, and all together. */
|
||||
|
||||
|
@ -330,7 +330,7 @@ static unsigned char meta_extra_lengths[] = {
|
|||
0, /* META_ACCEPT */
|
||||
0, /* META_FAIL */
|
||||
0, /* META_COMMIT */
|
||||
1, /* META_COMMIT_ARG - plus the string length */
|
||||
1, /* META_COMMIT_ARG - plus the string length */
|
||||
0, /* META_PRUNE */
|
||||
1, /* META_PRUNE_ARG - plus the string length */
|
||||
0, /* META_SKIP */
|
||||
|
@ -612,7 +612,7 @@ static const int verbcount = sizeof(verbs)/sizeof(verbitem);
|
|||
/* Verb opcodes, indexed by their META code offset from META_MARK. */
|
||||
|
||||
static const uint32_t verbops[] = {
|
||||
OP_MARK, OP_ACCEPT, OP_FAIL, OP_COMMIT, OP_COMMIT_ARG, OP_PRUNE,
|
||||
OP_MARK, OP_ACCEPT, OP_FAIL, OP_COMMIT, OP_COMMIT_ARG, OP_PRUNE,
|
||||
OP_PRUNE_ARG, OP_SKIP, OP_SKIP_ARG, OP_THEN, OP_THEN_ARG };
|
||||
|
||||
/* Offsets from OP_STAR for case-independent and negative repeat opcodes. */
|
||||
|
@ -731,7 +731,7 @@ enum { ERR0 = COMPILE_ERROR_BASE,
|
|||
ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69, ERR70,
|
||||
ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79, ERR80,
|
||||
ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERR87, ERR88, ERR89, ERR90,
|
||||
ERR91, ERR92, ERR93 };
|
||||
ERR91, ERR92, ERR93, ERR94 };
|
||||
|
||||
/* This is a table of start-of-pattern options such as (*UTF) and settings such
|
||||
as (*LIMIT_MATCH=nnnn) and (*CRLF). For completeness and backward
|
||||
|
@ -1441,41 +1441,41 @@ else if ((i = escapes[c - ESCAPES_FIRST]) != 0)
|
|||
escape = -i; /* Else return a special escape */
|
||||
if (cb != NULL && (escape == ESC_P || escape == ESC_p || escape == ESC_X))
|
||||
cb->external_flags |= PCRE2_HASBKPORX; /* Note \P, \p, or \X */
|
||||
|
||||
|
||||
/* Perl supports \N{name} for character names and \N{U+dddd} for numerical
|
||||
Unicode code points, as well as plain \N for "not newline". PCRE does not
|
||||
support \N{name}. However, it does support quantification such as \N{2,3},
|
||||
support \N{name}. However, it does support quantification such as \N{2,3},
|
||||
so if \N{ is not followed by U+dddd we check for a quantifier. */
|
||||
|
||||
if (escape == ESC_N && ptr < ptrend && *ptr == CHAR_LEFT_CURLY_BRACKET)
|
||||
{
|
||||
PCRE2_SPTR p = ptr + 1;
|
||||
|
||||
/* \N{U+ can be handled by the \x{ code. However, this construction is
|
||||
not valid in EBCDIC environments because it specifies a Unicode
|
||||
character, not a codepoint in the local code. For example \N{U+0041}
|
||||
|
||||
/* \N{U+ can be handled by the \x{ code. However, this construction is
|
||||
not valid in EBCDIC environments because it specifies a Unicode
|
||||
character, not a codepoint in the local code. For example \N{U+0041}
|
||||
must be "A" in all environments. */
|
||||
|
||||
|
||||
if (ptrend - p > 1 && *p == CHAR_U && p[1] == CHAR_PLUS)
|
||||
{
|
||||
#ifdef EBCDIC
|
||||
*errorcodeptr = ERR93;
|
||||
#else
|
||||
#else
|
||||
ptr = p + 1;
|
||||
escape = 0; /* Not a fancy escape after all */
|
||||
escape = 0; /* Not a fancy escape after all */
|
||||
goto COME_FROM_NU;
|
||||
#endif
|
||||
}
|
||||
|
||||
/* Give an error if what follows is not a quantifier, but don't override
|
||||
#endif
|
||||
}
|
||||
|
||||
/* Give an error if what follows is not a quantifier, but don't override
|
||||
an error set by the quantifier reader (e.g. number overflow). */
|
||||
|
||||
|
||||
else
|
||||
{
|
||||
{
|
||||
if (!read_repeat_counts(&p, ptrend, NULL, NULL, errorcodeptr) &&
|
||||
*errorcodeptr == 0)
|
||||
*errorcodeptr = ERR37;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -1762,9 +1762,9 @@ else
|
|||
{
|
||||
if (ptr < ptrend && *ptr == CHAR_LEFT_CURLY_BRACKET)
|
||||
{
|
||||
#ifndef EBCDIC
|
||||
COME_FROM_NU:
|
||||
#endif
|
||||
#ifndef EBCDIC
|
||||
COME_FROM_NU:
|
||||
#endif
|
||||
if (++ptr >= ptrend || *ptr == CHAR_RIGHT_CURLY_BRACKET)
|
||||
{
|
||||
*errorcodeptr = ERR78;
|
||||
|
@ -2495,15 +2495,15 @@ while (ptr < ptrend)
|
|||
goto FAILED;
|
||||
}
|
||||
*verblengthptr = (uint32_t)verbnamelength;
|
||||
|
||||
|
||||
/* If this name was on a verb such as (*ACCEPT) which does not continue,
|
||||
a (*MARK) was generated for the name. We now add the original verb as the
|
||||
next item. */
|
||||
a (*MARK) was generated for the name. We now add the original verb as the
|
||||
next item. */
|
||||
|
||||
if (add_after_mark != 0)
|
||||
{
|
||||
*parsed_pattern++ = add_after_mark;
|
||||
add_after_mark = 0;
|
||||
add_after_mark = 0;
|
||||
}
|
||||
break;
|
||||
|
||||
|
@ -3498,22 +3498,22 @@ while (ptr < ptrend)
|
|||
if (*ptr++ == CHAR_COLON) /* Skip past : or ) */
|
||||
{
|
||||
/* Some optional arguments can be treated as a preceding (*MARK) */
|
||||
|
||||
|
||||
if (verbs[i].has_arg < 0)
|
||||
{
|
||||
add_after_mark = verbs[i].meta;
|
||||
*parsed_pattern++ = META_MARK;
|
||||
*parsed_pattern++ = META_MARK;
|
||||
}
|
||||
|
||||
|
||||
/* The remaining verbs with arguments (except *MARK) need a different
|
||||
opcode. */
|
||||
|
||||
|
||||
else
|
||||
{
|
||||
{
|
||||
*parsed_pattern++ = verbs[i].meta +
|
||||
((verbs[i].meta != META_MARK)? 0x00010000u:0);
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
/* Set up for reading the name in the main loop. */
|
||||
|
||||
verblengthptr = parsed_pattern++;
|
||||
|
@ -3576,17 +3576,37 @@ while (ptr < ptrend)
|
|||
|
||||
else
|
||||
{
|
||||
BOOL hyphenok = TRUE;
|
||||
top_nest->reset_group = 0;
|
||||
top_nest->max_group = 0;
|
||||
set = unset = 0;
|
||||
optset = &set;
|
||||
|
||||
/* ^ at the start unsets imnsx and disables the subsequent use of - */
|
||||
|
||||
if (ptr < ptrend && *ptr == CHAR_CIRCUMFLEX_ACCENT)
|
||||
{
|
||||
options &= ~(PCRE2_CASELESS|PCRE2_MULTILINE|PCRE2_NO_AUTO_CAPTURE|
|
||||
PCRE2_DOTALL|PCRE2_EXTENDED|PCRE2_EXTENDED_MORE);
|
||||
hyphenok = FALSE;
|
||||
ptr++;
|
||||
}
|
||||
|
||||
while (ptr < ptrend && *ptr != CHAR_RIGHT_PARENTHESIS &&
|
||||
*ptr != CHAR_COLON)
|
||||
{
|
||||
switch (*ptr++)
|
||||
{
|
||||
case CHAR_MINUS: optset = &unset; break;
|
||||
case CHAR_MINUS:
|
||||
if (!hyphenok)
|
||||
{
|
||||
errorcode = ERR94;
|
||||
ptr--; /* Correct the offset */
|
||||
goto FAILED;
|
||||
}
|
||||
optset = &unset;
|
||||
hyphenok = FALSE;
|
||||
break;
|
||||
|
||||
case CHAR_J: /* Record that it changed in the external options */
|
||||
*optset |= PCRE2_DUPNAMES;
|
||||
|
@ -3644,9 +3664,10 @@ while (ptr < ptrend)
|
|||
}
|
||||
else *parsed_pattern++ = META_NOCAPTURE;
|
||||
|
||||
/* If nothing changed, no need to record. */
|
||||
/* If nothing changed, no need to record. The check of hyphenok catches
|
||||
the (?^) case. */
|
||||
|
||||
if (set != 0 || unset != 0)
|
||||
if (set != 0 || unset != 0 || !hyphenok)
|
||||
{
|
||||
*parsed_pattern++ = META_OPTIONS;
|
||||
*parsed_pattern++ = options;
|
||||
|
@ -3952,7 +3973,7 @@ while (ptr < ptrend)
|
|||
{
|
||||
if (++ptr >= ptrend || !IS_DIGIT(*ptr)) goto BAD_VERSION_CONDITION;
|
||||
minor = (*ptr++ - CHAR_0) * 10;
|
||||
if (IS_DIGIT(*ptr)) minor += *ptr++ - CHAR_0;
|
||||
if (IS_DIGIT(*ptr)) minor += *ptr++ - CHAR_0;
|
||||
if (ptr >= ptrend || *ptr != CHAR_RIGHT_PARENTHESIS)
|
||||
goto BAD_VERSION_CONDITION;
|
||||
}
|
||||
|
@ -5709,7 +5730,7 @@ for (;; pptr++)
|
|||
cb->had_pruneorskip = TRUE;
|
||||
/* Fall through */
|
||||
case META_MARK:
|
||||
case META_COMMIT_ARG:
|
||||
case META_COMMIT_ARG:
|
||||
VERB_ARG:
|
||||
*code++ = verbops[(meta - META_MARK) >> 16];
|
||||
/* The length is in characters. */
|
||||
|
@ -8058,7 +8079,7 @@ for (;;)
|
|||
break;
|
||||
|
||||
case OP_MARK:
|
||||
case OP_COMMIT_ARG:
|
||||
case OP_COMMIT_ARG:
|
||||
case OP_PRUNE_ARG:
|
||||
case OP_SKIP_ARG:
|
||||
case OP_THEN_ARG:
|
||||
|
@ -8367,7 +8388,7 @@ for (;; pptr++)
|
|||
break;
|
||||
|
||||
case META_MARK: /* Add the length of the name. */
|
||||
case META_COMMIT_ARG:
|
||||
case META_COMMIT_ARG:
|
||||
case META_PRUNE_ARG:
|
||||
case META_SKIP_ARG:
|
||||
case META_THEN_ARG:
|
||||
|
@ -8558,7 +8579,7 @@ for (;; pptr++)
|
|||
goto EXIT;
|
||||
|
||||
case META_MARK:
|
||||
case META_COMMIT_ARG:
|
||||
case META_COMMIT_ARG:
|
||||
case META_PRUNE_ARG:
|
||||
case META_SKIP_ARG:
|
||||
case META_THEN_ARG:
|
||||
|
@ -8630,31 +8651,31 @@ for (;; pptr++)
|
|||
case META_LOOKAHEADNOT:
|
||||
pptr = parsed_skip(pptr + 1, PSKIP_KET);
|
||||
if (pptr == NULL) goto PARSED_SKIP_FAILED;
|
||||
|
||||
|
||||
/* Also ignore any qualifiers that follow a lookahead assertion. */
|
||||
|
||||
|
||||
switch (pptr[1])
|
||||
{
|
||||
case META_ASTERISK:
|
||||
case META_ASTERISK_PLUS:
|
||||
case META_ASTERISK_QUERY:
|
||||
case META_ASTERISK_QUERY:
|
||||
case META_PLUS:
|
||||
case META_PLUS_PLUS:
|
||||
case META_PLUS_PLUS:
|
||||
case META_PLUS_QUERY:
|
||||
case META_QUERY:
|
||||
case META_QUERY_PLUS:
|
||||
case META_QUERY_QUERY:
|
||||
case META_QUERY_QUERY:
|
||||
pptr++;
|
||||
break;
|
||||
|
||||
|
||||
case META_MINMAX:
|
||||
case META_MINMAX_PLUS:
|
||||
case META_MINMAX_QUERY:
|
||||
pptr += 3;
|
||||
break;
|
||||
|
||||
|
||||
default:
|
||||
break;
|
||||
break;
|
||||
}
|
||||
break;
|
||||
|
||||
|
@ -9026,7 +9047,7 @@ for (pptr = cb->parsed_pattern; *pptr != META_END; pptr++)
|
|||
break;
|
||||
|
||||
case META_MARK:
|
||||
case META_COMMIT_ARG:
|
||||
case META_COMMIT_ARG:
|
||||
case META_PRUNE_ARG:
|
||||
case META_SKIP_ARG:
|
||||
case META_THEN_ARG:
|
||||
|
|
|
@ -179,7 +179,8 @@ static const unsigned char compile_error_texts[] =
|
|||
"internal error: bad code value in parsed_skip()\0"
|
||||
"PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is not allowed in UTF-16 mode\0"
|
||||
"invalid option bits with PCRE2_LITERAL\0"
|
||||
"\\N{U+dddd} is not supported in EBCDIC mode\0"
|
||||
"\\N{U+dddd} is not supported in EBCDIC mode\0"
|
||||
"invalid hyphen in option setting\0"
|
||||
;
|
||||
|
||||
/* Match-time and UTF error texts are in the same format. */
|
||||
|
|
|
@ -6252,4 +6252,10 @@ ef) x/x,mark
|
|||
|
||||
/(*COMMIT:]w)/
|
||||
|
||||
/(?i)A(?^)B(?^x:C D)(?^i)e f/
|
||||
aBCDE F
|
||||
\= Expect no match
|
||||
aBCDEF
|
||||
AbCDe f
|
||||
|
||||
# End of testinput1
|
||||
|
|
|
@ -5453,4 +5453,10 @@ a)"xI
|
|||
\= Expect no match
|
||||
axy
|
||||
|
||||
/(?^x-i)AB/
|
||||
|
||||
/(?^-i)AB/
|
||||
|
||||
/(?x-i-i)/
|
||||
|
||||
# End of testinput2
|
||||
|
|
|
@ -9912,4 +9912,13 @@ No match, mark = X
|
|||
|
||||
/(*COMMIT:]w)/
|
||||
|
||||
/(?i)A(?^)B(?^x:C D)(?^i)e f/
|
||||
aBCDE F
|
||||
0: aBCDE F
|
||||
\= Expect no match
|
||||
aBCDEF
|
||||
No match
|
||||
AbCDe f
|
||||
No match
|
||||
|
||||
# End of testinput1
|
||||
|
|
|
@ -16622,6 +16622,15 @@ No match, mark = X
|
|||
axy
|
||||
No match, mark = X
|
||||
|
||||
/(?^x-i)AB/
|
||||
Failed: error 194 at offset 4: invalid hyphen in option setting
|
||||
|
||||
/(?^-i)AB/
|
||||
Failed: error 194 at offset 3: invalid hyphen in option setting
|
||||
|
||||
/(?x-i-i)/
|
||||
Failed: error 194 at offset 5: invalid hyphen in option setting
|
||||
|
||||
# End of testinput2
|
||||
Error -70: PCRE2_ERROR_BADDATA (unknown error number)
|
||||
Error -62: bad serialized data
|
||||
|
|
Loading…
Reference in New Issue