Implement PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.
This commit is contained in:
parent
c0902e176f
commit
e3a0f22349
|
@ -27,10 +27,11 @@ DESCRIPTION
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
This function sets additional option bits for <b>pcre2_compile()</b> that are
|
This function sets additional option bits for <b>pcre2_compile()</b> that are
|
||||||
housed in a compile context. It completely replaces all the bits. The extra
|
housed in a compile context. It completely replaces all the bits. The extra
|
||||||
options are:
|
options are:
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \x{df800} to \x{dfff} in UTF-8 and UTF-32 modes
|
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \x{df800} to \x{dfff} in UTF-8 and UTF-32 modes
|
||||||
|
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL Treat all invalid escapes as a literal following character
|
||||||
</pre>
|
</pre>
|
||||||
There is a complete description of the PCRE2 native API in the
|
There is a complete description of the PCRE2 native API in the
|
||||||
<a href="pcre2api.html"><b>pcre2api</b></a>
|
<a href="pcre2api.html"><b>pcre2api</b></a>
|
||||||
|
|
|
@ -1706,6 +1706,24 @@ If the extra option PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is set, surrogate code
|
||||||
point values in UTF-8 and UTF-32 patterns no longer provoke errors and are
|
point values in UTF-8 and UTF-32 patterns no longer provoke errors and are
|
||||||
incorporated in the compiled pattern. However, they can only match subject
|
incorporated in the compiled pattern. However, they can only match subject
|
||||||
characters if the matching function is called with PCRE2_NO_UTF_CHECK set.
|
characters if the matching function is called with PCRE2_NO_UTF_CHECK set.
|
||||||
|
<pre>
|
||||||
|
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
|
||||||
|
</pre>
|
||||||
|
This is a dangerous option. Use with care. By default, an unrecognized escape
|
||||||
|
such as \j or a malformed one such as \x{2z} causes a compile-time error when
|
||||||
|
detected by <b>pcre2_compile()</b>. Perl is somewhat inconsistent in handling
|
||||||
|
such items: for example, \j is treated as a literal "j", and non-hexadecimal
|
||||||
|
digits in \x{} are just ignored, though warnings are given in both cases if
|
||||||
|
Perl's warning switch is enabled. However, a malformed octal number after \o{
|
||||||
|
always causes an error in Perl.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
If the PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL extra option is passed to
|
||||||
|
<b>pcre2_compile()</b>, all unrecognized or erroneous escape sequences are
|
||||||
|
treated as single-character escapes. For example, \j is a literal "j" and
|
||||||
|
\x{2z} is treated as the literal string "x{2z}". Setting this option means
|
||||||
|
that typos in patterns may go undetected and have unexpected results. This is a
|
||||||
|
dangerous option. Use with care.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC20" href="#TOC1">COMPILATION ERROR CODES</a><br>
|
<br><a name="SEC20" href="#TOC1">COMPILATION ERROR CODES</a><br>
|
||||||
<P>
|
<P>
|
||||||
|
@ -3471,7 +3489,7 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 30 May 2017
|
Last updated: 01 June 2017
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2017 University of Cambridge.
|
Copyright © 1997-2017 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
|
@ -577,6 +577,7 @@ for a description of the effects of these options.
|
||||||
alt_verbnames set PCRE2_ALT_VERBNAMES
|
alt_verbnames set PCRE2_ALT_VERBNAMES
|
||||||
anchored set PCRE2_ANCHORED
|
anchored set PCRE2_ANCHORED
|
||||||
auto_callout set PCRE2_AUTO_CALLOUT
|
auto_callout set PCRE2_AUTO_CALLOUT
|
||||||
|
bad_escape_is_literal set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
|
||||||
/i caseless set PCRE2_CASELESS
|
/i caseless set PCRE2_CASELESS
|
||||||
dollar_endonly set PCRE2_DOLLAR_ENDONLY
|
dollar_endonly set PCRE2_DOLLAR_ENDONLY
|
||||||
/s dotall set PCRE2_DOTALL
|
/s dotall set PCRE2_DOTALL
|
||||||
|
@ -1816,7 +1817,7 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 26 May 2017
|
Last updated: 01 June 2017
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2017 University of Cambridge.
|
Copyright © 1997-2017 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
|
@ -1688,6 +1688,24 @@ COMPILING A PATTERN
|
||||||
only match subject characters if the matching function is called with
|
only match subject characters if the matching function is called with
|
||||||
PCRE2_NO_UTF_CHECK set.
|
PCRE2_NO_UTF_CHECK set.
|
||||||
|
|
||||||
|
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
|
||||||
|
|
||||||
|
This is a dangerous option. Use with care. By default, an unrecognized
|
||||||
|
escape such as \j or a malformed one such as \x{2z} causes a compile-
|
||||||
|
time error when detected by pcre2_compile(). Perl is somewhat inconsis-
|
||||||
|
tent in handling such items: for example, \j is treated as a literal
|
||||||
|
"j", and non-hexadecimal digits in \x{} are just ignored, though warn-
|
||||||
|
ings are given in both cases if Perl's warning switch is enabled. How-
|
||||||
|
ever, a malformed octal number after \o{ always causes an error in
|
||||||
|
Perl.
|
||||||
|
|
||||||
|
If the PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL extra option is passed to
|
||||||
|
pcre2_compile(), all unrecognized or erroneous escape sequences are
|
||||||
|
treated as single-character escapes. For example, \j is a literal "j"
|
||||||
|
and \x{2z} is treated as the literal string "x{2z}". Setting this
|
||||||
|
option means that typos in patterns may go undetected and have unex-
|
||||||
|
pected results. This is a dangerous option. Use with care.
|
||||||
|
|
||||||
|
|
||||||
COMPILATION ERROR CODES
|
COMPILATION ERROR CODES
|
||||||
|
|
||||||
|
@ -3350,7 +3368,7 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 30 May 2017
|
Last updated: 01 June 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2_SET_MAX_PATTERN_LENGTH 3 "17 May 2017" "PCRE2 10.30"
|
.TH PCRE2_SET_MAX_PATTERN_LENGTH 3 "01 June 2017" "PCRE2 10.30"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -15,12 +15,15 @@ PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
This function sets additional option bits for \fBpcre2_compile()\fP that are
|
This function sets additional option bits for \fBpcre2_compile()\fP that are
|
||||||
housed in a compile context. It completely replaces all the bits. The extra
|
housed in a compile context. It completely replaces all the bits. The extra
|
||||||
options are:
|
options are:
|
||||||
.sp
|
.sp
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{df800} to \ex{dfff}
|
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{df800} to \ex{dfff}
|
||||||
in UTF-8 and UTF-32 modes
|
in UTF-8 and UTF-32 modes
|
||||||
|
.\" JOIN
|
||||||
|
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL Treat all invalid escapes as
|
||||||
|
a literal following character
|
||||||
.sp
|
.sp
|
||||||
There is a complete description of the PCRE2 native API in the
|
There is a complete description of the PCRE2 native API in the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2API 3 "30 May 2017" "PCRE2 10.30"
|
.TH PCRE2API 3 "01 June 2017" "PCRE2 10.30"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.sp
|
.sp
|
||||||
|
@ -1661,6 +1661,23 @@ If the extra option PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is set, surrogate code
|
||||||
point values in UTF-8 and UTF-32 patterns no longer provoke errors and are
|
point values in UTF-8 and UTF-32 patterns no longer provoke errors and are
|
||||||
incorporated in the compiled pattern. However, they can only match subject
|
incorporated in the compiled pattern. However, they can only match subject
|
||||||
characters if the matching function is called with PCRE2_NO_UTF_CHECK set.
|
characters if the matching function is called with PCRE2_NO_UTF_CHECK set.
|
||||||
|
.sp
|
||||||
|
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
|
||||||
|
.sp
|
||||||
|
This is a dangerous option. Use with care. By default, an unrecognized escape
|
||||||
|
such as \ej or a malformed one such as \ex{2z} causes a compile-time error when
|
||||||
|
detected by \fBpcre2_compile()\fP. Perl is somewhat inconsistent in handling
|
||||||
|
such items: for example, \ej is treated as a literal "j", and non-hexadecimal
|
||||||
|
digits in \ex{} are just ignored, though warnings are given in both cases if
|
||||||
|
Perl's warning switch is enabled. However, a malformed octal number after \eo{
|
||||||
|
always causes an error in Perl.
|
||||||
|
.P
|
||||||
|
If the PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL extra option is passed to
|
||||||
|
\fBpcre2_compile()\fP, all unrecognized or erroneous escape sequences are
|
||||||
|
treated as single-character escapes. For example, \ej is a literal "j" and
|
||||||
|
\ex{2z} is treated as the literal string "x{2z}". Setting this option means
|
||||||
|
that typos in patterns may go undetected and have unexpected results. This is a
|
||||||
|
dangerous option. Use with care.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SH "COMPILATION ERROR CODES"
|
.SH "COMPILATION ERROR CODES"
|
||||||
|
@ -3491,6 +3508,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 30 May 2017
|
Last updated: 01 June 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2TEST 1 "26 May 2017" "PCRE 10.30"
|
.TH PCRE2TEST 1 "01 June 2017" "PCRE 10.30"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
pcre2test - a program for testing Perl-compatible regular expressions.
|
pcre2test - a program for testing Perl-compatible regular expressions.
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -539,6 +539,7 @@ for a description of the effects of these options.
|
||||||
alt_verbnames set PCRE2_ALT_VERBNAMES
|
alt_verbnames set PCRE2_ALT_VERBNAMES
|
||||||
anchored set PCRE2_ANCHORED
|
anchored set PCRE2_ANCHORED
|
||||||
auto_callout set PCRE2_AUTO_CALLOUT
|
auto_callout set PCRE2_AUTO_CALLOUT
|
||||||
|
bad_escape_is_literal set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
|
||||||
/i caseless set PCRE2_CASELESS
|
/i caseless set PCRE2_CASELESS
|
||||||
dollar_endonly set PCRE2_DOLLAR_ENDONLY
|
dollar_endonly set PCRE2_DOLLAR_ENDONLY
|
||||||
/s dotall set PCRE2_DOTALL
|
/s dotall set PCRE2_DOTALL
|
||||||
|
@ -1792,6 +1793,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 26 May 2017
|
Last updated: 01 June 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -521,6 +521,7 @@ PATTERN MODIFIERS
|
||||||
alt_verbnames set PCRE2_ALT_VERBNAMES
|
alt_verbnames set PCRE2_ALT_VERBNAMES
|
||||||
anchored set PCRE2_ANCHORED
|
anchored set PCRE2_ANCHORED
|
||||||
auto_callout set PCRE2_AUTO_CALLOUT
|
auto_callout set PCRE2_AUTO_CALLOUT
|
||||||
|
bad_escape_is_literal set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL
|
||||||
/i caseless set PCRE2_CASELESS
|
/i caseless set PCRE2_CASELESS
|
||||||
dollar_endonly set PCRE2_DOLLAR_ENDONLY
|
dollar_endonly set PCRE2_DOLLAR_ENDONLY
|
||||||
/s dotall set PCRE2_DOTALL
|
/s dotall set PCRE2_DOTALL
|
||||||
|
@ -1650,5 +1651,5 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 26 May 2017
|
Last updated: 01 June 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
|
|
|
@ -142,6 +142,7 @@ D is inspected during pcre2_dfa_match() execution
|
||||||
/* An additional compile options word is available in the compile context. */
|
/* An additional compile options word is available in the compile context. */
|
||||||
|
|
||||||
#define PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES 0x00000001u /* C */
|
#define PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES 0x00000001u /* C */
|
||||||
|
#define PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL 0x00000002u /* C */
|
||||||
|
|
||||||
/* These are for pcre2_jit_compile(). */
|
/* These are for pcre2_jit_compile(). */
|
||||||
|
|
||||||
|
|
|
@ -142,6 +142,7 @@ D is inspected during pcre2_dfa_match() execution
|
||||||
/* An additional compile options word is available in the compile context. */
|
/* An additional compile options word is available in the compile context. */
|
||||||
|
|
||||||
#define PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES 0x00000001u /* C */
|
#define PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES 0x00000001u /* C */
|
||||||
|
#define PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL 0x00000002u /* C */
|
||||||
|
|
||||||
/* These are for pcre2_jit_compile(). */
|
/* These are for pcre2_jit_compile(). */
|
||||||
|
|
||||||
|
|
|
@ -2591,11 +2591,23 @@ while (ptr < ptrend)
|
||||||
/* ---- Escape sequence ---- */
|
/* ---- Escape sequence ---- */
|
||||||
|
|
||||||
case CHAR_BACKSLASH:
|
case CHAR_BACKSLASH:
|
||||||
|
tempptr = ptr;
|
||||||
escape = PRIV(check_escape)(&ptr, ptrend, &c, &errorcode, options,
|
escape = PRIV(check_escape)(&ptr, ptrend, &c, &errorcode, options,
|
||||||
FALSE, cb);
|
FALSE, cb);
|
||||||
if (errorcode != 0) goto FAILED;
|
if (errorcode != 0)
|
||||||
|
{
|
||||||
|
ESCAPE_FAILED:
|
||||||
|
if ((cb->cx->extra_options & PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL) == 0)
|
||||||
|
goto FAILED;
|
||||||
|
ptr = tempptr;
|
||||||
|
if (ptr >= ptrend) c = CHAR_BACKSLASH; else
|
||||||
|
{
|
||||||
|
GETCHARINCTEST(c, ptr); /* Get character value, increment pointer */
|
||||||
|
}
|
||||||
|
escape = 0; /* Treat as literal character */
|
||||||
|
}
|
||||||
|
|
||||||
/* The escape was a data character. */
|
/* The escape was a data escape or literal character. */
|
||||||
|
|
||||||
if (escape == 0)
|
if (escape == 0)
|
||||||
{
|
{
|
||||||
|
@ -2647,12 +2659,12 @@ while (ptr < ptrend)
|
||||||
case ESC_C:
|
case ESC_C:
|
||||||
#ifdef NEVER_BACKSLASH_C
|
#ifdef NEVER_BACKSLASH_C
|
||||||
errorcode = ERR85;
|
errorcode = ERR85;
|
||||||
goto FAILED;
|
goto ESCAPE_FAILED;
|
||||||
#else
|
#else
|
||||||
if ((options & PCRE2_NEVER_BACKSLASH_C) != 0)
|
if ((options & PCRE2_NEVER_BACKSLASH_C) != 0)
|
||||||
{
|
{
|
||||||
errorcode = ERR83;
|
errorcode = ERR83;
|
||||||
goto FAILED;
|
goto ESCAPE_FAILED;
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
okquantifier = TRUE;
|
okquantifier = TRUE;
|
||||||
|
@ -2662,7 +2674,7 @@ while (ptr < ptrend)
|
||||||
case ESC_X:
|
case ESC_X:
|
||||||
#ifndef SUPPORT_UNICODE
|
#ifndef SUPPORT_UNICODE
|
||||||
errorcode = ERR45; /* Supported only with Unicode support */
|
errorcode = ERR45; /* Supported only with Unicode support */
|
||||||
goto FAILED;
|
goto ESCAPE_FAILED;
|
||||||
#endif
|
#endif
|
||||||
case ESC_H:
|
case ESC_H:
|
||||||
case ESC_h:
|
case ESC_h:
|
||||||
|
@ -2727,7 +2739,7 @@ while (ptr < ptrend)
|
||||||
BOOL negated;
|
BOOL negated;
|
||||||
uint16_t ptype = 0, pdata = 0;
|
uint16_t ptype = 0, pdata = 0;
|
||||||
if (!get_ucp(&ptr, &negated, &ptype, &pdata, &errorcode, cb))
|
if (!get_ucp(&ptr, &negated, &ptype, &pdata, &errorcode, cb))
|
||||||
goto FAILED;
|
goto ESCAPE_FAILED;
|
||||||
if (negated) escape = (escape == ESC_P)? ESC_p : ESC_P;
|
if (negated) escape = (escape == ESC_P)? ESC_p : ESC_P;
|
||||||
*parsed_pattern++ = META_ESCAPE + escape;
|
*parsed_pattern++ = META_ESCAPE + escape;
|
||||||
*parsed_pattern++ = (ptype << 16) | pdata;
|
*parsed_pattern++ = (ptype << 16) | pdata;
|
||||||
|
@ -2735,7 +2747,7 @@ while (ptr < ptrend)
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
errorcode = ERR45;
|
errorcode = ERR45;
|
||||||
goto FAILED;
|
goto ESCAPE_FAILED;
|
||||||
#endif
|
#endif
|
||||||
break; /* End \P and \p */
|
break; /* End \P and \p */
|
||||||
|
|
||||||
|
@ -2751,7 +2763,7 @@ while (ptr < ptrend)
|
||||||
*ptr != CHAR_LESS_THAN_SIGN && *ptr != CHAR_APOSTROPHE))
|
*ptr != CHAR_LESS_THAN_SIGN && *ptr != CHAR_APOSTROPHE))
|
||||||
{
|
{
|
||||||
errorcode = (escape == ESC_g)? ERR57 : ERR69;
|
errorcode = (escape == ESC_g)? ERR57 : ERR69;
|
||||||
goto FAILED;
|
goto ESCAPE_FAILED;
|
||||||
}
|
}
|
||||||
terminator = (*ptr == CHAR_LESS_THAN_SIGN)?
|
terminator = (*ptr == CHAR_LESS_THAN_SIGN)?
|
||||||
CHAR_GREATER_THAN_SIGN : (*ptr == CHAR_APOSTROPHE)?
|
CHAR_GREATER_THAN_SIGN : (*ptr == CHAR_APOSTROPHE)?
|
||||||
|
@ -2769,18 +2781,18 @@ while (ptr < ptrend)
|
||||||
if (p >= ptrend || *p != terminator)
|
if (p >= ptrend || *p != terminator)
|
||||||
{
|
{
|
||||||
errorcode = ERR57;
|
errorcode = ERR57;
|
||||||
goto FAILED;
|
goto ESCAPE_FAILED;
|
||||||
}
|
}
|
||||||
ptr = p;
|
ptr = p;
|
||||||
goto SET_RECURSION;
|
goto SET_RECURSION;
|
||||||
}
|
}
|
||||||
if (errorcode != 0) goto FAILED;
|
if (errorcode != 0) goto ESCAPE_FAILED;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Not a numerical recursion */
|
/* Not a numerical recursion */
|
||||||
|
|
||||||
if (!read_name(&ptr, ptrend, terminator, &offset, &name, &namelen,
|
if (!read_name(&ptr, ptrend, terminator, &offset, &name, &namelen,
|
||||||
&errorcode, cb)) goto FAILED;
|
&errorcode, cb)) goto ESCAPE_FAILED;
|
||||||
|
|
||||||
/* \k and \g when used with braces are back references, whereas \g used
|
/* \k and \g when used with braces are back references, whereas \g used
|
||||||
with quotes or angle brackets is a recursion */
|
with quotes or angle brackets is a recursion */
|
||||||
|
@ -2792,7 +2804,7 @@ while (ptr < ptrend)
|
||||||
|
|
||||||
PUTOFFSET(offset, parsed_pattern);
|
PUTOFFSET(offset, parsed_pattern);
|
||||||
okquantifier = TRUE;
|
okquantifier = TRUE;
|
||||||
break;
|
break; /* End special escape processing */
|
||||||
}
|
}
|
||||||
break; /* End escape sequence processing */
|
break; /* End escape sequence processing */
|
||||||
|
|
||||||
|
@ -3139,10 +3151,23 @@ while (ptr < ptrend)
|
||||||
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
|
tempptr = ptr;
|
||||||
escape = PRIV(check_escape)(&ptr, ptrend, &c, &errorcode,
|
escape = PRIV(check_escape)(&ptr, ptrend, &c, &errorcode,
|
||||||
options, TRUE, cb);
|
options, TRUE, cb);
|
||||||
|
|
||||||
if (errorcode != 0) goto FAILED;
|
if (errorcode != 0)
|
||||||
|
{
|
||||||
|
CLASS_ESCAPE_FAILED:
|
||||||
|
if ((cb->cx->extra_options & PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL) == 0)
|
||||||
|
goto FAILED;
|
||||||
|
ptr = tempptr;
|
||||||
|
if (ptr >= ptrend) c = CHAR_BACKSLASH; else
|
||||||
|
{
|
||||||
|
GETCHARINCTEST(c, ptr); /* Get character value, increment pointer */
|
||||||
|
}
|
||||||
|
escape = 0; /* Treat as literal character */
|
||||||
|
}
|
||||||
|
|
||||||
if (escape == 0) /* Escaped character code point is in c */
|
if (escape == 0) /* Escaped character code point is in c */
|
||||||
{
|
{
|
||||||
char_is_literal = FALSE;
|
char_is_literal = FALSE;
|
||||||
|
@ -3176,7 +3201,7 @@ while (ptr < ptrend)
|
||||||
if (class_range_state == RANGE_STARTED)
|
if (class_range_state == RANGE_STARTED)
|
||||||
{
|
{
|
||||||
errorcode = ERR50;
|
errorcode = ERR50;
|
||||||
goto FAILED;
|
goto CLASS_ESCAPE_FAILED;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Of the remaining escapes, only those that define characters are
|
/* Of the remaining escapes, only those that define characters are
|
||||||
|
@ -3187,7 +3212,7 @@ while (ptr < ptrend)
|
||||||
{
|
{
|
||||||
case ESC_N:
|
case ESC_N:
|
||||||
errorcode = ERR71; /* Not supported in a class */
|
errorcode = ERR71; /* Not supported in a class */
|
||||||
goto FAILED;
|
goto CLASS_ESCAPE_FAILED;
|
||||||
|
|
||||||
case ESC_H:
|
case ESC_H:
|
||||||
case ESC_h:
|
case ESC_h:
|
||||||
|
@ -3250,13 +3275,14 @@ while (ptr < ptrend)
|
||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
errorcode = ERR45;
|
errorcode = ERR45;
|
||||||
goto FAILED;
|
goto CLASS_ESCAPE_FAILED;
|
||||||
#endif
|
#endif
|
||||||
break; /* End \P and \p */
|
break; /* End \P and \p */
|
||||||
|
|
||||||
default: /* All others are not allowed in a class */
|
default: /* All others are not allowed in a class */
|
||||||
errorcode = ERR7;
|
errorcode = ERR7;
|
||||||
goto FAILED_BACK;
|
ptr--;
|
||||||
|
goto CLASS_ESCAPE_FAILED;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -402,9 +402,9 @@ typedef struct convertstruct {
|
||||||
static convertstruct convertlist[] = {
|
static convertstruct convertlist[] = {
|
||||||
{ "glob", PCRE2_CONVERT_GLOB },
|
{ "glob", PCRE2_CONVERT_GLOB },
|
||||||
{ "glob_basic", PCRE2_CONVERT_GLOB_BASIC },
|
{ "glob_basic", PCRE2_CONVERT_GLOB_BASIC },
|
||||||
{ "glob_ignore_dot_start", PCRE2_CONVERT_GLOB_IGNORE_DOT_START },
|
{ "glob_ignore_dot_start", PCRE2_CONVERT_GLOB_IGNORE_DOT_START },
|
||||||
{ "glob_no_starstar", PCRE2_CONVERT_GLOB_NO_STARSTAR },
|
{ "glob_no_starstar", PCRE2_CONVERT_GLOB_NO_STARSTAR },
|
||||||
{ "glob_no_wild_separator", PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR },
|
{ "glob_no_wild_separator", PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR },
|
||||||
{ "posix_basic", PCRE2_CONVERT_POSIX_BASIC },
|
{ "posix_basic", PCRE2_CONVERT_POSIX_BASIC },
|
||||||
{ "posix_extended", PCRE2_CONVERT_POSIX_EXTENDED },
|
{ "posix_extended", PCRE2_CONVERT_POSIX_EXTENDED },
|
||||||
{ "unset", CONVERT_UNSET }};
|
{ "unset", CONVERT_UNSET }};
|
||||||
|
@ -590,6 +590,7 @@ static modstruct modlist[] = {
|
||||||
{ "altglobal", MOD_PND, MOD_CTL, CTL_ALTGLOBAL, PO(control) },
|
{ "altglobal", MOD_PND, MOD_CTL, CTL_ALTGLOBAL, PO(control) },
|
||||||
{ "anchored", MOD_PD, MOD_OPT, PCRE2_ANCHORED, PD(options) },
|
{ "anchored", MOD_PD, MOD_OPT, PCRE2_ANCHORED, PD(options) },
|
||||||
{ "auto_callout", MOD_PAT, MOD_OPT, PCRE2_AUTO_CALLOUT, PO(options) },
|
{ "auto_callout", MOD_PAT, MOD_OPT, PCRE2_AUTO_CALLOUT, PO(options) },
|
||||||
|
{ "bad_escape_is_literal", MOD_CTC, MOD_OPT, PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL, CO(extra_options) },
|
||||||
{ "bincode", MOD_PAT, MOD_CTL, CTL_BINCODE, PO(control) },
|
{ "bincode", MOD_PAT, MOD_CTL, CTL_BINCODE, PO(control) },
|
||||||
{ "bsr", MOD_CTC, MOD_BSR, 0, CO(bsr_convention) },
|
{ "bsr", MOD_CTC, MOD_BSR, 0, CO(bsr_convention) },
|
||||||
{ "callout_capture", MOD_DAT, MOD_CTL, CTL_CALLOUT_CAPTURE, DO(control) },
|
{ "callout_capture", MOD_DAT, MOD_CTL, CTL_CALLOUT_CAPTURE, DO(control) },
|
||||||
|
@ -692,8 +693,8 @@ static modstruct modlist[] = {
|
||||||
#define POSIX_SUPPORTED_COMPILE_OPTIONS ( \
|
#define POSIX_SUPPORTED_COMPILE_OPTIONS ( \
|
||||||
PCRE2_CASELESS|PCRE2_DOTALL|PCRE2_MULTILINE|PCRE2_UCP|PCRE2_UTF| \
|
PCRE2_CASELESS|PCRE2_DOTALL|PCRE2_MULTILINE|PCRE2_UCP|PCRE2_UTF| \
|
||||||
PCRE2_UNGREEDY)
|
PCRE2_UNGREEDY)
|
||||||
|
|
||||||
#define POSIX_SUPPORTED_COMPILE_EXTRA_OPTIONS (0)
|
#define POSIX_SUPPORTED_COMPILE_EXTRA_OPTIONS (0)
|
||||||
|
|
||||||
#define POSIX_SUPPORTED_COMPILE_CONTROLS ( \
|
#define POSIX_SUPPORTED_COMPILE_CONTROLS ( \
|
||||||
CTL_AFTERTEXT|CTL_ALLAFTERTEXT|CTL_EXPAND|CTL_POSIX|CTL_POSIX_NOSUB)
|
CTL_AFTERTEXT|CTL_ALLAFTERTEXT|CTL_EXPAND|CTL_POSIX|CTL_POSIX_NOSUB)
|
||||||
|
@ -3701,7 +3702,7 @@ for (;;)
|
||||||
|
|
||||||
case MOD_CON: /* A convert type/options list */
|
case MOD_CON: /* A convert type/options list */
|
||||||
for (;; pp++)
|
for (;; pp++)
|
||||||
{
|
{
|
||||||
uint8_t *colon = (uint8_t *)strchr((const char *)pp, ':');
|
uint8_t *colon = (uint8_t *)strchr((const char *)pp, ':');
|
||||||
len = ((colon != NULL && colon < ep)? colon:ep) - pp;
|
len = ((colon != NULL && colon < ep)? colon:ep) - pp;
|
||||||
for (i = 0; i < convertlistcount; i++)
|
for (i = 0; i < convertlistcount; i++)
|
||||||
|
@ -4073,13 +4074,14 @@ Returns: nothing
|
||||||
*/
|
*/
|
||||||
|
|
||||||
static void
|
static void
|
||||||
show_compile_extra_options(uint32_t options, const char *before,
|
show_compile_extra_options(uint32_t options, const char *before,
|
||||||
const char *after)
|
const char *after)
|
||||||
{
|
{
|
||||||
if (options == 0) fprintf(outfile, "%s <none>%s", before, after);
|
if (options == 0) fprintf(outfile, "%s <none>%s", before, after);
|
||||||
else fprintf(outfile, "%s%s%s",
|
else fprintf(outfile, "%s%s%s%s",
|
||||||
before,
|
before,
|
||||||
((options & PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES) != 0)? " allow_surrogate_escapes" : "",
|
((options & PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES) != 0)? " allow_surrogate_escapes" : "",
|
||||||
|
((options & PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL) != 0)? " bad_escape_is_literal" : "",
|
||||||
after);
|
after);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -5225,14 +5227,14 @@ if ((pat_patctl.control & CTL_POSIX) != 0)
|
||||||
msg = "";
|
msg = "";
|
||||||
}
|
}
|
||||||
|
|
||||||
if ((FLD(pat_context, extra_options) &
|
if ((FLD(pat_context, extra_options) &
|
||||||
~POSIX_SUPPORTED_COMPILE_EXTRA_OPTIONS) != 0)
|
~POSIX_SUPPORTED_COMPILE_EXTRA_OPTIONS) != 0)
|
||||||
{
|
{
|
||||||
show_compile_extra_options(
|
show_compile_extra_options(
|
||||||
FLD(pat_context, extra_options) & ~POSIX_SUPPORTED_COMPILE_EXTRA_OPTIONS,
|
FLD(pat_context, extra_options) & ~POSIX_SUPPORTED_COMPILE_EXTRA_OPTIONS,
|
||||||
msg, "");
|
msg, "");
|
||||||
msg = "";
|
msg = "";
|
||||||
}
|
}
|
||||||
|
|
||||||
if ((pat_patctl.control & ~POSIX_SUPPORTED_COMPILE_CONTROLS) != 0 ||
|
if ((pat_patctl.control & ~POSIX_SUPPORTED_COMPILE_CONTROLS) != 0 ||
|
||||||
(pat_patctl.control2 & ~POSIX_SUPPORTED_COMPILE_CONTROLS2) != 0)
|
(pat_patctl.control2 & ~POSIX_SUPPORTED_COMPILE_CONTROLS2) != 0)
|
||||||
|
@ -5246,8 +5248,8 @@ if ((pat_patctl.control & CTL_POSIX) != 0)
|
||||||
if (FLD(pat_context, max_pattern_length) != PCRE2_UNSET)
|
if (FLD(pat_context, max_pattern_length) != PCRE2_UNSET)
|
||||||
prmsg(&msg, "max_pattern_length");
|
prmsg(&msg, "max_pattern_length");
|
||||||
if (FLD(pat_context, parens_nest_limit) != PARENS_NEST_DEFAULT)
|
if (FLD(pat_context, parens_nest_limit) != PARENS_NEST_DEFAULT)
|
||||||
prmsg(&msg, "parens_nest_limit");
|
prmsg(&msg, "parens_nest_limit");
|
||||||
|
|
||||||
if (msg[0] == 0) fprintf(outfile, "\n");
|
if (msg[0] == 0) fprintf(outfile, "\n");
|
||||||
|
|
||||||
/* Translate PCRE2 options to POSIX options and then compile. */
|
/* Translate PCRE2 options to POSIX options and then compile. */
|
||||||
|
@ -5413,7 +5415,7 @@ if (pat_patctl.convert_type != CONVERT_UNSET)
|
||||||
if (pat_patctl.convert_glob_escape != 0)
|
if (pat_patctl.convert_glob_escape != 0)
|
||||||
{
|
{
|
||||||
uint32_t escape = (pat_patctl.convert_glob_escape == '0')? 0 :
|
uint32_t escape = (pat_patctl.convert_glob_escape == '0')? 0 :
|
||||||
pat_patctl.convert_glob_escape;
|
pat_patctl.convert_glob_escape;
|
||||||
PCRE2_SET_GLOB_ESCAPE(rc, con_context, escape);
|
PCRE2_SET_GLOB_ESCAPE(rc, con_context, escape);
|
||||||
if (rc != 0)
|
if (rc != 0)
|
||||||
{
|
{
|
||||||
|
@ -7057,10 +7059,10 @@ else for (gmatched = 0;; gmatched++)
|
||||||
if ((dat_datctl.control & CTL_DFA) == 0 &&
|
if ((dat_datctl.control & CTL_DFA) == 0 &&
|
||||||
(FLD(compiled_code, executable_jit) == NULL ||
|
(FLD(compiled_code, executable_jit) == NULL ||
|
||||||
(dat_datctl.options & PCRE2_NO_JIT) != 0))
|
(dat_datctl.options & PCRE2_NO_JIT) != 0))
|
||||||
{
|
{
|
||||||
(void)check_match_limit(pp, arg_ulen, PCRE2_ERROR_HEAPLIMIT, "heap");
|
(void)check_match_limit(pp, arg_ulen, PCRE2_ERROR_HEAPLIMIT, "heap");
|
||||||
}
|
}
|
||||||
|
|
||||||
capcount = check_match_limit(pp, arg_ulen, PCRE2_ERROR_MATCHLIMIT,
|
capcount = check_match_limit(pp, arg_ulen, PCRE2_ERROR_MATCHLIMIT,
|
||||||
"match");
|
"match");
|
||||||
|
|
||||||
|
|
|
@ -5279,4 +5279,17 @@ a)"xI
|
||||||
|
|
||||||
/(a)(?-n:(b))(c)/nB
|
/(a)(?-n:(b))(c)/nB
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------
|
||||||
|
# These test the dangerous PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL option.
|
||||||
|
|
||||||
|
/\j\x{z}\o{82}\L\uabcd\u\U\g{\g/B,\bad_escape_is_literal
|
||||||
|
|
||||||
|
/\N{\c/B,bad_escape_is_literal
|
||||||
|
|
||||||
|
/[\j\x{z}\o\gA-\Nb-\g]/B,bad_escape_is_literal
|
||||||
|
|
||||||
|
/[Q-\N]/B,bad_escape_is_literal
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------
|
||||||
|
|
||||||
# End of testinput2
|
# End of testinput2
|
||||||
|
|
|
@ -2015,6 +2015,13 @@
|
||||||
\= Expect no match
|
\= Expect no match
|
||||||
X$
|
X$
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ----------------------------------------------------------------------
|
||||||
|
# These test the dangerous PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL option.
|
||||||
|
|
||||||
|
/\x{d800}/B,utf,bad_escape_is_literal
|
||||||
|
|
||||||
|
/\ud800/B,utf,alt_bsux,bad_escape_is_literal
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------
|
||||||
|
|
||||||
# End of testinput5
|
# End of testinput5
|
||||||
|
|
|
@ -15988,6 +15988,33 @@ Subject length lower bound = 1
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------
|
||||||
|
# These test the dangerous PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL option.
|
||||||
|
|
||||||
|
/\j\x{z}\o{82}\L\uabcd\u\U\g{\g/B,\bad_escape_is_literal
|
||||||
|
** Unrecognized modifier '\' in '\bad_escape_is_literal'
|
||||||
|
|
||||||
|
/\N{\c/B,bad_escape_is_literal
|
||||||
|
------------------------------------------------------------------
|
||||||
|
Bra
|
||||||
|
N{c
|
||||||
|
Ket
|
||||||
|
End
|
||||||
|
------------------------------------------------------------------
|
||||||
|
|
||||||
|
/[\j\x{z}\o\gA-\Nb-\g]/B,bad_escape_is_literal
|
||||||
|
------------------------------------------------------------------
|
||||||
|
Bra
|
||||||
|
[A-Nb-gjoxz{}]
|
||||||
|
Ket
|
||||||
|
End
|
||||||
|
------------------------------------------------------------------
|
||||||
|
|
||||||
|
/[Q-\N]/B,bad_escape_is_literal
|
||||||
|
Failed: error 108 at offset 4: range out of order in character class
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------
|
||||||
|
|
||||||
# End of testinput2
|
# End of testinput2
|
||||||
Error -65: PCRE2_ERROR_BADDATA (unknown error number)
|
Error -65: PCRE2_ERROR_BADDATA (unknown error number)
|
||||||
Error -62: bad serialized data
|
Error -62: bad serialized data
|
||||||
|
|
|
@ -4579,6 +4579,25 @@ No match
|
||||||
X$
|
X$
|
||||||
No match
|
No match
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ----------------------------------------------------------------------
|
||||||
|
# These test the dangerous PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL option.
|
||||||
|
|
||||||
|
/\x{d800}/B,utf,bad_escape_is_literal
|
||||||
|
------------------------------------------------------------------
|
||||||
|
Bra
|
||||||
|
x{d800}
|
||||||
|
Ket
|
||||||
|
End
|
||||||
|
------------------------------------------------------------------
|
||||||
|
|
||||||
|
/\ud800/B,utf,alt_bsux,bad_escape_is_literal
|
||||||
|
------------------------------------------------------------------
|
||||||
|
Bra
|
||||||
|
ud800
|
||||||
|
Ket
|
||||||
|
End
|
||||||
|
------------------------------------------------------------------
|
||||||
|
|
||||||
|
# ----------------------------------------------------------------------
|
||||||
|
|
||||||
# End of testinput5
|
# End of testinput5
|
||||||
|
|
Loading…
Reference in New Issue