Fix overrun bug in recent property name parsing change

This commit is contained in:
Philip Hazel 2022-01-14 12:24:23 +00:00
parent 360a84e80b
commit 504ff06fff
4 changed files with 19 additions and 2 deletions

View File

@ -40,7 +40,7 @@ is NULL and the length is zero, treat as an empty string. Apparently a number
of applications treat NULL/0 in this way. of applications treat NULL/0 in this way.
14. Added support for Bidi_Class and a number of binary Unicode properties, 14. Added support for Bidi_Class and a number of binary Unicode properties,
including Bidi_Control. including Bidi_Control.
15. Fix some minor issues raised by clang sanitize. 15. Fix some minor issues raised by clang sanitize.
@ -61,6 +61,10 @@ including Bidi_Control.
(d) The standard Unicode 4-letter abbreviations for script names are now (d) The standard Unicode 4-letter abbreviations for script names are now
recognized. recognized.
(e) In accordance with Unicode and Perl's "loose matching" rules, spaces,
hyphens, and underscores are ignored in property names, which are then
matched independent of case.
18. The Python scripts in the maint directory have been refactored. There are 18. The Python scripts in the maint directory have been refactored. There are
now three scripts that generate pcre2_ucd.c, pcre2_ucp.h, and pcre2_ucptables.c now three scripts that generate pcre2_ucd.c, pcre2_ucp.h, and pcre2_ucptables.c

View File

@ -2115,7 +2115,11 @@ if (c == CHAR_LEFT_CURLY_BRACKET)
{ {
if (ptr >= cb->end_pattern) goto ERROR_RETURN; if (ptr >= cb->end_pattern) goto ERROR_RETURN;
c = *ptr++; c = *ptr++;
while (c == '_' || c == '-' || isspace(c)) c = *ptr++; while (c == '_' || c == '-' || isspace(c))
{
if (ptr >= cb->end_pattern) goto ERROR_RETURN;
c = *ptr++;
}
if (c == CHAR_NUL) goto ERROR_RETURN; if (c == CHAR_NUL) goto ERROR_RETURN;
if (c == CHAR_RIGHT_CURLY_BRACKET) break; if (c == CHAR_RIGHT_CURLY_BRACKET) break;
name[i] = tolower(c); name[i] = tolower(c);

4
testdata/testinput5 vendored
View File

@ -2207,5 +2207,9 @@
/\p{soft dotted}\p{sd}/g,utf /\p{soft dotted}\p{sd}/g,utf
>AF23<>\x{1df1a}\x{69}<>yz< >AF23<>\x{1df1a}\x{69}<>yz<
# ------------------------------------------------
/\p{\2b[:x臺gi:t:_/
# End of testinput5 # End of testinput5

View File

@ -5010,5 +5010,10 @@ Failed: error 147 at offset 8: unknown property after \P or \p
/\p{soft dotted}\p{sd}/g,utf /\p{soft dotted}\p{sd}/g,utf
>AF23<>\x{1df1a}\x{69}<>yz< >AF23<>\x{1df1a}\x{69}<>yz<
0: \x{1df1a}i 0: \x{1df1a}i
# ------------------------------------------------
/\p{\2b[:xäigi:t:_/
Failed: error 146 at offset 17: malformed \P or \p sequence
# End of testinput5 # End of testinput5