Fix [:punct:] bug in UCP mode (matching chars in the range 128-255)

This commit is contained in:
Philip.Hazel 2015-11-17 17:59:35 +00:00
parent 6650a2fd9a
commit c4b8531a8f
4 changed files with 11 additions and 1 deletions

View File

@ -318,6 +318,9 @@ with JIT (possibly caused by SSE2?).
by a single ASCII character in a class item, was incorrectly compiled in UCP by a single ASCII character in a class item, was incorrectly compiled in UCP
mode. The POSIX class got lost, but only if the single character followed it. mode. The POSIX class got lost, but only if the single character followed it.
96. [:punct:] in UCP mode was matching some characters in the range 128-255
that should not have been matched.
Version 10.20 30-June-2015 Version 10.20 30-June-2015
-------------------------- --------------------------

View File

@ -247,7 +247,7 @@ while ((t = *data++) != XCL_END)
case PT_PXPUNCT: case PT_PXPUNCT:
if ((PRIV(ucp_gentype)[prop->chartype] == ucp_P || if ((PRIV(ucp_gentype)[prop->chartype] == ucp_P ||
(c < 256 && PRIV(ucp_gentype)[prop->chartype] == ucp_S)) == isprop) (c < 128 && PRIV(ucp_gentype)[prop->chartype] == ucp_S)) == isprop)
return !negated; return !negated;
break; break;

3
testdata/testinput4 vendored
View File

@ -2233,4 +2233,7 @@
/[^\p{Any}]*+x/utf /[^\p{Any}]*+x/utf
x x
/[[:punct:]]/utf,ucp
\x{b4}
# End of testinput4 # End of testinput4

View File

@ -3620,4 +3620,8 @@ No match
x x
0: x 0: x
/[[:punct:]]/utf,ucp
\x{b4}
No match
# End of testinput4 # End of testinput4