Fix "running for ever" bug for deeply nested [: sequences.

This commit is contained in:
Philip.Hazel 2015-07-21 13:42:14 +00:00
parent 31241914a5
commit 01c4647b02
4 changed files with 18 additions and 13 deletions

View File

@ -58,6 +58,10 @@ compiled and could cause reading from uninitialized memory or an incorrect
error diagnosis. Examples are: /[[:\\](?<[::]/ and /[[:\\](?'abc')[a:]. The
first of these bugs was discovered by Karl Skomski with the LLVM fuzzer.
16. Pathological patterns containing many nested occurrences of [: caused
pcre2_compile() to run for a very long time. This bug was found by the LLVM
fuzzer.
Version 10.20 30-June-2015

View File

@ -2583,7 +2583,9 @@ when Perl does, I think.
A user pointed out that PCRE was rejecting [:a[:digit:]] whereas Perl was not.
It seems that the appearance of a nested POSIX class supersedes an apparent
external class. For example, [:a[:digit:]b:] matches "a", "b", ":", or
a digit.
a digit. This is handled by returning FALSE if the start of a new group with
the same terminator is encountered, since the next closing sequence must close
the nested group, not the outer one.
In Perl, unescaped square brackets may also appear as part of class names. For
example, [:a[:abc]b:] gives unknown POSIX class "[:abc]b:]". However, for
@ -2609,21 +2611,15 @@ for (++ptr; *ptr != CHAR_NULL; ptr++)
if (*ptr == CHAR_BACKSLASH &&
(ptr[1] == CHAR_RIGHT_SQUARE_BRACKET || ptr[1] == CHAR_BACKSLASH))
ptr++;
else if (*ptr == CHAR_RIGHT_SQUARE_BRACKET) return FALSE;
else
{
if (*ptr == terminator && ptr[1] == CHAR_RIGHT_SQUARE_BRACKET)
else if ((*ptr == CHAR_LEFT_SQUARE_BRACKET && ptr[1] == terminator) ||
*ptr == CHAR_RIGHT_SQUARE_BRACKET) return FALSE;
else if (*ptr == terminator && ptr[1] == CHAR_RIGHT_SQUARE_BRACKET)
{
*endptr = ptr;
return TRUE;
}
if (*ptr == CHAR_LEFT_SQUARE_BRACKET &&
(ptr[1] == CHAR_COLON || ptr[1] == CHAR_DOT ||
ptr[1] == CHAR_EQUALS_SIGN) &&
check_posix_syntax(ptr, endptr))
return FALSE;
}
}
return FALSE;
}

2
testdata/testinput2 vendored
View File

@ -4350,4 +4350,6 @@ a random value. /Ix
/[[:\\](?'abc')[a:]/I
"[[[.\xe8Nq\xffq\xff\xe0\x2|||::Nq\xffq\xff\xe0\x6\x2|||::[[[:[::::::[[[[[::::::::[:[[[:[:::[[[[[[[[[[[[:::::::::::::::::[[.\xe8Nq\xffq\xff\xe0\x2|||::Nq\xffq\xff\xe0\x6\x2|||::[[[:[::::::[[[[[::::::::[:[[[:[:::[[[[[[[[[[[[[[:::E[[[:[:[[:[:::[[:::E[[[:[:[[:'[:::::E[[[:[::::::[[[:[[[[[[[::E[[[:[::::::[[[:[[[[[[[[:[[::[::::[[:::::::[[:[[[[[[[:[[::[:[[:[~"
# End of testinput2

View File

@ -14534,4 +14534,7 @@ Named capturing subpatterns:
Starting code units: : [ \
Subject length lower bound = 2
"[[[.\xe8Nq\xffq\xff\xe0\x2|||::Nq\xffq\xff\xe0\x6\x2|||::[[[:[::::::[[[[[::::::::[:[[[:[:::[[[[[[[[[[[[:::::::::::::::::[[.\xe8Nq\xffq\xff\xe0\x2|||::Nq\xffq\xff\xe0\x6\x2|||::[[[:[::::::[[[[[::::::::[:[[[:[:::[[[[[[[[[[[[[[:::E[[[:[:[[:[:::[[:::E[[[:[:[[:'[:::::E[[[:[::::::[[[:[[[[[[[::E[[[:[::::::[[[:[[[[[[[[:[[::[::::[[:::::::[[:[[[[[[[:[[::[:[[:[~"
Failed: error 106 at offset 353: missing terminating ] for character class
# End of testinput2