Fix mis-parsing of a conditional group with callout but a question mark where

the assertion should start.
This commit is contained in:
Philip.Hazel 2016-12-23 18:34:10 +00:00
parent 482b6a1f0a
commit 6c48775955
4 changed files with 60 additions and 40 deletions

View File

@ -111,6 +111,10 @@ are noted here for the record.
(p) A buffer overflow could occur while sorting the names in the group name (p) A buffer overflow could occur while sorting the names in the group name
list (depending on the order in which the names were seen). list (depending on the order in which the names were seen).
(q) A conditional group that started with a callout was not doing the right
check for a following assertion, leading to compiling bad code. Example:
/(?(C'XX))?!XX/
4. Back references are now permitted in lookbehind assertions when there are 4. Back references are now permitted in lookbehind assertions when there are
no duplicated group numbers (that is, (?| has not been used), and, if the no duplicated group numbers (that is, (?| has not been used), and, if the
reference is by name, there is only one group of that name. The referenced reference is by name, there is only one group of that name. The referenced

View File

@ -2475,13 +2475,16 @@ while (ptr < ptrend)
/* If expect_cond_assert is 2, we have just passed (?( and are expecting an /* If expect_cond_assert is 2, we have just passed (?( and are expecting an
assertion, possibly preceded by a callout. If the value is 1, we have just assertion, possibly preceded by a callout. If the value is 1, we have just
had the callout and expect an assertion. There must be at least 3 more had the callout and expect an assertion. There must be at least 3 more
characters in all cases. We know that the current character is an opening characters in all cases. When expect_cond_assert is 2, we know that the
parenthesis, as otherwise we wouldn't be here. Note that expect_cond_assert current character is an opening parenthesis, as otherwise we wouldn't be
may be negative, since all callouts just decrement it. */ here. However, when it is 1, we need to check, and it's easiest just to check
always. Note that expect_cond_assert may be negative, since all callouts just
decrement it. */
if (expect_cond_assert > 0) if (expect_cond_assert > 0)
{ {
BOOL ok = ptrend - ptr >= 3 && ptr[0] == CHAR_QUESTION_MARK; BOOL ok = c == CHAR_LEFT_PARENTHESIS && ptrend - ptr >= 3 &&
ptr[0] == CHAR_QUESTION_MARK;
if (ok) switch(ptr[1]) if (ok) switch(ptr[1])
{ {
case CHAR_C: case CHAR_C:

6
testdata/testinput2 vendored
View File

@ -4944,4 +4944,10 @@ a)"xI
/(?:\[A|B|C|D|E|F|G|H|I|J|]{200}Z)/expand /(?:\[A|B|C|D|E|F|G|H|I|J|]{200}Z)/expand
# This one used to compile rubbish instead of a compile error, and then
# behave unpredictably at match time.
/.+(?(?C'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'))?!XXXX.=X/
.+(?(?C'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'))?!XXXX.=X
# End of testinput2 # End of testinput2

View File

@ -15424,6 +15424,13 @@ Subject length lower bound = 0
/(?:\[A|B|C|D|E|F|G|H|I|J|]{200}Z)/expand /(?:\[A|B|C|D|E|F|G|H|I|J|]{200}Z)/expand
# This one used to compile rubbish instead of a compile error, and then
# behave unpredictably at match time.
/.+(?(?C'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'))?!XXXX.=X/
Failed: error 128 at offset 63: assertion expected after (?( or (?(?C)
.+(?(?C'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'))?!XXXX.=X
# End of testinput2 # End of testinput2
Error -63: PCRE2_ERROR_BADDATA (unknown error number) Error -63: PCRE2_ERROR_BADDATA (unknown error number)
Error -62: bad serialized data Error -62: bad serialized data