Fix bugs when (?!) is used as a condition.

This commit is contained in:
Philip.Hazel 2015-03-24 10:21:34 +00:00
parent 69cda6bc70
commit a4ceadbf47
9 changed files with 54 additions and 10 deletions

View File

@ -14,8 +14,16 @@ error. This bug was discovered by the LLVM fuzzer.
4. Implemented pcre2_callout_enumerate(). 4. Implemented pcre2_callout_enumerate().
5. Fix JIT compilation of conditional blocks whose assertion 5. Fix JIT compilation of conditional blocks whose assertion is converted to
is converted to (*FAIL). E.g: /(?(?!))/. (*FAIL). E.g: /(?(?!))/.
6. The pattern /(?(?!)^)/ caused references to random memory. This bug was
discovered by the LLVM fuzzer.
7. The assertion (?!) is optimized to (*FAIL). This was not handled correctly
when this assertion was used as a condition, for example (?(?!)a|b). In
pcre2_match() it worked by luck; in pcre2_dfa_match() it gave an incorrect
error about an unsupported item.
Version 10.10 06-March-2015 Version 10.10 06-March-2015
@ -120,12 +128,13 @@ repeated outer group that has a zero minimum quantifier, caused incorrect code
to be compiled, leading to the error "internal error: previously-checked to be compiled, leading to the error "internal error: previously-checked
referenced subpattern not found" when an incorrect memory address was read. referenced subpattern not found" when an incorrect memory address was read.
This bug was reported as "heap overflow", discovered by Kai Lu of Fortinet's This bug was reported as "heap overflow", discovered by Kai Lu of Fortinet's
FortiGuard Labs. FortiGuard Labs. (Added 24-March-2015: CVE-2015-2325 was given to this.)
23. A pattern such as "((?+1)(\1))/" containing a forward reference subroutine 23. A pattern such as "((?+1)(\1))/" containing a forward reference subroutine
call within a group that also contained a recursive back reference caused call within a group that also contained a recursive back reference caused
incorrect code to be compiled. This bug was reported as "heap overflow", incorrect code to be compiled. This bug was reported as "heap overflow",
discovered by Kai Lu of Fortinet's FortiGuard Labs. discovered by Kai Lu of Fortinet's FortiGuard Labs. (Added 24-March-2015:
CVE-2015-2326 was given to this.)
24. Computing the size of the JIT read-only data in advance has been a source 24. Computing the size of the JIT read-only data in advance has been a source
of various issues, and new ones are still appear unfortunately. To fix of various issues, and new ones are still appear unfortunately. To fix

View File

@ -210,7 +210,8 @@ These items are all just one unit long
OP_THEN ) OP_THEN )
OP_ASSERT_ACCEPT is used when (*ACCEPT) is encountered within an assertion. OP_ASSERT_ACCEPT is used when (*ACCEPT) is encountered within an assertion.
This ends the assertion, not the entire pattern match. This ends the assertion, not the entire pattern match. The assertion (?!) is
always optimized to OP_FAIL.
Backtracking control verbs with optional data Backtracking control verbs with optional data
@ -528,6 +529,10 @@ immediately before the assertion. It is also possible to insert a manual
callout at this point. Only assertion conditions may have callouts preceding callout at this point. Only assertion conditions may have callouts preceding
the condition. the condition.
A condition that is the negative assertion (?!) is optimized to OP_FAIL in all
parts of the pattern, so this is another opcode that may appear as a condition.
It is treated the same as OP_FALSE.
Recursion Recursion
--------- ---------

View File

@ -7284,7 +7284,7 @@ inside atomic brackets or in a pattern that contains *PRUNE or *SKIP does not
count, because once again the assumption no longer holds. count, because once again the assumption no longer holds.
Arguments: Arguments:
code points to start of the compiled pattern code points to start of the compiled pattern or a group
bracket_map a bitmap of which brackets we are inside while testing; this bracket_map a bitmap of which brackets we are inside while testing; this
handles up to substring 31; after that we just have to take handles up to substring 31; after that we just have to take
the less precise approach the less precise approach
@ -7321,6 +7321,7 @@ do {
case OP_DNCREF: case OP_DNCREF:
case OP_RREF: case OP_RREF:
case OP_DNRREF: case OP_DNRREF:
case OP_FAIL:
case OP_FALSE: case OP_FALSE:
case OP_TRUE: case OP_TRUE:
return FALSE; return FALSE;

View File

@ -2660,14 +2660,15 @@ for (;;)
condcode == OP_DNRREF) condcode == OP_DNRREF)
return PCRE2_ERROR_DFA_UCOND; return PCRE2_ERROR_DFA_UCOND;
/* The DEFINE condition is always false */ /* The DEFINE condition is always false, and the assertion (?!) is
converted to OP_FAIL. */
if (condcode == OP_FALSE)
if (condcode == OP_FALSE || condcode == OP_FAIL)
{ ADD_ACTIVE(state_offset + codelink + LINK_SIZE + 1, 0); } { ADD_ACTIVE(state_offset + codelink + LINK_SIZE + 1, 0); }
/* There is also an always-true condition */ /* There is also an always-true condition */
if (condcode == OP_TRUE) else if (condcode == OP_TRUE)
{ ADD_ACTIVE(state_offset + LINK_SIZE + 2 + IMM2_SIZE, 0); } { ADD_ACTIVE(state_offset + LINK_SIZE + 2 + IMM2_SIZE, 0); }
/* The only supported version of OP_RREF is for the value RREF_ANY, /* The only supported version of OP_RREF is for the value RREF_ANY,

View File

@ -1408,6 +1408,7 @@ for (;;)
break; break;
case OP_FALSE: case OP_FALSE:
case OP_FAIL: /* The assertion (?!) becomes OP_FAIL */
break; break;
case OP_TRUE: case OP_TRUE:

7
testdata/testinput2 vendored
View File

@ -4229,4 +4229,11 @@ a random value. /Ix
/ 61 28 3f 43 27 78 00 7a 27 29 62/hex,callout_info / 61 28 3f 43 27 78 00 7a 27 29 62/hex,callout_info
abcdefgh abcdefgh
/(?(?!)^)/
/(?(?!)a|b)/
bbb
** Failers
aaa
# End of testinput2 # End of testinput2

4
testdata/testinput6 vendored
View File

@ -4846,4 +4846,8 @@
/ 61 28 3f 43 27 78 00 7a 27 29 62/hex / 61 28 3f 43 27 78 00 7a 27 29 62/hex
abcdefgh abcdefgh
/(?(?!)a|b)/
bbb
aaa
# End of testinput6 # End of testinput6

10
testdata/testoutput2 vendored
View File

@ -14188,4 +14188,14 @@ Callout (5): 'x\x00z'
^^ b ^^ b
0: ab 0: ab
/(?(?!)^)/
/(?(?!)a|b)/
bbb
0: b
** Failers
No match
aaa
No match
# End of testinput2 # End of testinput2

View File

@ -7919,4 +7919,10 @@ Callout (5): 'x\x00z'
^^ b ^^ b
0: ab 0: ab
/(?(?!)a|b)/
bbb
0: b
aaa
No match
# End of testinput6 # End of testinput6