Fix buffer overflow for recursive byname back reference when duplicate names
exist.
This commit is contained in:
parent
92739ef5d8
commit
56444e9978
54
ChangeLog
54
ChangeLog
|
@ -26,12 +26,12 @@ pcre2_match() it worked by luck; in pcre2_dfa_match() it gave an incorrect
|
||||||
error about an unsupported item.
|
error about an unsupported item.
|
||||||
|
|
||||||
8. For some types of pattern, for example /Z*(|d*){216}/, the auto-
|
8. For some types of pattern, for example /Z*(|d*){216}/, the auto-
|
||||||
possessification code could take exponential time to complete. A recursion
|
possessification code could take exponential time to complete. A recursion
|
||||||
depth limit of 1000 has been imposed to limit the resources used by this
|
depth limit of 1000 has been imposed to limit the resources used by this
|
||||||
optimization. This infelicity was discovered by the LLVM fuzzer.
|
optimization. This infelicity was discovered by the LLVM fuzzer.
|
||||||
|
|
||||||
9. A pattern such as /(*UTF)[\S\V\H]/, which contains a negated special class
|
9. A pattern such as /(*UTF)[\S\V\H]/, which contains a negated special class
|
||||||
such as \S in non-UCP mode, explicit wide characters (> 255) can be ignored
|
such as \S in non-UCP mode, explicit wide characters (> 255) can be ignored
|
||||||
because \S ensures they are all in the class. The code for doing this was
|
because \S ensures they are all in the class. The code for doing this was
|
||||||
interacting badly with the code for computing the amount of space needed to
|
interacting badly with the code for computing the amount of space needed to
|
||||||
compile the pattern, leading to a buffer overflow. This bug was discovered by
|
compile the pattern, leading to a buffer overflow. This bug was discovered by
|
||||||
|
@ -45,21 +45,21 @@ discovered by the LLVM fuzzer.
|
||||||
between a subroutine call and its quantifier was incorrectly compiled, leading
|
between a subroutine call and its quantifier was incorrectly compiled, leading
|
||||||
to buffer overflow or other errors. This bug was discovered by the LLVM fuzzer.
|
to buffer overflow or other errors. This bug was discovered by the LLVM fuzzer.
|
||||||
|
|
||||||
12. The illegal pattern /(?(?<E>.*!.*)?)/ was not being diagnosed as missing an
|
12. The illegal pattern /(?(?<E>.*!.*)?)/ was not being diagnosed as missing an
|
||||||
assertion after (?(. The code was failing to check the character after (?(?<
|
assertion after (?(. The code was failing to check the character after (?(?<
|
||||||
for the ! or = that would indicate a lookbehind assertion. This bug was
|
for the ! or = that would indicate a lookbehind assertion. This bug was
|
||||||
discovered by the LLVM fuzzer.
|
discovered by the LLVM fuzzer.
|
||||||
|
|
||||||
13. A pattern such as /X((?2)()*+){2}+/ which has a possessive quantifier with
|
13. A pattern such as /X((?2)()*+){2}+/ which has a possessive quantifier with
|
||||||
a fixed maximum following a group that contains a subroutine reference was
|
a fixed maximum following a group that contains a subroutine reference was
|
||||||
incorrectly compiled and could trigger buffer overflow. This bug was discovered
|
incorrectly compiled and could trigger buffer overflow. This bug was discovered
|
||||||
by the LLVM fuzzer.
|
by the LLVM fuzzer.
|
||||||
|
|
||||||
14. Negative relative recursive references such as (?-7) to non-existent
|
14. Negative relative recursive references such as (?-7) to non-existent
|
||||||
subpatterns were not being diagnosed and could lead to unpredictable behaviour.
|
subpatterns were not being diagnosed and could lead to unpredictable behaviour.
|
||||||
This bug was discovered by the LLVM fuzzer.
|
This bug was discovered by the LLVM fuzzer.
|
||||||
|
|
||||||
15. The bug fixed in 14 was due to an integer variable that was unsigned when
|
15. The bug fixed in 14 was due to an integer variable that was unsigned when
|
||||||
it should have been signed. Some other "int" variables, having been checked,
|
it should have been signed. Some other "int" variables, having been checked,
|
||||||
have either been changed to uint32_t or commented as "must be signed".
|
have either been changed to uint32_t or commented as "must be signed".
|
||||||
|
|
||||||
|
@ -73,41 +73,41 @@ lookbehind assertion. This bug was discovered by the LLVM fuzzer.
|
||||||
18. There was a similar problem to 17 in pcre2test for global matches, though
|
18. There was a similar problem to 17 in pcre2test for global matches, though
|
||||||
the code there did catch the loop.
|
the code there did catch the loop.
|
||||||
|
|
||||||
19. If a greedy quantified \X was preceded by \C in UTF mode (e.g. \C\X*),
|
19. If a greedy quantified \X was preceded by \C in UTF mode (e.g. \C\X*),
|
||||||
and a subsequent item in the pattern caused a non-match, backtracking over the
|
and a subsequent item in the pattern caused a non-match, backtracking over the
|
||||||
repeated \X did not stop, but carried on past the start of the subject, causing
|
repeated \X did not stop, but carried on past the start of the subject, causing
|
||||||
reference to random memory and/or a segfault. There were also some other cases
|
reference to random memory and/or a segfault. There were also some other cases
|
||||||
where backtracking after \C could crash. This set of bugs was discovered by the
|
where backtracking after \C could crash. This set of bugs was discovered by the
|
||||||
LLVM fuzzer.
|
LLVM fuzzer.
|
||||||
|
|
||||||
20. The function for finding the minimum length of a matching string could take
|
20. The function for finding the minimum length of a matching string could take
|
||||||
a very long time if mutual recursion was present many times in a pattern, for
|
a very long time if mutual recursion was present many times in a pattern, for
|
||||||
example, /((?2){73}(?2))((?1))/. A better mutual recursion detection method has
|
example, /((?2){73}(?2))((?1))/. A better mutual recursion detection method has
|
||||||
been implemented. This infelicity was discovered by the LLVM fuzzer.
|
been implemented. This infelicity was discovered by the LLVM fuzzer.
|
||||||
|
|
||||||
21. Implemented PCRE2_NEVER_BACKSLASH_C.
|
21. Implemented PCRE2_NEVER_BACKSLASH_C.
|
||||||
|
|
||||||
22. The feature for string replication in pcre2test could read from freed
|
22. The feature for string replication in pcre2test could read from freed
|
||||||
memory if the replication required a buffer to be extended, and it was not
|
memory if the replication required a buffer to be extended, and it was not
|
||||||
working properly in 16-bit and 32-bit modes. This issue was discovered by a
|
working properly in 16-bit and 32-bit modes. This issue was discovered by a
|
||||||
fuzzer: see http://lcamtuf.coredump.cx/afl/.
|
fuzzer: see http://lcamtuf.coredump.cx/afl/.
|
||||||
|
|
||||||
23. Added the PCRE2_ALT_CIRCUMFLEX option.
|
23. Added the PCRE2_ALT_CIRCUMFLEX option.
|
||||||
|
|
||||||
24. Adjust the treatment of \8 and \9 to be the same as the current Perl
|
24. Adjust the treatment of \8 and \9 to be the same as the current Perl
|
||||||
behaviour.
|
behaviour.
|
||||||
|
|
||||||
25. Static linking against the PCRE2 library using the pkg-config module was
|
25. Static linking against the PCRE2 library using the pkg-config module was
|
||||||
failing on missing pthread symbols.
|
failing on missing pthread symbols.
|
||||||
|
|
||||||
26. If a group that contained a recursive back reference also contained a
|
26. If a group that contained a recursive back reference also contained a
|
||||||
forward reference subroutine call followed by a non-forward-reference
|
forward reference subroutine call followed by a non-forward-reference
|
||||||
subroutine call, for example /.((?2)(?R)\1)()/, pcre2_compile() failed to
|
subroutine call, for example /.((?2)(?R)\1)()/, pcre2_compile() failed to
|
||||||
compile correct code, leading to undefined behaviour or an internally detected
|
compile correct code, leading to undefined behaviour or an internally detected
|
||||||
error. This bug was discovered by the LLVM fuzzer.
|
error. This bug was discovered by the LLVM fuzzer.
|
||||||
|
|
||||||
27. Quantification of certain items (e.g. atomic back references) could cause
|
27. Quantification of certain items (e.g. atomic back references) could cause
|
||||||
incorrect code to be compiled when recursive forward references were involved.
|
incorrect code to be compiled when recursive forward references were involved.
|
||||||
For example, in this pattern: /(?1)()((((((\1++))\x85)+)|))/. This bug was
|
For example, in this pattern: /(?1)()((((((\1++))\x85)+)|))/. This bug was
|
||||||
discovered by the LLVM fuzzer.
|
discovered by the LLVM fuzzer.
|
||||||
|
|
||||||
|
@ -115,6 +115,10 @@ discovered by the LLVM fuzzer.
|
||||||
a buffer overflow if there was more than one group with the given name. This
|
a buffer overflow if there was more than one group with the given name. This
|
||||||
bug was discovered by the LLVM fuzzer.
|
bug was discovered by the LLVM fuzzer.
|
||||||
|
|
||||||
|
29. A recursive back reference by name within a group that had the same name as
|
||||||
|
another group caused a buffer overflow. For example: /(?J)(?'d'(?'d'\g{d}))/.
|
||||||
|
This bug was discovered by the LLVM fuzzer.
|
||||||
|
|
||||||
|
|
||||||
Version 10.10 06-March-2015
|
Version 10.10 06-March-2015
|
||||||
---------------------------
|
---------------------------
|
||||||
|
|
|
@ -5946,18 +5946,34 @@ for (;; ptr++)
|
||||||
}
|
}
|
||||||
|
|
||||||
/* The name table does not exist in the first pass; instead we must
|
/* The name table does not exist in the first pass; instead we must
|
||||||
scan the list of names encountered so far in order to get the
|
scan the list of names encountered so far in order to get a number.
|
||||||
number. If the name is not found, set the value to 0 for a forward
|
If there are duplicates, there may be more than one number. For each
|
||||||
reference. */
|
one, if handling a back reference, we must check to see if it is
|
||||||
|
recursive, that is, it is inside the group that it references. A flag
|
||||||
|
is set so that the group can be made atomic. If the name is not
|
||||||
|
found, set the value of recno to 0 for a forward reference. */
|
||||||
|
|
||||||
|
recno = 0;
|
||||||
ng = cb->named_groups;
|
ng = cb->named_groups;
|
||||||
|
|
||||||
for (i = 0; i < cb->names_found; i++, ng++)
|
for (i = 0; i < cb->names_found; i++, ng++)
|
||||||
{
|
{
|
||||||
if (namelen == ng->length &&
|
if (namelen == ng->length &&
|
||||||
PRIV(strncmp)(name, ng->name, namelen) == 0)
|
PRIV(strncmp)(name, ng->name, namelen) == 0)
|
||||||
break;
|
{
|
||||||
|
open_capitem *oc;
|
||||||
|
recno = ng->number;
|
||||||
|
if (is_recurse) break;
|
||||||
|
for (oc = cb->open_caps; oc != NULL; oc = oc->next)
|
||||||
|
{
|
||||||
|
if (oc->number == recno)
|
||||||
|
{
|
||||||
|
oc->flag = TRUE;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
recno = (i < cb->names_found)? ng->number : 0;
|
|
||||||
|
|
||||||
/* If duplicate names are permitted, we have to allow for a named
|
/* If duplicate names are permitted, we have to allow for a named
|
||||||
reference to a duplicated name (this cannot be determined until the
|
reference to a duplicated name (this cannot be determined until the
|
||||||
|
@ -6002,8 +6018,8 @@ for (;; ptr++)
|
||||||
|
|
||||||
if (is_recurse) goto HANDLE_RECURSION;
|
if (is_recurse) goto HANDLE_RECURSION;
|
||||||
|
|
||||||
/* In the second pass we must see if the name is duplicated. If so, we
|
/* For back references, in the second pass we must see if the name is
|
||||||
generate a different opcode. */
|
duplicated. If so, we generate a different opcode. */
|
||||||
|
|
||||||
if (lengthptr == NULL && cb->dupnames)
|
if (lengthptr == NULL && cb->dupnames)
|
||||||
{
|
{
|
||||||
|
@ -6036,7 +6052,7 @@ for (;; ptr++)
|
||||||
cb->backref_map |= (recno < 32)? (1 << recno) : 1;
|
cb->backref_map |= (recno < 32)? (1 << recno) : 1;
|
||||||
if ((uint32_t)recno > cb->top_backref) cb->top_backref = recno;
|
if ((uint32_t)recno > cb->top_backref) cb->top_backref = recno;
|
||||||
|
|
||||||
/* Check to see if this back reference is recursive, that it, it
|
/* Check to see if this back reference is recursive, that is, it
|
||||||
is inside the group that it references. A flag is set so that the
|
is inside the group that it references. A flag is set so that the
|
||||||
group can be made atomic. */
|
group can be made atomic. */
|
||||||
|
|
||||||
|
@ -7138,7 +7154,7 @@ for (;;)
|
||||||
Because we are moving code along, we must ensure that any pending recursive
|
Because we are moving code along, we must ensure that any pending recursive
|
||||||
or forward subroutine references are updated. In any event, remove the
|
or forward subroutine references are updated. In any event, remove the
|
||||||
block from the chain. */
|
block from the chain. */
|
||||||
|
|
||||||
if (capnumber > 0)
|
if (capnumber > 0)
|
||||||
{
|
{
|
||||||
if (cb->open_caps->flag)
|
if (cb->open_caps->flag)
|
||||||
|
@ -8050,6 +8066,7 @@ at this stage. */
|
||||||
|
|
||||||
#ifdef CALL_PRINTINT
|
#ifdef CALL_PRINTINT
|
||||||
pcre2_printint(re, stderr, TRUE);
|
pcre2_printint(re, stderr, TRUE);
|
||||||
|
fprintf(stderr, "Length=%lu Used=%lu\n", length, usedlength);
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/* Fill in any forward references that are required. There may be repeated
|
/* Fill in any forward references that are required. There may be repeated
|
||||||
|
|
|
@ -4306,4 +4306,8 @@ a random value. /Ix
|
||||||
|
|
||||||
/$(&.+[\p{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(?(R)){0,6}?|){12\x8a\X*?\x8a\x0b\xd1^9\3*+(\xc1,\k'P'\xb4)\xcc(z\z(?JJ)(?''8};(\x0b\xd1^9\?'3*+(\xc1.]k+\x0b'Pm'\xb4\xcc4'\xd1'(?''))?-%--\x95$9*\4'|\xd1(''%\x95*$9)#(?'R')3\x07?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+!~:(?)''(d'E:yD!\s(?'R'\x1e;\x10:U))|')g!\xb0*){29+))#(?'P'})*?/
|
/$(&.+[\p{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(?(R)){0,6}?|){12\x8a\X*?\x8a\x0b\xd1^9\3*+(\xc1,\k'P'\xb4)\xcc(z\z(?JJ)(?''8};(\x0b\xd1^9\?'3*+(\xc1.]k+\x0b'Pm'\xb4\xcc4'\xd1'(?''))?-%--\x95$9*\4'|\xd1(''%\x95*$9)#(?'R')3\x07?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+!~:(?)''(d'E:yD!\s(?'R'\x1e;\x10:U))|')g!\xb0*){29+))#(?'P'})*?/
|
||||||
|
|
||||||
|
"\xa\xf<(.\pZ*\P{Xwd}+^\xa8\3'3yq.::?(?J:()\xd1+!~:3'(8?:)':(?'d'(?'d'^u]!.+.+\\A\Ah(n+?9){7}+\K;(?''u'(?'c'(?'z'(?<y>\xb::\xf0'|\xd3(\xae?'w(z\x8?P>l)\x8?P>a)'\H\R\xd1+!!~:3'(?:h$N{26875}\W+?\\=D{2}\x89(?i:Uy0\N({2\xa(\v\x85*){y*\A(()\p{L}+?\P{^Xan}'+?\xff\+pS\?|).{;y*\A(()\p{L}+?\8}\d?1(|)(/1){7}.+[Lp{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(\xbf(R))\x8a\X*?\x8a\xb\xd1^9\3*+(\xc1,\k'R'\xb4)\xcc(z\z(?J)(?''\x1b(\xb\xd1^9\?'3*+P{^Xan}+?\xff\+(\xc1.]k+\xb'Pm'\xb4)\xcc4f\xa7'\xd1V(?i:U,{2,2})'(?''))?-%--\x95$9*\4'|\xd1(\x9c''%\x94$9)#(?'R')3\x7?('P\xed7'\xa8\xb1^u\xeaw\1\0\0\(|(?1){7}.+[\p{Me}].\s\xdcC*^\x14?(?(<y>))(?<!^)$C((;*?(R*?))+(?(R)\x8a\X*?\x8a\xb\xd1^9\3*+|(\xc1,\k'R'\xb4)\xcc! z)\z(?JJ)(?'';(\xb\xd1^9\?'3*+(\xc1.]k+\xb'Pm'\xb4))':(?'d')(?'RD'(d')|)|$)'|(?<x>\g{d});\g{x}\x11\g{d}\x81\|$((?''\'X'(?'W''\x92()'9'\x83*))\xba*\!?^ <){)':;\xcc4'\xd1'(?''28))?-%--\x95$9*\4'|\xd1((''e\x94*$9:)*#(?'R')3)\x7?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+0!~:(?)'d'E:yD!\s(?'R'\x1e;\x10:U))|'\x9g!\xb0*){)\\x16:;()\x1e\x10\x87*:(?<y>)\xd1+!~:(?)'}'\d'E:yD!\s(?'R'\x1e;\x10:U))|'))|)g!\xb0*R+9{29+)#(?'P'})*?pS\{3,}\x85,{0,}l{*UTF)(\xe{7}){3722,{9,}d{2,?|))|{)\(A?&d}}{\xa,}2}){3,}7,l{)22}(,}l:7{2,4}}29\x19+)#?'P'})*v?))\x5"
|
||||||
|
|
||||||
|
"(?J)(?'d'(?'d'\g{d}))"
|
||||||
|
|
||||||
# End of testinput2
|
# End of testinput2
|
||||||
|
|
|
@ -14405,4 +14405,9 @@ Failed: error 115 at offset 26: reference to non-existent subpattern
|
||||||
|
|
||||||
/$(&.+[\p{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(?(R)){0,6}?|){12\x8a\X*?\x8a\x0b\xd1^9\3*+(\xc1,\k'P'\xb4)\xcc(z\z(?JJ)(?''8};(\x0b\xd1^9\?'3*+(\xc1.]k+\x0b'Pm'\xb4\xcc4'\xd1'(?''))?-%--\x95$9*\4'|\xd1(''%\x95*$9)#(?'R')3\x07?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+!~:(?)''(d'E:yD!\s(?'R'\x1e;\x10:U))|')g!\xb0*){29+))#(?'P'})*?/
|
/$(&.+[\p{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(?(R)){0,6}?|){12\x8a\X*?\x8a\x0b\xd1^9\3*+(\xc1,\k'P'\xb4)\xcc(z\z(?JJ)(?''8};(\x0b\xd1^9\?'3*+(\xc1.]k+\x0b'Pm'\xb4\xcc4'\xd1'(?''))?-%--\x95$9*\4'|\xd1(''%\x95*$9)#(?'R')3\x07?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+!~:(?)''(d'E:yD!\s(?'R'\x1e;\x10:U))|')g!\xb0*){29+))#(?'P'})*?/
|
||||||
|
|
||||||
|
"\xa\xf<(.\pZ*\P{Xwd}+^\xa8\3'3yq.::?(?J:()\xd1+!~:3'(8?:)':(?'d'(?'d'^u]!.+.+\\A\Ah(n+?9){7}+\K;(?''u'(?'c'(?'z'(?<y>\xb::\xf0'|\xd3(\xae?'w(z\x8?P>l)\x8?P>a)'\H\R\xd1+!!~:3'(?:h$N{26875}\W+?\\=D{2}\x89(?i:Uy0\N({2\xa(\v\x85*){y*\A(()\p{L}+?\P{^Xan}'+?\xff\+pS\?|).{;y*\A(()\p{L}+?\8}\d?1(|)(/1){7}.+[Lp{Me}].\s\xdcC*?(?(<y>))(?<!^)$C((;*?(R))+(\xbf(R))\x8a\X*?\x8a\xb\xd1^9\3*+(\xc1,\k'R'\xb4)\xcc(z\z(?J)(?''\x1b(\xb\xd1^9\?'3*+P{^Xan}+?\xff\+(\xc1.]k+\xb'Pm'\xb4)\xcc4f\xa7'\xd1V(?i:U,{2,2})'(?''))?-%--\x95$9*\4'|\xd1(\x9c''%\x94$9)#(?'R')3\x7?('P\xed7'\xa8\xb1^u\xeaw\1\0\0\(|(?1){7}.+[\p{Me}].\s\xdcC*^\x14?(?(<y>))(?<!^)$C((;*?(R*?))+(?(R)\x8a\X*?\x8a\xb\xd1^9\3*+|(\xc1,\k'R'\xb4)\xcc! z)\z(?JJ)(?'';(\xb\xd1^9\?'3*+(\xc1.]k+\xb'Pm'\xb4))':(?'d')(?'RD'(d')|)|$)'|(?<x>\g{d});\g{x}\x11\g{d}\x81\|$((?''\'X'(?'W''\x92()'9'\x83*))\xba*\!?^ <){)':;\xcc4'\xd1'(?''28))?-%--\x95$9*\4'|\xd1((''e\x94*$9:)*#(?'R')3)\x7?('P\xed')\\x16:;()\x1e\x10*:(?<y>)\xd1+0!~:(?)'d'E:yD!\s(?'R'\x1e;\x10:U))|'\x9g!\xb0*){)\\x16:;()\x1e\x10\x87*:(?<y>)\xd1+!~:(?)'}'\d'E:yD!\s(?'R'\x1e;\x10:U))|'))|)g!\xb0*R+9{29+)#(?'P'})*?pS\{3,}\x85,{0,}l{*UTF)(\xe{7}){3722,{9,}d{2,?|))|{)\(A?&d}}{\xa,}2}){3,}7,l{)22}(,}l:7{2,4}}29\x19+)#?'P'})*v?))\x5"
|
||||||
|
Failed: error 122 at offset 1221: unmatched closing parenthesis
|
||||||
|
|
||||||
|
"(?J)(?'d'(?'d'\g{d}))"
|
||||||
|
|
||||||
# End of testinput2
|
# End of testinput2
|
||||||
|
|
Loading…
Reference in New Issue