Dynamic check of (*MARK) etc name length to avoid the possibility of overflow.
This commit is contained in:
parent
3e24a1b351
commit
4ad83f7103
129
ChangeLog
129
ChangeLog
|
@ -9,41 +9,41 @@ Version 10.21 xx-xxx-xxxx
|
||||||
2. Use memchr() to find the first character in an unanchored match in 8-bit
|
2. Use memchr() to find the first character in an unanchored match in 8-bit
|
||||||
mode in the interpreter. This gives a significant speed improvement.
|
mode in the interpreter. This gives a significant speed improvement.
|
||||||
|
|
||||||
3. Removed a redundant copy of the opcode_possessify table in the
|
3. Removed a redundant copy of the opcode_possessify table in the
|
||||||
pcre2_auto_possessify.c source.
|
pcre2_auto_possessify.c source.
|
||||||
|
|
||||||
4. Fix typos in dftables.c for z/OS.
|
4. Fix typos in dftables.c for z/OS.
|
||||||
|
|
||||||
5. Change 36 for 10.20 broke the handling of [[:>:]] and [[:<:]] in that
|
5. Change 36 for 10.20 broke the handling of [[:>:]] and [[:<:]] in that
|
||||||
processing them could involve a buffer overflow if the following character was
|
processing them could involve a buffer overflow if the following character was
|
||||||
an opening parenthesis.
|
an opening parenthesis.
|
||||||
|
|
||||||
6. Change 36 for 10.20 also introduced a bug in processing this pattern:
|
6. Change 36 for 10.20 also introduced a bug in processing this pattern:
|
||||||
/((?x)(*:0))#(?'/. Specifically: if a setting of (?x) was followed by a (*MARK)
|
/((?x)(*:0))#(?'/. Specifically: if a setting of (?x) was followed by a (*MARK)
|
||||||
setting (which (*:0) is), then (?x) did not get unset at the end of its group
|
setting (which (*:0) is), then (?x) did not get unset at the end of its group
|
||||||
during the scan for named groups, and hence the external # was incorrectly
|
during the scan for named groups, and hence the external # was incorrectly
|
||||||
treated as a comment and the invalid (?' at the end of the pattern was not
|
treated as a comment and the invalid (?' at the end of the pattern was not
|
||||||
diagnosed. This caused a buffer overflow during the real compile. This bug was
|
diagnosed. This caused a buffer overflow during the real compile. This bug was
|
||||||
discovered by Karl Skomski with the LLVM fuzzer.
|
discovered by Karl Skomski with the LLVM fuzzer.
|
||||||
|
|
||||||
7. Moved the pcre2_find_bracket() function from src/pcre2_compile.c into its
|
7. Moved the pcre2_find_bracket() function from src/pcre2_compile.c into its
|
||||||
own source module to avoid a circular dependency between src/pcre2_compile.c
|
own source module to avoid a circular dependency between src/pcre2_compile.c
|
||||||
and src/pcre2_study.c
|
and src/pcre2_study.c
|
||||||
|
|
||||||
8. A callout with a string argument containing an opening square bracket, for
|
8. A callout with a string argument containing an opening square bracket, for
|
||||||
example /(?C$[$)(?<]/, was incorrectly processed and could provoke a buffer
|
example /(?C$[$)(?<]/, was incorrectly processed and could provoke a buffer
|
||||||
overflow. This bug was discovered by Karl Skomski with the LLVM fuzzer.
|
overflow. This bug was discovered by Karl Skomski with the LLVM fuzzer.
|
||||||
|
|
||||||
9. The handling of callouts during the pre-pass for named group identification
|
9. The handling of callouts during the pre-pass for named group identification
|
||||||
has been tightened up.
|
has been tightened up.
|
||||||
|
|
||||||
10. The quantifier {1} can be ignored, whether greedy, non-greedy, or
|
10. The quantifier {1} can be ignored, whether greedy, non-greedy, or
|
||||||
possessive. This is a very minor optimization.
|
possessive. This is a very minor optimization.
|
||||||
|
|
||||||
11. A possessively repeated conditional group that could match an empty string,
|
11. A possessively repeated conditional group that could match an empty string,
|
||||||
for example, /(?(R))*+/, was incorrectly compiled.
|
for example, /(?(R))*+/, was incorrectly compiled.
|
||||||
|
|
||||||
12. The Unicode tables have been updated to Unicode 8.0.0 (thanks to Christian
|
12. The Unicode tables have been updated to Unicode 8.0.0 (thanks to Christian
|
||||||
Persch).
|
Persch).
|
||||||
|
|
||||||
13. An empty comment (?#) in a pattern was incorrectly processed and could
|
13. An empty comment (?#) in a pattern was incorrectly processed and could
|
||||||
|
@ -58,72 +58,72 @@ compiled and could cause reading from uninitialized memory or an incorrect
|
||||||
error diagnosis. Examples are: /[[:\\](?<[::]/ and /[[:\\](?'abc')[a:]. The
|
error diagnosis. Examples are: /[[:\\](?<[::]/ and /[[:\\](?'abc')[a:]. The
|
||||||
first of these bugs was discovered by Karl Skomski with the LLVM fuzzer.
|
first of these bugs was discovered by Karl Skomski with the LLVM fuzzer.
|
||||||
|
|
||||||
16. Pathological patterns containing many nested occurrences of [: caused
|
16. Pathological patterns containing many nested occurrences of [: caused
|
||||||
pcre2_compile() to run for a very long time. This bug was found by the LLVM
|
pcre2_compile() to run for a very long time. This bug was found by the LLVM
|
||||||
fuzzer.
|
fuzzer.
|
||||||
|
|
||||||
17. A missing closing parenthesis for a callout with a string argument was not
|
17. A missing closing parenthesis for a callout with a string argument was not
|
||||||
being diagnosed, possibly leading to a buffer overflow. This bug was found by
|
being diagnosed, possibly leading to a buffer overflow. This bug was found by
|
||||||
the LLVM fuzzer.
|
the LLVM fuzzer.
|
||||||
|
|
||||||
18. A conditional group with only one branch has an implicit empty alternative
|
18. A conditional group with only one branch has an implicit empty alternative
|
||||||
branch and must therefore be treated as potentially matching an empty string.
|
branch and must therefore be treated as potentially matching an empty string.
|
||||||
|
|
||||||
19. If (?R was followed by - or + incorrect behaviour happened instead of a
|
19. If (?R was followed by - or + incorrect behaviour happened instead of a
|
||||||
diagnostic. This bug was discovered by Karl Skomski with the LLVM fuzzer.
|
diagnostic. This bug was discovered by Karl Skomski with the LLVM fuzzer.
|
||||||
|
|
||||||
20. Another bug that was introduced by change 36 for 10.20: conditional groups
|
20. Another bug that was introduced by change 36 for 10.20: conditional groups
|
||||||
whose condition was an assertion preceded by an explicit callout with a string
|
whose condition was an assertion preceded by an explicit callout with a string
|
||||||
argument might be incorrectly processed, especially if the string contained \Q.
|
argument might be incorrectly processed, especially if the string contained \Q.
|
||||||
This bug was discovered by Karl Skomski with the LLVM fuzzer.
|
This bug was discovered by Karl Skomski with the LLVM fuzzer.
|
||||||
|
|
||||||
21. Compiling PCRE2 with the sanitize options of clang showed up a number of
|
21. Compiling PCRE2 with the sanitize options of clang showed up a number of
|
||||||
very pedantic coding infelicities and a buffer overflow while checking a UTF-8
|
very pedantic coding infelicities and a buffer overflow while checking a UTF-8
|
||||||
string if the final multi-byte UTF-8 character was truncated.
|
string if the final multi-byte UTF-8 character was truncated.
|
||||||
|
|
||||||
22. For Perl compatibility in EBCDIC environments, ranges such as a-z in a
|
22. For Perl compatibility in EBCDIC environments, ranges such as a-z in a
|
||||||
class, where both values are literal letters in the same case, omit the
|
class, where both values are literal letters in the same case, omit the
|
||||||
non-letter EBCDIC code points within the range.
|
non-letter EBCDIC code points within the range.
|
||||||
|
|
||||||
23. Finding the minimum matching length of complex patterns with back
|
23. Finding the minimum matching length of complex patterns with back
|
||||||
references and/or recursions can take a long time. There is now a cut-off that
|
references and/or recursions can take a long time. There is now a cut-off that
|
||||||
gives up trying to find a minimum length when things get too complex.
|
gives up trying to find a minimum length when things get too complex.
|
||||||
|
|
||||||
24. An optimization has been added that speeds up finding the minimum matching
|
24. An optimization has been added that speeds up finding the minimum matching
|
||||||
length for patterns containing repeated capturing groups or recursions.
|
length for patterns containing repeated capturing groups or recursions.
|
||||||
|
|
||||||
25. If a pattern contained a back reference to a group whose number was
|
25. If a pattern contained a back reference to a group whose number was
|
||||||
duplicated as a result of appearing in a (?|...) group, the computation of the
|
duplicated as a result of appearing in a (?|...) group, the computation of the
|
||||||
minimum matching length gave a wrong result, which could cause incorrect "no
|
minimum matching length gave a wrong result, which could cause incorrect "no
|
||||||
match" errors. For such patterns, a minimum matching length cannot at present
|
match" errors. For such patterns, a minimum matching length cannot at present
|
||||||
be computed.
|
be computed.
|
||||||
|
|
||||||
26. Added a check for integer overflow in conditions (?(<digits>) and
|
26. Added a check for integer overflow in conditions (?(<digits>) and
|
||||||
(?(R<digits>). This omission was discovered by Karl Skomski with the LLVM
|
(?(R<digits>). This omission was discovered by Karl Skomski with the LLVM
|
||||||
fuzzer.
|
fuzzer.
|
||||||
|
|
||||||
27. Fixed an issue when \p{Any} inside an xclass did not read the current
|
27. Fixed an issue when \p{Any} inside an xclass did not read the current
|
||||||
character.
|
character.
|
||||||
|
|
||||||
28. If pcre2grep was given the -q option with -c or -l, or when handling a
|
28. If pcre2grep was given the -q option with -c or -l, or when handling a
|
||||||
binary file, it incorrectly wrote output to stdout.
|
binary file, it incorrectly wrote output to stdout.
|
||||||
|
|
||||||
29. The JIT compiler did not restore the control verb head in case of *THEN
|
29. The JIT compiler did not restore the control verb head in case of *THEN
|
||||||
control verbs. This issue was found by Karl Skomski with a custom LLVM fuzzer.
|
control verbs. This issue was found by Karl Skomski with a custom LLVM fuzzer.
|
||||||
|
|
||||||
30. The way recursive references such as (?3) are compiled has been re-written
|
30. The way recursive references such as (?3) are compiled has been re-written
|
||||||
because the old way was the cause of many issues. Now, conversion of the group
|
because the old way was the cause of many issues. Now, conversion of the group
|
||||||
number into a pattern offset does not happen until the pattern has been
|
number into a pattern offset does not happen until the pattern has been
|
||||||
completely compiled. This does mean that detection of all infinitely looping
|
completely compiled. This does mean that detection of all infinitely looping
|
||||||
recursions is postponed till match time. In the past, some easy ones were
|
recursions is postponed till match time. In the past, some easy ones were
|
||||||
detected at compile time. This re-writing was done in response to yet another
|
detected at compile time. This re-writing was done in response to yet another
|
||||||
bug found by the LLVM fuzzer.
|
bug found by the LLVM fuzzer.
|
||||||
|
|
||||||
31. A test for a back reference to a non-existent group was missing for items
|
31. A test for a back reference to a non-existent group was missing for items
|
||||||
such as \987. This caused incorrect code to be compiled. This issue was found
|
such as \987. This caused incorrect code to be compiled. This issue was found
|
||||||
by Karl Skomski with a custom LLVM fuzzer.
|
by Karl Skomski with a custom LLVM fuzzer.
|
||||||
|
|
||||||
32. Error messages for syntax errors following \g and \k were giving inaccurate
|
32. Error messages for syntax errors following \g and \k were giving inaccurate
|
||||||
offsets in the pattern.
|
offsets in the pattern.
|
||||||
|
|
||||||
33. Improve the performance of starting single character repetitions in JIT.
|
33. Improve the performance of starting single character repetitions in JIT.
|
||||||
|
@ -145,8 +145,8 @@ was fixed.
|
||||||
39. Match limit check added to recursion. This issue was found by Karl Skomski
|
39. Match limit check added to recursion. This issue was found by Karl Skomski
|
||||||
with a custom LLVM fuzzer.
|
with a custom LLVM fuzzer.
|
||||||
|
|
||||||
40. Arrange for the UTF check in pcre2_match() and pcre2_dfa_match() to look
|
40. Arrange for the UTF check in pcre2_match() and pcre2_dfa_match() to look
|
||||||
only at the part of the subject that is relevant when the starting offset is
|
only at the part of the subject that is relevant when the starting offset is
|
||||||
non-zero.
|
non-zero.
|
||||||
|
|
||||||
41. Improve first character match in JIT with SSE2 on x86.
|
41. Improve first character match in JIT with SSE2 on x86.
|
||||||
|
@ -154,17 +154,17 @@ non-zero.
|
||||||
42. Fix two assertion fails in JIT. These issues were found by Karl Skomski
|
42. Fix two assertion fails in JIT. These issues were found by Karl Skomski
|
||||||
with a custom LLVM fuzzer.
|
with a custom LLVM fuzzer.
|
||||||
|
|
||||||
43. Correct the setting of CMAKE_C_FLAGS in CMakeLists.txt (patch from Roy Ivy
|
43. Correct the setting of CMAKE_C_FLAGS in CMakeLists.txt (patch from Roy Ivy
|
||||||
III).
|
III).
|
||||||
|
|
||||||
44. Fix bug in RunTest.bat for new test 14, and adjust the script for the added
|
44. Fix bug in RunTest.bat for new test 14, and adjust the script for the added
|
||||||
test (there are now 20 in total).
|
test (there are now 20 in total).
|
||||||
|
|
||||||
45. Fixed a corner case of range optimization in JIT.
|
45. Fixed a corner case of range optimization in JIT.
|
||||||
|
|
||||||
46. Add the ${*MARK} facility to pcre2_substitute().
|
46. Add the ${*MARK} facility to pcre2_substitute().
|
||||||
|
|
||||||
47. Modifier lists in pcre2test were splitting at spaces without the required
|
47. Modifier lists in pcre2test were splitting at spaces without the required
|
||||||
commas.
|
commas.
|
||||||
|
|
||||||
48. Implemented PCRE2_ALT_VERBNAMES.
|
48. Implemented PCRE2_ALT_VERBNAMES.
|
||||||
|
@ -172,11 +172,11 @@ commas.
|
||||||
49. Fixed two issues in JIT. These were found by Karl Skomski with a custom
|
49. Fixed two issues in JIT. These were found by Karl Skomski with a custom
|
||||||
LLVM fuzzer.
|
LLVM fuzzer.
|
||||||
|
|
||||||
50. The pcre2test program has been extended by adding the #newline_default
|
50. The pcre2test program has been extended by adding the #newline_default
|
||||||
command. This has made it possible to run the standard tests when PCRE2 is
|
command. This has made it possible to run the standard tests when PCRE2 is
|
||||||
compiled with either CR or CRLF as the default newline convention. As part of
|
compiled with either CR or CRLF as the default newline convention. As part of
|
||||||
this work, the new command was added to several test files and the testing
|
this work, the new command was added to several test files and the testing
|
||||||
scripts were modified. The pcre2grep tests can now also be run when there is no
|
scripts were modified. The pcre2grep tests can now also be run when there is no
|
||||||
LF in the default newline convention.
|
LF in the default newline convention.
|
||||||
|
|
||||||
51. The RunTest script has been modified so that, when JIT is used and valgrind
|
51. The RunTest script has been modified so that, when JIT is used and valgrind
|
||||||
|
@ -184,72 +184,72 @@ is specified, a valgrind suppressions file is set up to ignore "Invalid read of
|
||||||
size 16" errors because these are false positives when the hardware supports
|
size 16" errors because these are false positives when the hardware supports
|
||||||
the SSE2 instruction set.
|
the SSE2 instruction set.
|
||||||
|
|
||||||
52. It is now possible to have comment lines amid the subject strings in
|
52. It is now possible to have comment lines amid the subject strings in
|
||||||
pcre2test (and perltest.sh) input.
|
pcre2test (and perltest.sh) input.
|
||||||
|
|
||||||
53. Implemented PCRE2_USE_OFFSET_LIMIT and pcre2_set_offset_limit().
|
53. Implemented PCRE2_USE_OFFSET_LIMIT and pcre2_set_offset_limit().
|
||||||
|
|
||||||
54. Add the null_context modifier to pcre2test so that calling pcre2_compile()
|
54. Add the null_context modifier to pcre2test so that calling pcre2_compile()
|
||||||
and the matching functions with NULL contexts can be tested.
|
and the matching functions with NULL contexts can be tested.
|
||||||
|
|
||||||
55. Implemented PCRE2_SUBSTITUTE_EXTENDED.
|
55. Implemented PCRE2_SUBSTITUTE_EXTENDED.
|
||||||
|
|
||||||
56. In a character class such as [\W\p{Any}] where both a negative-type escape
|
56. In a character class such as [\W\p{Any}] where both a negative-type escape
|
||||||
("not a word character") and a property escape were present, the property
|
("not a word character") and a property escape were present, the property
|
||||||
escape was being ignored.
|
escape was being ignored.
|
||||||
|
|
||||||
57. Fixed integer overflow for patterns whose minimum matching length is very,
|
57. Fixed integer overflow for patterns whose minimum matching length is very,
|
||||||
very large.
|
very large.
|
||||||
|
|
||||||
58. Implemented --never-backslash-C.
|
58. Implemented --never-backslash-C.
|
||||||
|
|
||||||
59. Change 55 above introduced a bug by which certain patterns provoked the
|
59. Change 55 above introduced a bug by which certain patterns provoked the
|
||||||
erroneous error "\ at end of pattern".
|
erroneous error "\ at end of pattern".
|
||||||
|
|
||||||
60. The special sequences [[:<:]] and [[:>:]] gave rise to incorrect compiling
|
60. The special sequences [[:<:]] and [[:>:]] gave rise to incorrect compiling
|
||||||
errors or other strange effects if compiled in UCP mode. Found with libFuzzer
|
errors or other strange effects if compiled in UCP mode. Found with libFuzzer
|
||||||
and AddressSanitizer.
|
and AddressSanitizer.
|
||||||
|
|
||||||
61. Whitespace at the end of a pcre2test pattern line caused a spurious error
|
61. Whitespace at the end of a pcre2test pattern line caused a spurious error
|
||||||
message if there were only single-character modifiers. It should be ignored.
|
message if there were only single-character modifiers. It should be ignored.
|
||||||
|
|
||||||
62. The use of PCRE2_NO_AUTO_CAPTURE could cause incorrect compilation results
|
62. The use of PCRE2_NO_AUTO_CAPTURE could cause incorrect compilation results
|
||||||
or segmentation errors for some patterns. Found with libFuzzer and
|
or segmentation errors for some patterns. Found with libFuzzer and
|
||||||
AddressSanitizer.
|
AddressSanitizer.
|
||||||
|
|
||||||
63. Very long names in (*MARK) or (*THEN) etc. items could provoke a buffer
|
63. Very long names in (*MARK) or (*THEN) etc. items could provoke a buffer
|
||||||
overflow.
|
overflow.
|
||||||
|
|
||||||
64. Improve error message for overly-complicated patterns.
|
64. Improve error message for overly-complicated patterns.
|
||||||
|
|
||||||
65. Implemented an optional replication feature for patterns in pcre2test, to
|
65. Implemented an optional replication feature for patterns in pcre2test, to
|
||||||
make it easier to test long repetitive patterns. The tests for 63 above are
|
make it easier to test long repetitive patterns. The tests for 63 above are
|
||||||
converted to use the new feature.
|
converted to use the new feature.
|
||||||
|
|
||||||
66. In the POSIX wrapper, if regerror() was given too small a buffer, it could
|
66. In the POSIX wrapper, if regerror() was given too small a buffer, it could
|
||||||
misbehave.
|
misbehave.
|
||||||
|
|
||||||
67. In pcre2_substitute() in UTF mode, the UTF validity check on the
|
67. In pcre2_substitute() in UTF mode, the UTF validity check on the
|
||||||
replacement string was happening before the length setting when the replacement
|
replacement string was happening before the length setting when the replacement
|
||||||
string was zero-terminated.
|
string was zero-terminated.
|
||||||
|
|
||||||
68. In pcre2_substitute() in UTF mode, PCRE2_NO_UTF_CHECK can be set for the
|
68. In pcre2_substitute() in UTF mode, PCRE2_NO_UTF_CHECK can be set for the
|
||||||
second and subsequent calls to pcre2_match().
|
second and subsequent calls to pcre2_match().
|
||||||
|
|
||||||
69. There was no check for integer overflow for a replacement group number in
|
69. There was no check for integer overflow for a replacement group number in
|
||||||
pcre2_substitute(). An added check for a number greater than the largest group
|
pcre2_substitute(). An added check for a number greater than the largest group
|
||||||
number in the pattern means this is not now needed.
|
number in the pattern means this is not now needed.
|
||||||
|
|
||||||
70. The PCRE2-specific VERSION condition didn't work correctly if only one
|
70. The PCRE2-specific VERSION condition didn't work correctly if only one
|
||||||
digit was given after the decimal point, or if more than two digits were given.
|
digit was given after the decimal point, or if more than two digits were given.
|
||||||
It now works with one or two digits, and gives a compile time error if more are
|
It now works with one or two digits, and gives a compile time error if more are
|
||||||
given.
|
given.
|
||||||
|
|
||||||
71. In pcre2_substitute() there was the possibility of reading one code unit
|
71. In pcre2_substitute() there was the possibility of reading one code unit
|
||||||
beyond the end of the replacement string.
|
beyond the end of the replacement string.
|
||||||
|
|
||||||
72. The code for checking a subject's UTF-32 validity for a pattern with a
|
72. The code for checking a subject's UTF-32 validity for a pattern with a
|
||||||
lookbehind involved an out-of-bounds pointer, which could potentially cause
|
lookbehind involved an out-of-bounds pointer, which could potentially cause
|
||||||
trouble in some environments.
|
trouble in some environments.
|
||||||
|
|
||||||
73. The maximum lookbehind length was incorrectly calculated for patterns such
|
73. The maximum lookbehind length was incorrectly calculated for patterns such
|
||||||
|
@ -260,6 +260,9 @@ as /(?<=(a)(?-1))x/ which have a recursion within a backreference.
|
||||||
75. Give an error in pcre2_substitute() if a match ends before it starts (as a
|
75. Give an error in pcre2_substitute() if a match ends before it starts (as a
|
||||||
result of the use of \K).
|
result of the use of \K).
|
||||||
|
|
||||||
|
76. Check the length of the name in (*MARK:xx) etc. dynamically to avoid the
|
||||||
|
possibility of integer overflow.
|
||||||
|
|
||||||
|
|
||||||
Version 10.20 30-June-2015
|
Version 10.20 30-June-2015
|
||||||
--------------------------
|
--------------------------
|
||||||
|
|
|
@ -584,19 +584,19 @@ enum { ERR0 = COMPILE_ERROR_BASE,
|
||||||
ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69, ERR70,
|
ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69, ERR70,
|
||||||
ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79, ERR80,
|
ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79, ERR80,
|
||||||
ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERR87 };
|
ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERR87 };
|
||||||
|
|
||||||
/* Error codes that correspond to negative error codes returned by
|
/* Error codes that correspond to negative error codes returned by
|
||||||
find_fixedlength(). */
|
find_fixedlength(). */
|
||||||
|
|
||||||
static int fixed_length_errors[] =
|
static int fixed_length_errors[] =
|
||||||
{
|
{
|
||||||
ERR0, /* Not an error */
|
ERR0, /* Not an error */
|
||||||
ERR0, /* Not an error; -1 is used for "process later" */
|
ERR0, /* Not an error; -1 is used for "process later" */
|
||||||
ERR25, /* Lookbehind is not fixed length */
|
ERR25, /* Lookbehind is not fixed length */
|
||||||
ERR36, /* \C in lookbehind is not allowed */
|
ERR36, /* \C in lookbehind is not allowed */
|
||||||
ERR87, /* Lookbehind is too long */
|
ERR87, /* Lookbehind is too long */
|
||||||
ERR70 /* Internal error: unknown opcode encountered */
|
ERR70 /* Internal error: unknown opcode encountered */
|
||||||
};
|
};
|
||||||
|
|
||||||
/* This is a table of start-of-pattern options such as (*UTF) and settings such
|
/* This is a table of start-of-pattern options such as (*UTF) and settings such
|
||||||
as (*LIMIT_MATCH=nnnn) and (*CRLF). For completeness and backward
|
as (*LIMIT_MATCH=nnnn) and (*CRLF). For completeness and backward
|
||||||
|
@ -803,7 +803,7 @@ However, we cannot do this when the assertion contains subroutine calls,
|
||||||
because they can be forward references. We solve this by remembering this case
|
because they can be forward references. We solve this by remembering this case
|
||||||
and doing the check at the end; a flag specifies which mode we are running in.
|
and doing the check at the end; a flag specifies which mode we are running in.
|
||||||
|
|
||||||
Lookbehind lengths are held in 16-bit fields and the maximum value is defined
|
Lookbehind lengths are held in 16-bit fields and the maximum value is defined
|
||||||
as LOOKBEHIND_MAX.
|
as LOOKBEHIND_MAX.
|
||||||
|
|
||||||
Arguments:
|
Arguments:
|
||||||
|
@ -817,7 +817,7 @@ Returns: if non-negative, the fixed length,
|
||||||
or -1 if an OP_RECURSE item was encountered and atend is FALSE
|
or -1 if an OP_RECURSE item was encountered and atend is FALSE
|
||||||
or -2 if there is no fixed length,
|
or -2 if there is no fixed length,
|
||||||
or -3 if \C was encountered (in UTF-8 mode only)
|
or -3 if \C was encountered (in UTF-8 mode only)
|
||||||
or -4 length is too long
|
or -4 length is too long
|
||||||
or -5 if an unknown opcode was encountered (internal error)
|
or -5 if an unknown opcode was encountered (internal error)
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
@ -844,8 +844,8 @@ for (;;)
|
||||||
int d;
|
int d;
|
||||||
PCRE2_UCHAR *ce, *cs;
|
PCRE2_UCHAR *ce, *cs;
|
||||||
register PCRE2_UCHAR op = *cc;
|
register PCRE2_UCHAR op = *cc;
|
||||||
|
|
||||||
if (branchlength > LOOKBEHIND_MAX) return FFL_TOOLONG;
|
if (branchlength > LOOKBEHIND_MAX) return FFL_TOOLONG;
|
||||||
|
|
||||||
switch (op)
|
switch (op)
|
||||||
{
|
{
|
||||||
|
@ -2875,7 +2875,7 @@ static int
|
||||||
process_verb_name(PCRE2_SPTR *ptrptr, PCRE2_UCHAR **codeptr, int *errorcodeptr,
|
process_verb_name(PCRE2_SPTR *ptrptr, PCRE2_UCHAR **codeptr, int *errorcodeptr,
|
||||||
uint32_t options, BOOL utf, compile_block *cb)
|
uint32_t options, BOOL utf, compile_block *cb)
|
||||||
{
|
{
|
||||||
int arglen = 0;
|
int32_t arglen = 0;
|
||||||
BOOL inescq = FALSE;
|
BOOL inescq = FALSE;
|
||||||
PCRE2_SPTR ptr = *ptrptr;
|
PCRE2_SPTR ptr = *ptrptr;
|
||||||
PCRE2_UCHAR *code = (codeptr == NULL)? NULL : *codeptr;
|
PCRE2_UCHAR *code = (codeptr == NULL)? NULL : *codeptr;
|
||||||
|
@ -2900,8 +2900,8 @@ for (; ptr < cb->end_pattern; ptr++)
|
||||||
{
|
{
|
||||||
if (x == CHAR_RIGHT_PARENTHESIS) break;
|
if (x == CHAR_RIGHT_PARENTHESIS) break;
|
||||||
|
|
||||||
/* Skip over comments and whitespace in extended mode. Need a loop to handle
|
/* Skip over comments and whitespace in extended mode. Need a loop to
|
||||||
whitespace after a comment. */
|
handle whitespace after a comment. */
|
||||||
|
|
||||||
if ((options & PCRE2_EXTENDED) != 0)
|
if ((options & PCRE2_EXTENDED) != 0)
|
||||||
{
|
{
|
||||||
|
@ -2912,8 +2912,8 @@ for (; ptr < cb->end_pattern; ptr++)
|
||||||
ptr++;
|
ptr++;
|
||||||
while (*ptr != CHAR_NULL)
|
while (*ptr != CHAR_NULL)
|
||||||
{
|
{
|
||||||
if (IS_NEWLINE(ptr)) /* For non-fixed-length newline cases, */
|
if (IS_NEWLINE(ptr)) /* For non-fixed-length newline cases, */
|
||||||
{ /* IS_NEWLINE sets cb->nllen. */
|
{ /* IS_NEWLINE sets cb->nllen. */
|
||||||
ptr += cb->nllen;
|
ptr += cb->nllen;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
@ -2984,6 +2984,12 @@ for (; ptr < cb->end_pattern; ptr++)
|
||||||
}
|
}
|
||||||
|
|
||||||
arglen++;
|
arglen++;
|
||||||
|
|
||||||
|
if ((unsigned int)arglen > MAX_MARK)
|
||||||
|
{
|
||||||
|
*errorcodeptr = ERR76;
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Update the pointers before returning. */
|
/* Update the pointers before returning. */
|
||||||
|
@ -5613,21 +5619,25 @@ for (;; ptr++)
|
||||||
|
|
||||||
if ((options & PCRE2_ALT_VERBNAMES) == 0)
|
if ((options & PCRE2_ALT_VERBNAMES) == 0)
|
||||||
{
|
{
|
||||||
while (*ptr != CHAR_NULL && *ptr != CHAR_RIGHT_PARENTHESIS) ptr++;
|
arglen = 0;
|
||||||
arglen = (int)(ptr - arg);
|
while (*ptr != CHAR_NULL && *ptr != CHAR_RIGHT_PARENTHESIS)
|
||||||
|
{
|
||||||
|
ptr++; /* Check length as we go */
|
||||||
|
arglen++; /* along, to avoid the */
|
||||||
|
if ((unsigned int)arglen > MAX_MARK) /* possibility of overflow. */
|
||||||
|
{
|
||||||
|
*errorcodeptr = ERR76;
|
||||||
|
goto FAILED;
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
|
/* The length check is in process_verb_names() */
|
||||||
arglen = process_verb_name(&ptr, NULL, errorcodeptr, options,
|
arglen = process_verb_name(&ptr, NULL, errorcodeptr, options,
|
||||||
utf, cb);
|
utf, cb);
|
||||||
if (arglen < 0) goto FAILED;
|
if (arglen < 0) goto FAILED;
|
||||||
}
|
}
|
||||||
|
|
||||||
if ((unsigned int)arglen > MAX_MARK)
|
|
||||||
{
|
|
||||||
*errorcodeptr = ERR76;
|
|
||||||
goto FAILED;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
if (*ptr != CHAR_RIGHT_PARENTHESIS)
|
if (*ptr != CHAR_RIGHT_PARENTHESIS)
|
||||||
|
@ -7484,7 +7494,7 @@ for (;;)
|
||||||
|
|
||||||
/* If lookbehind, check that this branch matches a fixed-length string, and
|
/* If lookbehind, check that this branch matches a fixed-length string, and
|
||||||
put the length into the OP_REVERSE item. Temporarily mark the end of the
|
put the length into the OP_REVERSE item. Temporarily mark the end of the
|
||||||
branch with OP_END. If the branch contains OP_RECURSE, the result is
|
branch with OP_END. If the branch contains OP_RECURSE, the result is
|
||||||
FFL_LATER (a negative value) because there may be forward references that
|
FFL_LATER (a negative value) because there may be forward references that
|
||||||
we can't check here. Set a flag to cause another lookbehind check at the
|
we can't check here. Set a flag to cause another lookbehind check at the
|
||||||
end. Why not do it all at the end? Because common errors can be picked up
|
end. Why not do it all at the end? Because common errors can be picked up
|
||||||
|
|
|
@ -239,9 +239,15 @@
|
||||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark
|
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark
|
||||||
XX
|
XX
|
||||||
|
|
||||||
|
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark,alt_verbnames
|
||||||
|
XX
|
||||||
|
|
||||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
|
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
|
||||||
XX
|
XX
|
||||||
|
|
||||||
|
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark,alt_verbnames
|
||||||
|
XX
|
||||||
|
|
||||||
/\u0100/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
/\u0100/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
||||||
|
|
||||||
/[\u0100-\u0200]/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
/[\u0100-\u0200]/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
||||||
|
|
|
@ -313,11 +313,20 @@ Failed: error 151 at offset 3: octal value is greater than \377 in 8-bit non-UTF
|
||||||
Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
|
Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
|
||||||
XX
|
XX
|
||||||
|
|
||||||
|
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/mark,alt_verbnames
|
||||||
|
Failed: error 176 at offset 3: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)
|
||||||
|
XX
|
||||||
|
|
||||||
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
|
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark
|
||||||
XX
|
XX
|
||||||
0: XX
|
0: XX
|
||||||
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
|
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
|
||||||
|
|
||||||
|
/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/mark,alt_verbnames
|
||||||
|
XX
|
||||||
|
0: XX
|
||||||
|
MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
|
||||||
|
|
||||||
/\u0100/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
/\u0100/alt_bsux,allow_empty_class,match_unset_backref,dupnames
|
||||||
Failed: error 177 at offset 5: character code point value in \u.... sequence is too large
|
Failed: error 177 at offset 5: character code point value in \u.... sequence is too large
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue