Code tidies for 10.30-RC1 release candidate.
This commit is contained in:
parent
e3052af6fd
commit
810d9b6da5
|
@ -429,7 +429,7 @@ SET(PCRE2_SOURCES
|
||||||
src/pcre2_compile.c
|
src/pcre2_compile.c
|
||||||
src/pcre2_config.c
|
src/pcre2_config.c
|
||||||
src/pcre2_context.c
|
src/pcre2_context.c
|
||||||
src/pcre2_convert.c
|
src/pcre2_convert.c
|
||||||
src/pcre2_dfa_match.c
|
src/pcre2_dfa_match.c
|
||||||
src/pcre2_error.c
|
src/pcre2_error.c
|
||||||
src/pcre2_find_bracket.c
|
src/pcre2_find_bracket.c
|
||||||
|
|
172
ChangeLog
172
ChangeLog
|
@ -2,105 +2,105 @@ Change Log for PCRE2
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
|
|
||||||
Version 10.30-DEV 09-March-2017
|
Version 10.30-RC1 18-July-2017
|
||||||
-------------------------------
|
------------------------------
|
||||||
|
|
||||||
1. The main interpreter, pcre2_match(), has been refactored into a new version
|
1. The main interpreter, pcre2_match(), has been refactored into a new version
|
||||||
that does not use recursive function calls (and therefore the stack) for
|
that does not use recursive function calls (and therefore the stack) for
|
||||||
remembering backtracking positions. This makes --disable-stack-for-recursion a
|
remembering backtracking positions. This makes --disable-stack-for-recursion a
|
||||||
NOOP. The new implementation allows backtracking into recursive group calls in
|
NOOP. The new implementation allows backtracking into recursive group calls in
|
||||||
patterns, making it more compatible with Perl, and also fixes some other
|
patterns, making it more compatible with Perl, and also fixes some other
|
||||||
hard-to-do issues such as #1887 in Bugzilla. The code is also cleaner because
|
hard-to-do issues such as #1887 in Bugzilla. The code is also cleaner because
|
||||||
the old code had a number of fudges to try to reduce stack usage. It seems to
|
the old code had a number of fudges to try to reduce stack usage. It seems to
|
||||||
run no slower than the old code.
|
run no slower than the old code.
|
||||||
|
|
||||||
A number of bugs in the refactored code were subsequently fixed during testing
|
A number of bugs in the refactored code were subsequently fixed during testing
|
||||||
before release, but after the code was made available in the repository. These
|
before release, but after the code was made available in the repository. These
|
||||||
bugs were never in fully released code, but are noted here for the record.
|
bugs were never in fully released code, but are noted here for the record.
|
||||||
|
|
||||||
(a) If a pattern had fewer capturing parentheses than the ovector supplied in
|
(a) If a pattern had fewer capturing parentheses than the ovector supplied in
|
||||||
the match data block, a memory error (detectable by ASAN) occurred after
|
the match data block, a memory error (detectable by ASAN) occurred after
|
||||||
a match, because the external block was being set from non-existent
|
a match, because the external block was being set from non-existent
|
||||||
internal ovector fields. Fixes oss-fuzz issue 781.
|
internal ovector fields. Fixes oss-fuzz issue 781.
|
||||||
|
|
||||||
(b) A pattern with very many capturing parentheses (when the internal frame
|
(b) A pattern with very many capturing parentheses (when the internal frame
|
||||||
size was greater than the initial frame vector on the stack) caused a
|
size was greater than the initial frame vector on the stack) caused a
|
||||||
crash. A vector on the heap is now set up at the start of matching if the
|
crash. A vector on the heap is now set up at the start of matching if the
|
||||||
vector on the stack is not big enough to handle at least 10 frames.
|
vector on the stack is not big enough to handle at least 10 frames.
|
||||||
Fixes oss-fuzz issue 783.
|
Fixes oss-fuzz issue 783.
|
||||||
|
|
||||||
(c) Handling of (*VERB)s in recursions was wrong in some cases.
|
(c) Handling of (*VERB)s in recursions was wrong in some cases.
|
||||||
|
|
||||||
(d) Captures in negative assertions that were used as conditions were not
|
(d) Captures in negative assertions that were used as conditions were not
|
||||||
happening if the assertion matched via (*ACCEPT).
|
happening if the assertion matched via (*ACCEPT).
|
||||||
|
|
||||||
(e) Mark values were not being passed out of recursions.
|
(e) Mark values were not being passed out of recursions.
|
||||||
|
|
||||||
(f) Refactor some code in do_callout() to avoid picky compiler warnings about
|
(f) Refactor some code in do_callout() to avoid picky compiler warnings about
|
||||||
negative indices. Fixes oss-fuzz issue 1454.
|
negative indices. Fixes oss-fuzz issue 1454.
|
||||||
|
|
||||||
(g) Similarly refactor the way the variable length ovector is addressed for
|
(g) Similarly refactor the way the variable length ovector is addressed for
|
||||||
similar reasons. Fixes oss-fuzz issue 1465.
|
similar reasons. Fixes oss-fuzz issue 1465.
|
||||||
|
|
||||||
2. Now that pcre2_match() no longer uses recursive function calls (see above),
|
2. Now that pcre2_match() no longer uses recursive function calls (see above),
|
||||||
the "match limit recursion" value seems misnamed. It still exists, and limits
|
the "match limit recursion" value seems misnamed. It still exists, and limits
|
||||||
the depth of tree that is searched. To avoid future confusion, it has been
|
the depth of tree that is searched. To avoid future confusion, it has been
|
||||||
renamed as "depth limit" in all relevant places (--with-depth-limit,
|
renamed as "depth limit" in all relevant places (--with-depth-limit,
|
||||||
(*LIMIT_DEPTH), pcre2_set_depth_limit(), etc) but the old names are still
|
(*LIMIT_DEPTH), pcre2_set_depth_limit(), etc) but the old names are still
|
||||||
available for backwards compatibility.
|
available for backwards compatibility.
|
||||||
|
|
||||||
3. Hardened pcre2test so as to reduce the number of bugs reported by fuzzers:
|
3. Hardened pcre2test so as to reduce the number of bugs reported by fuzzers:
|
||||||
|
|
||||||
(a) Check for malloc failures when getting memory for the ovector (POSIX) or
|
(a) Check for malloc failures when getting memory for the ovector (POSIX) or
|
||||||
the match data block (non-POSIX).
|
the match data block (non-POSIX).
|
||||||
|
|
||||||
4. In the 32-bit library in non-UTF mode, an attempt to find a Unicode property
|
4. In the 32-bit library in non-UTF mode, an attempt to find a Unicode property
|
||||||
for a character with a code point greater than 0x10ffff (the Unicode maximum)
|
for a character with a code point greater than 0x10ffff (the Unicode maximum)
|
||||||
caused a crash.
|
caused a crash.
|
||||||
|
|
||||||
5. If a lookbehind assertion that contained a back reference to a group
|
5. If a lookbehind assertion that contained a back reference to a group
|
||||||
appearing later in the pattern was compiled with the PCRE2_ANCHORED option,
|
appearing later in the pattern was compiled with the PCRE2_ANCHORED option,
|
||||||
undefined actions (often a segmentation fault) could occur, depending on what
|
undefined actions (often a segmentation fault) could occur, depending on what
|
||||||
other options were set. An example assertion is (?<!\1(abc)) where the
|
other options were set. An example assertion is (?<!\1(abc)) where the
|
||||||
reference \1 precedes the group (abc). This fixes oss-fuzz issue 865.
|
reference \1 precedes the group (abc). This fixes oss-fuzz issue 865.
|
||||||
|
|
||||||
6. Added the PCRE2_INFO_FRAMESIZE item to pcre2_pattern_info() and arranged for
|
6. Added the PCRE2_INFO_FRAMESIZE item to pcre2_pattern_info() and arranged for
|
||||||
pcre2test to use it to output the frame size when the "framesize" modifier is
|
pcre2test to use it to output the frame size when the "framesize" modifier is
|
||||||
given.
|
given.
|
||||||
|
|
||||||
7. Reworked the recursive pattern matching in the JIT compiler to follow the
|
7. Reworked the recursive pattern matching in the JIT compiler to follow the
|
||||||
interpreter changes.
|
interpreter changes.
|
||||||
|
|
||||||
8. When the zero_terminate modifier was specified on a pcre2test subject line
|
8. When the zero_terminate modifier was specified on a pcre2test subject line
|
||||||
for global matching, unpredictable things could happen. For example, in UTF-8
|
for global matching, unpredictable things could happen. For example, in UTF-8
|
||||||
mode, the pattern //g,zero_terminate read random memory when matched against an
|
mode, the pattern //g,zero_terminate read random memory when matched against an
|
||||||
empty string with zero_terminate. This was a bug in pcre2test, not the library.
|
empty string with zero_terminate. This was a bug in pcre2test, not the library.
|
||||||
|
|
||||||
9. Moved some Windows-specific code in pcre2grep (introduced in 10.23/13) out
|
9. Moved some Windows-specific code in pcre2grep (introduced in 10.23/13) out
|
||||||
of the section that is compiled when Unix-style directory scanning is
|
of the section that is compiled when Unix-style directory scanning is
|
||||||
available, and into a new section that is always compiled for Windows.
|
available, and into a new section that is always compiled for Windows.
|
||||||
|
|
||||||
10. In pcre2test, explicitly close the file after an error during serialization
|
10. In pcre2test, explicitly close the file after an error during serialization
|
||||||
or deserialization (the "load" or "save" commands).
|
or deserialization (the "load" or "save" commands).
|
||||||
|
|
||||||
11. Fix memory leak in pcre2_serialize_decode() when the input is invalid.
|
11. Fix memory leak in pcre2_serialize_decode() when the input is invalid.
|
||||||
|
|
||||||
12. Fix potential NULL dereference in pcre2_callout_enumerate() if called with
|
12. Fix potential NULL dereference in pcre2_callout_enumerate() if called with
|
||||||
a NULL pattern pointer when Unicode support is available.
|
a NULL pattern pointer when Unicode support is available.
|
||||||
|
|
||||||
13. When the 32-bit library was being tested by pcre2test, error messages that
|
13. When the 32-bit library was being tested by pcre2test, error messages that
|
||||||
were longer than 64 code units could cause a buffer overflow. This was a bug in
|
were longer than 64 code units could cause a buffer overflow. This was a bug in
|
||||||
pcre2test.
|
pcre2test.
|
||||||
|
|
||||||
14. The alternative matching function, pcre2_dfa_match() misbehaved if it
|
14. The alternative matching function, pcre2_dfa_match() misbehaved if it
|
||||||
encountered a character class with a possessive repeat, for example [a-f]{3}+.
|
encountered a character class with a possessive repeat, for example [a-f]{3}+.
|
||||||
|
|
||||||
15. The depth (formerly recursion) limit now applies to DFA matching (as
|
15. The depth (formerly recursion) limit now applies to DFA matching (as
|
||||||
of 10.23/36); pcre2test has been upgraded so that \=find_limits works with DFA
|
of 10.23/36); pcre2test has been upgraded so that \=find_limits works with DFA
|
||||||
matching to find the minimum value for this limit.
|
matching to find the minimum value for this limit.
|
||||||
|
|
||||||
16. Since 10.21, if pcre2_match() was called with a null context, default
|
16. Since 10.21, if pcre2_match() was called with a null context, default
|
||||||
memory allocation functions were used instead of whatever was used when the
|
memory allocation functions were used instead of whatever was used when the
|
||||||
pattern was compiled.
|
pattern was compiled.
|
||||||
|
|
||||||
17. Changes to the pcre2test "memory" modifier on a subject line. These apply
|
17. Changes to the pcre2test "memory" modifier on a subject line. These apply
|
||||||
|
@ -108,33 +108,33 @@ only to pcre2_match():
|
||||||
|
|
||||||
(a) Warn if null_context is set on both pattern and subject, because the
|
(a) Warn if null_context is set on both pattern and subject, because the
|
||||||
memory details cannot then be shown.
|
memory details cannot then be shown.
|
||||||
|
|
||||||
(b) Remember (up to a certain number of) memory allocations and their
|
(b) Remember (up to a certain number of) memory allocations and their
|
||||||
lengths, and list only the lengths, so as to be system-independent.
|
lengths, and list only the lengths, so as to be system-independent.
|
||||||
(In practice, the new interpreter never has more than 2 blocks allocated
|
(In practice, the new interpreter never has more than 2 blocks allocated
|
||||||
simultaneously.)
|
simultaneously.)
|
||||||
|
|
||||||
18. Make pcre2test detect an error return from pcre2_get_error_message(), give
|
18. Make pcre2test detect an error return from pcre2_get_error_message(), give
|
||||||
a message, and abandon the run (this would have detected #13 above).
|
a message, and abandon the run (this would have detected #13 above).
|
||||||
|
|
||||||
19. Implemented PCRE2_ENDANCHORED.
|
19. Implemented PCRE2_ENDANCHORED.
|
||||||
|
|
||||||
20. Applied Jason Hood's patches (slightly modified) to pcre2grep, to implement
|
20. Applied Jason Hood's patches (slightly modified) to pcre2grep, to implement
|
||||||
the --output=text (-O) option and the inbuilt callout echo.
|
the --output=text (-O) option and the inbuilt callout echo.
|
||||||
|
|
||||||
21. Extend auto-anchoring etc. to ignore groups with a zero qualifier and
|
21. Extend auto-anchoring etc. to ignore groups with a zero qualifier and
|
||||||
single-branch conditions with a false condition (e.g. DEFINE) at the start of a
|
single-branch conditions with a false condition (e.g. DEFINE) at the start of a
|
||||||
branch. For example, /(?(DEFINE)...)^A/ and /(...){0}^B/ are now flagged as
|
branch. For example, /(?(DEFINE)...)^A/ and /(...){0}^B/ are now flagged as
|
||||||
anchored.
|
anchored.
|
||||||
|
|
||||||
22. Added an explicit limit on the amount of heap used by pcre2_match(), set by
|
22. Added an explicit limit on the amount of heap used by pcre2_match(), set by
|
||||||
pcre2_set_heap_limit() or (*LIMIT_HEAP=xxx). Upgraded pcre2test to show the
|
pcre2_set_heap_limit() or (*LIMIT_HEAP=xxx). Upgraded pcre2test to show the
|
||||||
heap limit along with other pattern information, and to find the minimum when
|
heap limit along with other pattern information, and to find the minimum when
|
||||||
the find_limits modifier is set.
|
the find_limits modifier is set.
|
||||||
|
|
||||||
23. Write to the last 8 bytes of the pcre2_real_code structure when a compiled
|
23. Write to the last 8 bytes of the pcre2_real_code structure when a compiled
|
||||||
pattern is set up so as to initialize any padding the compiler might have
|
pattern is set up so as to initialize any padding the compiler might have
|
||||||
included. This avoids valgrind warnings when a compiled pattern is copied, in
|
included. This avoids valgrind warnings when a compiled pattern is copied, in
|
||||||
particular when it is serialized.
|
particular when it is serialized.
|
||||||
|
|
||||||
24. Remove a redundant line of code left in accidentally a long time ago.
|
24. Remove a redundant line of code left in accidentally a long time ago.
|
||||||
|
@ -143,80 +143,80 @@ particular when it is serialized.
|
||||||
|
|
||||||
26. Correct an incorrect cast in pcre2_valid_utf.c
|
26. Correct an incorrect cast in pcre2_valid_utf.c
|
||||||
|
|
||||||
27. Update pcre2test, remove some unused code in pcre2_match(), and upgrade the
|
27. Update pcre2test, remove some unused code in pcre2_match(), and upgrade the
|
||||||
tests to improve coverage.
|
tests to improve coverage.
|
||||||
|
|
||||||
28. Some fixes/tidies as a result of looking at Coverity Scan output:
|
28. Some fixes/tidies as a result of looking at Coverity Scan output:
|
||||||
|
|
||||||
(a) Typo: ">" should be ">=" in opcode check in pcre2_auto_possess.c.
|
(a) Typo: ">" should be ">=" in opcode check in pcre2_auto_possess.c.
|
||||||
(b) Added some casts to avoid "suspicious implicit sign extension".
|
(b) Added some casts to avoid "suspicious implicit sign extension".
|
||||||
(c) Resource leaks in pcre2test in rare error cases.
|
(c) Resource leaks in pcre2test in rare error cases.
|
||||||
(d) Avoid warning for never-use case OP_TABLE_LENGTH which is just a fudge
|
(d) Avoid warning for never-use case OP_TABLE_LENGTH which is just a fudge
|
||||||
for checking at compile time that tables are the right size.
|
for checking at compile time that tables are the right size.
|
||||||
(e) Add missing "fall through" comment.
|
(e) Add missing "fall through" comment.
|
||||||
|
|
||||||
29. Implemented PCRE2_EXTENDED_MORE and related /xx and (?xx) features.
|
29. Implemented PCRE2_EXTENDED_MORE and related /xx and (?xx) features.
|
||||||
|
|
||||||
30. Implement (?n: for PCRE2_NO_AUTO_CAPTURE, because Perl now has this.
|
30. Implement (?n: for PCRE2_NO_AUTO_CAPTURE, because Perl now has this.
|
||||||
|
|
||||||
31. If more than one of "push", "pushcopy", or "pushtablescopy" were set in
|
31. If more than one of "push", "pushcopy", or "pushtablescopy" were set in
|
||||||
pcre2test, a crash could occur.
|
pcre2test, a crash could occur.
|
||||||
|
|
||||||
32. Make -bigstack in RunTest allocate a 64Mb stack (instead of 16 MB) so that
|
32. Make -bigstack in RunTest allocate a 64Mb stack (instead of 16 MB) so that
|
||||||
all the tests can run with clang's sanitizing options.
|
all the tests can run with clang's sanitizing options.
|
||||||
|
|
||||||
33. Implement extra compile options in the compile context and add the first
|
33. Implement extra compile options in the compile context and add the first
|
||||||
one: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES.
|
one: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES.
|
||||||
|
|
||||||
34. Implement newline type PCRE2_NEWLINE_NUL.
|
34. Implement newline type PCRE2_NEWLINE_NUL.
|
||||||
|
|
||||||
35. A lookbehind assertion that had a zero-length branch caused undefined
|
35. A lookbehind assertion that had a zero-length branch caused undefined
|
||||||
behaviour when processed by pcre2_dfa_match(). This is oss-fuzz issue 1859.
|
behaviour when processed by pcre2_dfa_match(). This is oss-fuzz issue 1859.
|
||||||
|
|
||||||
36. The match limit value now also applies to pcre2_dfa_match() as there are
|
36. The match limit value now also applies to pcre2_dfa_match() as there are
|
||||||
patterns that can use up a lot of resources without necessarily recursing very
|
patterns that can use up a lot of resources without necessarily recursing very
|
||||||
deeply. (Compare item 10.23/36.) This should fix oss-fuzz #1761.
|
deeply. (Compare item 10.23/36.) This should fix oss-fuzz #1761.
|
||||||
|
|
||||||
37. Implement PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.
|
37. Implement PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.
|
||||||
|
|
||||||
38. Fix returned offsets from regexec() when REG_STARTEND is used with a
|
38. Fix returned offsets from regexec() when REG_STARTEND is used with a
|
||||||
starting offset greater than zero.
|
starting offset greater than zero.
|
||||||
|
|
||||||
39. Implement REG_PEND (GNU extension) for the POSIX wrapper.
|
39. Implement REG_PEND (GNU extension) for the POSIX wrapper.
|
||||||
|
|
||||||
40. Implement the subject_literal modifier in pcre2test, and allow jitstack on
|
40. Implement the subject_literal modifier in pcre2test, and allow jitstack on
|
||||||
pattern lines.
|
pattern lines.
|
||||||
|
|
||||||
41. Implement PCRE2_LITERAL and use it to support REG_NOSPEC.
|
41. Implement PCRE2_LITERAL and use it to support REG_NOSPEC.
|
||||||
|
|
||||||
42. Implement PCRE2_EXTRA_MATCH_LINE and PCRE2_EXTRA_MATCH_WORD for the benefit
|
42. Implement PCRE2_EXTRA_MATCH_LINE and PCRE2_EXTRA_MATCH_WORD for the benefit
|
||||||
of pcre2grep.
|
of pcre2grep.
|
||||||
|
|
||||||
43. Re-implement pcre2grep's -F, -w, and -x options using PCRE2_LITERAL,
|
43. Re-implement pcre2grep's -F, -w, and -x options using PCRE2_LITERAL,
|
||||||
PCRE2_EXTRA_MATCH_WORD, and PCRE2_EXTRA_MATCH_LINE. This fixes two bugs:
|
PCRE2_EXTRA_MATCH_WORD, and PCRE2_EXTRA_MATCH_LINE. This fixes two bugs:
|
||||||
|
|
||||||
(a) The -F option did not work for fixed strings containing \E.
|
(a) The -F option did not work for fixed strings containing \E.
|
||||||
(b) The -w option did not work for patterns with multiple branches.
|
(b) The -w option did not work for patterns with multiple branches.
|
||||||
|
|
||||||
44. Added configuration options for the SELinux compatible execmem allocator in
|
44. Added configuration options for the SELinux compatible execmem allocator in
|
||||||
JIT.
|
JIT.
|
||||||
|
|
||||||
45. Increased the limit for searching for a "must be present" code unit in
|
45. Increased the limit for searching for a "must be present" code unit in
|
||||||
subjects from 1000 to 2000 for 8-bit searches, since they use memchr() and are
|
subjects from 1000 to 2000 for 8-bit searches, since they use memchr() and are
|
||||||
much faster.
|
much faster.
|
||||||
|
|
||||||
46. Arrange for anchored patterns to record and use "first code unit" data,
|
46. Arrange for anchored patterns to record and use "first code unit" data,
|
||||||
because this can give a fast "no match" without searching for a "required code
|
because this can give a fast "no match" without searching for a "required code
|
||||||
unit". Previously only non-anchored patterns did this.
|
unit". Previously only non-anchored patterns did this.
|
||||||
|
|
||||||
47. Upgraded the Unicode tables from Unicode 8.0.0 to Unicode 10.0.0.
|
47. Upgraded the Unicode tables from Unicode 8.0.0 to Unicode 10.0.0.
|
||||||
|
|
||||||
48. Add the callout_no_where modifier to pcre2test.
|
48. Add the callout_no_where modifier to pcre2test.
|
||||||
|
|
||||||
49. Update extended grapheme breaking rules to the latest set that are in
|
49. Update extended grapheme breaking rules to the latest set that are in
|
||||||
Unicode Standard Annex #29.
|
Unicode Standard Annex #29.
|
||||||
|
|
||||||
50. Added experimental foreign pattern conversion facilities
|
50. Added experimental foreign pattern conversion facilities
|
||||||
(pcre2_pattern_convert() and friends).
|
(pcre2_pattern_convert() and friends).
|
||||||
|
|
||||||
|
|
||||||
|
|
57
NEWS
57
NEWS
|
@ -1,6 +1,63 @@
|
||||||
News about PCRE2 releases
|
News about PCRE2 releases
|
||||||
-------------------------
|
-------------------------
|
||||||
|
|
||||||
|
Version 10.30-RC1 18-July-2017
|
||||||
|
------------------------------
|
||||||
|
|
||||||
|
The full list of changes that includes bugfixes and tidies is, as always, in
|
||||||
|
ChangeLog. These are the most important new features:
|
||||||
|
|
||||||
|
1. The main interpreter, pcre2_match(), has been refactored into a new version
|
||||||
|
that does not use recursive function calls (and therefore the system stack) for
|
||||||
|
remembering backtracking positions. This makes --disable-stack-for-recursion a
|
||||||
|
NOOP. The new implementation allows backtracking into recursive group calls in
|
||||||
|
patterns, making it more compatible with Perl, and also fixes some other
|
||||||
|
previously hard-to-do issues. For patterns that have a lot of backtracking, the
|
||||||
|
heap is now used, and there is explicit limit on the amount, settable by
|
||||||
|
pcre2_set_heap_limit() or (*LIMIT_HEAP=xxx). The "recursion limit" is retained,
|
||||||
|
but is renamed as "depth limit" (though the old names remain for
|
||||||
|
compatibility).
|
||||||
|
|
||||||
|
There is also a change in the way callouts from pcre2_match() are handled. The
|
||||||
|
offset_vector field in the callout block is no longer a pointer to the
|
||||||
|
actual ovector that was passed to the matching function in the match data
|
||||||
|
block. Instead it points to an internal ovector of a size large enough to hold
|
||||||
|
all possible captured substrings in the pattern.
|
||||||
|
|
||||||
|
2. The new option PCRE2_ENDANCHORED insists that a pattern match must end at
|
||||||
|
the end of the subject.
|
||||||
|
|
||||||
|
3. The new option PCRE2_EXTENDED_MORE implements Perl's /xx feature, and
|
||||||
|
pcre2test is upgraded to support it. Setting within the pattern by (?xx) is
|
||||||
|
also supported.
|
||||||
|
|
||||||
|
4. (?n) can be used to set PCRE2_NO_AUTO_CAPTURE, because Perl now has this.
|
||||||
|
|
||||||
|
5. Additional compile options in the compile context are now available, and the
|
||||||
|
first two are: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES and
|
||||||
|
PCRE2_EXTRA_BAD_ESCAPE_IS LITERAL.
|
||||||
|
|
||||||
|
6. The newline type PCRE2_NEWLINE_NUL is now available.
|
||||||
|
|
||||||
|
7. The match limit value now also applies to pcre2_dfa_match() as there are
|
||||||
|
patterns that can use up a lot of resources without necessarily recursing very
|
||||||
|
deeply.
|
||||||
|
|
||||||
|
8. The option REG_PEND (a GNU extension) is now available for the POSIX
|
||||||
|
wrapper. Also there is a new option PCRE2_LITERAL which is used to support
|
||||||
|
REG_NOSPEC.
|
||||||
|
|
||||||
|
9. PCRE2_EXTRA_MATCH_LINE and PCRE2_EXTRA_MATCH_WORD are implemented for the
|
||||||
|
benefit of pcre2grep, and pcre2grep's -F, -w, and -x options are re-implemented
|
||||||
|
using PCRE2_LITERAL, PCRE2_EXTRA_MATCH_WORD, and PCRE2_EXTRA_MATCH_LINE. This
|
||||||
|
is tidier and also fixes some bugs.
|
||||||
|
|
||||||
|
10. The Unicode tables are upgraded from Unicode 8.0.0 to Unicode 10.0.0.
|
||||||
|
|
||||||
|
11. There are some experimental functions for converting foreign patterns
|
||||||
|
(globs and POSIX patterns) into PCRE2 patterns.
|
||||||
|
|
||||||
|
|
||||||
Version 10.23 14-February-2017
|
Version 10.23 14-February-2017
|
||||||
------------------------------
|
------------------------------
|
||||||
|
|
||||||
|
|
|
@ -179,8 +179,8 @@ can skip ahead to the CMake section.
|
||||||
|
|
||||||
STACK SIZE IN WINDOWS ENVIRONMENTS
|
STACK SIZE IN WINDOWS ENVIRONMENTS
|
||||||
|
|
||||||
Prior to release 10.30 the default system stack size of 1Mb in some Windows
|
Prior to release 10.30 the default system stack size of 1Mb in some Windows
|
||||||
environments caused issues with some tests. This should no longer be the case
|
environments caused issues with some tests. This should no longer be the case
|
||||||
for 10.30 and later releases.
|
for 10.30 and later releases.
|
||||||
|
|
||||||
|
|
||||||
|
|
63
README
63
README
|
@ -173,7 +173,7 @@ library. They are also documented in the pcre2build man page.
|
||||||
architectures. If you try to enable it on an unsupported architecture, there
|
architectures. If you try to enable it on an unsupported architecture, there
|
||||||
will be a compile time error. If you are running under SELinux you may also
|
will be a compile time error. If you are running under SELinux you may also
|
||||||
want to add --enable-jit-sealloc, which enables the use of an execmem
|
want to add --enable-jit-sealloc, which enables the use of an execmem
|
||||||
allocator in JIT that is compatible with SELinux. This has no effect if JIT
|
allocator in JIT that is compatible with SELinux. This has no effect if JIT
|
||||||
is not enabled.
|
is not enabled.
|
||||||
|
|
||||||
. If you do not want to make use of the default support for UTF-8 Unicode
|
. If you do not want to make use of the default support for UTF-8 Unicode
|
||||||
|
@ -198,13 +198,14 @@ library. They are also documented in the pcre2build man page.
|
||||||
or starting a pattern with (*UCP).
|
or starting a pattern with (*UCP).
|
||||||
|
|
||||||
. You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any
|
. You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any
|
||||||
of the preceding, or any of the Unicode newline sequences, as indicating the
|
of the preceding, or any of the Unicode newline sequences, or the NUL (zero)
|
||||||
end of a line. Whatever you specify at build time is the default; the caller
|
character as indicating the end of a line. Whatever you specify at build time
|
||||||
of PCRE2 can change the selection at run time. The default newline indicator
|
is the default; the caller of PCRE2 can change the selection at run time. The
|
||||||
is a single LF character (the Unix standard). You can specify the default
|
default newline indicator is a single LF character (the Unix standard). You
|
||||||
newline indicator by adding --enable-newline-is-cr, --enable-newline-is-lf,
|
can specify the default newline indicator by adding --enable-newline-is-cr,
|
||||||
--enable-newline-is-crlf, --enable-newline-is-anycrlf, or
|
--enable-newline-is-lf, --enable-newline-is-crlf,
|
||||||
--enable-newline-is-any to the "configure" command, respectively.
|
--enable-newline-is-anycrlf, --enable-newline-is-any, or
|
||||||
|
--enable-newline-is-nul to the "configure" command, respectively.
|
||||||
|
|
||||||
. By default, the sequence \R in a pattern matches any Unicode line ending
|
. By default, the sequence \R in a pattern matches any Unicode line ending
|
||||||
sequence. This is independent of the option specifying what PCRE2 considers
|
sequence. This is independent of the option specifying what PCRE2 considers
|
||||||
|
@ -227,15 +228,15 @@ library. They are also documented in the pcre2build man page.
|
||||||
--with-parens-nest-limit=500
|
--with-parens-nest-limit=500
|
||||||
|
|
||||||
. PCRE2 has a counter that can be set to limit the amount of computing resource
|
. PCRE2 has a counter that can be set to limit the amount of computing resource
|
||||||
it uses when matching a pattern with the Perl-compatible matching function.
|
it uses when matching a pattern. If the limit is exceeded during a match, the
|
||||||
If the limit is exceeded during a match, the match fails. The default is ten
|
match fails. The default is ten million. You can change the default by
|
||||||
million. You can change the default by setting, for example,
|
setting, for example,
|
||||||
|
|
||||||
--with-match-limit=500000
|
--with-match-limit=500000
|
||||||
|
|
||||||
on the "configure" command. This is just the default; individual calls to
|
on the "configure" command. This is just the default; individual calls to
|
||||||
pcre2_match() can supply their own value. There is more discussion in the
|
pcre2_match() or pcre2_dfa_match() can supply their own value. There is more
|
||||||
pcre2api man page (search for pcre2_set_match_limit).
|
discussion in the pcre2api man page (search for pcre2_set_match_limit).
|
||||||
|
|
||||||
. There is a separate counter that limits the depth of nested backtracking
|
. There is a separate counter that limits the depth of nested backtracking
|
||||||
during a matching process, which indirectly limits the amount of heap memory
|
during a matching process, which indirectly limits the amount of heap memory
|
||||||
|
@ -246,15 +247,15 @@ library. They are also documented in the pcre2build man page.
|
||||||
|
|
||||||
There is more discussion in the pcre2api man page (search for
|
There is more discussion in the pcre2api man page (search for
|
||||||
pcre2_set_depth_limit).
|
pcre2_set_depth_limit).
|
||||||
|
|
||||||
. You can also set an explicit limit on the amount of heap memory used by
|
. You can also set an explicit limit on the amount of heap memory used by
|
||||||
the pcre2_match() interpreter:
|
the pcre2_match() interpreter:
|
||||||
|
|
||||||
--with-heap-limit=500
|
--with-heap-limit=500
|
||||||
|
|
||||||
The units are kilobytes. This limit does not apply when the JIT optimization
|
The units are kilobytes. This limit does not apply when the JIT optimization
|
||||||
(which has its own memory control features) is used. There is more discussion
|
(which has its own memory control features) is used. There is more discussion
|
||||||
on the pcre2api man page (search for pcre2_set_heap_limit).
|
on the pcre2api man page (search for pcre2_set_heap_limit).
|
||||||
|
|
||||||
. In the 8-bit library, the default maximum compiled pattern size is around
|
. In the 8-bit library, the default maximum compiled pattern size is around
|
||||||
64K bytes. You can increase this by adding --with-link-size=3 to the
|
64K bytes. You can increase this by adding --with-link-size=3 to the
|
||||||
|
@ -659,9 +660,10 @@ with the perltest.sh script, and test 5 checking PCRE2-specific things.
|
||||||
Tests 6 and 7 check the pcre2_dfa_match() alternative matching function, in
|
Tests 6 and 7 check the pcre2_dfa_match() alternative matching function, in
|
||||||
non-UTF mode and UTF-mode with Unicode property support, respectively.
|
non-UTF mode and UTF-mode with Unicode property support, respectively.
|
||||||
|
|
||||||
Test 8 checks some internal offsets and code size features; it is run only when
|
Test 8 checks some internal offsets and code size features, but it is run only
|
||||||
the default "link size" of 2 is set (in other cases the sizes change) and when
|
when Unicode support is enabled. The output is different in 8-bit, 16-bit, and
|
||||||
Unicode support is enabled.
|
32-bit modes and for different link sizes, so there are different output files
|
||||||
|
for each mode and link size.
|
||||||
|
|
||||||
Tests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in
|
Tests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in
|
||||||
16-bit and 32-bit modes. These are tests that generate different output in
|
16-bit and 32-bit modes. These are tests that generate different output in
|
||||||
|
@ -671,7 +673,7 @@ Test 13 checks the handling of non-UTF characters greater than 255 by
|
||||||
pcre2_dfa_match() in 16-bit and 32-bit modes.
|
pcre2_dfa_match() in 16-bit and 32-bit modes.
|
||||||
|
|
||||||
Test 14 contains some special UTF and UCP tests that give different output for
|
Test 14 contains some special UTF and UCP tests that give different output for
|
||||||
the different widths.
|
different code unit widths.
|
||||||
|
|
||||||
Test 15 contains a number of tests that must not be run with JIT. They check,
|
Test 15 contains a number of tests that must not be run with JIT. They check,
|
||||||
among other non-JIT things, the match-limiting features of the intepretive
|
among other non-JIT things, the match-limiting features of the intepretive
|
||||||
|
@ -692,6 +694,9 @@ patterns to a file, and then reloading and checking them.
|
||||||
Tests 21 and 22 test \C support when the use of \C is not locked out, without
|
Tests 21 and 22 test \C support when the use of \C is not locked out, without
|
||||||
and with UTF support, respectively. Test 23 tests \C when it is locked out.
|
and with UTF support, respectively. Test 23 tests \C when it is locked out.
|
||||||
|
|
||||||
|
Tests 24 and 25 test the experimental pattern conversion functions, without and
|
||||||
|
with UTF support, respectively.
|
||||||
|
|
||||||
|
|
||||||
Character tables
|
Character tables
|
||||||
----------------
|
----------------
|
||||||
|
@ -710,7 +715,7 @@ specified for ./configure, a different version of pcre2_chartables.c is built
|
||||||
by the program dftables (compiled from dftables.c), which uses the ANSI C
|
by the program dftables (compiled from dftables.c), which uses the ANSI C
|
||||||
character handling functions such as isalnum(), isalpha(), isupper(),
|
character handling functions such as isalnum(), isalpha(), isupper(),
|
||||||
islower(), etc. to build the table sources. This means that the default C
|
islower(), etc. to build the table sources. This means that the default C
|
||||||
locale which is set for your system will control the contents of these default
|
locale that is set for your system will control the contents of these default
|
||||||
tables. You can change the default tables by editing pcre2_chartables.c and
|
tables. You can change the default tables by editing pcre2_chartables.c and
|
||||||
then re-building PCRE2. If you do this, you should take care to ensure that the
|
then re-building PCRE2. If you do this, you should take care to ensure that the
|
||||||
file does not get automatically re-generated. The best way to do this is to
|
file does not get automatically re-generated. The best way to do this is to
|
||||||
|
@ -765,6 +770,7 @@ The distribution should contain the files listed below.
|
||||||
src/pcre2_compile.c )
|
src/pcre2_compile.c )
|
||||||
src/pcre2_config.c )
|
src/pcre2_config.c )
|
||||||
src/pcre2_context.c )
|
src/pcre2_context.c )
|
||||||
|
src/pcre2_convert.c )
|
||||||
src/pcre2_dfa_match.c )
|
src/pcre2_dfa_match.c )
|
||||||
src/pcre2_error.c )
|
src/pcre2_error.c )
|
||||||
src/pcre2_find_bracket.c )
|
src/pcre2_find_bracket.c )
|
||||||
|
@ -804,7 +810,6 @@ The distribution should contain the files listed below.
|
||||||
src/pcre2demo.c simple demonstration of coding calls to PCRE2
|
src/pcre2demo.c simple demonstration of coding calls to PCRE2
|
||||||
src/pcre2grep.c source of a grep utility that uses PCRE2
|
src/pcre2grep.c source of a grep utility that uses PCRE2
|
||||||
src/pcre2test.c comprehensive test program
|
src/pcre2test.c comprehensive test program
|
||||||
src/pcre2_printint.c part of pcre2test
|
|
||||||
src/pcre2_jit_test.c JIT test program
|
src/pcre2_jit_test.c JIT test program
|
||||||
|
|
||||||
(C) Auxiliary files:
|
(C) Auxiliary files:
|
||||||
|
@ -869,12 +874,12 @@ The distribution should contain the files listed below.
|
||||||
|
|
||||||
(E) Auxiliary files for building PCRE2 "by hand"
|
(E) Auxiliary files for building PCRE2 "by hand"
|
||||||
|
|
||||||
pcre2.h.generic ) a version of the public PCRE2 header file
|
src/pcre2.h.generic ) a version of the public PCRE2 header file
|
||||||
) for use in non-"configure" environments
|
) for use in non-"configure" environments
|
||||||
config.h.generic ) a version of config.h for use in non-"configure"
|
src/config.h.generic ) a version of config.h for use in non-"configure"
|
||||||
) environments
|
) environments
|
||||||
|
|
||||||
Philip Hazel
|
Philip Hazel
|
||||||
Email local part: ph10
|
Email local part: ph10
|
||||||
Email domain: cam.ac.uk
|
Email domain: cam.ac.uk
|
||||||
Last updated: 17 June 2017
|
Last updated: 18 July 2017
|
||||||
|
|
6
RunTest
6
RunTest
|
@ -830,7 +830,7 @@ for bmode in "$test8" "$test16" "$test32"; do
|
||||||
if [ $supportBSC -ne 0 ] ; then
|
if [ $supportBSC -ne 0 ] ; then
|
||||||
echo " Skipped because \C is not disabled"
|
echo " Skipped because \C is not disabled"
|
||||||
else
|
else
|
||||||
$sim $valgrind ${opt:+$vjs} ./pcre2test -q $setstack $bmode $opt $testdata/testinput23 testtry
|
$sim $valgrind ./pcre2test -q $setstack $bmode $testdata/testinput23 testtry
|
||||||
checkresult $? 23 ""
|
checkresult $? 23 ""
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
@ -839,7 +839,7 @@ for bmode in "$test8" "$test16" "$test32"; do
|
||||||
|
|
||||||
if [ "$do24" = yes ] ; then
|
if [ "$do24" = yes ] ; then
|
||||||
echo $title24
|
echo $title24
|
||||||
$sim $valgrind ${opt:+$vjs} ./pcre2test -q $setstack $bmode $opt $testdata/testinput24 testtry
|
$sim $valgrind ./pcre2test -q $setstack $bmode $testdata/testinput24 testtry
|
||||||
checkresult $? 24 ""
|
checkresult $? 24 ""
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
@ -850,7 +850,7 @@ for bmode in "$test8" "$test16" "$test32"; do
|
||||||
if [ $utf -eq 0 ] ; then
|
if [ $utf -eq 0 ] ; then
|
||||||
echo " Skipped because UTF-$bits support is not available"
|
echo " Skipped because UTF-$bits support is not available"
|
||||||
else
|
else
|
||||||
$sim $valgrind ${opt:+$vjs} ./pcre2test -q $setstack $bmode $opt $testdata/testinput25 testtry
|
$sim $valgrind ./pcre2test -q $setstack $bmode $testdata/testinput25 testtry
|
||||||
checkresult $? 25 ""
|
checkresult $? 25 ""
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
20
configure.ac
20
configure.ac
|
@ -10,17 +10,17 @@ dnl be defined as -RC2, for example. For real releases, it should be empty.
|
||||||
|
|
||||||
m4_define(pcre2_major, [10])
|
m4_define(pcre2_major, [10])
|
||||||
m4_define(pcre2_minor, [30])
|
m4_define(pcre2_minor, [30])
|
||||||
m4_define(pcre2_prerelease, [-DEV])
|
m4_define(pcre2_prerelease, [-RC1])
|
||||||
m4_define(pcre2_date, [2017-03-05])
|
m4_define(pcre2_date, [2017-07-18])
|
||||||
|
|
||||||
# NOTE: The CMakeLists.txt file searches for the above variables in the first
|
# NOTE: The CMakeLists.txt file searches for the above variables in the first
|
||||||
# 50 lines of this file. Please update that if the variables above are moved.
|
# 50 lines of this file. Please update that if the variables above are moved.
|
||||||
|
|
||||||
# Libtool shared library interface versions (current:revision:age)
|
# Libtool shared library interface versions (current:revision:age)
|
||||||
m4_define(libpcre2_8_version, [5:0:5])
|
m4_define(libpcre2_8_version, [6:0:6])
|
||||||
m4_define(libpcre2_16_version, [5:0:5])
|
m4_define(libpcre2_16_version, [6:0:6])
|
||||||
m4_define(libpcre2_32_version, [5:0:5])
|
m4_define(libpcre2_32_version, [6:0:6])
|
||||||
m4_define(libpcre2_posix_version, [1:1:0])
|
m4_define(libpcre2_posix_version, [2:0:0])
|
||||||
|
|
||||||
AC_PREREQ(2.57)
|
AC_PREREQ(2.57)
|
||||||
AC_INIT(PCRE2, pcre2_major.pcre2_minor[]pcre2_prerelease, , pcre2)
|
AC_INIT(PCRE2, pcre2_major.pcre2_minor[]pcre2_prerelease, , pcre2)
|
||||||
|
@ -277,7 +277,7 @@ AC_ARG_WITH(parens-nest-limit,
|
||||||
AC_ARG_WITH(heap-limit,
|
AC_ARG_WITH(heap-limit,
|
||||||
AS_HELP_STRING([--with-heap-limit=N],
|
AS_HELP_STRING([--with-heap-limit=N],
|
||||||
[default limit on heap memory (kilobytes, default=20000000)]),
|
[default limit on heap memory (kilobytes, default=20000000)]),
|
||||||
, with_heap_limit=20000000)
|
, with_heap_limit=20000000)
|
||||||
|
|
||||||
# Handle --with-match-limit=N
|
# Handle --with-match-limit=N
|
||||||
AC_ARG_WITH(match-limit,
|
AC_ARG_WITH(match-limit,
|
||||||
|
@ -301,7 +301,7 @@ AC_ARG_WITH(match-limit-depth,
|
||||||
|
|
||||||
AC_ARG_WITH(match-limit-recursion,,
|
AC_ARG_WITH(match-limit-recursion,,
|
||||||
, with_match_limit_recursion=UNSET)
|
, with_match_limit_recursion=UNSET)
|
||||||
|
|
||||||
# Handle --enable-valgrind
|
# Handle --enable-valgrind
|
||||||
AC_ARG_ENABLE(valgrind,
|
AC_ARG_ENABLE(valgrind,
|
||||||
AS_HELP_STRING([--enable-valgrind],
|
AS_HELP_STRING([--enable-valgrind],
|
||||||
|
@ -370,7 +370,7 @@ case "$enable_newline" in
|
||||||
crlf) ac_pcre2_newline_value=3 ;;
|
crlf) ac_pcre2_newline_value=3 ;;
|
||||||
any) ac_pcre2_newline_value=4 ;;
|
any) ac_pcre2_newline_value=4 ;;
|
||||||
anycrlf) ac_pcre2_newline_value=5 ;;
|
anycrlf) ac_pcre2_newline_value=5 ;;
|
||||||
nul) ac_pcre2_newline_value=6 ;;
|
nul) ac_pcre2_newline_value=6 ;;
|
||||||
*)
|
*)
|
||||||
AC_MSG_ERROR([invalid argument \"$enable_newline\" to --enable-newline option])
|
AC_MSG_ERROR([invalid argument \"$enable_newline\" to --enable-newline option])
|
||||||
;;
|
;;
|
||||||
|
@ -737,7 +737,7 @@ AC_DEFINE_UNQUOTED([MATCH_LIMIT_DEPTH], [$with_match_limit_depth], [
|
||||||
|
|
||||||
AC_DEFINE_UNQUOTED([HEAP_LIMIT], [$with_heap_limit], [
|
AC_DEFINE_UNQUOTED([HEAP_LIMIT], [$with_heap_limit], [
|
||||||
This limits the amount of memory that pcre2_match() may use while matching
|
This limits the amount of memory that pcre2_match() may use while matching
|
||||||
a pattern. The value is in kilobytes.])
|
a pattern. The value is in kilobytes.])
|
||||||
|
|
||||||
AC_DEFINE([MAX_NAME_SIZE], [32], [
|
AC_DEFINE([MAX_NAME_SIZE], [32], [
|
||||||
This limit is parameterized just in case anybody ever wants to
|
This limit is parameterized just in case anybody ever wants to
|
||||||
|
|
|
@ -179,8 +179,8 @@ can skip ahead to the CMake section.
|
||||||
|
|
||||||
STACK SIZE IN WINDOWS ENVIRONMENTS
|
STACK SIZE IN WINDOWS ENVIRONMENTS
|
||||||
|
|
||||||
Prior to release 10.30 the default system stack size of 1Mb in some Windows
|
Prior to release 10.30 the default system stack size of 1Mb in some Windows
|
||||||
environments caused issues with some tests. This should no longer be the case
|
environments caused issues with some tests. This should no longer be the case
|
||||||
for 10.30 and later releases.
|
for 10.30 and later releases.
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -173,7 +173,7 @@ library. They are also documented in the pcre2build man page.
|
||||||
architectures. If you try to enable it on an unsupported architecture, there
|
architectures. If you try to enable it on an unsupported architecture, there
|
||||||
will be a compile time error. If you are running under SELinux you may also
|
will be a compile time error. If you are running under SELinux you may also
|
||||||
want to add --enable-jit-sealloc, which enables the use of an execmem
|
want to add --enable-jit-sealloc, which enables the use of an execmem
|
||||||
allocator in JIT that is compatible with SELinux. This has no effect if JIT
|
allocator in JIT that is compatible with SELinux. This has no effect if JIT
|
||||||
is not enabled.
|
is not enabled.
|
||||||
|
|
||||||
. If you do not want to make use of the default support for UTF-8 Unicode
|
. If you do not want to make use of the default support for UTF-8 Unicode
|
||||||
|
@ -198,13 +198,14 @@ library. They are also documented in the pcre2build man page.
|
||||||
or starting a pattern with (*UCP).
|
or starting a pattern with (*UCP).
|
||||||
|
|
||||||
. You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any
|
. You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any
|
||||||
of the preceding, or any of the Unicode newline sequences, as indicating the
|
of the preceding, or any of the Unicode newline sequences, or the NUL (zero)
|
||||||
end of a line. Whatever you specify at build time is the default; the caller
|
character as indicating the end of a line. Whatever you specify at build time
|
||||||
of PCRE2 can change the selection at run time. The default newline indicator
|
is the default; the caller of PCRE2 can change the selection at run time. The
|
||||||
is a single LF character (the Unix standard). You can specify the default
|
default newline indicator is a single LF character (the Unix standard). You
|
||||||
newline indicator by adding --enable-newline-is-cr, --enable-newline-is-lf,
|
can specify the default newline indicator by adding --enable-newline-is-cr,
|
||||||
--enable-newline-is-crlf, --enable-newline-is-anycrlf, or
|
--enable-newline-is-lf, --enable-newline-is-crlf,
|
||||||
--enable-newline-is-any to the "configure" command, respectively.
|
--enable-newline-is-anycrlf, --enable-newline-is-any, or
|
||||||
|
--enable-newline-is-nul to the "configure" command, respectively.
|
||||||
|
|
||||||
. By default, the sequence \R in a pattern matches any Unicode line ending
|
. By default, the sequence \R in a pattern matches any Unicode line ending
|
||||||
sequence. This is independent of the option specifying what PCRE2 considers
|
sequence. This is independent of the option specifying what PCRE2 considers
|
||||||
|
@ -227,15 +228,15 @@ library. They are also documented in the pcre2build man page.
|
||||||
--with-parens-nest-limit=500
|
--with-parens-nest-limit=500
|
||||||
|
|
||||||
. PCRE2 has a counter that can be set to limit the amount of computing resource
|
. PCRE2 has a counter that can be set to limit the amount of computing resource
|
||||||
it uses when matching a pattern with the Perl-compatible matching function.
|
it uses when matching a pattern. If the limit is exceeded during a match, the
|
||||||
If the limit is exceeded during a match, the match fails. The default is ten
|
match fails. The default is ten million. You can change the default by
|
||||||
million. You can change the default by setting, for example,
|
setting, for example,
|
||||||
|
|
||||||
--with-match-limit=500000
|
--with-match-limit=500000
|
||||||
|
|
||||||
on the "configure" command. This is just the default; individual calls to
|
on the "configure" command. This is just the default; individual calls to
|
||||||
pcre2_match() can supply their own value. There is more discussion in the
|
pcre2_match() or pcre2_dfa_match() can supply their own value. There is more
|
||||||
pcre2api man page (search for pcre2_set_match_limit).
|
discussion in the pcre2api man page (search for pcre2_set_match_limit).
|
||||||
|
|
||||||
. There is a separate counter that limits the depth of nested backtracking
|
. There is a separate counter that limits the depth of nested backtracking
|
||||||
during a matching process, which indirectly limits the amount of heap memory
|
during a matching process, which indirectly limits the amount of heap memory
|
||||||
|
@ -246,15 +247,15 @@ library. They are also documented in the pcre2build man page.
|
||||||
|
|
||||||
There is more discussion in the pcre2api man page (search for
|
There is more discussion in the pcre2api man page (search for
|
||||||
pcre2_set_depth_limit).
|
pcre2_set_depth_limit).
|
||||||
|
|
||||||
. You can also set an explicit limit on the amount of heap memory used by
|
. You can also set an explicit limit on the amount of heap memory used by
|
||||||
the pcre2_match() interpreter:
|
the pcre2_match() interpreter:
|
||||||
|
|
||||||
--with-heap-limit=500
|
--with-heap-limit=500
|
||||||
|
|
||||||
The units are kilobytes. This limit does not apply when the JIT optimization
|
The units are kilobytes. This limit does not apply when the JIT optimization
|
||||||
(which has its own memory control features) is used. There is more discussion
|
(which has its own memory control features) is used. There is more discussion
|
||||||
on the pcre2api man page (search for pcre2_set_heap_limit).
|
on the pcre2api man page (search for pcre2_set_heap_limit).
|
||||||
|
|
||||||
. In the 8-bit library, the default maximum compiled pattern size is around
|
. In the 8-bit library, the default maximum compiled pattern size is around
|
||||||
64K bytes. You can increase this by adding --with-link-size=3 to the
|
64K bytes. You can increase this by adding --with-link-size=3 to the
|
||||||
|
@ -659,9 +660,10 @@ with the perltest.sh script, and test 5 checking PCRE2-specific things.
|
||||||
Tests 6 and 7 check the pcre2_dfa_match() alternative matching function, in
|
Tests 6 and 7 check the pcre2_dfa_match() alternative matching function, in
|
||||||
non-UTF mode and UTF-mode with Unicode property support, respectively.
|
non-UTF mode and UTF-mode with Unicode property support, respectively.
|
||||||
|
|
||||||
Test 8 checks some internal offsets and code size features; it is run only when
|
Test 8 checks some internal offsets and code size features, but it is run only
|
||||||
the default "link size" of 2 is set (in other cases the sizes change) and when
|
when Unicode support is enabled. The output is different in 8-bit, 16-bit, and
|
||||||
Unicode support is enabled.
|
32-bit modes and for different link sizes, so there are different output files
|
||||||
|
for each mode and link size.
|
||||||
|
|
||||||
Tests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in
|
Tests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in
|
||||||
16-bit and 32-bit modes. These are tests that generate different output in
|
16-bit and 32-bit modes. These are tests that generate different output in
|
||||||
|
@ -671,7 +673,7 @@ Test 13 checks the handling of non-UTF characters greater than 255 by
|
||||||
pcre2_dfa_match() in 16-bit and 32-bit modes.
|
pcre2_dfa_match() in 16-bit and 32-bit modes.
|
||||||
|
|
||||||
Test 14 contains some special UTF and UCP tests that give different output for
|
Test 14 contains some special UTF and UCP tests that give different output for
|
||||||
the different widths.
|
different code unit widths.
|
||||||
|
|
||||||
Test 15 contains a number of tests that must not be run with JIT. They check,
|
Test 15 contains a number of tests that must not be run with JIT. They check,
|
||||||
among other non-JIT things, the match-limiting features of the intepretive
|
among other non-JIT things, the match-limiting features of the intepretive
|
||||||
|
@ -692,6 +694,9 @@ patterns to a file, and then reloading and checking them.
|
||||||
Tests 21 and 22 test \C support when the use of \C is not locked out, without
|
Tests 21 and 22 test \C support when the use of \C is not locked out, without
|
||||||
and with UTF support, respectively. Test 23 tests \C when it is locked out.
|
and with UTF support, respectively. Test 23 tests \C when it is locked out.
|
||||||
|
|
||||||
|
Tests 24 and 25 test the experimental pattern conversion functions, without and
|
||||||
|
with UTF support, respectively.
|
||||||
|
|
||||||
|
|
||||||
Character tables
|
Character tables
|
||||||
----------------
|
----------------
|
||||||
|
@ -710,7 +715,7 @@ specified for ./configure, a different version of pcre2_chartables.c is built
|
||||||
by the program dftables (compiled from dftables.c), which uses the ANSI C
|
by the program dftables (compiled from dftables.c), which uses the ANSI C
|
||||||
character handling functions such as isalnum(), isalpha(), isupper(),
|
character handling functions such as isalnum(), isalpha(), isupper(),
|
||||||
islower(), etc. to build the table sources. This means that the default C
|
islower(), etc. to build the table sources. This means that the default C
|
||||||
locale which is set for your system will control the contents of these default
|
locale that is set for your system will control the contents of these default
|
||||||
tables. You can change the default tables by editing pcre2_chartables.c and
|
tables. You can change the default tables by editing pcre2_chartables.c and
|
||||||
then re-building PCRE2. If you do this, you should take care to ensure that the
|
then re-building PCRE2. If you do this, you should take care to ensure that the
|
||||||
file does not get automatically re-generated. The best way to do this is to
|
file does not get automatically re-generated. The best way to do this is to
|
||||||
|
@ -765,6 +770,7 @@ The distribution should contain the files listed below.
|
||||||
src/pcre2_compile.c )
|
src/pcre2_compile.c )
|
||||||
src/pcre2_config.c )
|
src/pcre2_config.c )
|
||||||
src/pcre2_context.c )
|
src/pcre2_context.c )
|
||||||
|
src/pcre2_convert.c )
|
||||||
src/pcre2_dfa_match.c )
|
src/pcre2_dfa_match.c )
|
||||||
src/pcre2_error.c )
|
src/pcre2_error.c )
|
||||||
src/pcre2_find_bracket.c )
|
src/pcre2_find_bracket.c )
|
||||||
|
@ -804,7 +810,6 @@ The distribution should contain the files listed below.
|
||||||
src/pcre2demo.c simple demonstration of coding calls to PCRE2
|
src/pcre2demo.c simple demonstration of coding calls to PCRE2
|
||||||
src/pcre2grep.c source of a grep utility that uses PCRE2
|
src/pcre2grep.c source of a grep utility that uses PCRE2
|
||||||
src/pcre2test.c comprehensive test program
|
src/pcre2test.c comprehensive test program
|
||||||
src/pcre2_printint.c part of pcre2test
|
|
||||||
src/pcre2_jit_test.c JIT test program
|
src/pcre2_jit_test.c JIT test program
|
||||||
|
|
||||||
(C) Auxiliary files:
|
(C) Auxiliary files:
|
||||||
|
@ -869,12 +874,12 @@ The distribution should contain the files listed below.
|
||||||
|
|
||||||
(E) Auxiliary files for building PCRE2 "by hand"
|
(E) Auxiliary files for building PCRE2 "by hand"
|
||||||
|
|
||||||
pcre2.h.generic ) a version of the public PCRE2 header file
|
src/pcre2.h.generic ) a version of the public PCRE2 header file
|
||||||
) for use in non-"configure" environments
|
) for use in non-"configure" environments
|
||||||
config.h.generic ) a version of config.h for use in non-"configure"
|
src/config.h.generic ) a version of config.h for use in non-"configure"
|
||||||
) environments
|
) environments
|
||||||
|
|
||||||
Philip Hazel
|
Philip Hazel
|
||||||
Email local part: ph10
|
Email local part: ph10
|
||||||
Email domain: cam.ac.uk
|
Email domain: cam.ac.uk
|
||||||
Last updated: 17 June 2017
|
Last updated: 18 July 2017
|
||||||
|
|
|
@ -137,7 +137,7 @@ large search tree against a string that will never match. Nested unlimited
|
||||||
repeats in a pattern are a common example. PCRE2 provides some protection
|
repeats in a pattern are a common example. PCRE2 provides some protection
|
||||||
against this: see the <b>pcre2_set_match_limit()</b> function in the
|
against this: see the <b>pcre2_set_match_limit()</b> function in the
|
||||||
<a href="pcre2api.html"><b>pcre2api</b></a>
|
<a href="pcre2api.html"><b>pcre2api</b></a>
|
||||||
page. There is a similar function called <b>pcre2_set_depth_limit()</b> that can
|
page. There is a similar function called <b>pcre2_set_depth_limit()</b> that can
|
||||||
be used to restrict the amount of memory that is used.
|
be used to restrict the amount of memory that is used.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC3" href="#TOC1">USER DOCUMENTATION</a><br>
|
<br><a name="SEC3" href="#TOC1">USER DOCUMENTATION</a><br>
|
||||||
|
|
|
@ -26,8 +26,8 @@ DESCRIPTION
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
This function frees the memory used for a compiled pattern, including any
|
This function frees the memory used for a compiled pattern, including any
|
||||||
memory used by the JIT compiler. If the compiled pattern was created by a call
|
memory used by the JIT compiler. If the compiled pattern was created by a call
|
||||||
to <b>pcre2_code_copy_with_tables()</b>, the memory for the character tables is
|
to <b>pcre2_code_copy_with_tables()</b>, the memory for the character tables is
|
||||||
also freed.
|
also freed.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
|
|
|
@ -64,7 +64,7 @@ The option bits are:
|
||||||
PCRE2_ENDANCHORED Pattern can match only at end of subject
|
PCRE2_ENDANCHORED Pattern can match only at end of subject
|
||||||
PCRE2_EXTENDED Ignore white space and # comments
|
PCRE2_EXTENDED Ignore white space and # comments
|
||||||
PCRE2_FIRSTLINE Force matching to be before newline
|
PCRE2_FIRSTLINE Force matching to be before newline
|
||||||
PCRE2_LITERAL Pattern characters are all literal
|
PCRE2_LITERAL Pattern characters are all literal
|
||||||
PCRE2_MATCH_UNSET_BACKREF Match unset back references
|
PCRE2_MATCH_UNSET_BACKREF Match unset back references
|
||||||
PCRE2_MULTILINE ^ and $ match newlines within data
|
PCRE2_MULTILINE ^ and $ match newlines within data
|
||||||
PCRE2_NEVER_BACKSLASH_C Lock out the use of \C in patterns
|
PCRE2_NEVER_BACKSLASH_C Lock out the use of \C in patterns
|
||||||
|
|
|
@ -45,7 +45,7 @@ point to a uint32_t integer variable. The available codes are:
|
||||||
PCRE2_CONFIG_BSR Indicates what \R matches by default:
|
PCRE2_CONFIG_BSR Indicates what \R matches by default:
|
||||||
PCRE2_BSR_UNICODE
|
PCRE2_BSR_UNICODE
|
||||||
PCRE2_BSR_ANYCRLF
|
PCRE2_BSR_ANYCRLF
|
||||||
PCRE2_CONFIG_HEAPLIMIT Default heap memory limit
|
PCRE2_CONFIG_HEAPLIMIT Default heap memory limit
|
||||||
PCRE2_CONFIG_DEPTHLIMIT Default backtracking depth limit
|
PCRE2_CONFIG_DEPTHLIMIT Default backtracking depth limit
|
||||||
PCRE2_CONFIG_JIT Availability of just-in-time compiler support (1=yes 0=no)
|
PCRE2_CONFIG_JIT Availability of just-in-time compiler support (1=yes 0=no)
|
||||||
PCRE2_CONFIG_JITTARGET Information (a string) about the target architecture for the JIT compiler
|
PCRE2_CONFIG_JITTARGET Information (a string) about the target architecture for the JIT compiler
|
||||||
|
@ -57,7 +57,7 @@ point to a uint32_t integer variable. The available codes are:
|
||||||
PCRE2_NEWLINE_CRLF
|
PCRE2_NEWLINE_CRLF
|
||||||
PCRE2_NEWLINE_ANY
|
PCRE2_NEWLINE_ANY
|
||||||
PCRE2_NEWLINE_ANYCRLF
|
PCRE2_NEWLINE_ANYCRLF
|
||||||
PCRE2_NEWLINE_NUL
|
PCRE2_NEWLINE_NUL
|
||||||
PCRE2_CONFIG_PARENSLIMIT Default parentheses nesting limit
|
PCRE2_CONFIG_PARENSLIMIT Default parentheses nesting limit
|
||||||
PCRE2_CONFIG_RECURSIONLIMIT Obsolete: use PCRE2_CONFIG_DEPTHLIMIT
|
PCRE2_CONFIG_RECURSIONLIMIT Obsolete: use PCRE2_CONFIG_DEPTHLIMIT
|
||||||
PCRE2_CONFIG_STACKRECURSE Obsolete: always returns 0
|
PCRE2_CONFIG_STACKRECURSE Obsolete: always returns 0
|
||||||
|
|
|
@ -26,8 +26,8 @@ DESCRIPTION
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
This function is part of an experimental set of pattern conversion functions.
|
This function is part of an experimental set of pattern conversion functions.
|
||||||
It frees the memory occupied by a converted pattern that was obtained by
|
It frees the memory occupied by a converted pattern that was obtained by
|
||||||
calling <b>pcre2_pattern_convert()</b> with arguments that caused it to place
|
calling <b>pcre2_pattern_convert()</b> with arguments that caused it to place
|
||||||
the converted pattern into newly obtained heap memory.
|
the converted pattern into newly obtained heap memory.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
|
|
|
@ -25,7 +25,7 @@ SYNOPSIS
|
||||||
DESCRIPTION
|
DESCRIPTION
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
This function builds a set of character tables for character code points that
|
This function builds a set of character tables for character code points that
|
||||||
are less than 256. These can be passed to <b>pcre2_compile()</b> in a compile
|
are less than 256. These can be passed to <b>pcre2_compile()</b> in a compile
|
||||||
context in order to override the internal, built-in tables (which were either
|
context in order to override the internal, built-in tables (which were either
|
||||||
defaulted or made by <b>pcre2_maketables()</b> when PCRE2 was compiled). See the
|
defaulted or made by <b>pcre2_maketables()</b> when PCRE2 was compiled). See the
|
||||||
|
|
|
@ -43,14 +43,14 @@ offsets to captured substrings. Its arguments are:
|
||||||
A match context is needed only if you want to:
|
A match context is needed only if you want to:
|
||||||
<pre>
|
<pre>
|
||||||
Set up a callout function
|
Set up a callout function
|
||||||
Set a matching offset limit
|
Set a matching offset limit
|
||||||
Change the heap memory limit
|
Change the heap memory limit
|
||||||
Change the backtracking match limit
|
Change the backtracking match limit
|
||||||
Change the backtracking depth limit
|
Change the backtracking depth limit
|
||||||
Set custom memory management specifically for the match
|
Set custom memory management specifically for the match
|
||||||
</pre>
|
</pre>
|
||||||
The <i>length</i> and <i>startoffset</i> values are code
|
The <i>length</i> and <i>startoffset</i> values are code
|
||||||
units, not characters. The length may be given as PCRE2_ZERO_TERMINATE for a
|
units, not characters. The length may be given as PCRE2_ZERO_TERMINATE for a
|
||||||
subject that is terminated by a binary zero code unit. The options are:
|
subject that is terminated by a binary zero code unit. The options are:
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_ANCHORED Match only at the first position
|
PCRE2_ANCHORED Match only at the first position
|
||||||
|
@ -59,7 +59,7 @@ subject that is terminated by a binary zero code unit. The options are:
|
||||||
PCRE2_NOTEOL Subject string is not the end of a line
|
PCRE2_NOTEOL Subject string is not the end of a line
|
||||||
PCRE2_NOTEMPTY An empty string is not a valid match
|
PCRE2_NOTEMPTY An empty string is not a valid match
|
||||||
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject is not a valid match
|
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject is not a valid match
|
||||||
PCRE2_NO_JIT Do not use JIT matching
|
PCRE2_NO_JIT Do not use JIT matching
|
||||||
PCRE2_NO_UTF_CHECK Do not check the subject for UTF validity (only relevant if PCRE2_UTF
|
PCRE2_NO_UTF_CHECK Do not check the subject for UTF validity (only relevant if PCRE2_UTF
|
||||||
was set at compile time)
|
was set at compile time)
|
||||||
PCRE2_PARTIAL_HARD Return PCRE2_ERROR_PARTIAL for a partial match even if there is a full match
|
PCRE2_PARTIAL_HARD Return PCRE2_ERROR_PARTIAL for a partial match even if there is a full match
|
||||||
|
|
|
@ -48,7 +48,7 @@ request are as follows:
|
||||||
1 first code unit is set
|
1 first code unit is set
|
||||||
2 start of string or after newline
|
2 start of string or after newline
|
||||||
PCRE2_INFO_FIRSTCODEUNIT First code unit when type is 1
|
PCRE2_INFO_FIRSTCODEUNIT First code unit when type is 1
|
||||||
PCRE2_INFO_FRAMESIZE Size of backtracking frame
|
PCRE2_INFO_FRAMESIZE Size of backtracking frame
|
||||||
PCRE2_INFO_HASBACKSLASHC Return 1 if pattern contains \C
|
PCRE2_INFO_HASBACKSLASHC Return 1 if pattern contains \C
|
||||||
PCRE2_INFO_HASCRORLF Return 1 if explicit CR or LF matches exist in the pattern
|
PCRE2_INFO_HASCRORLF Return 1 if explicit CR or LF matches exist in the pattern
|
||||||
PCRE2_INFO_HEAPLIMIT Heap memory limit if set, otherwise PCRE2_ERROR_UNSET
|
PCRE2_INFO_HEAPLIMIT Heap memory limit if set, otherwise PCRE2_ERROR_UNSET
|
||||||
|
@ -71,7 +71,7 @@ request are as follows:
|
||||||
PCRE2_NEWLINE_CRLF
|
PCRE2_NEWLINE_CRLF
|
||||||
PCRE2_NEWLINE_ANY
|
PCRE2_NEWLINE_ANY
|
||||||
PCRE2_NEWLINE_ANYCRLF
|
PCRE2_NEWLINE_ANYCRLF
|
||||||
PCRE2_NEWLINE_NUL
|
PCRE2_NEWLINE_NUL
|
||||||
PCRE2_INFO_RECURSIONLIMIT Obsolete synonym for PCRE2_INFO_DEPTHLIMIT
|
PCRE2_INFO_RECURSIONLIMIT Obsolete synonym for PCRE2_INFO_DEPTHLIMIT
|
||||||
PCRE2_INFO_SIZE Size of compiled pattern
|
PCRE2_INFO_SIZE Size of compiled pattern
|
||||||
</pre>
|
</pre>
|
||||||
|
|
|
@ -35,7 +35,7 @@ matching patterns. The second argument must be one of:
|
||||||
PCRE2_NEWLINE_CRLF CR followed by LF only
|
PCRE2_NEWLINE_CRLF CR followed by LF only
|
||||||
PCRE2_NEWLINE_ANYCRLF Any of the above
|
PCRE2_NEWLINE_ANYCRLF Any of the above
|
||||||
PCRE2_NEWLINE_ANY Any Unicode newline sequence
|
PCRE2_NEWLINE_ANY Any Unicode newline sequence
|
||||||
PCRE2_NEWLINE_NUL The NUL character (binary zero)
|
PCRE2_NEWLINE_NUL The NUL character (binary zero)
|
||||||
</pre>
|
</pre>
|
||||||
The result is zero for success or PCRE2_ERROR_BADDATA if the second argument is
|
The result is zero for success or PCRE2_ERROR_BADDATA if the second argument is
|
||||||
invalid.
|
invalid.
|
||||||
|
|
|
@ -26,7 +26,7 @@ SYNOPSIS
|
||||||
DESCRIPTION
|
DESCRIPTION
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
This function is obsolete and should not be used in new code. Use
|
This function is obsolete and should not be used in new code. Use
|
||||||
<b>pcre2_set_depth_limit()</b> instead.
|
<b>pcre2_set_depth_limit()</b> instead.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
|
|
|
@ -60,7 +60,7 @@ want to:
|
||||||
The <i>length</i>, <i>startoffset</i> and <i>rlength</i> values are code
|
The <i>length</i>, <i>startoffset</i> and <i>rlength</i> values are code
|
||||||
units, not characters, as is the contents of the variable pointed at by
|
units, not characters, as is the contents of the variable pointed at by
|
||||||
<i>outlengthptr</i>, which is updated to the actual length of the new string.
|
<i>outlengthptr</i>, which is updated to the actual length of the new string.
|
||||||
The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
|
The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
|
||||||
zero-terminated strings. The options are:
|
zero-terminated strings. The options are:
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_ANCHORED Match only at the first position
|
PCRE2_ANCHORED Match only at the first position
|
||||||
|
|
|
@ -87,10 +87,10 @@ Options that specify values have names that start with --with.
|
||||||
<br><a name="SEC3" href="#TOC1">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br>
|
<br><a name="SEC3" href="#TOC1">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br>
|
||||||
<P>
|
<P>
|
||||||
By default, a library called <b>libpcre2-8</b> is built, containing functions
|
By default, a library called <b>libpcre2-8</b> is built, containing functions
|
||||||
that take string arguments contained in vectors of bytes, interpreted either as
|
that take string arguments contained in arrays of bytes, interpreted either as
|
||||||
single-byte characters, or UTF-8 strings. You can also build two other
|
single-byte characters, or UTF-8 strings. You can also build two other
|
||||||
libraries, called <b>libpcre2-16</b> and <b>libpcre2-32</b>, which process
|
libraries, called <b>libpcre2-16</b> and <b>libpcre2-32</b>, which process
|
||||||
strings that are contained in vectors of 16-bit and 32-bit code units,
|
strings that are contained in arrays of 16-bit and 32-bit code units,
|
||||||
respectively. These can be interpreted either as single-unit characters or
|
respectively. These can be interpreted either as single-unit characters or
|
||||||
UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
|
UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
|
||||||
the following to the <b>configure</b> command:
|
the following to the <b>configure</b> command:
|
||||||
|
@ -208,19 +208,23 @@ to the <b>configure</b> command. There is a fourth option, specified by
|
||||||
--enable-newline-is-anycrlf
|
--enable-newline-is-anycrlf
|
||||||
</pre>
|
</pre>
|
||||||
which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
|
which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
|
||||||
indicating a line ending. Finally, a fifth option, specified by
|
indicating a line ending. A fifth option, specified by
|
||||||
<pre>
|
<pre>
|
||||||
--enable-newline-is-any
|
--enable-newline-is-any
|
||||||
</pre>
|
</pre>
|
||||||
causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
|
causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
|
||||||
sequences are the three just mentioned, plus the single characters VT (vertical
|
sequences are the three just mentioned, plus the single characters VT (vertical
|
||||||
tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
|
tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
|
||||||
separator, U+2028), and PS (paragraph separator, U+2029).
|
separator, U+2028), and PS (paragraph separator, U+2029). The final option is
|
||||||
|
<pre>
|
||||||
|
--enable-newline-is-nul
|
||||||
|
</pre>
|
||||||
|
which causes NUL (binary zero) is set as the default line-ending character.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Whatever default line ending convention is selected when PCRE2 is built can be
|
Whatever default line ending convention is selected when PCRE2 is built can be
|
||||||
overridden by applications that use the library. At build time it is
|
overridden by applications that use the library. At build time it is
|
||||||
conventional to use the standard for your operating system.
|
recommended to use the standard for your operating system.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC9" href="#TOC1">WHAT \R MATCHES</a><br>
|
<br><a name="SEC9" href="#TOC1">WHAT \R MATCHES</a><br>
|
||||||
<P>
|
<P>
|
||||||
|
@ -301,7 +305,9 @@ because the size of each backtracking "frame" depends on the number of
|
||||||
capturing parentheses in a pattern, the amount of heap that is used before the
|
capturing parentheses in a pattern, the amount of heap that is used before the
|
||||||
limit is reached varies from pattern to pattern. This limit was more useful in
|
limit is reached varies from pattern to pattern. This limit was more useful in
|
||||||
versions before 10.30, where function recursion was used for backtracking.
|
versions before 10.30, where function recursion was used for backtracking.
|
||||||
However, as well as applying to <b>pcre2_match()</b>, this limit also controls
|
</P>
|
||||||
|
<P>
|
||||||
|
As well as applying to <b>pcre2_match()</b>, the depth limit also controls
|
||||||
the depth of recursive function calls in <b>pcre2_dfa_match()</b>. These are
|
the depth of recursive function calls in <b>pcre2_dfa_match()</b>. These are
|
||||||
used for lookaround assertions, atomic groups, and recursion within patterns.
|
used for lookaround assertions, atomic groups, and recursion within patterns.
|
||||||
The limit does not apply to JIT matching.
|
The limit does not apply to JIT matching.
|
||||||
|
@ -559,7 +565,7 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC25" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC25" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 17 June 2017
|
Last updated: 18 July 2017
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2017 University of Cambridge.
|
Copyright © 1997-2017 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
|
@ -85,7 +85,7 @@ documentation for details.
|
||||||
<P>
|
<P>
|
||||||
8. Subroutine calls (whether recursive or not) were treated as atomic groups up
|
8. Subroutine calls (whether recursive or not) were treated as atomic groups up
|
||||||
to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
|
to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
|
||||||
into subroutine calls is now supported, as in Perl.
|
into subroutine calls is now supported, as in Perl.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
9. If any of the backtracking control verbs are used in a subpattern that is
|
9. If any of the backtracking control verbs are used in a subpattern that is
|
||||||
|
|
|
@ -517,20 +517,20 @@ memory. There are three options that set resource limits for matching.
|
||||||
The <b>--match-limit</b> option provides a means of limiting computing resource
|
The <b>--match-limit</b> option provides a means of limiting computing resource
|
||||||
usage when processing patterns that are not going to match, but which have a
|
usage when processing patterns that are not going to match, but which have a
|
||||||
very large number of possibilities in their search trees. The classic example
|
very large number of possibilities in their search trees. The classic example
|
||||||
is a pattern that uses nested unlimited repeats. Internally, PCRE2 has a
|
is a pattern that uses nested unlimited repeats. Internally, PCRE2 has a
|
||||||
counter that is incremented each time around its main processing loop. If the
|
counter that is incremented each time around its main processing loop. If the
|
||||||
value set by <b>--match-limit</b> is reached, an error occurs.
|
value set by <b>--match-limit</b> is reached, an error occurs.
|
||||||
<br>
|
<br>
|
||||||
<br>
|
<br>
|
||||||
The <b>--heap-limit</b> option specifies, as a number of kilobytes, the amount
|
The <b>--heap-limit</b> option specifies, as a number of kilobytes, the amount
|
||||||
of heap memory that may be used for matching. Heap memory is needed only if
|
of heap memory that may be used for matching. Heap memory is needed only if
|
||||||
matching the pattern requires a significant number of nested backtracking
|
matching the pattern requires a significant number of nested backtracking
|
||||||
points to be remembered. This parameter can be set to zero to forbid the use of
|
points to be remembered. This parameter can be set to zero to forbid the use of
|
||||||
heap memory altogether.
|
heap memory altogether.
|
||||||
<br>
|
<br>
|
||||||
<br>
|
<br>
|
||||||
The <b>--depth-limit</b> option limits the depth of nested backtracking points,
|
The <b>--depth-limit</b> option limits the depth of nested backtracking points,
|
||||||
which indirectly limits the amount of memory that is used. The amount of memory
|
which indirectly limits the amount of memory that is used. The amount of memory
|
||||||
needed for each backtracking point depends on the number of capturing
|
needed for each backtracking point depends on the number of capturing
|
||||||
parentheses in the pattern, so the amount of memory that is used before this
|
parentheses in the pattern, so the amount of memory that is used before this
|
||||||
limit acts varies from pattern to pattern. This limit is of use only if it is
|
limit acts varies from pattern to pattern. This limit is of use only if it is
|
||||||
|
@ -538,7 +538,7 @@ set smaller than <b>--match-limit</b>.
|
||||||
<br>
|
<br>
|
||||||
<br>
|
<br>
|
||||||
There are no short forms for these options. The default settings are specified
|
There are no short forms for these options. The default settings are specified
|
||||||
when the PCRE2 library is compiled, with the default defaults being very large
|
when the PCRE2 library is compiled, with the default defaults being very large
|
||||||
and so effectively unlimited.
|
and so effectively unlimited.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
|
@ -841,7 +841,7 @@ patterns are ignored by <b>pcre2grep</b>.
|
||||||
A callout in a PCRE2 pattern is of the form (?C<arg>) where the argument is
|
A callout in a PCRE2 pattern is of the form (?C<arg>) where the argument is
|
||||||
either a number or a quoted string (see the
|
either a number or a quoted string (see the
|
||||||
<a href="pcre2callout.html"><b>pcre2callout</b></a>
|
<a href="pcre2callout.html"><b>pcre2callout</b></a>
|
||||||
documentation for details). Numbered callouts are ignored by <b>pcre2grep</b>;
|
documentation for details). Numbered callouts are ignored by <b>pcre2grep</b>;
|
||||||
only callouts with string arguments are useful.
|
only callouts with string arguments are useful.
|
||||||
</P>
|
</P>
|
||||||
<br><b>
|
<br><b>
|
||||||
|
@ -892,10 +892,10 @@ Echoing a specific string
|
||||||
If the callout string starts with a pipe (vertical bar) character, the rest of
|
If the callout string starts with a pipe (vertical bar) character, the rest of
|
||||||
the string is written to the output, having been passed through the same escape
|
the string is written to the output, having been passed through the same escape
|
||||||
processing as text from the --output option. This provides a simple echoing
|
processing as text from the --output option. This provides a simple echoing
|
||||||
facility that avoids calling an external program or script. No terminator is
|
facility that avoids calling an external program or script. No terminator is
|
||||||
added to the string, so if you want a newline, you must include it explicitly.
|
added to the string, so if you want a newline, you must include it explicitly.
|
||||||
Matching continues normally after the string is output. If you want to see only
|
Matching continues normally after the string is output. If you want to see only
|
||||||
the callout output but not any output from an actual match, you should end the
|
the callout output but not any output from an actual match, you should end the
|
||||||
relevant pattern with (*FAIL).
|
relevant pattern with (*FAIL).
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC11" href="#TOC1">MATCHING ERRORS</a><br>
|
<br><a name="SEC11" href="#TOC1">MATCHING ERRORS</a><br>
|
||||||
|
@ -910,8 +910,8 @@ there are more than 20 such errors, <b>pcre2grep</b> gives up.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The <b>--match-limit</b> option of <b>pcre2grep</b> can be used to set the
|
The <b>--match-limit</b> option of <b>pcre2grep</b> can be used to set the
|
||||||
overall resource limit. There are also other limits that affect the amount of
|
overall resource limit. There are also other limits that affect the amount of
|
||||||
memory used during matching; see the discussion of <b>--heap-limit</b> and
|
memory used during matching; see the discussion of <b>--heap-limit</b> and
|
||||||
<b>--depth-limit</b> above.
|
<b>--depth-limit</b> above.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC12" href="#TOC1">DIAGNOSTICS</a><br>
|
<br><a name="SEC12" href="#TOC1">DIAGNOSTICS</a><br>
|
||||||
|
|
|
@ -29,7 +29,7 @@ of them.
|
||||||
<br><a name="SEC2" href="#TOC1">COMPILED PATTERN MEMORY USAGE</a><br>
|
<br><a name="SEC2" href="#TOC1">COMPILED PATTERN MEMORY USAGE</a><br>
|
||||||
<P>
|
<P>
|
||||||
Patterns are compiled by PCRE2 into a reasonably efficient interpretive code,
|
Patterns are compiled by PCRE2 into a reasonably efficient interpretive code,
|
||||||
so that most simple patterns do not use much memory for storing the compiled
|
so that most simple patterns do not use much memory for storing the compiled
|
||||||
version. However, there is one case where the memory usage of a compiled
|
version. However, there is one case where the memory usage of a compiled
|
||||||
pattern can be unexpectedly large. If a parenthesized subpattern has a
|
pattern can be unexpectedly large. If a parenthesized subpattern has a
|
||||||
quantifier with a minimum greater than 1 and/or a limited maximum, the whole
|
quantifier with a minimum greater than 1 and/or a limited maximum, the whole
|
||||||
|
@ -91,7 +91,7 @@ vector is used. Rewriting patterns to be time-efficient, as described below,
|
||||||
may also reduce the memory requirements.
|
may also reduce the memory requirements.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
In contrast to <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b> does use recursive
|
In contrast to <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b> does use recursive
|
||||||
function calls, but only for processing atomic groups, lookaround assertions,
|
function calls, but only for processing atomic groups, lookaround assertions,
|
||||||
and recursion within the pattern. Too much nested recursion may cause stack
|
and recursion within the pattern. Too much nested recursion may cause stack
|
||||||
issues. The "match depth" parameter can be used to limit the depth of function
|
issues. The "match depth" parameter can be used to limit the depth of function
|
||||||
|
@ -184,7 +184,7 @@ appreciable time with strings longer than about 20 characters.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
In many cases, the solution to this kind of performance issue is to use an
|
In many cases, the solution to this kind of performance issue is to use an
|
||||||
atomic group or a possessive quantifier. This can often reduce memory
|
atomic group or a possessive quantifier. This can often reduce memory
|
||||||
requirements as well. As another example, consider this pattern:
|
requirements as well. As another example, consider this pattern:
|
||||||
<pre>
|
<pre>
|
||||||
([^<]|<(?!inet))+
|
([^<]|<(?!inet))+
|
||||||
|
@ -205,7 +205,7 @@ are "swallowed" in one item inside the parentheses, and a possessive quantifier
|
||||||
is used to stop any backtracking into the runs of non-"<" characters. This
|
is used to stop any backtracking into the runs of non-"<" characters. This
|
||||||
version also uses a lot less memory because entry to a new set of parentheses
|
version also uses a lot less memory because entry to a new set of parentheses
|
||||||
happens only when a "<" character that is not followed by "inet" is encountered
|
happens only when a "<" character that is not followed by "inet" is encountered
|
||||||
(and we assume this is relatively rare).
|
(and we assume this is relatively rare).
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
This example shows that one way of optimizing performance when matching long
|
This example shows that one way of optimizing performance when matching long
|
||||||
|
@ -216,10 +216,10 @@ than one character whenever possible.
|
||||||
SETTING RESOURCE LIMITS
|
SETTING RESOURCE LIMITS
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
You can set limits on the amount of processing that takes place when matching,
|
You can set limits on the amount of processing that takes place when matching,
|
||||||
and on the amount of heap memory that is used. The default values of the limits
|
and on the amount of heap memory that is used. The default values of the limits
|
||||||
are very large, and unlikely ever to operate. They can be changed when PCRE2 is
|
are very large, and unlikely ever to operate. They can be changed when PCRE2 is
|
||||||
built, and they can also be set when <b>pcre2_match()</b> or
|
built, and they can also be set when <b>pcre2_match()</b> or
|
||||||
<b>pcre2_dfa_match()</b> is called. For details of these interfaces, see the
|
<b>pcre2_dfa_match()</b> is called. For details of these interfaces, see the
|
||||||
<a href="pcre2build.html"><b>pcre2build</b></a>
|
<a href="pcre2build.html"><b>pcre2build</b></a>
|
||||||
documentation and the section entitled
|
documentation and the section entitled
|
||||||
|
|
|
@ -430,11 +430,11 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
|
||||||
(?i) caseless
|
(?i) caseless
|
||||||
(?J) allow duplicate names
|
(?J) allow duplicate names
|
||||||
(?m) multiline
|
(?m) multiline
|
||||||
(?n) no auto capture
|
(?n) no auto capture
|
||||||
(?s) single line (dotall)
|
(?s) single line (dotall)
|
||||||
(?U) default ungreedy (lazy)
|
(?U) default ungreedy (lazy)
|
||||||
(?x) extended: ignore white space except in classes
|
(?x) extended: ignore white space except in classes
|
||||||
(?xx) as (?x) but also ignore space and tab in classes
|
(?xx) as (?x) but also ignore space and tab in classes
|
||||||
(?-...) unset option(s)
|
(?-...) unset option(s)
|
||||||
</pre>
|
</pre>
|
||||||
The following are recognized only at the very start of a pattern or after one
|
The following are recognized only at the very start of a pattern or after one
|
||||||
|
|
|
@ -130,7 +130,7 @@ against this: see the \fBpcre2_set_match_limit()\fP function in the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
\fBpcre2api\fP
|
\fBpcre2api\fP
|
||||||
.\"
|
.\"
|
||||||
page. There is a similar function called \fBpcre2_set_depth_limit()\fP that can
|
page. There is a similar function called \fBpcre2_set_depth_limit()\fP that can
|
||||||
be used to restrict the amount of memory that is used.
|
be used to restrict the amount of memory that is used.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
|
|
162
doc/pcre2.txt
162
doc/pcre2.txt
|
@ -170,8 +170,8 @@ REVISION
|
||||||
Last updated: 01 April 2017
|
Last updated: 01 April 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2API(3) Library Functions Manual PCRE2API(3)
|
PCRE2API(3) Library Functions Manual PCRE2API(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -3432,8 +3432,8 @@ REVISION
|
||||||
Last updated: 10 July 2017
|
Last updated: 10 July 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2BUILD(3) Library Functions Manual PCRE2BUILD(3)
|
PCRE2BUILD(3) Library Functions Manual PCRE2BUILD(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -3487,10 +3487,10 @@ PCRE2 BUILD-TIME OPTIONS
|
||||||
BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES
|
BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES
|
||||||
|
|
||||||
By default, a library called libpcre2-8 is built, containing functions
|
By default, a library called libpcre2-8 is built, containing functions
|
||||||
that take string arguments contained in vectors of bytes, interpreted
|
that take string arguments contained in arrays of bytes, interpreted
|
||||||
either as single-byte characters, or UTF-8 strings. You can also build
|
either as single-byte characters, or UTF-8 strings. You can also build
|
||||||
two other libraries, called libpcre2-16 and libpcre2-32, which process
|
two other libraries, called libpcre2-16 and libpcre2-32, which process
|
||||||
strings that are contained in vectors of 16-bit and 32-bit code units,
|
strings that are contained in arrays of 16-bit and 32-bit code units,
|
||||||
respectively. These can be interpreted either as single-unit characters
|
respectively. These can be interpreted either as single-unit characters
|
||||||
or UTF-16/UTF-32 strings. To build these additional libraries, add one
|
or UTF-16/UTF-32 strings. To build these additional libraries, add one
|
||||||
or both of the following to the configure command:
|
or both of the following to the configure command:
|
||||||
|
@ -3609,7 +3609,7 @@ NEWLINE RECOGNITION
|
||||||
--enable-newline-is-anycrlf
|
--enable-newline-is-anycrlf
|
||||||
|
|
||||||
which causes PCRE2 to recognize any of the three sequences CR, LF, or
|
which causes PCRE2 to recognize any of the three sequences CR, LF, or
|
||||||
CRLF as indicating a line ending. Finally, a fifth option, specified by
|
CRLF as indicating a line ending. A fifth option, specified by
|
||||||
|
|
||||||
--enable-newline-is-any
|
--enable-newline-is-any
|
||||||
|
|
||||||
|
@ -3617,97 +3617,103 @@ NEWLINE RECOGNITION
|
||||||
newline sequences are the three just mentioned, plus the single charac-
|
newline sequences are the three just mentioned, plus the single charac-
|
||||||
ters VT (vertical tab, U+000B), FF (form feed, U+000C), NEL (next line,
|
ters VT (vertical tab, U+000B), FF (form feed, U+000C), NEL (next line,
|
||||||
U+0085), LS (line separator, U+2028), and PS (paragraph separator,
|
U+0085), LS (line separator, U+2028), and PS (paragraph separator,
|
||||||
U+2029).
|
U+2029). The final option is
|
||||||
|
|
||||||
|
--enable-newline-is-nul
|
||||||
|
|
||||||
|
which causes NUL (binary zero) is set as the default line-ending char-
|
||||||
|
acter.
|
||||||
|
|
||||||
Whatever default line ending convention is selected when PCRE2 is built
|
Whatever default line ending convention is selected when PCRE2 is built
|
||||||
can be overridden by applications that use the library. At build time
|
can be overridden by applications that use the library. At build time
|
||||||
it is conventional to use the standard for your operating system.
|
it is recommended to use the standard for your operating system.
|
||||||
|
|
||||||
|
|
||||||
WHAT \R MATCHES
|
WHAT \R MATCHES
|
||||||
|
|
||||||
By default, the sequence \R in a pattern matches any Unicode newline
|
By default, the sequence \R in a pattern matches any Unicode newline
|
||||||
sequence, independently of what has been selected as the line ending
|
sequence, independently of what has been selected as the line ending
|
||||||
sequence. If you specify
|
sequence. If you specify
|
||||||
|
|
||||||
--enable-bsr-anycrlf
|
--enable-bsr-anycrlf
|
||||||
|
|
||||||
the default is changed so that \R matches only CR, LF, or CRLF. What-
|
the default is changed so that \R matches only CR, LF, or CRLF. What-
|
||||||
ever is selected when PCRE2 is built can be overridden by applications
|
ever is selected when PCRE2 is built can be overridden by applications
|
||||||
that use the library.
|
that use the library.
|
||||||
|
|
||||||
|
|
||||||
HANDLING VERY LARGE PATTERNS
|
HANDLING VERY LARGE PATTERNS
|
||||||
|
|
||||||
Within a compiled pattern, offset values are used to point from one
|
Within a compiled pattern, offset values are used to point from one
|
||||||
part to another (for example, from an opening parenthesis to an alter-
|
part to another (for example, from an opening parenthesis to an alter-
|
||||||
nation metacharacter). By default, in the 8-bit and 16-bit libraries,
|
nation metacharacter). By default, in the 8-bit and 16-bit libraries,
|
||||||
two-byte values are used for these offsets, leading to a maximum size
|
two-byte values are used for these offsets, leading to a maximum size
|
||||||
for a compiled pattern of around 64K code units. This is sufficient to
|
for a compiled pattern of around 64K code units. This is sufficient to
|
||||||
handle all but the most gigantic patterns. Nevertheless, some people do
|
handle all but the most gigantic patterns. Nevertheless, some people do
|
||||||
want to process truly enormous patterns, so it is possible to compile
|
want to process truly enormous patterns, so it is possible to compile
|
||||||
PCRE2 to use three-byte or four-byte offsets by adding a setting such
|
PCRE2 to use three-byte or four-byte offsets by adding a setting such
|
||||||
as
|
as
|
||||||
|
|
||||||
--with-link-size=3
|
--with-link-size=3
|
||||||
|
|
||||||
to the configure command. The value given must be 2, 3, or 4. For the
|
to the configure command. The value given must be 2, 3, or 4. For the
|
||||||
16-bit library, a value of 3 is rounded up to 4. In these libraries,
|
16-bit library, a value of 3 is rounded up to 4. In these libraries,
|
||||||
using longer offsets slows down the operation of PCRE2 because it has
|
using longer offsets slows down the operation of PCRE2 because it has
|
||||||
to load additional data when handling them. For the 32-bit library the
|
to load additional data when handling them. For the 32-bit library the
|
||||||
value is always 4 and cannot be overridden; the value of --with-link-
|
value is always 4 and cannot be overridden; the value of --with-link-
|
||||||
size is ignored.
|
size is ignored.
|
||||||
|
|
||||||
|
|
||||||
LIMITING PCRE2 RESOURCE USAGE
|
LIMITING PCRE2 RESOURCE USAGE
|
||||||
|
|
||||||
The pcre2_match() function increments a counter each time it goes round
|
The pcre2_match() function increments a counter each time it goes round
|
||||||
its main loop. Putting a limit on this counter controls the amount of
|
its main loop. Putting a limit on this counter controls the amount of
|
||||||
computing resource used by a single call to pcre2_match(). The limit
|
computing resource used by a single call to pcre2_match(). The limit
|
||||||
can be changed at run time, as described in the pcre2api documentation.
|
can be changed at run time, as described in the pcre2api documentation.
|
||||||
The default is 10 million, but this can be changed by adding a setting
|
The default is 10 million, but this can be changed by adding a setting
|
||||||
such as
|
such as
|
||||||
|
|
||||||
--with-match-limit=500000
|
--with-match-limit=500000
|
||||||
|
|
||||||
to the configure command. This setting also applies to the
|
to the configure command. This setting also applies to the
|
||||||
pcre2_dfa_match() matching function, and to JIT matching (though the
|
pcre2_dfa_match() matching function, and to JIT matching (though the
|
||||||
counting is done differently).
|
counting is done differently).
|
||||||
|
|
||||||
The pcre2_match() function starts out using a 20K vector on the system
|
The pcre2_match() function starts out using a 20K vector on the system
|
||||||
stack to record backtracking points. The more nested backtracking
|
stack to record backtracking points. The more nested backtracking
|
||||||
points there are (that is, the deeper the search tree), the more memory
|
points there are (that is, the deeper the search tree), the more memory
|
||||||
is needed. If the initial vector is not large enough, heap memory is
|
is needed. If the initial vector is not large enough, heap memory is
|
||||||
used, up to a certain limit, which is specified in kilobytes. The limit
|
used, up to a certain limit, which is specified in kilobytes. The limit
|
||||||
can be changed at run time, as described in the pcre2api documentation.
|
can be changed at run time, as described in the pcre2api documentation.
|
||||||
The default limit (in effect unlimited) is 20 million. You can change
|
The default limit (in effect unlimited) is 20 million. You can change
|
||||||
this by a setting such as
|
this by a setting such as
|
||||||
|
|
||||||
--with-heap-limit=500
|
--with-heap-limit=500
|
||||||
|
|
||||||
which limits the amount of heap to 500 kilobytes. This limit applies
|
which limits the amount of heap to 500 kilobytes. This limit applies
|
||||||
only to interpretive matching in pcre2_match(). It does not apply when
|
only to interpretive matching in pcre2_match(). It does not apply when
|
||||||
JIT (which has its own memory arrangements) is used, nor does it apply
|
JIT (which has its own memory arrangements) is used, nor does it apply
|
||||||
to pcre2_dfa_match().
|
to pcre2_dfa_match().
|
||||||
|
|
||||||
You can also explicitly limit the depth of nested backtracking in the
|
You can also explicitly limit the depth of nested backtracking in the
|
||||||
pcre2_match() interpreter. This limit defaults to the value that is set
|
pcre2_match() interpreter. This limit defaults to the value that is set
|
||||||
for --with-match-limit. You can set a lower default limit by adding,
|
for --with-match-limit. You can set a lower default limit by adding,
|
||||||
for example,
|
for example,
|
||||||
|
|
||||||
--with-match-limit_depth=10000
|
--with-match-limit_depth=10000
|
||||||
|
|
||||||
to the configure command. This value can be overridden at run time.
|
to the configure command. This value can be overridden at run time.
|
||||||
This depth limit indirectly limits the amount of heap memory that is
|
This depth limit indirectly limits the amount of heap memory that is
|
||||||
used, but because the size of each backtracking "frame" depends on the
|
used, but because the size of each backtracking "frame" depends on the
|
||||||
number of capturing parentheses in a pattern, the amount of heap that
|
number of capturing parentheses in a pattern, the amount of heap that
|
||||||
is used before the limit is reached varies from pattern to pattern.
|
is used before the limit is reached varies from pattern to pattern.
|
||||||
This limit was more useful in versions before 10.30, where function
|
This limit was more useful in versions before 10.30, where function
|
||||||
recursion was used for backtracking. However, as well as applying to
|
recursion was used for backtracking.
|
||||||
pcre2_match(), this limit also controls the depth of recursive function
|
|
||||||
calls in pcre2_dfa_match(). These are used for lookaround assertions,
|
As well as applying to pcre2_match(), the depth limit also controls the
|
||||||
atomic groups, and recursion within patterns. The limit does not apply
|
depth of recursive function calls in pcre2_dfa_match(). These are used
|
||||||
to JIT matching.
|
for lookaround assertions, atomic groups, and recursion within pat-
|
||||||
|
terns. The limit does not apply to JIT matching.
|
||||||
|
|
||||||
|
|
||||||
CREATING CHARACTER TABLES AT BUILD TIME
|
CREATING CHARACTER TABLES AT BUILD TIME
|
||||||
|
@ -3969,11 +3975,11 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 17 June 2017
|
Last updated: 18 July 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2CALLOUT(3) Library Functions Manual PCRE2CALLOUT(3)
|
PCRE2CALLOUT(3) Library Functions Manual PCRE2CALLOUT(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -4366,8 +4372,8 @@ REVISION
|
||||||
Last updated: 14 April 2017
|
Last updated: 14 April 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2COMPAT(3) Library Functions Manual PCRE2COMPAT(3)
|
PCRE2COMPAT(3) Library Functions Manual PCRE2COMPAT(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -4564,8 +4570,8 @@ REVISION
|
||||||
Last updated: 18 April 2017
|
Last updated: 18 April 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2JIT(3) Library Functions Manual PCRE2JIT(3)
|
PCRE2JIT(3) Library Functions Manual PCRE2JIT(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -4958,8 +4964,8 @@ REVISION
|
||||||
Last updated: 31 March 2017
|
Last updated: 31 March 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2LIMITS(3) Library Functions Manual PCRE2LIMITS(3)
|
PCRE2LIMITS(3) Library Functions Manual PCRE2LIMITS(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -5029,8 +5035,8 @@ REVISION
|
||||||
Last updated: 30 March 2017
|
Last updated: 30 March 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2MATCHING(3) Library Functions Manual PCRE2MATCHING(3)
|
PCRE2MATCHING(3) Library Functions Manual PCRE2MATCHING(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -5248,8 +5254,8 @@ REVISION
|
||||||
Last updated: 29 September 2014
|
Last updated: 29 September 2014
|
||||||
Copyright (c) 1997-2014 University of Cambridge.
|
Copyright (c) 1997-2014 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2PARTIAL(3) Library Functions Manual PCRE2PARTIAL(3)
|
PCRE2PARTIAL(3) Library Functions Manual PCRE2PARTIAL(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -5688,8 +5694,8 @@ REVISION
|
||||||
Last updated: 22 December 2014
|
Last updated: 22 December 2014
|
||||||
Copyright (c) 1997-2014 University of Cambridge.
|
Copyright (c) 1997-2014 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2PATTERN(3) Library Functions Manual PCRE2PATTERN(3)
|
PCRE2PATTERN(3) Library Functions Manual PCRE2PATTERN(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -8790,8 +8796,8 @@ REVISION
|
||||||
Last updated: 05 July 2017
|
Last updated: 05 July 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2PERFORM(3) Library Functions Manual PCRE2PERFORM(3)
|
PCRE2PERFORM(3) Library Functions Manual PCRE2PERFORM(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -9018,8 +9024,8 @@ REVISION
|
||||||
Last updated: 08 April 2017
|
Last updated: 08 April 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2POSIX(3) Library Functions Manual PCRE2POSIX(3)
|
PCRE2POSIX(3) Library Functions Manual PCRE2POSIX(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -9326,8 +9332,8 @@ REVISION
|
||||||
Last updated: 15 June 2017
|
Last updated: 15 June 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2SAMPLE(3) Library Functions Manual PCRE2SAMPLE(3)
|
PCRE2SAMPLE(3) Library Functions Manual PCRE2SAMPLE(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -9595,8 +9601,8 @@ REVISION
|
||||||
Last updated: 21 March 2017
|
Last updated: 21 March 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2SYNTAX(3) Library Functions Manual PCRE2SYNTAX(3)
|
PCRE2SYNTAX(3) Library Functions Manual PCRE2SYNTAX(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -10043,8 +10049,8 @@ REVISION
|
||||||
Last updated: 17 June 2017
|
Last updated: 17 June 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
PCRE2UNICODE(3) Library Functions Manual PCRE2UNICODE(3)
|
PCRE2UNICODE(3) Library Functions Manual PCRE2UNICODE(3)
|
||||||
|
|
||||||
|
|
||||||
|
@ -10300,5 +10306,5 @@ REVISION
|
||||||
Last updated: 17 May 2017
|
Last updated: 17 May 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -14,8 +14,8 @@ PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
This function frees the memory used for a compiled pattern, including any
|
This function frees the memory used for a compiled pattern, including any
|
||||||
memory used by the JIT compiler. If the compiled pattern was created by a call
|
memory used by the JIT compiler. If the compiled pattern was created by a call
|
||||||
to \fBpcre2_code_copy_with_tables()\fP, the memory for the character tables is
|
to \fBpcre2_code_copy_with_tables()\fP, the memory for the character tables is
|
||||||
also freed.
|
also freed.
|
||||||
.P
|
.P
|
||||||
There is a complete description of the PCRE2 native API in the
|
There is a complete description of the PCRE2 native API in the
|
||||||
|
|
|
@ -52,7 +52,7 @@ The option bits are:
|
||||||
PCRE2_ENDANCHORED Pattern can match only at end of subject
|
PCRE2_ENDANCHORED Pattern can match only at end of subject
|
||||||
PCRE2_EXTENDED Ignore white space and # comments
|
PCRE2_EXTENDED Ignore white space and # comments
|
||||||
PCRE2_FIRSTLINE Force matching to be before newline
|
PCRE2_FIRSTLINE Force matching to be before newline
|
||||||
PCRE2_LITERAL Pattern characters are all literal
|
PCRE2_LITERAL Pattern characters are all literal
|
||||||
PCRE2_MATCH_UNSET_BACKREF Match unset back references
|
PCRE2_MATCH_UNSET_BACKREF Match unset back references
|
||||||
PCRE2_MULTILINE ^ and $ match newlines within data
|
PCRE2_MULTILINE ^ and $ match newlines within data
|
||||||
PCRE2_NEVER_BACKSLASH_C Lock out the use of \eC in patterns
|
PCRE2_NEVER_BACKSLASH_C Lock out the use of \eC in patterns
|
||||||
|
|
|
@ -31,7 +31,7 @@ point to a uint32_t integer variable. The available codes are:
|
||||||
PCRE2_CONFIG_BSR Indicates what \eR matches by default:
|
PCRE2_CONFIG_BSR Indicates what \eR matches by default:
|
||||||
PCRE2_BSR_UNICODE
|
PCRE2_BSR_UNICODE
|
||||||
PCRE2_BSR_ANYCRLF
|
PCRE2_BSR_ANYCRLF
|
||||||
PCRE2_CONFIG_HEAPLIMIT Default heap memory limit
|
PCRE2_CONFIG_HEAPLIMIT Default heap memory limit
|
||||||
PCRE2_CONFIG_DEPTHLIMIT Default backtracking depth limit
|
PCRE2_CONFIG_DEPTHLIMIT Default backtracking depth limit
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_CONFIG_JIT Availability of just-in-time compiler
|
PCRE2_CONFIG_JIT Availability of just-in-time compiler
|
||||||
|
@ -47,7 +47,7 @@ point to a uint32_t integer variable. The available codes are:
|
||||||
PCRE2_NEWLINE_CRLF
|
PCRE2_NEWLINE_CRLF
|
||||||
PCRE2_NEWLINE_ANY
|
PCRE2_NEWLINE_ANY
|
||||||
PCRE2_NEWLINE_ANYCRLF
|
PCRE2_NEWLINE_ANYCRLF
|
||||||
PCRE2_NEWLINE_NUL
|
PCRE2_NEWLINE_NUL
|
||||||
PCRE2_CONFIG_PARENSLIMIT Default parentheses nesting limit
|
PCRE2_CONFIG_PARENSLIMIT Default parentheses nesting limit
|
||||||
PCRE2_CONFIG_RECURSIONLIMIT Obsolete: use PCRE2_CONFIG_DEPTHLIMIT
|
PCRE2_CONFIG_RECURSIONLIMIT Obsolete: use PCRE2_CONFIG_DEPTHLIMIT
|
||||||
PCRE2_CONFIG_STACKRECURSE Obsolete: always returns 0
|
PCRE2_CONFIG_STACKRECURSE Obsolete: always returns 0
|
||||||
|
|
|
@ -14,8 +14,8 @@ PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
This function is part of an experimental set of pattern conversion functions.
|
This function is part of an experimental set of pattern conversion functions.
|
||||||
It frees the memory occupied by a converted pattern that was obtained by
|
It frees the memory occupied by a converted pattern that was obtained by
|
||||||
calling \fBpcre2_pattern_convert()\fP with arguments that caused it to place
|
calling \fBpcre2_pattern_convert()\fP with arguments that caused it to place
|
||||||
the converted pattern into newly obtained heap memory.
|
the converted pattern into newly obtained heap memory.
|
||||||
.P
|
.P
|
||||||
The pattern conversion functions are described in the
|
The pattern conversion functions are described in the
|
||||||
|
|
|
@ -43,17 +43,17 @@ The options are:
|
||||||
PCRE2_NOTBOL Subject is not the beginning of a line
|
PCRE2_NOTBOL Subject is not the beginning of a line
|
||||||
PCRE2_NOTEOL Subject is not the end of a line
|
PCRE2_NOTEOL Subject is not the end of a line
|
||||||
PCRE2_NOTEMPTY An empty string is not a valid match
|
PCRE2_NOTEMPTY An empty string is not a valid match
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject
|
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject
|
||||||
is not a valid match
|
is not a valid match
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_NO_UTF_CHECK Do not check the subject for UTF
|
PCRE2_NO_UTF_CHECK Do not check the subject for UTF
|
||||||
validity (only relevant if PCRE2_UTF
|
validity (only relevant if PCRE2_UTF
|
||||||
was set at compile time)
|
was set at compile time)
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_PARTIAL_HARD Return PCRE2_ERROR_PARTIAL for a partial
|
PCRE2_PARTIAL_HARD Return PCRE2_ERROR_PARTIAL for a partial
|
||||||
match even if there is a full match
|
match even if there is a full match
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_PARTIAL_SOFT Return PCRE2_ERROR_PARTIAL for a partial
|
PCRE2_PARTIAL_SOFT Return PCRE2_ERROR_PARTIAL for a partial
|
||||||
match if no full matches are found
|
match if no full matches are found
|
||||||
PCRE2_DFA_RESTART Restart after a partial match
|
PCRE2_DFA_RESTART Restart after a partial match
|
||||||
|
|
|
@ -12,7 +12,7 @@ PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH DESCRIPTION
|
.SH DESCRIPTION
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
This function builds a set of character tables for character code points that
|
This function builds a set of character tables for character code points that
|
||||||
are less than 256. These can be passed to \fBpcre2_compile()\fP in a compile
|
are less than 256. These can be passed to \fBpcre2_compile()\fP in a compile
|
||||||
context in order to override the internal, built-in tables (which were either
|
context in order to override the internal, built-in tables (which were either
|
||||||
defaulted or made by \fBpcre2_maketables()\fP when PCRE2 was compiled). See the
|
defaulted or made by \fBpcre2_maketables()\fP when PCRE2 was compiled). See the
|
||||||
|
|
|
@ -31,14 +31,14 @@ offsets to captured substrings. Its arguments are:
|
||||||
A match context is needed only if you want to:
|
A match context is needed only if you want to:
|
||||||
.sp
|
.sp
|
||||||
Set up a callout function
|
Set up a callout function
|
||||||
Set a matching offset limit
|
Set a matching offset limit
|
||||||
Change the heap memory limit
|
Change the heap memory limit
|
||||||
Change the backtracking match limit
|
Change the backtracking match limit
|
||||||
Change the backtracking depth limit
|
Change the backtracking depth limit
|
||||||
Set custom memory management specifically for the match
|
Set custom memory management specifically for the match
|
||||||
.sp
|
.sp
|
||||||
The \fIlength\fP and \fIstartoffset\fP values are code
|
The \fIlength\fP and \fIstartoffset\fP values are code
|
||||||
units, not characters. The length may be given as PCRE2_ZERO_TERMINATE for a
|
units, not characters. The length may be given as PCRE2_ZERO_TERMINATE for a
|
||||||
subject that is terminated by a binary zero code unit. The options are:
|
subject that is terminated by a binary zero code unit. The options are:
|
||||||
.sp
|
.sp
|
||||||
PCRE2_ANCHORED Match only at the first position
|
PCRE2_ANCHORED Match only at the first position
|
||||||
|
@ -49,7 +49,7 @@ subject that is terminated by a binary zero code unit. The options are:
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject
|
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject
|
||||||
is not a valid match
|
is not a valid match
|
||||||
PCRE2_NO_JIT Do not use JIT matching
|
PCRE2_NO_JIT Do not use JIT matching
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_NO_UTF_CHECK Do not check the subject for UTF
|
PCRE2_NO_UTF_CHECK Do not check the subject for UTF
|
||||||
validity (only relevant if PCRE2_UTF
|
validity (only relevant if PCRE2_UTF
|
||||||
|
|
|
@ -38,7 +38,7 @@ request are as follows:
|
||||||
1 first code unit is set
|
1 first code unit is set
|
||||||
2 start of string or after newline
|
2 start of string or after newline
|
||||||
PCRE2_INFO_FIRSTCODEUNIT First code unit when type is 1
|
PCRE2_INFO_FIRSTCODEUNIT First code unit when type is 1
|
||||||
PCRE2_INFO_FRAMESIZE Size of backtracking frame
|
PCRE2_INFO_FRAMESIZE Size of backtracking frame
|
||||||
PCRE2_INFO_HASBACKSLASHC Return 1 if pattern contains \eC
|
PCRE2_INFO_HASBACKSLASHC Return 1 if pattern contains \eC
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_INFO_HASCRORLF Return 1 if explicit CR or LF matches
|
PCRE2_INFO_HASCRORLF Return 1 if explicit CR or LF matches
|
||||||
|
@ -71,7 +71,7 @@ request are as follows:
|
||||||
PCRE2_NEWLINE_CRLF
|
PCRE2_NEWLINE_CRLF
|
||||||
PCRE2_NEWLINE_ANY
|
PCRE2_NEWLINE_ANY
|
||||||
PCRE2_NEWLINE_ANYCRLF
|
PCRE2_NEWLINE_ANYCRLF
|
||||||
PCRE2_NEWLINE_NUL
|
PCRE2_NEWLINE_NUL
|
||||||
PCRE2_INFO_RECURSIONLIMIT Obsolete synonym for PCRE2_INFO_DEPTHLIMIT
|
PCRE2_INFO_RECURSIONLIMIT Obsolete synonym for PCRE2_INFO_DEPTHLIMIT
|
||||||
PCRE2_INFO_SIZE Size of compiled pattern
|
PCRE2_INFO_SIZE Size of compiled pattern
|
||||||
.sp
|
.sp
|
||||||
|
|
|
@ -23,7 +23,7 @@ matching patterns. The second argument must be one of:
|
||||||
PCRE2_NEWLINE_CRLF CR followed by LF only
|
PCRE2_NEWLINE_CRLF CR followed by LF only
|
||||||
PCRE2_NEWLINE_ANYCRLF Any of the above
|
PCRE2_NEWLINE_ANYCRLF Any of the above
|
||||||
PCRE2_NEWLINE_ANY Any Unicode newline sequence
|
PCRE2_NEWLINE_ANY Any Unicode newline sequence
|
||||||
PCRE2_NEWLINE_NUL The NUL character (binary zero)
|
PCRE2_NEWLINE_NUL The NUL character (binary zero)
|
||||||
.sp
|
.sp
|
||||||
The result is zero for success or PCRE2_ERROR_BADDATA if the second argument is
|
The result is zero for success or PCRE2_ERROR_BADDATA if the second argument is
|
||||||
invalid.
|
invalid.
|
||||||
|
|
|
@ -14,7 +14,7 @@ PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH DESCRIPTION
|
.SH DESCRIPTION
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
This function is obsolete and should not be used in new code. Use
|
This function is obsolete and should not be used in new code. Use
|
||||||
\fBpcre2_set_depth_limit()\fP instead.
|
\fBpcre2_set_depth_limit()\fP instead.
|
||||||
.P
|
.P
|
||||||
There is a complete description of the PCRE2 native API in the
|
There is a complete description of the PCRE2 native API in the
|
||||||
|
|
|
@ -48,7 +48,7 @@ want to:
|
||||||
The \fIlength\fP, \fIstartoffset\fP and \fIrlength\fP values are code
|
The \fIlength\fP, \fIstartoffset\fP and \fIrlength\fP values are code
|
||||||
units, not characters, as is the contents of the variable pointed at by
|
units, not characters, as is the contents of the variable pointed at by
|
||||||
\fIoutlengthptr\fP, which is updated to the actual length of the new string.
|
\fIoutlengthptr\fP, which is updated to the actual length of the new string.
|
||||||
The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
|
The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
|
||||||
zero-terminated strings. The options are:
|
zero-terminated strings. The options are:
|
||||||
.sp
|
.sp
|
||||||
PCRE2_ANCHORED Match only at the first position
|
PCRE2_ANCHORED Match only at the first position
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2BUILD 3 "17 June 2017" "PCRE2 10.30"
|
.TH PCRE2BUILD 3 "18 July 2017" "PCRE2 10.30"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.
|
.
|
||||||
|
@ -66,10 +66,10 @@ Options that specify values have names that start with --with.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
By default, a library called \fBlibpcre2-8\fP is built, containing functions
|
By default, a library called \fBlibpcre2-8\fP is built, containing functions
|
||||||
that take string arguments contained in vectors of bytes, interpreted either as
|
that take string arguments contained in arrays of bytes, interpreted either as
|
||||||
single-byte characters, or UTF-8 strings. You can also build two other
|
single-byte characters, or UTF-8 strings. You can also build two other
|
||||||
libraries, called \fBlibpcre2-16\fP and \fBlibpcre2-32\fP, which process
|
libraries, called \fBlibpcre2-16\fP and \fBlibpcre2-32\fP, which process
|
||||||
strings that are contained in vectors of 16-bit and 32-bit code units,
|
strings that are contained in arrays of 16-bit and 32-bit code units,
|
||||||
respectively. These can be interpreted either as single-unit characters or
|
respectively. These can be interpreted either as single-unit characters or
|
||||||
UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
|
UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
|
||||||
the following to the \fBconfigure\fP command:
|
the following to the \fBconfigure\fP command:
|
||||||
|
@ -197,18 +197,22 @@ to the \fBconfigure\fP command. There is a fourth option, specified by
|
||||||
--enable-newline-is-anycrlf
|
--enable-newline-is-anycrlf
|
||||||
.sp
|
.sp
|
||||||
which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
|
which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
|
||||||
indicating a line ending. Finally, a fifth option, specified by
|
indicating a line ending. A fifth option, specified by
|
||||||
.sp
|
.sp
|
||||||
--enable-newline-is-any
|
--enable-newline-is-any
|
||||||
.sp
|
.sp
|
||||||
causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
|
causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
|
||||||
sequences are the three just mentioned, plus the single characters VT (vertical
|
sequences are the three just mentioned, plus the single characters VT (vertical
|
||||||
tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
|
tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
|
||||||
separator, U+2028), and PS (paragraph separator, U+2029).
|
separator, U+2028), and PS (paragraph separator, U+2029). The final option is
|
||||||
|
.sp
|
||||||
|
--enable-newline-is-nul
|
||||||
|
.sp
|
||||||
|
which causes NUL (binary zero) is set as the default line-ending character.
|
||||||
.P
|
.P
|
||||||
Whatever default line ending convention is selected when PCRE2 is built can be
|
Whatever default line ending convention is selected when PCRE2 is built can be
|
||||||
overridden by applications that use the library. At build time it is
|
overridden by applications that use the library. At build time it is
|
||||||
conventional to use the standard for your operating system.
|
recommended to use the standard for your operating system.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SH "WHAT \eR MATCHES"
|
.SH "WHAT \eR MATCHES"
|
||||||
|
@ -297,7 +301,8 @@ because the size of each backtracking "frame" depends on the number of
|
||||||
capturing parentheses in a pattern, the amount of heap that is used before the
|
capturing parentheses in a pattern, the amount of heap that is used before the
|
||||||
limit is reached varies from pattern to pattern. This limit was more useful in
|
limit is reached varies from pattern to pattern. This limit was more useful in
|
||||||
versions before 10.30, where function recursion was used for backtracking.
|
versions before 10.30, where function recursion was used for backtracking.
|
||||||
However, as well as applying to \fBpcre2_match()\fP, this limit also controls
|
.P
|
||||||
|
As well as applying to \fBpcre2_match()\fP, the depth limit also controls
|
||||||
the depth of recursive function calls in \fBpcre2_dfa_match()\fP. These are
|
the depth of recursive function calls in \fBpcre2_dfa_match()\fP. These are
|
||||||
used for lookaround assertions, atomic groups, and recursion within patterns.
|
used for lookaround assertions, atomic groups, and recursion within patterns.
|
||||||
The limit does not apply to JIT matching.
|
The limit does not apply to JIT matching.
|
||||||
|
@ -577,6 +582,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 17 June 2017
|
Last updated: 18 July 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -71,7 +71,7 @@ documentation for details.
|
||||||
.P
|
.P
|
||||||
8. Subroutine calls (whether recursive or not) were treated as atomic groups up
|
8. Subroutine calls (whether recursive or not) were treated as atomic groups up
|
||||||
to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
|
to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
|
||||||
into subroutine calls is now supported, as in Perl.
|
into subroutine calls is now supported, as in Perl.
|
||||||
.P
|
.P
|
||||||
9. If any of the backtracking control verbs are used in a subpattern that is
|
9. If any of the backtracking control verbs are used in a subpattern that is
|
||||||
called as a subroutine (whether or not recursively), their effect is confined
|
called as a subroutine (whether or not recursively), their effect is confined
|
||||||
|
|
|
@ -446,25 +446,25 @@ memory. There are three options that set resource limits for matching.
|
||||||
The \fB--match-limit\fP option provides a means of limiting computing resource
|
The \fB--match-limit\fP option provides a means of limiting computing resource
|
||||||
usage when processing patterns that are not going to match, but which have a
|
usage when processing patterns that are not going to match, but which have a
|
||||||
very large number of possibilities in their search trees. The classic example
|
very large number of possibilities in their search trees. The classic example
|
||||||
is a pattern that uses nested unlimited repeats. Internally, PCRE2 has a
|
is a pattern that uses nested unlimited repeats. Internally, PCRE2 has a
|
||||||
counter that is incremented each time around its main processing loop. If the
|
counter that is incremented each time around its main processing loop. If the
|
||||||
value set by \fB--match-limit\fP is reached, an error occurs.
|
value set by \fB--match-limit\fP is reached, an error occurs.
|
||||||
.sp
|
.sp
|
||||||
The \fB--heap-limit\fP option specifies, as a number of kilobytes, the amount
|
The \fB--heap-limit\fP option specifies, as a number of kilobytes, the amount
|
||||||
of heap memory that may be used for matching. Heap memory is needed only if
|
of heap memory that may be used for matching. Heap memory is needed only if
|
||||||
matching the pattern requires a significant number of nested backtracking
|
matching the pattern requires a significant number of nested backtracking
|
||||||
points to be remembered. This parameter can be set to zero to forbid the use of
|
points to be remembered. This parameter can be set to zero to forbid the use of
|
||||||
heap memory altogether.
|
heap memory altogether.
|
||||||
.sp
|
.sp
|
||||||
The \fB--depth-limit\fP option limits the depth of nested backtracking points,
|
The \fB--depth-limit\fP option limits the depth of nested backtracking points,
|
||||||
which indirectly limits the amount of memory that is used. The amount of memory
|
which indirectly limits the amount of memory that is used. The amount of memory
|
||||||
needed for each backtracking point depends on the number of capturing
|
needed for each backtracking point depends on the number of capturing
|
||||||
parentheses in the pattern, so the amount of memory that is used before this
|
parentheses in the pattern, so the amount of memory that is used before this
|
||||||
limit acts varies from pattern to pattern. This limit is of use only if it is
|
limit acts varies from pattern to pattern. This limit is of use only if it is
|
||||||
set smaller than \fB--match-limit\fP.
|
set smaller than \fB--match-limit\fP.
|
||||||
.sp
|
.sp
|
||||||
There are no short forms for these options. The default settings are specified
|
There are no short forms for these options. The default settings are specified
|
||||||
when the PCRE2 library is compiled, with the default defaults being very large
|
when the PCRE2 library is compiled, with the default defaults being very large
|
||||||
and so effectively unlimited.
|
and so effectively unlimited.
|
||||||
.TP
|
.TP
|
||||||
\fB--max-buffer-size=\fInumber\fP
|
\fB--max-buffer-size=\fInumber\fP
|
||||||
|
@ -747,7 +747,7 @@ either a number or a quoted string (see the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
\fBpcre2callout\fP
|
\fBpcre2callout\fP
|
||||||
.\"
|
.\"
|
||||||
documentation for details). Numbered callouts are ignored by \fBpcre2grep\fP;
|
documentation for details). Numbered callouts are ignored by \fBpcre2grep\fP;
|
||||||
only callouts with string arguments are useful.
|
only callouts with string arguments are useful.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
|
@ -797,10 +797,10 @@ matcher backtracks in the normal way.
|
||||||
If the callout string starts with a pipe (vertical bar) character, the rest of
|
If the callout string starts with a pipe (vertical bar) character, the rest of
|
||||||
the string is written to the output, having been passed through the same escape
|
the string is written to the output, having been passed through the same escape
|
||||||
processing as text from the --output option. This provides a simple echoing
|
processing as text from the --output option. This provides a simple echoing
|
||||||
facility that avoids calling an external program or script. No terminator is
|
facility that avoids calling an external program or script. No terminator is
|
||||||
added to the string, so if you want a newline, you must include it explicitly.
|
added to the string, so if you want a newline, you must include it explicitly.
|
||||||
Matching continues normally after the string is output. If you want to see only
|
Matching continues normally after the string is output. If you want to see only
|
||||||
the callout output but not any output from an actual match, you should end the
|
the callout output but not any output from an actual match, you should end the
|
||||||
relevant pattern with (*FAIL).
|
relevant pattern with (*FAIL).
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
|
@ -816,8 +816,8 @@ message and the line that caused the problem to the standard error stream. If
|
||||||
there are more than 20 such errors, \fBpcre2grep\fP gives up.
|
there are more than 20 such errors, \fBpcre2grep\fP gives up.
|
||||||
.P
|
.P
|
||||||
The \fB--match-limit\fP option of \fBpcre2grep\fP can be used to set the
|
The \fB--match-limit\fP option of \fBpcre2grep\fP can be used to set the
|
||||||
overall resource limit. There are also other limits that affect the amount of
|
overall resource limit. There are also other limits that affect the amount of
|
||||||
memory used during matching; see the discussion of \fB--heap-limit\fP and
|
memory used during matching; see the discussion of \fB--heap-limit\fP and
|
||||||
\fB--depth-limit\fP above.
|
\fB--depth-limit\fP above.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
|
|
|
@ -12,7 +12,7 @@ of them.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
Patterns are compiled by PCRE2 into a reasonably efficient interpretive code,
|
Patterns are compiled by PCRE2 into a reasonably efficient interpretive code,
|
||||||
so that most simple patterns do not use much memory for storing the compiled
|
so that most simple patterns do not use much memory for storing the compiled
|
||||||
version. However, there is one case where the memory usage of a compiled
|
version. However, there is one case where the memory usage of a compiled
|
||||||
pattern can be unexpectedly large. If a parenthesized subpattern has a
|
pattern can be unexpectedly large. If a parenthesized subpattern has a
|
||||||
quantifier with a minimum greater than 1 and/or a limited maximum, the whole
|
quantifier with a minimum greater than 1 and/or a limited maximum, the whole
|
||||||
|
@ -76,7 +76,7 @@ memory can be limited; if the limit is set to zero, only the initial stack
|
||||||
vector is used. Rewriting patterns to be time-efficient, as described below,
|
vector is used. Rewriting patterns to be time-efficient, as described below,
|
||||||
may also reduce the memory requirements.
|
may also reduce the memory requirements.
|
||||||
.P
|
.P
|
||||||
In contrast to \fBpcre2_match()\fP, \fBpcre2_dfa_match()\fP does use recursive
|
In contrast to \fBpcre2_match()\fP, \fBpcre2_dfa_match()\fP does use recursive
|
||||||
function calls, but only for processing atomic groups, lookaround assertions,
|
function calls, but only for processing atomic groups, lookaround assertions,
|
||||||
and recursion within the pattern. Too much nested recursion may cause stack
|
and recursion within the pattern. Too much nested recursion may cause stack
|
||||||
issues. The "match depth" parameter can be used to limit the depth of function
|
issues. The "match depth" parameter can be used to limit the depth of function
|
||||||
|
@ -163,7 +163,7 @@ applied to a whole line of "a" characters, whereas the latter takes an
|
||||||
appreciable time with strings longer than about 20 characters.
|
appreciable time with strings longer than about 20 characters.
|
||||||
.P
|
.P
|
||||||
In many cases, the solution to this kind of performance issue is to use an
|
In many cases, the solution to this kind of performance issue is to use an
|
||||||
atomic group or a possessive quantifier. This can often reduce memory
|
atomic group or a possessive quantifier. This can often reduce memory
|
||||||
requirements as well. As another example, consider this pattern:
|
requirements as well. As another example, consider this pattern:
|
||||||
.sp
|
.sp
|
||||||
([^<]|<(?!inet))+
|
([^<]|<(?!inet))+
|
||||||
|
@ -184,7 +184,7 @@ are "swallowed" in one item inside the parentheses, and a possessive quantifier
|
||||||
is used to stop any backtracking into the runs of non-"<" characters. This
|
is used to stop any backtracking into the runs of non-"<" characters. This
|
||||||
version also uses a lot less memory because entry to a new set of parentheses
|
version also uses a lot less memory because entry to a new set of parentheses
|
||||||
happens only when a "<" character that is not followed by "inet" is encountered
|
happens only when a "<" character that is not followed by "inet" is encountered
|
||||||
(and we assume this is relatively rare).
|
(and we assume this is relatively rare).
|
||||||
.P
|
.P
|
||||||
This example shows that one way of optimizing performance when matching long
|
This example shows that one way of optimizing performance when matching long
|
||||||
subject strings is to write repeated parenthesized subpatterns to match more
|
subject strings is to write repeated parenthesized subpatterns to match more
|
||||||
|
@ -194,10 +194,10 @@ than one character whenever possible.
|
||||||
.SS "SETTING RESOURCE LIMITS"
|
.SS "SETTING RESOURCE LIMITS"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
You can set limits on the amount of processing that takes place when matching,
|
You can set limits on the amount of processing that takes place when matching,
|
||||||
and on the amount of heap memory that is used. The default values of the limits
|
and on the amount of heap memory that is used. The default values of the limits
|
||||||
are very large, and unlikely ever to operate. They can be changed when PCRE2 is
|
are very large, and unlikely ever to operate. They can be changed when PCRE2 is
|
||||||
built, and they can also be set when \fBpcre2_match()\fP or
|
built, and they can also be set when \fBpcre2_match()\fP or
|
||||||
\fBpcre2_dfa_match()\fP is called. For details of these interfaces, see the
|
\fBpcre2_dfa_match()\fP is called. For details of these interfaces, see the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
\fBpcre2build\fP
|
\fBpcre2build\fP
|
||||||
|
|
|
@ -407,11 +407,11 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
|
||||||
(?i) caseless
|
(?i) caseless
|
||||||
(?J) allow duplicate names
|
(?J) allow duplicate names
|
||||||
(?m) multiline
|
(?m) multiline
|
||||||
(?n) no auto capture
|
(?n) no auto capture
|
||||||
(?s) single line (dotall)
|
(?s) single line (dotall)
|
||||||
(?U) default ungreedy (lazy)
|
(?U) default ungreedy (lazy)
|
||||||
(?x) extended: ignore white space except in classes
|
(?x) extended: ignore white space except in classes
|
||||||
(?xx) as (?x) but also ignore space and tab in classes
|
(?xx) as (?x) but also ignore space and tab in classes
|
||||||
(?-...) unset option(s)
|
(?-...) unset option(s)
|
||||||
.sp
|
.sp
|
||||||
The following are recognized only at the very start of a pattern or after one
|
The following are recognized only at the very start of a pattern or after one
|
||||||
|
|
14
perltest.sh
14
perltest.sh
|
@ -50,7 +50,7 @@ fi
|
||||||
# ucp sets Perl's /u modifier
|
# ucp sets Perl's /u modifier
|
||||||
# utf invoke UTF-8 functionality
|
# utf invoke UTF-8 functionality
|
||||||
#
|
#
|
||||||
# The data lines must not have any pcre2test modifiers. Unless
|
# The data lines must not have any pcre2test modifiers. Unless
|
||||||
# "subject_litersl" is on the pattern, data lines are processed as
|
# "subject_litersl" is on the pattern, data lines are processed as
|
||||||
# Perl double-quoted strings, so if they contain " $ or @ characters, these
|
# Perl double-quoted strings, so if they contain " $ or @ characters, these
|
||||||
# have to be escaped. For this reason, all such characters in the
|
# have to be escaped. For this reason, all such characters in the
|
||||||
|
@ -141,20 +141,20 @@ for (;;)
|
||||||
|
|
||||||
chomp($pattern);
|
chomp($pattern);
|
||||||
$pattern =~ s/\s+$//;
|
$pattern =~ s/\s+$//;
|
||||||
|
|
||||||
# Split the pattern from the modifiers and adjust them as necessary.
|
# Split the pattern from the modifiers and adjust them as necessary.
|
||||||
|
|
||||||
$pattern =~ /^\s*((.).*\2)(.*)$/s;
|
$pattern =~ /^\s*((.).*\2)(.*)$/s;
|
||||||
$pat = $1;
|
$pat = $1;
|
||||||
$mod = $3;
|
$mod = $3;
|
||||||
|
|
||||||
# The private "aftertext" modifier means "print $' afterwards".
|
# The private "aftertext" modifier means "print $' afterwards".
|
||||||
|
|
||||||
$showrest = ($mod =~ s/aftertext,?//);
|
$showrest = ($mod =~ s/aftertext,?//);
|
||||||
|
|
||||||
# The "subject_literal" modifer disables escapes in subjects.
|
# The "subject_literal" modifer disables escapes in subjects.
|
||||||
|
|
||||||
$subject_literal = ($mod =~ s/subject_literal,?//);
|
$subject_literal = ($mod =~ s/subject_literal,?//);
|
||||||
|
|
||||||
# "allaftertext" is used by pcre2test to print remainders after captures
|
# "allaftertext" is used by pcre2test to print remainders after captures
|
||||||
|
|
||||||
|
@ -238,7 +238,7 @@ for (;;)
|
||||||
$x = $_;
|
$x = $_;
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
$x = eval "\"$_\""; # To get escapes processed
|
$x = eval "\"$_\""; # To get escapes processed
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
@ -132,6 +132,12 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
/* Define to 1 if you have the <zlib.h> header file. */
|
/* Define to 1 if you have the <zlib.h> header file. */
|
||||||
/* #undef HAVE_ZLIB_H */
|
/* #undef HAVE_ZLIB_H */
|
||||||
|
|
||||||
|
/* This limits the amount of memory that pcre2_match() may use while matching
|
||||||
|
a pattern. The value is in kilobytes. */
|
||||||
|
#ifndef HEAP_LIMIT
|
||||||
|
#define HEAP_LIMIT 20000000
|
||||||
|
#endif
|
||||||
|
|
||||||
/* The value of LINK_SIZE determines the number of bytes used to store links
|
/* The value of LINK_SIZE determines the number of bytes used to store links
|
||||||
as offsets within the compiled regex. The default is 2, which allows for
|
as offsets within the compiled regex. The default is 2, which allows for
|
||||||
compiled patterns up to 64K long. This covers the vast majority of cases.
|
compiled patterns up to 64K long. This covers the vast majority of cases.
|
||||||
|
@ -148,7 +154,7 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/* The value of MATCH_LIMIT determines the default number of times the
|
/* The value of MATCH_LIMIT determines the default number of times the
|
||||||
internal match() function can record a backtrack position during a single
|
pcre2_match() function can record a backtrack position during a single
|
||||||
matching attempt. There is a runtime interface for setting a different
|
matching attempt. There is a runtime interface for setting a different
|
||||||
limit. The limit exists in order to catch runaway regular expressions that
|
limit. The limit exists in order to catch runaway regular expressions that
|
||||||
take for ever to determine that they do not match. The default is set very
|
take for ever to determine that they do not match. The default is set very
|
||||||
|
@ -188,8 +194,8 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
|
|
||||||
/* The value of NEWLINE_DEFAULT determines the default newline character
|
/* The value of NEWLINE_DEFAULT determines the default newline character
|
||||||
sequence. PCRE2 client programs can override this by selecting other values
|
sequence. PCRE2 client programs can override this by selecting other values
|
||||||
at run time. The valid values are 1 (CR), 2 (LF), 3 (CRLF), 4 (ANY), and 5
|
at run time. The valid values are 1 (CR), 2 (LF), 3 (CRLF), 4 (ANY), 5
|
||||||
(ANYCRLF). */
|
(ANYCRLF), and 6 (NUL). */
|
||||||
#ifndef NEWLINE_DEFAULT
|
#ifndef NEWLINE_DEFAULT
|
||||||
#define NEWLINE_DEFAULT 2
|
#define NEWLINE_DEFAULT 2
|
||||||
#endif
|
#endif
|
||||||
|
@ -204,7 +210,7 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
#define PACKAGE_NAME "PCRE2"
|
#define PACKAGE_NAME "PCRE2"
|
||||||
|
|
||||||
/* Define to the full name and version of this package. */
|
/* Define to the full name and version of this package. */
|
||||||
#define PACKAGE_STRING "PCRE2 10.30-DEV"
|
#define PACKAGE_STRING "PCRE2 10.30-RC1"
|
||||||
|
|
||||||
/* Define to the one symbol short name of this package. */
|
/* Define to the one symbol short name of this package. */
|
||||||
#define PACKAGE_TARNAME "pcre2"
|
#define PACKAGE_TARNAME "pcre2"
|
||||||
|
@ -213,7 +219,7 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
#define PACKAGE_URL ""
|
#define PACKAGE_URL ""
|
||||||
|
|
||||||
/* Define to the version of this package. */
|
/* Define to the version of this package. */
|
||||||
#define PACKAGE_VERSION "10.30-DEV"
|
#define PACKAGE_VERSION "10.30-RC1"
|
||||||
|
|
||||||
/* The value of PARENS_NEST_LIMIT specifies the maximum depth of nested
|
/* The value of PARENS_NEST_LIMIT specifies the maximum depth of nested
|
||||||
parentheses (of any kind) in a pattern. This limits the amount of system
|
parentheses (of any kind) in a pattern. This limits the amount of system
|
||||||
|
@ -261,6 +267,11 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
your system. */
|
your system. */
|
||||||
/* #undef PTHREAD_CREATE_JOINABLE */
|
/* #undef PTHREAD_CREATE_JOINABLE */
|
||||||
|
|
||||||
|
/* Define to any non-zero number to enable support for SELinux compatible
|
||||||
|
executable memory allocator in JIT. Note that this will have no effect
|
||||||
|
unless SUPPORT_JIT is also defined. */
|
||||||
|
/* #undef SLJIT_PROT_EXECUTABLE_ALLOCATOR */
|
||||||
|
|
||||||
/* Define to 1 if you have the ANSI C header files. */
|
/* Define to 1 if you have the ANSI C header files. */
|
||||||
/* #undef STDC_HEADERS */
|
/* #undef STDC_HEADERS */
|
||||||
|
|
||||||
|
@ -328,7 +339,7 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/* Version number of package */
|
/* Version number of package */
|
||||||
#define VERSION "10.30-DEV"
|
#define VERSION "10.30-RC1"
|
||||||
|
|
||||||
/* Define to 1 if on MINIX. */
|
/* Define to 1 if on MINIX. */
|
||||||
/* #undef _MINIX */
|
/* #undef _MINIX */
|
||||||
|
|
|
@ -43,8 +43,8 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||||
|
|
||||||
#define PCRE2_MAJOR 10
|
#define PCRE2_MAJOR 10
|
||||||
#define PCRE2_MINOR 30
|
#define PCRE2_MINOR 30
|
||||||
#define PCRE2_PRERELEASE -DEV
|
#define PCRE2_PRERELEASE -RC1
|
||||||
#define PCRE2_DATE 2017-03-05
|
#define PCRE2_DATE 2017-07-18
|
||||||
|
|
||||||
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
||||||
imported have to be identified as such. When building PCRE2, the appropriate
|
imported have to be identified as such. When building PCRE2, the appropriate
|
||||||
|
|
|
@ -43,8 +43,8 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||||
|
|
||||||
#define PCRE2_MAJOR 10
|
#define PCRE2_MAJOR 10
|
||||||
#define PCRE2_MINOR 30
|
#define PCRE2_MINOR 30
|
||||||
#define PCRE2_PRERELEASE -DEV
|
#define PCRE2_PRERELEASE -RC1
|
||||||
#define PCRE2_DATE 2017-03-05
|
#define PCRE2_DATE 2017-07-18
|
||||||
|
|
||||||
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
||||||
imported have to be identified as such. When building PCRE2, the appropriate
|
imported have to be identified as such. When building PCRE2, the appropriate
|
||||||
|
@ -138,6 +138,14 @@ D is inspected during pcre2_dfa_match() execution
|
||||||
#define PCRE2_ALT_VERBNAMES 0x00400000u /* C */
|
#define PCRE2_ALT_VERBNAMES 0x00400000u /* C */
|
||||||
#define PCRE2_USE_OFFSET_LIMIT 0x00800000u /* J M D */
|
#define PCRE2_USE_OFFSET_LIMIT 0x00800000u /* J M D */
|
||||||
#define PCRE2_EXTENDED_MORE 0x01000000u /* C */
|
#define PCRE2_EXTENDED_MORE 0x01000000u /* C */
|
||||||
|
#define PCRE2_LITERAL 0x02000000u /* C */
|
||||||
|
|
||||||
|
/* An additional compile options word is available in the compile context. */
|
||||||
|
|
||||||
|
#define PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES 0x00000001u /* C */
|
||||||
|
#define PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL 0x00000002u /* C */
|
||||||
|
#define PCRE2_EXTRA_MATCH_WORD 0x00000004u /* C */
|
||||||
|
#define PCRE2_EXTRA_MATCH_LINE 0x00000008u /* C */
|
||||||
|
|
||||||
/* These are for pcre2_jit_compile(). */
|
/* These are for pcre2_jit_compile(). */
|
||||||
|
|
||||||
|
@ -176,6 +184,16 @@ ignored for pcre2_jit_match(). */
|
||||||
|
|
||||||
#define PCRE2_NO_JIT 0x00002000u
|
#define PCRE2_NO_JIT 0x00002000u
|
||||||
|
|
||||||
|
/* Options for pcre2_pattern_convert(). */
|
||||||
|
|
||||||
|
#define PCRE2_CONVERT_UTF 0x00000001u
|
||||||
|
#define PCRE2_CONVERT_NO_UTF_CHECK 0x00000002u
|
||||||
|
#define PCRE2_CONVERT_POSIX_BASIC 0x00000004u
|
||||||
|
#define PCRE2_CONVERT_POSIX_EXTENDED 0x00000008u
|
||||||
|
#define PCRE2_CONVERT_GLOB 0x00000010u
|
||||||
|
#define PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR 0x00000030u
|
||||||
|
#define PCRE2_CONVERT_GLOB_NO_STARSTAR 0x00000050u
|
||||||
|
|
||||||
/* Newline and \R settings, for use in compile contexts. The newline values
|
/* Newline and \R settings, for use in compile contexts. The newline values
|
||||||
must be kept in step with values set in config.h and both sets must all be
|
must be kept in step with values set in config.h and both sets must all be
|
||||||
greater than zero. */
|
greater than zero. */
|
||||||
|
@ -185,6 +203,7 @@ greater than zero. */
|
||||||
#define PCRE2_NEWLINE_CRLF 3
|
#define PCRE2_NEWLINE_CRLF 3
|
||||||
#define PCRE2_NEWLINE_ANY 4
|
#define PCRE2_NEWLINE_ANY 4
|
||||||
#define PCRE2_NEWLINE_ANYCRLF 5
|
#define PCRE2_NEWLINE_ANYCRLF 5
|
||||||
|
#define PCRE2_NEWLINE_NUL 6
|
||||||
|
|
||||||
#define PCRE2_BSR_UNICODE 1
|
#define PCRE2_BSR_UNICODE 1
|
||||||
#define PCRE2_BSR_ANYCRLF 2
|
#define PCRE2_BSR_ANYCRLF 2
|
||||||
|
@ -270,6 +289,8 @@ numbers must not be changed. */
|
||||||
#define PCRE2_ERROR_TOOMANYREPLACE (-61)
|
#define PCRE2_ERROR_TOOMANYREPLACE (-61)
|
||||||
#define PCRE2_ERROR_BADSERIALIZEDDATA (-62)
|
#define PCRE2_ERROR_BADSERIALIZEDDATA (-62)
|
||||||
#define PCRE2_ERROR_HEAPLIMIT (-63)
|
#define PCRE2_ERROR_HEAPLIMIT (-63)
|
||||||
|
#define PCRE2_ERROR_CONVERT_SYNTAX (-64)
|
||||||
|
|
||||||
|
|
||||||
/* Request types for pcre2_pattern_info() */
|
/* Request types for pcre2_pattern_info() */
|
||||||
|
|
||||||
|
@ -351,6 +372,9 @@ typedef struct pcre2_real_compile_context pcre2_compile_context; \
|
||||||
struct pcre2_real_match_context; \
|
struct pcre2_real_match_context; \
|
||||||
typedef struct pcre2_real_match_context pcre2_match_context; \
|
typedef struct pcre2_real_match_context pcre2_match_context; \
|
||||||
\
|
\
|
||||||
|
struct pcre2_real_convert_context; \
|
||||||
|
typedef struct pcre2_real_convert_context pcre2_convert_context; \
|
||||||
|
\
|
||||||
struct pcre2_real_code; \
|
struct pcre2_real_code; \
|
||||||
typedef struct pcre2_real_code pcre2_code; \
|
typedef struct pcre2_real_code pcre2_code; \
|
||||||
\
|
\
|
||||||
|
@ -434,6 +458,8 @@ PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
pcre2_set_bsr(pcre2_compile_context *, uint32_t); \
|
pcre2_set_bsr(pcre2_compile_context *, uint32_t); \
|
||||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
pcre2_set_character_tables(pcre2_compile_context *, const unsigned char *); \
|
pcre2_set_character_tables(pcre2_compile_context *, const unsigned char *); \
|
||||||
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_set_compile_extra_options(pcre2_compile_context *, uint32_t); \
|
||||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
pcre2_set_max_pattern_length(pcre2_compile_context *, PCRE2_SIZE); \
|
pcre2_set_max_pattern_length(pcre2_compile_context *, PCRE2_SIZE); \
|
||||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
@ -466,6 +492,18 @@ PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
pcre2_set_recursion_memory_management(pcre2_match_context *, \
|
pcre2_set_recursion_memory_management(pcre2_match_context *, \
|
||||||
void *(*)(PCRE2_SIZE, void *), void (*)(void *, void *), void *);
|
void *(*)(PCRE2_SIZE, void *), void (*)(void *, void *), void *);
|
||||||
|
|
||||||
|
#define PCRE2_CONVERT_CONTEXT_FUNCTIONS \
|
||||||
|
PCRE2_EXP_DECL pcre2_convert_context PCRE2_CALL_CONVENTION \
|
||||||
|
*pcre2_convert_context_copy(pcre2_convert_context *); \
|
||||||
|
PCRE2_EXP_DECL pcre2_convert_context PCRE2_CALL_CONVENTION \
|
||||||
|
*pcre2_convert_context_create(pcre2_general_context *); \
|
||||||
|
PCRE2_EXP_DECL void PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_convert_context_free(pcre2_convert_context *); \
|
||||||
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_set_glob_escape(pcre2_convert_context *, uint32_t); \
|
||||||
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_set_glob_separator(pcre2_convert_context *, uint32_t);
|
||||||
|
|
||||||
|
|
||||||
/* Functions concerned with compiling a pattern to PCRE internal code. */
|
/* Functions concerned with compiling a pattern to PCRE internal code. */
|
||||||
|
|
||||||
|
@ -572,6 +610,16 @@ PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
PCRE2_SIZE, PCRE2_UCHAR *, PCRE2_SIZE *);
|
PCRE2_SIZE, PCRE2_UCHAR *, PCRE2_SIZE *);
|
||||||
|
|
||||||
|
|
||||||
|
/* Functions for converting pattern source strings. */
|
||||||
|
|
||||||
|
#define PCRE2_CONVERT_FUNCTIONS \
|
||||||
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_pattern_convert(PCRE2_SPTR, PCRE2_SIZE, uint32_t, PCRE2_UCHAR **, \
|
||||||
|
PCRE2_SIZE *, pcre2_convert_context *); \
|
||||||
|
PCRE2_EXP_DECL void PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_converted_pattern_free(PCRE2_UCHAR *);
|
||||||
|
|
||||||
|
|
||||||
/* Functions for JIT processing */
|
/* Functions for JIT processing */
|
||||||
|
|
||||||
#define PCRE2_JIT_FUNCTIONS \
|
#define PCRE2_JIT_FUNCTIONS \
|
||||||
|
@ -623,6 +671,7 @@ pcre2_compile are called by application code. */
|
||||||
#define pcre2_real_code PCRE2_SUFFIX(pcre2_real_code_)
|
#define pcre2_real_code PCRE2_SUFFIX(pcre2_real_code_)
|
||||||
#define pcre2_real_general_context PCRE2_SUFFIX(pcre2_real_general_context_)
|
#define pcre2_real_general_context PCRE2_SUFFIX(pcre2_real_general_context_)
|
||||||
#define pcre2_real_compile_context PCRE2_SUFFIX(pcre2_real_compile_context_)
|
#define pcre2_real_compile_context PCRE2_SUFFIX(pcre2_real_compile_context_)
|
||||||
|
#define pcre2_real_convert_context PCRE2_SUFFIX(pcre2_real_convert_context_)
|
||||||
#define pcre2_real_match_context PCRE2_SUFFIX(pcre2_real_match_context_)
|
#define pcre2_real_match_context PCRE2_SUFFIX(pcre2_real_match_context_)
|
||||||
#define pcre2_real_jit_stack PCRE2_SUFFIX(pcre2_real_jit_stack_)
|
#define pcre2_real_jit_stack PCRE2_SUFFIX(pcre2_real_jit_stack_)
|
||||||
#define pcre2_real_match_data PCRE2_SUFFIX(pcre2_real_match_data_)
|
#define pcre2_real_match_data PCRE2_SUFFIX(pcre2_real_match_data_)
|
||||||
|
@ -634,6 +683,7 @@ pcre2_compile are called by application code. */
|
||||||
#define pcre2_callout_enumerate_block PCRE2_SUFFIX(pcre2_callout_enumerate_block_)
|
#define pcre2_callout_enumerate_block PCRE2_SUFFIX(pcre2_callout_enumerate_block_)
|
||||||
#define pcre2_general_context PCRE2_SUFFIX(pcre2_general_context_)
|
#define pcre2_general_context PCRE2_SUFFIX(pcre2_general_context_)
|
||||||
#define pcre2_compile_context PCRE2_SUFFIX(pcre2_compile_context_)
|
#define pcre2_compile_context PCRE2_SUFFIX(pcre2_compile_context_)
|
||||||
|
#define pcre2_convert_context PCRE2_SUFFIX(pcre2_convert_context_)
|
||||||
#define pcre2_match_context PCRE2_SUFFIX(pcre2_match_context_)
|
#define pcre2_match_context PCRE2_SUFFIX(pcre2_match_context_)
|
||||||
#define pcre2_match_data PCRE2_SUFFIX(pcre2_match_data_)
|
#define pcre2_match_data PCRE2_SUFFIX(pcre2_match_data_)
|
||||||
|
|
||||||
|
@ -649,6 +699,10 @@ pcre2_compile are called by application code. */
|
||||||
#define pcre2_compile_context_create PCRE2_SUFFIX(pcre2_compile_context_create_)
|
#define pcre2_compile_context_create PCRE2_SUFFIX(pcre2_compile_context_create_)
|
||||||
#define pcre2_compile_context_free PCRE2_SUFFIX(pcre2_compile_context_free_)
|
#define pcre2_compile_context_free PCRE2_SUFFIX(pcre2_compile_context_free_)
|
||||||
#define pcre2_config PCRE2_SUFFIX(pcre2_config_)
|
#define pcre2_config PCRE2_SUFFIX(pcre2_config_)
|
||||||
|
#define pcre2_convert_context_copy PCRE2_SUFFIX(pcre2_convert_context_copy_)
|
||||||
|
#define pcre2_convert_context_create PCRE2_SUFFIX(pcre2_convert_context_create_)
|
||||||
|
#define pcre2_convert_context_free PCRE2_SUFFIX(pcre2_convert_context_free_)
|
||||||
|
#define pcre2_converted_pattern_free PCRE2_SUFFIX(pcre2_converted_pattern_free_)
|
||||||
#define pcre2_dfa_match PCRE2_SUFFIX(pcre2_dfa_match_)
|
#define pcre2_dfa_match PCRE2_SUFFIX(pcre2_dfa_match_)
|
||||||
#define pcre2_general_context_copy PCRE2_SUFFIX(pcre2_general_context_copy_)
|
#define pcre2_general_context_copy PCRE2_SUFFIX(pcre2_general_context_copy_)
|
||||||
#define pcre2_general_context_create PCRE2_SUFFIX(pcre2_general_context_create_)
|
#define pcre2_general_context_create PCRE2_SUFFIX(pcre2_general_context_create_)
|
||||||
|
@ -672,6 +726,7 @@ pcre2_compile are called by application code. */
|
||||||
#define pcre2_match_data_create PCRE2_SUFFIX(pcre2_match_data_create_)
|
#define pcre2_match_data_create PCRE2_SUFFIX(pcre2_match_data_create_)
|
||||||
#define pcre2_match_data_create_from_pattern PCRE2_SUFFIX(pcre2_match_data_create_from_pattern_)
|
#define pcre2_match_data_create_from_pattern PCRE2_SUFFIX(pcre2_match_data_create_from_pattern_)
|
||||||
#define pcre2_match_data_free PCRE2_SUFFIX(pcre2_match_data_free_)
|
#define pcre2_match_data_free PCRE2_SUFFIX(pcre2_match_data_free_)
|
||||||
|
#define pcre2_pattern_convert PCRE2_SUFFIX(pcre2_pattern_convert_)
|
||||||
#define pcre2_pattern_info PCRE2_SUFFIX(pcre2_pattern_info_)
|
#define pcre2_pattern_info PCRE2_SUFFIX(pcre2_pattern_info_)
|
||||||
#define pcre2_serialize_decode PCRE2_SUFFIX(pcre2_serialize_decode_)
|
#define pcre2_serialize_decode PCRE2_SUFFIX(pcre2_serialize_decode_)
|
||||||
#define pcre2_serialize_encode PCRE2_SUFFIX(pcre2_serialize_encode_)
|
#define pcre2_serialize_encode PCRE2_SUFFIX(pcre2_serialize_encode_)
|
||||||
|
@ -680,8 +735,11 @@ pcre2_compile are called by application code. */
|
||||||
#define pcre2_set_bsr PCRE2_SUFFIX(pcre2_set_bsr_)
|
#define pcre2_set_bsr PCRE2_SUFFIX(pcre2_set_bsr_)
|
||||||
#define pcre2_set_callout PCRE2_SUFFIX(pcre2_set_callout_)
|
#define pcre2_set_callout PCRE2_SUFFIX(pcre2_set_callout_)
|
||||||
#define pcre2_set_character_tables PCRE2_SUFFIX(pcre2_set_character_tables_)
|
#define pcre2_set_character_tables PCRE2_SUFFIX(pcre2_set_character_tables_)
|
||||||
|
#define pcre2_set_compile_extra_options PCRE2_SUFFIX(pcre2_set_compile_extra_options_)
|
||||||
#define pcre2_set_compile_recursion_guard PCRE2_SUFFIX(pcre2_set_compile_recursion_guard_)
|
#define pcre2_set_compile_recursion_guard PCRE2_SUFFIX(pcre2_set_compile_recursion_guard_)
|
||||||
#define pcre2_set_depth_limit PCRE2_SUFFIX(pcre2_set_depth_limit_)
|
#define pcre2_set_depth_limit PCRE2_SUFFIX(pcre2_set_depth_limit_)
|
||||||
|
#define pcre2_set_glob_escape PCRE2_SUFFIX(pcre2_set_glob_escape_)
|
||||||
|
#define pcre2_set_glob_separator PCRE2_SUFFIX(pcre2_set_glob_separator_)
|
||||||
#define pcre2_set_heap_limit PCRE2_SUFFIX(pcre2_set_heap_limit_)
|
#define pcre2_set_heap_limit PCRE2_SUFFIX(pcre2_set_heap_limit_)
|
||||||
#define pcre2_set_match_limit PCRE2_SUFFIX(pcre2_set_match_limit_)
|
#define pcre2_set_match_limit PCRE2_SUFFIX(pcre2_set_match_limit_)
|
||||||
#define pcre2_set_max_pattern_length PCRE2_SUFFIX(pcre2_set_max_pattern_length_)
|
#define pcre2_set_max_pattern_length PCRE2_SUFFIX(pcre2_set_max_pattern_length_)
|
||||||
|
@ -716,6 +774,8 @@ PCRE2_STRUCTURE_LIST \
|
||||||
PCRE2_GENERAL_INFO_FUNCTIONS \
|
PCRE2_GENERAL_INFO_FUNCTIONS \
|
||||||
PCRE2_GENERAL_CONTEXT_FUNCTIONS \
|
PCRE2_GENERAL_CONTEXT_FUNCTIONS \
|
||||||
PCRE2_COMPILE_CONTEXT_FUNCTIONS \
|
PCRE2_COMPILE_CONTEXT_FUNCTIONS \
|
||||||
|
PCRE2_CONVERT_CONTEXT_FUNCTIONS \
|
||||||
|
PCRE2_CONVERT_FUNCTIONS \
|
||||||
PCRE2_MATCH_CONTEXT_FUNCTIONS \
|
PCRE2_MATCH_CONTEXT_FUNCTIONS \
|
||||||
PCRE2_COMPILE_FUNCTIONS \
|
PCRE2_COMPILE_FUNCTIONS \
|
||||||
PCRE2_PATTERN_INFO_FUNCTIONS \
|
PCRE2_PATTERN_INFO_FUNCTIONS \
|
||||||
|
@ -745,6 +805,7 @@ PCRE2_TYPES_STRUCTURES_AND_FUNCTIONS
|
||||||
#undef PCRE2_GENERAL_INFO_FUNCTIONS
|
#undef PCRE2_GENERAL_INFO_FUNCTIONS
|
||||||
#undef PCRE2_GENERAL_CONTEXT_FUNCTIONS
|
#undef PCRE2_GENERAL_CONTEXT_FUNCTIONS
|
||||||
#undef PCRE2_COMPILE_CONTEXT_FUNCTIONS
|
#undef PCRE2_COMPILE_CONTEXT_FUNCTIONS
|
||||||
|
#undef PCRE2_CONVERT_CONTEXT_FUNCTIONS
|
||||||
#undef PCRE2_MATCH_CONTEXT_FUNCTIONS
|
#undef PCRE2_MATCH_CONTEXT_FUNCTIONS
|
||||||
#undef PCRE2_COMPILE_FUNCTIONS
|
#undef PCRE2_COMPILE_FUNCTIONS
|
||||||
#undef PCRE2_PATTERN_INFO_FUNCTIONS
|
#undef PCRE2_PATTERN_INFO_FUNCTIONS
|
||||||
|
|
|
@ -84,7 +84,7 @@ if (where == NULL) /* Requests a length */
|
||||||
return PCRE2_ERROR_BADOPTION;
|
return PCRE2_ERROR_BADOPTION;
|
||||||
|
|
||||||
case PCRE2_CONFIG_BSR:
|
case PCRE2_CONFIG_BSR:
|
||||||
case PCRE2_CONFIG_HEAPLIMIT:
|
case PCRE2_CONFIG_HEAPLIMIT:
|
||||||
case PCRE2_CONFIG_JIT:
|
case PCRE2_CONFIG_JIT:
|
||||||
case PCRE2_CONFIG_LINKSIZE:
|
case PCRE2_CONFIG_LINKSIZE:
|
||||||
case PCRE2_CONFIG_MATCHLIMIT:
|
case PCRE2_CONFIG_MATCHLIMIT:
|
||||||
|
@ -151,7 +151,7 @@ switch (what)
|
||||||
case PCRE2_CONFIG_DEPTHLIMIT:
|
case PCRE2_CONFIG_DEPTHLIMIT:
|
||||||
*((uint32_t *)where) = MATCH_LIMIT_DEPTH;
|
*((uint32_t *)where) = MATCH_LIMIT_DEPTH;
|
||||||
break;
|
break;
|
||||||
|
|
||||||
case PCRE2_CONFIG_NEWLINE:
|
case PCRE2_CONFIG_NEWLINE:
|
||||||
*((uint32_t *)where) = NEWLINE_DEFAULT;
|
*((uint32_t *)where) = NEWLINE_DEFAULT;
|
||||||
break;
|
break;
|
||||||
|
@ -160,7 +160,7 @@ switch (what)
|
||||||
*((uint32_t *)where) = PARENS_NEST_LIMIT;
|
*((uint32_t *)where) = PARENS_NEST_LIMIT;
|
||||||
break;
|
break;
|
||||||
|
|
||||||
/* This is now obsolete. The stack is no longer used via recursion for
|
/* This is now obsolete. The stack is no longer used via recursion for
|
||||||
handling backtracking in pcre2_match(). */
|
handling backtracking in pcre2_match(). */
|
||||||
|
|
||||||
case PCRE2_CONFIG_STACKRECURSE:
|
case PCRE2_CONFIG_STACKRECURSE:
|
||||||
|
|
|
@ -198,7 +198,7 @@ const pcre2_convert_context PRIV(default_convert_context) = {
|
||||||
CHAR_BACKSLASH, /* Default path separator */
|
CHAR_BACKSLASH, /* Default path separator */
|
||||||
CHAR_GRAVE_ACCENT /* Default escape character */
|
CHAR_GRAVE_ACCENT /* Default escape character */
|
||||||
#else /* Not Windows */
|
#else /* Not Windows */
|
||||||
CHAR_SLASH, /* Default path separator */
|
CHAR_SLASH, /* Default path separator */
|
||||||
CHAR_BACKSLASH /* Default escape character */
|
CHAR_BACKSLASH /* Default escape character */
|
||||||
#endif
|
#endif
|
||||||
};
|
};
|
||||||
|
@ -359,7 +359,7 @@ switch(newline)
|
||||||
case PCRE2_NEWLINE_CRLF:
|
case PCRE2_NEWLINE_CRLF:
|
||||||
case PCRE2_NEWLINE_ANY:
|
case PCRE2_NEWLINE_ANY:
|
||||||
case PCRE2_NEWLINE_ANYCRLF:
|
case PCRE2_NEWLINE_ANYCRLF:
|
||||||
case PCRE2_NEWLINE_NUL:
|
case PCRE2_NEWLINE_NUL:
|
||||||
ccontext->newline_convention = newline;
|
ccontext->newline_convention = newline;
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
|
|
|
@ -177,7 +177,7 @@ static const unsigned char compile_error_texts[] =
|
||||||
/* 90 */
|
/* 90 */
|
||||||
"internal error: bad code value in parsed_skip()\0"
|
"internal error: bad code value in parsed_skip()\0"
|
||||||
"PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is not allowed in UTF-16 mode\0"
|
"PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is not allowed in UTF-16 mode\0"
|
||||||
"invalid option bits with PCRE2_LITERAL\0"
|
"invalid option bits with PCRE2_LITERAL\0"
|
||||||
;
|
;
|
||||||
|
|
||||||
/* Match-time and UTF error texts are in the same format. */
|
/* Match-time and UTF error texts are in the same format. */
|
||||||
|
|
|
@ -240,7 +240,7 @@ not rely on this. */
|
||||||
|
|
||||||
#define COMPILE_ERROR_BASE 100
|
#define COMPILE_ERROR_BASE 100
|
||||||
|
|
||||||
/* The initial frames vector for remembering backtracking points in
|
/* The initial frames vector for remembering backtracking points in
|
||||||
pcre2_match() is allocated on the system stack, of this size (bytes). The size
|
pcre2_match() is allocated on the system stack, of this size (bytes). The size
|
||||||
must be a multiple of sizeof(PCRE2_SPTR) in all environments, so making it a
|
must be a multiple of sizeof(PCRE2_SPTR) in all environments, so making it a
|
||||||
multiple of 8 is best. Typical frame sizes are a few hundred bytes (it depends
|
multiple of 8 is best. Typical frame sizes are a few hundred bytes (it depends
|
||||||
|
@ -557,7 +557,7 @@ enum { PCRE2_MATCHEDBY_INTERPRETER, /* pcre2_match() */
|
||||||
#define MAGIC_NUMBER 0x50435245UL /* 'PCRE' */
|
#define MAGIC_NUMBER 0x50435245UL /* 'PCRE' */
|
||||||
|
|
||||||
/* The maximum remaining length of subject we are prepared to search for a
|
/* The maximum remaining length of subject we are prepared to search for a
|
||||||
req_unit match. In 8-bit mode, memchr() is used and is much faster than the
|
req_unit match. In 8-bit mode, memchr() is used and is much faster than the
|
||||||
search loop that has to be used in 16-bit and 32-bit modes. */
|
search loop that has to be used in 16-bit and 32-bit modes. */
|
||||||
|
|
||||||
#if PCRE2_CODE_UNIT_WIDTH == 8
|
#if PCRE2_CODE_UNIT_WIDTH == 8
|
||||||
|
|
|
@ -3829,7 +3829,7 @@ while (TRUE)
|
||||||
{
|
{
|
||||||
case OP_CHARI:
|
case OP_CHARI:
|
||||||
caseless = TRUE;
|
caseless = TRUE;
|
||||||
/* Fall through */
|
/* Fall through */
|
||||||
case OP_CHAR:
|
case OP_CHAR:
|
||||||
last = FALSE;
|
last = FALSE;
|
||||||
cc++;
|
cc++;
|
||||||
|
@ -3861,7 +3861,7 @@ while (TRUE)
|
||||||
case OP_MINPLUSI:
|
case OP_MINPLUSI:
|
||||||
case OP_POSPLUSI:
|
case OP_POSPLUSI:
|
||||||
caseless = TRUE;
|
caseless = TRUE;
|
||||||
/* Fall through */
|
/* Fall through */
|
||||||
case OP_PLUS:
|
case OP_PLUS:
|
||||||
case OP_MINPLUS:
|
case OP_MINPLUS:
|
||||||
case OP_POSPLUS:
|
case OP_POSPLUS:
|
||||||
|
@ -3870,7 +3870,7 @@ while (TRUE)
|
||||||
|
|
||||||
case OP_EXACTI:
|
case OP_EXACTI:
|
||||||
caseless = TRUE;
|
caseless = TRUE;
|
||||||
/* Fall through */
|
/* Fall through */
|
||||||
case OP_EXACT:
|
case OP_EXACT:
|
||||||
repeat = GET2(cc, 1);
|
repeat = GET2(cc, 1);
|
||||||
last = FALSE;
|
last = FALSE;
|
||||||
|
@ -3881,7 +3881,7 @@ while (TRUE)
|
||||||
case OP_MINQUERYI:
|
case OP_MINQUERYI:
|
||||||
case OP_POSQUERYI:
|
case OP_POSQUERYI:
|
||||||
caseless = TRUE;
|
caseless = TRUE;
|
||||||
/* Fall through */
|
/* Fall through */
|
||||||
case OP_QUERY:
|
case OP_QUERY:
|
||||||
case OP_MINQUERY:
|
case OP_MINQUERY:
|
||||||
case OP_POSQUERY:
|
case OP_POSQUERY:
|
||||||
|
@ -4351,7 +4351,7 @@ struct sljit_jump *quit;
|
||||||
struct sljit_jump *partial_quit[2];
|
struct sljit_jump *partial_quit[2];
|
||||||
sljit_u8 instruction[8];
|
sljit_u8 instruction[8];
|
||||||
sljit_s32 tmp1_ind = sljit_get_register_index(TMP1);
|
sljit_s32 tmp1_ind = sljit_get_register_index(TMP1);
|
||||||
sljit_s32 tmp2_ind = sljit_get_register_index(TMP2);
|
// sljit_s32 tmp2_ind = sljit_get_register_index(TMP2);
|
||||||
sljit_s32 str_ptr_ind = sljit_get_register_index(STR_PTR);
|
sljit_s32 str_ptr_ind = sljit_get_register_index(STR_PTR);
|
||||||
sljit_s32 data_ind = 0;
|
sljit_s32 data_ind = 0;
|
||||||
sljit_s32 tmp_ind = 1;
|
sljit_s32 tmp_ind = 1;
|
||||||
|
@ -4376,7 +4376,9 @@ if (common->mode == PCRE2_JIT_COMPLETE)
|
||||||
|
|
||||||
OP1(SLJIT_MOV, TMP1, 0, SLJIT_IMM, character_to_int32(char1 | bit));
|
OP1(SLJIT_MOV, TMP1, 0, SLJIT_IMM, character_to_int32(char1 | bit));
|
||||||
|
|
||||||
SLJIT_ASSERT(tmp1_ind < 8 && tmp2_ind == 1);
|
// SLJIT_ASSERT(tmp1_ind < 8 && tmp2_ind == 1);
|
||||||
|
|
||||||
|
SLJIT_ASSERT(tmp1_ind < 8);
|
||||||
|
|
||||||
/* MOVD xmm, r/m32 */
|
/* MOVD xmm, r/m32 */
|
||||||
instruction[0] = 0x66;
|
instruction[0] = 0x66;
|
||||||
|
|
|
@ -80,7 +80,7 @@ if (where == NULL) /* Requests field length */
|
||||||
case PCRE2_INFO_FIRSTCODEUNIT:
|
case PCRE2_INFO_FIRSTCODEUNIT:
|
||||||
case PCRE2_INFO_HASBACKSLASHC:
|
case PCRE2_INFO_HASBACKSLASHC:
|
||||||
case PCRE2_INFO_HASCRORLF:
|
case PCRE2_INFO_HASCRORLF:
|
||||||
case PCRE2_INFO_HEAPLIMIT:
|
case PCRE2_INFO_HEAPLIMIT:
|
||||||
case PCRE2_INFO_JCHANGED:
|
case PCRE2_INFO_JCHANGED:
|
||||||
case PCRE2_INFO_LASTCODETYPE:
|
case PCRE2_INFO_LASTCODETYPE:
|
||||||
case PCRE2_INFO_LASTCODEUNIT:
|
case PCRE2_INFO_LASTCODEUNIT:
|
||||||
|
|
|
@ -167,15 +167,15 @@ are implementing).
|
||||||
6. Do not break after Prepend characters.
|
6. Do not break after Prepend characters.
|
||||||
|
|
||||||
7. Do not break within emoji modifier sequences (E_Base or E_Base_GAZ followed
|
7. Do not break within emoji modifier sequences (E_Base or E_Base_GAZ followed
|
||||||
by E_Modifier). Extend characters are allowed before the modifier; this
|
by E_Modifier). Extend characters are allowed before the modifier; this
|
||||||
cannot be represented in this table, the code has to deal with it.
|
cannot be represented in this table, the code has to deal with it.
|
||||||
|
|
||||||
8. Do not break within emoji zwj sequences (ZWJ followed by Glue_After_Zwj or
|
8. Do not break within emoji zwj sequences (ZWJ followed by Glue_After_Zwj or
|
||||||
E_Base_GAZ).
|
E_Base_GAZ).
|
||||||
|
|
||||||
9. Do not break within emoji flag sequences. That is, do not break between
|
9. Do not break within emoji flag sequences. That is, do not break between
|
||||||
regional indicator (RI) symbols if there are an odd number of RI characters
|
regional indicator (RI) symbols if there are an odd number of RI characters
|
||||||
before the break point. This table encodes "join RI characters"; the code
|
before the break point. This table encodes "join RI characters"; the code
|
||||||
has to deal with checking for previous adjoining RIs.
|
has to deal with checking for previous adjoining RIs.
|
||||||
|
|
||||||
10. Otherwise, break everywhere.
|
10. Otherwise, break everywhere.
|
||||||
|
|
|
@ -264,7 +264,7 @@ enum {
|
||||||
ucp_Multani,
|
ucp_Multani,
|
||||||
ucp_Old_Hungarian,
|
ucp_Old_Hungarian,
|
||||||
ucp_SignWriting,
|
ucp_SignWriting,
|
||||||
/* New for Unicode 10.0.0 (no update since 8.0.0) */
|
/* New for Unicode 10.0.0 (no update since 8.0.0) */
|
||||||
ucp_Adlam,
|
ucp_Adlam,
|
||||||
ucp_Bhaiksuki,
|
ucp_Bhaiksuki,
|
||||||
ucp_Marchen,
|
ucp_Marchen,
|
||||||
|
@ -272,7 +272,7 @@ enum {
|
||||||
ucp_Osage,
|
ucp_Osage,
|
||||||
ucp_Tangut,
|
ucp_Tangut,
|
||||||
ucp_Masaram_Gondi,
|
ucp_Masaram_Gondi,
|
||||||
ucp_Nushu,
|
ucp_Nushu,
|
||||||
ucp_Soyombo,
|
ucp_Soyombo,
|
||||||
ucp_Zanabazar_Square
|
ucp_Zanabazar_Square
|
||||||
};
|
};
|
||||||
|
|
|
@ -142,7 +142,7 @@ static const int eint2[] = {
|
||||||
32, REG_INVARG, /* this version of PCRE2 does not have Unicode support */
|
32, REG_INVARG, /* this version of PCRE2 does not have Unicode support */
|
||||||
37, REG_EESCAPE, /* PCRE2 does not support \L, \l, \N{name}, \U, or \u */
|
37, REG_EESCAPE, /* PCRE2 does not support \L, \l, \N{name}, \U, or \u */
|
||||||
56, REG_INVARG, /* internal error: unknown newline setting */
|
56, REG_INVARG, /* internal error: unknown newline setting */
|
||||||
92, REG_INVARG, /* invalid option bits with PCRE2_LITERAL */
|
92, REG_INVARG, /* invalid option bits with PCRE2_LITERAL */
|
||||||
};
|
};
|
||||||
|
|
||||||
/* Table of texts corresponding to POSIX error codes */
|
/* Table of texts corresponding to POSIX error codes */
|
||||||
|
@ -239,7 +239,7 @@ int re_nsub = 0;
|
||||||
|
|
||||||
patlen = ((cflags & REG_PEND) != 0)? (PCRE2_SIZE)(preg->re_endp - pattern) :
|
patlen = ((cflags & REG_PEND) != 0)? (PCRE2_SIZE)(preg->re_endp - pattern) :
|
||||||
PCRE2_ZERO_TERMINATED;
|
PCRE2_ZERO_TERMINATED;
|
||||||
|
|
||||||
if ((cflags & REG_ICASE) != 0) options |= PCRE2_CASELESS;
|
if ((cflags & REG_ICASE) != 0) options |= PCRE2_CASELESS;
|
||||||
if ((cflags & REG_NEWLINE) != 0) options |= PCRE2_MULTILINE;
|
if ((cflags & REG_NEWLINE) != 0) options |= PCRE2_MULTILINE;
|
||||||
if ((cflags & REG_DOTALL) != 0) options |= PCRE2_DOTALL;
|
if ((cflags & REG_DOTALL) != 0) options |= PCRE2_DOTALL;
|
||||||
|
@ -249,7 +249,7 @@ if ((cflags & REG_UCP) != 0) options |= PCRE2_UCP;
|
||||||
if ((cflags & REG_UNGREEDY) != 0) options |= PCRE2_UNGREEDY;
|
if ((cflags & REG_UNGREEDY) != 0) options |= PCRE2_UNGREEDY;
|
||||||
|
|
||||||
preg->re_cflags = cflags;
|
preg->re_cflags = cflags;
|
||||||
preg->re_pcre2_code = pcre2_compile((PCRE2_SPTR)pattern, patlen, options,
|
preg->re_pcre2_code = pcre2_compile((PCRE2_SPTR)pattern, patlen, options,
|
||||||
&errorcode, &erroffset, NULL);
|
&errorcode, &erroffset, NULL);
|
||||||
preg->re_erroffset = erroffset;
|
preg->re_erroffset = erroffset;
|
||||||
|
|
||||||
|
@ -262,7 +262,7 @@ if (preg->re_pcre2_code == NULL)
|
||||||
|
|
||||||
if (errorcode < COMPILE_ERROR_BASE) return REG_BADPAT;
|
if (errorcode < COMPILE_ERROR_BASE) return REG_BADPAT;
|
||||||
errorcode -= COMPILE_ERROR_BASE;
|
errorcode -= COMPILE_ERROR_BASE;
|
||||||
|
|
||||||
if (errorcode < (int)(sizeof(eint1)/sizeof(const int)))
|
if (errorcode < (int)(sizeof(eint1)/sizeof(const int)))
|
||||||
return eint1[errorcode];
|
return eint1[errorcode];
|
||||||
for (i = 0; i < sizeof(eint2)/sizeof(const int); i += 2)
|
for (i = 0; i < sizeof(eint2)/sizeof(const int); i += 2)
|
||||||
|
|
|
@ -93,13 +93,13 @@ enum {
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|
||||||
/* The structure representing a compiled regular expression. It is also used
|
/* The structure representing a compiled regular expression. It is also used
|
||||||
for passing the pattern end pointer when REG_PEND is set. */
|
for passing the pattern end pointer when REG_PEND is set. */
|
||||||
|
|
||||||
typedef struct {
|
typedef struct {
|
||||||
void *re_pcre2_code;
|
void *re_pcre2_code;
|
||||||
void *re_match_data;
|
void *re_match_data;
|
||||||
const char *re_endp;
|
const char *re_endp;
|
||||||
size_t re_nsub;
|
size_t re_nsub;
|
||||||
size_t re_erroffset;
|
size_t re_erroffset;
|
||||||
int re_cflags;
|
int re_cflags;
|
||||||
|
|
|
@ -4073,7 +4073,8 @@ else fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%
|
||||||
* Show compile extra options *
|
* Show compile extra options *
|
||||||
*************************************************/
|
*************************************************/
|
||||||
|
|
||||||
/* Called for unsupported POSIX options.
|
/* Called only for unsupported POSIX options at present, and therefore needed
|
||||||
|
only when the 8-bit library is being compiled.
|
||||||
|
|
||||||
Arguments:
|
Arguments:
|
||||||
options an options word
|
options an options word
|
||||||
|
@ -4083,17 +4084,21 @@ Arguments:
|
||||||
Returns: nothing
|
Returns: nothing
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
#ifdef SUPPORT_PCRE2_8
|
||||||
static void
|
static void
|
||||||
show_compile_extra_options(uint32_t options, const char *before,
|
show_compile_extra_options(uint32_t options, const char *before,
|
||||||
const char *after)
|
const char *after)
|
||||||
{
|
{
|
||||||
if (options == 0) fprintf(outfile, "%s <none>%s", before, after);
|
if (options == 0) fprintf(outfile, "%s <none>%s", before, after);
|
||||||
else fprintf(outfile, "%s%s%s%s",
|
else fprintf(outfile, "%s%s%s%s%s%s",
|
||||||
before,
|
before,
|
||||||
((options & PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES) != 0)? " allow_surrogate_escapes" : "",
|
((options & PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES) != 0)? " allow_surrogate_escapes" : "",
|
||||||
((options & PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL) != 0)? " bad_escape_is_literal" : "",
|
((options & PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL) != 0)? " bad_escape_is_literal" : "",
|
||||||
|
((options & PCRE2_EXTRA_MATCH_WORD) != 0)? " match_word" : "",
|
||||||
|
((options & PCRE2_EXTRA_MATCH_LINE) != 0)? " match_line" : "",
|
||||||
after);
|
after);
|
||||||
}
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -124,10 +124,10 @@
|
||||||
/* SLJIT_REWRITABLE_JUMP is 0x1000. */
|
/* SLJIT_REWRITABLE_JUMP is 0x1000. */
|
||||||
|
|
||||||
#if (defined SLJIT_CONFIG_X86 && SLJIT_CONFIG_X86)
|
#if (defined SLJIT_CONFIG_X86 && SLJIT_CONFIG_X86)
|
||||||
# define PATCH_MB 0x4
|
# define PATCH_MB 0x4
|
||||||
# define PATCH_MW 0x8
|
# define PATCH_MW 0x8
|
||||||
#if (defined SLJIT_CONFIG_X86_64 && SLJIT_CONFIG_X86_64)
|
#if (defined SLJIT_CONFIG_X86_64 && SLJIT_CONFIG_X86_64)
|
||||||
# define PATCH_MD 0x10
|
# define PATCH_MD 0x10
|
||||||
#endif
|
#endif
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
@ -1555,6 +1555,7 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_cmov(struct sljit_compile
|
||||||
sljit_s32 dst_reg,
|
sljit_s32 dst_reg,
|
||||||
sljit_s32 src, sljit_sw srcw)
|
sljit_s32 src, sljit_sw srcw)
|
||||||
{
|
{
|
||||||
|
(void)srcw; /* To stop compiler warning */
|
||||||
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
|
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
|
||||||
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_I32_OP)));
|
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_I32_OP)));
|
||||||
CHECK_ARGUMENT((type & 0xff) >= SLJIT_EQUAL && (type & 0xff) <= SLJIT_ORDERED_F64);
|
CHECK_ARGUMENT((type & 0xff) >= SLJIT_EQUAL && (type & 0xff) <= SLJIT_ORDERED_F64);
|
||||||
|
|
|
@ -95,17 +95,6 @@
|
||||||
aaac
|
aaac
|
||||||
abbbbbbbbbbbac
|
abbbbbbbbbbbac
|
||||||
|
|
||||||
/^(b+|a){1,2}?bc/
|
|
||||||
bbc
|
|
||||||
|
|
||||||
/^(b*|ba){1,2}?bc/
|
|
||||||
babc
|
|
||||||
bbabc
|
|
||||||
bababc
|
|
||||||
\= Expect no match
|
|
||||||
bababbc
|
|
||||||
babababc
|
|
||||||
|
|
||||||
/^(ba|b*){1,2}?bc/
|
/^(ba|b*){1,2}?bc/
|
||||||
babc
|
babc
|
||||||
bbabc
|
bbabc
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -183,27 +183,6 @@ No match
|
||||||
abbbbbbbbbbbac
|
abbbbbbbbbbbac
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/^(b+|a){1,2}?bc/
|
|
||||||
bbc
|
|
||||||
0: bbc
|
|
||||||
1: b
|
|
||||||
|
|
||||||
/^(b*|ba){1,2}?bc/
|
|
||||||
babc
|
|
||||||
0: babc
|
|
||||||
1: ba
|
|
||||||
bbabc
|
|
||||||
0: bbabc
|
|
||||||
1: ba
|
|
||||||
bababc
|
|
||||||
0: bababc
|
|
||||||
1: ba
|
|
||||||
\= Expect no match
|
|
||||||
bababbc
|
|
||||||
No match
|
|
||||||
babababc
|
|
||||||
No match
|
|
||||||
|
|
||||||
/^(ba|b*){1,2}?bc/
|
/^(ba|b*){1,2}?bc/
|
||||||
babc
|
babc
|
||||||
0: babc
|
0: babc
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -853,10 +853,8 @@ Memory allocation (code space): 28
|
||||||
# with link size - hence multiple tests with different values.
|
# with link size - hence multiple tests with different values.
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
||||||
Failed: error 186 at offset 5813: regular expression is too complicated
|
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
||||||
Failed: error 186 at offset 5820: regular expression is too complicated
|
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
||||||
Failed: error 186 at offset 12820: regular expression is too complicated
|
Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
|
|
|
@ -853,10 +853,8 @@ Memory allocation (code space): 28
|
||||||
# with link size - hence multiple tests with different values.
|
# with link size - hence multiple tests with different values.
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
||||||
Failed: error 186 at offset 5813: regular expression is too complicated
|
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
||||||
Failed: error 186 at offset 5820: regular expression is too complicated
|
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
||||||
Failed: error 186 at offset 12820: regular expression is too complicated
|
Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
|
|
Loading…
Reference in New Issue