Philip.Hazel
3c869816ac
Fix sometimes failing caseless non-ASCII matching in assertion.
2019-11-16 17:30:07 +00:00
Philip.Hazel
7ecc9cdfaf
Fix error offset bug introduced at 1176.
2019-10-16 17:12:13 +00:00
Philip.Hazel
e413f3147c
Optimize certain starting code unit bit maps into a single starting code unit.
2019-09-13 17:02:06 +00:00
Philip.Hazel
d917899be5
Improve starting-byte bit map for UTF-8 patterns with wide characters in
...
classes.
2019-09-10 15:38:42 +00:00
Philip.Hazel
bf15267c30
Optimize classes such as [Aa] to be a single caseless character.
2019-09-09 17:00:19 +00:00
Philip.Hazel
963b570fd0
Back off failed attempt to handle nested lookbehinds for estimating how much of
...
a partial match to retain for multi-segment matching. Document the current
difficulty if the whole first segment cannot be retained.
2019-09-04 18:14:54 +00:00
Philip.Hazel
45b219e6bc
Fix bug introduced in commit 1133. Lookbehinds that follow a condition were not
...
always properly handled.
2019-08-26 16:28:26 +00:00
Philip.Hazel
71eb916d79
Fix allusedtext bug, rightmost consulted character incorrect in negative
...
lookaheads.
2019-08-10 11:34:50 +00:00
Philip.Hazel
59c7c5d100
Fix incorrect computation of group length when one branch exceeded 65535.
2019-08-03 08:30:40 +00:00
Philip.Hazel
630e4bb516
Fix typo in test data comment.
2019-08-01 16:42:36 +00:00
Philip.Hazel
aff5a78056
Upgrade to Unicode 12.1.0
2019-07-29 15:32:36 +00:00
Philip.Hazel
fe2df37c9f
Documentation and test update.
2019-07-28 15:32:11 +00:00
Philip.Hazel
24c62fc0d0
(*ACCEPT) at start of branch was not recording "may match empty string".
2019-07-23 16:58:57 +00:00
Zoltán Herczeg
82a4729e13
Follow the partial matching changes in JIT.
2019-07-23 12:34:58 +00:00
Philip.Hazel
3572634086
More partial match tweaks.
2019-07-22 16:30:44 +00:00
Philip.Hazel
c84a06c96e
Update definition of partial match and fix \z and \Z (as documented).
2019-07-21 16:48:13 +00:00
Philip.Hazel
c30815f5a1
Fix bug in recent patch for lookbehinds within lookaheads. Fixes ClusterFuzz
...
15933.
2019-07-18 17:20:29 +00:00
Philip.Hazel
0d0ee67eb0
Check start code unit bit map for setting minimum length.
2019-07-16 16:16:45 +00:00
Philip.Hazel
046c5cd21c
Fix lookbehind within lookahead within lookbehind misbehaviour bug.
2019-07-16 15:06:21 +00:00
Philip.Hazel
66811c6c73
Fix oversights in recent non-atomic assertions patch. Fixes ClusterFuzz 15837.
2019-07-15 16:04:13 +00:00
Philip.Hazel
620f3a1307
Implement non-atomic positive assertions.
2019-07-13 11:12:03 +00:00
Philip.Hazel
f985a68ea5
Additional overflow test.
2019-07-05 15:49:37 +00:00
Philip.Hazel
2e06fdcdc1
Check for integer overflow when computing lookbehind lengths. Fixes Clusterfuzz
...
issue 13656.
2019-07-04 17:01:53 +00:00
Philip.Hazel
4866bd3652
Fix bugs in recent patch for setting the maximum lookbehind.
2019-06-28 16:58:08 +00:00
Philip.Hazel
c0d0ee5365
Fix partial matching bug in pcre2_dfa_match().
2019-06-26 16:13:28 +00:00
Philip.Hazel
434e3f7468
Make pcre2test show actual pre-match consulted characters for a partial match,
...
not the length of the longest lookbehind. Control this by "allusedtext".
2019-06-26 08:23:47 +00:00
Philip.Hazel
d21f7daf9b
Improve maximum lookbehind calculation for nested lookbehinds.
2019-06-25 15:40:42 +00:00
Philip.Hazel
175b4919f7
Update tests.
2019-06-20 17:19:13 +00:00
Philip.Hazel
8eb01ad8a9
Typo in doc and update tests
2019-06-20 16:37:30 +00:00
Philip.Hazel
da5155fed3
Don't ignore {1}+ when it is applied to a parenthesized item.
2019-06-19 16:27:50 +00:00
Philip.Hazel
ef79b978a6
Fix minimum length bug for patterns containing (*ACCEPT).
2019-06-18 16:07:43 +00:00
Philip.Hazel
1ebc2c50cc
Another extension to minimum length calculation.
2019-06-17 16:26:44 +00:00
Philip.Hazel
ead78198d1
Improve minimum length finder in the presence of back references when there are
...
multiple groups with the same number.
2019-06-16 15:37:45 +00:00
Philip.Hazel
0d1ab8515f
Fix pcre2grep -o bug when ovector overflows; add option to adjust the limit;
...
raise the default limit; give error if -o requests an uncaptured parens.
2019-06-15 15:51:07 +00:00
Philip.Hazel
300bf6e2d6
Another fix to the recent (*ACCEPT) patch. Fixes clusterfuzz 15242.
2019-06-14 15:44:57 +00:00
Philip.Hazel
49f174ef78
Make pcre2_match() return (*MARK) names from successful conditional assertions,
...
as Perl and the JIT do.
2019-06-13 16:49:40 +00:00
Philip.Hazel
1f6b9097f4
Minor improvement to minimum length calculation.
2019-06-13 16:00:11 +00:00
Philip.Hazel
f0c06ee212
Fix minor oversight in previous patch. Fixes clusterfuzz 15199.
2019-06-11 07:37:29 +00:00
Philip.Hazel
306f2b9c57
Allow (*ACCEPT) to be quantified.
2019-06-10 16:41:22 +00:00
Philip.Hazel
4f31de2866
Add support for invalid UTF-8 matching to pcre2grep.
2019-05-28 14:14:22 +00:00
Philip.Hazel
16c046ce50
Implement support for invalid UTF in the pcre2_match() interpreter.
2019-05-24 17:15:48 +00:00
Philip.Hazel
e118e60a68
Fix crash when \X is used without UTF in JIT.
2019-05-13 16:26:17 +00:00
Philip.Hazel
16de9003e5
Implement a check on the number of capturing parentheses, which for some reason
...
has never existed. This fixes ClusterFuzz issue 14376.
2019-04-22 12:39:38 +00:00
Philip.Hazel
e85de98d0a
Fix crash in pcre2_substitute() with NULL match context.
2019-03-11 17:29:08 +00:00
Philip.Hazel
255f5e741b
Compile \p{Any} the same as . in DOTALL mode, to benefit from auto-anchoring.
2019-02-13 17:30:24 +00:00
Philip.Hazel
f2e1cea288
Fix overflow bug in new /u code. Fixes ClusterFuzz 13073.
2019-02-13 16:48:30 +00:00
Philip.Hazel
8c8deae8eb
Implement PCRE2_EXTRA_ALT_BSUX to support ECMAscript 6's \u{hhh..} syntax.
2019-02-12 17:50:19 +00:00
Philip.Hazel
d90de8b053
Previout bug-fix was bad. This properly fixes an overrun while reading a
...
Unicode group name.
2019-02-07 17:59:37 +00:00
Philip.Hazel
d7b10a57d1
Allow non-ASCII in group names when UTF is set; revise group naming terminology
...
in documentation to use "capture group", as Perl does.
2019-02-06 18:11:36 +00:00
Philip.Hazel
86349f8814
Fix bug in VERSION conditional test in DFA matching.
2019-01-29 14:34:59 +00:00
Philip.Hazel
7de013bac3
Fix issues with BAD_ESCAPE_IS_LITERAL in character classes.
2019-01-04 16:41:32 +00:00
Philip.Hazel
0b64d9cfca
Fix non-recognition of anchoring when preceded by (*MARK) etc.
2018-11-27 16:00:58 +00:00
Philip.Hazel
0ad7ff1549
Add --disable-pcre2grep-callout-fork configuration setting.
2018-11-17 16:45:57 +00:00
Philip.Hazel
9bc81d5229
Upgrade the as yet unreleased substitute callout facility.
2018-11-12 16:02:01 +00:00
Philip.Hazel
996892434f
Fix zero-repeated subroutine call at start of pattern bug, which recorded an
...
incorrect first code unit.
2018-10-20 09:28:02 +00:00
Philip.Hazel
f90ce1a333
Implement PCRE2_COPY_MATCHED_SUBJECT.
2018-10-17 08:33:38 +00:00
Philip.Hazel
2ba22647d1
Update Makefile.am for compiling with gcov. Add Script Run tests to improve
...
coverage.
2018-10-14 15:56:36 +00:00
Philip.Hazel
1c4dc562e4
Upgrade the ucptest program (used only by maintainer) and script run tests.
2018-10-14 14:27:16 +00:00
Philip.Hazel
0fc5cda13b
Documentation and tests update for script runs.
2018-10-12 17:02:34 +00:00
Philip.Hazel
4e7a204d18
Update Script Run code to use the Script Extension property instead of the
...
Script property.
2018-10-09 16:42:21 +00:00
Philip.Hazel
cda4780fb6
Fix bugs of omission in new script run code.
2018-10-03 15:41:47 +00:00
Philip.Hazel
866750fd53
Basic "script run" implementation. Not yet complete, and not yet documented.
2018-10-02 15:25:58 +00:00
Philip.Hazel
f26b0b0bae
Implement Perl 5.28's alphabetic lookaround syntax, e.g. (*pla:...) and also
...
(*atomic:...).
2018-09-24 16:23:53 +00:00
Philip.Hazel
69254c77f1
Implement PCRE2_EXTRA_ESCAPED_CR_IS_LF
2018-09-21 16:59:48 +00:00
Philip.Hazel
a69267246f
Implement callouts from pcre2_substitute().
2018-09-18 16:31:30 +00:00
Philip.Hazel
3fce7c75e9
Add "allvector" to pcre2test.
2018-09-15 17:10:39 +00:00
Philip.Hazel
bfad956b34
Treat empty-string-matching repeated conditionals the same as ordinary ones
...
when checking for an anchored pattern.
2018-09-03 15:20:40 +00:00
Philip.Hazel
59c2175ed9
Fix anchoring bug in conditionals with only one branch.
2018-09-02 16:53:29 +00:00
Philip.Hazel
50f0de6015
Lock out \N{U+hhhh} in non-UTF (non-Unicode) modes.
2018-09-02 16:03:27 +00:00
Philip.Hazel
6e6bb40a3d
Fix bad auto-possessification of certain types of class.
2018-08-17 14:45:35 +00:00
Philip.Hazel
9332d4be69
Fix dynamic options changing bug.
2018-08-04 08:20:18 +00:00
Philip.Hazel
b196143523
Make /x more Perl-compatible by recognizing all of Unicode's "Pattern White
...
Space" characters, not just the ASCII ones.
2018-08-03 09:38:36 +00:00
Philip.Hazel
6e245572b8
Add support for (?^) as now supported by Perl.
2018-07-28 16:23:24 +00:00
Philip.Hazel
a9453f096f
Give specific error for \F as for \L, \U etc.
2018-07-27 16:55:52 +00:00
Philip.Hazel
e9aa3c0a21
Add support for \N{U+dd...}, for ASCII and Unicode modes only.
2018-07-27 16:30:40 +00:00
Philip.Hazel
775481293a
Add more tests for further ClusterFuzz issues, all were fixed by the previous
...
patch; they just crashed in different ways. The fixed issues are ClusterFuzz
numbers 9522, 9534, 9535, 9541, 9542. The bug was a new one, introduced by a
recent code update (never in a release).
2018-07-22 15:43:00 +00:00
Philip.Hazel
7d97c226c7
Fix oversight in recent OP_COMMIT_ARG update.
2018-07-22 15:19:43 +00:00
Philip.Hazel
5ea9f6b0f1
Some places where the new opcode OP_COMMIT_ARG needs to be handled and which I
...
forgot.
2018-07-21 14:52:26 +00:00
Philip.Hazel
192b82cf6e
Allow :NAME on (*ACCEPT), (*FAIL), and (*COMMIT) and fix bug with (*MARK)
...
followed by (*ACCEPT) in an assertion. More small updates to perltest.sh.
2018-07-21 14:34:51 +00:00
Philip.Hazel
666e94cd59
Fixed atomic group backtracking bug.
2018-07-16 15:24:32 +00:00
Philip.Hazel
a0e367f5b6
Update Perl tester to allow for optimization to be turned off. Required moving
...
some tests out of the Perl-compatible files.
2018-07-14 16:16:51 +00:00
Philip.Hazel
7db5904b9f
Documentation and tests update and minor tweak to perltest.sh.
2018-07-12 17:04:43 +00:00
Philip.Hazel
937617f343
Update to Unicode 11.0.0
2018-07-07 16:10:29 +00:00
Philip.Hazel
50aa69657e
Fix bug in VERSION number reading.
2018-07-02 12:26:04 +00:00
Philip.Hazel
b2294373d7
Ignore qualifiers on lookaheads within lookbehinds when checking for a fixed
...
length.
2018-07-02 11:23:45 +00:00
Philip.Hazel
1c79bdf36f
Fix global search/replace in pcre2test and pcre2_substitute() when the pattern
...
matches an empty string, but never at the starting offset.
2018-07-02 10:54:03 +00:00
Philip.Hazel
89c2a02027
Fix bug when \K is used in a lookbehind in a substitute pattern.
2018-06-22 16:29:56 +00:00
Philip.Hazel
e75410a5d8
More typos and changes to "Kibibytes" for "Kilobytes".
2018-06-18 14:03:33 +00:00
Philip.Hazel
3fb01b0443
Ensure all match limit tests set a limit, don't rely on the default.
2018-04-29 15:07:44 +00:00
Philip.Hazel
fb15b37b2c
Remove ctrl/Z from the input for test 6.
2018-04-28 16:05:48 +00:00
Philip.Hazel
75747ebb11
Re-factor pcre2_dfa_match() to use the heap instead of the stack for workspace
...
vectors when doing recursive function calls.
2018-04-27 16:48:35 +00:00
Philip.Hazel
04919e9d03
Add support to pcre2grep for binary zeros in -f files.
2018-02-24 17:09:19 +00:00
Philip.Hazel
c440473190
Add another test.
2018-02-20 15:37:49 +00:00
Philip.Hazel
b26aa366ba
Fix \C bug with repeated character classes in UTF-8 mode.
2018-02-19 17:26:33 +00:00
Philip.Hazel
aff77100bb
Fix the value passed back for POSIX unset groups when REG_STARTEND has a
...
non-zero starting offset, and make pcre2test show relevant POSIX unset groups.
2018-02-19 14:49:42 +00:00
Philip.Hazel
53a588431c
Fix auto-possessification bug at the end of a capturing group that is called
...
recursively.
2018-01-31 17:53:56 +00:00
Zoltán Herczeg
940627c83a
Fix a typo in JIT and add a test.
2018-01-10 09:28:03 +00:00
Zoltán Herczeg
4a4389fa50
Support the new EXTUNI in JIT.
2018-01-06 08:48:11 +00:00
Philip.Hazel
807f37095d
Previous FIRSTLINE patch was broken. Fix it.
2018-01-01 14:54:06 +00:00
Philip.Hazel
7a6e8a4454
Fix PCRE2_FIRSTLINE bug when a pattern match starts with the first code unit of
...
a newline sequence.
2018-01-01 14:12:35 +00:00