490 Commits

Author SHA1 Message Date
Philip Hazel
194a15315a Correct comment in test 2021-12-14 15:54:48 +00:00
Zoltan Herczeg
4243515033 JIT support for Bidi_Control and Bidi_Class 2021-12-13 07:04:19 +00:00
Philip Hazel
49b29f837d Add short synonyms for Bidi_Control and Bidi_Class 2021-12-10 16:32:10 +00:00
Philip Hazel
0246c6bf64 Add support for Bidi_Control and Bidi_Class properties 2021-12-08 15:34:27 +00:00
Philip Hazel
4ef0c51d2b Interpret NULL pointer, zero length as an empty string for subjects and replacements. 2021-11-30 16:34:39 +00:00
Carlo Marcelo Arenas Belón
587b94277b
doc: formatting/typo fixes to documentation (#47)
* doc: fix incorrect use of JOIN and typo

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>

* doc: reformat of pcre2_substitute to align options

includes some rewording to fit better in an 80 char wide troff output.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>

* doc: update names to pcre2
2021-11-27 16:27:49 +00:00
Philip Hazel
1ed34b9cb1 Update version to 10.40-RC1 and fix consequent version test issue. 2021-11-09 17:12:50 +00:00
Carlo Marcelo Arenas Belón
7db8784296
pcre2grep: correctly handle multiple passes (#35)
* tests: use a explicit filehandle to share in testing -m

The way stdin is shared to all participants of a subshell varies
per shell, and at least the standard /bin/sh in Solaris seem to
create a new copy for each command, defeating the purpose of the
test.

Use instead exec to create a filehandle that could then be used
explicitly in the test to confirm that the stream is set.

* pcre2grep: correctly handle multiple passes

When the -m option is used, pcre2grep is meant to exit after enough
matches are found but while leaving the stream pinned to the next position
after the last match.

Unfortunately, it wasn't tracking correctly the beginning of the stream
on subsequent passes, and therefore it will fail to use the right seek
value.

Grab the position of the stream at the beginning and while at it, make
sure that the stream passed hasn't been consumed already.
2021-11-09 16:57:48 +00:00
Philip Hazel
21c26698b3 Lock out \K in lookaround assertions by default, but provide an option to
re-enable the old behaviour, just in case.
2021-08-30 16:57:44 +01:00
Philip.Hazel
85fc061dcf Documentation and tests update. 2021-04-28 14:21:38 +00:00
Philip.Hazel
8c1df186ab Add another test, tidy ChangeLog. 2021-02-19 12:05:57 +00:00
Philip.Hazel
2c4d3942e4 Fix \K within recursion bug in interpreter. 2021-02-18 09:46:08 +00:00
Philip.Hazel
8144ae04e9 Fix some numerical checking bugs, Bugzilla 2690. 2021-02-01 17:56:12 +00:00
Philip.Hazel
027c9375c0 Update RunGrepTest to use tr for handling binary zeros instead of sed, which it
is hoped with increase portability. Bugzilla #2681.
2021-01-04 17:17:48 +00:00
Philip.Hazel
9e15c97b6d Fix bug in RunTest: not reporting failure in test 2, and fix bugs in RunTest
and RunTest.bat causing test 2 to fail when not building in source directory.
2020-11-22 15:16:05 +00:00
Zoltán Herczeg
2451870e3c Fixed a word boundary check bug in JIT when partial matching is enabled. 2020-10-27 08:16:04 +00:00
Philip.Hazel
81da2b97e3 pcre2grep update: -m and $x{..}, $o{..} escapes. Also some doc updates. 2020-10-04 16:34:31 +00:00
Philip.Hazel
f8cbb1f58d Fix Bugzilla #2642: no match bug in 8-bit mode for caseless invalid utf
matching.
2020-09-15 14:36:23 +00:00
Philip.Hazel
a2f0fd01c7 Update pcre2test to check delimiters after #perltest and fix some in test 1. 2020-09-14 15:39:39 +00:00
Philip.Hazel
5652d41209 Fix delimiters in tests 1 and 4 for correct Perl behaviour (Bugzilla #2641).
Also move \K in lookaround tests to test 2 (Perl no longer supports).
2020-09-13 15:56:32 +00:00
Philip.Hazel
0ad89ab06d Fix read overflow for invalid VERSION test with one fractional digit at the end
of a pattern. Fixes ClusterFuzz 23779.
2020-06-29 15:35:49 +00:00
Philip.Hazel
ce558bbff1 Second attempt at getting rid of gcc 10 warning. 2020-04-24 15:36:53 +00:00
Philip.Hazel
c472f3f91a Update to Unicode 13.0.0. 2020-03-25 17:18:33 +00:00
Philip.Hazel
8057c3c8b9 Renamed dftables as pcre2_dftables and enable it to write the tables in binary.
Update documentation about character tables.
2020-03-20 18:09:59 +00:00
Philip.Hazel
3155a6951f Fix bugs in new UCP casing code for back references and characters with more
than 2 cases.
2020-02-26 16:53:39 +00:00
Zoltán Herczeg
305e273e99 Follow ucp changes in JIT. 2020-02-26 10:18:43 +00:00
Philip.Hazel
68f9c49517 Fix bug introduced in recent UCP changes (writing outside starting code unit
bitmap for non-UTF caseless character U+00DF).
2020-02-25 16:47:36 +00:00
Philip.Hazel
3be538015b Fix bad lookbehind compilation when preceded by a DEFINE group. 2020-02-24 17:29:00 +00:00
Philip.Hazel
f50ee03f5d Fix bug in UTF-16 checker returning wrong offset for missing low surrogate. 2020-02-24 15:39:56 +00:00
Philip.Hazel
4a7dfab0ec Unicode upper/lower casing is now used when UCP is set, even if UTF is not set.
This is not yet documented, and it not yet implemented in JIT.
2020-02-23 16:40:05 +00:00
Philip.Hazel
a57787b7cd Fix problems with new PCRE2_SUBSTITUTE_MATCHED code. 2020-02-16 17:46:40 +00:00
Philip.Hazel
3a6b4948d1 Fix bug in processing (?(DEFINE)...) within lookbehind assertions. 2020-01-26 15:31:27 +00:00
Philip.Hazel
9e960f5465 Ensure a newline after the final line in a file is output by pcre2grep. 2020-01-25 15:50:44 +00:00
Philip.Hazel
e8d70e2459 Implement PCRE2_SUBSTITUTE_REPLACEMENT_ONLY. 2020-01-22 17:50:12 +00:00
Philip.Hazel
7171d86587 Update Windows-specific test output (overlooked wording change). 2020-01-15 16:50:45 +00:00
Philip.Hazel
03720de840 Documentation update and another cunning test pattern. 2020-01-05 12:32:29 +00:00
Philip.Hazel
5ba5230b82 Allow real repetition of assertions. 2020-01-01 12:07:02 +00:00
Philip.Hazel
eaf4572ff8 Some test files needed updating for link sizes 3 and 4. 2019-12-29 11:56:45 +00:00
Philip.Hazel
ac4ab7186d Add (?* and (?<* synonyms for non-atomic lookarounds. 2019-12-28 13:53:59 +00:00
Philip.Hazel
d170829b26 Implement PCRE2_SUBSTITUTE_MATCHED. 2019-12-27 13:35:17 +00:00
Philip.Hazel
f3fd8b18cb Implement PCRE2_SUBSTITUTE_LITERAL. 2019-12-26 14:53:24 +00:00
Philip.Hazel
0a2033f0f7 Remove atomic restriction on capture groups containing recursive back
references, as since 10.30 it has been unnecessary.
2019-12-18 16:16:12 +00:00
Philip.Hazel
3c869816ac Fix sometimes failing caseless non-ASCII matching in assertion. 2019-11-16 17:30:07 +00:00
Philip.Hazel
7ecc9cdfaf Fix error offset bug introduced at 1176. 2019-10-16 17:12:13 +00:00
Philip.Hazel
e413f3147c Optimize certain starting code unit bit maps into a single starting code unit. 2019-09-13 17:02:06 +00:00
Philip.Hazel
d917899be5 Improve starting-byte bit map for UTF-8 patterns with wide characters in
classes.
2019-09-10 15:38:42 +00:00
Philip.Hazel
bf15267c30 Optimize classes such as [Aa] to be a single caseless character. 2019-09-09 17:00:19 +00:00
Philip.Hazel
963b570fd0 Back off failed attempt to handle nested lookbehinds for estimating how much of
a partial match to retain for multi-segment matching. Document the current 
difficulty if the whole first segment cannot be retained.
2019-09-04 18:14:54 +00:00
Philip.Hazel
45b219e6bc Fix bug introduced in commit 1133. Lookbehinds that follow a condition were not
always properly handled.
2019-08-26 16:28:26 +00:00
Philip.Hazel
71eb916d79 Fix allusedtext bug, rightmost consulted character incorrect in negative
lookaheads.
2019-08-10 11:34:50 +00:00