Commit Graph

505 Commits

Author SHA1 Message Date
Philip Hazel 8b133fa0ba Implement -Z in pcre2grep and update documentation 2022-07-30 17:41:49 +01:00
Philip Hazel cc5e121c8e Added some special heap tests 2022-07-28 17:58:19 +01:00
Philip Hazel d90fb23878 Refactor match_data() to always use the heap instead of having an initial frames vector on the stack; some consequential adjustmentsneeded. 2022-07-27 17:44:55 +01:00
Philip Hazel 1bb2b97b29 Update build workflow to add test in an Alpine container 2022-04-22 09:31:05 +01:00
Philip Hazel fdd9479108 Fix incorrect compiling when [Aa] etc. are quantified 2022-01-26 08:37:18 +00:00
Philip Hazel 504ff06fff Fix overrun bug in recent property name parsing change 2022-01-14 12:24:23 +00:00
Philip Hazel 628a804102 Tests for new Boolean properties 2022-01-10 12:41:28 +00:00
Philip Hazel 636569a957 Initial code for Boolean property support 2022-01-09 14:46:43 +00:00
Zoltan Herczeg f90542a209
Improve unicode property abbreviation support (#74)
* Improve unicode property abbreviation support

* Auto-generate script names

Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2022-01-07 10:01:18 +00:00
Zoltan Herczeg 435140a0ac
Fix script extension support on jit (#69)
Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2022-01-03 15:49:26 +00:00
Zoltan Herczeg e7457003cd
Auto generate unicode property tests. (#67)
Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2021-12-31 16:47:37 +00:00
Philip Hazel d888d36013 Update script run code to work with new script extensions coding 2021-12-31 16:06:05 +00:00
Zoltan Herczeg 6614b281bc
Implement script extension support in JIT. (#66)
Fix incorect operator in GenerateUcd.py (modulo -> bitwise and)

Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2021-12-29 15:57:32 +00:00
Philip Hazel 7713f33e46 Add support for 4-character script abbreviations 2021-12-28 15:10:12 +00:00
Philip Hazel b29732063b Revised script handling (see ChangeLog) 2021-12-21 16:11:30 +00:00
Philip Hazel 194a15315a Correct comment in test 2021-12-14 15:54:48 +00:00
Zoltan Herczeg 4243515033 JIT support for Bidi_Control and Bidi_Class 2021-12-13 07:04:19 +00:00
Philip Hazel 49b29f837d Add short synonyms for Bidi_Control and Bidi_Class 2021-12-10 16:32:10 +00:00
Philip Hazel 0246c6bf64 Add support for Bidi_Control and Bidi_Class properties 2021-12-08 15:34:27 +00:00
Philip Hazel 4ef0c51d2b Interpret NULL pointer, zero length as an empty string for subjects and replacements. 2021-11-30 16:34:39 +00:00
Carlo Marcelo Arenas Belón 587b94277b
doc: formatting/typo fixes to documentation (#47)
* doc: fix incorrect use of JOIN and typo

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>

* doc: reformat of pcre2_substitute to align options

includes some rewording to fit better in an 80 char wide troff output.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>

* doc: update names to pcre2
2021-11-27 16:27:49 +00:00
Philip Hazel 1ed34b9cb1 Update version to 10.40-RC1 and fix consequent version test issue. 2021-11-09 17:12:50 +00:00
Carlo Marcelo Arenas Belón 7db8784296
pcre2grep: correctly handle multiple passes (#35)
* tests: use a explicit filehandle to share in testing -m

The way stdin is shared to all participants of a subshell varies
per shell, and at least the standard /bin/sh in Solaris seem to
create a new copy for each command, defeating the purpose of the
test.

Use instead exec to create a filehandle that could then be used
explicitly in the test to confirm that the stream is set.

* pcre2grep: correctly handle multiple passes

When the -m option is used, pcre2grep is meant to exit after enough
matches are found but while leaving the stream pinned to the next position
after the last match.

Unfortunately, it wasn't tracking correctly the beginning of the stream
on subsequent passes, and therefore it will fail to use the right seek
value.

Grab the position of the stream at the beginning and while at it, make
sure that the stream passed hasn't been consumed already.
2021-11-09 16:57:48 +00:00
Philip Hazel 21c26698b3 Lock out \K in lookaround assertions by default, but provide an option to
re-enable the old behaviour, just in case.
2021-08-30 16:57:44 +01:00
Philip.Hazel 85fc061dcf Documentation and tests update. 2021-04-28 14:21:38 +00:00
Philip.Hazel 8c1df186ab Add another test, tidy ChangeLog. 2021-02-19 12:05:57 +00:00
Philip.Hazel 2c4d3942e4 Fix \K within recursion bug in interpreter. 2021-02-18 09:46:08 +00:00
Philip.Hazel 8144ae04e9 Fix some numerical checking bugs, Bugzilla 2690. 2021-02-01 17:56:12 +00:00
Philip.Hazel 027c9375c0 Update RunGrepTest to use tr for handling binary zeros instead of sed, which it
is hoped with increase portability. Bugzilla #2681.
2021-01-04 17:17:48 +00:00
Philip.Hazel 9e15c97b6d Fix bug in RunTest: not reporting failure in test 2, and fix bugs in RunTest
and RunTest.bat causing test 2 to fail when not building in source directory.
2020-11-22 15:16:05 +00:00
Zoltán Herczeg 2451870e3c Fixed a word boundary check bug in JIT when partial matching is enabled. 2020-10-27 08:16:04 +00:00
Philip.Hazel 81da2b97e3 pcre2grep update: -m and $x{..}, $o{..} escapes. Also some doc updates. 2020-10-04 16:34:31 +00:00
Philip.Hazel f8cbb1f58d Fix Bugzilla #2642: no match bug in 8-bit mode for caseless invalid utf
matching.
2020-09-15 14:36:23 +00:00
Philip.Hazel a2f0fd01c7 Update pcre2test to check delimiters after #perltest and fix some in test 1. 2020-09-14 15:39:39 +00:00
Philip.Hazel 5652d41209 Fix delimiters in tests 1 and 4 for correct Perl behaviour (Bugzilla #2641).
Also move \K in lookaround tests to test 2 (Perl no longer supports).
2020-09-13 15:56:32 +00:00
Philip.Hazel 0ad89ab06d Fix read overflow for invalid VERSION test with one fractional digit at the end
of a pattern. Fixes ClusterFuzz 23779.
2020-06-29 15:35:49 +00:00
Philip.Hazel ce558bbff1 Second attempt at getting rid of gcc 10 warning. 2020-04-24 15:36:53 +00:00
Philip.Hazel c472f3f91a Update to Unicode 13.0.0. 2020-03-25 17:18:33 +00:00
Philip.Hazel 8057c3c8b9 Renamed dftables as pcre2_dftables and enable it to write the tables in binary.
Update documentation about character tables.
2020-03-20 18:09:59 +00:00
Philip.Hazel 3155a6951f Fix bugs in new UCP casing code for back references and characters with more
than 2 cases.
2020-02-26 16:53:39 +00:00
Zoltán Herczeg 305e273e99 Follow ucp changes in JIT. 2020-02-26 10:18:43 +00:00
Philip.Hazel 68f9c49517 Fix bug introduced in recent UCP changes (writing outside starting code unit
bitmap for non-UTF caseless character U+00DF).
2020-02-25 16:47:36 +00:00
Philip.Hazel 3be538015b Fix bad lookbehind compilation when preceded by a DEFINE group. 2020-02-24 17:29:00 +00:00
Philip.Hazel f50ee03f5d Fix bug in UTF-16 checker returning wrong offset for missing low surrogate. 2020-02-24 15:39:56 +00:00
Philip.Hazel 4a7dfab0ec Unicode upper/lower casing is now used when UCP is set, even if UTF is not set.
This is not yet documented, and it not yet implemented in JIT.
2020-02-23 16:40:05 +00:00
Philip.Hazel a57787b7cd Fix problems with new PCRE2_SUBSTITUTE_MATCHED code. 2020-02-16 17:46:40 +00:00
Philip.Hazel 3a6b4948d1 Fix bug in processing (?(DEFINE)...) within lookbehind assertions. 2020-01-26 15:31:27 +00:00
Philip.Hazel 9e960f5465 Ensure a newline after the final line in a file is output by pcre2grep. 2020-01-25 15:50:44 +00:00
Philip.Hazel e8d70e2459 Implement PCRE2_SUBSTITUTE_REPLACEMENT_ONLY. 2020-01-22 17:50:12 +00:00
Philip.Hazel 7171d86587 Update Windows-specific test output (overlooked wording change). 2020-01-15 16:50:45 +00:00