Commit Graph

135 Commits

Author SHA1 Message Date
Philip Hazel 0246c6bf64 Add support for Bidi_Control and Bidi_Class properties 2021-12-08 15:34:27 +00:00
Philip Hazel 4ef0c51d2b Interpret NULL pointer, zero length as an empty string for subjects and replacements. 2021-11-30 16:34:39 +00:00
Carlo Marcelo Arenas Belón ae4e6261e5
match: avoid crash if subject NULL and PCRE2_ZERO_TERMINATED (#53)
* pcre2_match: avoid crash if subject NULL and PCRE2_ZERO_TERMINATED

When length of subject is PCRE2_ZERO_TERMINATED strlen is used
to calculate its size, which will trigger a crash if subject is
also NULL.

Move the NULL check before strlen on it would be used, and make
sure or dependent variables are set after the NULL validation
as well.

While at it, fix a typo in a debug flag in the same file, which
is otherwise unrelated and make sure the full section of constrain
checks can be identified clearly using the leading comment alone.

* pcre2_dfa_match: avoid crash if subject NULL and PCRE2_ZERO_TERMINATED

When length of subject is PCRE2_ZERO_TERMINATED strlen is used
to calculate its size, which will trigger a crash if subject is
also NULL.

Move the NULL check before the detection for subject sizes to
avoid this issue.

* pcre2_substitute: avoid crash if subject or replacement are NULL

The underlying pcre2_match() function will validate the subject if
needed, but will crash when length is PCRE2_ZERO_TERMINATED or if
subject == NULL and pcre2_match() is not being called because
match_data was provided.

The replacement parameter is missing NULL checks, and so currently
allows for an equivalent response to "" if rlength == 0.

Restrict all other cases to avoid strlen(NULL) crashes in the same
way that is done for subject, but also make sure to reject invalid
length values as early as possible.
2021-11-27 16:49:31 +00:00
Philip Hazel 8f3e11a355 Doc file tidies for 10.38-RC1 2021-08-31 17:14:42 +01:00
Philip Hazel eea410b33a Improve code for "starts with" optimization in the interpreters. 2021-08-29 17:25:59 +01:00
Philip.Hazel cd45050ee4 Final file tidies for 10.37-RC1 2021-04-28 16:44:51 +00:00
Philip.Hazel 2c4d3942e4 Fix \K within recursion bug in interpreter. 2021-02-18 09:46:08 +00:00
Philip.Hazel 000bbf2ea7 File tidies for 10.36-RC1 2020-11-06 17:27:35 +00:00
Philip.Hazel f8cbb1f58d Fix Bugzilla #2642: no match bug in 8-bit mode for caseless invalid utf
matching.
2020-09-15 14:36:23 +00:00
Philip.Hazel 5ec5c45423 Added tests for __attribute__((uninitialized)) to both the configure and
CMake build files. Used to disable initialization of the match stack frames
vector (clang has an automatic initialization feature).
2020-04-23 16:50:45 +00:00
Philip.Hazel 8b3f8af535 File tidies for 10.35-RC1 release candidate. 2020-04-15 16:34:36 +00:00
Philip.Hazel 3155a6951f Fix bugs in new UCP casing code for back references and characters with more
than 2 cases.
2020-02-26 16:53:39 +00:00
Philip.Hazel 4a7dfab0ec Unicode upper/lower casing is now used when UCP is set, even if UTF is not set.
This is not yet documented, and it not yet implemented in JIT.
2020-02-23 16:40:05 +00:00
Philip.Hazel 777582d4de Avoid some VS compiler warnings. 2019-12-26 15:10:26 +00:00
Philip.Hazel 7ecc9cdfaf Fix error offset bug introduced at 1176. 2019-10-16 17:12:13 +00:00
Philip.Hazel 7bbdc58513 Fix pessimizing optimization of start-of-match code units in the interpreters. 2019-09-06 16:08:45 +00:00
Philip.Hazel 71eb916d79 Fix allusedtext bug, rightmost consulted character incorrect in negative
lookaheads.
2019-08-10 11:34:50 +00:00
Philip.Hazel 81ad92820a Comments updates. 2019-08-01 16:59:50 +00:00
Philip.Hazel 3572634086 More partial match tweaks. 2019-07-22 16:30:44 +00:00
Philip.Hazel c84a06c96e Update definition of partial match and fix \z and \Z (as documented). 2019-07-21 16:48:13 +00:00
Philip.Hazel 4677b1b0bb Tidy partial matching code; prepare for possible future change. 2019-07-14 16:44:46 +00:00
Philip.Hazel 620f3a1307 Implement non-atomic positive assertions. 2019-07-13 11:12:03 +00:00
Philip.Hazel 49f174ef78 Make pcre2_match() return (*MARK) names from successful conditional assertions,
as Perl and the JIT do.
2019-06-13 16:49:40 +00:00
Philip.Hazel d5dc4e0c33 Tweak limits on "must have" code unit searches (improves some performance). 2019-05-28 16:34:28 +00:00
Philip.Hazel 16c046ce50 Implement support for invalid UTF in the pcre2_match() interpreter. 2019-05-24 17:15:48 +00:00
Philip.Hazel 95c9d011e3 Change a number of expressions like 1<<10 to 1u<<10. 2019-04-12 14:40:27 +00:00
Philip.Hazel 9e4e6feee7 Update explanatory comment. 2018-11-27 10:42:59 +00:00
Philip.Hazel 8a0dd8955a Set subject field in match data to NULL after failed match. 2018-10-19 15:31:16 +00:00
Philip.Hazel f90ce1a333 Implement PCRE2_COPY_MATCHED_SUBJECT. 2018-10-17 08:33:38 +00:00
Philip.Hazel 866750fd53 Basic "script run" implementation. Not yet complete, and not yet documented. 2018-10-02 15:25:58 +00:00
Philip.Hazel 392974a0cb File tidies and documentation update for 10.32-RC1 Release Candidate. 2018-08-13 11:57:09 +00:00
Philip.Hazel 192b82cf6e Allow :NAME on (*ACCEPT), (*FAIL), and (*COMMIT) and fix bug with (*MARK)
followed by (*ACCEPT) in an assertion. More small updates to perltest.sh.
2018-07-21 14:34:51 +00:00
Philip.Hazel 666e94cd59 Fixed atomic group backtracking bug. 2018-07-16 15:24:32 +00:00
Philip.Hazel 9d87fcb727 Patches for portability. 2018-06-20 17:05:31 +00:00
Philip.Hazel e75410a5d8 More typos and changes to "Kibibytes" for "Kilobytes". 2018-06-18 14:03:33 +00:00
Philip.Hazel fabea723cf Typos in documentation and comments noted by Jason Hood. 2018-06-17 14:13:28 +00:00
Philip.Hazel b26aa366ba Fix \C bug with repeated character classes in UTF-8 mode. 2018-02-19 17:26:33 +00:00
Philip.Hazel 85f8ecba58 Tidy ACROSSCHAR macro to take same form as FORWARDCHAR and BACKCHAR. 2018-01-01 15:13:24 +00:00
Philip.Hazel 4048606896 Small tidy to start of match optimizations. 2018-01-01 15:05:27 +00:00
Philip.Hazel 807f37095d Previous FIRSTLINE patch was broken. Fix it. 2018-01-01 14:54:06 +00:00
Philip.Hazel 7a6e8a4454 Fix PCRE2_FIRSTLINE bug when a pattern match starts with the first code unit of
a newline sequence.
2018-01-01 14:12:35 +00:00
Philip.Hazel 94d5f4a050 Add callout_flags to callout blocks, and set bits within it from pcre2_match()
interpretation.
2017-12-22 15:56:27 +00:00
Philip.Hazel 5cbab74c97 Rejig how callout blocks are allocated in pcre2_match(). 2017-12-16 16:43:47 +00:00
Philip.Hazel 9e38537b87 A small code tidy for one error return. 2017-12-16 16:07:29 +00:00
Philip.Hazel 6f4ee08469 Add some casts to avoid compiler warnings. 2017-09-26 17:01:23 +00:00
Philip.Hazel 42f547bf4d Replace multiple copies of extended grapheme sequence code with a single
subroutine.
2017-09-12 16:28:42 +00:00
Philip.Hazel 4f7a608d56 Update grapheme breaking rules for Unicode 10.0.0. 2017-07-05 08:55:49 +00:00
Philip.Hazel b7d5cee61f Allow anchored patterns to use "first code unit" optimization. 2017-06-30 16:00:33 +00:00
Philip.Hazel f850015168 Add suitable "fall through" comments for latest gcc warnings. 2017-06-03 17:50:03 +00:00
Philip.Hazel 3d80fa4fc2 Implement PCRE2_NEWLINE_NUL. 2017-05-26 17:14:36 +00:00