Commit Graph

1472 Commits

Author SHA1 Message Date
Zoltan Herczeg dea56d2df9 JIT compiler update. 2022-02-24 14:15:15 +00:00
Adam 111cd470b5
Fix typo `with-match-limit_depth` -> `with-match-limit-depth` (#83) 2022-01-26 12:15:11 +00:00
Philip Hazel fdd9479108 Fix incorrect compiling when [Aa] etc. are quantified 2022-01-26 08:37:18 +00:00
Philip Hazel 419e3c68a3 Tidy comments 2022-01-14 16:05:30 +00:00
Zoltan Herczeg e21345de97
Extend unicode boolean property bitset index to 12 bit (#81)
Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2022-01-14 15:51:03 +00:00
Philip Hazel e85a81ebac Correct CMakeLists.txt for MSVC debugger file names 2022-01-14 12:37:24 +00:00
Philip Hazel 504ff06fff Fix overrun bug in recent property name parsing change 2022-01-14 12:24:23 +00:00
Philip Hazel 360a84e80b Update descriptive comments in UCD generation. 2022-01-12 17:38:48 +00:00
Zoltan Herczeg 061e57695a
Merge scriptx and bidi fields (#78)
Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2022-01-12 17:00:12 +00:00
Philip Hazel 7f7d3e8521 Documentation update for binary property support 2022-01-12 15:30:22 +00:00
Philip Hazel bf35c0518c Add -LP and -LS (list properties, list scripts) features to pcre2test. 2022-01-12 15:01:14 +00:00
Zoltan Herczeg 68fbc1982e
Support boolean properties in JIT (#76)
Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2022-01-11 16:03:34 +00:00
Philip Hazel 06d3a66065 Fix bug in modifier listing 2022-01-11 09:21:27 +00:00
Philip Hazel 87571b5af3 Update documentation and comments for UCD generation 2022-01-10 16:26:41 +00:00
Philip Hazel 838cdac4dc Remove vestiges of previous Bidi_Class coding 2022-01-10 14:57:45 +00:00
Philip Hazel 628a804102 Tests for new Boolean properties 2022-01-10 12:41:28 +00:00
Philip Hazel ec091e2e44 Restore lost de-duplication 2022-01-10 11:31:27 +00:00
Philip Hazel 636569a957 Initial code for Boolean property support 2022-01-09 14:46:43 +00:00
Philip Hazel 81d3729c66 Temporary note in maint/README and update ucptestdata for changes to script numbers 2022-01-07 10:21:09 +00:00
Zoltan Herczeg f90542a209
Improve unicode property abbreviation support (#74)
* Improve unicode property abbreviation support

* Auto-generate script names

Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2022-01-07 10:01:18 +00:00
Carlo Marcelo Arenas Belón 14dbc6e6ec
jit: use correct type when checking for max value (#73)
eb42305f (jit: avoid integer wraparound in stack size definition (#42),
2021-11-19) introduces a check to avoid an integer overflow when
allocating stack size for JIT.

Unfortunately the maximum value was using PCRE2_SIZE_MAX, eventhough
the variable is of type size_t, so correct it.

Practically; the issue shouldn't affect the most common configurations
where both values are the same, and it will be unlikely that there would
be a configuration where PCRE2_SIZE_MAX > SIZE_MAX, hence the mistake
is unlikely to have reintroduced the original bug and this change should
be therefore mostly equivalent.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-01-06 14:46:43 +01:00
Philip Hazel 80205ee2a0 ChangeLog entry for PR#72 2022-01-04 17:11:57 +00:00
Jessica Clarke 04ecb267c0
match: Properly align heapframes for CHERI/Arm's Morello prototype (#72)
On CHERI, and thus Arm's Morello prototype, pointers are represented as
hardware capabilities, which consist of both an integer address and
additional metadata, meaning they are twice the size of the platform's
size_t type, i.e. 16 bytes on a 64-bit system. The ovector member of
heapframe happens to only be 8 byte aligned, and so computing frame_size
ends up with a multiple of 8 but not 16. Whilst the first frame is
always suitably aligned, this then misaligns the frame that follows it,
resulting in an alignment fault when storing a pointer to Fecode at the
start of match.

Thus, round up frame_size to a multiple of heapframe's alignment to
ensure alignment is preserved. This can be completely optimised away on
traditional architectures and, since CHERI's capabilities are in fact
2 * sizeof(PCRE2_SIZE) bytes in size, the variable part of the
expression is also proven to be a multiple of the alignment and so the
aligning gets folded into the offsetof part by adding an additional 8,
so no dynamic alignment code is needed even on CHERI architectures.
2022-01-04 17:06:14 +00:00
Jessica Clarke 534b4760e3
RunGrepTest: Fix tests 132 and 133 when srcdir is relative (#71)
Notably, running the script directly from a build subdirectory will
infer srcdir as .. if not otherwise set, but doesn't work for these.
With this commit sh pcre2_grep_test.sh works as expected.
2022-01-04 16:59:03 +00:00
Philip Hazel 31fb2e58a1 Suppress compiler fall-through warnings 2022-01-03 15:57:48 +00:00
Zoltan Herczeg 435140a0ac
Fix script extension support on jit (#69)
Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2022-01-03 15:49:26 +00:00
Philip Hazel c24047f15d Documentation update 2021-12-31 16:59:44 +00:00
Zoltan Herczeg e7457003cd
Auto generate unicode property tests. (#67)
Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2021-12-31 16:47:37 +00:00
Philip Hazel d888d36013 Update script run code to work with new script extensions coding 2021-12-31 16:06:05 +00:00
Zoltan Herczeg 6614b281bc
Implement script extension support in JIT. (#66)
Fix incorect operator in GenerateUcd.py (modulo -> bitwise and)

Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2021-12-29 15:57:32 +00:00
Zoltan Herczeg afa4756d19
Rework script extension handling (#64)
Co-authored-by: Zoltan Herczeg <hzmester@freemail.hu>
2021-12-29 09:35:22 +00:00
Philip Hazel 7713f33e46 Add support for 4-character script abbreviations 2021-12-28 15:10:12 +00:00
Michael Kaufmann af2637ee5e
Fix parameter types in the pcre2serialize man page (#63) 2021-12-27 11:57:28 +00:00
Philip Hazel 98e7d70bc6 Refactor Python scripts for generating Unicode property data 2021-12-26 17:49:58 +00:00
Philip Hazel 321b559ed4 Ignore Python cache 2021-12-24 16:20:26 +00:00
Philip Hazel 16c8a84cce Arrange to distribute pcre2_ucptables.c 2021-12-23 16:13:45 +00:00
Philip Hazel 4514ddd2a2 Split generated tables from fixed tables 2021-12-22 16:55:30 +00:00
Philip Hazel 944f0e10a1 Documentation for script handling update 2021-12-22 15:02:26 +00:00
Philip Hazel b29732063b Revised script handling (see ChangeLog) 2021-12-21 16:11:30 +00:00
Philip Hazel 92d7cf1dd0 Very minor code speed up for maximizing character property matches 2021-12-17 12:30:05 +00:00
Philip Hazel 1d432ee3cf Do bidi synonyms properly 2021-12-15 11:48:23 +00:00
Philip Hazel 194a15315a Correct comment in test 2021-12-14 15:54:48 +00:00
Philip Hazel 1c41a5b815 Fix minor issues raised by Clang sanitize 2021-12-14 15:52:24 +00:00
Zoltan Herczeg 4243515033 JIT support for Bidi_Control and Bidi_Class 2021-12-13 07:04:19 +00:00
Philip Hazel 49b29f837d Add short synonyms for Bidi_Control and Bidi_Class 2021-12-10 16:32:10 +00:00
Philip Hazel 30abd0ac8d Documentation for Bidi_Control and Bidi_Class 2021-12-08 16:37:34 +00:00
Philip Hazel 0246c6bf64 Add support for Bidi_Control and Bidi_Class properties 2021-12-08 15:34:27 +00:00
Philip Hazel 823d4ac956 Add bidi class and control information to Unicode property data 2021-12-05 18:00:10 +00:00
Philip Hazel ba3d0edcbd Documentation update 2021-12-01 16:21:08 +00:00
Philip Hazel 4ef0c51d2b Interpret NULL pointer, zero length as an empty string for subjects and replacements. 2021-11-30 16:34:39 +00:00