eb42305f (jit: avoid integer wraparound in stack size definition (#42),
2021-11-19) introduces a check to avoid an integer overflow when
allocating stack size for JIT.
Unfortunately the maximum value was using PCRE2_SIZE_MAX, eventhough
the variable is of type size_t, so correct it.
Practically; the issue shouldn't affect the most common configurations
where both values are the same, and it will be unlikely that there would
be a configuration where PCRE2_SIZE_MAX > SIZE_MAX, hence the mistake
is unlikely to have reintroduced the original bug and this change should
be therefore mostly equivalent.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
On CHERI, and thus Arm's Morello prototype, pointers are represented as
hardware capabilities, which consist of both an integer address and
additional metadata, meaning they are twice the size of the platform's
size_t type, i.e. 16 bytes on a 64-bit system. The ovector member of
heapframe happens to only be 8 byte aligned, and so computing frame_size
ends up with a multiple of 8 but not 16. Whilst the first frame is
always suitably aligned, this then misaligns the frame that follows it,
resulting in an alignment fault when storing a pointer to Fecode at the
start of match.
Thus, round up frame_size to a multiple of heapframe's alignment to
ensure alignment is preserved. This can be completely optimised away on
traditional architectures and, since CHERI's capabilities are in fact
2 * sizeof(PCRE2_SIZE) bytes in size, the variable part of the
expression is also proven to be a multiple of the alignment and so the
aligning gets folded into the offsetof part by adding an additional 8,
so no dynamic alignment code is needed even on CHERI architectures.
Notably, running the script directly from a build subdirectory will
infer srcdir as .. if not otherwise set, but doesn't work for these.
With this commit sh pcre2_grep_test.sh works as expected.
* pcre2_match: avoid crash if subject NULL and PCRE2_ZERO_TERMINATED
When length of subject is PCRE2_ZERO_TERMINATED strlen is used
to calculate its size, which will trigger a crash if subject is
also NULL.
Move the NULL check before strlen on it would be used, and make
sure or dependent variables are set after the NULL validation
as well.
While at it, fix a typo in a debug flag in the same file, which
is otherwise unrelated and make sure the full section of constrain
checks can be identified clearly using the leading comment alone.
* pcre2_dfa_match: avoid crash if subject NULL and PCRE2_ZERO_TERMINATED
When length of subject is PCRE2_ZERO_TERMINATED strlen is used
to calculate its size, which will trigger a crash if subject is
also NULL.
Move the NULL check before the detection for subject sizes to
avoid this issue.
* pcre2_substitute: avoid crash if subject or replacement are NULL
The underlying pcre2_match() function will validate the subject if
needed, but will crash when length is PCRE2_ZERO_TERMINATED or if
subject == NULL and pcre2_match() is not being called because
match_data was provided.
The replacement parameter is missing NULL checks, and so currently
allows for an equivalent response to "" if rlength == 0.
Restrict all other cases to avoid strlen(NULL) crashes in the same
way that is done for subject, but also make sure to reject invalid
length values as early as possible.
It doesn't seem needed, and is apparently resulting in at least one
duplicated entry in the installation list that causes problems for
uninstalling.
Fixes: #46
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
* doc: fix incorrect use of JOIN and typo
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
* doc: reformat of pcre2_substitute to align options
includes some rewording to fit better in an 80 char wide troff output.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
* doc: update names to pcre2
Since d5a61ee8 (Patch to detect (and ignore) symlink loops in
pcre2grep., 2021-08-28), there is optional code that depends
on readlink and PATH_MAX but that had only detection added for
the first.
GNU Hurd doesn't have the later so it fails to build.
Improve the detection to include both dependencies in autotools
and cmake to fix that.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
pcre2_jit_stack_create() allows the user to indicate how big of a
stack size JIT should be able to allocate and use, using a size_t
variable which should be able to hold bigger values than reasonable.
Internally, the value is rounded to the next 8K, but if the value
is unreasonable large, would overflow and could result in a smaller
than expected stack or a maximun size that is smaller than the
minimum..
Avoid the overflow by checking the value and failing early, and
while at it make the check clearer while documenting the failure
mode.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
* test: avoid failing RunTest if pcre2test -S is not supported
If `pcre2test -S` is not supported then then avoid checking for it
in a test.
There is already a conditional check for it to be used when it is
needed and it is available, so adjust that as well.
* pcre2test: update list of platform support for -S
Minix 3 has a BSD userspace and now works fine, but Haiku still
doesn't support stack limits, so update accordingly.
To allow pcre2grep to do an early exit in a resumable way, -m uses
fseek on stdin, which is sadly not supported in several platforms.
Most of the conflicting issues come from the fact that managing the
position while buffering is not trivial, and is therefore an optional
feature[1] of POSIX.1-2017
Workaround this by removing the buffer to stdin, if the -m option is
being used. There is likely not a significant performance benefit
even for the platforms that support it, but it could be conditionally
added in that case, later.
Fixes: #10
[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/fseek.html
* tests: use a explicit filehandle to share in testing -m
The way stdin is shared to all participants of a subshell varies
per shell, and at least the standard /bin/sh in Solaris seem to
create a new copy for each command, defeating the purpose of the
test.
Use instead exec to create a filehandle that could then be used
explicitly in the test to confirm that the stream is set.
* pcre2grep: correctly handle multiple passes
When the -m option is used, pcre2grep is meant to exit after enough
matches are found but while leaving the stream pinned to the next position
after the last match.
Unfortunately, it wasn't tracking correctly the beginning of the stream
on subsequent passes, and therefore it will fail to use the right seek
value.
Grab the position of the stream at the beginning and while at it, make
sure that the stream passed hasn't been consumed already.