Interpret NULL pointer, zero length as an empty string for subjects and replacements.
This commit is contained in:
parent
7ab2769728
commit
4ef0c51d2b
|
@ -34,6 +34,11 @@ substituting.
|
||||||
|
|
||||||
12. Add check for NULL replacement to pcre2_substitute().
|
12. Add check for NULL replacement to pcre2_substitute().
|
||||||
|
|
||||||
|
13. For the subject arguments of pcre2_match(), pcre2_dfa_match(), and
|
||||||
|
pcre2_substitute(), and the replacement argument of the latter, if the pointer
|
||||||
|
is NULL and the length is zero, treat as an empty string. Apparently a number
|
||||||
|
of applications treat NULL/0 in this way.
|
||||||
|
|
||||||
|
|
||||||
Version 10.39 29-October-2021
|
Version 10.39 29-October-2021
|
||||||
-----------------------------
|
-----------------------------
|
||||||
|
|
|
@ -2640,7 +2640,9 @@ The subject string is passed to <b>pcre2_match()</b> as a pointer in
|
||||||
<i>startoffset</i>. The length and offset are in code units, not characters.
|
<i>startoffset</i>. The length and offset are in code units, not characters.
|
||||||
That is, they are in bytes for the 8-bit library, 16-bit code units for the
|
That is, they are in bytes for the 8-bit library, 16-bit code units for the
|
||||||
16-bit library, and 32-bit code units for the 32-bit library, whether or not
|
16-bit library, and 32-bit code units for the 32-bit library, whether or not
|
||||||
UTF processing is enabled.
|
UTF processing is enabled. As a special case, if <i>subject</i> is NULL and
|
||||||
|
<i>length</i> is zero, the subject is assumed to be an empty string. If
|
||||||
|
<i>length</i> is non-zero, an error occurs if <i>subject</i> is NULL.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
If <i>startoffset</i> is greater than the length of the subject,
|
If <i>startoffset</i> is greater than the length of the subject,
|
||||||
|
@ -3394,12 +3396,17 @@ same number causes an error at compile time.
|
||||||
<P>
|
<P>
|
||||||
This function optionally calls <b>pcre2_match()</b> and then makes a copy of the
|
This function optionally calls <b>pcre2_match()</b> and then makes a copy of the
|
||||||
subject string in <i>outputbuffer</i>, replacing parts that were matched with
|
subject string in <i>outputbuffer</i>, replacing parts that were matched with
|
||||||
the <i>replacement</i> string, whose length is supplied in <b>rlength</b>. This
|
the <i>replacement</i> string, whose length is supplied in <b>rlength</b>, which
|
||||||
can be given as PCRE2_ZERO_TERMINATED for a zero-terminated string. There is an
|
can be given as PCRE2_ZERO_TERMINATED for a zero-terminated string. As a
|
||||||
option (see PCRE2_SUBSTITUTE_REPLACEMENT_ONLY below) to return just the
|
special case, if <i>replacement</i> is NULL and <i>rlength</i> is zero, the
|
||||||
replacement string(s). The default action is to perform just one replacement if
|
replacement is assumed to be an empty string. If <i>rlength</i> is non-zero, an
|
||||||
the pattern matches, but there is an option that requests multiple replacements
|
error occurs if <i>replacement</i> is NULL.
|
||||||
(see PCRE2_SUBSTITUTE_GLOBAL below).
|
</P>
|
||||||
|
<P>
|
||||||
|
There is an option (see PCRE2_SUBSTITUTE_REPLACEMENT_ONLY below) to return just
|
||||||
|
the replacement string(s). The default action is to perform just one
|
||||||
|
replacement if the pattern matches, but there is an option that requests
|
||||||
|
multiple replacements (see PCRE2_SUBSTITUTE_GLOBAL below).
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
If successful, <b>pcre2_substitute()</b> returns the number of substitutions
|
If successful, <b>pcre2_substitute()</b> returns the number of substitutions
|
||||||
|
@ -3812,12 +3819,13 @@ other alternatives. Ultimately, when it runs out of matches,
|
||||||
<P>
|
<P>
|
||||||
The function <b>pcre2_dfa_match()</b> is called to match a subject string
|
The function <b>pcre2_dfa_match()</b> is called to match a subject string
|
||||||
against a compiled pattern, using a matching algorithm that scans the subject
|
against a compiled pattern, using a matching algorithm that scans the subject
|
||||||
string just once (not counting lookaround assertions), and does not backtrack.
|
string just once (not counting lookaround assertions), and does not backtrack
|
||||||
This has different characteristics to the normal algorithm, and is not
|
(except when processing lookaround assertions). This has different
|
||||||
compatible with Perl. Some of the features of PCRE2 patterns are not supported.
|
characteristics to the normal algorithm, and is not compatible with Perl. Some
|
||||||
Nevertheless, there are times when this kind of matching can be useful. For a
|
of the features of PCRE2 patterns are not supported. Nevertheless, there are
|
||||||
discussion of the two matching algorithms, and a list of features that
|
times when this kind of matching can be useful. For a discussion of the two
|
||||||
<b>pcre2_dfa_match()</b> does not support, see the
|
matching algorithms, and a list of features that <b>pcre2_dfa_match()</b> does
|
||||||
|
not support, see the
|
||||||
<a href="pcre2matching.html"><b>pcre2matching</b></a>
|
<a href="pcre2matching.html"><b>pcre2matching</b></a>
|
||||||
documentation.
|
documentation.
|
||||||
</P>
|
</P>
|
||||||
|
@ -4010,7 +4018,7 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 30 August 2021
|
Last updated: 30 November 2021
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2021 University of Cambridge.
|
Copyright © 1997-2021 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
|
@ -269,11 +269,11 @@ starts another match, that match must use a different JIT stack to the one used
|
||||||
for currently suspended match(es).
|
for currently suspended match(es).
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
In a multithread application, if you do not
|
In a multithread application, if you do not specify a JIT stack, or if you
|
||||||
specify a JIT stack, or if you assign or pass back NULL from a callback, that
|
assign or pass back NULL from a callback, that is thread-safe, because each
|
||||||
is thread-safe, because each thread has its own machine stack. However, if you
|
thread has its own machine stack. However, if you assign or pass back a
|
||||||
assign or pass back a non-NULL JIT stack, this must be a different stack for
|
non-NULL JIT stack, this must be a different stack for each thread so that the
|
||||||
each thread so that the application is thread-safe.
|
application is thread-safe.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Strictly speaking, even more is allowed. You can assign the same non-NULL stack
|
Strictly speaking, even more is allowed. You can assign the same non-NULL stack
|
||||||
|
@ -382,8 +382,8 @@ out this complicated API.
|
||||||
<b>void pcre2_jit_free_unused_memory(pcre2_general_context *<i>gcontext</i>);</b>
|
<b>void pcre2_jit_free_unused_memory(pcre2_general_context *<i>gcontext</i>);</b>
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The JIT executable allocator does not free all memory when it is possible.
|
The JIT executable allocator does not free all memory when it is possible. It
|
||||||
It expects new allocations, and keeps some free memory around to improve
|
expects new allocations, and keeps some free memory around to improve
|
||||||
allocation speed. However, in low memory conditions, it might be better to free
|
allocation speed. However, in low memory conditions, it might be better to free
|
||||||
all possible memory. You can cause this to happen by calling
|
all possible memory. You can cause this to happen by calling
|
||||||
pcre2_jit_free_unused_memory(). Its argument is a general context, for custom
|
pcre2_jit_free_unused_memory(). Its argument is a general context, for custom
|
||||||
|
@ -442,10 +442,10 @@ that was not compiled.
|
||||||
<P>
|
<P>
|
||||||
When you call <b>pcre2_match()</b>, as well as testing for invalid options, a
|
When you call <b>pcre2_match()</b>, as well as testing for invalid options, a
|
||||||
number of other sanity checks are performed on the arguments. For example, if
|
number of other sanity checks are performed on the arguments. For example, if
|
||||||
the subject pointer is NULL, an immediate error is given. Also, unless
|
the subject pointer is NULL but the length is non-zero, an immediate error is
|
||||||
PCRE2_NO_UTF_CHECK is set, a UTF subject string is tested for validity. In the
|
given. Also, unless PCRE2_NO_UTF_CHECK is set, a UTF subject string is tested
|
||||||
interests of speed, these checks do not happen on the JIT fast path, and if
|
for validity. In the interests of speed, these checks do not happen on the JIT
|
||||||
invalid data is passed, the result is undefined.
|
fast path, and if invalid data is passed, the result is undefined.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Bypassing the sanity checks and the <b>pcre2_match()</b> wrapping can give
|
Bypassing the sanity checks and the <b>pcre2_match()</b> wrapping can give
|
||||||
|
@ -466,9 +466,9 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC14" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC14" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 23 May 2019
|
Last updated: 30 November 2021
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2019 University of Cambridge.
|
Copyright © 1997-2021 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
|
@ -2579,7 +2579,9 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
|
||||||
and offset are in code units, not characters. That is, they are in
|
and offset are in code units, not characters. That is, they are in
|
||||||
bytes for the 8-bit library, 16-bit code units for the 16-bit library,
|
bytes for the 8-bit library, 16-bit code units for the 16-bit library,
|
||||||
and 32-bit code units for the 32-bit library, whether or not UTF pro-
|
and 32-bit code units for the 32-bit library, whether or not UTF pro-
|
||||||
cessing is enabled.
|
cessing is enabled. As a special case, if subject is NULL and length is
|
||||||
|
zero, the subject is assumed to be an empty string. If length is non-
|
||||||
|
zero, an error occurs if subject is NULL.
|
||||||
|
|
||||||
If startoffset is greater than the length of the subject, pcre2_match()
|
If startoffset is greater than the length of the subject, pcre2_match()
|
||||||
returns PCRE2_ERROR_BADOFFSET. When the starting offset is zero, the
|
returns PCRE2_ERROR_BADOFFSET. When the starting offset is zero, the
|
||||||
|
@ -3280,8 +3282,12 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
||||||
|
|
||||||
This function optionally calls pcre2_match() and then makes a copy of
|
This function optionally calls pcre2_match() and then makes a copy of
|
||||||
the subject string in outputbuffer, replacing parts that were matched
|
the subject string in outputbuffer, replacing parts that were matched
|
||||||
with the replacement string, whose length is supplied in rlength. This
|
with the replacement string, whose length is supplied in rlength, which
|
||||||
can be given as PCRE2_ZERO_TERMINATED for a zero-terminated string.
|
can be given as PCRE2_ZERO_TERMINATED for a zero-terminated string. As
|
||||||
|
a special case, if replacement is NULL and rlength is zero, the re-
|
||||||
|
placement is assumed to be an empty string. If rlength is non-zero, an
|
||||||
|
error occurs if replacement is NULL.
|
||||||
|
|
||||||
There is an option (see PCRE2_SUBSTITUTE_REPLACEMENT_ONLY below) to re-
|
There is an option (see PCRE2_SUBSTITUTE_REPLACEMENT_ONLY below) to re-
|
||||||
turn just the replacement string(s). The default action is to perform
|
turn just the replacement string(s). The default action is to perform
|
||||||
just one replacement if the pattern matches, but there is an option
|
just one replacement if the pattern matches, but there is an option
|
||||||
|
@ -3666,12 +3672,13 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
|
||||||
The function pcre2_dfa_match() is called to match a subject string
|
The function pcre2_dfa_match() is called to match a subject string
|
||||||
against a compiled pattern, using a matching algorithm that scans the
|
against a compiled pattern, using a matching algorithm that scans the
|
||||||
subject string just once (not counting lookaround assertions), and does
|
subject string just once (not counting lookaround assertions), and does
|
||||||
not backtrack. This has different characteristics to the normal algo-
|
not backtrack (except when processing lookaround assertions). This has
|
||||||
rithm, and is not compatible with Perl. Some of the features of PCRE2
|
different characteristics to the normal algorithm, and is not compati-
|
||||||
patterns are not supported. Nevertheless, there are times when this
|
ble with Perl. Some of the features of PCRE2 patterns are not sup-
|
||||||
kind of matching can be useful. For a discussion of the two matching
|
ported. Nevertheless, there are times when this kind of matching can be
|
||||||
algorithms, and a list of features that pcre2_dfa_match() does not sup-
|
useful. For a discussion of the two matching algorithms, and a list of
|
||||||
port, see the pcre2matching documentation.
|
features that pcre2_dfa_match() does not support, see the pcre2matching
|
||||||
|
documentation.
|
||||||
|
|
||||||
The arguments for the pcre2_dfa_match() function are the same as for
|
The arguments for the pcre2_dfa_match() function are the same as for
|
||||||
pcre2_match(), plus two extras. The ovector within the match data block
|
pcre2_match(), plus two extras. The ovector within the match data block
|
||||||
|
@ -3850,7 +3857,7 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 30 August 2021
|
Last updated: 30 November 2021
|
||||||
Copyright (c) 1997-2021 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
@ -5494,10 +5501,11 @@ JIT FAST PATH API
|
||||||
|
|
||||||
When you call pcre2_match(), as well as testing for invalid options, a
|
When you call pcre2_match(), as well as testing for invalid options, a
|
||||||
number of other sanity checks are performed on the arguments. For exam-
|
number of other sanity checks are performed on the arguments. For exam-
|
||||||
ple, if the subject pointer is NULL, an immediate error is given. Also,
|
ple, if the subject pointer is NULL but the length is non-zero, an im-
|
||||||
unless PCRE2_NO_UTF_CHECK is set, a UTF subject string is tested for
|
mediate error is given. Also, unless PCRE2_NO_UTF_CHECK is set, a UTF
|
||||||
validity. In the interests of speed, these checks do not happen on the
|
subject string is tested for validity. In the interests of speed, these
|
||||||
JIT fast path, and if invalid data is passed, the result is undefined.
|
checks do not happen on the JIT fast path, and if invalid data is
|
||||||
|
passed, the result is undefined.
|
||||||
|
|
||||||
Bypassing the sanity checks and the pcre2_match() wrapping can give
|
Bypassing the sanity checks and the pcre2_match() wrapping can give
|
||||||
speedups of more than 10%.
|
speedups of more than 10%.
|
||||||
|
@ -5517,8 +5525,8 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 23 May 2019
|
Last updated: 30 November 2021
|
||||||
Copyright (c) 1997-2019 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2API 3 "30 August 2021" "PCRE2 10.38"
|
.TH PCRE2API 3 "30 November 2021" "PCRE2 10.40"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.sp
|
.sp
|
||||||
|
@ -2624,7 +2624,9 @@ The subject string is passed to \fBpcre2_match()\fP as a pointer in
|
||||||
\fIstartoffset\fP. The length and offset are in code units, not characters.
|
\fIstartoffset\fP. The length and offset are in code units, not characters.
|
||||||
That is, they are in bytes for the 8-bit library, 16-bit code units for the
|
That is, they are in bytes for the 8-bit library, 16-bit code units for the
|
||||||
16-bit library, and 32-bit code units for the 32-bit library, whether or not
|
16-bit library, and 32-bit code units for the 32-bit library, whether or not
|
||||||
UTF processing is enabled.
|
UTF processing is enabled. As a special case, if \fIsubject\fP is NULL and
|
||||||
|
\fIlength\fP is zero, the subject is assumed to be an empty string. If
|
||||||
|
\fIlength\fP is non-zero, an error occurs if \fIsubject\fP is NULL.
|
||||||
.P
|
.P
|
||||||
If \fIstartoffset\fP is greater than the length of the subject,
|
If \fIstartoffset\fP is greater than the length of the subject,
|
||||||
\fBpcre2_match()\fP returns PCRE2_ERROR_BADOFFSET. When the starting offset is
|
\fBpcre2_match()\fP returns PCRE2_ERROR_BADOFFSET. When the starting offset is
|
||||||
|
@ -3413,12 +3415,16 @@ same number causes an error at compile time.
|
||||||
.P
|
.P
|
||||||
This function optionally calls \fBpcre2_match()\fP and then makes a copy of the
|
This function optionally calls \fBpcre2_match()\fP and then makes a copy of the
|
||||||
subject string in \fIoutputbuffer\fP, replacing parts that were matched with
|
subject string in \fIoutputbuffer\fP, replacing parts that were matched with
|
||||||
the \fIreplacement\fP string, whose length is supplied in \fBrlength\fP. This
|
the \fIreplacement\fP string, whose length is supplied in \fBrlength\fP, which
|
||||||
can be given as PCRE2_ZERO_TERMINATED for a zero-terminated string. There is an
|
can be given as PCRE2_ZERO_TERMINATED for a zero-terminated string. As a
|
||||||
option (see PCRE2_SUBSTITUTE_REPLACEMENT_ONLY below) to return just the
|
special case, if \fIreplacement\fP is NULL and \fIrlength\fP is zero, the
|
||||||
replacement string(s). The default action is to perform just one replacement if
|
replacement is assumed to be an empty string. If \fIrlength\fP is non-zero, an
|
||||||
the pattern matches, but there is an option that requests multiple replacements
|
error occurs if \fIreplacement\fP is NULL.
|
||||||
(see PCRE2_SUBSTITUTE_GLOBAL below).
|
.P
|
||||||
|
There is an option (see PCRE2_SUBSTITUTE_REPLACEMENT_ONLY below) to return just
|
||||||
|
the replacement string(s). The default action is to perform just one
|
||||||
|
replacement if the pattern matches, but there is an option that requests
|
||||||
|
multiple replacements (see PCRE2_SUBSTITUTE_GLOBAL below).
|
||||||
.P
|
.P
|
||||||
If successful, \fBpcre2_substitute()\fP returns the number of substitutions
|
If successful, \fBpcre2_substitute()\fP returns the number of substitutions
|
||||||
that were carried out. This may be zero if no match was found, and is never
|
that were carried out. This may be zero if no match was found, and is never
|
||||||
|
@ -3813,12 +3819,13 @@ other alternatives. Ultimately, when it runs out of matches,
|
||||||
.P
|
.P
|
||||||
The function \fBpcre2_dfa_match()\fP is called to match a subject string
|
The function \fBpcre2_dfa_match()\fP is called to match a subject string
|
||||||
against a compiled pattern, using a matching algorithm that scans the subject
|
against a compiled pattern, using a matching algorithm that scans the subject
|
||||||
string just once (not counting lookaround assertions), and does not backtrack.
|
string just once (not counting lookaround assertions), and does not backtrack
|
||||||
This has different characteristics to the normal algorithm, and is not
|
(except when processing lookaround assertions). This has different
|
||||||
compatible with Perl. Some of the features of PCRE2 patterns are not supported.
|
characteristics to the normal algorithm, and is not compatible with Perl. Some
|
||||||
Nevertheless, there are times when this kind of matching can be useful. For a
|
of the features of PCRE2 patterns are not supported. Nevertheless, there are
|
||||||
discussion of the two matching algorithms, and a list of features that
|
times when this kind of matching can be useful. For a discussion of the two
|
||||||
\fBpcre2_dfa_match()\fP does not support, see the
|
matching algorithms, and a list of features that \fBpcre2_dfa_match()\fP does
|
||||||
|
not support, see the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
\fBpcre2matching\fP
|
\fBpcre2matching\fP
|
||||||
.\"
|
.\"
|
||||||
|
@ -4018,6 +4025,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 30 August 2021
|
Last updated: 30 November 2021
|
||||||
Copyright (c) 1997-2021 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2JIT 3 "23 May 2019" "PCRE2 10.34"
|
.TH PCRE2JIT 3 "30 November 2021" "PCRE2 10.40"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH "PCRE2 JUST-IN-TIME COMPILER SUPPORT"
|
.SH "PCRE2 JUST-IN-TIME COMPILER SUPPORT"
|
||||||
|
@ -251,11 +251,11 @@ non-sequential matches in one thread is to use callouts: if a callout function
|
||||||
starts another match, that match must use a different JIT stack to the one used
|
starts another match, that match must use a different JIT stack to the one used
|
||||||
for currently suspended match(es).
|
for currently suspended match(es).
|
||||||
.P
|
.P
|
||||||
In a multithread application, if you do not
|
In a multithread application, if you do not specify a JIT stack, or if you
|
||||||
specify a JIT stack, or if you assign or pass back NULL from a callback, that
|
assign or pass back NULL from a callback, that is thread-safe, because each
|
||||||
is thread-safe, because each thread has its own machine stack. However, if you
|
thread has its own machine stack. However, if you assign or pass back a
|
||||||
assign or pass back a non-NULL JIT stack, this must be a different stack for
|
non-NULL JIT stack, this must be a different stack for each thread so that the
|
||||||
each thread so that the application is thread-safe.
|
application is thread-safe.
|
||||||
.P
|
.P
|
||||||
Strictly speaking, even more is allowed. You can assign the same non-NULL stack
|
Strictly speaking, even more is allowed. You can assign the same non-NULL stack
|
||||||
to a match context that is used by any number of patterns, as long as they are
|
to a match context that is used by any number of patterns, as long as they are
|
||||||
|
@ -355,8 +355,8 @@ out this complicated API.
|
||||||
.B void pcre2_jit_free_unused_memory(pcre2_general_context *\fIgcontext\fP);
|
.B void pcre2_jit_free_unused_memory(pcre2_general_context *\fIgcontext\fP);
|
||||||
.fi
|
.fi
|
||||||
.P
|
.P
|
||||||
The JIT executable allocator does not free all memory when it is possible.
|
The JIT executable allocator does not free all memory when it is possible. It
|
||||||
It expects new allocations, and keeps some free memory around to improve
|
expects new allocations, and keeps some free memory around to improve
|
||||||
allocation speed. However, in low memory conditions, it might be better to free
|
allocation speed. However, in low memory conditions, it might be better to free
|
||||||
all possible memory. You can cause this to happen by calling
|
all possible memory. You can cause this to happen by calling
|
||||||
pcre2_jit_free_unused_memory(). Its argument is a general context, for custom
|
pcre2_jit_free_unused_memory(). Its argument is a general context, for custom
|
||||||
|
@ -416,10 +416,10 @@ that was not compiled.
|
||||||
.P
|
.P
|
||||||
When you call \fBpcre2_match()\fP, as well as testing for invalid options, a
|
When you call \fBpcre2_match()\fP, as well as testing for invalid options, a
|
||||||
number of other sanity checks are performed on the arguments. For example, if
|
number of other sanity checks are performed on the arguments. For example, if
|
||||||
the subject pointer is NULL, an immediate error is given. Also, unless
|
the subject pointer is NULL but the length is non-zero, an immediate error is
|
||||||
PCRE2_NO_UTF_CHECK is set, a UTF subject string is tested for validity. In the
|
given. Also, unless PCRE2_NO_UTF_CHECK is set, a UTF subject string is tested
|
||||||
interests of speed, these checks do not happen on the JIT fast path, and if
|
for validity. In the interests of speed, these checks do not happen on the JIT
|
||||||
invalid data is passed, the result is undefined.
|
fast path, and if invalid data is passed, the result is undefined.
|
||||||
.P
|
.P
|
||||||
Bypassing the sanity checks and the \fBpcre2_match()\fP wrapping can give
|
Bypassing the sanity checks and the \fBpcre2_match()\fP wrapping can give
|
||||||
speedups of more than 10%.
|
speedups of more than 10%.
|
||||||
|
@ -445,6 +445,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 23 May 2019
|
Last updated: 30 November 2021
|
||||||
Copyright (c) 1997-2019 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -3285,6 +3285,10 @@ rws->next = NULL;
|
||||||
rws->size = RWS_BASE_SIZE;
|
rws->size = RWS_BASE_SIZE;
|
||||||
rws->free = RWS_BASE_SIZE - RWS_ANCHOR_SIZE;
|
rws->free = RWS_BASE_SIZE - RWS_ANCHOR_SIZE;
|
||||||
|
|
||||||
|
/* Recognize NULL, length 0 as an empty string. */
|
||||||
|
|
||||||
|
if (subject == NULL && length == 0) subject = (PCRE2_SPTR)"";
|
||||||
|
|
||||||
/* Plausibility checks */
|
/* Plausibility checks */
|
||||||
|
|
||||||
if ((options & ~PUBLIC_DFA_MATCH_OPTIONS) != 0) return PCRE2_ERROR_BADOPTION;
|
if ((options & ~PUBLIC_DFA_MATCH_OPTIONS) != 0) return PCRE2_ERROR_BADOPTION;
|
||||||
|
|
|
@ -253,7 +253,7 @@ static const unsigned char match_error_texts[] =
|
||||||
"unknown substring\0"
|
"unknown substring\0"
|
||||||
/* 50 */
|
/* 50 */
|
||||||
"non-unique substring name\0"
|
"non-unique substring name\0"
|
||||||
"NULL argument passed\0"
|
"NULL argument passed with non-zero length\0"
|
||||||
"nested recursion at the same subject position\0"
|
"nested recursion at the same subject position\0"
|
||||||
"matching depth limit exceeded\0"
|
"matching depth limit exceeded\0"
|
||||||
"requested value is not available\0"
|
"requested value is not available\0"
|
||||||
|
|
|
@ -6170,6 +6170,10 @@ PCRE2_SPTR stack_frames_vector[START_FRAMES_SIZE/sizeof(PCRE2_SPTR)]
|
||||||
PCRE2_KEEP_UNINITIALIZED;
|
PCRE2_KEEP_UNINITIALIZED;
|
||||||
mb->stack_frames = (heapframe *)stack_frames_vector;
|
mb->stack_frames = (heapframe *)stack_frames_vector;
|
||||||
|
|
||||||
|
/* Recognize NULL, length 0 as an empty string. */
|
||||||
|
|
||||||
|
if (subject == NULL && length == 0) subject = (PCRE2_SPTR)"";
|
||||||
|
|
||||||
/* Plausibility checks */
|
/* Plausibility checks */
|
||||||
|
|
||||||
if ((options & ~PUBLIC_MATCH_OPTIONS) != 0) return PCRE2_ERROR_BADOPTION;
|
if ((options & ~PUBLIC_MATCH_OPTIONS) != 0) return PCRE2_ERROR_BADOPTION;
|
||||||
|
|
|
@ -260,9 +260,15 @@ PCRE2_UNSET, so as not to imply an offset in the replacement. */
|
||||||
if ((options & (PCRE2_PARTIAL_HARD|PCRE2_PARTIAL_SOFT)) != 0)
|
if ((options & (PCRE2_PARTIAL_HARD|PCRE2_PARTIAL_SOFT)) != 0)
|
||||||
return PCRE2_ERROR_BADOPTION;
|
return PCRE2_ERROR_BADOPTION;
|
||||||
|
|
||||||
/* Validate length and find the end of the replacement. */
|
/* Validate length and find the end of the replacement. A NULL replacement of
|
||||||
|
zero length is interpreted as an empty string. */
|
||||||
|
|
||||||
|
if (replacement == NULL)
|
||||||
|
{
|
||||||
|
if (rlength != 0) return PCRE2_ERROR_NULL;
|
||||||
|
replacement = (PCRE2_SPTR)"";
|
||||||
|
}
|
||||||
|
|
||||||
if (replacement == NULL) return PCRE2_ERROR_NULL;
|
|
||||||
if (rlength == PCRE2_ZERO_TERMINATED) rlength = PRIV(strlen)(replacement);
|
if (rlength == PCRE2_ZERO_TERMINATED) rlength = PRIV(strlen)(replacement);
|
||||||
repend = replacement + rlength;
|
repend = replacement + rlength;
|
||||||
|
|
||||||
|
|
|
@ -304,4 +304,7 @@
|
||||||
/[aCz]/mg,firstline,newline=lf
|
/[aCz]/mg,firstline,newline=lf
|
||||||
match\nmatch
|
match\nmatch
|
||||||
|
|
||||||
|
//jitfast
|
||||||
|
\=null_subject
|
||||||
|
|
||||||
# End of testinput17
|
# End of testinput17
|
||||||
|
|
|
@ -135,4 +135,9 @@
|
||||||
123ace
|
123ace
|
||||||
123ace\=posix_startend=2:6
|
123ace\=posix_startend=2:6
|
||||||
|
|
||||||
|
//posix
|
||||||
|
\= Expect errors
|
||||||
|
\=null_subject
|
||||||
|
abc\=null_subject
|
||||||
|
|
||||||
# End of testdata/testinput18
|
# End of testdata/testinput18
|
||||||
|
|
|
@ -5902,4 +5902,25 @@ a)"xI
|
||||||
|
|
||||||
# ---------
|
# ---------
|
||||||
|
|
||||||
|
# Tests for zero-length NULL to be treated as an empty string.
|
||||||
|
|
||||||
|
//
|
||||||
|
\=null_subject
|
||||||
|
\= Expect error
|
||||||
|
abc\=null_subject
|
||||||
|
|
||||||
|
//replace=[20]
|
||||||
|
abc\=null_replacement
|
||||||
|
\=null_subject
|
||||||
|
\=null_replacement
|
||||||
|
|
||||||
|
/X*/g,replace=xy
|
||||||
|
\= Expect error
|
||||||
|
>X<\=null_replacement
|
||||||
|
|
||||||
|
/X+/replace=[20]
|
||||||
|
>XX<\=null_replacement
|
||||||
|
|
||||||
|
# ---------
|
||||||
|
|
||||||
# End of testinput2
|
# End of testinput2
|
||||||
|
|
|
@ -550,4 +550,8 @@ Failed: error -47: match limit exceeded
|
||||||
match\nmatch
|
match\nmatch
|
||||||
0: a (JIT)
|
0: a (JIT)
|
||||||
|
|
||||||
|
//jitfast
|
||||||
|
\=null_subject
|
||||||
|
0: (JIT)
|
||||||
|
|
||||||
# End of testinput17
|
# End of testinput17
|
||||||
|
|
|
@ -215,4 +215,11 @@ Failed: POSIX code 16: bad argument at offset 0
|
||||||
3: <unset>
|
3: <unset>
|
||||||
4: c
|
4: c
|
||||||
|
|
||||||
|
//posix
|
||||||
|
\= Expect errors
|
||||||
|
\=null_subject
|
||||||
|
No match: POSIX code 16: bad argument
|
||||||
|
abc\=null_subject
|
||||||
|
No match: POSIX code 16: bad argument
|
||||||
|
|
||||||
# End of testdata/testinput18
|
# End of testdata/testinput18
|
||||||
|
|
|
@ -17674,6 +17674,34 @@ Failed: error 199 at offset 14: \K is not allowed in lookarounds (but see PCRE2_
|
||||||
|
|
||||||
# ---------
|
# ---------
|
||||||
|
|
||||||
|
# Tests for zero-length NULL to be treated as an empty string.
|
||||||
|
|
||||||
|
//
|
||||||
|
\=null_subject
|
||||||
|
0:
|
||||||
|
\= Expect error
|
||||||
|
abc\=null_subject
|
||||||
|
Failed: error -51: NULL argument passed with non-zero length
|
||||||
|
|
||||||
|
//replace=[20]
|
||||||
|
abc\=null_replacement
|
||||||
|
1: abc
|
||||||
|
\=null_subject
|
||||||
|
1:
|
||||||
|
\=null_replacement
|
||||||
|
1:
|
||||||
|
|
||||||
|
/X*/g,replace=xy
|
||||||
|
\= Expect error
|
||||||
|
>X<\=null_replacement
|
||||||
|
Failed: error -51: NULL argument passed with non-zero length
|
||||||
|
|
||||||
|
/X+/replace=[20]
|
||||||
|
>XX<\=null_replacement
|
||||||
|
1: ><
|
||||||
|
|
||||||
|
# ---------
|
||||||
|
|
||||||
# End of testinput2
|
# End of testinput2
|
||||||
Error -70: PCRE2_ERROR_BADDATA (unknown error number)
|
Error -70: PCRE2_ERROR_BADDATA (unknown error number)
|
||||||
Error -62: bad serialized data
|
Error -62: bad serialized data
|
||||||
|
|
Loading…
Reference in New Issue