Implement PCRE2_SUBSTITUTE_{OVERFLOW_LENGTH,UNKNOWN_UNSET}.
This commit is contained in:
parent
215e2185e4
commit
35e0f55783
|
@ -386,6 +386,9 @@ possible to test it.
|
||||||
111. "Harden" pcre2test against ridiculously large values in modifiers and
|
111. "Harden" pcre2test against ridiculously large values in modifiers and
|
||||||
command line arguments.
|
command line arguments.
|
||||||
|
|
||||||
|
112. Implemented PCRE2_SUBSTITUTE_UNKNOWN_UNSET and PCRE2_SUBSTITUTE_OVERFLOW_
|
||||||
|
LENGTH.
|
||||||
|
|
||||||
|
|
||||||
Version 10.20 30-June-2015
|
Version 10.20 30-June-2015
|
||||||
--------------------------
|
--------------------------
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2_SUBSTITUTE 3 "04 December 2015" "PCRE2 10.21"
|
.TH PCRE2_SUBSTITUTE 3 "12 December 2015" "PCRE2 10.21"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -58,6 +58,8 @@ The options are:
|
||||||
PCRE2_UTF was set at compile time)
|
PCRE2_UTF was set at compile time)
|
||||||
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
|
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
|
||||||
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject
|
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject
|
||||||
|
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
|
||||||
|
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
|
||||||
PCRE2_SUBSTITUTE_UNSET_EMPTY Simple unset insert = empty string
|
PCRE2_SUBSTITUTE_UNSET_EMPTY Simple unset insert = empty string
|
||||||
.sp
|
.sp
|
||||||
The function returns the number of substitutions, which may be zero if there
|
The function returns the number of substitutions, which may be zero if there
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2API 3 "04 December 2015" "PCRE2 10.21"
|
.TH PCRE2API 3 "12 December 2015" "PCRE2 10.21"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.sp
|
.sp
|
||||||
|
@ -2704,12 +2704,20 @@ functions from the match context, if provided, or else those that were used to
|
||||||
allocate memory for the compiled code.
|
allocate memory for the compiled code.
|
||||||
.P
|
.P
|
||||||
The \fIoutlengthptr\fP argument must point to a variable that contains the
|
The \fIoutlengthptr\fP argument must point to a variable that contains the
|
||||||
length, in code units, of the output buffer. If the function is successful,
|
length, in code units, of the output buffer. If the function is successful, the
|
||||||
the value is updated to contain the length of the new string, excluding the
|
value is updated to contain the length of the new string, excluding the
|
||||||
trailing zero that is automatically added. If the function is not successful,
|
trailing zero that is automatically added.
|
||||||
the value is set to PCRE2_UNSET for general errors (such as output buffer too
|
.P
|
||||||
small). For syntax errors in the replacement string, the value is set to the
|
If the function is not successful, the value set via \fIoutlengthptr\fP depends
|
||||||
offset in the replacement string where the error was detected.
|
on the type of error. For syntax errors in the replacement string, the value is
|
||||||
|
the offset in the replacement string where the error was detected. For other
|
||||||
|
errors, the value is PCRE2_UNSET by default. This includes the case of the
|
||||||
|
output buffer being too small, unless PCRE2_SUBSTITUTE_OVERFLOW_LENGTH is set
|
||||||
|
(see below), in which case the value is the minimum length needed, including
|
||||||
|
space for the trailing zero. Note that in order to compute the required length,
|
||||||
|
\fBpcre2_substitute()\fP has to simulate all the matching and copying, instead
|
||||||
|
of giving an error return as soon as the buffer overflows. Note also that the
|
||||||
|
length is in code units, not bytes.
|
||||||
.P
|
.P
|
||||||
In the replacement string, which is interpreted as a UTF string in UTF mode,
|
In the replacement string, which is interpreted as a UTF string in UTF mode,
|
||||||
and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK option is set, a
|
and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK option is set, a
|
||||||
|
@ -2734,7 +2742,8 @@ simultaneous substitutions, as this \fBpcre2test\fP example shows:
|
||||||
apple lemon
|
apple lemon
|
||||||
2: pear orange
|
2: pear orange
|
||||||
.sp
|
.sp
|
||||||
Three additional options are available:
|
As well as the usual options for \fBpcre2_match()\fP, a number of additional
|
||||||
|
options can be set in the \fIoptions\fP argument.
|
||||||
.P
|
.P
|
||||||
PCRE2_SUBSTITUTE_GLOBAL causes the function to iterate over the subject string,
|
PCRE2_SUBSTITUTE_GLOBAL causes the function to iterate over the subject string,
|
||||||
replacing every matching substring. If this is not set, only the first matching
|
replacing every matching substring. If this is not set, only the first matching
|
||||||
|
@ -2745,10 +2754,30 @@ advanced by one character except when CRLF is a valid newline sequence and the
|
||||||
next two characters are CR, LF. In this case, the current position is advanced
|
next two characters are CR, LF. In this case, the current position is advanced
|
||||||
by two characters.
|
by two characters.
|
||||||
.P
|
.P
|
||||||
PCRE2_SUBSTITUTE_UNSET_EMPTY causes unset capturing groups to be treated as
|
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH changes what happens when the output buffer is
|
||||||
empty strings when inserted as described above. If this option is not set, an
|
too small. The default action is to return PCRE2_ERROR_NOMEMORY immediately. If
|
||||||
attempt to insert an unset group causes the PCRE2_ERROR_UNSET error. This
|
this option is set, however, \fBpcre2_substitute()\fP continues to go through
|
||||||
option does not influence the extended substitution syntax described below.
|
the motions of matching and substituting (without, of course, writing anything)
|
||||||
|
in order to compute the size of buffer that is needed. This value is passed
|
||||||
|
back via the \fIoutlengthptr\fP variable, with the result of the function still
|
||||||
|
being PCRE2_ERROR_NOMEMORY.
|
||||||
|
.P
|
||||||
|
Passing a buffer size of zero is a permitted way of finding out how much memory
|
||||||
|
is needed for given substitution. However, this does mean that the entire
|
||||||
|
operation is carried out twice. Depending on the application, it may be more
|
||||||
|
efficient to allocate a large buffer and free the excess afterwards, instead of
|
||||||
|
using PCRE2_SUBSTITUTE_OVERFLOW_LENGTH.
|
||||||
|
.P
|
||||||
|
PCRE2_SUBSTITUTE_UNKNOWN_UNSET causes references to capturing groups that do
|
||||||
|
not appear in the pattern to be treated as unset groups. This option should be
|
||||||
|
used with care, because it means that a typo in a group name or number no
|
||||||
|
longer causes the PCRE2_ERROR_NOSUBSTRING error.
|
||||||
|
.P
|
||||||
|
PCRE2_SUBSTITUTE_UNSET_EMPTY causes unset capturing groups (including unknown
|
||||||
|
groups when PCRE2_SUBSTITUTE_UNKNOWN_UNSET is set) to be treated as empty
|
||||||
|
strings when inserted as described above. If this option is not set, an attempt
|
||||||
|
to insert an unset group causes the PCRE2_ERROR_UNSET error. This option does
|
||||||
|
not influence the extended substitution syntax described below.
|
||||||
.P
|
.P
|
||||||
PCRE2_SUBSTITUTE_EXTENDED causes extra processing to be applied to the
|
PCRE2_SUBSTITUTE_EXTENDED causes extra processing to be applied to the
|
||||||
replacement string. Without this option, only the dollar character is special,
|
replacement string. Without this option, only the dollar character is special,
|
||||||
|
@ -2800,26 +2829,38 @@ string remains in force afterwards, as shown in this \fBpcre2test\fP example:
|
||||||
1: HELLO
|
1: HELLO
|
||||||
.sp
|
.sp
|
||||||
The PCRE2_SUBSTITUTE_UNSET_EMPTY option does not affect these extended
|
The PCRE2_SUBSTITUTE_UNSET_EMPTY option does not affect these extended
|
||||||
substitutions.
|
substitutions. However, PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause unknown
|
||||||
|
groups in the extended syntax forms to be treated as unset.
|
||||||
.P
|
.P
|
||||||
If successful, the function returns the number of replacements that were made.
|
If successful, \fBpcre2_substitute()\fP returns the number of replacements that
|
||||||
This may be zero if no matches were found, and is never greater than 1 unless
|
were made. This may be zero if no matches were found, and is never greater than
|
||||||
PCRE2_SUBSTITUTE_GLOBAL is set.
|
1 unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
||||||
.P
|
.P
|
||||||
In the event of an error, a negative error code is returned. Except for
|
In the event of an error, a negative error code is returned. Except for
|
||||||
PCRE2_ERROR_NOMATCH (which is never returned), errors from \fBpcre2_match()\fP
|
PCRE2_ERROR_NOMATCH (which is never returned), errors from \fBpcre2_match()\fP
|
||||||
are passed straight back. PCRE2_ERROR_NOSUBSTRING is returned for a
|
are passed straight back.
|
||||||
non-existent substring insertion, and PCRE2_ERROR_UNSET is returned for an
|
.P
|
||||||
unset substring insertion when the simple (non-extended) syntax is used and
|
PCRE2_ERROR_NOSUBSTRING is returned for a non-existent substring insertion,
|
||||||
PCRE2_SUBSTITUTE_UNSET_EMPTY is not set. PCRE2_ERROR_NOMEMORY is returned if
|
unless PCRE2_SUBSTITUTE_UNKNOWN_UNSET is set.
|
||||||
the output buffer is not big enough. PCRE2_ERROR_BADREPLACEMENT is used for
|
.P
|
||||||
miscellaneous syntax errors in the replacement string, with more particular
|
PCRE2_ERROR_UNSET is returned for an unset substring insertion (including an
|
||||||
errors being PCRE2_ERROR_BADREPESCAPE (invalid escape sequence),
|
unknown substring when PCRE2_SUBSTITUTE_UNKNOWN_UNSET is set) when the simple
|
||||||
PCRE2_ERROR_REPMISSING_BRACE (closing curly bracket not found),
|
(non-extended) syntax is used and PCRE2_SUBSTITUTE_UNSET_EMPTY is not set.
|
||||||
PCRE2_BADSUBSTITUTION (syntax error in extended group substitution), and
|
.P
|
||||||
PCRE2_BADSUBPATTERN (the pattern match ended before it started). As for all
|
PCRE2_ERROR_NOMEMORY is returned if the output buffer is not big enough. If the
|
||||||
PCRE2 errors, a text message that describes the error can be obtained by
|
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option is set, the size of buffer that is
|
||||||
calling \fBpcre2_get_error_message()\fP.
|
needed is returned via \fIoutlengthptr\fP. Note that this does not happen by
|
||||||
|
default.
|
||||||
|
.P
|
||||||
|
PCRE2_ERROR_BADREPLACEMENT is used for miscellaneous syntax errors in the
|
||||||
|
replacement string, with more particular errors being PCRE2_ERROR_BADREPESCAPE
|
||||||
|
(invalid escape sequence), PCRE2_ERROR_REPMISSING_BRACE (closing curly bracket
|
||||||
|
not found), PCRE2_BADSUBSTITUTION (syntax error in extended group
|
||||||
|
substitution), and PCRE2_BADSUBPATTERN (the pattern match ended before it
|
||||||
|
started, which can happen if \eK is used in an assertion).
|
||||||
|
.P
|
||||||
|
As for all PCRE2 errors, a text message that describes the error can be
|
||||||
|
obtained by calling \fBpcre2_get_error_message()\fP.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SH "DUPLICATE SUBPATTERN NAMES"
|
.SH "DUPLICATE SUBPATTERN NAMES"
|
||||||
|
@ -3113,6 +3154,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 04 December 2015
|
Last updated: 21 December 2015
|
||||||
Copyright (c) 1997-2015 University of Cambridge.
|
Copyright (c) 1997-2015 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
114
doc/pcre2test.1
114
doc/pcre2test.1
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2TEST 1 "04 December 2015" "PCRE 10.21"
|
.TH PCRE2TEST 1 "12 December 2015" "PCRE 10.21"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
pcre2test - a program for testing Perl-compatible regular expressions.
|
pcre2test - a program for testing Perl-compatible regular expressions.
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -854,16 +854,18 @@ are applied to every subject line that is processed with that pattern. They may
|
||||||
not appear in \fB#pattern\fP commands. These modifiers do not affect the
|
not appear in \fB#pattern\fP commands. These modifiers do not affect the
|
||||||
compilation process.
|
compilation process.
|
||||||
.sp
|
.sp
|
||||||
aftertext show text after match
|
aftertext show text after match
|
||||||
allaftertext show text after captures
|
allaftertext show text after captures
|
||||||
allcaptures show all captures
|
allcaptures show all captures
|
||||||
allusedtext show all consulted text
|
allusedtext show all consulted text
|
||||||
/g global global matching
|
/g global global matching
|
||||||
mark show mark values
|
mark show mark values
|
||||||
replace=<string> specify a replacement string
|
replace=<string> specify a replacement string
|
||||||
startchar show starting character when relevant
|
startchar show starting character when relevant
|
||||||
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
||||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||||
|
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||||
|
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||||
.sp
|
.sp
|
||||||
These modifiers may not appear in a \fB#pattern\fP command. If you want them as
|
These modifiers may not appear in a \fB#pattern\fP command. If you want them as
|
||||||
defaults, set them in a \fB#subject\fP command.
|
defaults, set them in a \fB#subject\fP command.
|
||||||
|
@ -935,36 +937,38 @@ information. Some of them may also be specified on a pattern line (see above),
|
||||||
in which case they apply to every subject line that is matched against that
|
in which case they apply to every subject line that is matched against that
|
||||||
pattern.
|
pattern.
|
||||||
.sp
|
.sp
|
||||||
aftertext show text after match
|
aftertext show text after match
|
||||||
allaftertext show text after captures
|
allaftertext show text after captures
|
||||||
allcaptures show all captures
|
allcaptures show all captures
|
||||||
allusedtext show all consulted text (non-JIT only)
|
allusedtext show all consulted text (non-JIT only)
|
||||||
altglobal alternative global matching
|
altglobal alternative global matching
|
||||||
callout_capture show captures at callout time
|
callout_capture show captures at callout time
|
||||||
callout_data=<n> set a value to pass via callouts
|
callout_data=<n> set a value to pass via callouts
|
||||||
callout_fail=<n>[:<m>] control callout failure
|
callout_fail=<n>[:<m>] control callout failure
|
||||||
callout_none do not supply a callout function
|
callout_none do not supply a callout function
|
||||||
copy=<number or name> copy captured substring
|
copy=<number or name> copy captured substring
|
||||||
dfa use \fBpcre2_dfa_match()\fP
|
dfa use \fBpcre2_dfa_match()\fP
|
||||||
find_limits find match and recursion limits
|
find_limits find match and recursion limits
|
||||||
get=<number or name> extract captured substring
|
get=<number or name> extract captured substring
|
||||||
getall extract all captured substrings
|
getall extract all captured substrings
|
||||||
/g global global matching
|
/g global global matching
|
||||||
jitstack=<n> set size of JIT stack
|
jitstack=<n> set size of JIT stack
|
||||||
mark show mark values
|
mark show mark values
|
||||||
match_limit=<n> set a match limit
|
match_limit=<n> set a match limit
|
||||||
memory show memory usage
|
memory show memory usage
|
||||||
null_context match with a NULL context
|
null_context match with a NULL context
|
||||||
offset=<n> set starting offset
|
offset=<n> set starting offset
|
||||||
offset_limit=<n> set offset limit
|
offset_limit=<n> set offset limit
|
||||||
ovector=<n> set size of output vector
|
ovector=<n> set size of output vector
|
||||||
recursion_limit=<n> set a recursion limit
|
recursion_limit=<n> set a recursion limit
|
||||||
replace=<string> specify a replacement string
|
replace=<string> specify a replacement string
|
||||||
startchar show startchar when relevant
|
startchar show startchar when relevant
|
||||||
startoffset=<n> same as offset=<n>
|
startoffset=<n> same as offset=<n>
|
||||||
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
||||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||||
zero_terminate pass the subject as zero-terminated
|
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||||
|
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||||
|
zero_terminate pass the subject as zero-terminated
|
||||||
.sp
|
.sp
|
||||||
The effects of these modifiers are described in the following sections.
|
The effects of these modifiers are described in the following sections.
|
||||||
.
|
.
|
||||||
|
@ -1107,10 +1111,15 @@ the appropriate code unit width. If it is not a valid UTF-8 string, the
|
||||||
individual code units are copied directly. This provides a means of passing an
|
individual code units are copied directly. This provides a means of passing an
|
||||||
invalid UTF-8 string for testing purposes.
|
invalid UTF-8 string for testing purposes.
|
||||||
.P
|
.P
|
||||||
If the \fBglobal\fP modifier is set, PCRE2_SUBSTITUTE_GLOBAL is passed to
|
The following modifiers set options (in additional to the normal match options)
|
||||||
\fBpcre2_substitute()\fP. The \fBsubstitute_extended\fP and
|
for \fBpcre2_substitute()\fP:
|
||||||
\fBsubstitute_unset_empty\fP modifiers set PCRE2_SUBSTITUTE_EXTENDED and
|
.sp
|
||||||
PCRE2_SUBSTITUTE_UNSET_EMPTY, respectively.
|
global PCRE2_SUBSTITUTE_GLOBAL
|
||||||
|
substitute_extended PCRE2_SUBSTITUTE_EXTENDED
|
||||||
|
substitute_overflow_length PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||||
|
substitute_unknown_unset PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||||
|
substitute_unset_empty PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||||
|
.sp
|
||||||
.P
|
.P
|
||||||
After a successful substitution, the modified string is output, preceded by the
|
After a successful substitution, the modified string is output, preceded by the
|
||||||
number of replacements. This may be zero if there were no matches. Here is a
|
number of replacements. This may be zero if there were no matches. Here is a
|
||||||
|
@ -1135,6 +1144,19 @@ character. Here is an example that tests the edge case:
|
||||||
123abc123\e=replace=[9]XYZ
|
123abc123\e=replace=[9]XYZ
|
||||||
Failed: error -47: no more memory
|
Failed: error -47: no more memory
|
||||||
.sp
|
.sp
|
||||||
|
The default action of \fBpcre2_substitute()\fP is to return
|
||||||
|
PCRE2_ERROR_NOMEMORY when the output buffer is too small. However, if the
|
||||||
|
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option is set (by using the
|
||||||
|
\fBsubstitute_overflow_length\fP modifier), \fBpcre2_substitute()\fP continues
|
||||||
|
to go through the motions of matching and substituting, in order to compute the
|
||||||
|
size of buffer that is required. When this happens, \fBpcre2test\fP shows the
|
||||||
|
required buffer length (which includes space for the trailing zero) as part of
|
||||||
|
the error message. For example:
|
||||||
|
.sp
|
||||||
|
/abc/substitute_overflow_length
|
||||||
|
123abc123\e=replace=[9]XYZ
|
||||||
|
Failed: error -47: no more memory: 10 code units are needed
|
||||||
|
.sp
|
||||||
A replacement string is ignored with POSIX and DFA matching. Specifying partial
|
A replacement string is ignored with POSIX and DFA matching. Specifying partial
|
||||||
matching provokes an error return ("bad option value") from
|
matching provokes an error return ("bad option value") from
|
||||||
\fBpcre2_substitute()\fP.
|
\fBpcre2_substitute()\fP.
|
||||||
|
@ -1618,6 +1640,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 04 December 2015
|
Last updated: 12 December 2015
|
||||||
Copyright (c) 1997-2015 University of Cambridge.
|
Copyright (c) 1997-2015 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -148,9 +148,11 @@ sanity checks). */
|
||||||
|
|
||||||
/* These are additional options for pcre2_substitute(). */
|
/* These are additional options for pcre2_substitute(). */
|
||||||
|
|
||||||
#define PCRE2_SUBSTITUTE_GLOBAL 0x00000100u
|
#define PCRE2_SUBSTITUTE_GLOBAL 0x00000100u
|
||||||
#define PCRE2_SUBSTITUTE_EXTENDED 0x00000200u
|
#define PCRE2_SUBSTITUTE_EXTENDED 0x00000200u
|
||||||
#define PCRE2_SUBSTITUTE_UNSET_EMPTY 0x00000400u
|
#define PCRE2_SUBSTITUTE_UNSET_EMPTY 0x00000400u
|
||||||
|
#define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u
|
||||||
|
#define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u
|
||||||
|
|
||||||
/* Newline and \R settings, for use in compile contexts. The newline values
|
/* Newline and \R settings, for use in compile contexts. The newline values
|
||||||
must be kept in step with values set in config.h and both sets must all be
|
must be kept in step with values set in config.h and both sets must all be
|
||||||
|
|
|
@ -148,9 +148,11 @@ sanity checks). */
|
||||||
|
|
||||||
/* These are additional options for pcre2_substitute(). */
|
/* These are additional options for pcre2_substitute(). */
|
||||||
|
|
||||||
#define PCRE2_SUBSTITUTE_GLOBAL 0x00000100u
|
#define PCRE2_SUBSTITUTE_GLOBAL 0x00000100u
|
||||||
#define PCRE2_SUBSTITUTE_EXTENDED 0x00000200u
|
#define PCRE2_SUBSTITUTE_EXTENDED 0x00000200u
|
||||||
#define PCRE2_SUBSTITUTE_UNSET_EMPTY 0x00000400u
|
#define PCRE2_SUBSTITUTE_UNSET_EMPTY 0x00000400u
|
||||||
|
#define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u
|
||||||
|
#define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u
|
||||||
|
|
||||||
/* Newline and \R settings, for use in compile contexts. The newline values
|
/* Newline and \R settings, for use in compile contexts. The newline values
|
||||||
must be kept in step with values set in config.h and both sets must all be
|
must be kept in step with values set in config.h and both sets must all be
|
||||||
|
|
|
@ -47,6 +47,12 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||||
|
|
||||||
#define PTR_STACK_SIZE 20
|
#define PTR_STACK_SIZE 20
|
||||||
|
|
||||||
|
#define SUBSTITUTE_OPTIONS \
|
||||||
|
(PCRE2_SUBSTITUTE_EXTENDED|PCRE2_SUBSTITUTE_GLOBAL| \
|
||||||
|
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH|PCRE2_SUBSTITUTE_UNKNOWN_UNSET| \
|
||||||
|
PCRE2_SUBSTITUTE_UNSET_EMPTY)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
/*************************************************
|
/*************************************************
|
||||||
* Find end of substitute text *
|
* Find end of substitute text *
|
||||||
|
@ -181,6 +187,30 @@ Returns: >= 0 number of substitutions made
|
||||||
PCRE2_ERROR_BADREPLACEMENT means invalid use of $
|
PCRE2_ERROR_BADREPLACEMENT means invalid use of $
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
/* This macro checks for space in the buffer before copying into it. On
|
||||||
|
overflow, either give an error immediately, or keep on, accumulating the
|
||||||
|
length. */
|
||||||
|
|
||||||
|
#define CHECKMEMCPY(from,length) \
|
||||||
|
if (!overflowed && lengthleft < length) \
|
||||||
|
{ \
|
||||||
|
if ((suboptions & PCRE2_SUBSTITUTE_OVERFLOW_LENGTH) == 0) goto NOROOM; \
|
||||||
|
overflowed = TRUE; \
|
||||||
|
extra_needed = length - lengthleft; \
|
||||||
|
} \
|
||||||
|
else if (overflowed) \
|
||||||
|
{ \
|
||||||
|
extra_needed += length; \
|
||||||
|
} \
|
||||||
|
else \
|
||||||
|
{ \
|
||||||
|
memcpy(buffer + buff_offset, from, CU2BYTES(length)); \
|
||||||
|
buff_offset += length; \
|
||||||
|
lengthleft -= length; \
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Here's the function */
|
||||||
|
|
||||||
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
|
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
|
||||||
pcre2_substitute(const pcre2_code *code, PCRE2_SPTR subject, PCRE2_SIZE length,
|
pcre2_substitute(const pcre2_code *code, PCRE2_SPTR subject, PCRE2_SIZE length,
|
||||||
PCRE2_SIZE start_offset, uint32_t options, pcre2_match_data *match_data,
|
PCRE2_SIZE start_offset, uint32_t options, pcre2_match_data *match_data,
|
||||||
|
@ -193,20 +223,22 @@ int forcecase = 0;
|
||||||
int forcecasereset = 0;
|
int forcecasereset = 0;
|
||||||
uint32_t ovector_count;
|
uint32_t ovector_count;
|
||||||
uint32_t goptions = 0;
|
uint32_t goptions = 0;
|
||||||
|
uint32_t suboptions;
|
||||||
BOOL match_data_created = FALSE;
|
BOOL match_data_created = FALSE;
|
||||||
BOOL global = FALSE;
|
|
||||||
BOOL extended = FALSE;
|
|
||||||
BOOL literal = FALSE;
|
BOOL literal = FALSE;
|
||||||
BOOL uempty = FALSE; /* Unset/unknown groups => empty string */
|
BOOL overflowed = FALSE;
|
||||||
#ifdef SUPPORT_UNICODE
|
#ifdef SUPPORT_UNICODE
|
||||||
BOOL utf = (code->overall_options & PCRE2_UTF) != 0;
|
BOOL utf = (code->overall_options & PCRE2_UTF) != 0;
|
||||||
#endif
|
#endif
|
||||||
|
PCRE2_UCHAR temp[6];
|
||||||
PCRE2_SPTR ptr;
|
PCRE2_SPTR ptr;
|
||||||
PCRE2_SPTR repend;
|
PCRE2_SPTR repend;
|
||||||
|
PCRE2_SIZE extra_needed = 0;
|
||||||
PCRE2_SIZE buff_offset, buff_length, lengthleft, fraglength;
|
PCRE2_SIZE buff_offset, buff_length, lengthleft, fraglength;
|
||||||
PCRE2_SIZE *ovector;
|
PCRE2_SIZE *ovector;
|
||||||
|
|
||||||
buff_length = *blength;
|
buff_offset = 0;
|
||||||
|
lengthleft = buff_length = *blength;
|
||||||
*blength = PCRE2_UNSET;
|
*blength = PCRE2_UNSET;
|
||||||
|
|
||||||
/* Partial matching is not valid. */
|
/* Partial matching is not valid. */
|
||||||
|
@ -248,33 +280,14 @@ if (utf && (options & PCRE2_NO_UTF_CHECK) == 0)
|
||||||
}
|
}
|
||||||
#endif /* SUPPORT_UNICODE */
|
#endif /* SUPPORT_UNICODE */
|
||||||
|
|
||||||
/* Notice the global and extended options and remove them from the options that
|
/* Save the substitute options and remove them from the match options. */
|
||||||
are passed to pcre2_match(). */
|
|
||||||
|
|
||||||
if ((options & PCRE2_SUBSTITUTE_GLOBAL) != 0)
|
suboptions = options & SUBSTITUTE_OPTIONS;
|
||||||
{
|
options &= ~SUBSTITUTE_OPTIONS;
|
||||||
options &= ~PCRE2_SUBSTITUTE_GLOBAL;
|
|
||||||
global = TRUE;
|
|
||||||
}
|
|
||||||
|
|
||||||
if ((options & PCRE2_SUBSTITUTE_EXTENDED) != 0)
|
|
||||||
{
|
|
||||||
options &= ~PCRE2_SUBSTITUTE_EXTENDED;
|
|
||||||
extended = TRUE;
|
|
||||||
}
|
|
||||||
|
|
||||||
if ((options & PCRE2_SUBSTITUTE_UNSET_EMPTY) != 0)
|
|
||||||
{
|
|
||||||
options &= ~PCRE2_SUBSTITUTE_UNSET_EMPTY;
|
|
||||||
uempty = TRUE;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Copy up to the start offset */
|
/* Copy up to the start offset */
|
||||||
|
|
||||||
if (start_offset > buff_length) goto NOROOM;
|
CHECKMEMCPY(subject, start_offset);
|
||||||
memcpy(buffer, subject, start_offset * (PCRE2_CODE_UNIT_WIDTH/8));
|
|
||||||
buff_offset = start_offset;
|
|
||||||
lengthleft = buff_length - start_offset;
|
|
||||||
|
|
||||||
/* Loop for global substituting. */
|
/* Loop for global substituting. */
|
||||||
|
|
||||||
|
@ -330,13 +343,11 @@ do
|
||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
fraglength = start_offset - save_start;
|
/* Copy what we have advanced past, reset the special global options, and
|
||||||
if (lengthleft < fraglength) goto NOROOM;
|
continue to the next match. */
|
||||||
memcpy(buffer + buff_offset, subject + save_start,
|
|
||||||
fraglength*(PCRE2_CODE_UNIT_WIDTH/8));
|
|
||||||
buff_offset += fraglength;
|
|
||||||
lengthleft -= fraglength;
|
|
||||||
|
|
||||||
|
fraglength = start_offset - save_start;
|
||||||
|
CHECKMEMCPY(subject + save_start, fraglength);
|
||||||
goptions = 0;
|
goptions = 0;
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
@ -350,25 +361,21 @@ do
|
||||||
goto EXIT;
|
goto EXIT;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Paranoid check for integer overflow; surely no real call to this function
|
/* Count substitutions with a paranoid check for integer overflow; surely no
|
||||||
would ever hit this! */
|
real call to this function would ever hit this! */
|
||||||
|
|
||||||
if (subs == INT_MAX)
|
if (subs == INT_MAX)
|
||||||
{
|
{
|
||||||
rc = PCRE2_ERROR_TOOMANYREPLACE;
|
rc = PCRE2_ERROR_TOOMANYREPLACE;
|
||||||
goto EXIT;
|
goto EXIT;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Count substitutions and proceed */
|
|
||||||
|
|
||||||
subs++;
|
subs++;
|
||||||
|
|
||||||
|
/* Copy the text leading up to the match. */
|
||||||
|
|
||||||
if (rc == 0) rc = ovector_count;
|
if (rc == 0) rc = ovector_count;
|
||||||
fraglength = ovector[0] - start_offset;
|
fraglength = ovector[0] - start_offset;
|
||||||
if (fraglength >= lengthleft) goto NOROOM;
|
CHECKMEMCPY(subject + start_offset, fraglength);
|
||||||
memcpy(buffer + buff_offset, subject + start_offset,
|
|
||||||
fraglength*(PCRE2_CODE_UNIT_WIDTH/8));
|
|
||||||
buff_offset += fraglength;
|
|
||||||
lengthleft -= fraglength;
|
|
||||||
|
|
||||||
/* Process the replacement string. Literal mode is set by \Q, but only in
|
/* Process the replacement string. Literal mode is set by \Q, but only in
|
||||||
extended mode when backslashes are being interpreted. In extended mode we
|
extended mode when backslashes are being interpreted. In extended mode we
|
||||||
|
@ -378,12 +385,13 @@ do
|
||||||
for (;;)
|
for (;;)
|
||||||
{
|
{
|
||||||
uint32_t ch;
|
uint32_t ch;
|
||||||
|
unsigned int chlen;
|
||||||
|
|
||||||
/* If at the end of a nested substring, pop the stack. */
|
/* If at the end of a nested substring, pop the stack. */
|
||||||
|
|
||||||
if (ptr >= repend)
|
if (ptr >= repend)
|
||||||
{
|
{
|
||||||
if (ptrstackptr <= 0) break;
|
if (ptrstackptr <= 0) break; /* End of replacement string */
|
||||||
repend = ptrstack[--ptrstackptr];
|
repend = ptrstack[--ptrstackptr];
|
||||||
ptr = ptrstack[--ptrstackptr];
|
ptr = ptrstack[--ptrstackptr];
|
||||||
continue;
|
continue;
|
||||||
|
@ -450,12 +458,22 @@ do
|
||||||
group = group * 10 + next - CHAR_0;
|
group = group * 10 + next - CHAR_0;
|
||||||
|
|
||||||
/* A check for a number greater than the hightest captured group
|
/* A check for a number greater than the hightest captured group
|
||||||
is sufficient here; no need for a separate overflow check. */
|
is sufficient here; no need for a separate overflow check. If unknown
|
||||||
|
groups are to be treated as unset, just skip over any remaining
|
||||||
|
digits and carry on. */
|
||||||
|
|
||||||
if (group > code->top_bracket)
|
if (group > code->top_bracket)
|
||||||
{
|
{
|
||||||
rc = PCRE2_ERROR_NOSUBSTRING;
|
if ((suboptions & PCRE2_SUBSTITUTE_UNKNOWN_UNSET) != 0)
|
||||||
goto PTREXIT;
|
{
|
||||||
|
while (++ptr < repend && *ptr >= CHAR_0 && *ptr <= CHAR_9);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
rc = PCRE2_ERROR_NOSUBSTRING;
|
||||||
|
goto PTREXIT;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -478,7 +496,8 @@ do
|
||||||
|
|
||||||
if (inparens)
|
if (inparens)
|
||||||
{
|
{
|
||||||
if (extended && !star && ptr < repend - 2 && next == CHAR_COLON)
|
if ((suboptions & PCRE2_SUBSTITUTE_EXTENDED) != 0 &&
|
||||||
|
!star && ptr < repend - 2 && next == CHAR_COLON)
|
||||||
{
|
{
|
||||||
special = *(++ptr);
|
special = *(++ptr);
|
||||||
if (special != CHAR_PLUS && special != CHAR_MINUS)
|
if (special != CHAR_PLUS && special != CHAR_MINUS)
|
||||||
|
@ -513,8 +532,8 @@ do
|
||||||
ptr++;
|
ptr++;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Have found a syntactically correct group number or name, or
|
/* Have found a syntactically correct group number or name, or *name.
|
||||||
*name. Only *MARK is currently recognized. */
|
Only *MARK is currently recognized. */
|
||||||
|
|
||||||
if (star)
|
if (star)
|
||||||
{
|
{
|
||||||
|
@ -523,11 +542,10 @@ do
|
||||||
PCRE2_SPTR mark = pcre2_get_mark(match_data);
|
PCRE2_SPTR mark = pcre2_get_mark(match_data);
|
||||||
if (mark != NULL)
|
if (mark != NULL)
|
||||||
{
|
{
|
||||||
while (*mark != 0)
|
PCRE2_SPTR mark_start = mark;
|
||||||
{
|
while (*mark != 0) mark++;
|
||||||
if (lengthleft-- < 1) goto NOROOM;
|
fraglength = mark - mark_start;
|
||||||
buffer[buff_offset++] = *mark++;
|
CHECKMEMCPY(mark_start, fraglength);
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else goto BAD;
|
else goto BAD;
|
||||||
|
@ -541,31 +559,41 @@ do
|
||||||
PCRE2_SPTR subptr, subptrend;
|
PCRE2_SPTR subptr, subptrend;
|
||||||
|
|
||||||
/* Find a number for a named group. In case there are duplicate names,
|
/* Find a number for a named group. In case there are duplicate names,
|
||||||
search for the first one that is set. */
|
search for the first one that is set. If the name is not found when
|
||||||
|
PCRE2_SUBSTITUTE_UNKNOWN_EMPTY is set, set the group number to a
|
||||||
|
non-existent group. */
|
||||||
|
|
||||||
if (group < 0)
|
if (group < 0)
|
||||||
{
|
{
|
||||||
PCRE2_SPTR first, last, entry;
|
PCRE2_SPTR first, last, entry;
|
||||||
rc = pcre2_substring_nametable_scan(code, name, &first, &last);
|
rc = pcre2_substring_nametable_scan(code, name, &first, &last);
|
||||||
if (rc < 0) goto PTREXIT;
|
if (rc == PCRE2_ERROR_NOSUBSTRING &&
|
||||||
for (entry = first; entry <= last; entry += rc)
|
(suboptions & PCRE2_SUBSTITUTE_UNKNOWN_UNSET) != 0)
|
||||||
{
|
{
|
||||||
uint32_t ng = GET2(entry, 0);
|
group = code->top_bracket + 1;
|
||||||
if (ng < ovector_count)
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
if (rc < 0) goto PTREXIT;
|
||||||
|
for (entry = first; entry <= last; entry += rc)
|
||||||
{
|
{
|
||||||
if (group < 0) group = ng; /* First in ovector */
|
uint32_t ng = GET2(entry, 0);
|
||||||
if (ovector[ng*2] != PCRE2_UNSET)
|
if (ng < ovector_count)
|
||||||
{
|
{
|
||||||
group = ng; /* First that is set */
|
if (group < 0) group = ng; /* First in ovector */
|
||||||
break;
|
if (ovector[ng*2] != PCRE2_UNSET)
|
||||||
|
{
|
||||||
|
group = ng; /* First that is set */
|
||||||
|
break;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* If group is still negative, it means we did not find a group
|
||||||
|
that is in the ovector. Just set the first group. */
|
||||||
|
|
||||||
|
if (group < 0) group = GET2(first, 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* If group is still negative, it means we did not find a group that
|
|
||||||
is in the ovector. Just set the first group. */
|
|
||||||
|
|
||||||
if (group < 0) group = GET2(first, 0);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/* We now have a group that is identified by number. Find the length of
|
/* We now have a group that is identified by number. Find the length of
|
||||||
|
@ -575,10 +603,15 @@ do
|
||||||
rc = pcre2_substring_length_bynumber(match_data, group, &sublength);
|
rc = pcre2_substring_length_bynumber(match_data, group, &sublength);
|
||||||
if (rc < 0)
|
if (rc < 0)
|
||||||
{
|
{
|
||||||
|
if (rc == PCRE2_ERROR_NOSUBSTRING &&
|
||||||
|
(suboptions & PCRE2_SUBSTITUTE_UNKNOWN_UNSET) != 0)
|
||||||
|
{
|
||||||
|
rc = PCRE2_ERROR_UNSET;
|
||||||
|
}
|
||||||
if (rc != PCRE2_ERROR_UNSET) goto PTREXIT; /* Non-unset errors */
|
if (rc != PCRE2_ERROR_UNSET) goto PTREXIT; /* Non-unset errors */
|
||||||
if (special == 0) /* Plain substitution */
|
if (special == 0) /* Plain substitution */
|
||||||
{
|
{
|
||||||
if (uempty) continue; /* Treat as empty */
|
if ((suboptions & PCRE2_SUBSTITUTE_UNSET_EMPTY) != 0) continue;
|
||||||
goto PTREXIT; /* Else error */
|
goto PTREXIT; /* Else error */
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -646,26 +679,13 @@ do
|
||||||
}
|
}
|
||||||
|
|
||||||
#ifdef SUPPORT_UNICODE
|
#ifdef SUPPORT_UNICODE
|
||||||
if (utf)
|
if (utf) chlen = PRIV(ord2utf)(ch, temp); else
|
||||||
{
|
|
||||||
unsigned int chlen;
|
|
||||||
#if PCRE2_CODE_UNIT_WIDTH == 8
|
|
||||||
if (lengthleft < 6) goto NOROOM;
|
|
||||||
#elif PCRE2_CODE_UNIT_WIDTH == 16
|
|
||||||
if (lengthleft < 2) goto NOROOM;
|
|
||||||
#else
|
|
||||||
if (lengthleft < 1) goto NOROOM;
|
|
||||||
#endif
|
|
||||||
chlen = PRIV(ord2utf)(ch, buffer + buff_offset);
|
|
||||||
buff_offset += chlen;
|
|
||||||
lengthleft -= chlen;
|
|
||||||
}
|
|
||||||
else
|
|
||||||
#endif
|
#endif
|
||||||
{
|
{
|
||||||
if (lengthleft-- < 1) goto NOROOM;
|
temp[0] = ch;
|
||||||
buffer[buff_offset++] = ch;
|
chlen = 1;
|
||||||
}
|
}
|
||||||
|
CHECKMEMCPY(temp, chlen);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -675,7 +695,8 @@ do
|
||||||
the case-forcing escapes are not supported in pcre2_compile() so must be
|
the case-forcing escapes are not supported in pcre2_compile() so must be
|
||||||
recognized here. */
|
recognized here. */
|
||||||
|
|
||||||
else if (extended && *ptr == CHAR_BACKSLASH)
|
else if ((suboptions & PCRE2_SUBSTITUTE_EXTENDED) != 0 &&
|
||||||
|
*ptr == CHAR_BACKSLASH)
|
||||||
{
|
{
|
||||||
int errorcode = 0;
|
int errorcode = 0;
|
||||||
|
|
||||||
|
@ -756,33 +777,19 @@ do
|
||||||
)[ch/8] & (1 << (ch%8))) == 0)
|
)[ch/8] & (1 << (ch%8))) == 0)
|
||||||
ch = (code->tables + fcc_offset)[ch];
|
ch = (code->tables + fcc_offset)[ch];
|
||||||
}
|
}
|
||||||
|
|
||||||
forcecase = forcecasereset;
|
forcecase = forcecasereset;
|
||||||
}
|
}
|
||||||
|
|
||||||
#ifdef SUPPORT_UNICODE
|
#ifdef SUPPORT_UNICODE
|
||||||
if (utf)
|
if (utf) chlen = PRIV(ord2utf)(ch, temp); else
|
||||||
{
|
|
||||||
unsigned int chlen;
|
|
||||||
#if PCRE2_CODE_UNIT_WIDTH == 8
|
|
||||||
if (lengthleft < 6) goto NOROOM;
|
|
||||||
#elif PCRE2_CODE_UNIT_WIDTH == 16
|
|
||||||
if (lengthleft < 2) goto NOROOM;
|
|
||||||
#else
|
|
||||||
if (lengthleft < 1) goto NOROOM;
|
|
||||||
#endif
|
|
||||||
chlen = PRIV(ord2utf)(ch, buffer + buff_offset);
|
|
||||||
buff_offset += chlen;
|
|
||||||
lengthleft -= chlen;
|
|
||||||
}
|
|
||||||
else
|
|
||||||
#endif
|
#endif
|
||||||
{
|
{
|
||||||
if (lengthleft-- < 1) goto NOROOM;
|
temp[0] = ch;
|
||||||
buffer[buff_offset++] = ch;
|
chlen = 1;
|
||||||
}
|
}
|
||||||
}
|
CHECKMEMCPY(temp, chlen);
|
||||||
}
|
} /* End handling a literal code unit */
|
||||||
|
} /* End of loop for scanning the replacement. */
|
||||||
|
|
||||||
/* The replacement has been copied to the output. Update the start offset to
|
/* The replacement has been copied to the output. Update the start offset to
|
||||||
point to the rest of the subject string. If we matched an empty string,
|
point to the rest of the subject string. If we matched an empty string,
|
||||||
|
@ -791,18 +798,33 @@ do
|
||||||
start_offset = ovector[1];
|
start_offset = ovector[1];
|
||||||
goptions = (ovector[0] != ovector[1])? 0 :
|
goptions = (ovector[0] != ovector[1])? 0 :
|
||||||
PCRE2_ANCHORED|PCRE2_NOTEMPTY_ATSTART;
|
PCRE2_ANCHORED|PCRE2_NOTEMPTY_ATSTART;
|
||||||
} while (global); /* Repeat "do" loop */
|
} while ((suboptions & PCRE2_SUBSTITUTE_GLOBAL) != 0); /* Repeat "do" loop */
|
||||||
|
|
||||||
/* Copy the rest of the subject and return the number of substitutions. */
|
/* Copy the rest of the subject. */
|
||||||
|
|
||||||
rc = subs;
|
|
||||||
fraglength = length - start_offset;
|
fraglength = length - start_offset;
|
||||||
if (fraglength + 1 > lengthleft) goto NOROOM;
|
CHECKMEMCPY(subject + start_offset, fraglength);
|
||||||
memcpy(buffer + buff_offset, subject + start_offset,
|
temp[0] = 0;
|
||||||
fraglength*(PCRE2_CODE_UNIT_WIDTH/8));
|
CHECKMEMCPY(temp , 1);
|
||||||
buff_offset += fraglength;
|
|
||||||
buffer[buff_offset] = 0;
|
/* If overflowed is set it means the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH is set,
|
||||||
*blength = buff_offset;
|
and matching has carried on after a full buffer, in order to compute the length
|
||||||
|
needed. Otherwise, an overflow generates an immediate error return. */
|
||||||
|
|
||||||
|
if (overflowed)
|
||||||
|
{
|
||||||
|
rc = PCRE2_ERROR_NOMEMORY;
|
||||||
|
*blength = buff_length + extra_needed;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* After a successful execution, return the number of substitutions and set the
|
||||||
|
length of buffer used, excluding the trailing zero. */
|
||||||
|
|
||||||
|
else
|
||||||
|
{
|
||||||
|
rc = subs;
|
||||||
|
*blength = buff_offset - 1;
|
||||||
|
}
|
||||||
|
|
||||||
EXIT:
|
EXIT:
|
||||||
if (match_data_created) pcre2_match_data_free(match_data);
|
if (match_data_created) pcre2_match_data_free(match_data);
|
||||||
|
|
338
src/pcre2test.c
338
src/pcre2test.c
|
@ -399,39 +399,50 @@ enum { MOD_CTC, /* Applies to a compile context */
|
||||||
MOD_STR }; /* Is a string */
|
MOD_STR }; /* Is a string */
|
||||||
|
|
||||||
/* Control bits. Some apply to compiling, some to matching, but some can be set
|
/* Control bits. Some apply to compiling, some to matching, but some can be set
|
||||||
either on a pattern or a data line, so they must all be distinct. */
|
either on a pattern or a data line, so they must all be distinct. There are now
|
||||||
|
so many of them that they are split into two fields. */
|
||||||
|
|
||||||
#define CTL_AFTERTEXT 0x00000001u
|
#define CTL_AFTERTEXT 0x00000001u
|
||||||
#define CTL_ALLAFTERTEXT 0x00000002u
|
#define CTL_ALLAFTERTEXT 0x00000002u
|
||||||
#define CTL_ALLCAPTURES 0x00000004u
|
#define CTL_ALLCAPTURES 0x00000004u
|
||||||
#define CTL_ALLUSEDTEXT 0x00000008u
|
#define CTL_ALLUSEDTEXT 0x00000008u
|
||||||
#define CTL_ALTGLOBAL 0x00000010u
|
#define CTL_ALTGLOBAL 0x00000010u
|
||||||
#define CTL_BINCODE 0x00000020u
|
#define CTL_BINCODE 0x00000020u
|
||||||
#define CTL_CALLOUT_CAPTURE 0x00000040u
|
#define CTL_CALLOUT_CAPTURE 0x00000040u
|
||||||
#define CTL_CALLOUT_INFO 0x00000080u
|
#define CTL_CALLOUT_INFO 0x00000080u
|
||||||
#define CTL_CALLOUT_NONE 0x00000100u
|
#define CTL_CALLOUT_NONE 0x00000100u
|
||||||
#define CTL_DFA 0x00000200u
|
#define CTL_DFA 0x00000200u
|
||||||
#define CTL_EXPAND 0x00000400u
|
#define CTL_EXPAND 0x00000400u
|
||||||
#define CTL_FINDLIMITS 0x00000800u
|
#define CTL_FINDLIMITS 0x00000800u
|
||||||
#define CTL_FULLBINCODE 0x00001000u
|
#define CTL_FULLBINCODE 0x00001000u
|
||||||
#define CTL_GETALL 0x00002000u
|
#define CTL_GETALL 0x00002000u
|
||||||
#define CTL_GLOBAL 0x00004000u
|
#define CTL_GLOBAL 0x00004000u
|
||||||
#define CTL_HEXPAT 0x00008000u
|
#define CTL_HEXPAT 0x00008000u
|
||||||
#define CTL_INFO 0x00010000u
|
#define CTL_INFO 0x00010000u
|
||||||
#define CTL_JITFAST 0x00020000u
|
#define CTL_JITFAST 0x00020000u
|
||||||
#define CTL_JITVERIFY 0x00040000u
|
#define CTL_JITVERIFY 0x00040000u
|
||||||
#define CTL_MARK 0x00080000u
|
#define CTL_MARK 0x00080000u
|
||||||
#define CTL_MEMORY 0x00100000u
|
#define CTL_MEMORY 0x00100000u
|
||||||
#define CTL_NULLCONTEXT 0x00200000u
|
#define CTL_NULLCONTEXT 0x00200000u
|
||||||
#define CTL_POSIX 0x00400000u
|
#define CTL_POSIX 0x00400000u
|
||||||
#define CTL_PUSH 0x00800000u
|
#define CTL_PUSH 0x00800000u
|
||||||
#define CTL_STARTCHAR 0x01000000u
|
#define CTL_STARTCHAR 0x01000000u
|
||||||
#define CTL_SUBSTITUTE_EXTENDED 0x02000000u
|
#define CTL_ZERO_TERMINATE 0x02000000u
|
||||||
#define CTL_SUBSTITUTE_UNSET_EMPTY 0x04000000u
|
/* Spare 0x04000000u */
|
||||||
#define CTL_ZERO_TERMINATE 0x08000000u
|
/* Spare 0x08000000u */
|
||||||
|
/* Spare 0x10000000u */
|
||||||
|
/* Spare 0x20000000u */
|
||||||
|
#define CTL_NL_SET 0x40000000u /* Informational */
|
||||||
|
#define CTL_BSR_SET 0x80000000u /* Informational */
|
||||||
|
|
||||||
#define CTL_BSR_SET 0x80000000u /* This is informational */
|
/* Second control word */
|
||||||
#define CTL_NL_SET 0x40000000u /* This is informational */
|
|
||||||
|
#define CTL2_SUBSTITUTE_EXTENDED 0x00000001u
|
||||||
|
#define CTL2_SUBSTITUTE_OVERFLOW_LENGTH 0x00000002u
|
||||||
|
#define CTL2_SUBSTITUTE_UNKNOWN_UNSET 0x00000004u
|
||||||
|
#define CTL2_SUBSTITUTE_UNSET_EMPTY 0x00000008u
|
||||||
|
|
||||||
|
/* Combinations */
|
||||||
|
|
||||||
#define CTL_DEBUG (CTL_FULLBINCODE|CTL_INFO) /* For setting */
|
#define CTL_DEBUG (CTL_FULLBINCODE|CTL_INFO) /* For setting */
|
||||||
#define CTL_ANYINFO (CTL_DEBUG|CTL_BINCODE|CTL_CALLOUT_INFO)
|
#define CTL_ANYINFO (CTL_DEBUG|CTL_BINCODE|CTL_CALLOUT_INFO)
|
||||||
|
@ -448,9 +459,12 @@ data line. */
|
||||||
CTL_GLOBAL|\
|
CTL_GLOBAL|\
|
||||||
CTL_MARK|\
|
CTL_MARK|\
|
||||||
CTL_MEMORY|\
|
CTL_MEMORY|\
|
||||||
CTL_STARTCHAR|\
|
CTL_STARTCHAR)
|
||||||
CTL_SUBSTITUTE_EXTENDED|\
|
|
||||||
CTL_SUBSTITUTE_UNSET_EMPTY)
|
#define CTL2_ALLPD (CTL2_SUBSTITUTE_EXTENDED|\
|
||||||
|
CTL2_SUBSTITUTE_OVERFLOW_LENGTH|\
|
||||||
|
CTL2_SUBSTITUTE_UNKNOWN_UNSET|\
|
||||||
|
CTL2_SUBSTITUTE_UNSET_EMPTY)
|
||||||
|
|
||||||
/* Structures for holding modifier information for patterns and subject strings
|
/* Structures for holding modifier information for patterns and subject strings
|
||||||
(data). Fields containing modifiers that can be set either for a pattern or a
|
(data). Fields containing modifiers that can be set either for a pattern or a
|
||||||
|
@ -460,6 +474,7 @@ same offset in the big table below works for both. */
|
||||||
typedef struct patctl { /* Structure for pattern modifiers. */
|
typedef struct patctl { /* Structure for pattern modifiers. */
|
||||||
uint32_t options; /* Must be in same position as datctl */
|
uint32_t options; /* Must be in same position as datctl */
|
||||||
uint32_t control; /* Must be in same position as datctl */
|
uint32_t control; /* Must be in same position as datctl */
|
||||||
|
uint32_t control2; /* Must be in same position as datctl */
|
||||||
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
||||||
uint32_t jit;
|
uint32_t jit;
|
||||||
uint32_t stackguard_test;
|
uint32_t stackguard_test;
|
||||||
|
@ -474,6 +489,7 @@ typedef struct patctl { /* Structure for pattern modifiers. */
|
||||||
typedef struct datctl { /* Structure for data line modifiers. */
|
typedef struct datctl { /* Structure for data line modifiers. */
|
||||||
uint32_t options; /* Must be in same position as patctl */
|
uint32_t options; /* Must be in same position as patctl */
|
||||||
uint32_t control; /* Must be in same position as patctl */
|
uint32_t control; /* Must be in same position as patctl */
|
||||||
|
uint32_t control2; /* Must be in same position as patctl */
|
||||||
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
||||||
uint32_t cfail[2];
|
uint32_t cfail[2];
|
||||||
int32_t callout_data;
|
int32_t callout_data;
|
||||||
|
@ -514,92 +530,94 @@ typedef struct modstruct {
|
||||||
} modstruct;
|
} modstruct;
|
||||||
|
|
||||||
static modstruct modlist[] = {
|
static modstruct modlist[] = {
|
||||||
{ "aftertext", MOD_PNDP, MOD_CTL, CTL_AFTERTEXT, PO(control) },
|
{ "aftertext", MOD_PNDP, MOD_CTL, CTL_AFTERTEXT, PO(control) },
|
||||||
{ "allaftertext", MOD_PNDP, MOD_CTL, CTL_ALLAFTERTEXT, PO(control) },
|
{ "allaftertext", MOD_PNDP, MOD_CTL, CTL_ALLAFTERTEXT, PO(control) },
|
||||||
{ "allcaptures", MOD_PND, MOD_CTL, CTL_ALLCAPTURES, PO(control) },
|
{ "allcaptures", MOD_PND, MOD_CTL, CTL_ALLCAPTURES, PO(control) },
|
||||||
{ "allow_empty_class", MOD_PAT, MOD_OPT, PCRE2_ALLOW_EMPTY_CLASS, PO(options) },
|
{ "allow_empty_class", MOD_PAT, MOD_OPT, PCRE2_ALLOW_EMPTY_CLASS, PO(options) },
|
||||||
{ "allusedtext", MOD_PNDP, MOD_CTL, CTL_ALLUSEDTEXT, PO(control) },
|
{ "allusedtext", MOD_PNDP, MOD_CTL, CTL_ALLUSEDTEXT, PO(control) },
|
||||||
{ "alt_bsux", MOD_PAT, MOD_OPT, PCRE2_ALT_BSUX, PO(options) },
|
{ "alt_bsux", MOD_PAT, MOD_OPT, PCRE2_ALT_BSUX, PO(options) },
|
||||||
{ "alt_circumflex", MOD_PAT, MOD_OPT, PCRE2_ALT_CIRCUMFLEX, PO(options) },
|
{ "alt_circumflex", MOD_PAT, MOD_OPT, PCRE2_ALT_CIRCUMFLEX, PO(options) },
|
||||||
{ "alt_verbnames", MOD_PAT, MOD_OPT, PCRE2_ALT_VERBNAMES, PO(options) },
|
{ "alt_verbnames", MOD_PAT, MOD_OPT, PCRE2_ALT_VERBNAMES, PO(options) },
|
||||||
{ "altglobal", MOD_PND, MOD_CTL, CTL_ALTGLOBAL, PO(control) },
|
{ "altglobal", MOD_PND, MOD_CTL, CTL_ALTGLOBAL, PO(control) },
|
||||||
{ "anchored", MOD_PD, MOD_OPT, PCRE2_ANCHORED, PD(options) },
|
{ "anchored", MOD_PD, MOD_OPT, PCRE2_ANCHORED, PD(options) },
|
||||||
{ "auto_callout", MOD_PAT, MOD_OPT, PCRE2_AUTO_CALLOUT, PO(options) },
|
{ "auto_callout", MOD_PAT, MOD_OPT, PCRE2_AUTO_CALLOUT, PO(options) },
|
||||||
{ "bincode", MOD_PAT, MOD_CTL, CTL_BINCODE, PO(control) },
|
{ "bincode", MOD_PAT, MOD_CTL, CTL_BINCODE, PO(control) },
|
||||||
{ "bsr", MOD_CTC, MOD_BSR, 0, CO(bsr_convention) },
|
{ "bsr", MOD_CTC, MOD_BSR, 0, CO(bsr_convention) },
|
||||||
{ "callout_capture", MOD_DAT, MOD_CTL, CTL_CALLOUT_CAPTURE, DO(control) },
|
{ "callout_capture", MOD_DAT, MOD_CTL, CTL_CALLOUT_CAPTURE, DO(control) },
|
||||||
{ "callout_data", MOD_DAT, MOD_INS, 0, DO(callout_data) },
|
{ "callout_data", MOD_DAT, MOD_INS, 0, DO(callout_data) },
|
||||||
{ "callout_fail", MOD_DAT, MOD_IN2, 0, DO(cfail) },
|
{ "callout_fail", MOD_DAT, MOD_IN2, 0, DO(cfail) },
|
||||||
{ "callout_info", MOD_PAT, MOD_CTL, CTL_CALLOUT_INFO, PO(control) },
|
{ "callout_info", MOD_PAT, MOD_CTL, CTL_CALLOUT_INFO, PO(control) },
|
||||||
{ "callout_none", MOD_DAT, MOD_CTL, CTL_CALLOUT_NONE, DO(control) },
|
{ "callout_none", MOD_DAT, MOD_CTL, CTL_CALLOUT_NONE, DO(control) },
|
||||||
{ "caseless", MOD_PATP, MOD_OPT, PCRE2_CASELESS, PO(options) },
|
{ "caseless", MOD_PATP, MOD_OPT, PCRE2_CASELESS, PO(options) },
|
||||||
{ "copy", MOD_DAT, MOD_NN, DO(copy_numbers), DO(copy_names) },
|
{ "copy", MOD_DAT, MOD_NN, DO(copy_numbers), DO(copy_names) },
|
||||||
{ "debug", MOD_PAT, MOD_CTL, CTL_DEBUG, PO(control) },
|
{ "debug", MOD_PAT, MOD_CTL, CTL_DEBUG, PO(control) },
|
||||||
{ "dfa", MOD_DAT, MOD_CTL, CTL_DFA, DO(control) },
|
{ "dfa", MOD_DAT, MOD_CTL, CTL_DFA, DO(control) },
|
||||||
{ "dfa_restart", MOD_DAT, MOD_OPT, PCRE2_DFA_RESTART, DO(options) },
|
{ "dfa_restart", MOD_DAT, MOD_OPT, PCRE2_DFA_RESTART, DO(options) },
|
||||||
{ "dfa_shortest", MOD_DAT, MOD_OPT, PCRE2_DFA_SHORTEST, DO(options) },
|
{ "dfa_shortest", MOD_DAT, MOD_OPT, PCRE2_DFA_SHORTEST, DO(options) },
|
||||||
{ "dollar_endonly", MOD_PAT, MOD_OPT, PCRE2_DOLLAR_ENDONLY, PO(options) },
|
{ "dollar_endonly", MOD_PAT, MOD_OPT, PCRE2_DOLLAR_ENDONLY, PO(options) },
|
||||||
{ "dotall", MOD_PATP, MOD_OPT, PCRE2_DOTALL, PO(options) },
|
{ "dotall", MOD_PATP, MOD_OPT, PCRE2_DOTALL, PO(options) },
|
||||||
{ "dupnames", MOD_PATP, MOD_OPT, PCRE2_DUPNAMES, PO(options) },
|
{ "dupnames", MOD_PATP, MOD_OPT, PCRE2_DUPNAMES, PO(options) },
|
||||||
{ "expand", MOD_PAT, MOD_CTL, CTL_EXPAND, PO(control) },
|
{ "expand", MOD_PAT, MOD_CTL, CTL_EXPAND, PO(control) },
|
||||||
{ "extended", MOD_PATP, MOD_OPT, PCRE2_EXTENDED, PO(options) },
|
{ "extended", MOD_PATP, MOD_OPT, PCRE2_EXTENDED, PO(options) },
|
||||||
{ "find_limits", MOD_DAT, MOD_CTL, CTL_FINDLIMITS, DO(control) },
|
{ "find_limits", MOD_DAT, MOD_CTL, CTL_FINDLIMITS, DO(control) },
|
||||||
{ "firstline", MOD_PAT, MOD_OPT, PCRE2_FIRSTLINE, PO(options) },
|
{ "firstline", MOD_PAT, MOD_OPT, PCRE2_FIRSTLINE, PO(options) },
|
||||||
{ "fullbincode", MOD_PAT, MOD_CTL, CTL_FULLBINCODE, PO(control) },
|
{ "fullbincode", MOD_PAT, MOD_CTL, CTL_FULLBINCODE, PO(control) },
|
||||||
{ "get", MOD_DAT, MOD_NN, DO(get_numbers), DO(get_names) },
|
{ "get", MOD_DAT, MOD_NN, DO(get_numbers), DO(get_names) },
|
||||||
{ "getall", MOD_DAT, MOD_CTL, CTL_GETALL, DO(control) },
|
{ "getall", MOD_DAT, MOD_CTL, CTL_GETALL, DO(control) },
|
||||||
{ "global", MOD_PNDP, MOD_CTL, CTL_GLOBAL, PO(control) },
|
{ "global", MOD_PNDP, MOD_CTL, CTL_GLOBAL, PO(control) },
|
||||||
{ "hex", MOD_PAT, MOD_CTL, CTL_HEXPAT, PO(control) },
|
{ "hex", MOD_PAT, MOD_CTL, CTL_HEXPAT, PO(control) },
|
||||||
{ "info", MOD_PAT, MOD_CTL, CTL_INFO, PO(control) },
|
{ "info", MOD_PAT, MOD_CTL, CTL_INFO, PO(control) },
|
||||||
{ "jit", MOD_PAT, MOD_IND, 7, PO(jit) },
|
{ "jit", MOD_PAT, MOD_IND, 7, PO(jit) },
|
||||||
{ "jitfast", MOD_PAT, MOD_CTL, CTL_JITFAST, PO(control) },
|
{ "jitfast", MOD_PAT, MOD_CTL, CTL_JITFAST, PO(control) },
|
||||||
{ "jitstack", MOD_DAT, MOD_INT, 0, DO(jitstack) },
|
{ "jitstack", MOD_DAT, MOD_INT, 0, DO(jitstack) },
|
||||||
{ "jitverify", MOD_PAT, MOD_CTL, CTL_JITVERIFY, PO(control) },
|
{ "jitverify", MOD_PAT, MOD_CTL, CTL_JITVERIFY, PO(control) },
|
||||||
{ "locale", MOD_PAT, MOD_STR, LOCALESIZE, PO(locale) },
|
{ "locale", MOD_PAT, MOD_STR, LOCALESIZE, PO(locale) },
|
||||||
{ "mark", MOD_PNDP, MOD_CTL, CTL_MARK, PO(control) },
|
{ "mark", MOD_PNDP, MOD_CTL, CTL_MARK, PO(control) },
|
||||||
{ "match_limit", MOD_CTM, MOD_INT, 0, MO(match_limit) },
|
{ "match_limit", MOD_CTM, MOD_INT, 0, MO(match_limit) },
|
||||||
{ "match_unset_backref", MOD_PAT, MOD_OPT, PCRE2_MATCH_UNSET_BACKREF, PO(options) },
|
{ "match_unset_backref", MOD_PAT, MOD_OPT, PCRE2_MATCH_UNSET_BACKREF, PO(options) },
|
||||||
{ "max_pattern_length", MOD_CTC, MOD_SIZ, 0, CO(max_pattern_length) },
|
{ "max_pattern_length", MOD_CTC, MOD_SIZ, 0, CO(max_pattern_length) },
|
||||||
{ "memory", MOD_PD, MOD_CTL, CTL_MEMORY, PD(control) },
|
{ "memory", MOD_PD, MOD_CTL, CTL_MEMORY, PD(control) },
|
||||||
{ "multiline", MOD_PATP, MOD_OPT, PCRE2_MULTILINE, PO(options) },
|
{ "multiline", MOD_PATP, MOD_OPT, PCRE2_MULTILINE, PO(options) },
|
||||||
{ "never_backslash_c", MOD_PAT, MOD_OPT, PCRE2_NEVER_BACKSLASH_C, PO(options) },
|
{ "never_backslash_c", MOD_PAT, MOD_OPT, PCRE2_NEVER_BACKSLASH_C, PO(options) },
|
||||||
{ "never_ucp", MOD_PAT, MOD_OPT, PCRE2_NEVER_UCP, PO(options) },
|
{ "never_ucp", MOD_PAT, MOD_OPT, PCRE2_NEVER_UCP, PO(options) },
|
||||||
{ "never_utf", MOD_PAT, MOD_OPT, PCRE2_NEVER_UTF, PO(options) },
|
{ "never_utf", MOD_PAT, MOD_OPT, PCRE2_NEVER_UTF, PO(options) },
|
||||||
{ "newline", MOD_CTC, MOD_NL, 0, CO(newline_convention) },
|
{ "newline", MOD_CTC, MOD_NL, 0, CO(newline_convention) },
|
||||||
{ "no_auto_capture", MOD_PAT, MOD_OPT, PCRE2_NO_AUTO_CAPTURE, PO(options) },
|
{ "no_auto_capture", MOD_PAT, MOD_OPT, PCRE2_NO_AUTO_CAPTURE, PO(options) },
|
||||||
{ "no_auto_possess", MOD_PATP, MOD_OPT, PCRE2_NO_AUTO_POSSESS, PO(options) },
|
{ "no_auto_possess", MOD_PATP, MOD_OPT, PCRE2_NO_AUTO_POSSESS, PO(options) },
|
||||||
{ "no_dotstar_anchor", MOD_PAT, MOD_OPT, PCRE2_NO_DOTSTAR_ANCHOR, PO(options) },
|
{ "no_dotstar_anchor", MOD_PAT, MOD_OPT, PCRE2_NO_DOTSTAR_ANCHOR, PO(options) },
|
||||||
{ "no_start_optimize", MOD_PATP, MOD_OPT, PCRE2_NO_START_OPTIMIZE, PO(options) },
|
{ "no_start_optimize", MOD_PATP, MOD_OPT, PCRE2_NO_START_OPTIMIZE, PO(options) },
|
||||||
{ "no_utf_check", MOD_PD, MOD_OPT, PCRE2_NO_UTF_CHECK, PD(options) },
|
{ "no_utf_check", MOD_PD, MOD_OPT, PCRE2_NO_UTF_CHECK, PD(options) },
|
||||||
{ "notbol", MOD_DAT, MOD_OPT, PCRE2_NOTBOL, DO(options) },
|
{ "notbol", MOD_DAT, MOD_OPT, PCRE2_NOTBOL, DO(options) },
|
||||||
{ "notempty", MOD_DAT, MOD_OPT, PCRE2_NOTEMPTY, DO(options) },
|
{ "notempty", MOD_DAT, MOD_OPT, PCRE2_NOTEMPTY, DO(options) },
|
||||||
{ "notempty_atstart", MOD_DAT, MOD_OPT, PCRE2_NOTEMPTY_ATSTART, DO(options) },
|
{ "notempty_atstart", MOD_DAT, MOD_OPT, PCRE2_NOTEMPTY_ATSTART, DO(options) },
|
||||||
{ "noteol", MOD_DAT, MOD_OPT, PCRE2_NOTEOL, DO(options) },
|
{ "noteol", MOD_DAT, MOD_OPT, PCRE2_NOTEOL, DO(options) },
|
||||||
{ "null_context", MOD_PD, MOD_CTL, CTL_NULLCONTEXT, PO(control) },
|
{ "null_context", MOD_PD, MOD_CTL, CTL_NULLCONTEXT, PO(control) },
|
||||||
{ "offset", MOD_DAT, MOD_INT, 0, DO(offset) },
|
{ "offset", MOD_DAT, MOD_INT, 0, DO(offset) },
|
||||||
{ "offset_limit", MOD_CTM, MOD_SIZ, 0, MO(offset_limit)},
|
{ "offset_limit", MOD_CTM, MOD_SIZ, 0, MO(offset_limit)},
|
||||||
{ "ovector", MOD_DAT, MOD_INT, 0, DO(oveccount) },
|
{ "ovector", MOD_DAT, MOD_INT, 0, DO(oveccount) },
|
||||||
{ "parens_nest_limit", MOD_CTC, MOD_INT, 0, CO(parens_nest_limit) },
|
{ "parens_nest_limit", MOD_CTC, MOD_INT, 0, CO(parens_nest_limit) },
|
||||||
{ "partial_hard", MOD_DAT, MOD_OPT, PCRE2_PARTIAL_HARD, DO(options) },
|
{ "partial_hard", MOD_DAT, MOD_OPT, PCRE2_PARTIAL_HARD, DO(options) },
|
||||||
{ "partial_soft", MOD_DAT, MOD_OPT, PCRE2_PARTIAL_SOFT, DO(options) },
|
{ "partial_soft", MOD_DAT, MOD_OPT, PCRE2_PARTIAL_SOFT, DO(options) },
|
||||||
{ "ph", MOD_DAT, MOD_OPT, PCRE2_PARTIAL_HARD, DO(options) },
|
{ "ph", MOD_DAT, MOD_OPT, PCRE2_PARTIAL_HARD, DO(options) },
|
||||||
{ "posix", MOD_PAT, MOD_CTL, CTL_POSIX, PO(control) },
|
{ "posix", MOD_PAT, MOD_CTL, CTL_POSIX, PO(control) },
|
||||||
{ "ps", MOD_DAT, MOD_OPT, PCRE2_PARTIAL_SOFT, DO(options) },
|
{ "ps", MOD_DAT, MOD_OPT, PCRE2_PARTIAL_SOFT, DO(options) },
|
||||||
{ "push", MOD_PAT, MOD_CTL, CTL_PUSH, PO(control) },
|
{ "push", MOD_PAT, MOD_CTL, CTL_PUSH, PO(control) },
|
||||||
{ "recursion_limit", MOD_CTM, MOD_INT, 0, MO(recursion_limit) },
|
{ "recursion_limit", MOD_CTM, MOD_INT, 0, MO(recursion_limit) },
|
||||||
{ "regerror_buffsize", MOD_PAT, MOD_INT, 0, PO(regerror_buffsize) },
|
{ "regerror_buffsize", MOD_PAT, MOD_INT, 0, PO(regerror_buffsize) },
|
||||||
{ "replace", MOD_PND, MOD_STR, REPLACE_MODSIZE, PO(replacement) },
|
{ "replace", MOD_PND, MOD_STR, REPLACE_MODSIZE, PO(replacement) },
|
||||||
{ "stackguard", MOD_PAT, MOD_INT, 0, PO(stackguard_test) },
|
{ "stackguard", MOD_PAT, MOD_INT, 0, PO(stackguard_test) },
|
||||||
{ "startchar", MOD_PND, MOD_CTL, CTL_STARTCHAR, PO(control) },
|
{ "startchar", MOD_PND, MOD_CTL, CTL_STARTCHAR, PO(control) },
|
||||||
{ "startoffset", MOD_DAT, MOD_INT, 0, DO(offset) },
|
{ "startoffset", MOD_DAT, MOD_INT, 0, DO(offset) },
|
||||||
{ "substitute_extended", MOD_PND, MOD_CTL, CTL_SUBSTITUTE_EXTENDED, PO(control) },
|
{ "substitute_extended", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_EXTENDED, PO(control2) },
|
||||||
{ "substitute_unset_empty", MOD_PND, MOD_CTL, CTL_SUBSTITUTE_UNSET_EMPTY, PO(control) },
|
{ "substitute_overflow_length", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_OVERFLOW_LENGTH, PO(control2) },
|
||||||
{ "tables", MOD_PAT, MOD_INT, 0, PO(tables_id) },
|
{ "substitute_unknown_unset", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_UNKNOWN_UNSET, PO(control2) },
|
||||||
{ "ucp", MOD_PATP, MOD_OPT, PCRE2_UCP, PO(options) },
|
{ "substitute_unset_empty", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_UNSET_EMPTY, PO(control2) },
|
||||||
{ "ungreedy", MOD_PAT, MOD_OPT, PCRE2_UNGREEDY, PO(options) },
|
{ "tables", MOD_PAT, MOD_INT, 0, PO(tables_id) },
|
||||||
{ "use_offset_limit", MOD_PAT, MOD_OPT, PCRE2_USE_OFFSET_LIMIT, PO(options) },
|
{ "ucp", MOD_PATP, MOD_OPT, PCRE2_UCP, PO(options) },
|
||||||
{ "utf", MOD_PATP, MOD_OPT, PCRE2_UTF, PO(options) },
|
{ "ungreedy", MOD_PAT, MOD_OPT, PCRE2_UNGREEDY, PO(options) },
|
||||||
{ "zero_terminate", MOD_DAT, MOD_CTL, CTL_ZERO_TERMINATE, DO(control) }
|
{ "use_offset_limit", MOD_PAT, MOD_OPT, PCRE2_USE_OFFSET_LIMIT, PO(options) },
|
||||||
|
{ "utf", MOD_PATP, MOD_OPT, PCRE2_UTF, PO(options) },
|
||||||
|
{ "zero_terminate", MOD_DAT, MOD_CTL, CTL_ZERO_TERMINATE, DO(control) }
|
||||||
};
|
};
|
||||||
|
|
||||||
#define MODLISTCOUNT sizeof(modlist)/sizeof(modstruct)
|
#define MODLISTCOUNT sizeof(modlist)/sizeof(modstruct)
|
||||||
|
@ -613,10 +631,13 @@ static modstruct modlist[] = {
|
||||||
#define POSIX_SUPPORTED_COMPILE_CONTROLS ( \
|
#define POSIX_SUPPORTED_COMPILE_CONTROLS ( \
|
||||||
CTL_AFTERTEXT|CTL_ALLAFTERTEXT|CTL_EXPAND|CTL_POSIX)
|
CTL_AFTERTEXT|CTL_ALLAFTERTEXT|CTL_EXPAND|CTL_POSIX)
|
||||||
|
|
||||||
|
#define POSIX_SUPPORTED_COMPILE_CONTROLS2 (0)
|
||||||
|
|
||||||
#define POSIX_SUPPORTED_MATCH_OPTIONS ( \
|
#define POSIX_SUPPORTED_MATCH_OPTIONS ( \
|
||||||
PCRE2_NOTBOL|PCRE2_NOTEMPTY|PCRE2_NOTEOL)
|
PCRE2_NOTBOL|PCRE2_NOTEMPTY|PCRE2_NOTEOL)
|
||||||
|
|
||||||
#define POSIX_SUPPORTED_MATCH_CONTROLS (CTL_AFTERTEXT|CTL_ALLAFTERTEXT)
|
#define POSIX_SUPPORTED_MATCH_CONTROLS (CTL_AFTERTEXT|CTL_ALLAFTERTEXT)
|
||||||
|
#define POSIX_SUPPORTED_MATCH_CONTROLS2 (0)
|
||||||
|
|
||||||
/* Control bits that are not ignored with 'push'. */
|
/* Control bits that are not ignored with 'push'. */
|
||||||
|
|
||||||
|
@ -624,23 +645,27 @@ static modstruct modlist[] = {
|
||||||
CTL_BINCODE|CTL_CALLOUT_INFO|CTL_FULLBINCODE|CTL_HEXPAT|CTL_INFO| \
|
CTL_BINCODE|CTL_CALLOUT_INFO|CTL_FULLBINCODE|CTL_HEXPAT|CTL_INFO| \
|
||||||
CTL_JITVERIFY|CTL_MEMORY|CTL_PUSH|CTL_BSR_SET|CTL_NL_SET)
|
CTL_JITVERIFY|CTL_MEMORY|CTL_PUSH|CTL_BSR_SET|CTL_NL_SET)
|
||||||
|
|
||||||
|
#define PUSH_SUPPORTED_COMPILE_CONTROLS2 (0)
|
||||||
|
|
||||||
/* Controls that apply only at compile time with 'push'. */
|
/* Controls that apply only at compile time with 'push'. */
|
||||||
|
|
||||||
#define PUSH_COMPILE_ONLY_CONTROLS CTL_JITVERIFY
|
#define PUSH_COMPILE_ONLY_CONTROLS CTL_JITVERIFY
|
||||||
|
#define PUSH_COMPILE_ONLY_CONTROLS2 (0)
|
||||||
|
|
||||||
/* Controls that are forbidden with #pop. */
|
/* Controls that are forbidden with #pop. */
|
||||||
|
|
||||||
#define NOTPOP_CONTROLS (CTL_HEXPAT|CTL_POSIX|CTL_PUSH)
|
#define NOTPOP_CONTROLS (CTL_HEXPAT|CTL_POSIX|CTL_PUSH)
|
||||||
|
|
||||||
/* Pattern controls that are mutually exclusive. */
|
/* Pattern controls that are mutually exclusive. At present these are all in
|
||||||
|
the first control word. */
|
||||||
|
|
||||||
static uint32_t exclusive_pat_controls[] = {
|
static uint32_t exclusive_pat_controls[] = {
|
||||||
CTL_POSIX | CTL_HEXPAT,
|
CTL_POSIX | CTL_HEXPAT,
|
||||||
CTL_POSIX | CTL_PUSH,
|
CTL_POSIX | CTL_PUSH,
|
||||||
CTL_EXPAND | CTL_HEXPAT };
|
CTL_EXPAND | CTL_HEXPAT };
|
||||||
|
|
||||||
/* Data controls that are mutually exclusive. */
|
/* Data controls that are mutually exclusive. At present these are all in the
|
||||||
|
first control word. */
|
||||||
static uint32_t exclusive_dat_controls[] = {
|
static uint32_t exclusive_dat_controls[] = {
|
||||||
CTL_ALLUSEDTEXT | CTL_STARTCHAR,
|
CTL_ALLUSEDTEXT | CTL_STARTCHAR,
|
||||||
CTL_FINDLIMITS | CTL_NULLCONTEXT };
|
CTL_FINDLIMITS | CTL_NULLCONTEXT };
|
||||||
|
@ -3528,15 +3553,16 @@ words.
|
||||||
|
|
||||||
Arguments:
|
Arguments:
|
||||||
controls control bits
|
controls control bits
|
||||||
|
controls2 more control bits
|
||||||
before text to print before
|
before text to print before
|
||||||
|
|
||||||
Returns: nothing
|
Returns: nothing
|
||||||
*/
|
*/
|
||||||
|
|
||||||
static void
|
static void
|
||||||
show_controls(uint32_t controls, const char *before)
|
show_controls(uint32_t controls, uint32_t controls2, const char *before)
|
||||||
{
|
{
|
||||||
fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s",
|
fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s",
|
||||||
before,
|
before,
|
||||||
((controls & CTL_AFTERTEXT) != 0)? " aftertext" : "",
|
((controls & CTL_AFTERTEXT) != 0)? " aftertext" : "",
|
||||||
((controls & CTL_ALLAFTERTEXT) != 0)? " allaftertext" : "",
|
((controls & CTL_ALLAFTERTEXT) != 0)? " allaftertext" : "",
|
||||||
|
@ -3565,8 +3591,10 @@ fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s
|
||||||
((controls & CTL_POSIX) != 0)? " posix" : "",
|
((controls & CTL_POSIX) != 0)? " posix" : "",
|
||||||
((controls & CTL_PUSH) != 0)? " push" : "",
|
((controls & CTL_PUSH) != 0)? " push" : "",
|
||||||
((controls & CTL_STARTCHAR) != 0)? " startchar" : "",
|
((controls & CTL_STARTCHAR) != 0)? " startchar" : "",
|
||||||
((controls & CTL_SUBSTITUTE_EXTENDED) != 0)? " substitute_extended" : "",
|
((controls2 & CTL2_SUBSTITUTE_EXTENDED) != 0)? " substitute_extended" : "",
|
||||||
((controls & CTL_SUBSTITUTE_UNSET_EMPTY) != 0)? " substitute_unset_empty" : "",
|
((controls2 & CTL2_SUBSTITUTE_OVERFLOW_LENGTH) != 0)? " substitute_overflow_length" : "",
|
||||||
|
((controls2 & CTL2_SUBSTITUTE_UNKNOWN_UNSET) != 0)? " substitute_unknown_unset" : "",
|
||||||
|
((controls2 & CTL2_SUBSTITUTE_UNSET_EMPTY) != 0)? " substitute_unset_empty" : "",
|
||||||
((controls & CTL_ZERO_TERMINATE) != 0)? " zero_terminate" : "");
|
((controls & CTL_ZERO_TERMINATE) != 0)? " zero_terminate" : "");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -4398,14 +4426,15 @@ patlen = p - buffer - 2;
|
||||||
if (!decode_modifiers(p, CTX_PAT, &pat_patctl, NULL)) return PR_SKIP;
|
if (!decode_modifiers(p, CTX_PAT, &pat_patctl, NULL)) return PR_SKIP;
|
||||||
utf = (pat_patctl.options & PCRE2_UTF) != 0;
|
utf = (pat_patctl.options & PCRE2_UTF) != 0;
|
||||||
|
|
||||||
/* Check for mutually exclusive modifiers. */
|
/* Check for mutually exclusive modifiers. At present, these are all in the
|
||||||
|
first control word. */
|
||||||
|
|
||||||
for (k = 0; k < sizeof(exclusive_pat_controls)/sizeof(uint32_t); k++)
|
for (k = 0; k < sizeof(exclusive_pat_controls)/sizeof(uint32_t); k++)
|
||||||
{
|
{
|
||||||
uint32_t c = pat_patctl.control & exclusive_pat_controls[k];
|
uint32_t c = pat_patctl.control & exclusive_pat_controls[k];
|
||||||
if (c != 0 && c != (c & (~c+1)))
|
if (c != 0 && c != (c & (~c+1)))
|
||||||
{
|
{
|
||||||
show_controls(c, "** Not allowed together:");
|
show_controls(c, 0, "** Not allowed together:");
|
||||||
fprintf(outfile, "\n");
|
fprintf(outfile, "\n");
|
||||||
return PR_SKIP;
|
return PR_SKIP;
|
||||||
}
|
}
|
||||||
|
@ -4605,9 +4634,11 @@ if ((pat_patctl.control & CTL_POSIX) != 0)
|
||||||
pat_patctl.options & ~POSIX_SUPPORTED_COMPILE_OPTIONS, msg, "");
|
pat_patctl.options & ~POSIX_SUPPORTED_COMPILE_OPTIONS, msg, "");
|
||||||
msg = "";
|
msg = "";
|
||||||
}
|
}
|
||||||
if ((pat_patctl.control & ~POSIX_SUPPORTED_COMPILE_CONTROLS) != 0)
|
if ((pat_patctl.control & ~POSIX_SUPPORTED_COMPILE_CONTROLS) != 0 ||
|
||||||
|
(pat_patctl.control2 & ~POSIX_SUPPORTED_COMPILE_CONTROLS2) != 0)
|
||||||
{
|
{
|
||||||
show_controls(pat_patctl.control & ~POSIX_SUPPORTED_COMPILE_CONTROLS, msg);
|
show_controls(pat_patctl.control & ~POSIX_SUPPORTED_COMPILE_CONTROLS,
|
||||||
|
pat_patctl.control2 & ~POSIX_SUPPORTED_COMPILE_CONTROLS2, msg);
|
||||||
msg = "";
|
msg = "";
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -4663,15 +4694,19 @@ if ((pat_patctl.control & CTL_PUSH) != 0)
|
||||||
fprintf(outfile, "** Replacement text is not supported with 'push'.\n");
|
fprintf(outfile, "** Replacement text is not supported with 'push'.\n");
|
||||||
return PR_OK;
|
return PR_OK;
|
||||||
}
|
}
|
||||||
if ((pat_patctl.control & ~PUSH_SUPPORTED_COMPILE_CONTROLS) != 0)
|
if ((pat_patctl.control & ~PUSH_SUPPORTED_COMPILE_CONTROLS) != 0 ||
|
||||||
|
(pat_patctl.control2 & ~PUSH_SUPPORTED_COMPILE_CONTROLS2) != 0)
|
||||||
{
|
{
|
||||||
show_controls(pat_patctl.control & ~PUSH_SUPPORTED_COMPILE_CONTROLS,
|
show_controls(pat_patctl.control & ~PUSH_SUPPORTED_COMPILE_CONTROLS,
|
||||||
|
pat_patctl.control2 & ~PUSH_SUPPORTED_COMPILE_CONTROLS2,
|
||||||
"** Ignored when compiled pattern is stacked with 'push':");
|
"** Ignored when compiled pattern is stacked with 'push':");
|
||||||
fprintf(outfile, "\n");
|
fprintf(outfile, "\n");
|
||||||
}
|
}
|
||||||
if ((pat_patctl.control & PUSH_COMPILE_ONLY_CONTROLS) != 0)
|
if ((pat_patctl.control & PUSH_COMPILE_ONLY_CONTROLS) != 0 ||
|
||||||
|
(pat_patctl.control2 & PUSH_COMPILE_ONLY_CONTROLS2) != 0)
|
||||||
{
|
{
|
||||||
show_controls(pat_patctl.control & PUSH_COMPILE_ONLY_CONTROLS,
|
show_controls(pat_patctl.control & PUSH_COMPILE_ONLY_CONTROLS,
|
||||||
|
pat_patctl.control2 & PUSH_COMPILE_ONLY_CONTROLS2,
|
||||||
"** Applies only to compile when pattern is stacked with 'push':");
|
"** Applies only to compile when pattern is stacked with 'push':");
|
||||||
fprintf(outfile, "\n");
|
fprintf(outfile, "\n");
|
||||||
}
|
}
|
||||||
|
@ -5340,6 +5375,7 @@ matching. */
|
||||||
DATCTXCPY(dat_context, default_dat_context);
|
DATCTXCPY(dat_context, default_dat_context);
|
||||||
memcpy(&dat_datctl, &def_datctl, sizeof(datctl));
|
memcpy(&dat_datctl, &def_datctl, sizeof(datctl));
|
||||||
dat_datctl.control |= (pat_patctl.control & CTL_ALLPD);
|
dat_datctl.control |= (pat_patctl.control & CTL_ALLPD);
|
||||||
|
dat_datctl.control2 |= (pat_patctl.control2 & CTL2_ALLPD);
|
||||||
strcpy((char *)dat_datctl.replacement, (char *)pat_patctl.replacement);
|
strcpy((char *)dat_datctl.replacement, (char *)pat_patctl.replacement);
|
||||||
|
|
||||||
/* Initialize for scanning the data line. */
|
/* Initialize for scanning the data line. */
|
||||||
|
@ -5657,14 +5693,15 @@ ulen = len/code_unit_size; /* Length in code units */
|
||||||
if (p[-1] != 0 && !decode_modifiers(p, CTX_DAT, NULL, &dat_datctl))
|
if (p[-1] != 0 && !decode_modifiers(p, CTX_DAT, NULL, &dat_datctl))
|
||||||
return PR_OK;
|
return PR_OK;
|
||||||
|
|
||||||
/* Check for mutually exclusive modifiers. */
|
/* Check for mutually exclusive modifiers. At present, these are all in the
|
||||||
|
first control word. */
|
||||||
|
|
||||||
for (k = 0; k < sizeof(exclusive_dat_controls)/sizeof(uint32_t); k++)
|
for (k = 0; k < sizeof(exclusive_dat_controls)/sizeof(uint32_t); k++)
|
||||||
{
|
{
|
||||||
c = dat_datctl.control & exclusive_dat_controls[k];
|
c = dat_datctl.control & exclusive_dat_controls[k];
|
||||||
if (c != 0 && c != (c & (~c+1)))
|
if (c != 0 && c != (c & (~c+1)))
|
||||||
{
|
{
|
||||||
show_controls(c, "** Not allowed together:");
|
show_controls(c, 0, "** Not allowed together:");
|
||||||
fprintf(outfile, "\n");
|
fprintf(outfile, "\n");
|
||||||
return PR_OK;
|
return PR_OK;
|
||||||
}
|
}
|
||||||
|
@ -5717,9 +5754,11 @@ if ((pat_patctl.control & CTL_POSIX) != 0)
|
||||||
show_match_options(dat_datctl.options & ~POSIX_SUPPORTED_MATCH_OPTIONS);
|
show_match_options(dat_datctl.options & ~POSIX_SUPPORTED_MATCH_OPTIONS);
|
||||||
msg = "";
|
msg = "";
|
||||||
}
|
}
|
||||||
if ((dat_datctl.control & ~POSIX_SUPPORTED_MATCH_CONTROLS) != 0)
|
if ((dat_datctl.control & ~POSIX_SUPPORTED_MATCH_CONTROLS) != 0 ||
|
||||||
|
(dat_datctl.control2 & ~POSIX_SUPPORTED_MATCH_CONTROLS2) != 0)
|
||||||
{
|
{
|
||||||
show_controls(dat_datctl.control & ~POSIX_SUPPORTED_MATCH_CONTROLS, msg);
|
show_controls(dat_datctl.control & ~POSIX_SUPPORTED_MATCH_CONTROLS,
|
||||||
|
dat_datctl.control2 & ~POSIX_SUPPORTED_MATCH_CONTROLS2, msg);
|
||||||
msg = "";
|
msg = "";
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -5891,9 +5930,13 @@ if (dat_datctl.replacement[0] != 0)
|
||||||
|
|
||||||
xoptions = (((dat_datctl.control & CTL_GLOBAL) == 0)? 0 :
|
xoptions = (((dat_datctl.control & CTL_GLOBAL) == 0)? 0 :
|
||||||
PCRE2_SUBSTITUTE_GLOBAL) |
|
PCRE2_SUBSTITUTE_GLOBAL) |
|
||||||
(((dat_datctl.control & CTL_SUBSTITUTE_EXTENDED) == 0)? 0 :
|
(((dat_datctl.control2 & CTL2_SUBSTITUTE_EXTENDED) == 0)? 0 :
|
||||||
PCRE2_SUBSTITUTE_EXTENDED) |
|
PCRE2_SUBSTITUTE_EXTENDED) |
|
||||||
(((dat_datctl.control & CTL_SUBSTITUTE_UNSET_EMPTY) == 0)? 0 :
|
(((dat_datctl.control2 & CTL2_SUBSTITUTE_OVERFLOW_LENGTH) == 0)? 0 :
|
||||||
|
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH) |
|
||||||
|
(((dat_datctl.control2 & CTL2_SUBSTITUTE_UNKNOWN_UNSET) == 0)? 0 :
|
||||||
|
PCRE2_SUBSTITUTE_UNKNOWN_UNSET) |
|
||||||
|
(((dat_datctl.control2 & CTL2_SUBSTITUTE_UNSET_EMPTY) == 0)? 0 :
|
||||||
PCRE2_SUBSTITUTE_UNSET_EMPTY);
|
PCRE2_SUBSTITUTE_UNSET_EMPTY);
|
||||||
|
|
||||||
SETCASTPTR(r, rbuffer); /* Sets r8, r16, or r32, as appropriate. */
|
SETCASTPTR(r, rbuffer); /* Sets r8, r16, or r32, as appropriate. */
|
||||||
|
@ -5987,12 +6030,16 @@ if (dat_datctl.replacement[0] != 0)
|
||||||
|
|
||||||
if (rc < 0)
|
if (rc < 0)
|
||||||
{
|
{
|
||||||
|
PCRE2_SIZE msize;
|
||||||
fprintf(outfile, "Failed: error %d", rc);
|
fprintf(outfile, "Failed: error %d", rc);
|
||||||
if (nsize != PCRE2_UNSET)
|
if (rc != PCRE2_ERROR_NOMEMORY && nsize != PCRE2_UNSET)
|
||||||
fprintf(outfile, " at offset %ld in replacement", nsize);
|
fprintf(outfile, " at offset %ld in replacement", nsize);
|
||||||
fprintf(outfile, ": ");
|
fprintf(outfile, ": ");
|
||||||
PCRE2_GET_ERROR_MESSAGE(nsize, rc, pbuffer);
|
PCRE2_GET_ERROR_MESSAGE(msize, rc, pbuffer);
|
||||||
PCHARSV(CASTVAR(void *, pbuffer), 0, nsize, FALSE, outfile);
|
PCHARSV(CASTVAR(void *, pbuffer), 0, msize, FALSE, outfile);
|
||||||
|
if (rc == PCRE2_ERROR_NOMEMORY &&
|
||||||
|
(xoptions & PCRE2_SUBSTITUTE_OVERFLOW_LENGTH) != 0)
|
||||||
|
fprintf(outfile, ": %ld code units are needed", nsize);
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
|
@ -6850,7 +6897,8 @@ control blocks must be the same so that common options and controls such as
|
||||||
We cannot test this till runtime because "offsetof" does not work in the
|
We cannot test this till runtime because "offsetof" does not work in the
|
||||||
preprocessor. */
|
preprocessor. */
|
||||||
|
|
||||||
if (PO(options) != DO(options) || PO(control) != DO(control))
|
if (PO(options) != DO(options) || PO(control) != DO(control) ||
|
||||||
|
PO(control2) != DO(control2))
|
||||||
{
|
{
|
||||||
fprintf(stderr, "** Coding error: "
|
fprintf(stderr, "** Coding error: "
|
||||||
"options and control offsets for pattern and data must be the same.\n");
|
"options and control offsets for pattern and data must be the same.\n");
|
||||||
|
|
|
@ -4042,8 +4042,6 @@
|
||||||
|
|
||||||
/(((((a)))))/parens_nest_limit=2
|
/(((((a)))))/parens_nest_limit=2
|
||||||
|
|
||||||
# Tests for pcre2_substitute()
|
|
||||||
|
|
||||||
/abc/replace=XYZ
|
/abc/replace=XYZ
|
||||||
123123
|
123123
|
||||||
123abc123
|
123abc123
|
||||||
|
@ -4149,11 +4147,24 @@
|
||||||
|
|
||||||
/(*:pear)apple|(*:orange)lemon|(*:strawberry)blackberry/g,replace=[22]${*MARK}
|
/(*:pear)apple|(*:orange)lemon|(*:strawberry)blackberry/g,replace=[22]${*MARK}
|
||||||
apple lemon blackberry
|
apple lemon blackberry
|
||||||
|
apple lemon blackberry\=substitute_overflow_length
|
||||||
|
|
||||||
/(*:pear)apple|(*:orange)lemon|(*:strawberry)blackberry/g,replace=[23]${*MARK}
|
/(*:pear)apple|(*:orange)lemon|(*:strawberry)blackberry/g,replace=[23]${*MARK}
|
||||||
apple lemon blackberry
|
apple lemon blackberry
|
||||||
|
|
||||||
# End of substitute tests
|
/abc/
|
||||||
|
123abc123\=replace=[9]XYZ
|
||||||
|
123abc123\=substitute_overflow_length,replace=[9]XYZ
|
||||||
|
123abc123\=substitute_overflow_length,replace=[6]XYZ
|
||||||
|
123abc123\=substitute_overflow_length,replace=[1]XYZ
|
||||||
|
123abc123\=substitute_overflow_length,replace=[0]XYZ
|
||||||
|
|
||||||
|
/a(b)c/
|
||||||
|
123abc123\=replace=[9]x$1z
|
||||||
|
123abc123\=substitute_overflow_length,replace=[9]x$1z
|
||||||
|
123abc123\=substitute_overflow_length,replace=[6]x$1z
|
||||||
|
123abc123\=substitute_overflow_length,replace=[1]x$1z
|
||||||
|
123abc123\=substitute_overflow_length,replace=[0]x$1z
|
||||||
|
|
||||||
"((?=(?(?=(?(?=(?(?=()))))))))"
|
"((?=(?(?=(?(?=(?(?=()))))))))"
|
||||||
a
|
a
|
||||||
|
@ -4749,12 +4760,24 @@ a)"xI
|
||||||
cat\=replace=>$1<,substitute_unset_empty
|
cat\=replace=>$1<,substitute_unset_empty
|
||||||
xbcom\=replace=>$1<,substitute_unset_empty
|
xbcom\=replace=>$1<,substitute_unset_empty
|
||||||
|
|
||||||
|
/a|(b)c/substitute_extended
|
||||||
|
cat\=replace=>${2:-xx}<
|
||||||
|
cat\=replace=>${2:-xx}<,substitute_unknown_unset
|
||||||
|
cat\=replace=>${X:-xx}<,substitute_unknown_unset
|
||||||
|
|
||||||
/a|(?'X'b)c/replace=>$X<,substitute_unset_empty
|
/a|(?'X'b)c/replace=>$X<,substitute_unset_empty
|
||||||
cat
|
cat
|
||||||
xbcom
|
xbcom
|
||||||
|
|
||||||
|
/a|(?'X'b)c/replace=>$Y<,substitute_unset_empty
|
||||||
|
cat
|
||||||
|
cat\=substitute_unknown_unset
|
||||||
|
cat\=substitute_unknown_unset,-substitute_unset_empty
|
||||||
|
|
||||||
/a|(b)c/replace=>$2<,substitute_unset_empty
|
/a|(b)c/replace=>$2<,substitute_unset_empty
|
||||||
cat
|
cat
|
||||||
|
cat\=substitute_unknown_unset
|
||||||
|
cat\=substitute_unknown_unset,-substitute_unset_empty
|
||||||
|
|
||||||
/()()()/use_offset_limit
|
/()()()/use_offset_limit
|
||||||
\=ovector=11000000000
|
\=ovector=11000000000
|
||||||
|
|
|
@ -13432,8 +13432,6 @@ Subject length lower bound = 0
|
||||||
/(((((a)))))/parens_nest_limit=2
|
/(((((a)))))/parens_nest_limit=2
|
||||||
Failed: error 119 at offset 3: parentheses are too deeply nested
|
Failed: error 119 at offset 3: parentheses are too deeply nested
|
||||||
|
|
||||||
# Tests for pcre2_substitute()
|
|
||||||
|
|
||||||
/abc/replace=XYZ
|
/abc/replace=XYZ
|
||||||
123123
|
123123
|
||||||
0: 123123
|
0: 123123
|
||||||
|
@ -13583,12 +13581,36 @@ Failed: error -35 at offset 9 in replacement: invalid replacement string
|
||||||
/(*:pear)apple|(*:orange)lemon|(*:strawberry)blackberry/g,replace=[22]${*MARK}
|
/(*:pear)apple|(*:orange)lemon|(*:strawberry)blackberry/g,replace=[22]${*MARK}
|
||||||
apple lemon blackberry
|
apple lemon blackberry
|
||||||
Failed: error -48: no more memory
|
Failed: error -48: no more memory
|
||||||
|
apple lemon blackberry\=substitute_overflow_length
|
||||||
|
Failed: error -48: no more memory: 23 code units are needed
|
||||||
|
|
||||||
/(*:pear)apple|(*:orange)lemon|(*:strawberry)blackberry/g,replace=[23]${*MARK}
|
/(*:pear)apple|(*:orange)lemon|(*:strawberry)blackberry/g,replace=[23]${*MARK}
|
||||||
apple lemon blackberry
|
apple lemon blackberry
|
||||||
3: pear orange strawberry
|
3: pear orange strawberry
|
||||||
|
|
||||||
# End of substitute tests
|
/abc/
|
||||||
|
123abc123\=replace=[9]XYZ
|
||||||
|
Failed: error -48: no more memory
|
||||||
|
123abc123\=substitute_overflow_length,replace=[9]XYZ
|
||||||
|
Failed: error -48: no more memory: 10 code units are needed
|
||||||
|
123abc123\=substitute_overflow_length,replace=[6]XYZ
|
||||||
|
Failed: error -48: no more memory: 10 code units are needed
|
||||||
|
123abc123\=substitute_overflow_length,replace=[1]XYZ
|
||||||
|
Failed: error -48: no more memory: 10 code units are needed
|
||||||
|
123abc123\=substitute_overflow_length,replace=[0]XYZ
|
||||||
|
Failed: error -48: no more memory: 10 code units are needed
|
||||||
|
|
||||||
|
/a(b)c/
|
||||||
|
123abc123\=replace=[9]x$1z
|
||||||
|
Failed: error -48: no more memory
|
||||||
|
123abc123\=substitute_overflow_length,replace=[9]x$1z
|
||||||
|
Failed: error -48: no more memory: 10 code units are needed
|
||||||
|
123abc123\=substitute_overflow_length,replace=[6]x$1z
|
||||||
|
Failed: error -48: no more memory: 10 code units are needed
|
||||||
|
123abc123\=substitute_overflow_length,replace=[1]x$1z
|
||||||
|
Failed: error -48: no more memory: 10 code units are needed
|
||||||
|
123abc123\=substitute_overflow_length,replace=[0]x$1z
|
||||||
|
Failed: error -48: no more memory: 10 code units are needed
|
||||||
|
|
||||||
"((?=(?(?=(?(?=(?(?=()))))))))"
|
"((?=(?(?=(?(?=(?(?=()))))))))"
|
||||||
a
|
a
|
||||||
|
@ -15075,15 +15097,35 @@ Failed: error -55 at offset 3 in replacement: requested value is not set
|
||||||
xbcom\=replace=>$1<,substitute_unset_empty
|
xbcom\=replace=>$1<,substitute_unset_empty
|
||||||
1: x>b<om
|
1: x>b<om
|
||||||
|
|
||||||
|
/a|(b)c/substitute_extended
|
||||||
|
cat\=replace=>${2:-xx}<
|
||||||
|
Failed: error -49 at offset 9 in replacement: unknown substring
|
||||||
|
cat\=replace=>${2:-xx}<,substitute_unknown_unset
|
||||||
|
1: c>xx<t
|
||||||
|
cat\=replace=>${X:-xx}<,substitute_unknown_unset
|
||||||
|
1: c>xx<t
|
||||||
|
|
||||||
/a|(?'X'b)c/replace=>$X<,substitute_unset_empty
|
/a|(?'X'b)c/replace=>$X<,substitute_unset_empty
|
||||||
cat
|
cat
|
||||||
1: c><t
|
1: c><t
|
||||||
xbcom
|
xbcom
|
||||||
1: x>b<om
|
1: x>b<om
|
||||||
|
|
||||||
|
/a|(?'X'b)c/replace=>$Y<,substitute_unset_empty
|
||||||
|
cat
|
||||||
|
Failed: error -49 at offset 3 in replacement: unknown substring
|
||||||
|
cat\=substitute_unknown_unset
|
||||||
|
1: c><t
|
||||||
|
cat\=substitute_unknown_unset,-substitute_unset_empty
|
||||||
|
Failed: error -55 at offset 3 in replacement: requested value is not set
|
||||||
|
|
||||||
/a|(b)c/replace=>$2<,substitute_unset_empty
|
/a|(b)c/replace=>$2<,substitute_unset_empty
|
||||||
cat
|
cat
|
||||||
Failed: error -49 at offset 3 in replacement: unknown substring
|
Failed: error -49 at offset 3 in replacement: unknown substring
|
||||||
|
cat\=substitute_unknown_unset
|
||||||
|
1: c><t
|
||||||
|
cat\=substitute_unknown_unset,-substitute_unset_empty
|
||||||
|
Failed: error -55 at offset 3 in replacement: requested value is not set
|
||||||
|
|
||||||
/()()()/use_offset_limit
|
/()()()/use_offset_limit
|
||||||
\=ovector=11000000000
|
\=ovector=11000000000
|
||||||
|
|
Loading…
Reference in New Issue