Upgrade the as yet unreleased substitute callout facility.
This commit is contained in:
parent
900f457222
commit
9bc81d5229
|
@ -20,7 +20,7 @@ SYNOPSIS
|
|||
</P>
|
||||
<P>
|
||||
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
||||
<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *),</b>
|
||||
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *),</b>
|
||||
<b> void *<i>callout_data</i>);</b>
|
||||
</P>
|
||||
<br><b>
|
||||
|
|
|
@ -183,7 +183,7 @@ document for an overview of all the PCRE2 documentation.
|
|||
<br>
|
||||
<br>
|
||||
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
||||
<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
||||
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
||||
<b> void *<i>callout_data</i>);</b>
|
||||
<br>
|
||||
<br>
|
||||
|
@ -924,7 +924,7 @@ documentation.
|
|||
<br>
|
||||
<br>
|
||||
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
||||
<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
||||
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
||||
<b> void *<i>callout_data</i>);</b>
|
||||
<br>
|
||||
<br>
|
||||
|
@ -3413,9 +3413,9 @@ substitutions. However, PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause unknown
|
|||
groups in the extended syntax forms to be treated as unset.
|
||||
</P>
|
||||
<P>
|
||||
If successful, <b>pcre2_substitute()</b> returns the number of replacements that
|
||||
were made. This may be zero if no matches were found, and is never greater than
|
||||
1 unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
||||
If successful, <b>pcre2_substitute()</b> returns the number of successful
|
||||
matches. This may be zero if no matches were found, and is never greater than 1
|
||||
unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
||||
</P>
|
||||
<P>
|
||||
In the event of an error, a negative error code is returned. Except for
|
||||
|
@ -3457,16 +3457,16 @@ Substitution callouts
|
|||
</b><br>
|
||||
<P>
|
||||
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
||||
<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
||||
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
||||
<b> void *<i>callout_data</i>);</b>
|
||||
<br>
|
||||
<br>
|
||||
The <b>pcre2_set_substitution_callout()</b> function can be used to specify a
|
||||
callout function for <b>pcre2_substitute()</b>. This information is passed in
|
||||
a match context. The callout function is called after each substitution. It is
|
||||
not called for simulated substitutions that happen as a result of the
|
||||
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option. A callout function should not return
|
||||
any value.
|
||||
a match context. The callout function is called after each substitution has
|
||||
been processed, but it can cause the replacement not to happen. The callout
|
||||
function is not called for simulated substitutions that happen as a result of
|
||||
the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option.
|
||||
</P>
|
||||
<P>
|
||||
The first argument of the callout function is a pointer to a substitute callout
|
||||
|
@ -3474,7 +3474,11 @@ block structure, which contains the following fields, not necessarily in this
|
|||
order:
|
||||
<pre>
|
||||
uint32_t <i>version</i>;
|
||||
PCRE2_SIZE <i>input_offsets[2]</i>;
|
||||
uint32_t <i>subscount</i>;
|
||||
PCRE2_SPTR <i>input</i>;
|
||||
PCRE2_SPTR <i>output</i>;
|
||||
PCRE2_SIZE <i>*ovector</i>;
|
||||
uint32_t <i>oveccount</i>;
|
||||
PCRE2_SIZE <i>output_offsets[2]</i>;
|
||||
</pre>
|
||||
The <i>version</i> field contains the version number of the block format. The
|
||||
|
@ -3482,13 +3486,34 @@ current version is 0. The version number will increase in future if more fields
|
|||
are added, but the intention is never to remove any of the existing fields.
|
||||
</P>
|
||||
<P>
|
||||
The <i>input_offsets</i> vector contains the code unit offsets in the input
|
||||
string of the matched substring, and the <i>output_offsets</i> vector contains
|
||||
the offsets of the replacement in the output string.
|
||||
The <i>subscount</i> field is the number of the current match. It is 1 for the
|
||||
first callout, 2 for the second, and so on. The <i>input</i> and <i>output</i>
|
||||
pointers are copies of the values passed to <b>pcre2_substitute()</b>.
|
||||
</P>
|
||||
<P>
|
||||
The <i>ovector</i> field points to the ovector, which contains the result of the
|
||||
most recent match. The <i>oveccount</i> field contains the number of pairs that
|
||||
are set in the ovector, and is always greater than zero.
|
||||
</P>
|
||||
<P>
|
||||
The <i>output_offsets</i> vector contains the offsets of the replacement in the
|
||||
output string. This has already been processed for dollar and (if requested)
|
||||
backslash substitutions as described above.
|
||||
</P>
|
||||
<P>
|
||||
The second argument of the callout function is the value passed as
|
||||
<i>callout_data</i> when the function was registered.
|
||||
<i>callout_data</i> when the function was registered. The value returned by the
|
||||
callout function is interpreted as follows:
|
||||
</P>
|
||||
<P>
|
||||
If the value is zero, the replacement is accepted, and, if
|
||||
PCRE2_SUBSTITUTE_GLOBAL is set, processing continues with a search for the next
|
||||
match. If the value is not zero, the current replacement is not accepted. If
|
||||
the value is greater than zero, processing continues when
|
||||
PCRE2_SUBSTITUTE_GLOBAL is set. Otherwise (the value is less than zero or
|
||||
PCRE2_SUBSTITUTE_GLOBAL is not set), the the rest of the input is copied to the
|
||||
output and the call to <b>pcre2_substitute()</b> exits, returning the number of
|
||||
matches so far.
|
||||
</P>
|
||||
<br><a name="SEC37" href="#TOC1">DUPLICATE SUBPATTERN NAMES</a><br>
|
||||
<P>
|
||||
|
@ -3757,7 +3782,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 19 October 2018
|
||||
Last updated: 12 November 2018
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -1052,7 +1052,9 @@ process.
|
|||
startchar show starting character when relevant
|
||||
substitute_callout use substitution callouts
|
||||
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
||||
substitute_skip=<n> skip substitution number n
|
||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||
substitute_stop=<n> skip substitution number n and greater
|
||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||
</pre>
|
||||
|
@ -1220,7 +1222,9 @@ pattern.
|
|||
startoffset=<n> same as offset=<n>
|
||||
substitute_callout use substitution callouts
|
||||
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
||||
substitute_skip=<n> skip substitution number n
|
||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||
substitute_stop=<n> skip substitution number n and greater
|
||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||
zero_terminate pass the subject as zero-terminated
|
||||
|
@ -1410,16 +1414,6 @@ simple example of a substitution test:
|
|||
=abc=abc=\=global
|
||||
2: =xxx=xxx=
|
||||
</pre>
|
||||
If the <b>substitute_callout</b> modifier is set, a substitution callout
|
||||
function is set up. When it is called (after each substitution), the offsets in
|
||||
the input and output strings are output. For example:
|
||||
<pre>
|
||||
/abc/g,replace=<$0>,substitute_callout
|
||||
abcdefabcpqr
|
||||
Old 0 3 New 0 5
|
||||
Old 6 9 New 8 13
|
||||
2: <abc>def<abc>pqr
|
||||
</pre>
|
||||
Subject and replacement strings should be kept relatively short (fewer than 256
|
||||
characters) for substitution tests, as fixed-size buffers are used. To make it
|
||||
easy to test for buffer overflow, if the replacement string starts with a
|
||||
|
@ -1451,6 +1445,47 @@ matching provokes an error return ("bad option value") from
|
|||
<b>pcre2_substitute()</b>.
|
||||
</P>
|
||||
<br><b>
|
||||
Testing substitute callouts
|
||||
</b><br>
|
||||
<P>
|
||||
If the <b>substitute_callout</b> modifier is set, a substitution callout
|
||||
function is set up. When it is called (after each substitution), details of the
|
||||
the input and output strings are output. For example:
|
||||
<pre>
|
||||
/abc/g,replace=<$0>,substitute_callout
|
||||
abcdefabcpqr
|
||||
1(1) Old 0 3 "abc" New 0 5 "<abc>"
|
||||
2(1) Old 6 9 "abc" New 8 13 "<abc>"
|
||||
2: <abc>def<abc>pqr
|
||||
</pre>
|
||||
The first number on each callout line is the count of matches. The
|
||||
parenthesized number is the number of pairs that are set in the ovector (that
|
||||
is, one more than the number of capturing groups that were set). Then are
|
||||
listed the offsets of the old substring, its contents, and the same for the
|
||||
replacement.
|
||||
</P>
|
||||
<P>
|
||||
By default, the substitution callout function returns zero, which accepts the
|
||||
replacement and causes matching to continue if /g was used. Two further
|
||||
modifiers can be used to test other return values. If <b>substitute_skip</b> is
|
||||
set to a value greater than zero the callout function returns +1 for the match
|
||||
of that number, and similarly <b>substitute_stop</b> returns -1. These cause the
|
||||
replacement to be rejected, and -1 causes no further matching to take place. If
|
||||
either of them are set, <b>substitute_callout</b> is assumed. For example:
|
||||
<pre>
|
||||
/abc/g,replace=<$0>,substitute_skip=1
|
||||
abcdefabcpqr
|
||||
1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED"
|
||||
2(1) Old 6 9 "abc" New 6 11 "<abc>"
|
||||
2: abcdef<abc>pqr
|
||||
abcdefabcpqr\=substitute_stop=1
|
||||
1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED"
|
||||
1: abcdefabcpqr
|
||||
</pre>
|
||||
If both are set for the same number, stop takes precedence. Only a single skip
|
||||
or stop is supported, which is sufficient for testing that the feature works.
|
||||
</P>
|
||||
<br><b>
|
||||
Setting the JIT stack size
|
||||
</b><br>
|
||||
<P>
|
||||
|
@ -2040,7 +2075,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 21 September 2018
|
||||
Last updated: 12 November 2018
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -294,7 +294,7 @@ PCRE2 NATIVE API MATCH CONTEXT FUNCTIONS
|
|||
void *callout_data);
|
||||
|
||||
int pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
||||
void (*callout_function)(pcre2_substitute_callout_block *, void *),
|
||||
int (*callout_function)(pcre2_substitute_callout_block *, void *),
|
||||
void *callout_data);
|
||||
|
||||
int pcre2_set_offset_limit(pcre2_match_context *mcontext,
|
||||
|
@ -942,7 +942,7 @@ PCRE2 CONTEXTS
|
|||
umentation.
|
||||
|
||||
int pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
||||
void (*callout_function)(pcre2_substitute_callout_block *, void *),
|
||||
int (*callout_function)(pcre2_substitute_callout_block *, void *),
|
||||
void *callout_data);
|
||||
|
||||
This sets up a callout function for PCRE2 to call after each substitu-
|
||||
|
@ -3318,8 +3318,8 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
|||
substitutions. However, PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause
|
||||
unknown groups in the extended syntax forms to be treated as unset.
|
||||
|
||||
If successful, pcre2_substitute() returns the number of replacements
|
||||
that were made. This may be zero if no matches were found, and is never
|
||||
If successful, pcre2_substitute() returns the number of successful
|
||||
matches. This may be zero if no matches were found, and is never
|
||||
greater than 1 unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
||||
|
||||
In the event of an error, a negative error code is returned. Except for
|
||||
|
@ -3355,22 +3355,26 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
|||
Substitution callouts
|
||||
|
||||
int pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
||||
void (*callout_function)(pcre2_substitute_callout_block *, void *),
|
||||
int (*callout_function)(pcre2_substitute_callout_block *, void *),
|
||||
void *callout_data);
|
||||
|
||||
The pcre2_set_substitution_callout() function can be used to specify a
|
||||
callout function for pcre2_substitute(). This information is passed in
|
||||
a match context. The callout function is called after each substitu-
|
||||
tion. It is not called for simulated substitutions that happen as a
|
||||
result of the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option. A callout func-
|
||||
tion should not return any value.
|
||||
a match context. The callout function is called after each substitution
|
||||
has been processed, but it can cause the replacement not to happen. The
|
||||
callout function is not called for simulated substitutions that happen
|
||||
as a result of the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option.
|
||||
|
||||
The first argument of the callout function is a pointer to a substitute
|
||||
callout block structure, which contains the following fields, not nec-
|
||||
essarily in this order:
|
||||
|
||||
uint32_t version;
|
||||
PCRE2_SIZE input_offsets[2];
|
||||
uint32_t subscount;
|
||||
PCRE2_SPTR input;
|
||||
PCRE2_SPTR output;
|
||||
PCRE2_SIZE *ovector;
|
||||
uint32_t oveccount;
|
||||
PCRE2_SIZE output_offsets[2];
|
||||
|
||||
The version field contains the version number of the block format. The
|
||||
|
@ -3378,12 +3382,30 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
|||
more fields are added, but the intention is never to remove any of the
|
||||
existing fields.
|
||||
|
||||
The input_offsets vector contains the code unit offsets in the input
|
||||
string of the matched substring, and the output_offsets vector contains
|
||||
the offsets of the replacement in the output string.
|
||||
The subscount field is the number of the current match. It is 1 for the
|
||||
first callout, 2 for the second, and so on. The input and output point-
|
||||
ers are copies of the values passed to pcre2_substitute().
|
||||
|
||||
The ovector field points to the ovector, which contains the result of
|
||||
the most recent match. The oveccount field contains the number of pairs
|
||||
that are set in the ovector, and is always greater than zero.
|
||||
|
||||
The output_offsets vector contains the offsets of the replacement in
|
||||
the output string. This has already been processed for dollar and (if
|
||||
requested) backslash substitutions as described above.
|
||||
|
||||
The second argument of the callout function is the value passed as
|
||||
callout_data when the function was registered.
|
||||
callout_data when the function was registered. The value returned by
|
||||
the callout function is interpreted as follows:
|
||||
|
||||
If the value is zero, the replacement is accepted, and, if PCRE2_SUB-
|
||||
STITUTE_GLOBAL is set, processing continues with a search for the next
|
||||
match. If the value is not zero, the current replacement is not
|
||||
accepted. If the value is greater than zero, processing continues when
|
||||
PCRE2_SUBSTITUTE_GLOBAL is set. Otherwise (the value is less than zero
|
||||
or PCRE2_SUBSTITUTE_GLOBAL is not set), the the rest of the input is
|
||||
copied to the output and the call to pcre2_substitute() exits, return-
|
||||
ing the number of matches so far.
|
||||
|
||||
|
||||
DUPLICATE SUBPATTERN NAMES
|
||||
|
@ -3633,7 +3655,7 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 19 October 2018
|
||||
Last updated: 12 November 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2_SET_SUBSTITUTE_CALLOUT 3 "17 September 2018" "PCRE2 10.33"
|
||||
.TH PCRE2_SET_SUBSTITUTE_CALLOUT 3 "12 November 2018" "PCRE2 10.33"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.SH SYNOPSIS
|
||||
|
@ -8,7 +8,7 @@ PCRE2 - Perl-compatible regular expressions (revised API)
|
|||
.PP
|
||||
.nf
|
||||
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
||||
.B " void (*\fIcallout_function\fP)(pcre2_substitute_callout_block *),"
|
||||
.B " int (*\fIcallout_function\fP)(pcre2_substitute_callout_block *),"
|
||||
.B " void *\fIcallout_data\fP);"
|
||||
.fi
|
||||
.
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2API 3 "19 October 2018" "PCRE2 10.33"
|
||||
.TH PCRE2API 3 "12 November 2018" "PCRE2 10.33"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.sp
|
||||
|
@ -124,7 +124,7 @@ document for an overview of all the PCRE2 documentation.
|
|||
.B " void *\fIcallout_data\fP);"
|
||||
.sp
|
||||
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
||||
.B " void (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
||||
.B " int (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
||||
.B " void *\fIcallout_data\fP);"
|
||||
.sp
|
||||
.B int pcre2_set_offset_limit(pcre2_match_context *\fImcontext\fP,
|
||||
|
@ -860,7 +860,7 @@ documentation.
|
|||
.sp
|
||||
.nf
|
||||
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
||||
.B " void (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
||||
.B " int (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
||||
.B " void *\fIcallout_data\fP);"
|
||||
.fi
|
||||
.sp
|
||||
|
@ -3412,9 +3412,9 @@ The PCRE2_SUBSTITUTE_UNSET_EMPTY option does not affect these extended
|
|||
substitutions. However, PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause unknown
|
||||
groups in the extended syntax forms to be treated as unset.
|
||||
.P
|
||||
If successful, \fBpcre2_substitute()\fP returns the number of replacements that
|
||||
were made. This may be zero if no matches were found, and is never greater than
|
||||
1 unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
||||
If successful, \fBpcre2_substitute()\fP returns the number of successful
|
||||
matches. This may be zero if no matches were found, and is never greater than 1
|
||||
unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
||||
.P
|
||||
In the event of an error, a negative error code is returned. Except for
|
||||
PCRE2_ERROR_NOMATCH (which is never returned), errors from \fBpcre2_match()\fP
|
||||
|
@ -3454,35 +3454,57 @@ above).
|
|||
.sp
|
||||
.nf
|
||||
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
||||
.B " void (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
||||
.B " int (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
||||
.B " void *\fIcallout_data\fP);"
|
||||
.fi
|
||||
.sp
|
||||
The \fBpcre2_set_substitution_callout()\fP function can be used to specify a
|
||||
callout function for \fBpcre2_substitute()\fP. This information is passed in
|
||||
a match context. The callout function is called after each substitution. It is
|
||||
not called for simulated substitutions that happen as a result of the
|
||||
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option. A callout function should not return
|
||||
any value.
|
||||
a match context. The callout function is called after each substitution has
|
||||
been processed, but it can cause the replacement not to happen. The callout
|
||||
function is not called for simulated substitutions that happen as a result of
|
||||
the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option.
|
||||
.P
|
||||
The first argument of the callout function is a pointer to a substitute callout
|
||||
block structure, which contains the following fields, not necessarily in this
|
||||
order:
|
||||
.sp
|
||||
uint32_t \fIversion\fP;
|
||||
PCRE2_SIZE \fIinput_offsets[2]\fP;
|
||||
uint32_t \fIsubscount\fP;
|
||||
PCRE2_SPTR \fIinput\fP;
|
||||
PCRE2_SPTR \fIoutput\fP;
|
||||
PCRE2_SIZE \fI*ovector\fP;
|
||||
uint32_t \fIoveccount\fP;
|
||||
PCRE2_SIZE \fIoutput_offsets[2]\fP;
|
||||
.sp
|
||||
The \fIversion\fP field contains the version number of the block format. The
|
||||
current version is 0. The version number will increase in future if more fields
|
||||
are added, but the intention is never to remove any of the existing fields.
|
||||
.P
|
||||
The \fIinput_offsets\fP vector contains the code unit offsets in the input
|
||||
string of the matched substring, and the \fIoutput_offsets\fP vector contains
|
||||
the offsets of the replacement in the output string.
|
||||
The \fIsubscount\fP field is the number of the current match. It is 1 for the
|
||||
first callout, 2 for the second, and so on. The \fIinput\fP and \fIoutput\fP
|
||||
pointers are copies of the values passed to \fBpcre2_substitute()\fP.
|
||||
.P
|
||||
The \fIovector\fP field points to the ovector, which contains the result of the
|
||||
most recent match. The \fIoveccount\fP field contains the number of pairs that
|
||||
are set in the ovector, and is always greater than zero.
|
||||
.P
|
||||
The \fIoutput_offsets\fP vector contains the offsets of the replacement in the
|
||||
output string. This has already been processed for dollar and (if requested)
|
||||
backslash substitutions as described above.
|
||||
.P
|
||||
The second argument of the callout function is the value passed as
|
||||
\fIcallout_data\fP when the function was registered.
|
||||
\fIcallout_data\fP when the function was registered. The value returned by the
|
||||
callout function is interpreted as follows:
|
||||
.P
|
||||
If the value is zero, the replacement is accepted, and, if
|
||||
PCRE2_SUBSTITUTE_GLOBAL is set, processing continues with a search for the next
|
||||
match. If the value is not zero, the current replacement is not accepted. If
|
||||
the value is greater than zero, processing continues when
|
||||
PCRE2_SUBSTITUTE_GLOBAL is set. Otherwise (the value is less than zero or
|
||||
PCRE2_SUBSTITUTE_GLOBAL is not set), the the rest of the input is copied to the
|
||||
output and the call to \fBpcre2_substitute()\fP exits, returning the number of
|
||||
matches so far.
|
||||
.
|
||||
.
|
||||
.SH "DUPLICATE SUBPATTERN NAMES"
|
||||
|
@ -3768,6 +3790,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 19 October 2018
|
||||
Last updated: 12 November 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2TEST 1 "21 September 2018" "PCRE 10.33"
|
||||
.TH PCRE2TEST 1 "12 November 2018" "PCRE 10.33"
|
||||
.SH NAME
|
||||
pcre2test - a program for testing Perl-compatible regular expressions.
|
||||
.SH SYNOPSIS
|
||||
|
@ -1014,7 +1014,9 @@ process.
|
|||
startchar show starting character when relevant
|
||||
substitute_callout use substitution callouts
|
||||
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
||||
substitute_skip=<n> skip substitution number n
|
||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||
substitute_stop=<n> skip substitution number n and greater
|
||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||
.sp
|
||||
|
@ -1189,7 +1191,9 @@ pattern.
|
|||
startoffset=<n> same as offset=<n>
|
||||
substitute_callout use substitution callouts
|
||||
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
||||
substitute_skip=<n> skip substitution number n
|
||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||
substitute_stop=<n> skip substitution number n and greater
|
||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||
zero_terminate pass the subject as zero-terminated
|
||||
|
@ -1377,16 +1381,6 @@ simple example of a substitution test:
|
|||
=abc=abc=\e=global
|
||||
2: =xxx=xxx=
|
||||
.sp
|
||||
If the \fBsubstitute_callout\fP modifier is set, a substitution callout
|
||||
function is set up. When it is called (after each substitution), the offsets in
|
||||
the input and output strings are output. For example:
|
||||
.sp
|
||||
/abc/g,replace=<$0>,substitute_callout
|
||||
abcdefabcpqr
|
||||
Old 0 3 New 0 5
|
||||
Old 6 9 New 8 13
|
||||
2: <abc>def<abc>pqr
|
||||
.sp
|
||||
Subject and replacement strings should be kept relatively short (fewer than 256
|
||||
characters) for substitution tests, as fixed-size buffers are used. To make it
|
||||
easy to test for buffer overflow, if the replacement string starts with a
|
||||
|
@ -1418,6 +1412,46 @@ matching provokes an error return ("bad option value") from
|
|||
\fBpcre2_substitute()\fP.
|
||||
.
|
||||
.
|
||||
.SS "Testing substitute callouts"
|
||||
.rs
|
||||
.sp
|
||||
If the \fBsubstitute_callout\fP modifier is set, a substitution callout
|
||||
function is set up. When it is called (after each substitution), details of the
|
||||
the input and output strings are output. For example:
|
||||
.sp
|
||||
/abc/g,replace=<$0>,substitute_callout
|
||||
abcdefabcpqr
|
||||
1(1) Old 0 3 "abc" New 0 5 "<abc>"
|
||||
2(1) Old 6 9 "abc" New 8 13 "<abc>"
|
||||
2: <abc>def<abc>pqr
|
||||
.sp
|
||||
The first number on each callout line is the count of matches. The
|
||||
parenthesized number is the number of pairs that are set in the ovector (that
|
||||
is, one more than the number of capturing groups that were set). Then are
|
||||
listed the offsets of the old substring, its contents, and the same for the
|
||||
replacement.
|
||||
.P
|
||||
By default, the substitution callout function returns zero, which accepts the
|
||||
replacement and causes matching to continue if /g was used. Two further
|
||||
modifiers can be used to test other return values. If \fBsubstitute_skip\fP is
|
||||
set to a value greater than zero the callout function returns +1 for the match
|
||||
of that number, and similarly \fBsubstitute_stop\fP returns -1. These cause the
|
||||
replacement to be rejected, and -1 causes no further matching to take place. If
|
||||
either of them are set, \fBsubstitute_callout\fP is assumed. For example:
|
||||
.sp
|
||||
/abc/g,replace=<$0>,substitute_skip=1
|
||||
abcdefabcpqr
|
||||
1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED"
|
||||
2(1) Old 6 9 "abc" New 6 11 "<abc>"
|
||||
2: abcdef<abc>pqr
|
||||
abcdefabcpqr\e=substitute_stop=1
|
||||
1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED"
|
||||
1: abcdefabcpqr
|
||||
.sp
|
||||
If both are set for the same number, stop takes precedence. Only a single skip
|
||||
or stop is supported, which is sufficient for testing that the feature works.
|
||||
.
|
||||
.
|
||||
.SS "Setting the JIT stack size"
|
||||
.rs
|
||||
.sp
|
||||
|
@ -2022,6 +2056,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 21 September 2018
|
||||
Last updated: 12 November 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -940,7 +940,9 @@ PATTERN MODIFIERS
|
|||
startchar show starting character when relevant
|
||||
substitute_callout use substitution callouts
|
||||
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
||||
substitute_skip=<n> skip substitution number n
|
||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||
substitute_stop=<n> skip substitution number n and greater
|
||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||
|
||||
|
@ -1092,7 +1094,9 @@ SUBJECT MODIFIERS
|
|||
startoffset=<n> same as offset=<n>
|
||||
substitute_callout use substitution callouts
|
||||
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
||||
substitute_skip=<n> skip substitution number n
|
||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||
substitute_stop=<n> skip substitution number n and greater
|
||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||
zero_terminate pass the subject as zero-terminated
|
||||
|
@ -1263,16 +1267,6 @@ SUBJECT MODIFIERS
|
|||
=abc=abc=\=global
|
||||
2: =xxx=xxx=
|
||||
|
||||
If the substitute_callout modifier is set, a substitution callout func-
|
||||
tion is set up. When it is called (after each substitution), the off-
|
||||
sets in the input and output strings are output. For example:
|
||||
|
||||
/abc/g,replace=<$0>,substitute_callout
|
||||
abcdefabcpqr
|
||||
Old 0 3 New 0 5
|
||||
Old 6 9 New 8 13
|
||||
2: <abc>def<abc>pqr
|
||||
|
||||
Subject and replacement strings should be kept relatively short (fewer
|
||||
than 256 characters) for substitution tests, as fixed-size buffers are
|
||||
used. To make it easy to test for buffer overflow, if the replacement
|
||||
|
@ -1305,6 +1299,46 @@ SUBJECT MODIFIERS
|
|||
partial matching provokes an error return ("bad option value") from
|
||||
pcre2_substitute().
|
||||
|
||||
Testing substitute callouts
|
||||
|
||||
If the substitute_callout modifier is set, a substitution callout func-
|
||||
tion is set up. When it is called (after each substitution), details of
|
||||
the the input and output strings are output. For example:
|
||||
|
||||
/abc/g,replace=<$0>,substitute_callout
|
||||
abcdefabcpqr
|
||||
1(1) Old 0 3 "abc" New 0 5 "<abc>"
|
||||
2(1) Old 6 9 "abc" New 8 13 "<abc>"
|
||||
2: <abc>def<abc>pqr
|
||||
|
||||
The first number on each callout line is the count of matches. The
|
||||
parenthesized number is the number of pairs that are set in the ovector
|
||||
(that is, one more than the number of capturing groups that were set).
|
||||
Then are listed the offsets of the old substring, its contents, and the
|
||||
same for the replacement.
|
||||
|
||||
By default, the substitution callout function returns zero, which
|
||||
accepts the replacement and causes matching to continue if /g was used.
|
||||
Two further modifiers can be used to test other return values. If sub-
|
||||
stitute_skip is set to a value greater than zero the callout function
|
||||
returns +1 for the match of that number, and similarly substitute_stop
|
||||
returns -1. These cause the replacement to be rejected, and -1 causes
|
||||
no further matching to take place. If either of them are set, substi-
|
||||
tute_callout is assumed. For example:
|
||||
|
||||
/abc/g,replace=<$0>,substitute_skip=1
|
||||
abcdefabcpqr
|
||||
1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED"
|
||||
2(1) Old 6 9 "abc" New 6 11 "<abc>"
|
||||
2: abcdef<abc>pqr
|
||||
abcdefabcpqr\=substitute_stop=1
|
||||
1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED"
|
||||
1: abcdefabcpqr
|
||||
|
||||
If both are set for the same number, stop takes precedence. Only a sin-
|
||||
gle skip or stop is supported, which is sufficient for testing that the
|
||||
feature works.
|
||||
|
||||
Setting the JIT stack size
|
||||
|
||||
The jitstack modifier provides a way of setting the maximum stack size
|
||||
|
@ -1853,5 +1887,5 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 21 September 2018
|
||||
Last updated: 12 November 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
|
|
|
@ -549,8 +549,12 @@ typedef struct pcre2_callout_enumerate_block { \
|
|||
typedef struct pcre2_substitute_callout_block { \
|
||||
uint32_t version; /* Identifies version of block */ \
|
||||
/* ------------------------ Version 0 ------------------------------- */ \
|
||||
PCRE2_SIZE input_offsets[2]; /* Matched portion of the input */ \
|
||||
PCRE2_SPTR input; /* Pointer to input subject string */ \
|
||||
PCRE2_SPTR output; /* Pointer to output buffer */ \
|
||||
PCRE2_SIZE output_offsets[2]; /* Changed portion of the output */ \
|
||||
PCRE2_SIZE *ovector; /* Pointer to current ovector */ \
|
||||
uint32_t oveccount; /* Count of pairs set in ovector */ \
|
||||
uint32_t subscount; /* Substitution number */ \
|
||||
/* ------------------------------------------------------------------ */ \
|
||||
} pcre2_substitute_callout_block;
|
||||
|
||||
|
@ -609,7 +613,7 @@ PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
|||
int (*)(pcre2_callout_block *, void *), void *); \
|
||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||
pcre2_set_substitute_callout(pcre2_match_context *, \
|
||||
void (*)(pcre2_substitute_callout_block *, void *), void *); \
|
||||
int (*)(pcre2_substitute_callout_block *, void *), void *); \
|
||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||
pcre2_set_depth_limit(pcre2_match_context *, uint32_t); \
|
||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||
|
|
|
@ -407,7 +407,7 @@ return 0;
|
|||
|
||||
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
|
||||
pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
||||
void (*substitute_callout)(pcre2_substitute_callout_block *, void *),
|
||||
int (*substitute_callout)(pcre2_substitute_callout_block *, void *),
|
||||
void *substitute_callout_data)
|
||||
{
|
||||
mcontext->substitute_callout = substitute_callout;
|
||||
|
|
|
@ -585,7 +585,7 @@ typedef struct pcre2_real_match_context {
|
|||
#endif
|
||||
int (*callout)(pcre2_callout_block *, void *);
|
||||
void *callout_data;
|
||||
void (*substitute_callout)(pcre2_substitute_callout_block *, void *);
|
||||
int (*substitute_callout)(pcre2_substitute_callout_block *, void *);
|
||||
void *substitute_callout_data;
|
||||
PCRE2_SIZE offset_limit;
|
||||
uint32_t heap_limit;
|
||||
|
|
|
@ -241,13 +241,15 @@ PCRE2_SIZE *ovector;
|
|||
PCRE2_SIZE ovecsave[3];
|
||||
pcre2_substitute_callout_block scb;
|
||||
|
||||
scb.version = 0;
|
||||
/* General initialization */
|
||||
|
||||
buff_offset = 0;
|
||||
lengthleft = buff_length = *blength;
|
||||
*blength = PCRE2_UNSET;
|
||||
ovecsave[0] = ovecsave[1] = ovecsave[2] = PCRE2_UNSET;
|
||||
|
||||
/* Partial matching is not valid. */
|
||||
/* Partial matching is not valid. This must come after setting *blength to
|
||||
PCRE2_UNSET, so as not to imply an offset in the replacement. */
|
||||
|
||||
if ((options & (PCRE2_PARTIAL_HARD|PCRE2_PARTIAL_SOFT)) != 0)
|
||||
return PCRE2_ERROR_BADOPTION;
|
||||
|
@ -266,6 +268,13 @@ if (match_data == NULL)
|
|||
ovector = pcre2_get_ovector_pointer(match_data);
|
||||
ovector_count = pcre2_get_ovector_count(match_data);
|
||||
|
||||
/* Fixed things in the callout block */
|
||||
|
||||
scb.version = 0;
|
||||
scb.input = subject;
|
||||
scb.output = (PCRE2_SPTR)buffer;
|
||||
scb.ovector = ovector;
|
||||
|
||||
/* Find lengths of zero-terminated strings and the end of the replacement. */
|
||||
|
||||
if (length == PCRE2_ZERO_TERMINATED) length = PRIV(strlen)(subject);
|
||||
|
@ -393,11 +402,6 @@ do
|
|||
goto EXIT;
|
||||
}
|
||||
|
||||
/* Save the match point for a possible callout */
|
||||
|
||||
scb.input_offsets[0] = ovector[0];
|
||||
scb.input_offsets[1] = ovector[1];
|
||||
|
||||
/* Count substitutions with a paranoid check for integer overflow; surely no
|
||||
real call to this function would ever hit this! */
|
||||
|
||||
|
@ -409,12 +413,13 @@ do
|
|||
subs++;
|
||||
|
||||
/* Copy the text leading up to the match, and remember where the insert
|
||||
begins. */
|
||||
begins and how many ovector pairs are set. */
|
||||
|
||||
if (rc == 0) rc = ovector_count;
|
||||
fraglength = ovector[0] - start_offset;
|
||||
CHECKMEMCPY(subject + start_offset, fraglength);
|
||||
scb.output_offsets[0] = buff_offset;
|
||||
scb.oveccount = rc;
|
||||
|
||||
/* Process the replacement string. Literal mode is set by \Q, but only in
|
||||
extended mode when backslashes are being interpreted. In extended mode we
|
||||
|
@ -836,8 +841,26 @@ do
|
|||
|
||||
if (!overflowed && mcontext->substitute_callout != NULL)
|
||||
{
|
||||
scb.subscount = subs;
|
||||
scb.output_offsets[1] = buff_offset;
|
||||
mcontext->substitute_callout(&scb, mcontext->substitute_callout_data);
|
||||
rc = mcontext->substitute_callout(&scb, mcontext->substitute_callout_data);
|
||||
|
||||
/* A non-zero return means cancel this substitution. Instead, copy the
|
||||
matched string fragment. */
|
||||
|
||||
if (rc != 0)
|
||||
{
|
||||
PCRE2_SIZE newlength = scb.output_offsets[1] - scb.output_offsets[0];
|
||||
PCRE2_SIZE oldlength = ovector[1] - ovector[0];
|
||||
|
||||
buff_offset -= newlength;
|
||||
lengthleft += newlength;
|
||||
CHECKMEMCPY(subject + ovector[0], oldlength);
|
||||
|
||||
/* A negative return means do not do any more. */
|
||||
|
||||
if (rc < 0) suboptions &= (~PCRE2_SUBSTITUTE_GLOBAL);
|
||||
}
|
||||
}
|
||||
|
||||
/* Save the details of this match. See above for how this data is used. If we
|
||||
|
|
|
@ -537,6 +537,8 @@ typedef struct patctl { /* Structure for pattern modifiers. */
|
|||
uint32_t control2; /* Must be in same position as datctl */
|
||||
uint32_t jitstack; /* Must be in same position as datctl */
|
||||
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
||||
uint32_t substitute_skip; /* Must be in same position as patctl */
|
||||
uint32_t substitute_stop; /* Must be in same position as patctl */
|
||||
uint32_t jit;
|
||||
uint32_t stackguard_test;
|
||||
uint32_t tables_id;
|
||||
|
@ -557,6 +559,8 @@ typedef struct datctl { /* Structure for data line modifiers. */
|
|||
uint32_t control2; /* Must be in same position as patctl */
|
||||
uint32_t jitstack; /* Must be in same position as patctl */
|
||||
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
||||
uint32_t substitute_skip; /* Must be in same position as patctl */
|
||||
uint32_t substitute_stop; /* Must be in same position as patctl */
|
||||
uint32_t startend[2];
|
||||
uint32_t cerror[2];
|
||||
uint32_t cfail[2];
|
||||
|
@ -704,6 +708,8 @@ static modstruct modlist[] = {
|
|||
{ "substitute_callout", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_CALLOUT, PO(control2) },
|
||||
{ "substitute_extended", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_EXTENDED, PO(control2) },
|
||||
{ "substitute_overflow_length", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_OVERFLOW_LENGTH, PO(control2) },
|
||||
{ "substitute_skip", MOD_PND, MOD_INT, 0, PO(substitute_skip) },
|
||||
{ "substitute_stop", MOD_PND, MOD_INT, 0, PO(substitute_stop) },
|
||||
{ "substitute_unknown_unset", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_UNKNOWN_UNSET, PO(control2) },
|
||||
{ "substitute_unset_empty", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_UNSET_EMPTY, PO(control2) },
|
||||
{ "tables", MOD_PAT, MOD_INT, 0, PO(tables_id) },
|
||||
|
@ -1370,13 +1376,13 @@ are supported. */
|
|||
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
||||
if (test_mode == PCRE8_MODE) \
|
||||
pcre2_set_substitute_callout_8(G(a,8), \
|
||||
(void (*)(pcre2_substitute_callout_block_8 *, void *))b,c); \
|
||||
(int (*)(pcre2_substitute_callout_block_8 *, void *))b,c); \
|
||||
else if (test_mode == PCRE16_MODE) \
|
||||
pcre2_set_substitute_callout_16(G(a,16), \
|
||||
(void (*)(pcre2_substitute_callout_block_16 *, void *))b,c); \
|
||||
(int (*)(pcre2_substitute_callout_block_16 *, void *))b,c); \
|
||||
else \
|
||||
pcre2_set_substitute_callout_32(G(a,32), \
|
||||
(void (*)(pcre2_substitute_callout_block_32 *, void *))b,c)
|
||||
(int (*)(pcre2_substitute_callout_block_32 *, void *))b,c)
|
||||
|
||||
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
||||
if (test_mode == PCRE8_MODE) \
|
||||
|
@ -1850,10 +1856,10 @@ the three different cases. */
|
|||
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
||||
if (test_mode == G(G(PCRE,BITONE),_MODE)) \
|
||||
G(pcre2_set_substitute_callout_,BITONE)(G(a,BITONE), \
|
||||
(void (*)(G(pcre2_substitute_callout_block_,BITONE) *, void *))b,c); \
|
||||
(int (*)(G(pcre2_substitute_callout_block_,BITONE) *, void *))b,c); \
|
||||
else \
|
||||
G(pcre2_set_substitute_callout_,BITTWO)(G(a,BITTWO), \
|
||||
(void (*)(G(pcre2_substitute_callout_block_,BITTWO) *, void *))b,c)
|
||||
(int (*)(G(pcre2_substitute_callout_block_,BITTWO) *, void *))b,c)
|
||||
|
||||
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
||||
if (test_mode == G(G(PCRE,BITONE),_MODE)) \
|
||||
|
@ -2058,7 +2064,7 @@ the three different cases. */
|
|||
#define PCRE2_SET_PARENS_NEST_LIMIT(a,b) pcre2_set_parens_nest_limit_8(G(a,8),b)
|
||||
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
||||
pcre2_set_substitute_callout_8(G(a,8), \
|
||||
(void (*)(pcre2_substitute_callout_block_8 *, void *))b,c)
|
||||
(int (*)(pcre2_substitute_callout_block_8 *, void *))b,c)
|
||||
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
||||
a = pcre2_substitute_8(G(b,8),(PCRE2_SPTR8)c,d,e,f,G(g,8),G(h,8), \
|
||||
(PCRE2_SPTR8)i,j,(PCRE2_UCHAR8 *)k,l)
|
||||
|
@ -2165,7 +2171,7 @@ the three different cases. */
|
|||
#define PCRE2_SET_PARENS_NEST_LIMIT(a,b) pcre2_set_parens_nest_limit_16(G(a,16),b)
|
||||
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
||||
pcre2_set_substitute_callout_16(G(a,16), \
|
||||
(void (*)(pcre2_substitute_callout_block_16 *, void *))b,c)
|
||||
(int (*)(pcre2_substitute_callout_block_16 *, void *))b,c)
|
||||
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
||||
a = pcre2_substitute_16(G(b,16),(PCRE2_SPTR16)c,d,e,f,G(g,16),G(h,16), \
|
||||
(PCRE2_SPTR16)i,j,(PCRE2_UCHAR16 *)k,l)
|
||||
|
@ -2272,7 +2278,7 @@ the three different cases. */
|
|||
#define PCRE2_SET_PARENS_NEST_LIMIT(a,b) pcre2_set_parens_nest_limit_32(G(a,32),b)
|
||||
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
||||
pcre2_set_substitute_callout_32(G(a,32), \
|
||||
(void (*)(pcre2_substitute_callout_block_32 *, void *))b,c)
|
||||
(int (*)(pcre2_substitute_callout_block_32 *, void *))b,c)
|
||||
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
||||
a = pcre2_substitute_32(G(b,32),(PCRE2_SPTR32)c,d,e,f,G(g,32),G(h,32), \
|
||||
(PCRE2_SPTR32)i,j,(PCRE2_UCHAR32 *)k,l)
|
||||
|
@ -5955,17 +5961,40 @@ Arguments:
|
|||
Returns: nothing
|
||||
*/
|
||||
|
||||
static void
|
||||
static int
|
||||
substitute_callout_function(pcre2_substitute_callout_block_8 *scb,
|
||||
void *data_ptr)
|
||||
{
|
||||
int yield = 0;
|
||||
BOOL utf = (FLD(compiled_code, overall_options) & PCRE2_UTF) != 0;
|
||||
(void)data_ptr; /* Not used */
|
||||
fprintf(outfile, "Old %" SIZ_FORM " %" SIZ_FORM " New %" SIZ_FORM
|
||||
" %" SIZ_FORM "\n",
|
||||
SIZ_CAST scb->input_offsets[0],
|
||||
SIZ_CAST scb->input_offsets[1],
|
||||
SIZ_CAST scb->output_offsets[0],
|
||||
SIZ_CAST scb->output_offsets[1]);
|
||||
|
||||
fprintf(outfile, "%2d(%d) Old %" SIZ_FORM " %" SIZ_FORM " \"",
|
||||
scb->subscount, scb->oveccount,
|
||||
SIZ_CAST scb->ovector[0], SIZ_CAST scb->ovector[1]);
|
||||
|
||||
PCHARSV(scb->input, scb->ovector[0], scb->ovector[1] - scb->ovector[0],
|
||||
utf, outfile);
|
||||
|
||||
fprintf(outfile, "\" New %" SIZ_FORM " %" SIZ_FORM " \"",
|
||||
SIZ_CAST scb->output_offsets[0], SIZ_CAST scb->output_offsets[1]);
|
||||
|
||||
PCHARSV(scb->output, scb->output_offsets[0],
|
||||
scb->output_offsets[1] - scb->output_offsets[0], utf, outfile);
|
||||
|
||||
if (scb->subscount == dat_datctl.substitute_stop)
|
||||
{
|
||||
yield = -1;
|
||||
fprintf(outfile, " STOPPED");
|
||||
}
|
||||
else if (scb->subscount == dat_datctl.substitute_skip)
|
||||
{
|
||||
yield = +1;
|
||||
fprintf(outfile, " SKIPPED");
|
||||
}
|
||||
|
||||
fprintf(outfile, "\"\n");
|
||||
return yield;
|
||||
}
|
||||
|
||||
|
||||
|
@ -6494,6 +6523,11 @@ dat_datctl.control2 |= (pat_patctl.control2 & CTL2_ALLPD);
|
|||
strcpy((char *)dat_datctl.replacement, (char *)pat_patctl.replacement);
|
||||
if (dat_datctl.jitstack == 0) dat_datctl.jitstack = pat_patctl.jitstack;
|
||||
|
||||
if (dat_datctl.substitute_skip == 0)
|
||||
dat_datctl.substitute_skip = pat_patctl.substitute_skip;
|
||||
if (dat_datctl.substitute_stop == 0)
|
||||
dat_datctl.substitute_stop = pat_patctl.substitute_stop;
|
||||
|
||||
/* Initialize for scanning the data line. */
|
||||
|
||||
#ifdef SUPPORT_PCRE2_8
|
||||
|
@ -6833,6 +6867,11 @@ arg_ulen = ulen; /* Value to use in match arg */
|
|||
if (p[-1] != 0 && !decode_modifiers(p, CTX_DAT, NULL, &dat_datctl))
|
||||
return PR_OK;
|
||||
|
||||
/* Setting substitute_{skip,fail} implies a substitute callout. */
|
||||
|
||||
if (dat_datctl.substitute_skip != 0 || dat_datctl.substitute_stop != 0)
|
||||
dat_datctl.control2 |= CTL2_SUBSTITUTE_CALLOUT;
|
||||
|
||||
/* Check for mutually exclusive modifiers. At present, these are all in the
|
||||
first control word. */
|
||||
|
||||
|
|
|
@ -5516,6 +5516,21 @@ a)"xI
|
|||
|
||||
/a(b)c|xyz/g,replace=<$0>,substitute_callout
|
||||
abcdefabcpqr
|
||||
abxyzpqrabcxyz
|
||||
12abc34xyz99abc55\=substitute_stop=2
|
||||
12abc34xyz99abc55\=substitute_skip=1
|
||||
12abc34xyz99abc55\=substitute_skip=2
|
||||
|
||||
/a(b)c|xyz/g,replace=<$0>
|
||||
abcdefabcpqr
|
||||
abxyzpqrabcxyz
|
||||
12abc34xyz\=substitute_stop=2
|
||||
12abc34xyz\=substitute_skip=1
|
||||
|
||||
/a(b)c|xyz/replace=<$0>
|
||||
abcdefabcpqr
|
||||
12abc34xyz\=substitute_skip=1
|
||||
12abc34xyz\=substitute_stop=1
|
||||
|
||||
/abc\rdef/
|
||||
abc\ndef
|
||||
|
|
|
@ -1630,10 +1630,10 @@ No match
|
|||
|
||||
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
||||
123abcáyzabcdef789abcሴqr
|
||||
Old 6 6 New 6 8
|
||||
Old 13 13 New 15 17
|
||||
Old 13 16 New 17 22
|
||||
Old 22 22 New 28 30
|
||||
1(2) Old 6 6 "" New 6 8 "<>"
|
||||
2(2) Old 13 13 "" New 15 17 "<>"
|
||||
3(2) Old 13 16 "def" New 17 22 "<def>"
|
||||
4(2) Old 22 22 "" New 28 30 "<>"
|
||||
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
||||
|
||||
# End of testinput10
|
||||
|
|
|
@ -1475,10 +1475,10 @@ No match
|
|||
|
||||
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
||||
123abcáyzabcdef789abcሴqr
|
||||
Old 6 6 New 6 8
|
||||
Old 12 12 New 14 16
|
||||
Old 12 15 New 16 21
|
||||
Old 21 21 New 27 29
|
||||
1(2) Old 6 6 "" New 6 8 "<>"
|
||||
2(2) Old 12 12 "" New 14 16 "<>"
|
||||
3(2) Old 12 15 "def" New 16 21 "<def>"
|
||||
4(2) Old 21 21 "" New 27 29 "<>"
|
||||
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
||||
|
||||
# A few script run tests in non-UTF mode (but they need Unicode support)
|
||||
|
|
|
@ -1472,10 +1472,10 @@ No match
|
|||
|
||||
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
||||
123abcáyzabcdef789abcሴqr
|
||||
Old 6 6 New 6 8
|
||||
Old 12 12 New 14 16
|
||||
Old 12 15 New 16 21
|
||||
Old 21 21 New 27 29
|
||||
1(2) Old 6 6 "" New 6 8 "<>"
|
||||
2(2) Old 12 12 "" New 14 16 "<>"
|
||||
3(2) Old 12 15 "def" New 16 21 "<def>"
|
||||
4(2) Old 21 21 "" New 27 29 "<>"
|
||||
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
||||
|
||||
# A few script run tests in non-UTF mode (but they need Unicode support)
|
||||
|
|
|
@ -16797,9 +16797,52 @@ Subject length lower bound = 1
|
|||
|
||||
/a(b)c|xyz/g,replace=<$0>,substitute_callout
|
||||
abcdefabcpqr
|
||||
Old 0 3 New 0 5
|
||||
Old 6 9 New 8 13
|
||||
1(2) Old 0 3 "abc" New 0 5 "<abc>"
|
||||
2(2) Old 6 9 "abc" New 8 13 "<abc>"
|
||||
2: <abc>def<abc>pqr
|
||||
abxyzpqrabcxyz
|
||||
1(1) Old 2 5 "xyz" New 2 7 "<xyz>"
|
||||
2(2) Old 8 11 "abc" New 10 15 "<abc>"
|
||||
3(1) Old 11 14 "xyz" New 15 20 "<xyz>"
|
||||
3: ab<xyz>pqr<abc><xyz>
|
||||
12abc34xyz99abc55\=substitute_stop=2
|
||||
1(2) Old 2 5 "abc" New 2 7 "<abc>"
|
||||
2(1) Old 7 10 "xyz" New 9 14 "<xyz> STOPPED"
|
||||
2: 12<abc>34xyz99abc55
|
||||
12abc34xyz99abc55\=substitute_skip=1
|
||||
1(2) Old 2 5 "abc" New 2 7 "<abc> SKIPPED"
|
||||
2(1) Old 7 10 "xyz" New 7 12 "<xyz>"
|
||||
3(2) Old 12 15 "abc" New 14 19 "<abc>"
|
||||
3: 12abc34<xyz>99<abc>55
|
||||
12abc34xyz99abc55\=substitute_skip=2
|
||||
1(2) Old 2 5 "abc" New 2 7 "<abc>"
|
||||
2(1) Old 7 10 "xyz" New 9 14 "<xyz> SKIPPED"
|
||||
3(2) Old 12 15 "abc" New 14 19 "<abc>"
|
||||
3: 12<abc>34xyz99<abc>55
|
||||
|
||||
/a(b)c|xyz/g,replace=<$0>
|
||||
abcdefabcpqr
|
||||
2: <abc>def<abc>pqr
|
||||
abxyzpqrabcxyz
|
||||
3: ab<xyz>pqr<abc><xyz>
|
||||
12abc34xyz\=substitute_stop=2
|
||||
1(2) Old 2 5 "abc" New 2 7 "<abc>"
|
||||
2(1) Old 7 10 "xyz" New 9 14 "<xyz> STOPPED"
|
||||
2: 12<abc>34xyz
|
||||
12abc34xyz\=substitute_skip=1
|
||||
1(2) Old 2 5 "abc" New 2 7 "<abc> SKIPPED"
|
||||
2(1) Old 7 10 "xyz" New 7 12 "<xyz>"
|
||||
2: 12abc34<xyz>
|
||||
|
||||
/a(b)c|xyz/replace=<$0>
|
||||
abcdefabcpqr
|
||||
1: <abc>defabcpqr
|
||||
12abc34xyz\=substitute_skip=1
|
||||
1(2) Old 2 5 "abc" New 2 7 "<abc> SKIPPED"
|
||||
1: 12abc34xyz
|
||||
12abc34xyz\=substitute_stop=1
|
||||
1(2) Old 2 5 "abc" New 2 7 "<abc> STOPPED"
|
||||
1: 12abc34xyz
|
||||
|
||||
/abc\rdef/
|
||||
abc\ndef
|
||||
|
|
Loading…
Reference in New Issue