Upgrade the as yet unreleased substitute callout facility.
This commit is contained in:
parent
900f457222
commit
9bc81d5229
|
@ -20,7 +20,7 @@ SYNOPSIS
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
||||||
<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *),</b>
|
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *),</b>
|
||||||
<b> void *<i>callout_data</i>);</b>
|
<b> void *<i>callout_data</i>);</b>
|
||||||
</P>
|
</P>
|
||||||
<br><b>
|
<br><b>
|
||||||
|
|
|
@ -183,7 +183,7 @@ document for an overview of all the PCRE2 documentation.
|
||||||
<br>
|
<br>
|
||||||
<br>
|
<br>
|
||||||
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
||||||
<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
||||||
<b> void *<i>callout_data</i>);</b>
|
<b> void *<i>callout_data</i>);</b>
|
||||||
<br>
|
<br>
|
||||||
<br>
|
<br>
|
||||||
|
@ -924,7 +924,7 @@ documentation.
|
||||||
<br>
|
<br>
|
||||||
<br>
|
<br>
|
||||||
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
||||||
<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
||||||
<b> void *<i>callout_data</i>);</b>
|
<b> void *<i>callout_data</i>);</b>
|
||||||
<br>
|
<br>
|
||||||
<br>
|
<br>
|
||||||
|
@ -3413,9 +3413,9 @@ substitutions. However, PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause unknown
|
||||||
groups in the extended syntax forms to be treated as unset.
|
groups in the extended syntax forms to be treated as unset.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
If successful, <b>pcre2_substitute()</b> returns the number of replacements that
|
If successful, <b>pcre2_substitute()</b> returns the number of successful
|
||||||
were made. This may be zero if no matches were found, and is never greater than
|
matches. This may be zero if no matches were found, and is never greater than 1
|
||||||
1 unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
In the event of an error, a negative error code is returned. Except for
|
In the event of an error, a negative error code is returned. Except for
|
||||||
|
@ -3457,16 +3457,16 @@ Substitution callouts
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
|
||||||
<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
|
||||||
<b> void *<i>callout_data</i>);</b>
|
<b> void *<i>callout_data</i>);</b>
|
||||||
<br>
|
<br>
|
||||||
<br>
|
<br>
|
||||||
The <b>pcre2_set_substitution_callout()</b> function can be used to specify a
|
The <b>pcre2_set_substitution_callout()</b> function can be used to specify a
|
||||||
callout function for <b>pcre2_substitute()</b>. This information is passed in
|
callout function for <b>pcre2_substitute()</b>. This information is passed in
|
||||||
a match context. The callout function is called after each substitution. It is
|
a match context. The callout function is called after each substitution has
|
||||||
not called for simulated substitutions that happen as a result of the
|
been processed, but it can cause the replacement not to happen. The callout
|
||||||
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option. A callout function should not return
|
function is not called for simulated substitutions that happen as a result of
|
||||||
any value.
|
the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The first argument of the callout function is a pointer to a substitute callout
|
The first argument of the callout function is a pointer to a substitute callout
|
||||||
|
@ -3474,7 +3474,11 @@ block structure, which contains the following fields, not necessarily in this
|
||||||
order:
|
order:
|
||||||
<pre>
|
<pre>
|
||||||
uint32_t <i>version</i>;
|
uint32_t <i>version</i>;
|
||||||
PCRE2_SIZE <i>input_offsets[2]</i>;
|
uint32_t <i>subscount</i>;
|
||||||
|
PCRE2_SPTR <i>input</i>;
|
||||||
|
PCRE2_SPTR <i>output</i>;
|
||||||
|
PCRE2_SIZE <i>*ovector</i>;
|
||||||
|
uint32_t <i>oveccount</i>;
|
||||||
PCRE2_SIZE <i>output_offsets[2]</i>;
|
PCRE2_SIZE <i>output_offsets[2]</i>;
|
||||||
</pre>
|
</pre>
|
||||||
The <i>version</i> field contains the version number of the block format. The
|
The <i>version</i> field contains the version number of the block format. The
|
||||||
|
@ -3482,13 +3486,34 @@ current version is 0. The version number will increase in future if more fields
|
||||||
are added, but the intention is never to remove any of the existing fields.
|
are added, but the intention is never to remove any of the existing fields.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The <i>input_offsets</i> vector contains the code unit offsets in the input
|
The <i>subscount</i> field is the number of the current match. It is 1 for the
|
||||||
string of the matched substring, and the <i>output_offsets</i> vector contains
|
first callout, 2 for the second, and so on. The <i>input</i> and <i>output</i>
|
||||||
the offsets of the replacement in the output string.
|
pointers are copies of the values passed to <b>pcre2_substitute()</b>.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
The <i>ovector</i> field points to the ovector, which contains the result of the
|
||||||
|
most recent match. The <i>oveccount</i> field contains the number of pairs that
|
||||||
|
are set in the ovector, and is always greater than zero.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
The <i>output_offsets</i> vector contains the offsets of the replacement in the
|
||||||
|
output string. This has already been processed for dollar and (if requested)
|
||||||
|
backslash substitutions as described above.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The second argument of the callout function is the value passed as
|
The second argument of the callout function is the value passed as
|
||||||
<i>callout_data</i> when the function was registered.
|
<i>callout_data</i> when the function was registered. The value returned by the
|
||||||
|
callout function is interpreted as follows:
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
If the value is zero, the replacement is accepted, and, if
|
||||||
|
PCRE2_SUBSTITUTE_GLOBAL is set, processing continues with a search for the next
|
||||||
|
match. If the value is not zero, the current replacement is not accepted. If
|
||||||
|
the value is greater than zero, processing continues when
|
||||||
|
PCRE2_SUBSTITUTE_GLOBAL is set. Otherwise (the value is less than zero or
|
||||||
|
PCRE2_SUBSTITUTE_GLOBAL is not set), the the rest of the input is copied to the
|
||||||
|
output and the call to <b>pcre2_substitute()</b> exits, returning the number of
|
||||||
|
matches so far.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC37" href="#TOC1">DUPLICATE SUBPATTERN NAMES</a><br>
|
<br><a name="SEC37" href="#TOC1">DUPLICATE SUBPATTERN NAMES</a><br>
|
||||||
<P>
|
<P>
|
||||||
|
@ -3757,7 +3782,7 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 19 October 2018
|
Last updated: 12 November 2018
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2018 University of Cambridge.
|
Copyright © 1997-2018 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
|
@ -1052,7 +1052,9 @@ process.
|
||||||
startchar show starting character when relevant
|
startchar show starting character when relevant
|
||||||
substitute_callout use substitution callouts
|
substitute_callout use substitution callouts
|
||||||
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
||||||
|
substitute_skip=<n> skip substitution number n
|
||||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||||
|
substitute_stop=<n> skip substitution number n and greater
|
||||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||||
</pre>
|
</pre>
|
||||||
|
@ -1220,7 +1222,9 @@ pattern.
|
||||||
startoffset=<n> same as offset=<n>
|
startoffset=<n> same as offset=<n>
|
||||||
substitute_callout use substitution callouts
|
substitute_callout use substitution callouts
|
||||||
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
||||||
|
substitute_skip=<n> skip substitution number n
|
||||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||||
|
substitute_stop=<n> skip substitution number n and greater
|
||||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||||
zero_terminate pass the subject as zero-terminated
|
zero_terminate pass the subject as zero-terminated
|
||||||
|
@ -1410,16 +1414,6 @@ simple example of a substitution test:
|
||||||
=abc=abc=\=global
|
=abc=abc=\=global
|
||||||
2: =xxx=xxx=
|
2: =xxx=xxx=
|
||||||
</pre>
|
</pre>
|
||||||
If the <b>substitute_callout</b> modifier is set, a substitution callout
|
|
||||||
function is set up. When it is called (after each substitution), the offsets in
|
|
||||||
the input and output strings are output. For example:
|
|
||||||
<pre>
|
|
||||||
/abc/g,replace=<$0>,substitute_callout
|
|
||||||
abcdefabcpqr
|
|
||||||
Old 0 3 New 0 5
|
|
||||||
Old 6 9 New 8 13
|
|
||||||
2: <abc>def<abc>pqr
|
|
||||||
</pre>
|
|
||||||
Subject and replacement strings should be kept relatively short (fewer than 256
|
Subject and replacement strings should be kept relatively short (fewer than 256
|
||||||
characters) for substitution tests, as fixed-size buffers are used. To make it
|
characters) for substitution tests, as fixed-size buffers are used. To make it
|
||||||
easy to test for buffer overflow, if the replacement string starts with a
|
easy to test for buffer overflow, if the replacement string starts with a
|
||||||
|
@ -1451,6 +1445,47 @@ matching provokes an error return ("bad option value") from
|
||||||
<b>pcre2_substitute()</b>.
|
<b>pcre2_substitute()</b>.
|
||||||
</P>
|
</P>
|
||||||
<br><b>
|
<br><b>
|
||||||
|
Testing substitute callouts
|
||||||
|
</b><br>
|
||||||
|
<P>
|
||||||
|
If the <b>substitute_callout</b> modifier is set, a substitution callout
|
||||||
|
function is set up. When it is called (after each substitution), details of the
|
||||||
|
the input and output strings are output. For example:
|
||||||
|
<pre>
|
||||||
|
/abc/g,replace=<$0>,substitute_callout
|
||||||
|
abcdefabcpqr
|
||||||
|
1(1) Old 0 3 "abc" New 0 5 "<abc>"
|
||||||
|
2(1) Old 6 9 "abc" New 8 13 "<abc>"
|
||||||
|
2: <abc>def<abc>pqr
|
||||||
|
</pre>
|
||||||
|
The first number on each callout line is the count of matches. The
|
||||||
|
parenthesized number is the number of pairs that are set in the ovector (that
|
||||||
|
is, one more than the number of capturing groups that were set). Then are
|
||||||
|
listed the offsets of the old substring, its contents, and the same for the
|
||||||
|
replacement.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
|
By default, the substitution callout function returns zero, which accepts the
|
||||||
|
replacement and causes matching to continue if /g was used. Two further
|
||||||
|
modifiers can be used to test other return values. If <b>substitute_skip</b> is
|
||||||
|
set to a value greater than zero the callout function returns +1 for the match
|
||||||
|
of that number, and similarly <b>substitute_stop</b> returns -1. These cause the
|
||||||
|
replacement to be rejected, and -1 causes no further matching to take place. If
|
||||||
|
either of them are set, <b>substitute_callout</b> is assumed. For example:
|
||||||
|
<pre>
|
||||||
|
/abc/g,replace=<$0>,substitute_skip=1
|
||||||
|
abcdefabcpqr
|
||||||
|
1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED"
|
||||||
|
2(1) Old 6 9 "abc" New 6 11 "<abc>"
|
||||||
|
2: abcdef<abc>pqr
|
||||||
|
abcdefabcpqr\=substitute_stop=1
|
||||||
|
1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED"
|
||||||
|
1: abcdefabcpqr
|
||||||
|
</pre>
|
||||||
|
If both are set for the same number, stop takes precedence. Only a single skip
|
||||||
|
or stop is supported, which is sufficient for testing that the feature works.
|
||||||
|
</P>
|
||||||
|
<br><b>
|
||||||
Setting the JIT stack size
|
Setting the JIT stack size
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
|
@ -2040,7 +2075,7 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 21 September 2018
|
Last updated: 12 November 2018
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2018 University of Cambridge.
|
Copyright © 1997-2018 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
|
@ -294,7 +294,7 @@ PCRE2 NATIVE API MATCH CONTEXT FUNCTIONS
|
||||||
void *callout_data);
|
void *callout_data);
|
||||||
|
|
||||||
int pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
int pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
||||||
void (*callout_function)(pcre2_substitute_callout_block *, void *),
|
int (*callout_function)(pcre2_substitute_callout_block *, void *),
|
||||||
void *callout_data);
|
void *callout_data);
|
||||||
|
|
||||||
int pcre2_set_offset_limit(pcre2_match_context *mcontext,
|
int pcre2_set_offset_limit(pcre2_match_context *mcontext,
|
||||||
|
@ -942,7 +942,7 @@ PCRE2 CONTEXTS
|
||||||
umentation.
|
umentation.
|
||||||
|
|
||||||
int pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
int pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
||||||
void (*callout_function)(pcre2_substitute_callout_block *, void *),
|
int (*callout_function)(pcre2_substitute_callout_block *, void *),
|
||||||
void *callout_data);
|
void *callout_data);
|
||||||
|
|
||||||
This sets up a callout function for PCRE2 to call after each substitu-
|
This sets up a callout function for PCRE2 to call after each substitu-
|
||||||
|
@ -3318,8 +3318,8 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
||||||
substitutions. However, PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause
|
substitutions. However, PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause
|
||||||
unknown groups in the extended syntax forms to be treated as unset.
|
unknown groups in the extended syntax forms to be treated as unset.
|
||||||
|
|
||||||
If successful, pcre2_substitute() returns the number of replacements
|
If successful, pcre2_substitute() returns the number of successful
|
||||||
that were made. This may be zero if no matches were found, and is never
|
matches. This may be zero if no matches were found, and is never
|
||||||
greater than 1 unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
greater than 1 unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
||||||
|
|
||||||
In the event of an error, a negative error code is returned. Except for
|
In the event of an error, a negative error code is returned. Except for
|
||||||
|
@ -3355,22 +3355,26 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
||||||
Substitution callouts
|
Substitution callouts
|
||||||
|
|
||||||
int pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
int pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
||||||
void (*callout_function)(pcre2_substitute_callout_block *, void *),
|
int (*callout_function)(pcre2_substitute_callout_block *, void *),
|
||||||
void *callout_data);
|
void *callout_data);
|
||||||
|
|
||||||
The pcre2_set_substitution_callout() function can be used to specify a
|
The pcre2_set_substitution_callout() function can be used to specify a
|
||||||
callout function for pcre2_substitute(). This information is passed in
|
callout function for pcre2_substitute(). This information is passed in
|
||||||
a match context. The callout function is called after each substitu-
|
a match context. The callout function is called after each substitution
|
||||||
tion. It is not called for simulated substitutions that happen as a
|
has been processed, but it can cause the replacement not to happen. The
|
||||||
result of the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option. A callout func-
|
callout function is not called for simulated substitutions that happen
|
||||||
tion should not return any value.
|
as a result of the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option.
|
||||||
|
|
||||||
The first argument of the callout function is a pointer to a substitute
|
The first argument of the callout function is a pointer to a substitute
|
||||||
callout block structure, which contains the following fields, not nec-
|
callout block structure, which contains the following fields, not nec-
|
||||||
essarily in this order:
|
essarily in this order:
|
||||||
|
|
||||||
uint32_t version;
|
uint32_t version;
|
||||||
PCRE2_SIZE input_offsets[2];
|
uint32_t subscount;
|
||||||
|
PCRE2_SPTR input;
|
||||||
|
PCRE2_SPTR output;
|
||||||
|
PCRE2_SIZE *ovector;
|
||||||
|
uint32_t oveccount;
|
||||||
PCRE2_SIZE output_offsets[2];
|
PCRE2_SIZE output_offsets[2];
|
||||||
|
|
||||||
The version field contains the version number of the block format. The
|
The version field contains the version number of the block format. The
|
||||||
|
@ -3378,12 +3382,30 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
||||||
more fields are added, but the intention is never to remove any of the
|
more fields are added, but the intention is never to remove any of the
|
||||||
existing fields.
|
existing fields.
|
||||||
|
|
||||||
The input_offsets vector contains the code unit offsets in the input
|
The subscount field is the number of the current match. It is 1 for the
|
||||||
string of the matched substring, and the output_offsets vector contains
|
first callout, 2 for the second, and so on. The input and output point-
|
||||||
the offsets of the replacement in the output string.
|
ers are copies of the values passed to pcre2_substitute().
|
||||||
|
|
||||||
|
The ovector field points to the ovector, which contains the result of
|
||||||
|
the most recent match. The oveccount field contains the number of pairs
|
||||||
|
that are set in the ovector, and is always greater than zero.
|
||||||
|
|
||||||
|
The output_offsets vector contains the offsets of the replacement in
|
||||||
|
the output string. This has already been processed for dollar and (if
|
||||||
|
requested) backslash substitutions as described above.
|
||||||
|
|
||||||
The second argument of the callout function is the value passed as
|
The second argument of the callout function is the value passed as
|
||||||
callout_data when the function was registered.
|
callout_data when the function was registered. The value returned by
|
||||||
|
the callout function is interpreted as follows:
|
||||||
|
|
||||||
|
If the value is zero, the replacement is accepted, and, if PCRE2_SUB-
|
||||||
|
STITUTE_GLOBAL is set, processing continues with a search for the next
|
||||||
|
match. If the value is not zero, the current replacement is not
|
||||||
|
accepted. If the value is greater than zero, processing continues when
|
||||||
|
PCRE2_SUBSTITUTE_GLOBAL is set. Otherwise (the value is less than zero
|
||||||
|
or PCRE2_SUBSTITUTE_GLOBAL is not set), the the rest of the input is
|
||||||
|
copied to the output and the call to pcre2_substitute() exits, return-
|
||||||
|
ing the number of matches so far.
|
||||||
|
|
||||||
|
|
||||||
DUPLICATE SUBPATTERN NAMES
|
DUPLICATE SUBPATTERN NAMES
|
||||||
|
@ -3633,7 +3655,7 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 19 October 2018
|
Last updated: 12 November 2018
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2018 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2_SET_SUBSTITUTE_CALLOUT 3 "17 September 2018" "PCRE2 10.33"
|
.TH PCRE2_SET_SUBSTITUTE_CALLOUT 3 "12 November 2018" "PCRE2 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -8,7 +8,7 @@ PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.PP
|
.PP
|
||||||
.nf
|
.nf
|
||||||
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
||||||
.B " void (*\fIcallout_function\fP)(pcre2_substitute_callout_block *),"
|
.B " int (*\fIcallout_function\fP)(pcre2_substitute_callout_block *),"
|
||||||
.B " void *\fIcallout_data\fP);"
|
.B " void *\fIcallout_data\fP);"
|
||||||
.fi
|
.fi
|
||||||
.
|
.
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2API 3 "19 October 2018" "PCRE2 10.33"
|
.TH PCRE2API 3 "12 November 2018" "PCRE2 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.sp
|
.sp
|
||||||
|
@ -124,7 +124,7 @@ document for an overview of all the PCRE2 documentation.
|
||||||
.B " void *\fIcallout_data\fP);"
|
.B " void *\fIcallout_data\fP);"
|
||||||
.sp
|
.sp
|
||||||
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
||||||
.B " void (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
.B " int (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
||||||
.B " void *\fIcallout_data\fP);"
|
.B " void *\fIcallout_data\fP);"
|
||||||
.sp
|
.sp
|
||||||
.B int pcre2_set_offset_limit(pcre2_match_context *\fImcontext\fP,
|
.B int pcre2_set_offset_limit(pcre2_match_context *\fImcontext\fP,
|
||||||
|
@ -860,7 +860,7 @@ documentation.
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
||||||
.B " void (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
.B " int (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
||||||
.B " void *\fIcallout_data\fP);"
|
.B " void *\fIcallout_data\fP);"
|
||||||
.fi
|
.fi
|
||||||
.sp
|
.sp
|
||||||
|
@ -3412,9 +3412,9 @@ The PCRE2_SUBSTITUTE_UNSET_EMPTY option does not affect these extended
|
||||||
substitutions. However, PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause unknown
|
substitutions. However, PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause unknown
|
||||||
groups in the extended syntax forms to be treated as unset.
|
groups in the extended syntax forms to be treated as unset.
|
||||||
.P
|
.P
|
||||||
If successful, \fBpcre2_substitute()\fP returns the number of replacements that
|
If successful, \fBpcre2_substitute()\fP returns the number of successful
|
||||||
were made. This may be zero if no matches were found, and is never greater than
|
matches. This may be zero if no matches were found, and is never greater than 1
|
||||||
1 unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
unless PCRE2_SUBSTITUTE_GLOBAL is set.
|
||||||
.P
|
.P
|
||||||
In the event of an error, a negative error code is returned. Except for
|
In the event of an error, a negative error code is returned. Except for
|
||||||
PCRE2_ERROR_NOMATCH (which is never returned), errors from \fBpcre2_match()\fP
|
PCRE2_ERROR_NOMATCH (which is never returned), errors from \fBpcre2_match()\fP
|
||||||
|
@ -3454,35 +3454,57 @@ above).
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
.B int pcre2_set_substitute_callout(pcre2_match_context *\fImcontext\fP,
|
||||||
.B " void (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
.B " int (*\fIcallout_function\fP)(pcre2_substitute_callout_block *, void *),"
|
||||||
.B " void *\fIcallout_data\fP);"
|
.B " void *\fIcallout_data\fP);"
|
||||||
.fi
|
.fi
|
||||||
.sp
|
.sp
|
||||||
The \fBpcre2_set_substitution_callout()\fP function can be used to specify a
|
The \fBpcre2_set_substitution_callout()\fP function can be used to specify a
|
||||||
callout function for \fBpcre2_substitute()\fP. This information is passed in
|
callout function for \fBpcre2_substitute()\fP. This information is passed in
|
||||||
a match context. The callout function is called after each substitution. It is
|
a match context. The callout function is called after each substitution has
|
||||||
not called for simulated substitutions that happen as a result of the
|
been processed, but it can cause the replacement not to happen. The callout
|
||||||
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option. A callout function should not return
|
function is not called for simulated substitutions that happen as a result of
|
||||||
any value.
|
the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option.
|
||||||
.P
|
.P
|
||||||
The first argument of the callout function is a pointer to a substitute callout
|
The first argument of the callout function is a pointer to a substitute callout
|
||||||
block structure, which contains the following fields, not necessarily in this
|
block structure, which contains the following fields, not necessarily in this
|
||||||
order:
|
order:
|
||||||
.sp
|
.sp
|
||||||
uint32_t \fIversion\fP;
|
uint32_t \fIversion\fP;
|
||||||
PCRE2_SIZE \fIinput_offsets[2]\fP;
|
uint32_t \fIsubscount\fP;
|
||||||
|
PCRE2_SPTR \fIinput\fP;
|
||||||
|
PCRE2_SPTR \fIoutput\fP;
|
||||||
|
PCRE2_SIZE \fI*ovector\fP;
|
||||||
|
uint32_t \fIoveccount\fP;
|
||||||
PCRE2_SIZE \fIoutput_offsets[2]\fP;
|
PCRE2_SIZE \fIoutput_offsets[2]\fP;
|
||||||
.sp
|
.sp
|
||||||
The \fIversion\fP field contains the version number of the block format. The
|
The \fIversion\fP field contains the version number of the block format. The
|
||||||
current version is 0. The version number will increase in future if more fields
|
current version is 0. The version number will increase in future if more fields
|
||||||
are added, but the intention is never to remove any of the existing fields.
|
are added, but the intention is never to remove any of the existing fields.
|
||||||
.P
|
.P
|
||||||
The \fIinput_offsets\fP vector contains the code unit offsets in the input
|
The \fIsubscount\fP field is the number of the current match. It is 1 for the
|
||||||
string of the matched substring, and the \fIoutput_offsets\fP vector contains
|
first callout, 2 for the second, and so on. The \fIinput\fP and \fIoutput\fP
|
||||||
the offsets of the replacement in the output string.
|
pointers are copies of the values passed to \fBpcre2_substitute()\fP.
|
||||||
|
.P
|
||||||
|
The \fIovector\fP field points to the ovector, which contains the result of the
|
||||||
|
most recent match. The \fIoveccount\fP field contains the number of pairs that
|
||||||
|
are set in the ovector, and is always greater than zero.
|
||||||
|
.P
|
||||||
|
The \fIoutput_offsets\fP vector contains the offsets of the replacement in the
|
||||||
|
output string. This has already been processed for dollar and (if requested)
|
||||||
|
backslash substitutions as described above.
|
||||||
.P
|
.P
|
||||||
The second argument of the callout function is the value passed as
|
The second argument of the callout function is the value passed as
|
||||||
\fIcallout_data\fP when the function was registered.
|
\fIcallout_data\fP when the function was registered. The value returned by the
|
||||||
|
callout function is interpreted as follows:
|
||||||
|
.P
|
||||||
|
If the value is zero, the replacement is accepted, and, if
|
||||||
|
PCRE2_SUBSTITUTE_GLOBAL is set, processing continues with a search for the next
|
||||||
|
match. If the value is not zero, the current replacement is not accepted. If
|
||||||
|
the value is greater than zero, processing continues when
|
||||||
|
PCRE2_SUBSTITUTE_GLOBAL is set. Otherwise (the value is less than zero or
|
||||||
|
PCRE2_SUBSTITUTE_GLOBAL is not set), the the rest of the input is copied to the
|
||||||
|
output and the call to \fBpcre2_substitute()\fP exits, returning the number of
|
||||||
|
matches so far.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SH "DUPLICATE SUBPATTERN NAMES"
|
.SH "DUPLICATE SUBPATTERN NAMES"
|
||||||
|
@ -3768,6 +3790,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 19 October 2018
|
Last updated: 12 November 2018
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2018 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2TEST 1 "21 September 2018" "PCRE 10.33"
|
.TH PCRE2TEST 1 "12 November 2018" "PCRE 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
pcre2test - a program for testing Perl-compatible regular expressions.
|
pcre2test - a program for testing Perl-compatible regular expressions.
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -1014,7 +1014,9 @@ process.
|
||||||
startchar show starting character when relevant
|
startchar show starting character when relevant
|
||||||
substitute_callout use substitution callouts
|
substitute_callout use substitution callouts
|
||||||
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
||||||
|
substitute_skip=<n> skip substitution number n
|
||||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||||
|
substitute_stop=<n> skip substitution number n and greater
|
||||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||||
.sp
|
.sp
|
||||||
|
@ -1189,7 +1191,9 @@ pattern.
|
||||||
startoffset=<n> same as offset=<n>
|
startoffset=<n> same as offset=<n>
|
||||||
substitute_callout use substitution callouts
|
substitute_callout use substitution callouts
|
||||||
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
||||||
|
substitute_skip=<n> skip substitution number n
|
||||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||||
|
substitute_stop=<n> skip substitution number n and greater
|
||||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||||
zero_terminate pass the subject as zero-terminated
|
zero_terminate pass the subject as zero-terminated
|
||||||
|
@ -1377,16 +1381,6 @@ simple example of a substitution test:
|
||||||
=abc=abc=\e=global
|
=abc=abc=\e=global
|
||||||
2: =xxx=xxx=
|
2: =xxx=xxx=
|
||||||
.sp
|
.sp
|
||||||
If the \fBsubstitute_callout\fP modifier is set, a substitution callout
|
|
||||||
function is set up. When it is called (after each substitution), the offsets in
|
|
||||||
the input and output strings are output. For example:
|
|
||||||
.sp
|
|
||||||
/abc/g,replace=<$0>,substitute_callout
|
|
||||||
abcdefabcpqr
|
|
||||||
Old 0 3 New 0 5
|
|
||||||
Old 6 9 New 8 13
|
|
||||||
2: <abc>def<abc>pqr
|
|
||||||
.sp
|
|
||||||
Subject and replacement strings should be kept relatively short (fewer than 256
|
Subject and replacement strings should be kept relatively short (fewer than 256
|
||||||
characters) for substitution tests, as fixed-size buffers are used. To make it
|
characters) for substitution tests, as fixed-size buffers are used. To make it
|
||||||
easy to test for buffer overflow, if the replacement string starts with a
|
easy to test for buffer overflow, if the replacement string starts with a
|
||||||
|
@ -1418,6 +1412,46 @@ matching provokes an error return ("bad option value") from
|
||||||
\fBpcre2_substitute()\fP.
|
\fBpcre2_substitute()\fP.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
|
.SS "Testing substitute callouts"
|
||||||
|
.rs
|
||||||
|
.sp
|
||||||
|
If the \fBsubstitute_callout\fP modifier is set, a substitution callout
|
||||||
|
function is set up. When it is called (after each substitution), details of the
|
||||||
|
the input and output strings are output. For example:
|
||||||
|
.sp
|
||||||
|
/abc/g,replace=<$0>,substitute_callout
|
||||||
|
abcdefabcpqr
|
||||||
|
1(1) Old 0 3 "abc" New 0 5 "<abc>"
|
||||||
|
2(1) Old 6 9 "abc" New 8 13 "<abc>"
|
||||||
|
2: <abc>def<abc>pqr
|
||||||
|
.sp
|
||||||
|
The first number on each callout line is the count of matches. The
|
||||||
|
parenthesized number is the number of pairs that are set in the ovector (that
|
||||||
|
is, one more than the number of capturing groups that were set). Then are
|
||||||
|
listed the offsets of the old substring, its contents, and the same for the
|
||||||
|
replacement.
|
||||||
|
.P
|
||||||
|
By default, the substitution callout function returns zero, which accepts the
|
||||||
|
replacement and causes matching to continue if /g was used. Two further
|
||||||
|
modifiers can be used to test other return values. If \fBsubstitute_skip\fP is
|
||||||
|
set to a value greater than zero the callout function returns +1 for the match
|
||||||
|
of that number, and similarly \fBsubstitute_stop\fP returns -1. These cause the
|
||||||
|
replacement to be rejected, and -1 causes no further matching to take place. If
|
||||||
|
either of them are set, \fBsubstitute_callout\fP is assumed. For example:
|
||||||
|
.sp
|
||||||
|
/abc/g,replace=<$0>,substitute_skip=1
|
||||||
|
abcdefabcpqr
|
||||||
|
1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED"
|
||||||
|
2(1) Old 6 9 "abc" New 6 11 "<abc>"
|
||||||
|
2: abcdef<abc>pqr
|
||||||
|
abcdefabcpqr\e=substitute_stop=1
|
||||||
|
1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED"
|
||||||
|
1: abcdefabcpqr
|
||||||
|
.sp
|
||||||
|
If both are set for the same number, stop takes precedence. Only a single skip
|
||||||
|
or stop is supported, which is sufficient for testing that the feature works.
|
||||||
|
.
|
||||||
|
.
|
||||||
.SS "Setting the JIT stack size"
|
.SS "Setting the JIT stack size"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
|
@ -2022,6 +2056,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 21 September 2018
|
Last updated: 12 November 2018
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2018 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -940,7 +940,9 @@ PATTERN MODIFIERS
|
||||||
startchar show starting character when relevant
|
startchar show starting character when relevant
|
||||||
substitute_callout use substitution callouts
|
substitute_callout use substitution callouts
|
||||||
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
|
||||||
|
substitute_skip=<n> skip substitution number n
|
||||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||||
|
substitute_stop=<n> skip substitution number n and greater
|
||||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||||
|
|
||||||
|
@ -1092,7 +1094,9 @@ SUBJECT MODIFIERS
|
||||||
startoffset=<n> same as offset=<n>
|
startoffset=<n> same as offset=<n>
|
||||||
substitute_callout use substitution callouts
|
substitute_callout use substitution callouts
|
||||||
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
|
||||||
|
substitute_skip=<n> skip substitution number n
|
||||||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
|
||||||
|
substitute_stop=<n> skip substitution number n and greater
|
||||||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
|
||||||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
|
||||||
zero_terminate pass the subject as zero-terminated
|
zero_terminate pass the subject as zero-terminated
|
||||||
|
@ -1263,16 +1267,6 @@ SUBJECT MODIFIERS
|
||||||
=abc=abc=\=global
|
=abc=abc=\=global
|
||||||
2: =xxx=xxx=
|
2: =xxx=xxx=
|
||||||
|
|
||||||
If the substitute_callout modifier is set, a substitution callout func-
|
|
||||||
tion is set up. When it is called (after each substitution), the off-
|
|
||||||
sets in the input and output strings are output. For example:
|
|
||||||
|
|
||||||
/abc/g,replace=<$0>,substitute_callout
|
|
||||||
abcdefabcpqr
|
|
||||||
Old 0 3 New 0 5
|
|
||||||
Old 6 9 New 8 13
|
|
||||||
2: <abc>def<abc>pqr
|
|
||||||
|
|
||||||
Subject and replacement strings should be kept relatively short (fewer
|
Subject and replacement strings should be kept relatively short (fewer
|
||||||
than 256 characters) for substitution tests, as fixed-size buffers are
|
than 256 characters) for substitution tests, as fixed-size buffers are
|
||||||
used. To make it easy to test for buffer overflow, if the replacement
|
used. To make it easy to test for buffer overflow, if the replacement
|
||||||
|
@ -1305,162 +1299,202 @@ SUBJECT MODIFIERS
|
||||||
partial matching provokes an error return ("bad option value") from
|
partial matching provokes an error return ("bad option value") from
|
||||||
pcre2_substitute().
|
pcre2_substitute().
|
||||||
|
|
||||||
|
Testing substitute callouts
|
||||||
|
|
||||||
|
If the substitute_callout modifier is set, a substitution callout func-
|
||||||
|
tion is set up. When it is called (after each substitution), details of
|
||||||
|
the the input and output strings are output. For example:
|
||||||
|
|
||||||
|
/abc/g,replace=<$0>,substitute_callout
|
||||||
|
abcdefabcpqr
|
||||||
|
1(1) Old 0 3 "abc" New 0 5 "<abc>"
|
||||||
|
2(1) Old 6 9 "abc" New 8 13 "<abc>"
|
||||||
|
2: <abc>def<abc>pqr
|
||||||
|
|
||||||
|
The first number on each callout line is the count of matches. The
|
||||||
|
parenthesized number is the number of pairs that are set in the ovector
|
||||||
|
(that is, one more than the number of capturing groups that were set).
|
||||||
|
Then are listed the offsets of the old substring, its contents, and the
|
||||||
|
same for the replacement.
|
||||||
|
|
||||||
|
By default, the substitution callout function returns zero, which
|
||||||
|
accepts the replacement and causes matching to continue if /g was used.
|
||||||
|
Two further modifiers can be used to test other return values. If sub-
|
||||||
|
stitute_skip is set to a value greater than zero the callout function
|
||||||
|
returns +1 for the match of that number, and similarly substitute_stop
|
||||||
|
returns -1. These cause the replacement to be rejected, and -1 causes
|
||||||
|
no further matching to take place. If either of them are set, substi-
|
||||||
|
tute_callout is assumed. For example:
|
||||||
|
|
||||||
|
/abc/g,replace=<$0>,substitute_skip=1
|
||||||
|
abcdefabcpqr
|
||||||
|
1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED"
|
||||||
|
2(1) Old 6 9 "abc" New 6 11 "<abc>"
|
||||||
|
2: abcdef<abc>pqr
|
||||||
|
abcdefabcpqr\=substitute_stop=1
|
||||||
|
1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED"
|
||||||
|
1: abcdefabcpqr
|
||||||
|
|
||||||
|
If both are set for the same number, stop takes precedence. Only a sin-
|
||||||
|
gle skip or stop is supported, which is sufficient for testing that the
|
||||||
|
feature works.
|
||||||
|
|
||||||
Setting the JIT stack size
|
Setting the JIT stack size
|
||||||
|
|
||||||
The jitstack modifier provides a way of setting the maximum stack size
|
The jitstack modifier provides a way of setting the maximum stack size
|
||||||
that is used by the just-in-time optimization code. It is ignored if
|
that is used by the just-in-time optimization code. It is ignored if
|
||||||
JIT optimization is not being used. The value is a number of kibibytes
|
JIT optimization is not being used. The value is a number of kibibytes
|
||||||
(units of 1024 bytes). Setting zero reverts to the default of 32KiB.
|
(units of 1024 bytes). Setting zero reverts to the default of 32KiB.
|
||||||
Providing a stack that is larger than the default is necessary only for
|
Providing a stack that is larger than the default is necessary only for
|
||||||
very complicated patterns. If jitstack is set non-zero on a subject
|
very complicated patterns. If jitstack is set non-zero on a subject
|
||||||
line it overrides any value that was set on the pattern.
|
line it overrides any value that was set on the pattern.
|
||||||
|
|
||||||
Setting heap, match, and depth limits
|
Setting heap, match, and depth limits
|
||||||
|
|
||||||
The heap_limit, match_limit, and depth_limit modifiers set the appro-
|
The heap_limit, match_limit, and depth_limit modifiers set the appro-
|
||||||
priate limits in the match context. These values are ignored when the
|
priate limits in the match context. These values are ignored when the
|
||||||
find_limits modifier is specified.
|
find_limits modifier is specified.
|
||||||
|
|
||||||
Finding minimum limits
|
Finding minimum limits
|
||||||
|
|
||||||
If the find_limits modifier is present on a subject line, pcre2test
|
If the find_limits modifier is present on a subject line, pcre2test
|
||||||
calls the relevant matching function several times, setting different
|
calls the relevant matching function several times, setting different
|
||||||
values in the match context via pcre2_set_heap_limit(),
|
values in the match context via pcre2_set_heap_limit(),
|
||||||
pcre2_set_match_limit(), or pcre2_set_depth_limit() until it finds the
|
pcre2_set_match_limit(), or pcre2_set_depth_limit() until it finds the
|
||||||
minimum values for each parameter that allows the match to complete
|
minimum values for each parameter that allows the match to complete
|
||||||
without error. If JIT is being used, only the match limit is relevant.
|
without error. If JIT is being used, only the match limit is relevant.
|
||||||
|
|
||||||
When using this modifier, the pattern should not contain any limit set-
|
When using this modifier, the pattern should not contain any limit set-
|
||||||
tings such as (*LIMIT_MATCH=...) within it. If such a setting is
|
tings such as (*LIMIT_MATCH=...) within it. If such a setting is
|
||||||
present and is lower than the minimum matching value, the minimum value
|
present and is lower than the minimum matching value, the minimum value
|
||||||
cannot be found because pcre2_set_match_limit() etc. are only able to
|
cannot be found because pcre2_set_match_limit() etc. are only able to
|
||||||
reduce the value of an in-pattern limit; they cannot increase it.
|
reduce the value of an in-pattern limit; they cannot increase it.
|
||||||
|
|
||||||
For non-DFA matching, the minimum depth_limit number is a measure of
|
For non-DFA matching, the minimum depth_limit number is a measure of
|
||||||
how much nested backtracking happens (that is, how deeply the pattern's
|
how much nested backtracking happens (that is, how deeply the pattern's
|
||||||
tree is searched). In the case of DFA matching, depth_limit controls
|
tree is searched). In the case of DFA matching, depth_limit controls
|
||||||
the depth of recursive calls of the internal function that is used for
|
the depth of recursive calls of the internal function that is used for
|
||||||
handling pattern recursion, lookaround assertions, and atomic groups.
|
handling pattern recursion, lookaround assertions, and atomic groups.
|
||||||
|
|
||||||
For non-DFA matching, the match_limit number is a measure of the amount
|
For non-DFA matching, the match_limit number is a measure of the amount
|
||||||
of backtracking that takes place, and learning the minimum value can be
|
of backtracking that takes place, and learning the minimum value can be
|
||||||
instructive. For most simple matches, the number is quite small, but
|
instructive. For most simple matches, the number is quite small, but
|
||||||
for patterns with very large numbers of matching possibilities, it can
|
for patterns with very large numbers of matching possibilities, it can
|
||||||
become large very quickly with increasing length of subject string. In
|
become large very quickly with increasing length of subject string. In
|
||||||
the case of DFA matching, match_limit controls the total number of
|
the case of DFA matching, match_limit controls the total number of
|
||||||
calls, both recursive and non-recursive, to the internal matching func-
|
calls, both recursive and non-recursive, to the internal matching func-
|
||||||
tion, thus controlling the overall amount of computing resource that is
|
tion, thus controlling the overall amount of computing resource that is
|
||||||
used.
|
used.
|
||||||
|
|
||||||
For both kinds of matching, the heap_limit number, which is in
|
For both kinds of matching, the heap_limit number, which is in
|
||||||
kibibytes (units of 1024 bytes), limits the amount of heap memory used
|
kibibytes (units of 1024 bytes), limits the amount of heap memory used
|
||||||
for matching. A value of zero disables the use of any heap memory; many
|
for matching. A value of zero disables the use of any heap memory; many
|
||||||
simple pattern matches can be done without using the heap, so zero is
|
simple pattern matches can be done without using the heap, so zero is
|
||||||
not an unreasonable setting.
|
not an unreasonable setting.
|
||||||
|
|
||||||
Showing MARK names
|
Showing MARK names
|
||||||
|
|
||||||
|
|
||||||
The mark modifier causes the names from backtracking control verbs that
|
The mark modifier causes the names from backtracking control verbs that
|
||||||
are returned from calls to pcre2_match() to be displayed. If a mark is
|
are returned from calls to pcre2_match() to be displayed. If a mark is
|
||||||
returned for a match, non-match, or partial match, pcre2test shows it.
|
returned for a match, non-match, or partial match, pcre2test shows it.
|
||||||
For a match, it is on a line by itself, tagged with "MK:". Otherwise,
|
For a match, it is on a line by itself, tagged with "MK:". Otherwise,
|
||||||
it is added to the non-match message.
|
it is added to the non-match message.
|
||||||
|
|
||||||
Showing memory usage
|
Showing memory usage
|
||||||
|
|
||||||
The memory modifier causes pcre2test to log the sizes of all heap mem-
|
The memory modifier causes pcre2test to log the sizes of all heap mem-
|
||||||
ory allocation and freeing calls that occur during a call to
|
ory allocation and freeing calls that occur during a call to
|
||||||
pcre2_match() or pcre2_dfa_match(). These occur only when a match
|
pcre2_match() or pcre2_dfa_match(). These occur only when a match
|
||||||
requires a bigger vector than the default for remembering backtracking
|
requires a bigger vector than the default for remembering backtracking
|
||||||
points (pcre2_match()) or for internal workspace (pcre2_dfa_match()).
|
points (pcre2_match()) or for internal workspace (pcre2_dfa_match()).
|
||||||
In many cases there will be no heap memory used and therefore no addi-
|
In many cases there will be no heap memory used and therefore no addi-
|
||||||
tional output. No heap memory is allocated during matching with JIT, so
|
tional output. No heap memory is allocated during matching with JIT, so
|
||||||
in that case the memory modifier never has any effect. For this modi-
|
in that case the memory modifier never has any effect. For this modi-
|
||||||
fier to work, the null_context modifier must not be set on both the
|
fier to work, the null_context modifier must not be set on both the
|
||||||
pattern and the subject, though it can be set on one or the other.
|
pattern and the subject, though it can be set on one or the other.
|
||||||
|
|
||||||
Setting a starting offset
|
Setting a starting offset
|
||||||
|
|
||||||
The offset modifier sets an offset in the subject string at which
|
The offset modifier sets an offset in the subject string at which
|
||||||
matching starts. Its value is a number of code units, not characters.
|
matching starts. Its value is a number of code units, not characters.
|
||||||
|
|
||||||
Setting an offset limit
|
Setting an offset limit
|
||||||
|
|
||||||
The offset_limit modifier sets a limit for unanchored matches. If a
|
The offset_limit modifier sets a limit for unanchored matches. If a
|
||||||
match cannot be found starting at or before this offset in the subject,
|
match cannot be found starting at or before this offset in the subject,
|
||||||
a "no match" return is given. The data value is a number of code units,
|
a "no match" return is given. The data value is a number of code units,
|
||||||
not characters. When this modifier is used, the use_offset_limit modi-
|
not characters. When this modifier is used, the use_offset_limit modi-
|
||||||
fier must have been set for the pattern; if not, an error is generated.
|
fier must have been set for the pattern; if not, an error is generated.
|
||||||
|
|
||||||
Setting the size of the output vector
|
Setting the size of the output vector
|
||||||
|
|
||||||
The ovector modifier applies only to the subject line in which it
|
The ovector modifier applies only to the subject line in which it
|
||||||
appears, though of course it can also be used to set a default in a
|
appears, though of course it can also be used to set a default in a
|
||||||
#subject command. It specifies the number of pairs of offsets that are
|
#subject command. It specifies the number of pairs of offsets that are
|
||||||
available for storing matching information. The default is 15.
|
available for storing matching information. The default is 15.
|
||||||
|
|
||||||
A value of zero is useful when testing the POSIX API because it causes
|
A value of zero is useful when testing the POSIX API because it causes
|
||||||
regexec() to be called with a NULL capture vector. When not testing the
|
regexec() to be called with a NULL capture vector. When not testing the
|
||||||
POSIX API, a value of zero is used to cause pcre2_match_data_cre-
|
POSIX API, a value of zero is used to cause pcre2_match_data_cre-
|
||||||
ate_from_pattern() to be called, in order to create a match block of
|
ate_from_pattern() to be called, in order to create a match block of
|
||||||
exactly the right size for the pattern. (It is not possible to create a
|
exactly the right size for the pattern. (It is not possible to create a
|
||||||
match block with a zero-length ovector; there is always at least one
|
match block with a zero-length ovector; there is always at least one
|
||||||
pair of offsets.)
|
pair of offsets.)
|
||||||
|
|
||||||
Passing the subject as zero-terminated
|
Passing the subject as zero-terminated
|
||||||
|
|
||||||
By default, the subject string is passed to a native API matching func-
|
By default, the subject string is passed to a native API matching func-
|
||||||
tion with its correct length. In order to test the facility for passing
|
tion with its correct length. In order to test the facility for passing
|
||||||
a zero-terminated string, the zero_terminate modifier is provided. It
|
a zero-terminated string, the zero_terminate modifier is provided. It
|
||||||
causes the length to be passed as PCRE2_ZERO_TERMINATED. When matching
|
causes the length to be passed as PCRE2_ZERO_TERMINATED. When matching
|
||||||
via the POSIX interface, this modifier is ignored, with a warning.
|
via the POSIX interface, this modifier is ignored, with a warning.
|
||||||
|
|
||||||
When testing pcre2_substitute(), this modifier also has the effect of
|
When testing pcre2_substitute(), this modifier also has the effect of
|
||||||
passing the replacement string as zero-terminated.
|
passing the replacement string as zero-terminated.
|
||||||
|
|
||||||
Passing a NULL context
|
Passing a NULL context
|
||||||
|
|
||||||
Normally, pcre2test passes a context block to pcre2_match(),
|
Normally, pcre2test passes a context block to pcre2_match(),
|
||||||
pcre2_dfa_match() or pcre2_jit_match(). If the null_context modifier is
|
pcre2_dfa_match() or pcre2_jit_match(). If the null_context modifier is
|
||||||
set, however, NULL is passed. This is for testing that the matching
|
set, however, NULL is passed. This is for testing that the matching
|
||||||
functions behave correctly in this case (they use default values). This
|
functions behave correctly in this case (they use default values). This
|
||||||
modifier cannot be used with the find_limits modifier or when testing
|
modifier cannot be used with the find_limits modifier or when testing
|
||||||
the substitution function.
|
the substitution function.
|
||||||
|
|
||||||
|
|
||||||
THE ALTERNATIVE MATCHING FUNCTION
|
THE ALTERNATIVE MATCHING FUNCTION
|
||||||
|
|
||||||
By default, pcre2test uses the standard PCRE2 matching function,
|
By default, pcre2test uses the standard PCRE2 matching function,
|
||||||
pcre2_match() to match each subject line. PCRE2 also supports an alter-
|
pcre2_match() to match each subject line. PCRE2 also supports an alter-
|
||||||
native matching function, pcre2_dfa_match(), which operates in a dif-
|
native matching function, pcre2_dfa_match(), which operates in a dif-
|
||||||
ferent way, and has some restrictions. The differences between the two
|
ferent way, and has some restrictions. The differences between the two
|
||||||
functions are described in the pcre2matching documentation.
|
functions are described in the pcre2matching documentation.
|
||||||
|
|
||||||
If the dfa modifier is set, the alternative matching function is used.
|
If the dfa modifier is set, the alternative matching function is used.
|
||||||
This function finds all possible matches at a given point in the sub-
|
This function finds all possible matches at a given point in the sub-
|
||||||
ject. If, however, the dfa_shortest modifier is set, processing stops
|
ject. If, however, the dfa_shortest modifier is set, processing stops
|
||||||
after the first match is found. This is always the shortest possible
|
after the first match is found. This is always the shortest possible
|
||||||
match.
|
match.
|
||||||
|
|
||||||
|
|
||||||
DEFAULT OUTPUT FROM pcre2test
|
DEFAULT OUTPUT FROM pcre2test
|
||||||
|
|
||||||
This section describes the output when the normal matching function,
|
This section describes the output when the normal matching function,
|
||||||
pcre2_match(), is being used.
|
pcre2_match(), is being used.
|
||||||
|
|
||||||
When a match succeeds, pcre2test outputs the list of captured sub-
|
When a match succeeds, pcre2test outputs the list of captured sub-
|
||||||
strings, starting with number 0 for the string that matched the whole
|
strings, starting with number 0 for the string that matched the whole
|
||||||
pattern. Otherwise, it outputs "No match" when the return is
|
pattern. Otherwise, it outputs "No match" when the return is
|
||||||
PCRE2_ERROR_NOMATCH, or "Partial match:" followed by the partially
|
PCRE2_ERROR_NOMATCH, or "Partial match:" followed by the partially
|
||||||
matching substring when the return is PCRE2_ERROR_PARTIAL. (Note that
|
matching substring when the return is PCRE2_ERROR_PARTIAL. (Note that
|
||||||
this is the entire substring that was inspected during the partial
|
this is the entire substring that was inspected during the partial
|
||||||
match; it may include characters before the actual match start if a
|
match; it may include characters before the actual match start if a
|
||||||
lookbehind assertion, \K, \b, or \B was involved.)
|
lookbehind assertion, \K, \b, or \B was involved.)
|
||||||
|
|
||||||
For any other return, pcre2test outputs the PCRE2 negative error number
|
For any other return, pcre2test outputs the PCRE2 negative error number
|
||||||
and a short descriptive phrase. If the error is a failed UTF string
|
and a short descriptive phrase. If the error is a failed UTF string
|
||||||
check, the code unit offset of the start of the failing character is
|
check, the code unit offset of the start of the failing character is
|
||||||
also output. Here is an example of an interactive pcre2test run.
|
also output. Here is an example of an interactive pcre2test run.
|
||||||
|
|
||||||
$ pcre2test
|
$ pcre2test
|
||||||
|
@ -1476,8 +1510,8 @@ DEFAULT OUTPUT FROM pcre2test
|
||||||
Unset capturing substrings that are not followed by one that is set are
|
Unset capturing substrings that are not followed by one that is set are
|
||||||
not shown by pcre2test unless the allcaptures modifier is specified. In
|
not shown by pcre2test unless the allcaptures modifier is specified. In
|
||||||
the following example, there are two capturing substrings, but when the
|
the following example, there are two capturing substrings, but when the
|
||||||
first data line is matched, the second, unset substring is not shown.
|
first data line is matched, the second, unset substring is not shown.
|
||||||
An "internal" unset substring is shown as "<unset>", as for the second
|
An "internal" unset substring is shown as "<unset>", as for the second
|
||||||
data line.
|
data line.
|
||||||
|
|
||||||
re> /(a)|(b)/
|
re> /(a)|(b)/
|
||||||
|
@ -1489,11 +1523,11 @@ DEFAULT OUTPUT FROM pcre2test
|
||||||
1: <unset>
|
1: <unset>
|
||||||
2: b
|
2: b
|
||||||
|
|
||||||
If the strings contain any non-printing characters, they are output as
|
If the strings contain any non-printing characters, they are output as
|
||||||
\xhh escapes if the value is less than 256 and UTF mode is not set.
|
\xhh escapes if the value is less than 256 and UTF mode is not set.
|
||||||
Otherwise they are output as \x{hh...} escapes. See below for the defi-
|
Otherwise they are output as \x{hh...} escapes. See below for the defi-
|
||||||
nition of non-printing characters. If the aftertext modifier is set,
|
nition of non-printing characters. If the aftertext modifier is set,
|
||||||
the output for substring 0 is followed by the the rest of the subject
|
the output for substring 0 is followed by the the rest of the subject
|
||||||
string, identified by "0+" like this:
|
string, identified by "0+" like this:
|
||||||
|
|
||||||
re> /cat/aftertext
|
re> /cat/aftertext
|
||||||
|
@ -1501,7 +1535,7 @@ DEFAULT OUTPUT FROM pcre2test
|
||||||
0: cat
|
0: cat
|
||||||
0+ aract
|
0+ aract
|
||||||
|
|
||||||
If global matching is requested, the results of successive matching
|
If global matching is requested, the results of successive matching
|
||||||
attempts are output in sequence, like this:
|
attempts are output in sequence, like this:
|
||||||
|
|
||||||
re> /\Bi(\w\w)/g
|
re> /\Bi(\w\w)/g
|
||||||
|
@ -1513,8 +1547,8 @@ DEFAULT OUTPUT FROM pcre2test
|
||||||
0: ipp
|
0: ipp
|
||||||
1: pp
|
1: pp
|
||||||
|
|
||||||
"No match" is output only if the first match attempt fails. Here is an
|
"No match" is output only if the first match attempt fails. Here is an
|
||||||
example of a failure message (the offset 4 that is specified by the
|
example of a failure message (the offset 4 that is specified by the
|
||||||
offset modifier is past the end of the subject string):
|
offset modifier is past the end of the subject string):
|
||||||
|
|
||||||
re> /xyz/
|
re> /xyz/
|
||||||
|
@ -1522,7 +1556,7 @@ DEFAULT OUTPUT FROM pcre2test
|
||||||
Error -24 (bad offset value)
|
Error -24 (bad offset value)
|
||||||
|
|
||||||
Note that whereas patterns can be continued over several lines (a plain
|
Note that whereas patterns can be continued over several lines (a plain
|
||||||
">" prompt is used for continuations), subject lines may not. However
|
">" prompt is used for continuations), subject lines may not. However
|
||||||
newlines can be included in a subject by means of the \n escape (or \r,
|
newlines can be included in a subject by means of the \n escape (or \r,
|
||||||
\r\n, etc., depending on the newline sequence setting).
|
\r\n, etc., depending on the newline sequence setting).
|
||||||
|
|
||||||
|
@ -1530,7 +1564,7 @@ DEFAULT OUTPUT FROM pcre2test
|
||||||
OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
|
OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
|
||||||
|
|
||||||
When the alternative matching function, pcre2_dfa_match(), is used, the
|
When the alternative matching function, pcre2_dfa_match(), is used, the
|
||||||
output consists of a list of all the matches that start at the first
|
output consists of a list of all the matches that start at the first
|
||||||
point in the subject where there is at least one match. For example:
|
point in the subject where there is at least one match. For example:
|
||||||
|
|
||||||
re> /(tang|tangerine|tan)/
|
re> /(tang|tangerine|tan)/
|
||||||
|
@ -1539,11 +1573,11 @@ OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
|
||||||
1: tang
|
1: tang
|
||||||
2: tan
|
2: tan
|
||||||
|
|
||||||
Using the normal matching function on this data finds only "tang". The
|
Using the normal matching function on this data finds only "tang". The
|
||||||
longest matching string is always given first (and numbered zero).
|
longest matching string is always given first (and numbered zero).
|
||||||
After a PCRE2_ERROR_PARTIAL return, the output is "Partial match:",
|
After a PCRE2_ERROR_PARTIAL return, the output is "Partial match:",
|
||||||
followed by the partially matching substring. Note that this is the
|
followed by the partially matching substring. Note that this is the
|
||||||
entire substring that was inspected during the partial match; it may
|
entire substring that was inspected during the partial match; it may
|
||||||
include characters before the actual match start if a lookbehind asser-
|
include characters before the actual match start if a lookbehind asser-
|
||||||
tion, \b, or \B was involved. (\K is not supported for DFA matching.)
|
tion, \b, or \B was involved. (\K is not supported for DFA matching.)
|
||||||
|
|
||||||
|
@ -1559,16 +1593,16 @@ OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
|
||||||
1: tan
|
1: tan
|
||||||
0: tan
|
0: tan
|
||||||
|
|
||||||
The alternative matching function does not support substring capture,
|
The alternative matching function does not support substring capture,
|
||||||
so the modifiers that are concerned with captured substrings are not
|
so the modifiers that are concerned with captured substrings are not
|
||||||
relevant.
|
relevant.
|
||||||
|
|
||||||
|
|
||||||
RESTARTING AFTER A PARTIAL MATCH
|
RESTARTING AFTER A PARTIAL MATCH
|
||||||
|
|
||||||
When the alternative matching function has given the PCRE2_ERROR_PAR-
|
When the alternative matching function has given the PCRE2_ERROR_PAR-
|
||||||
TIAL return, indicating that the subject partially matched the pattern,
|
TIAL return, indicating that the subject partially matched the pattern,
|
||||||
you can restart the match with additional subject data by means of the
|
you can restart the match with additional subject data by means of the
|
||||||
dfa_restart modifier. For example:
|
dfa_restart modifier. For example:
|
||||||
|
|
||||||
re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
|
re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
|
||||||
|
@ -1577,37 +1611,37 @@ RESTARTING AFTER A PARTIAL MATCH
|
||||||
data> n05\=dfa,dfa_restart
|
data> n05\=dfa,dfa_restart
|
||||||
0: n05
|
0: n05
|
||||||
|
|
||||||
For further information about partial matching, see the pcre2partial
|
For further information about partial matching, see the pcre2partial
|
||||||
documentation.
|
documentation.
|
||||||
|
|
||||||
|
|
||||||
CALLOUTS
|
CALLOUTS
|
||||||
|
|
||||||
If the pattern contains any callout requests, pcre2test's callout func-
|
If the pattern contains any callout requests, pcre2test's callout func-
|
||||||
tion is called during matching unless callout_none is specified. This
|
tion is called during matching unless callout_none is specified. This
|
||||||
works with both matching functions, and with JIT, though there are some
|
works with both matching functions, and with JIT, though there are some
|
||||||
differences in behaviour. The output for callouts with numerical argu-
|
differences in behaviour. The output for callouts with numerical argu-
|
||||||
ments and those with string arguments is slightly different.
|
ments and those with string arguments is slightly different.
|
||||||
|
|
||||||
Callouts with numerical arguments
|
Callouts with numerical arguments
|
||||||
|
|
||||||
By default, the callout function displays the callout number, the start
|
By default, the callout function displays the callout number, the start
|
||||||
and current positions in the subject text at the callout time, and the
|
and current positions in the subject text at the callout time, and the
|
||||||
next pattern item to be tested. For example:
|
next pattern item to be tested. For example:
|
||||||
|
|
||||||
--->pqrabcdef
|
--->pqrabcdef
|
||||||
0 ^ ^ \d
|
0 ^ ^ \d
|
||||||
|
|
||||||
This output indicates that callout number 0 occurred for a match
|
This output indicates that callout number 0 occurred for a match
|
||||||
attempt starting at the fourth character of the subject string, when
|
attempt starting at the fourth character of the subject string, when
|
||||||
the pointer was at the seventh character, and when the next pattern
|
the pointer was at the seventh character, and when the next pattern
|
||||||
item was \d. Just one circumflex is output if the start and current
|
item was \d. Just one circumflex is output if the start and current
|
||||||
positions are the same, or if the current position precedes the start
|
positions are the same, or if the current position precedes the start
|
||||||
position, which can happen if the callout is in a lookbehind assertion.
|
position, which can happen if the callout is in a lookbehind assertion.
|
||||||
|
|
||||||
Callouts numbered 255 are assumed to be automatic callouts, inserted as
|
Callouts numbered 255 are assumed to be automatic callouts, inserted as
|
||||||
a result of the auto_callout pattern modifier. In this case, instead of
|
a result of the auto_callout pattern modifier. In this case, instead of
|
||||||
showing the callout number, the offset in the pattern, preceded by a
|
showing the callout number, the offset in the pattern, preceded by a
|
||||||
plus, is output. For example:
|
plus, is output. For example:
|
||||||
|
|
||||||
re> /\d?[A-E]\*/auto_callout
|
re> /\d?[A-E]\*/auto_callout
|
||||||
|
@ -1620,7 +1654,7 @@ CALLOUTS
|
||||||
0: E*
|
0: E*
|
||||||
|
|
||||||
If a pattern contains (*MARK) items, an additional line is output when-
|
If a pattern contains (*MARK) items, an additional line is output when-
|
||||||
ever a change of latest mark is passed to the callout function. For
|
ever a change of latest mark is passed to the callout function. For
|
||||||
example:
|
example:
|
||||||
|
|
||||||
re> /a(*MARK:X)bc/auto_callout
|
re> /a(*MARK:X)bc/auto_callout
|
||||||
|
@ -1634,17 +1668,17 @@ CALLOUTS
|
||||||
+12 ^ ^
|
+12 ^ ^
|
||||||
0: abc
|
0: abc
|
||||||
|
|
||||||
The mark changes between matching "a" and "b", but stays the same for
|
The mark changes between matching "a" and "b", but stays the same for
|
||||||
the rest of the match, so nothing more is output. If, as a result of
|
the rest of the match, so nothing more is output. If, as a result of
|
||||||
backtracking, the mark reverts to being unset, the text "<unset>" is
|
backtracking, the mark reverts to being unset, the text "<unset>" is
|
||||||
output.
|
output.
|
||||||
|
|
||||||
Callouts with string arguments
|
Callouts with string arguments
|
||||||
|
|
||||||
The output for a callout with a string argument is similar, except that
|
The output for a callout with a string argument is similar, except that
|
||||||
instead of outputting a callout number before the position indicators,
|
instead of outputting a callout number before the position indicators,
|
||||||
the callout string and its offset in the pattern string are output
|
the callout string and its offset in the pattern string are output
|
||||||
before the reflection of the subject string, and the subject string is
|
before the reflection of the subject string, and the subject string is
|
||||||
reflected for each callout. For example:
|
reflected for each callout. For example:
|
||||||
|
|
||||||
re> /^ab(?C'first')cd(?C"second")ef/
|
re> /^ab(?C'first')cd(?C"second")ef/
|
||||||
|
@ -1660,26 +1694,26 @@ CALLOUTS
|
||||||
|
|
||||||
Callout modifiers
|
Callout modifiers
|
||||||
|
|
||||||
The callout function in pcre2test returns zero (carry on matching) by
|
The callout function in pcre2test returns zero (carry on matching) by
|
||||||
default, but you can use a callout_fail modifier in a subject line to
|
default, but you can use a callout_fail modifier in a subject line to
|
||||||
change this and other parameters of the callout (see below).
|
change this and other parameters of the callout (see below).
|
||||||
|
|
||||||
If the callout_capture modifier is set, the current captured groups are
|
If the callout_capture modifier is set, the current captured groups are
|
||||||
output when a callout occurs. This is useful only for non-DFA matching,
|
output when a callout occurs. This is useful only for non-DFA matching,
|
||||||
as pcre2_dfa_match() does not support capturing, so no captures are
|
as pcre2_dfa_match() does not support capturing, so no captures are
|
||||||
ever shown.
|
ever shown.
|
||||||
|
|
||||||
The normal callout output, showing the callout number or pattern offset
|
The normal callout output, showing the callout number or pattern offset
|
||||||
(as described above) is suppressed if the callout_no_where modifier is
|
(as described above) is suppressed if the callout_no_where modifier is
|
||||||
set.
|
set.
|
||||||
|
|
||||||
When using the interpretive matching function pcre2_match() without
|
When using the interpretive matching function pcre2_match() without
|
||||||
JIT, setting the callout_extra modifier causes additional output from
|
JIT, setting the callout_extra modifier causes additional output from
|
||||||
pcre2test's callout function to be generated. For the first callout in
|
pcre2test's callout function to be generated. For the first callout in
|
||||||
a match attempt at a new starting position in the subject, "New match
|
a match attempt at a new starting position in the subject, "New match
|
||||||
attempt" is output. If there has been a backtrack since the last call-
|
attempt" is output. If there has been a backtrack since the last call-
|
||||||
out (or start of matching if this is the first callout), "Backtrack" is
|
out (or start of matching if this is the first callout), "Backtrack" is
|
||||||
output, followed by "No other matching paths" if the backtrack ended
|
output, followed by "No other matching paths" if the backtrack ended
|
||||||
the previous match attempt. For example:
|
the previous match attempt. For example:
|
||||||
|
|
||||||
re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess
|
re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess
|
||||||
|
@ -1716,86 +1750,86 @@ CALLOUTS
|
||||||
+1 ^ a+
|
+1 ^ a+
|
||||||
No match
|
No match
|
||||||
|
|
||||||
Notice that various optimizations must be turned off if you want all
|
Notice that various optimizations must be turned off if you want all
|
||||||
possible matching paths to be scanned. If no_start_optimize is not
|
possible matching paths to be scanned. If no_start_optimize is not
|
||||||
used, there is an immediate "no match", without any callouts, because
|
used, there is an immediate "no match", without any callouts, because
|
||||||
the starting optimization fails to find "b" in the subject, which it
|
the starting optimization fails to find "b" in the subject, which it
|
||||||
knows must be present for any match. If no_auto_possess is not used,
|
knows must be present for any match. If no_auto_possess is not used,
|
||||||
the "a+" item is turned into "a++", which reduces the number of back-
|
the "a+" item is turned into "a++", which reduces the number of back-
|
||||||
tracks.
|
tracks.
|
||||||
|
|
||||||
The callout_extra modifier has no effect if used with the DFA matching
|
The callout_extra modifier has no effect if used with the DFA matching
|
||||||
function, or with JIT.
|
function, or with JIT.
|
||||||
|
|
||||||
Return values from callouts
|
Return values from callouts
|
||||||
|
|
||||||
The default return from the callout function is zero, which allows
|
The default return from the callout function is zero, which allows
|
||||||
matching to continue. The callout_fail modifier can be given one or two
|
matching to continue. The callout_fail modifier can be given one or two
|
||||||
numbers. If there is only one number, 1 is returned instead of 0 (caus-
|
numbers. If there is only one number, 1 is returned instead of 0 (caus-
|
||||||
ing matching to backtrack) when a callout of that number is reached. If
|
ing matching to backtrack) when a callout of that number is reached. If
|
||||||
two numbers (<n>:<m>) are given, 1 is returned when callout <n> is
|
two numbers (<n>:<m>) are given, 1 is returned when callout <n> is
|
||||||
reached and there have been at least <m> callouts. The callout_error
|
reached and there have been at least <m> callouts. The callout_error
|
||||||
modifier is similar, except that PCRE2_ERROR_CALLOUT is returned, caus-
|
modifier is similar, except that PCRE2_ERROR_CALLOUT is returned, caus-
|
||||||
ing the entire matching process to be aborted. If both these modifiers
|
ing the entire matching process to be aborted. If both these modifiers
|
||||||
are set for the same callout number, callout_error takes precedence.
|
are set for the same callout number, callout_error takes precedence.
|
||||||
Note that callouts with string arguments are always given the number
|
Note that callouts with string arguments are always given the number
|
||||||
zero.
|
zero.
|
||||||
|
|
||||||
The callout_data modifier can be given an unsigned or a negative num-
|
The callout_data modifier can be given an unsigned or a negative num-
|
||||||
ber. This is set as the "user data" that is passed to the matching
|
ber. This is set as the "user data" that is passed to the matching
|
||||||
function, and passed back when the callout function is invoked. Any
|
function, and passed back when the callout function is invoked. Any
|
||||||
value other than zero is used as a return from pcre2test's callout
|
value other than zero is used as a return from pcre2test's callout
|
||||||
function.
|
function.
|
||||||
|
|
||||||
Inserting callouts can be helpful when using pcre2test to check compli-
|
Inserting callouts can be helpful when using pcre2test to check compli-
|
||||||
cated regular expressions. For further information about callouts, see
|
cated regular expressions. For further information about callouts, see
|
||||||
the pcre2callout documentation.
|
the pcre2callout documentation.
|
||||||
|
|
||||||
|
|
||||||
NON-PRINTING CHARACTERS
|
NON-PRINTING CHARACTERS
|
||||||
|
|
||||||
When pcre2test is outputting text in the compiled version of a pattern,
|
When pcre2test is outputting text in the compiled version of a pattern,
|
||||||
bytes other than 32-126 are always treated as non-printing characters
|
bytes other than 32-126 are always treated as non-printing characters
|
||||||
and are therefore shown as hex escapes.
|
and are therefore shown as hex escapes.
|
||||||
|
|
||||||
When pcre2test is outputting text that is a matched part of a subject
|
When pcre2test is outputting text that is a matched part of a subject
|
||||||
string, it behaves in the same way, unless a different locale has been
|
string, it behaves in the same way, unless a different locale has been
|
||||||
set for the pattern (using the locale modifier). In this case, the
|
set for the pattern (using the locale modifier). In this case, the
|
||||||
isprint() function is used to distinguish printing and non-printing
|
isprint() function is used to distinguish printing and non-printing
|
||||||
characters.
|
characters.
|
||||||
|
|
||||||
|
|
||||||
SAVING AND RESTORING COMPILED PATTERNS
|
SAVING AND RESTORING COMPILED PATTERNS
|
||||||
|
|
||||||
It is possible to save compiled patterns on disc or elsewhere, and
|
It is possible to save compiled patterns on disc or elsewhere, and
|
||||||
reload them later, subject to a number of restrictions. JIT data cannot
|
reload them later, subject to a number of restrictions. JIT data cannot
|
||||||
be saved. The host on which the patterns are reloaded must be running
|
be saved. The host on which the patterns are reloaded must be running
|
||||||
the same version of PCRE2, with the same code unit width, and must also
|
the same version of PCRE2, with the same code unit width, and must also
|
||||||
have the same endianness, pointer width and PCRE2_SIZE type. Before
|
have the same endianness, pointer width and PCRE2_SIZE type. Before
|
||||||
compiled patterns can be saved they must be serialized, that is, con-
|
compiled patterns can be saved they must be serialized, that is, con-
|
||||||
verted to a stream of bytes. A single byte stream may contain any num-
|
verted to a stream of bytes. A single byte stream may contain any num-
|
||||||
ber of compiled patterns, but they must all use the same character
|
ber of compiled patterns, but they must all use the same character
|
||||||
tables. A single copy of the tables is included in the byte stream (its
|
tables. A single copy of the tables is included in the byte stream (its
|
||||||
size is 1088 bytes).
|
size is 1088 bytes).
|
||||||
|
|
||||||
The functions whose names begin with pcre2_serialize_ are used for
|
The functions whose names begin with pcre2_serialize_ are used for
|
||||||
serializing and de-serializing. They are described in the pcre2serial-
|
serializing and de-serializing. They are described in the pcre2serial-
|
||||||
ize documentation. In this section we describe the features of
|
ize documentation. In this section we describe the features of
|
||||||
pcre2test that can be used to test these functions.
|
pcre2test that can be used to test these functions.
|
||||||
|
|
||||||
Note that "serialization" in PCRE2 does not convert compiled patterns
|
Note that "serialization" in PCRE2 does not convert compiled patterns
|
||||||
to an abstract format like Java or .NET. It just makes a reloadable
|
to an abstract format like Java or .NET. It just makes a reloadable
|
||||||
byte code stream. Hence the restrictions on reloading mentioned above.
|
byte code stream. Hence the restrictions on reloading mentioned above.
|
||||||
|
|
||||||
In pcre2test, when a pattern with push modifier is successfully com-
|
In pcre2test, when a pattern with push modifier is successfully com-
|
||||||
piled, it is pushed onto a stack of compiled patterns, and pcre2test
|
piled, it is pushed onto a stack of compiled patterns, and pcre2test
|
||||||
expects the next line to contain a new pattern (or command) instead of
|
expects the next line to contain a new pattern (or command) instead of
|
||||||
a subject line. By contrast, the pushcopy modifier causes a copy of the
|
a subject line. By contrast, the pushcopy modifier causes a copy of the
|
||||||
compiled pattern to be stacked, leaving the original available for
|
compiled pattern to be stacked, leaving the original available for
|
||||||
immediate matching. By using push and/or pushcopy, a number of patterns
|
immediate matching. By using push and/or pushcopy, a number of patterns
|
||||||
can be compiled and retained. These modifiers are incompatible with
|
can be compiled and retained. These modifiers are incompatible with
|
||||||
posix, and control modifiers that act at match time are ignored (with a
|
posix, and control modifiers that act at match time are ignored (with a
|
||||||
message) for the stacked patterns. The jitverify modifier applies only
|
message) for the stacked patterns. The jitverify modifier applies only
|
||||||
at compile time.
|
at compile time.
|
||||||
|
|
||||||
The command
|
The command
|
||||||
|
@ -1803,21 +1837,21 @@ SAVING AND RESTORING COMPILED PATTERNS
|
||||||
#save <filename>
|
#save <filename>
|
||||||
|
|
||||||
causes all the stacked patterns to be serialized and the result written
|
causes all the stacked patterns to be serialized and the result written
|
||||||
to the named file. Afterwards, all the stacked patterns are freed. The
|
to the named file. Afterwards, all the stacked patterns are freed. The
|
||||||
command
|
command
|
||||||
|
|
||||||
#load <filename>
|
#load <filename>
|
||||||
|
|
||||||
reads the data in the file, and then arranges for it to be de-serial-
|
reads the data in the file, and then arranges for it to be de-serial-
|
||||||
ized, with the resulting compiled patterns added to the pattern stack.
|
ized, with the resulting compiled patterns added to the pattern stack.
|
||||||
The pattern on the top of the stack can be retrieved by the #pop com-
|
The pattern on the top of the stack can be retrieved by the #pop com-
|
||||||
mand, which must be followed by lines of subjects that are to be
|
mand, which must be followed by lines of subjects that are to be
|
||||||
matched with the pattern, terminated as usual by an empty line or end
|
matched with the pattern, terminated as usual by an empty line or end
|
||||||
of file. This command may be followed by a modifier list containing
|
of file. This command may be followed by a modifier list containing
|
||||||
only control modifiers that act after a pattern has been compiled. In
|
only control modifiers that act after a pattern has been compiled. In
|
||||||
particular, hex, posix, posix_nosub, push, and pushcopy are not
|
particular, hex, posix, posix_nosub, push, and pushcopy are not
|
||||||
allowed, nor are any option-setting modifiers. The JIT modifiers are,
|
allowed, nor are any option-setting modifiers. The JIT modifiers are,
|
||||||
however permitted. Here is an example that saves and reloads two pat-
|
however permitted. Here is an example that saves and reloads two pat-
|
||||||
terns.
|
terns.
|
||||||
|
|
||||||
/abc/push
|
/abc/push
|
||||||
|
@ -1830,10 +1864,10 @@ SAVING AND RESTORING COMPILED PATTERNS
|
||||||
#pop jit,bincode
|
#pop jit,bincode
|
||||||
abc
|
abc
|
||||||
|
|
||||||
If jitverify is used with #pop, it does not automatically imply jit,
|
If jitverify is used with #pop, it does not automatically imply jit,
|
||||||
which is different behaviour from when it is used on a pattern.
|
which is different behaviour from when it is used on a pattern.
|
||||||
|
|
||||||
The #popcopy command is analagous to the pushcopy modifier in that it
|
The #popcopy command is analagous to the pushcopy modifier in that it
|
||||||
makes current a copy of the topmost stack pattern, leaving the original
|
makes current a copy of the topmost stack pattern, leaving the original
|
||||||
still on the stack.
|
still on the stack.
|
||||||
|
|
||||||
|
@ -1853,5 +1887,5 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 21 September 2018
|
Last updated: 12 November 2018
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2018 University of Cambridge.
|
||||||
|
|
|
@ -549,8 +549,12 @@ typedef struct pcre2_callout_enumerate_block { \
|
||||||
typedef struct pcre2_substitute_callout_block { \
|
typedef struct pcre2_substitute_callout_block { \
|
||||||
uint32_t version; /* Identifies version of block */ \
|
uint32_t version; /* Identifies version of block */ \
|
||||||
/* ------------------------ Version 0 ------------------------------- */ \
|
/* ------------------------ Version 0 ------------------------------- */ \
|
||||||
PCRE2_SIZE input_offsets[2]; /* Matched portion of the input */ \
|
PCRE2_SPTR input; /* Pointer to input subject string */ \
|
||||||
|
PCRE2_SPTR output; /* Pointer to output buffer */ \
|
||||||
PCRE2_SIZE output_offsets[2]; /* Changed portion of the output */ \
|
PCRE2_SIZE output_offsets[2]; /* Changed portion of the output */ \
|
||||||
|
PCRE2_SIZE *ovector; /* Pointer to current ovector */ \
|
||||||
|
uint32_t oveccount; /* Count of pairs set in ovector */ \
|
||||||
|
uint32_t subscount; /* Substitution number */ \
|
||||||
/* ------------------------------------------------------------------ */ \
|
/* ------------------------------------------------------------------ */ \
|
||||||
} pcre2_substitute_callout_block;
|
} pcre2_substitute_callout_block;
|
||||||
|
|
||||||
|
@ -609,7 +613,7 @@ PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
int (*)(pcre2_callout_block *, void *), void *); \
|
int (*)(pcre2_callout_block *, void *), void *); \
|
||||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
pcre2_set_substitute_callout(pcre2_match_context *, \
|
pcre2_set_substitute_callout(pcre2_match_context *, \
|
||||||
void (*)(pcre2_substitute_callout_block *, void *), void *); \
|
int (*)(pcre2_substitute_callout_block *, void *), void *); \
|
||||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
pcre2_set_depth_limit(pcre2_match_context *, uint32_t); \
|
pcre2_set_depth_limit(pcre2_match_context *, uint32_t); \
|
||||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
|
|
@ -407,7 +407,7 @@ return 0;
|
||||||
|
|
||||||
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
|
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
|
||||||
pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
pcre2_set_substitute_callout(pcre2_match_context *mcontext,
|
||||||
void (*substitute_callout)(pcre2_substitute_callout_block *, void *),
|
int (*substitute_callout)(pcre2_substitute_callout_block *, void *),
|
||||||
void *substitute_callout_data)
|
void *substitute_callout_data)
|
||||||
{
|
{
|
||||||
mcontext->substitute_callout = substitute_callout;
|
mcontext->substitute_callout = substitute_callout;
|
||||||
|
|
|
@ -585,7 +585,7 @@ typedef struct pcre2_real_match_context {
|
||||||
#endif
|
#endif
|
||||||
int (*callout)(pcre2_callout_block *, void *);
|
int (*callout)(pcre2_callout_block *, void *);
|
||||||
void *callout_data;
|
void *callout_data;
|
||||||
void (*substitute_callout)(pcre2_substitute_callout_block *, void *);
|
int (*substitute_callout)(pcre2_substitute_callout_block *, void *);
|
||||||
void *substitute_callout_data;
|
void *substitute_callout_data;
|
||||||
PCRE2_SIZE offset_limit;
|
PCRE2_SIZE offset_limit;
|
||||||
uint32_t heap_limit;
|
uint32_t heap_limit;
|
||||||
|
|
|
@ -241,13 +241,15 @@ PCRE2_SIZE *ovector;
|
||||||
PCRE2_SIZE ovecsave[3];
|
PCRE2_SIZE ovecsave[3];
|
||||||
pcre2_substitute_callout_block scb;
|
pcre2_substitute_callout_block scb;
|
||||||
|
|
||||||
scb.version = 0;
|
/* General initialization */
|
||||||
|
|
||||||
buff_offset = 0;
|
buff_offset = 0;
|
||||||
lengthleft = buff_length = *blength;
|
lengthleft = buff_length = *blength;
|
||||||
*blength = PCRE2_UNSET;
|
*blength = PCRE2_UNSET;
|
||||||
ovecsave[0] = ovecsave[1] = ovecsave[2] = PCRE2_UNSET;
|
ovecsave[0] = ovecsave[1] = ovecsave[2] = PCRE2_UNSET;
|
||||||
|
|
||||||
/* Partial matching is not valid. */
|
/* Partial matching is not valid. This must come after setting *blength to
|
||||||
|
PCRE2_UNSET, so as not to imply an offset in the replacement. */
|
||||||
|
|
||||||
if ((options & (PCRE2_PARTIAL_HARD|PCRE2_PARTIAL_SOFT)) != 0)
|
if ((options & (PCRE2_PARTIAL_HARD|PCRE2_PARTIAL_SOFT)) != 0)
|
||||||
return PCRE2_ERROR_BADOPTION;
|
return PCRE2_ERROR_BADOPTION;
|
||||||
|
@ -266,6 +268,13 @@ if (match_data == NULL)
|
||||||
ovector = pcre2_get_ovector_pointer(match_data);
|
ovector = pcre2_get_ovector_pointer(match_data);
|
||||||
ovector_count = pcre2_get_ovector_count(match_data);
|
ovector_count = pcre2_get_ovector_count(match_data);
|
||||||
|
|
||||||
|
/* Fixed things in the callout block */
|
||||||
|
|
||||||
|
scb.version = 0;
|
||||||
|
scb.input = subject;
|
||||||
|
scb.output = (PCRE2_SPTR)buffer;
|
||||||
|
scb.ovector = ovector;
|
||||||
|
|
||||||
/* Find lengths of zero-terminated strings and the end of the replacement. */
|
/* Find lengths of zero-terminated strings and the end of the replacement. */
|
||||||
|
|
||||||
if (length == PCRE2_ZERO_TERMINATED) length = PRIV(strlen)(subject);
|
if (length == PCRE2_ZERO_TERMINATED) length = PRIV(strlen)(subject);
|
||||||
|
@ -393,11 +402,6 @@ do
|
||||||
goto EXIT;
|
goto EXIT;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Save the match point for a possible callout */
|
|
||||||
|
|
||||||
scb.input_offsets[0] = ovector[0];
|
|
||||||
scb.input_offsets[1] = ovector[1];
|
|
||||||
|
|
||||||
/* Count substitutions with a paranoid check for integer overflow; surely no
|
/* Count substitutions with a paranoid check for integer overflow; surely no
|
||||||
real call to this function would ever hit this! */
|
real call to this function would ever hit this! */
|
||||||
|
|
||||||
|
@ -409,12 +413,13 @@ do
|
||||||
subs++;
|
subs++;
|
||||||
|
|
||||||
/* Copy the text leading up to the match, and remember where the insert
|
/* Copy the text leading up to the match, and remember where the insert
|
||||||
begins. */
|
begins and how many ovector pairs are set. */
|
||||||
|
|
||||||
if (rc == 0) rc = ovector_count;
|
if (rc == 0) rc = ovector_count;
|
||||||
fraglength = ovector[0] - start_offset;
|
fraglength = ovector[0] - start_offset;
|
||||||
CHECKMEMCPY(subject + start_offset, fraglength);
|
CHECKMEMCPY(subject + start_offset, fraglength);
|
||||||
scb.output_offsets[0] = buff_offset;
|
scb.output_offsets[0] = buff_offset;
|
||||||
|
scb.oveccount = rc;
|
||||||
|
|
||||||
/* Process the replacement string. Literal mode is set by \Q, but only in
|
/* Process the replacement string. Literal mode is set by \Q, but only in
|
||||||
extended mode when backslashes are being interpreted. In extended mode we
|
extended mode when backslashes are being interpreted. In extended mode we
|
||||||
|
@ -836,8 +841,26 @@ do
|
||||||
|
|
||||||
if (!overflowed && mcontext->substitute_callout != NULL)
|
if (!overflowed && mcontext->substitute_callout != NULL)
|
||||||
{
|
{
|
||||||
|
scb.subscount = subs;
|
||||||
scb.output_offsets[1] = buff_offset;
|
scb.output_offsets[1] = buff_offset;
|
||||||
mcontext->substitute_callout(&scb, mcontext->substitute_callout_data);
|
rc = mcontext->substitute_callout(&scb, mcontext->substitute_callout_data);
|
||||||
|
|
||||||
|
/* A non-zero return means cancel this substitution. Instead, copy the
|
||||||
|
matched string fragment. */
|
||||||
|
|
||||||
|
if (rc != 0)
|
||||||
|
{
|
||||||
|
PCRE2_SIZE newlength = scb.output_offsets[1] - scb.output_offsets[0];
|
||||||
|
PCRE2_SIZE oldlength = ovector[1] - ovector[0];
|
||||||
|
|
||||||
|
buff_offset -= newlength;
|
||||||
|
lengthleft += newlength;
|
||||||
|
CHECKMEMCPY(subject + ovector[0], oldlength);
|
||||||
|
|
||||||
|
/* A negative return means do not do any more. */
|
||||||
|
|
||||||
|
if (rc < 0) suboptions &= (~PCRE2_SUBSTITUTE_GLOBAL);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Save the details of this match. See above for how this data is used. If we
|
/* Save the details of this match. See above for how this data is used. If we
|
||||||
|
|
|
@ -531,12 +531,14 @@ different things in the two cases. */
|
||||||
subject must be at the start and in the same order in both cases so that the
|
subject must be at the start and in the same order in both cases so that the
|
||||||
same offset in the big table below works for both. */
|
same offset in the big table below works for both. */
|
||||||
|
|
||||||
typedef struct patctl { /* Structure for pattern modifiers. */
|
typedef struct patctl { /* Structure for pattern modifiers. */
|
||||||
uint32_t options; /* Must be in same position as datctl */
|
uint32_t options; /* Must be in same position as datctl */
|
||||||
uint32_t control; /* Must be in same position as datctl */
|
uint32_t control; /* Must be in same position as datctl */
|
||||||
uint32_t control2; /* Must be in same position as datctl */
|
uint32_t control2; /* Must be in same position as datctl */
|
||||||
uint32_t jitstack; /* Must be in same position as datctl */
|
uint32_t jitstack; /* Must be in same position as datctl */
|
||||||
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
||||||
|
uint32_t substitute_skip; /* Must be in same position as patctl */
|
||||||
|
uint32_t substitute_stop; /* Must be in same position as patctl */
|
||||||
uint32_t jit;
|
uint32_t jit;
|
||||||
uint32_t stackguard_test;
|
uint32_t stackguard_test;
|
||||||
uint32_t tables_id;
|
uint32_t tables_id;
|
||||||
|
@ -551,12 +553,14 @@ typedef struct patctl { /* Structure for pattern modifiers. */
|
||||||
#define MAXCPYGET 10
|
#define MAXCPYGET 10
|
||||||
#define LENCPYGET 64
|
#define LENCPYGET 64
|
||||||
|
|
||||||
typedef struct datctl { /* Structure for data line modifiers. */
|
typedef struct datctl { /* Structure for data line modifiers. */
|
||||||
uint32_t options; /* Must be in same position as patctl */
|
uint32_t options; /* Must be in same position as patctl */
|
||||||
uint32_t control; /* Must be in same position as patctl */
|
uint32_t control; /* Must be in same position as patctl */
|
||||||
uint32_t control2; /* Must be in same position as patctl */
|
uint32_t control2; /* Must be in same position as patctl */
|
||||||
uint32_t jitstack; /* Must be in same position as patctl */
|
uint32_t jitstack; /* Must be in same position as patctl */
|
||||||
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
||||||
|
uint32_t substitute_skip; /* Must be in same position as patctl */
|
||||||
|
uint32_t substitute_stop; /* Must be in same position as patctl */
|
||||||
uint32_t startend[2];
|
uint32_t startend[2];
|
||||||
uint32_t cerror[2];
|
uint32_t cerror[2];
|
||||||
uint32_t cfail[2];
|
uint32_t cfail[2];
|
||||||
|
@ -704,6 +708,8 @@ static modstruct modlist[] = {
|
||||||
{ "substitute_callout", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_CALLOUT, PO(control2) },
|
{ "substitute_callout", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_CALLOUT, PO(control2) },
|
||||||
{ "substitute_extended", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_EXTENDED, PO(control2) },
|
{ "substitute_extended", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_EXTENDED, PO(control2) },
|
||||||
{ "substitute_overflow_length", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_OVERFLOW_LENGTH, PO(control2) },
|
{ "substitute_overflow_length", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_OVERFLOW_LENGTH, PO(control2) },
|
||||||
|
{ "substitute_skip", MOD_PND, MOD_INT, 0, PO(substitute_skip) },
|
||||||
|
{ "substitute_stop", MOD_PND, MOD_INT, 0, PO(substitute_stop) },
|
||||||
{ "substitute_unknown_unset", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_UNKNOWN_UNSET, PO(control2) },
|
{ "substitute_unknown_unset", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_UNKNOWN_UNSET, PO(control2) },
|
||||||
{ "substitute_unset_empty", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_UNSET_EMPTY, PO(control2) },
|
{ "substitute_unset_empty", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_UNSET_EMPTY, PO(control2) },
|
||||||
{ "tables", MOD_PAT, MOD_INT, 0, PO(tables_id) },
|
{ "tables", MOD_PAT, MOD_INT, 0, PO(tables_id) },
|
||||||
|
@ -1370,13 +1376,13 @@ are supported. */
|
||||||
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
||||||
if (test_mode == PCRE8_MODE) \
|
if (test_mode == PCRE8_MODE) \
|
||||||
pcre2_set_substitute_callout_8(G(a,8), \
|
pcre2_set_substitute_callout_8(G(a,8), \
|
||||||
(void (*)(pcre2_substitute_callout_block_8 *, void *))b,c); \
|
(int (*)(pcre2_substitute_callout_block_8 *, void *))b,c); \
|
||||||
else if (test_mode == PCRE16_MODE) \
|
else if (test_mode == PCRE16_MODE) \
|
||||||
pcre2_set_substitute_callout_16(G(a,16), \
|
pcre2_set_substitute_callout_16(G(a,16), \
|
||||||
(void (*)(pcre2_substitute_callout_block_16 *, void *))b,c); \
|
(int (*)(pcre2_substitute_callout_block_16 *, void *))b,c); \
|
||||||
else \
|
else \
|
||||||
pcre2_set_substitute_callout_32(G(a,32), \
|
pcre2_set_substitute_callout_32(G(a,32), \
|
||||||
(void (*)(pcre2_substitute_callout_block_32 *, void *))b,c)
|
(int (*)(pcre2_substitute_callout_block_32 *, void *))b,c)
|
||||||
|
|
||||||
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
||||||
if (test_mode == PCRE8_MODE) \
|
if (test_mode == PCRE8_MODE) \
|
||||||
|
@ -1850,10 +1856,10 @@ the three different cases. */
|
||||||
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
||||||
if (test_mode == G(G(PCRE,BITONE),_MODE)) \
|
if (test_mode == G(G(PCRE,BITONE),_MODE)) \
|
||||||
G(pcre2_set_substitute_callout_,BITONE)(G(a,BITONE), \
|
G(pcre2_set_substitute_callout_,BITONE)(G(a,BITONE), \
|
||||||
(void (*)(G(pcre2_substitute_callout_block_,BITONE) *, void *))b,c); \
|
(int (*)(G(pcre2_substitute_callout_block_,BITONE) *, void *))b,c); \
|
||||||
else \
|
else \
|
||||||
G(pcre2_set_substitute_callout_,BITTWO)(G(a,BITTWO), \
|
G(pcre2_set_substitute_callout_,BITTWO)(G(a,BITTWO), \
|
||||||
(void (*)(G(pcre2_substitute_callout_block_,BITTWO) *, void *))b,c)
|
(int (*)(G(pcre2_substitute_callout_block_,BITTWO) *, void *))b,c)
|
||||||
|
|
||||||
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
||||||
if (test_mode == G(G(PCRE,BITONE),_MODE)) \
|
if (test_mode == G(G(PCRE,BITONE),_MODE)) \
|
||||||
|
@ -2058,7 +2064,7 @@ the three different cases. */
|
||||||
#define PCRE2_SET_PARENS_NEST_LIMIT(a,b) pcre2_set_parens_nest_limit_8(G(a,8),b)
|
#define PCRE2_SET_PARENS_NEST_LIMIT(a,b) pcre2_set_parens_nest_limit_8(G(a,8),b)
|
||||||
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
||||||
pcre2_set_substitute_callout_8(G(a,8), \
|
pcre2_set_substitute_callout_8(G(a,8), \
|
||||||
(void (*)(pcre2_substitute_callout_block_8 *, void *))b,c)
|
(int (*)(pcre2_substitute_callout_block_8 *, void *))b,c)
|
||||||
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
||||||
a = pcre2_substitute_8(G(b,8),(PCRE2_SPTR8)c,d,e,f,G(g,8),G(h,8), \
|
a = pcre2_substitute_8(G(b,8),(PCRE2_SPTR8)c,d,e,f,G(g,8),G(h,8), \
|
||||||
(PCRE2_SPTR8)i,j,(PCRE2_UCHAR8 *)k,l)
|
(PCRE2_SPTR8)i,j,(PCRE2_UCHAR8 *)k,l)
|
||||||
|
@ -2165,7 +2171,7 @@ the three different cases. */
|
||||||
#define PCRE2_SET_PARENS_NEST_LIMIT(a,b) pcre2_set_parens_nest_limit_16(G(a,16),b)
|
#define PCRE2_SET_PARENS_NEST_LIMIT(a,b) pcre2_set_parens_nest_limit_16(G(a,16),b)
|
||||||
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
||||||
pcre2_set_substitute_callout_16(G(a,16), \
|
pcre2_set_substitute_callout_16(G(a,16), \
|
||||||
(void (*)(pcre2_substitute_callout_block_16 *, void *))b,c)
|
(int (*)(pcre2_substitute_callout_block_16 *, void *))b,c)
|
||||||
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
||||||
a = pcre2_substitute_16(G(b,16),(PCRE2_SPTR16)c,d,e,f,G(g,16),G(h,16), \
|
a = pcre2_substitute_16(G(b,16),(PCRE2_SPTR16)c,d,e,f,G(g,16),G(h,16), \
|
||||||
(PCRE2_SPTR16)i,j,(PCRE2_UCHAR16 *)k,l)
|
(PCRE2_SPTR16)i,j,(PCRE2_UCHAR16 *)k,l)
|
||||||
|
@ -2272,7 +2278,7 @@ the three different cases. */
|
||||||
#define PCRE2_SET_PARENS_NEST_LIMIT(a,b) pcre2_set_parens_nest_limit_32(G(a,32),b)
|
#define PCRE2_SET_PARENS_NEST_LIMIT(a,b) pcre2_set_parens_nest_limit_32(G(a,32),b)
|
||||||
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
#define PCRE2_SET_SUBSTITUTE_CALLOUT(a,b,c) \
|
||||||
pcre2_set_substitute_callout_32(G(a,32), \
|
pcre2_set_substitute_callout_32(G(a,32), \
|
||||||
(void (*)(pcre2_substitute_callout_block_32 *, void *))b,c)
|
(int (*)(pcre2_substitute_callout_block_32 *, void *))b,c)
|
||||||
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
#define PCRE2_SUBSTITUTE(a,b,c,d,e,f,g,h,i,j,k,l) \
|
||||||
a = pcre2_substitute_32(G(b,32),(PCRE2_SPTR32)c,d,e,f,G(g,32),G(h,32), \
|
a = pcre2_substitute_32(G(b,32),(PCRE2_SPTR32)c,d,e,f,G(g,32),G(h,32), \
|
||||||
(PCRE2_SPTR32)i,j,(PCRE2_UCHAR32 *)k,l)
|
(PCRE2_SPTR32)i,j,(PCRE2_UCHAR32 *)k,l)
|
||||||
|
@ -5955,17 +5961,40 @@ Arguments:
|
||||||
Returns: nothing
|
Returns: nothing
|
||||||
*/
|
*/
|
||||||
|
|
||||||
static void
|
static int
|
||||||
substitute_callout_function(pcre2_substitute_callout_block_8 *scb,
|
substitute_callout_function(pcre2_substitute_callout_block_8 *scb,
|
||||||
void *data_ptr)
|
void *data_ptr)
|
||||||
{
|
{
|
||||||
|
int yield = 0;
|
||||||
|
BOOL utf = (FLD(compiled_code, overall_options) & PCRE2_UTF) != 0;
|
||||||
(void)data_ptr; /* Not used */
|
(void)data_ptr; /* Not used */
|
||||||
fprintf(outfile, "Old %" SIZ_FORM " %" SIZ_FORM " New %" SIZ_FORM
|
|
||||||
" %" SIZ_FORM "\n",
|
fprintf(outfile, "%2d(%d) Old %" SIZ_FORM " %" SIZ_FORM " \"",
|
||||||
SIZ_CAST scb->input_offsets[0],
|
scb->subscount, scb->oveccount,
|
||||||
SIZ_CAST scb->input_offsets[1],
|
SIZ_CAST scb->ovector[0], SIZ_CAST scb->ovector[1]);
|
||||||
SIZ_CAST scb->output_offsets[0],
|
|
||||||
SIZ_CAST scb->output_offsets[1]);
|
PCHARSV(scb->input, scb->ovector[0], scb->ovector[1] - scb->ovector[0],
|
||||||
|
utf, outfile);
|
||||||
|
|
||||||
|
fprintf(outfile, "\" New %" SIZ_FORM " %" SIZ_FORM " \"",
|
||||||
|
SIZ_CAST scb->output_offsets[0], SIZ_CAST scb->output_offsets[1]);
|
||||||
|
|
||||||
|
PCHARSV(scb->output, scb->output_offsets[0],
|
||||||
|
scb->output_offsets[1] - scb->output_offsets[0], utf, outfile);
|
||||||
|
|
||||||
|
if (scb->subscount == dat_datctl.substitute_stop)
|
||||||
|
{
|
||||||
|
yield = -1;
|
||||||
|
fprintf(outfile, " STOPPED");
|
||||||
|
}
|
||||||
|
else if (scb->subscount == dat_datctl.substitute_skip)
|
||||||
|
{
|
||||||
|
yield = +1;
|
||||||
|
fprintf(outfile, " SKIPPED");
|
||||||
|
}
|
||||||
|
|
||||||
|
fprintf(outfile, "\"\n");
|
||||||
|
return yield;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@ -6494,6 +6523,11 @@ dat_datctl.control2 |= (pat_patctl.control2 & CTL2_ALLPD);
|
||||||
strcpy((char *)dat_datctl.replacement, (char *)pat_patctl.replacement);
|
strcpy((char *)dat_datctl.replacement, (char *)pat_patctl.replacement);
|
||||||
if (dat_datctl.jitstack == 0) dat_datctl.jitstack = pat_patctl.jitstack;
|
if (dat_datctl.jitstack == 0) dat_datctl.jitstack = pat_patctl.jitstack;
|
||||||
|
|
||||||
|
if (dat_datctl.substitute_skip == 0)
|
||||||
|
dat_datctl.substitute_skip = pat_patctl.substitute_skip;
|
||||||
|
if (dat_datctl.substitute_stop == 0)
|
||||||
|
dat_datctl.substitute_stop = pat_patctl.substitute_stop;
|
||||||
|
|
||||||
/* Initialize for scanning the data line. */
|
/* Initialize for scanning the data line. */
|
||||||
|
|
||||||
#ifdef SUPPORT_PCRE2_8
|
#ifdef SUPPORT_PCRE2_8
|
||||||
|
@ -6832,6 +6866,11 @@ arg_ulen = ulen; /* Value to use in match arg */
|
||||||
|
|
||||||
if (p[-1] != 0 && !decode_modifiers(p, CTX_DAT, NULL, &dat_datctl))
|
if (p[-1] != 0 && !decode_modifiers(p, CTX_DAT, NULL, &dat_datctl))
|
||||||
return PR_OK;
|
return PR_OK;
|
||||||
|
|
||||||
|
/* Setting substitute_{skip,fail} implies a substitute callout. */
|
||||||
|
|
||||||
|
if (dat_datctl.substitute_skip != 0 || dat_datctl.substitute_stop != 0)
|
||||||
|
dat_datctl.control2 |= CTL2_SUBSTITUTE_CALLOUT;
|
||||||
|
|
||||||
/* Check for mutually exclusive modifiers. At present, these are all in the
|
/* Check for mutually exclusive modifiers. At present, these are all in the
|
||||||
first control word. */
|
first control word. */
|
||||||
|
|
|
@ -5516,6 +5516,21 @@ a)"xI
|
||||||
|
|
||||||
/a(b)c|xyz/g,replace=<$0>,substitute_callout
|
/a(b)c|xyz/g,replace=<$0>,substitute_callout
|
||||||
abcdefabcpqr
|
abcdefabcpqr
|
||||||
|
abxyzpqrabcxyz
|
||||||
|
12abc34xyz99abc55\=substitute_stop=2
|
||||||
|
12abc34xyz99abc55\=substitute_skip=1
|
||||||
|
12abc34xyz99abc55\=substitute_skip=2
|
||||||
|
|
||||||
|
/a(b)c|xyz/g,replace=<$0>
|
||||||
|
abcdefabcpqr
|
||||||
|
abxyzpqrabcxyz
|
||||||
|
12abc34xyz\=substitute_stop=2
|
||||||
|
12abc34xyz\=substitute_skip=1
|
||||||
|
|
||||||
|
/a(b)c|xyz/replace=<$0>
|
||||||
|
abcdefabcpqr
|
||||||
|
12abc34xyz\=substitute_skip=1
|
||||||
|
12abc34xyz\=substitute_stop=1
|
||||||
|
|
||||||
/abc\rdef/
|
/abc\rdef/
|
||||||
abc\ndef
|
abc\ndef
|
||||||
|
|
|
@ -1630,10 +1630,10 @@ No match
|
||||||
|
|
||||||
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
||||||
123abcáyzabcdef789abcሴqr
|
123abcáyzabcdef789abcሴqr
|
||||||
Old 6 6 New 6 8
|
1(2) Old 6 6 "" New 6 8 "<>"
|
||||||
Old 13 13 New 15 17
|
2(2) Old 13 13 "" New 15 17 "<>"
|
||||||
Old 13 16 New 17 22
|
3(2) Old 13 16 "def" New 17 22 "<def>"
|
||||||
Old 22 22 New 28 30
|
4(2) Old 22 22 "" New 28 30 "<>"
|
||||||
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
||||||
|
|
||||||
# End of testinput10
|
# End of testinput10
|
||||||
|
|
|
@ -1475,10 +1475,10 @@ No match
|
||||||
|
|
||||||
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
||||||
123abcáyzabcdef789abcሴqr
|
123abcáyzabcdef789abcሴqr
|
||||||
Old 6 6 New 6 8
|
1(2) Old 6 6 "" New 6 8 "<>"
|
||||||
Old 12 12 New 14 16
|
2(2) Old 12 12 "" New 14 16 "<>"
|
||||||
Old 12 15 New 16 21
|
3(2) Old 12 15 "def" New 16 21 "<def>"
|
||||||
Old 21 21 New 27 29
|
4(2) Old 21 21 "" New 27 29 "<>"
|
||||||
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
||||||
|
|
||||||
# A few script run tests in non-UTF mode (but they need Unicode support)
|
# A few script run tests in non-UTF mode (but they need Unicode support)
|
||||||
|
|
|
@ -1472,10 +1472,10 @@ No match
|
||||||
|
|
||||||
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
||||||
123abcáyzabcdef789abcሴqr
|
123abcáyzabcdef789abcሴqr
|
||||||
Old 6 6 New 6 8
|
1(2) Old 6 6 "" New 6 8 "<>"
|
||||||
Old 12 12 New 14 16
|
2(2) Old 12 12 "" New 14 16 "<>"
|
||||||
Old 12 15 New 16 21
|
3(2) Old 12 15 "def" New 16 21 "<def>"
|
||||||
Old 21 21 New 27 29
|
4(2) Old 21 21 "" New 27 29 "<>"
|
||||||
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
||||||
|
|
||||||
# A few script run tests in non-UTF mode (but they need Unicode support)
|
# A few script run tests in non-UTF mode (but they need Unicode support)
|
||||||
|
|
|
@ -16797,9 +16797,52 @@ Subject length lower bound = 1
|
||||||
|
|
||||||
/a(b)c|xyz/g,replace=<$0>,substitute_callout
|
/a(b)c|xyz/g,replace=<$0>,substitute_callout
|
||||||
abcdefabcpqr
|
abcdefabcpqr
|
||||||
Old 0 3 New 0 5
|
1(2) Old 0 3 "abc" New 0 5 "<abc>"
|
||||||
Old 6 9 New 8 13
|
2(2) Old 6 9 "abc" New 8 13 "<abc>"
|
||||||
2: <abc>def<abc>pqr
|
2: <abc>def<abc>pqr
|
||||||
|
abxyzpqrabcxyz
|
||||||
|
1(1) Old 2 5 "xyz" New 2 7 "<xyz>"
|
||||||
|
2(2) Old 8 11 "abc" New 10 15 "<abc>"
|
||||||
|
3(1) Old 11 14 "xyz" New 15 20 "<xyz>"
|
||||||
|
3: ab<xyz>pqr<abc><xyz>
|
||||||
|
12abc34xyz99abc55\=substitute_stop=2
|
||||||
|
1(2) Old 2 5 "abc" New 2 7 "<abc>"
|
||||||
|
2(1) Old 7 10 "xyz" New 9 14 "<xyz> STOPPED"
|
||||||
|
2: 12<abc>34xyz99abc55
|
||||||
|
12abc34xyz99abc55\=substitute_skip=1
|
||||||
|
1(2) Old 2 5 "abc" New 2 7 "<abc> SKIPPED"
|
||||||
|
2(1) Old 7 10 "xyz" New 7 12 "<xyz>"
|
||||||
|
3(2) Old 12 15 "abc" New 14 19 "<abc>"
|
||||||
|
3: 12abc34<xyz>99<abc>55
|
||||||
|
12abc34xyz99abc55\=substitute_skip=2
|
||||||
|
1(2) Old 2 5 "abc" New 2 7 "<abc>"
|
||||||
|
2(1) Old 7 10 "xyz" New 9 14 "<xyz> SKIPPED"
|
||||||
|
3(2) Old 12 15 "abc" New 14 19 "<abc>"
|
||||||
|
3: 12<abc>34xyz99<abc>55
|
||||||
|
|
||||||
|
/a(b)c|xyz/g,replace=<$0>
|
||||||
|
abcdefabcpqr
|
||||||
|
2: <abc>def<abc>pqr
|
||||||
|
abxyzpqrabcxyz
|
||||||
|
3: ab<xyz>pqr<abc><xyz>
|
||||||
|
12abc34xyz\=substitute_stop=2
|
||||||
|
1(2) Old 2 5 "abc" New 2 7 "<abc>"
|
||||||
|
2(1) Old 7 10 "xyz" New 9 14 "<xyz> STOPPED"
|
||||||
|
2: 12<abc>34xyz
|
||||||
|
12abc34xyz\=substitute_skip=1
|
||||||
|
1(2) Old 2 5 "abc" New 2 7 "<abc> SKIPPED"
|
||||||
|
2(1) Old 7 10 "xyz" New 7 12 "<xyz>"
|
||||||
|
2: 12abc34<xyz>
|
||||||
|
|
||||||
|
/a(b)c|xyz/replace=<$0>
|
||||||
|
abcdefabcpqr
|
||||||
|
1: <abc>defabcpqr
|
||||||
|
12abc34xyz\=substitute_skip=1
|
||||||
|
1(2) Old 2 5 "abc" New 2 7 "<abc> SKIPPED"
|
||||||
|
1: 12abc34xyz
|
||||||
|
12abc34xyz\=substitute_stop=1
|
||||||
|
1(2) Old 2 5 "abc" New 2 7 "<abc> STOPPED"
|
||||||
|
1: 12abc34xyz
|
||||||
|
|
||||||
/abc\rdef/
|
/abc\rdef/
|
||||||
abc\ndef
|
abc\ndef
|
||||||
|
|
Loading…
Reference in New Issue