Documentation update.
This commit is contained in:
parent
0b64d9cfca
commit
e7a762ddff
|
@ -2841,22 +2841,23 @@ undefined.
|
|||
</P>
|
||||
<P>
|
||||
After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a failure
|
||||
to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN) name may be
|
||||
available. The function <b>pcre2_get_mark()</b> can be called to access this
|
||||
name. The same function applies to all three verbs. It returns a pointer to the
|
||||
zero-terminated name, which is within the compiled pattern. If no name is
|
||||
to match (PCRE2_ERROR_NOMATCH), a mark name may be available. The function
|
||||
<b>pcre2_get_mark()</b> can be called to access this name, which can be
|
||||
specified in the pattern by any of the backtracking control verbs, not just
|
||||
(*MARK). The same function applies to all the verbs. It returns a pointer to
|
||||
the zero-terminated name, which is within the compiled pattern. If no name is
|
||||
available, NULL is returned. The length of the name (excluding the terminating
|
||||
zero) is stored in the code unit that precedes the name. You should use this
|
||||
length instead of relying on the terminating zero if the name might contain a
|
||||
binary zero.
|
||||
</P>
|
||||
<P>
|
||||
After a successful match, the name that is returned is the last (*MARK),
|
||||
(*PRUNE), or (*THEN) name encountered on the matching path through the pattern.
|
||||
Instances of (*PRUNE) and (*THEN) without names are ignored. Thus, for example,
|
||||
if the matching path contains (*MARK:A)(*PRUNE), the name "A" is returned.
|
||||
After a "no match" or a partial match, the last encountered name is returned.
|
||||
For example, consider this pattern:
|
||||
After a successful match, the name that is returned is the last mark name
|
||||
encountered on the matching path through the pattern. Instances of backtracking
|
||||
verbs without names do not count. Thus, for example, if the matching path
|
||||
contains (*MARK:A)(*PRUNE), the name "A" is returned. After a "no match" or a
|
||||
partial match, the last encountered name is returned. For example, consider
|
||||
this pattern:
|
||||
<pre>
|
||||
^(*MARK:A)((*MARK:B)a|b)c
|
||||
</pre>
|
||||
|
@ -2871,7 +2872,7 @@ is removed from the pattern above, there is an initial check for the presence
|
|||
of "c" in the subject before running the matching engine. This check fails for
|
||||
"bx", causing a match failure without seeing any marks. You can disable the
|
||||
start-of-match optimizations by setting the PCRE2_NO_START_OPTIMIZE option for
|
||||
<b>pcre2_compile()</b> or starting the pattern with (*NO_START_OPT).
|
||||
<b>pcre2_compile()</b> or by starting the pattern with (*NO_START_OPT).
|
||||
</P>
|
||||
<P>
|
||||
After a successful match, a partial match, or one of the invalid UTF errors
|
||||
|
@ -3286,13 +3287,12 @@ For example, if the pattern a(b)c is matched with "=abc=" and the replacement
|
|||
string "+$1$0$1+", the result is "=+babcb+=".
|
||||
</P>
|
||||
<P>
|
||||
$*MARK inserts the name from the last encountered (*ACCEPT), (*COMMIT),
|
||||
(*MARK), (*PRUNE), or (*THEN) on the matching path that has a name. (*MARK)
|
||||
must always include a name, but the other verbs need not. For example, in
|
||||
the case of (*MARK:A)(*PRUNE) the name inserted is "A", but for
|
||||
(*MARK:A)(*PRUNE:B) the relevant name is "B". This facility can be used to
|
||||
perform simple simultaneous substitutions, as this <b>pcre2test</b> example
|
||||
shows:
|
||||
$*MARK inserts the name from the last encountered backtracking control verb on
|
||||
the matching path that has a name. (*MARK) must always include a name, but the
|
||||
other verbs need not. For example, in the case of (*MARK:A)(*PRUNE) the name
|
||||
inserted is "A", but for (*MARK:A)(*PRUNE:B) the relevant name is "B". This
|
||||
facility can be used to perform simple simultaneous substitutions, as this
|
||||
<b>pcre2test</b> example shows:
|
||||
<pre>
|
||||
/(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
|
||||
apple lemon
|
||||
|
@ -3782,7 +3782,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 12 November 2018
|
||||
Last updated: 27 November 2018
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -871,9 +871,14 @@ only callouts with string arguments are useful.
|
|||
Calling external programs or scripts
|
||||
</b><br>
|
||||
<P>
|
||||
This facility can be independently disabled when <b>pcre2grep</b> is built. If
|
||||
the callout string does not start with a pipe (vertical bar) character, it is
|
||||
parsed into a list of substrings separated by pipe characters. The first
|
||||
This facility can be independently disabled when <b>pcre2grep</b> is built. It
|
||||
is supported for Windows, where a call to <b>_spawnvp()</b> is used, for VMS,
|
||||
where <b>lib$spawn()</b> is used, and for any other Unix-like environment where
|
||||
<b>fork()</b> and <b>execv()</b> are available.
|
||||
</P>
|
||||
<P>
|
||||
If the callout string does not start with a pipe (vertical bar) character, it
|
||||
is parsed into a list of substrings separated by pipe characters. The first
|
||||
substring must be an executable name, with the following substrings specifying
|
||||
arguments:
|
||||
<pre>
|
||||
|
@ -900,7 +905,7 @@ a single dollar and $| is replaced by a pipe character. Here is an example:
|
|||
Arg1: [1] [234] [4] Arg2: |1| ()
|
||||
12345
|
||||
</pre>
|
||||
The parameters for the <b>execv()</b> system call that is used to run the
|
||||
The parameters for the system call that is used to run the
|
||||
program or script are zero-terminated strings. This means that binary zero
|
||||
characters in the callout argument will cause premature termination of their
|
||||
substrings, and therefore should not be present. Any syntax errors in the
|
||||
|
@ -966,7 +971,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC16" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 17 November 2018
|
||||
Last updated: 24 November 2018
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -3223,6 +3223,7 @@ There are a number of special "Backtracking Control Verbs" (to use Perl's
|
|||
terminology) that modify the behaviour of backtracking during matching. They
|
||||
are generally of the form (*VERB) or (*VERB:NAME). Some verbs take either form,
|
||||
possibly behaving differently depending on whether or not a name is present.
|
||||
The names are not required to be unique within the pattern.
|
||||
</P>
|
||||
<P>
|
||||
By default, for compatibility with Perl, a name is any sequence of characters
|
||||
|
@ -3331,8 +3332,8 @@ A match with the string "aaaa" always fails, but the callout is taken before
|
|||
each backtrack happens (in this example, 10 times).
|
||||
</P>
|
||||
<P>
|
||||
(*ACCEPT:NAME) and (*FAIL:NAME) behave exactly the same as
|
||||
(*MARK:NAME)(*ACCEPT) and (*MARK:NAME)(*FAIL), respectively.
|
||||
(*ACCEPT:NAME) and (*FAIL:NAME) are treated as (*MARK:NAME)(*ACCEPT) and
|
||||
(*MARK:NAME)(*FAIL), respectively.
|
||||
</P>
|
||||
<br><b>
|
||||
Recording which path was taken
|
||||
|
@ -3344,27 +3345,25 @@ starting point (see (*SKIP) below).
|
|||
<pre>
|
||||
(*MARK:NAME) or (*:NAME)
|
||||
</pre>
|
||||
A name is always required with this verb. There may be as many instances of
|
||||
(*MARK) as you like in a pattern, and their names do not have to be unique.
|
||||
A name is always required with this verb. For all the other backtracking
|
||||
control verbs, a NAME argument is optional.
|
||||
</P>
|
||||
<P>
|
||||
When a match succeeds, the name of the last-encountered (*MARK:NAME) on the
|
||||
When a match succeeds, the name of the last-encountered mark name on the
|
||||
matching path is passed back to the caller as described in the section entitled
|
||||
<a href="pcre2api.html#matchotherdata">"Other information about the match"</a>
|
||||
in the
|
||||
<a href="pcre2api.html"><b>pcre2api</b></a>
|
||||
documentation. This applies to all instances of (*MARK), including those inside
|
||||
assertions and atomic groups. (There are differences in those cases when
|
||||
(*MARK) is used in conjunction with (*SKIP) as described below.)
|
||||
documentation. This applies to all instances of (*MARK) and other verbs,
|
||||
including those inside assertions and atomic groups. However, there are
|
||||
differences in those cases when (*MARK) is used in conjunction with (*SKIP) as
|
||||
described below.
|
||||
</P>
|
||||
<P>
|
||||
As well as (*MARK), the (*COMMIT), (*PRUNE) and (*THEN) verbs may have
|
||||
associated NAME arguments. Whichever is last on the matching path is passed
|
||||
back. See below for more details of these other verbs.
|
||||
</P>
|
||||
<P>
|
||||
Here is an example of <b>pcre2test</b> output, where the "mark" modifier
|
||||
requests the retrieval and outputting of (*MARK) data:
|
||||
The mark name that was last encountered on the matching path is passed back. A
|
||||
verb without a NAME argument is ignored for this purpose. Here is an example of
|
||||
<b>pcre2test</b> output, where the "mark" modifier requests the retrieval and
|
||||
outputting of (*MARK) data:
|
||||
<pre>
|
||||
re> /X(*MARK:A)Y|X(*MARK:B)Z/mark
|
||||
data> XY
|
||||
|
@ -3414,7 +3413,7 @@ to the left of the verb. However, when one of these verbs appears inside an
|
|||
atomic group or in a lookaround assertion that is true, its effect is confined
|
||||
to that group, because once the group has been matched, there is never any
|
||||
backtracking into it. Backtracking from beyond an assertion or an atomic group
|
||||
ignores the entire group, and seeks a preceeding backtracking point.
|
||||
ignores the entire group, and seeks a preceding backtracking point.
|
||||
</P>
|
||||
<P>
|
||||
These verbs differ in exactly what kind of failure occurs when backtracking
|
||||
|
@ -3439,8 +3438,8 @@ dynamic anchor, or "I've started, so I must finish."
|
|||
<P>
|
||||
The behaviour of (*COMMIT:NAME) is not the same as (*MARK:NAME)(*COMMIT). It is
|
||||
like (*MARK:NAME) in that the name is remembered for passing back to the
|
||||
caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
|
||||
ignoring those set by (*COMMIT), (*PRUNE) and (*THEN).
|
||||
caller. However, (*SKIP:NAME) searches only for names that are set with
|
||||
(*MARK), ignoring those set by any of the other backtracking verbs.
|
||||
</P>
|
||||
<P>
|
||||
If there is more than one backtracking verb in a pattern, a different one that
|
||||
|
@ -3484,7 +3483,7 @@ as (*COMMIT).
|
|||
The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE). It is
|
||||
like (*MARK:NAME) in that the name is remembered for passing back to the
|
||||
caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
|
||||
ignoring those set by (*COMMIT), (*PRUNE) or (*THEN).
|
||||
ignoring those set by other backtracking verbs.
|
||||
<pre>
|
||||
(*SKIP)
|
||||
</pre>
|
||||
|
@ -3539,7 +3538,7 @@ the second branch of the pattern.
|
|||
</P>
|
||||
<P>
|
||||
Note that (*SKIP:NAME) searches only for names set by (*MARK:NAME). It ignores
|
||||
names that are set by (*COMMIT:NAME), (*PRUNE:NAME) or (*THEN:NAME).
|
||||
names that are set by other backtracking verbs.
|
||||
<pre>
|
||||
(*THEN) or (*THEN:NAME)
|
||||
</pre>
|
||||
|
@ -3561,7 +3560,7 @@ group. If (*THEN) is not inside an alternation, it acts like (*PRUNE).
|
|||
The behaviour of (*THEN:NAME) is not the same as (*MARK:NAME)(*THEN). It is
|
||||
like (*MARK:NAME) in that the name is remembered for passing back to the
|
||||
caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
|
||||
ignoring those set by (*COMMIT), (*PRUNE) and (*THEN).
|
||||
ignoring those set by other backtracking verbs.
|
||||
</P>
|
||||
<P>
|
||||
A subpattern that does not contain a | character is just a part of the
|
||||
|
@ -3656,10 +3655,10 @@ subpattern.
|
|||
</P>
|
||||
<P>
|
||||
(*ACCEPT) in a standalone positive assertion causes the assertion to succeed
|
||||
without any further processing; captured strings and a (*MARK) name (if set)
|
||||
are retained. In a standalone negative assertion, (*ACCEPT) causes the
|
||||
assertion to fail without any further processing; captured substrings and any
|
||||
(*MARK) name are discarded.
|
||||
without any further processing; captured strings and a mark name (if set) are
|
||||
retained. In a standalone negative assertion, (*ACCEPT) causes the assertion to
|
||||
fail without any further processing; captured substrings and any mark name are
|
||||
discarded.
|
||||
</P>
|
||||
<P>
|
||||
If the assertion is a condition, (*ACCEPT) causes the condition to be true for
|
||||
|
@ -3731,7 +3730,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC31" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 12 October 2018
|
||||
Last updated: 27 November 2018
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -2772,22 +2772,22 @@ OTHER INFORMATION ABOUT A MATCH
|
|||
times, the result is undefined.
|
||||
|
||||
After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a
|
||||
failure to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN)
|
||||
name may be available. The function pcre2_get_mark() can be called to
|
||||
access this name. The same function applies to all three verbs. It
|
||||
failure to match (PCRE2_ERROR_NOMATCH), a mark name may be available.
|
||||
The function pcre2_get_mark() can be called to access this name, which
|
||||
can be specified in the pattern by any of the backtracking control
|
||||
verbs, not just (*MARK). The same function applies to all the verbs. It
|
||||
returns a pointer to the zero-terminated name, which is within the com-
|
||||
piled pattern. If no name is available, NULL is returned. The length of
|
||||
the name (excluding the terminating zero) is stored in the code unit
|
||||
that precedes the name. You should use this length instead of relying
|
||||
on the terminating zero if the name might contain a binary zero.
|
||||
|
||||
After a successful match, the name that is returned is the last
|
||||
(*MARK), (*PRUNE), or (*THEN) name encountered on the matching path
|
||||
through the pattern. Instances of (*PRUNE) and (*THEN) without names
|
||||
are ignored. Thus, for example, if the matching path contains
|
||||
(*MARK:A)(*PRUNE), the name "A" is returned. After a "no match" or a
|
||||
partial match, the last encountered name is returned. For example,
|
||||
consider this pattern:
|
||||
After a successful match, the name that is returned is the last mark
|
||||
name encountered on the matching path through the pattern. Instances of
|
||||
backtracking verbs without names do not count. Thus, for example, if
|
||||
the matching path contains (*MARK:A)(*PRUNE), the name "A" is returned.
|
||||
After a "no match" or a partial match, the last encountered name is
|
||||
returned. For example, consider this pattern:
|
||||
|
||||
^(*MARK:A)((*MARK:B)a|b)c
|
||||
|
||||
|
@ -2802,8 +2802,8 @@ OTHER INFORMATION ABOUT A MATCH
|
|||
for the presence of "c" in the subject before running the matching
|
||||
engine. This check fails for "bx", causing a match failure without see-
|
||||
ing any marks. You can disable the start-of-match optimizations by set-
|
||||
ting the PCRE2_NO_START_OPTIMIZE option for pcre2_compile() or starting
|
||||
the pattern with (*NO_START_OPT).
|
||||
ting the PCRE2_NO_START_OPTIMIZE option for pcre2_compile() or by
|
||||
starting the pattern with (*NO_START_OPT).
|
||||
|
||||
After a successful match, a partial match, or one of the invalid UTF
|
||||
errors (for example, PCRE2_ERROR_UTF8_ERR5), pcre2_get_startchar() can
|
||||
|
@ -3193,13 +3193,12 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
|
|||
matched with "=abc=" and the replacement string "+$1$0$1+", the result
|
||||
is "=+babcb+=".
|
||||
|
||||
$*MARK inserts the name from the last encountered (*ACCEPT), (*COMMIT),
|
||||
(*MARK), (*PRUNE), or (*THEN) on the matching path that has a name.
|
||||
(*MARK) must always include a name, but the other verbs need not. For
|
||||
example, in the case of (*MARK:A)(*PRUNE) the name inserted is "A", but
|
||||
for (*MARK:A)(*PRUNE:B) the relevant name is "B". This facility can be
|
||||
used to perform simple simultaneous substitutions, as this pcre2test
|
||||
example shows:
|
||||
$*MARK inserts the name from the last encountered backtracking control
|
||||
verb on the matching path that has a name. (*MARK) must always include
|
||||
a name, but the other verbs need not. For example, in the case of
|
||||
(*MARK:A)(*PRUNE) the name inserted is "A", but for (*MARK:A)(*PRUNE:B)
|
||||
the relevant name is "B". This facility can be used to perform simple
|
||||
simultaneous substitutions, as this pcre2test example shows:
|
||||
|
||||
/(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
|
||||
apple lemon
|
||||
|
@ -3655,7 +3654,7 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 12 November 2018
|
||||
Last updated: 27 November 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
@ -8865,7 +8864,8 @@ BACKTRACKING CONTROL
|
|||
Perl's terminology) that modify the behaviour of backtracking during
|
||||
matching. They are generally of the form (*VERB) or (*VERB:NAME). Some
|
||||
verbs take either form, possibly behaving differently depending on
|
||||
whether or not a name is present.
|
||||
whether or not a name is present. The names are not required to be
|
||||
unique within the pattern.
|
||||
|
||||
By default, for compatibility with Perl, a name is any sequence of
|
||||
characters that does not include a closing parenthesis. The name is not
|
||||
|
@ -8959,8 +8959,8 @@ BACKTRACKING CONTROL
|
|||
A match with the string "aaaa" always fails, but the callout is taken
|
||||
before each backtrack happens (in this example, 10 times).
|
||||
|
||||
(*ACCEPT:NAME) and (*FAIL:NAME) behave exactly the same as
|
||||
(*MARK:NAME)(*ACCEPT) and (*MARK:NAME)(*FAIL), respectively.
|
||||
(*ACCEPT:NAME) and (*FAIL:NAME) are treated as (*MARK:NAME)(*ACCEPT)
|
||||
and (*MARK:NAME)(*FAIL), respectively.
|
||||
|
||||
Recording which path was taken
|
||||
|
||||
|
@ -8970,24 +8970,21 @@ BACKTRACKING CONTROL
|
|||
|
||||
(*MARK:NAME) or (*:NAME)
|
||||
|
||||
A name is always required with this verb. There may be as many
|
||||
instances of (*MARK) as you like in a pattern, and their names do not
|
||||
have to be unique.
|
||||
A name is always required with this verb. For all the other backtrack-
|
||||
ing control verbs, a NAME argument is optional.
|
||||
|
||||
When a match succeeds, the name of the last-encountered (*MARK:NAME) on
|
||||
When a match succeeds, the name of the last-encountered mark name on
|
||||
the matching path is passed back to the caller as described in the sec-
|
||||
tion entitled "Other information about the match" in the pcre2api docu-
|
||||
mentation. This applies to all instances of (*MARK), including those
|
||||
inside assertions and atomic groups. (There are differences in those
|
||||
cases when (*MARK) is used in conjunction with (*SKIP) as described
|
||||
below.)
|
||||
mentation. This applies to all instances of (*MARK) and other verbs,
|
||||
including those inside assertions and atomic groups. However, there are
|
||||
differences in those cases when (*MARK) is used in conjunction with
|
||||
(*SKIP) as described below.
|
||||
|
||||
As well as (*MARK), the (*COMMIT), (*PRUNE) and (*THEN) verbs may have
|
||||
associated NAME arguments. Whichever is last on the matching path is
|
||||
passed back. See below for more details of these other verbs.
|
||||
|
||||
Here is an example of pcre2test output, where the "mark" modifier
|
||||
requests the retrieval and outputting of (*MARK) data:
|
||||
The mark name that was last encountered on the matching path is passed
|
||||
back. A verb without a NAME argument is ignored for this purpose. Here
|
||||
is an example of pcre2test output, where the "mark" modifier requests
|
||||
the retrieval and outputting of (*MARK) data:
|
||||
|
||||
re> /X(*MARK:A)Y|X(*MARK:B)Z/mark
|
||||
data> XY
|
||||
|
@ -9033,7 +9030,7 @@ BACKTRACKING CONTROL
|
|||
that is true, its effect is confined to that group, because once the
|
||||
group has been matched, there is never any backtracking into it. Back-
|
||||
tracking from beyond an assertion or an atomic group ignores the entire
|
||||
group, and seeks a preceeding backtracking point.
|
||||
group, and seeks a preceding backtracking point.
|
||||
|
||||
These verbs differ in exactly what kind of failure occurs when back-
|
||||
tracking reaches them. The behaviour described below is what happens
|
||||
|
@ -9058,8 +9055,8 @@ BACKTRACKING CONTROL
|
|||
The behaviour of (*COMMIT:NAME) is not the same as (*MARK:NAME)(*COM-
|
||||
MIT). It is like (*MARK:NAME) in that the name is remembered for pass-
|
||||
ing back to the caller. However, (*SKIP:NAME) searches only for names
|
||||
set with (*MARK), ignoring those set by (*COMMIT), (*PRUNE) and
|
||||
(*THEN).
|
||||
that are set with (*MARK), ignoring those set by any of the other back-
|
||||
tracking verbs.
|
||||
|
||||
If there is more than one backtracking verb in a pattern, a different
|
||||
one that follows (*COMMIT) may be triggered first, so merely passing
|
||||
|
@ -9103,7 +9100,7 @@ BACKTRACKING CONTROL
|
|||
The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE).
|
||||
It is like (*MARK:NAME) in that the name is remembered for passing back
|
||||
to the caller. However, (*SKIP:NAME) searches only for names set with
|
||||
(*MARK), ignoring those set by (*COMMIT), (*PRUNE) or (*THEN).
|
||||
(*MARK), ignoring those set by other backtracking verbs.
|
||||
|
||||
(*SKIP)
|
||||
|
||||
|
@ -9159,8 +9156,7 @@ BACKTRACKING CONTROL
|
|||
the pattern.
|
||||
|
||||
Note that (*SKIP:NAME) searches only for names set by (*MARK:NAME). It
|
||||
ignores names that are set by (*COMMIT:NAME), (*PRUNE:NAME) or
|
||||
(*THEN:NAME).
|
||||
ignores names that are set by other backtracking verbs.
|
||||
|
||||
(*THEN) or (*THEN:NAME)
|
||||
|
||||
|
@ -9182,7 +9178,7 @@ BACKTRACKING CONTROL
|
|||
The behaviour of (*THEN:NAME) is not the same as (*MARK:NAME)(*THEN).
|
||||
It is like (*MARK:NAME) in that the name is remembered for passing back
|
||||
to the caller. However, (*SKIP:NAME) searches only for names set with
|
||||
(*MARK), ignoring those set by (*COMMIT), (*PRUNE) and (*THEN).
|
||||
(*MARK), ignoring those set by other backtracking verbs.
|
||||
|
||||
A subpattern that does not contain a | character is just a part of the
|
||||
enclosing alternative; it is not a nested alternation with only one
|
||||
|
@ -9269,10 +9265,10 @@ BACKTRACKING CONTROL
|
|||
in a conditional subpattern.
|
||||
|
||||
(*ACCEPT) in a standalone positive assertion causes the assertion to
|
||||
succeed without any further processing; captured strings and a (*MARK)
|
||||
succeed without any further processing; captured strings and a mark
|
||||
name (if set) are retained. In a standalone negative assertion,
|
||||
(*ACCEPT) causes the assertion to fail without any further processing;
|
||||
captured substrings and any (*MARK) name are discarded.
|
||||
captured substrings and any mark name are discarded.
|
||||
|
||||
If the assertion is a condition, (*ACCEPT) causes the condition to be
|
||||
true for a positive assertion and false for a negative one; captured
|
||||
|
@ -9336,7 +9332,7 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 12 October 2018
|
||||
Last updated: 27 November 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2API 3 "12 November 2018" "PCRE2 10.33"
|
||||
.TH PCRE2API 3 "27 November 2018" "PCRE2 10.33"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.sp
|
||||
|
@ -2842,21 +2842,22 @@ appropriate circumstances. If they are called at other times, the result is
|
|||
undefined.
|
||||
.P
|
||||
After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a failure
|
||||
to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN) name may be
|
||||
available. The function \fBpcre2_get_mark()\fP can be called to access this
|
||||
name. The same function applies to all three verbs. It returns a pointer to the
|
||||
zero-terminated name, which is within the compiled pattern. If no name is
|
||||
to match (PCRE2_ERROR_NOMATCH), a mark name may be available. The function
|
||||
\fBpcre2_get_mark()\fP can be called to access this name, which can be
|
||||
specified in the pattern by any of the backtracking control verbs, not just
|
||||
(*MARK). The same function applies to all the verbs. It returns a pointer to
|
||||
the zero-terminated name, which is within the compiled pattern. If no name is
|
||||
available, NULL is returned. The length of the name (excluding the terminating
|
||||
zero) is stored in the code unit that precedes the name. You should use this
|
||||
length instead of relying on the terminating zero if the name might contain a
|
||||
binary zero.
|
||||
.P
|
||||
After a successful match, the name that is returned is the last (*MARK),
|
||||
(*PRUNE), or (*THEN) name encountered on the matching path through the pattern.
|
||||
Instances of (*PRUNE) and (*THEN) without names are ignored. Thus, for example,
|
||||
if the matching path contains (*MARK:A)(*PRUNE), the name "A" is returned.
|
||||
After a "no match" or a partial match, the last encountered name is returned.
|
||||
For example, consider this pattern:
|
||||
After a successful match, the name that is returned is the last mark name
|
||||
encountered on the matching path through the pattern. Instances of backtracking
|
||||
verbs without names do not count. Thus, for example, if the matching path
|
||||
contains (*MARK:A)(*PRUNE), the name "A" is returned. After a "no match" or a
|
||||
partial match, the last encountered name is returned. For example, consider
|
||||
this pattern:
|
||||
.sp
|
||||
^(*MARK:A)((*MARK:B)a|b)c
|
||||
.sp
|
||||
|
@ -2870,7 +2871,7 @@ is removed from the pattern above, there is an initial check for the presence
|
|||
of "c" in the subject before running the matching engine. This check fails for
|
||||
"bx", causing a match failure without seeing any marks. You can disable the
|
||||
start-of-match optimizations by setting the PCRE2_NO_START_OPTIMIZE option for
|
||||
\fBpcre2_compile()\fP or starting the pattern with (*NO_START_OPT).
|
||||
\fBpcre2_compile()\fP or by starting the pattern with (*NO_START_OPT).
|
||||
.P
|
||||
After a successful match, a partial match, or one of the invalid UTF errors
|
||||
(for example, PCRE2_ERROR_UTF8_ERR5), \fBpcre2_get_startchar()\fP can be
|
||||
|
@ -3297,13 +3298,12 @@ number or name. The number may be zero to include the entire matched string.
|
|||
For example, if the pattern a(b)c is matched with "=abc=" and the replacement
|
||||
string "+$1$0$1+", the result is "=+babcb+=".
|
||||
.P
|
||||
$*MARK inserts the name from the last encountered (*ACCEPT), (*COMMIT),
|
||||
(*MARK), (*PRUNE), or (*THEN) on the matching path that has a name. (*MARK)
|
||||
must always include a name, but the other verbs need not. For example, in
|
||||
the case of (*MARK:A)(*PRUNE) the name inserted is "A", but for
|
||||
(*MARK:A)(*PRUNE:B) the relevant name is "B". This facility can be used to
|
||||
perform simple simultaneous substitutions, as this \fBpcre2test\fP example
|
||||
shows:
|
||||
$*MARK inserts the name from the last encountered backtracking control verb on
|
||||
the matching path that has a name. (*MARK) must always include a name, but the
|
||||
other verbs need not. For example, in the case of (*MARK:A)(*PRUNE) the name
|
||||
inserted is "A", but for (*MARK:A)(*PRUNE:B) the relevant name is "B". This
|
||||
facility can be used to perform simple simultaneous substitutions, as this
|
||||
\fBpcre2test\fP example shows:
|
||||
.sp
|
||||
/(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
|
||||
apple lemon
|
||||
|
@ -3790,6 +3790,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 12 November 2018
|
||||
Last updated: 27 November 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -847,11 +847,15 @@ USING PCRE2'S CALLOUT FACILITY
|
|||
|
||||
Calling external programs or scripts
|
||||
|
||||
This facility can be independently disabled when pcre2grep is built. If
|
||||
the callout string does not start with a pipe (vertical bar) character,
|
||||
it is parsed into a list of substrings separated by pipe characters.
|
||||
The first substring must be an executable name, with the following sub-
|
||||
strings specifying arguments:
|
||||
This facility can be independently disabled when pcre2grep is built. It
|
||||
is supported for Windows, where a call to _spawnvp() is used, for VMS,
|
||||
where lib$spawn() is used, and for any other Unix-like environment
|
||||
where fork() and execv() are available.
|
||||
|
||||
If the callout string does not start with a pipe (vertical bar) charac-
|
||||
ter, it is parsed into a list of substrings separated by pipe charac-
|
||||
ters. The first substring must be an executable name, with the follow-
|
||||
ing substrings specifying arguments:
|
||||
|
||||
executable_name|arg1|arg2|...
|
||||
|
||||
|
@ -877,15 +881,14 @@ USING PCRE2'S CALLOUT FACILITY
|
|||
Arg1: [1] [234] [4] Arg2: |1| ()
|
||||
12345
|
||||
|
||||
The parameters for the execv() system call that is used to run the pro-
|
||||
gram or script are zero-terminated strings. This means that binary zero
|
||||
characters in the callout argument will cause premature termination of
|
||||
their substrings, and therefore should not be present. Any syntax
|
||||
errors in the string (for example, a dollar not followed by another
|
||||
character) cause the callout to be ignored. If running the program
|
||||
fails for any reason (including the non-existence of the executable), a
|
||||
local matching failure occurs and the matcher backtracks in the normal
|
||||
way.
|
||||
The parameters for the system call that is used to run the program or
|
||||
script are zero-terminated strings. This means that binary zero charac-
|
||||
ters in the callout argument will cause premature termination of their
|
||||
substrings, and therefore should not be present. Any syntax errors in
|
||||
the string (for example, a dollar not followed by another character)
|
||||
cause the callout to be ignored. If running the program fails for any
|
||||
reason (including the non-existence of the executable), a local match-
|
||||
ing failure occurs and the matcher backtracks in the normal way.
|
||||
|
||||
Echoing a specific string
|
||||
|
||||
|
@ -945,5 +948,5 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 17 November 2018
|
||||
Last updated: 24 November 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2PATTERN 3 "12 October 2018" "PCRE2 10.33"
|
||||
.TH PCRE2PATTERN 3 "27 November 2018" "PCRE2 10.33"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.SH "PCRE2 REGULAR EXPRESSION DETAILS"
|
||||
|
@ -3262,6 +3262,7 @@ There are a number of special "Backtracking Control Verbs" (to use Perl's
|
|||
terminology) that modify the behaviour of backtracking during matching. They
|
||||
are generally of the form (*VERB) or (*VERB:NAME). Some verbs take either form,
|
||||
possibly behaving differently depending on whether or not a name is present.
|
||||
The names are not required to be unique within the pattern.
|
||||
.P
|
||||
By default, for compatibility with Perl, a name is any sequence of characters
|
||||
that does not include a closing parenthesis. The name is not processed in
|
||||
|
@ -3376,8 +3377,8 @@ nearest equivalent is the callout feature, as for example in this pattern:
|
|||
A match with the string "aaaa" always fails, but the callout is taken before
|
||||
each backtrack happens (in this example, 10 times).
|
||||
.P
|
||||
(*ACCEPT:NAME) and (*FAIL:NAME) behave exactly the same as
|
||||
(*MARK:NAME)(*ACCEPT) and (*MARK:NAME)(*FAIL), respectively.
|
||||
(*ACCEPT:NAME) and (*FAIL:NAME) are treated as (*MARK:NAME)(*ACCEPT) and
|
||||
(*MARK:NAME)(*FAIL), respectively.
|
||||
.
|
||||
.
|
||||
.SS "Recording which path was taken"
|
||||
|
@ -3389,10 +3390,10 @@ starting point (see (*SKIP) below).
|
|||
.sp
|
||||
(*MARK:NAME) or (*:NAME)
|
||||
.sp
|
||||
A name is always required with this verb. There may be as many instances of
|
||||
(*MARK) as you like in a pattern, and their names do not have to be unique.
|
||||
A name is always required with this verb. For all the other backtracking
|
||||
control verbs, a NAME argument is optional.
|
||||
.P
|
||||
When a match succeeds, the name of the last-encountered (*MARK:NAME) on the
|
||||
When a match succeeds, the name of the last-encountered mark name on the
|
||||
matching path is passed back to the caller as described in the section entitled
|
||||
.\" HTML <a href="pcre2api.html#matchotherdata">
|
||||
.\" </a>
|
||||
|
@ -3402,16 +3403,15 @@ in the
|
|||
.\" HREF
|
||||
\fBpcre2api\fP
|
||||
.\"
|
||||
documentation. This applies to all instances of (*MARK), including those inside
|
||||
assertions and atomic groups. (There are differences in those cases when
|
||||
(*MARK) is used in conjunction with (*SKIP) as described below.)
|
||||
documentation. This applies to all instances of (*MARK) and other verbs,
|
||||
including those inside assertions and atomic groups. However, there are
|
||||
differences in those cases when (*MARK) is used in conjunction with (*SKIP) as
|
||||
described below.
|
||||
.P
|
||||
As well as (*MARK), the (*COMMIT), (*PRUNE) and (*THEN) verbs may have
|
||||
associated NAME arguments. Whichever is last on the matching path is passed
|
||||
back. See below for more details of these other verbs.
|
||||
.P
|
||||
Here is an example of \fBpcre2test\fP output, where the "mark" modifier
|
||||
requests the retrieval and outputting of (*MARK) data:
|
||||
The mark name that was last encountered on the matching path is passed back. A
|
||||
verb without a NAME argument is ignored for this purpose. Here is an example of
|
||||
\fBpcre2test\fP output, where the "mark" modifier requests the retrieval and
|
||||
outputting of (*MARK) data:
|
||||
.sp
|
||||
re> /X(*MARK:A)Y|X(*MARK:B)Z/mark
|
||||
data> XY
|
||||
|
@ -3461,7 +3461,7 @@ to the left of the verb. However, when one of these verbs appears inside an
|
|||
atomic group or in a lookaround assertion that is true, its effect is confined
|
||||
to that group, because once the group has been matched, there is never any
|
||||
backtracking into it. Backtracking from beyond an assertion or an atomic group
|
||||
ignores the entire group, and seeks a preceeding backtracking point.
|
||||
ignores the entire group, and seeks a preceding backtracking point.
|
||||
.P
|
||||
These verbs differ in exactly what kind of failure occurs when backtracking
|
||||
reaches them. The behaviour described below is what happens when the verb is
|
||||
|
@ -3484,8 +3484,8 @@ dynamic anchor, or "I've started, so I must finish."
|
|||
.P
|
||||
The behaviour of (*COMMIT:NAME) is not the same as (*MARK:NAME)(*COMMIT). It is
|
||||
like (*MARK:NAME) in that the name is remembered for passing back to the
|
||||
caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
|
||||
ignoring those set by (*COMMIT), (*PRUNE) and (*THEN).
|
||||
caller. However, (*SKIP:NAME) searches only for names that are set with
|
||||
(*MARK), ignoring those set by any of the other backtracking verbs.
|
||||
.P
|
||||
If there is more than one backtracking verb in a pattern, a different one that
|
||||
follows (*COMMIT) may be triggered first, so merely passing (*COMMIT) during a
|
||||
|
@ -3526,7 +3526,7 @@ as (*COMMIT).
|
|||
The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE). It is
|
||||
like (*MARK:NAME) in that the name is remembered for passing back to the
|
||||
caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
|
||||
ignoring those set by (*COMMIT), (*PRUNE) or (*THEN).
|
||||
ignoring those set by other backtracking verbs.
|
||||
.sp
|
||||
(*SKIP)
|
||||
.sp
|
||||
|
@ -3579,7 +3579,7 @@ never seen because "a" does not match "b", so the matcher immediately jumps to
|
|||
the second branch of the pattern.
|
||||
.P
|
||||
Note that (*SKIP:NAME) searches only for names set by (*MARK:NAME). It ignores
|
||||
names that are set by (*COMMIT:NAME), (*PRUNE:NAME) or (*THEN:NAME).
|
||||
names that are set by other backtracking verbs.
|
||||
.sp
|
||||
(*THEN) or (*THEN:NAME)
|
||||
.sp
|
||||
|
@ -3600,7 +3600,7 @@ group. If (*THEN) is not inside an alternation, it acts like (*PRUNE).
|
|||
The behaviour of (*THEN:NAME) is not the same as (*MARK:NAME)(*THEN). It is
|
||||
like (*MARK:NAME) in that the name is remembered for passing back to the
|
||||
caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
|
||||
ignoring those set by (*COMMIT), (*PRUNE) and (*THEN).
|
||||
ignoring those set by other backtracking verbs.
|
||||
.P
|
||||
A subpattern that does not contain a | character is just a part of the
|
||||
enclosing alternative; it is not a nested alternation with only one
|
||||
|
@ -3693,10 +3693,10 @@ not the assertion is standalone or acting as the condition in a conditional
|
|||
subpattern.
|
||||
.P
|
||||
(*ACCEPT) in a standalone positive assertion causes the assertion to succeed
|
||||
without any further processing; captured strings and a (*MARK) name (if set)
|
||||
are retained. In a standalone negative assertion, (*ACCEPT) causes the
|
||||
assertion to fail without any further processing; captured substrings and any
|
||||
(*MARK) name are discarded.
|
||||
without any further processing; captured strings and a mark name (if set) are
|
||||
retained. In a standalone negative assertion, (*ACCEPT) causes the assertion to
|
||||
fail without any further processing; captured substrings and any mark name are
|
||||
discarded.
|
||||
.P
|
||||
If the assertion is a condition, (*ACCEPT) causes the condition to be true for
|
||||
a positive assertion and false for a negative one; captured substrings are
|
||||
|
@ -3767,6 +3767,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 12 October 2018
|
||||
Last updated: 27 November 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
.fi
|
||||
|
|
Loading…
Reference in New Issue