Documentation update.

This commit is contained in:
Philip.Hazel 2018-11-27 16:41:20 +00:00
parent 0b64d9cfca
commit e7a762ddff
7 changed files with 451 additions and 448 deletions

View File

@ -2841,22 +2841,23 @@ undefined.
</P> </P>
<P> <P>
After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a failure After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a failure
to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN) name may be to match (PCRE2_ERROR_NOMATCH), a mark name may be available. The function
available. The function <b>pcre2_get_mark()</b> can be called to access this <b>pcre2_get_mark()</b> can be called to access this name, which can be
name. The same function applies to all three verbs. It returns a pointer to the specified in the pattern by any of the backtracking control verbs, not just
zero-terminated name, which is within the compiled pattern. If no name is (*MARK). The same function applies to all the verbs. It returns a pointer to
the zero-terminated name, which is within the compiled pattern. If no name is
available, NULL is returned. The length of the name (excluding the terminating available, NULL is returned. The length of the name (excluding the terminating
zero) is stored in the code unit that precedes the name. You should use this zero) is stored in the code unit that precedes the name. You should use this
length instead of relying on the terminating zero if the name might contain a length instead of relying on the terminating zero if the name might contain a
binary zero. binary zero.
</P> </P>
<P> <P>
After a successful match, the name that is returned is the last (*MARK), After a successful match, the name that is returned is the last mark name
(*PRUNE), or (*THEN) name encountered on the matching path through the pattern. encountered on the matching path through the pattern. Instances of backtracking
Instances of (*PRUNE) and (*THEN) without names are ignored. Thus, for example, verbs without names do not count. Thus, for example, if the matching path
if the matching path contains (*MARK:A)(*PRUNE), the name "A" is returned. contains (*MARK:A)(*PRUNE), the name "A" is returned. After a "no match" or a
After a "no match" or a partial match, the last encountered name is returned. partial match, the last encountered name is returned. For example, consider
For example, consider this pattern: this pattern:
<pre> <pre>
^(*MARK:A)((*MARK:B)a|b)c ^(*MARK:A)((*MARK:B)a|b)c
</pre> </pre>
@ -2871,7 +2872,7 @@ is removed from the pattern above, there is an initial check for the presence
of "c" in the subject before running the matching engine. This check fails for of "c" in the subject before running the matching engine. This check fails for
"bx", causing a match failure without seeing any marks. You can disable the "bx", causing a match failure without seeing any marks. You can disable the
start-of-match optimizations by setting the PCRE2_NO_START_OPTIMIZE option for start-of-match optimizations by setting the PCRE2_NO_START_OPTIMIZE option for
<b>pcre2_compile()</b> or starting the pattern with (*NO_START_OPT). <b>pcre2_compile()</b> or by starting the pattern with (*NO_START_OPT).
</P> </P>
<P> <P>
After a successful match, a partial match, or one of the invalid UTF errors After a successful match, a partial match, or one of the invalid UTF errors
@ -3286,13 +3287,12 @@ For example, if the pattern a(b)c is matched with "=abc=" and the replacement
string "+$1$0$1+", the result is "=+babcb+=". string "+$1$0$1+", the result is "=+babcb+=".
</P> </P>
<P> <P>
$*MARK inserts the name from the last encountered (*ACCEPT), (*COMMIT), $*MARK inserts the name from the last encountered backtracking control verb on
(*MARK), (*PRUNE), or (*THEN) on the matching path that has a name. (*MARK) the matching path that has a name. (*MARK) must always include a name, but the
must always include a name, but the other verbs need not. For example, in other verbs need not. For example, in the case of (*MARK:A)(*PRUNE) the name
the case of (*MARK:A)(*PRUNE) the name inserted is "A", but for inserted is "A", but for (*MARK:A)(*PRUNE:B) the relevant name is "B". This
(*MARK:A)(*PRUNE:B) the relevant name is "B". This facility can be used to facility can be used to perform simple simultaneous substitutions, as this
perform simple simultaneous substitutions, as this <b>pcre2test</b> example <b>pcre2test</b> example shows:
shows:
<pre> <pre>
/(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK} /(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
apple lemon apple lemon
@ -3782,7 +3782,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC42" href="#TOC1">REVISION</a><br> <br><a name="SEC42" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 12 November 2018 Last updated: 27 November 2018
<br> <br>
Copyright &copy; 1997-2018 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>

View File

@ -871,9 +871,14 @@ only callouts with string arguments are useful.
Calling external programs or scripts Calling external programs or scripts
</b><br> </b><br>
<P> <P>
This facility can be independently disabled when <b>pcre2grep</b> is built. If This facility can be independently disabled when <b>pcre2grep</b> is built. It
the callout string does not start with a pipe (vertical bar) character, it is is supported for Windows, where a call to <b>_spawnvp()</b> is used, for VMS,
parsed into a list of substrings separated by pipe characters. The first where <b>lib$spawn()</b> is used, and for any other Unix-like environment where
<b>fork()</b> and <b>execv()</b> are available.
</P>
<P>
If the callout string does not start with a pipe (vertical bar) character, it
is parsed into a list of substrings separated by pipe characters. The first
substring must be an executable name, with the following substrings specifying substring must be an executable name, with the following substrings specifying
arguments: arguments:
<pre> <pre>
@ -900,7 +905,7 @@ a single dollar and $| is replaced by a pipe character. Here is an example:
Arg1: [1] [234] [4] Arg2: |1| () Arg1: [1] [234] [4] Arg2: |1| ()
12345 12345
</pre> </pre>
The parameters for the <b>execv()</b> system call that is used to run the The parameters for the system call that is used to run the
program or script are zero-terminated strings. This means that binary zero program or script are zero-terminated strings. This means that binary zero
characters in the callout argument will cause premature termination of their characters in the callout argument will cause premature termination of their
substrings, and therefore should not be present. Any syntax errors in the substrings, and therefore should not be present. Any syntax errors in the
@ -966,7 +971,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC16" href="#TOC1">REVISION</a><br> <br><a name="SEC16" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 17 November 2018 Last updated: 24 November 2018
<br> <br>
Copyright &copy; 1997-2018 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>

View File

@ -3223,6 +3223,7 @@ There are a number of special "Backtracking Control Verbs" (to use Perl's
terminology) that modify the behaviour of backtracking during matching. They terminology) that modify the behaviour of backtracking during matching. They
are generally of the form (*VERB) or (*VERB:NAME). Some verbs take either form, are generally of the form (*VERB) or (*VERB:NAME). Some verbs take either form,
possibly behaving differently depending on whether or not a name is present. possibly behaving differently depending on whether or not a name is present.
The names are not required to be unique within the pattern.
</P> </P>
<P> <P>
By default, for compatibility with Perl, a name is any sequence of characters By default, for compatibility with Perl, a name is any sequence of characters
@ -3331,8 +3332,8 @@ A match with the string "aaaa" always fails, but the callout is taken before
each backtrack happens (in this example, 10 times). each backtrack happens (in this example, 10 times).
</P> </P>
<P> <P>
(*ACCEPT:NAME) and (*FAIL:NAME) behave exactly the same as (*ACCEPT:NAME) and (*FAIL:NAME) are treated as (*MARK:NAME)(*ACCEPT) and
(*MARK:NAME)(*ACCEPT) and (*MARK:NAME)(*FAIL), respectively. (*MARK:NAME)(*FAIL), respectively.
</P> </P>
<br><b> <br><b>
Recording which path was taken Recording which path was taken
@ -3344,27 +3345,25 @@ starting point (see (*SKIP) below).
<pre> <pre>
(*MARK:NAME) or (*:NAME) (*MARK:NAME) or (*:NAME)
</pre> </pre>
A name is always required with this verb. There may be as many instances of A name is always required with this verb. For all the other backtracking
(*MARK) as you like in a pattern, and their names do not have to be unique. control verbs, a NAME argument is optional.
</P> </P>
<P> <P>
When a match succeeds, the name of the last-encountered (*MARK:NAME) on the When a match succeeds, the name of the last-encountered mark name on the
matching path is passed back to the caller as described in the section entitled matching path is passed back to the caller as described in the section entitled
<a href="pcre2api.html#matchotherdata">"Other information about the match"</a> <a href="pcre2api.html#matchotherdata">"Other information about the match"</a>
in the in the
<a href="pcre2api.html"><b>pcre2api</b></a> <a href="pcre2api.html"><b>pcre2api</b></a>
documentation. This applies to all instances of (*MARK), including those inside documentation. This applies to all instances of (*MARK) and other verbs,
assertions and atomic groups. (There are differences in those cases when including those inside assertions and atomic groups. However, there are
(*MARK) is used in conjunction with (*SKIP) as described below.) differences in those cases when (*MARK) is used in conjunction with (*SKIP) as
described below.
</P> </P>
<P> <P>
As well as (*MARK), the (*COMMIT), (*PRUNE) and (*THEN) verbs may have The mark name that was last encountered on the matching path is passed back. A
associated NAME arguments. Whichever is last on the matching path is passed verb without a NAME argument is ignored for this purpose. Here is an example of
back. See below for more details of these other verbs. <b>pcre2test</b> output, where the "mark" modifier requests the retrieval and
</P> outputting of (*MARK) data:
<P>
Here is an example of <b>pcre2test</b> output, where the "mark" modifier
requests the retrieval and outputting of (*MARK) data:
<pre> <pre>
re&#62; /X(*MARK:A)Y|X(*MARK:B)Z/mark re&#62; /X(*MARK:A)Y|X(*MARK:B)Z/mark
data&#62; XY data&#62; XY
@ -3414,7 +3413,7 @@ to the left of the verb. However, when one of these verbs appears inside an
atomic group or in a lookaround assertion that is true, its effect is confined atomic group or in a lookaround assertion that is true, its effect is confined
to that group, because once the group has been matched, there is never any to that group, because once the group has been matched, there is never any
backtracking into it. Backtracking from beyond an assertion or an atomic group backtracking into it. Backtracking from beyond an assertion or an atomic group
ignores the entire group, and seeks a preceeding backtracking point. ignores the entire group, and seeks a preceding backtracking point.
</P> </P>
<P> <P>
These verbs differ in exactly what kind of failure occurs when backtracking These verbs differ in exactly what kind of failure occurs when backtracking
@ -3439,8 +3438,8 @@ dynamic anchor, or "I've started, so I must finish."
<P> <P>
The behaviour of (*COMMIT:NAME) is not the same as (*MARK:NAME)(*COMMIT). It is The behaviour of (*COMMIT:NAME) is not the same as (*MARK:NAME)(*COMMIT). It is
like (*MARK:NAME) in that the name is remembered for passing back to the like (*MARK:NAME) in that the name is remembered for passing back to the
caller. However, (*SKIP:NAME) searches only for names set with (*MARK), caller. However, (*SKIP:NAME) searches only for names that are set with
ignoring those set by (*COMMIT), (*PRUNE) and (*THEN). (*MARK), ignoring those set by any of the other backtracking verbs.
</P> </P>
<P> <P>
If there is more than one backtracking verb in a pattern, a different one that If there is more than one backtracking verb in a pattern, a different one that
@ -3484,7 +3483,7 @@ as (*COMMIT).
The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE). It is The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE). It is
like (*MARK:NAME) in that the name is remembered for passing back to the like (*MARK:NAME) in that the name is remembered for passing back to the
caller. However, (*SKIP:NAME) searches only for names set with (*MARK), caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
ignoring those set by (*COMMIT), (*PRUNE) or (*THEN). ignoring those set by other backtracking verbs.
<pre> <pre>
(*SKIP) (*SKIP)
</pre> </pre>
@ -3539,7 +3538,7 @@ the second branch of the pattern.
</P> </P>
<P> <P>
Note that (*SKIP:NAME) searches only for names set by (*MARK:NAME). It ignores Note that (*SKIP:NAME) searches only for names set by (*MARK:NAME). It ignores
names that are set by (*COMMIT:NAME), (*PRUNE:NAME) or (*THEN:NAME). names that are set by other backtracking verbs.
<pre> <pre>
(*THEN) or (*THEN:NAME) (*THEN) or (*THEN:NAME)
</pre> </pre>
@ -3561,7 +3560,7 @@ group. If (*THEN) is not inside an alternation, it acts like (*PRUNE).
The behaviour of (*THEN:NAME) is not the same as (*MARK:NAME)(*THEN). It is The behaviour of (*THEN:NAME) is not the same as (*MARK:NAME)(*THEN). It is
like (*MARK:NAME) in that the name is remembered for passing back to the like (*MARK:NAME) in that the name is remembered for passing back to the
caller. However, (*SKIP:NAME) searches only for names set with (*MARK), caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
ignoring those set by (*COMMIT), (*PRUNE) and (*THEN). ignoring those set by other backtracking verbs.
</P> </P>
<P> <P>
A subpattern that does not contain a | character is just a part of the A subpattern that does not contain a | character is just a part of the
@ -3656,10 +3655,10 @@ subpattern.
</P> </P>
<P> <P>
(*ACCEPT) in a standalone positive assertion causes the assertion to succeed (*ACCEPT) in a standalone positive assertion causes the assertion to succeed
without any further processing; captured strings and a (*MARK) name (if set) without any further processing; captured strings and a mark name (if set) are
are retained. In a standalone negative assertion, (*ACCEPT) causes the retained. In a standalone negative assertion, (*ACCEPT) causes the assertion to
assertion to fail without any further processing; captured substrings and any fail without any further processing; captured substrings and any mark name are
(*MARK) name are discarded. discarded.
</P> </P>
<P> <P>
If the assertion is a condition, (*ACCEPT) causes the condition to be true for If the assertion is a condition, (*ACCEPT) causes the condition to be true for
@ -3731,7 +3730,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC31" href="#TOC1">REVISION</a><br> <br><a name="SEC31" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 12 October 2018 Last updated: 27 November 2018
<br> <br>
Copyright &copy; 1997-2018 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>

View File

@ -2772,22 +2772,22 @@ OTHER INFORMATION ABOUT A MATCH
times, the result is undefined. times, the result is undefined.
After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a
failure to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN) failure to match (PCRE2_ERROR_NOMATCH), a mark name may be available.
name may be available. The function pcre2_get_mark() can be called to The function pcre2_get_mark() can be called to access this name, which
access this name. The same function applies to all three verbs. It can be specified in the pattern by any of the backtracking control
verbs, not just (*MARK). The same function applies to all the verbs. It
returns a pointer to the zero-terminated name, which is within the com- returns a pointer to the zero-terminated name, which is within the com-
piled pattern. If no name is available, NULL is returned. The length of piled pattern. If no name is available, NULL is returned. The length of
the name (excluding the terminating zero) is stored in the code unit the name (excluding the terminating zero) is stored in the code unit
that precedes the name. You should use this length instead of relying that precedes the name. You should use this length instead of relying
on the terminating zero if the name might contain a binary zero. on the terminating zero if the name might contain a binary zero.
After a successful match, the name that is returned is the last After a successful match, the name that is returned is the last mark
(*MARK), (*PRUNE), or (*THEN) name encountered on the matching path name encountered on the matching path through the pattern. Instances of
through the pattern. Instances of (*PRUNE) and (*THEN) without names backtracking verbs without names do not count. Thus, for example, if
are ignored. Thus, for example, if the matching path contains the matching path contains (*MARK:A)(*PRUNE), the name "A" is returned.
(*MARK:A)(*PRUNE), the name "A" is returned. After a "no match" or a After a "no match" or a partial match, the last encountered name is
partial match, the last encountered name is returned. For example, returned. For example, consider this pattern:
consider this pattern:
^(*MARK:A)((*MARK:B)a|b)c ^(*MARK:A)((*MARK:B)a|b)c
@ -2802,8 +2802,8 @@ OTHER INFORMATION ABOUT A MATCH
for the presence of "c" in the subject before running the matching for the presence of "c" in the subject before running the matching
engine. This check fails for "bx", causing a match failure without see- engine. This check fails for "bx", causing a match failure without see-
ing any marks. You can disable the start-of-match optimizations by set- ing any marks. You can disable the start-of-match optimizations by set-
ting the PCRE2_NO_START_OPTIMIZE option for pcre2_compile() or starting ting the PCRE2_NO_START_OPTIMIZE option for pcre2_compile() or by
the pattern with (*NO_START_OPT). starting the pattern with (*NO_START_OPT).
After a successful match, a partial match, or one of the invalid UTF After a successful match, a partial match, or one of the invalid UTF
errors (for example, PCRE2_ERROR_UTF8_ERR5), pcre2_get_startchar() can errors (for example, PCRE2_ERROR_UTF8_ERR5), pcre2_get_startchar() can
@ -3193,13 +3193,12 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
matched with "=abc=" and the replacement string "+$1$0$1+", the result matched with "=abc=" and the replacement string "+$1$0$1+", the result
is "=+babcb+=". is "=+babcb+=".
$*MARK inserts the name from the last encountered (*ACCEPT), (*COMMIT), $*MARK inserts the name from the last encountered backtracking control
(*MARK), (*PRUNE), or (*THEN) on the matching path that has a name. verb on the matching path that has a name. (*MARK) must always include
(*MARK) must always include a name, but the other verbs need not. For a name, but the other verbs need not. For example, in the case of
example, in the case of (*MARK:A)(*PRUNE) the name inserted is "A", but (*MARK:A)(*PRUNE) the name inserted is "A", but for (*MARK:A)(*PRUNE:B)
for (*MARK:A)(*PRUNE:B) the relevant name is "B". This facility can be the relevant name is "B". This facility can be used to perform simple
used to perform simple simultaneous substitutions, as this pcre2test simultaneous substitutions, as this pcre2test example shows:
example shows:
/(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK} /(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
apple lemon apple lemon
@ -3655,7 +3654,7 @@ AUTHOR
REVISION REVISION
Last updated: 12 November 2018 Last updated: 27 November 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
@ -8865,7 +8864,8 @@ BACKTRACKING CONTROL
Perl's terminology) that modify the behaviour of backtracking during Perl's terminology) that modify the behaviour of backtracking during
matching. They are generally of the form (*VERB) or (*VERB:NAME). Some matching. They are generally of the form (*VERB) or (*VERB:NAME). Some
verbs take either form, possibly behaving differently depending on verbs take either form, possibly behaving differently depending on
whether or not a name is present. whether or not a name is present. The names are not required to be
unique within the pattern.
By default, for compatibility with Perl, a name is any sequence of By default, for compatibility with Perl, a name is any sequence of
characters that does not include a closing parenthesis. The name is not characters that does not include a closing parenthesis. The name is not
@ -8959,8 +8959,8 @@ BACKTRACKING CONTROL
A match with the string "aaaa" always fails, but the callout is taken A match with the string "aaaa" always fails, but the callout is taken
before each backtrack happens (in this example, 10 times). before each backtrack happens (in this example, 10 times).
(*ACCEPT:NAME) and (*FAIL:NAME) behave exactly the same as (*ACCEPT:NAME) and (*FAIL:NAME) are treated as (*MARK:NAME)(*ACCEPT)
(*MARK:NAME)(*ACCEPT) and (*MARK:NAME)(*FAIL), respectively. and (*MARK:NAME)(*FAIL), respectively.
Recording which path was taken Recording which path was taken
@ -8970,24 +8970,21 @@ BACKTRACKING CONTROL
(*MARK:NAME) or (*:NAME) (*MARK:NAME) or (*:NAME)
A name is always required with this verb. There may be as many A name is always required with this verb. For all the other backtrack-
instances of (*MARK) as you like in a pattern, and their names do not ing control verbs, a NAME argument is optional.
have to be unique.
When a match succeeds, the name of the last-encountered (*MARK:NAME) on When a match succeeds, the name of the last-encountered mark name on
the matching path is passed back to the caller as described in the sec- the matching path is passed back to the caller as described in the sec-
tion entitled "Other information about the match" in the pcre2api docu- tion entitled "Other information about the match" in the pcre2api docu-
mentation. This applies to all instances of (*MARK), including those mentation. This applies to all instances of (*MARK) and other verbs,
inside assertions and atomic groups. (There are differences in those including those inside assertions and atomic groups. However, there are
cases when (*MARK) is used in conjunction with (*SKIP) as described differences in those cases when (*MARK) is used in conjunction with
below.) (*SKIP) as described below.
As well as (*MARK), the (*COMMIT), (*PRUNE) and (*THEN) verbs may have The mark name that was last encountered on the matching path is passed
associated NAME arguments. Whichever is last on the matching path is back. A verb without a NAME argument is ignored for this purpose. Here
passed back. See below for more details of these other verbs. is an example of pcre2test output, where the "mark" modifier requests
the retrieval and outputting of (*MARK) data:
Here is an example of pcre2test output, where the "mark" modifier
requests the retrieval and outputting of (*MARK) data:
re> /X(*MARK:A)Y|X(*MARK:B)Z/mark re> /X(*MARK:A)Y|X(*MARK:B)Z/mark
data> XY data> XY
@ -9033,7 +9030,7 @@ BACKTRACKING CONTROL
that is true, its effect is confined to that group, because once the that is true, its effect is confined to that group, because once the
group has been matched, there is never any backtracking into it. Back- group has been matched, there is never any backtracking into it. Back-
tracking from beyond an assertion or an atomic group ignores the entire tracking from beyond an assertion or an atomic group ignores the entire
group, and seeks a preceeding backtracking point. group, and seeks a preceding backtracking point.
These verbs differ in exactly what kind of failure occurs when back- These verbs differ in exactly what kind of failure occurs when back-
tracking reaches them. The behaviour described below is what happens tracking reaches them. The behaviour described below is what happens
@ -9058,8 +9055,8 @@ BACKTRACKING CONTROL
The behaviour of (*COMMIT:NAME) is not the same as (*MARK:NAME)(*COM- The behaviour of (*COMMIT:NAME) is not the same as (*MARK:NAME)(*COM-
MIT). It is like (*MARK:NAME) in that the name is remembered for pass- MIT). It is like (*MARK:NAME) in that the name is remembered for pass-
ing back to the caller. However, (*SKIP:NAME) searches only for names ing back to the caller. However, (*SKIP:NAME) searches only for names
set with (*MARK), ignoring those set by (*COMMIT), (*PRUNE) and that are set with (*MARK), ignoring those set by any of the other back-
(*THEN). tracking verbs.
If there is more than one backtracking verb in a pattern, a different If there is more than one backtracking verb in a pattern, a different
one that follows (*COMMIT) may be triggered first, so merely passing one that follows (*COMMIT) may be triggered first, so merely passing
@ -9103,7 +9100,7 @@ BACKTRACKING CONTROL
The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE). The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE).
It is like (*MARK:NAME) in that the name is remembered for passing back It is like (*MARK:NAME) in that the name is remembered for passing back
to the caller. However, (*SKIP:NAME) searches only for names set with to the caller. However, (*SKIP:NAME) searches only for names set with
(*MARK), ignoring those set by (*COMMIT), (*PRUNE) or (*THEN). (*MARK), ignoring those set by other backtracking verbs.
(*SKIP) (*SKIP)
@ -9159,8 +9156,7 @@ BACKTRACKING CONTROL
the pattern. the pattern.
Note that (*SKIP:NAME) searches only for names set by (*MARK:NAME). It Note that (*SKIP:NAME) searches only for names set by (*MARK:NAME). It
ignores names that are set by (*COMMIT:NAME), (*PRUNE:NAME) or ignores names that are set by other backtracking verbs.
(*THEN:NAME).
(*THEN) or (*THEN:NAME) (*THEN) or (*THEN:NAME)
@ -9182,7 +9178,7 @@ BACKTRACKING CONTROL
The behaviour of (*THEN:NAME) is not the same as (*MARK:NAME)(*THEN). The behaviour of (*THEN:NAME) is not the same as (*MARK:NAME)(*THEN).
It is like (*MARK:NAME) in that the name is remembered for passing back It is like (*MARK:NAME) in that the name is remembered for passing back
to the caller. However, (*SKIP:NAME) searches only for names set with to the caller. However, (*SKIP:NAME) searches only for names set with
(*MARK), ignoring those set by (*COMMIT), (*PRUNE) and (*THEN). (*MARK), ignoring those set by other backtracking verbs.
A subpattern that does not contain a | character is just a part of the A subpattern that does not contain a | character is just a part of the
enclosing alternative; it is not a nested alternation with only one enclosing alternative; it is not a nested alternation with only one
@ -9269,10 +9265,10 @@ BACKTRACKING CONTROL
in a conditional subpattern. in a conditional subpattern.
(*ACCEPT) in a standalone positive assertion causes the assertion to (*ACCEPT) in a standalone positive assertion causes the assertion to
succeed without any further processing; captured strings and a (*MARK) succeed without any further processing; captured strings and a mark
name (if set) are retained. In a standalone negative assertion, name (if set) are retained. In a standalone negative assertion,
(*ACCEPT) causes the assertion to fail without any further processing; (*ACCEPT) causes the assertion to fail without any further processing;
captured substrings and any (*MARK) name are discarded. captured substrings and any mark name are discarded.
If the assertion is a condition, (*ACCEPT) causes the condition to be If the assertion is a condition, (*ACCEPT) causes the condition to be
true for a positive assertion and false for a negative one; captured true for a positive assertion and false for a negative one; captured
@ -9336,7 +9332,7 @@ AUTHOR
REVISION REVISION
Last updated: 12 October 2018 Last updated: 27 November 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------

View File

@ -1,4 +1,4 @@
.TH PCRE2API 3 "12 November 2018" "PCRE2 10.33" .TH PCRE2API 3 "27 November 2018" "PCRE2 10.33"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.sp .sp
@ -2842,21 +2842,22 @@ appropriate circumstances. If they are called at other times, the result is
undefined. undefined.
.P .P
After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a failure After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a failure
to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN) name may be to match (PCRE2_ERROR_NOMATCH), a mark name may be available. The function
available. The function \fBpcre2_get_mark()\fP can be called to access this \fBpcre2_get_mark()\fP can be called to access this name, which can be
name. The same function applies to all three verbs. It returns a pointer to the specified in the pattern by any of the backtracking control verbs, not just
zero-terminated name, which is within the compiled pattern. If no name is (*MARK). The same function applies to all the verbs. It returns a pointer to
the zero-terminated name, which is within the compiled pattern. If no name is
available, NULL is returned. The length of the name (excluding the terminating available, NULL is returned. The length of the name (excluding the terminating
zero) is stored in the code unit that precedes the name. You should use this zero) is stored in the code unit that precedes the name. You should use this
length instead of relying on the terminating zero if the name might contain a length instead of relying on the terminating zero if the name might contain a
binary zero. binary zero.
.P .P
After a successful match, the name that is returned is the last (*MARK), After a successful match, the name that is returned is the last mark name
(*PRUNE), or (*THEN) name encountered on the matching path through the pattern. encountered on the matching path through the pattern. Instances of backtracking
Instances of (*PRUNE) and (*THEN) without names are ignored. Thus, for example, verbs without names do not count. Thus, for example, if the matching path
if the matching path contains (*MARK:A)(*PRUNE), the name "A" is returned. contains (*MARK:A)(*PRUNE), the name "A" is returned. After a "no match" or a
After a "no match" or a partial match, the last encountered name is returned. partial match, the last encountered name is returned. For example, consider
For example, consider this pattern: this pattern:
.sp .sp
^(*MARK:A)((*MARK:B)a|b)c ^(*MARK:A)((*MARK:B)a|b)c
.sp .sp
@ -2870,7 +2871,7 @@ is removed from the pattern above, there is an initial check for the presence
of "c" in the subject before running the matching engine. This check fails for of "c" in the subject before running the matching engine. This check fails for
"bx", causing a match failure without seeing any marks. You can disable the "bx", causing a match failure without seeing any marks. You can disable the
start-of-match optimizations by setting the PCRE2_NO_START_OPTIMIZE option for start-of-match optimizations by setting the PCRE2_NO_START_OPTIMIZE option for
\fBpcre2_compile()\fP or starting the pattern with (*NO_START_OPT). \fBpcre2_compile()\fP or by starting the pattern with (*NO_START_OPT).
.P .P
After a successful match, a partial match, or one of the invalid UTF errors After a successful match, a partial match, or one of the invalid UTF errors
(for example, PCRE2_ERROR_UTF8_ERR5), \fBpcre2_get_startchar()\fP can be (for example, PCRE2_ERROR_UTF8_ERR5), \fBpcre2_get_startchar()\fP can be
@ -3297,13 +3298,12 @@ number or name. The number may be zero to include the entire matched string.
For example, if the pattern a(b)c is matched with "=abc=" and the replacement For example, if the pattern a(b)c is matched with "=abc=" and the replacement
string "+$1$0$1+", the result is "=+babcb+=". string "+$1$0$1+", the result is "=+babcb+=".
.P .P
$*MARK inserts the name from the last encountered (*ACCEPT), (*COMMIT), $*MARK inserts the name from the last encountered backtracking control verb on
(*MARK), (*PRUNE), or (*THEN) on the matching path that has a name. (*MARK) the matching path that has a name. (*MARK) must always include a name, but the
must always include a name, but the other verbs need not. For example, in other verbs need not. For example, in the case of (*MARK:A)(*PRUNE) the name
the case of (*MARK:A)(*PRUNE) the name inserted is "A", but for inserted is "A", but for (*MARK:A)(*PRUNE:B) the relevant name is "B". This
(*MARK:A)(*PRUNE:B) the relevant name is "B". This facility can be used to facility can be used to perform simple simultaneous substitutions, as this
perform simple simultaneous substitutions, as this \fBpcre2test\fP example \fBpcre2test\fP example shows:
shows:
.sp .sp
/(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK} /(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
apple lemon apple lemon
@ -3790,6 +3790,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 12 November 2018 Last updated: 27 November 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
.fi .fi

View File

@ -847,11 +847,15 @@ USING PCRE2'S CALLOUT FACILITY
Calling external programs or scripts Calling external programs or scripts
This facility can be independently disabled when pcre2grep is built. If This facility can be independently disabled when pcre2grep is built. It
the callout string does not start with a pipe (vertical bar) character, is supported for Windows, where a call to _spawnvp() is used, for VMS,
it is parsed into a list of substrings separated by pipe characters. where lib$spawn() is used, and for any other Unix-like environment
The first substring must be an executable name, with the following sub- where fork() and execv() are available.
strings specifying arguments:
If the callout string does not start with a pipe (vertical bar) charac-
ter, it is parsed into a list of substrings separated by pipe charac-
ters. The first substring must be an executable name, with the follow-
ing substrings specifying arguments:
executable_name|arg1|arg2|... executable_name|arg1|arg2|...
@ -877,15 +881,14 @@ USING PCRE2'S CALLOUT FACILITY
Arg1: [1] [234] [4] Arg2: |1| () Arg1: [1] [234] [4] Arg2: |1| ()
12345 12345
The parameters for the execv() system call that is used to run the pro- The parameters for the system call that is used to run the program or
gram or script are zero-terminated strings. This means that binary zero script are zero-terminated strings. This means that binary zero charac-
characters in the callout argument will cause premature termination of ters in the callout argument will cause premature termination of their
their substrings, and therefore should not be present. Any syntax substrings, and therefore should not be present. Any syntax errors in
errors in the string (for example, a dollar not followed by another the string (for example, a dollar not followed by another character)
character) cause the callout to be ignored. If running the program cause the callout to be ignored. If running the program fails for any
fails for any reason (including the non-existence of the executable), a reason (including the non-existence of the executable), a local match-
local matching failure occurs and the matcher backtracks in the normal ing failure occurs and the matcher backtracks in the normal way.
way.
Echoing a specific string Echoing a specific string
@ -945,5 +948,5 @@ AUTHOR
REVISION REVISION
Last updated: 17 November 2018 Last updated: 24 November 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.

View File

@ -1,4 +1,4 @@
.TH PCRE2PATTERN 3 "12 October 2018" "PCRE2 10.33" .TH PCRE2PATTERN 3 "27 November 2018" "PCRE2 10.33"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH "PCRE2 REGULAR EXPRESSION DETAILS" .SH "PCRE2 REGULAR EXPRESSION DETAILS"
@ -3262,6 +3262,7 @@ There are a number of special "Backtracking Control Verbs" (to use Perl's
terminology) that modify the behaviour of backtracking during matching. They terminology) that modify the behaviour of backtracking during matching. They
are generally of the form (*VERB) or (*VERB:NAME). Some verbs take either form, are generally of the form (*VERB) or (*VERB:NAME). Some verbs take either form,
possibly behaving differently depending on whether or not a name is present. possibly behaving differently depending on whether or not a name is present.
The names are not required to be unique within the pattern.
.P .P
By default, for compatibility with Perl, a name is any sequence of characters By default, for compatibility with Perl, a name is any sequence of characters
that does not include a closing parenthesis. The name is not processed in that does not include a closing parenthesis. The name is not processed in
@ -3376,8 +3377,8 @@ nearest equivalent is the callout feature, as for example in this pattern:
A match with the string "aaaa" always fails, but the callout is taken before A match with the string "aaaa" always fails, but the callout is taken before
each backtrack happens (in this example, 10 times). each backtrack happens (in this example, 10 times).
.P .P
(*ACCEPT:NAME) and (*FAIL:NAME) behave exactly the same as (*ACCEPT:NAME) and (*FAIL:NAME) are treated as (*MARK:NAME)(*ACCEPT) and
(*MARK:NAME)(*ACCEPT) and (*MARK:NAME)(*FAIL), respectively. (*MARK:NAME)(*FAIL), respectively.
. .
. .
.SS "Recording which path was taken" .SS "Recording which path was taken"
@ -3389,10 +3390,10 @@ starting point (see (*SKIP) below).
.sp .sp
(*MARK:NAME) or (*:NAME) (*MARK:NAME) or (*:NAME)
.sp .sp
A name is always required with this verb. There may be as many instances of A name is always required with this verb. For all the other backtracking
(*MARK) as you like in a pattern, and their names do not have to be unique. control verbs, a NAME argument is optional.
.P .P
When a match succeeds, the name of the last-encountered (*MARK:NAME) on the When a match succeeds, the name of the last-encountered mark name on the
matching path is passed back to the caller as described in the section entitled matching path is passed back to the caller as described in the section entitled
.\" HTML <a href="pcre2api.html#matchotherdata"> .\" HTML <a href="pcre2api.html#matchotherdata">
.\" </a> .\" </a>
@ -3402,16 +3403,15 @@ in the
.\" HREF .\" HREF
\fBpcre2api\fP \fBpcre2api\fP
.\" .\"
documentation. This applies to all instances of (*MARK), including those inside documentation. This applies to all instances of (*MARK) and other verbs,
assertions and atomic groups. (There are differences in those cases when including those inside assertions and atomic groups. However, there are
(*MARK) is used in conjunction with (*SKIP) as described below.) differences in those cases when (*MARK) is used in conjunction with (*SKIP) as
described below.
.P .P
As well as (*MARK), the (*COMMIT), (*PRUNE) and (*THEN) verbs may have The mark name that was last encountered on the matching path is passed back. A
associated NAME arguments. Whichever is last on the matching path is passed verb without a NAME argument is ignored for this purpose. Here is an example of
back. See below for more details of these other verbs. \fBpcre2test\fP output, where the "mark" modifier requests the retrieval and
.P outputting of (*MARK) data:
Here is an example of \fBpcre2test\fP output, where the "mark" modifier
requests the retrieval and outputting of (*MARK) data:
.sp .sp
re> /X(*MARK:A)Y|X(*MARK:B)Z/mark re> /X(*MARK:A)Y|X(*MARK:B)Z/mark
data> XY data> XY
@ -3461,7 +3461,7 @@ to the left of the verb. However, when one of these verbs appears inside an
atomic group or in a lookaround assertion that is true, its effect is confined atomic group or in a lookaround assertion that is true, its effect is confined
to that group, because once the group has been matched, there is never any to that group, because once the group has been matched, there is never any
backtracking into it. Backtracking from beyond an assertion or an atomic group backtracking into it. Backtracking from beyond an assertion or an atomic group
ignores the entire group, and seeks a preceeding backtracking point. ignores the entire group, and seeks a preceding backtracking point.
.P .P
These verbs differ in exactly what kind of failure occurs when backtracking These verbs differ in exactly what kind of failure occurs when backtracking
reaches them. The behaviour described below is what happens when the verb is reaches them. The behaviour described below is what happens when the verb is
@ -3484,8 +3484,8 @@ dynamic anchor, or "I've started, so I must finish."
.P .P
The behaviour of (*COMMIT:NAME) is not the same as (*MARK:NAME)(*COMMIT). It is The behaviour of (*COMMIT:NAME) is not the same as (*MARK:NAME)(*COMMIT). It is
like (*MARK:NAME) in that the name is remembered for passing back to the like (*MARK:NAME) in that the name is remembered for passing back to the
caller. However, (*SKIP:NAME) searches only for names set with (*MARK), caller. However, (*SKIP:NAME) searches only for names that are set with
ignoring those set by (*COMMIT), (*PRUNE) and (*THEN). (*MARK), ignoring those set by any of the other backtracking verbs.
.P .P
If there is more than one backtracking verb in a pattern, a different one that If there is more than one backtracking verb in a pattern, a different one that
follows (*COMMIT) may be triggered first, so merely passing (*COMMIT) during a follows (*COMMIT) may be triggered first, so merely passing (*COMMIT) during a
@ -3526,7 +3526,7 @@ as (*COMMIT).
The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE). It is The behaviour of (*PRUNE:NAME) is not the same as (*MARK:NAME)(*PRUNE). It is
like (*MARK:NAME) in that the name is remembered for passing back to the like (*MARK:NAME) in that the name is remembered for passing back to the
caller. However, (*SKIP:NAME) searches only for names set with (*MARK), caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
ignoring those set by (*COMMIT), (*PRUNE) or (*THEN). ignoring those set by other backtracking verbs.
.sp .sp
(*SKIP) (*SKIP)
.sp .sp
@ -3579,7 +3579,7 @@ never seen because "a" does not match "b", so the matcher immediately jumps to
the second branch of the pattern. the second branch of the pattern.
.P .P
Note that (*SKIP:NAME) searches only for names set by (*MARK:NAME). It ignores Note that (*SKIP:NAME) searches only for names set by (*MARK:NAME). It ignores
names that are set by (*COMMIT:NAME), (*PRUNE:NAME) or (*THEN:NAME). names that are set by other backtracking verbs.
.sp .sp
(*THEN) or (*THEN:NAME) (*THEN) or (*THEN:NAME)
.sp .sp
@ -3600,7 +3600,7 @@ group. If (*THEN) is not inside an alternation, it acts like (*PRUNE).
The behaviour of (*THEN:NAME) is not the same as (*MARK:NAME)(*THEN). It is The behaviour of (*THEN:NAME) is not the same as (*MARK:NAME)(*THEN). It is
like (*MARK:NAME) in that the name is remembered for passing back to the like (*MARK:NAME) in that the name is remembered for passing back to the
caller. However, (*SKIP:NAME) searches only for names set with (*MARK), caller. However, (*SKIP:NAME) searches only for names set with (*MARK),
ignoring those set by (*COMMIT), (*PRUNE) and (*THEN). ignoring those set by other backtracking verbs.
.P .P
A subpattern that does not contain a | character is just a part of the A subpattern that does not contain a | character is just a part of the
enclosing alternative; it is not a nested alternation with only one enclosing alternative; it is not a nested alternation with only one
@ -3693,10 +3693,10 @@ not the assertion is standalone or acting as the condition in a conditional
subpattern. subpattern.
.P .P
(*ACCEPT) in a standalone positive assertion causes the assertion to succeed (*ACCEPT) in a standalone positive assertion causes the assertion to succeed
without any further processing; captured strings and a (*MARK) name (if set) without any further processing; captured strings and a mark name (if set) are
are retained. In a standalone negative assertion, (*ACCEPT) causes the retained. In a standalone negative assertion, (*ACCEPT) causes the assertion to
assertion to fail without any further processing; captured substrings and any fail without any further processing; captured substrings and any mark name are
(*MARK) name are discarded. discarded.
.P .P
If the assertion is a condition, (*ACCEPT) causes the condition to be true for If the assertion is a condition, (*ACCEPT) causes the condition to be true for
a positive assertion and false for a negative one; captured substrings are a positive assertion and false for a negative one; captured substrings are
@ -3767,6 +3767,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 12 October 2018 Last updated: 27 November 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
.fi .fi