Derived documentation update.

This commit is contained in:
Philip.Hazel 2017-10-17 16:26:40 +00:00
parent cc2182261a
commit 71d0ee75d2
6 changed files with 94 additions and 57 deletions

View File

@ -26,11 +26,15 @@ DESCRIPTION
</b><br> </b><br>
<P> <P>
After a call of <b>pcre2_match()</b> that was passed the match block that is After a call of <b>pcre2_match()</b> that was passed the match block that is
this function's argument, this function returns a pointer to the last (*MARK) this function's argument, this function returns a pointer to the last (*MARK),
name that was encountered. The name is zero-terminated, and is within the (*PRUNE), or (*THEN) name that was encountered during the matching process. The
compiled pattern. If no (*MARK) name is available, NULL is returned. A (*MARK) name is zero-terminated, and is within the compiled pattern. The length of the
name may be available after a failed match or a partial match, as well as after name is in the preceding code unit. If no name is available, NULL is returned.
a successful one. </P>
<P>
After a successful match, the name that is returned is the last one on the
matching path. After a failed match or a partial match, the last encountered
name is returned.
</P> </P>
<P> <P>
There is a complete description of the PCRE2 native API in the There is a complete description of the PCRE2 native API in the

View File

@ -2719,25 +2719,28 @@ undefined.
</P> </P>
<P> <P>
After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a failure After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a failure
to match (PCRE2_ERROR_NOMATCH), a (*MARK) name may be available, and to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN) name may be
<b>pcre2_get_mark()</b> can be called. It returns a pointer to the available. The function <b>pcre2_get_mark()</b> can be called to access this
zero-terminated name, which is within the compiled pattern. Otherwise NULL is name. The same function applies to all three verbs. It returns a pointer to the
returned. The length of the (*MARK) name (excluding the terminating zero) is zero-terminated name, which is within the compiled pattern. If no name is
stored in the code unit that preceeds the name. You should use this instead of available, NULL is returned. The length of the name (excluding the terminating
relying on the terminating zero if the (*MARK) name might contain a binary zero) is stored in the code unit that precedes the name. You should use this
zero. length instead of relying on the terminating zero if the name might contain a
binary zero.
</P> </P>
<P> <P>
After a successful match, the (*MARK) name that is returned is the After a successful match, the name that is returned is the last (*MARK),
last one encountered on the matching path through the pattern. After a "no (*PRUNE), or (*THEN) name encountered on the matching path through the pattern.
match" or a partial match, the last encountered (*MARK) name is returned. For Instances of (*PRUNE) and (*THEN) without names are ignored. Thus, for example,
example, consider this pattern: if the matching path contains (*MARK:A)(*PRUNE), the name "A" is returned.
After a "no match" or a partial match, the last encountered name is returned.
For example, consider this pattern:
<pre> <pre>
^(*MARK:A)((*MARK:B)a|b)c ^(*MARK:A)((*MARK:B)a|b)c
</pre> </pre>
When it matches "bc", the returned mark is A. The B mark is "seen" in the first When it matches "bc", the returned name is A. The B mark is "seen" in the first
branch of the group, but it is not on the matching path. On the other hand, branch of the group, but it is not on the matching path. On the other hand,
when this pattern fails to match "bx", the returned mark is B. when this pattern fails to match "bx", the returned name is B.
</P> </P>
<P> <P>
After a successful match, a partial match, or one of the invalid UTF errors After a successful match, a partial match, or one of the invalid UTF errors
@ -3124,12 +3127,12 @@ length is in code units, not bytes.
In the replacement string, which is interpreted as a UTF string in UTF mode, In the replacement string, which is interpreted as a UTF string in UTF mode,
and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK option is set, a and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK option is set, a
dollar character is an escape character that can specify the insertion of dollar character is an escape character that can specify the insertion of
characters from capturing groups or (*MARK) items in the pattern. The following characters from capturing groups or (*MARK), (*PRUNE), or (*THEN) items in the
forms are always recognized: pattern. The following forms are always recognized:
<pre> <pre>
$$ insert a dollar character $$ insert a dollar character
$&#60;n&#62; or ${&#60;n&#62;} insert the contents of group &#60;n&#62; $&#60;n&#62; or ${&#60;n&#62;} insert the contents of group &#60;n&#62;
$*MARK or ${*MARK} insert the name of the last (*MARK) encountered $*MARK or ${*MARK} insert a (*MARK), (*PRUNE), or (*THEN) name
</pre> </pre>
Either a group number or a group name can be given for &#60;n&#62;. Curly brackets are Either a group number or a group name can be given for &#60;n&#62;. Curly brackets are
required only if the following character would be interpreted as part of the required only if the following character would be interpreted as part of the
@ -3138,15 +3141,19 @@ For example, if the pattern a(b)c is matched with "=abc=" and the replacement
string "+$1$0$1+", the result is "=+babcb+=". string "+$1$0$1+", the result is "=+babcb+=".
</P> </P>
<P> <P>
The facility for inserting a (*MARK) name can be used to perform simple $*MARK inserts the name from the last encountered (*MARK), (*PRUNE), or (*THEN)
simultaneous substitutions, as this <b>pcre2test</b> example shows: on the matching path that has a name. (*MARK) must always include a name, but
(*PRUNE) and (*THEN) need not. For example, in the case of (*MARK:A)(*PRUNE)
the name inserted is "A", but for (*MARK:A)(*PRUNE:B) the relevant name is "B".
This facility can be used to perform simple simultaneous substitutions, as this
<b>pcre2test</b> example shows:
<pre> <pre>
/(*:pear)apple|(*:orange)lemon/g,replace=${*MARK} /(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
apple lemon apple lemon
2: pear orange 2: pear orange
</pre> </pre>
As well as the usual options for <b>pcre2_match()</b>, a number of additional As well as the usual options for <b>pcre2_match()</b>, a number of additional
options can be set in the <i>options</i> argument. options can be set in the <i>options</i> argument of <b>pcre2_substitute()</b>.
</P> </P>
<P> <P>
PCRE2_SUBSTITUTE_GLOBAL causes the function to iterate over the subject string, PCRE2_SUBSTITUTE_GLOBAL causes the function to iterate over the subject string,
@ -3560,7 +3567,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC42" href="#TOC1">REVISION</a><br> <br><a name="SEC42" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 25 September 2017 Last updated: 13 October 2017
<br> <br>
Copyright &copy; 1997-2017 University of Cambridge. Copyright &copy; 1997-2017 University of Cambridge.
<br> <br>

View File

@ -922,6 +922,10 @@ matches were found in other files) or too many matching errors. Using the
<b>-s</b> option to suppress error messages about inaccessible files does not <b>-s</b> option to suppress error messages about inaccessible files does not
affect the return code. affect the return code.
</P> </P>
<P>
When run under VMS, the return code is placed in the symbol PCRE2GREP_RC
because VMS does not distinguish between exit(0) and exit(1).
</P>
<br><a name="SEC13" href="#TOC1">SEE ALSO</a><br> <br><a name="SEC13" href="#TOC1">SEE ALSO</a><br>
<P> <P>
<b>pcre2pattern</b>(3), <b>pcre2syntax</b>(3), <b>pcre2callout</b>(3). <b>pcre2pattern</b>(3), <b>pcre2syntax</b>(3), <b>pcre2callout</b>(3).
@ -937,7 +941,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC15" href="#TOC1">REVISION</a><br> <br><a name="SEC15" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 17 June 2017 Last updated: 11 October 2017
<br> <br>
Copyright &copy; 1997-2017 University of Cambridge. Copyright &copy; 1997-2017 University of Cambridge.
<br> <br>

View File

@ -167,7 +167,8 @@ internal binary form of the pattern is output after compilation.
<b>-C</b> <b>-C</b>
Output the version number of the PCRE2 library, and all available information Output the version number of the PCRE2 library, and all available information
about the optional features that are included, and then exit with zero exit about the optional features that are included, and then exit with zero exit
code. All other options are ignored. code. All other options are ignored. If both -C and -LM are present, whichever
is first is recognized.
</P> </P>
<P> <P>
<b>-C</b> <i>option</i> <b>-C</b> <i>option</i>
@ -241,6 +242,12 @@ successful compilation, each pattern is passed to the just-in-time compiler, if
available, and the use of JIT is verified. available, and the use of JIT is verified.
</P> </P>
<P> <P>
<b>-LM</b>
List modifiers: write a list of available pattern and subject modifiers to the
standard output, then exit with zero exit code. All other options are ignored.
If both -C and -LM are present, whichever is first is recognized.
</P>
<P>
\fB-pattern\fB <i>modifier-list</i> \fB-pattern\fB <i>modifier-list</i>
Behave as if each pattern line contains the given modifiers. Behave as if each pattern line contains the given modifiers.
</P> </P>
@ -1020,13 +1027,14 @@ Setting certain match controls
The following modifiers are really subject modifiers, and are described under The following modifiers are really subject modifiers, and are described under
"Subject Modifiers" below. However, they may be included in a pattern's "Subject Modifiers" below. However, they may be included in a pattern's
modifier list, in which case they are applied to every subject line that is modifier list, in which case they are applied to every subject line that is
processed with that pattern. They may not appear in <b>#pattern</b> commands. processed with that pattern. These modifiers do not affect the compilation
These modifiers do not affect the compilation process. process.
<pre> <pre>
aftertext show text after match aftertext show text after match
allaftertext show text after captures allaftertext show text after captures
allcaptures show all captures allcaptures show all captures
allusedtext show all consulted text allusedtext show all consulted text
altglobal alternative global matching
/g global global matching /g global global matching
jitstack=&#60;n&#62; set size of JIT stack jitstack=&#60;n&#62; set size of JIT stack
mark show mark values mark show mark values
@ -1905,7 +1913,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC21" href="#TOC1">REVISION</a><br> <br><a name="SEC21" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 12 July 2017 Last updated: 17 October 2017
<br> <br>
Copyright &copy; 1997-2017 University of Cambridge. Copyright &copy; 1997-2017 University of Cambridge.
<br> <br>

View File

@ -2647,25 +2647,29 @@ OTHER INFORMATION ABOUT A MATCH
times, the result is undefined. times, the result is undefined.
After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a After a successful match, a partial match (PCRE2_ERROR_PARTIAL), or a
failure to match (PCRE2_ERROR_NOMATCH), a (*MARK) name may be avail- failure to match (PCRE2_ERROR_NOMATCH), a (*MARK), (*PRUNE), or (*THEN)
able, and pcre2_get_mark() can be called. It returns a pointer to the name may be available. The function pcre2_get_mark() can be called to
zero-terminated name, which is within the compiled pattern. Otherwise access this name. The same function applies to all three verbs. It
NULL is returned. The length of the (*MARK) name (excluding the termi- returns a pointer to the zero-terminated name, which is within the com-
nating zero) is stored in the code unit that preceeds the name. You piled pattern. If no name is available, NULL is returned. The length of
should use this instead of relying on the terminating zero if the the name (excluding the terminating zero) is stored in the code unit
(*MARK) name might contain a binary zero. that precedes the name. You should use this length instead of relying
on the terminating zero if the name might contain a binary zero.
After a successful match, the (*MARK) name that is returned is the last After a successful match, the name that is returned is the last
one encountered on the matching path through the pattern. After a "no (*MARK), (*PRUNE), or (*THEN) name encountered on the matching path
match" or a partial match, the last encountered (*MARK) name is through the pattern. Instances of (*PRUNE) and (*THEN) without names
returned. For example, consider this pattern: are ignored. Thus, for example, if the matching path contains
(*MARK:A)(*PRUNE), the name "A" is returned. After a "no match" or a
partial match, the last encountered name is returned. For example,
consider this pattern:
^(*MARK:A)((*MARK:B)a|b)c ^(*MARK:A)((*MARK:B)a|b)c
When it matches "bc", the returned mark is A. The B mark is "seen" in When it matches "bc", the returned name is A. The B mark is "seen" in
the first branch of the group, but it is not on the matching path. On the first branch of the group, but it is not on the matching path. On
the other hand, when this pattern fails to match "bx", the returned the other hand, when this pattern fails to match "bx", the returned
mark is B. name is B.
After a successful match, a partial match, or one of the invalid UTF After a successful match, a partial match, or one of the invalid UTF
errors (for example, PCRE2_ERROR_UTF8_ERR5), pcre2_get_startchar() can errors (for example, PCRE2_ERROR_UTF8_ERR5), pcre2_get_startchar() can
@ -3027,29 +3031,35 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
In the replacement string, which is interpreted as a UTF string in UTF In the replacement string, which is interpreted as a UTF string in UTF
mode, and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK mode, and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK
option is set, a dollar character is an escape character that can spec- option is set, a dollar character is an escape character that can spec-
ify the insertion of characters from capturing groups or (*MARK) items ify the insertion of characters from capturing groups or (*MARK),
in the pattern. The following forms are always recognized: (*PRUNE), or (*THEN) items in the pattern. The following forms are
always recognized:
$$ insert a dollar character $$ insert a dollar character
$<n> or ${<n>} insert the contents of group <n> $<n> or ${<n>} insert the contents of group <n>
$*MARK or ${*MARK} insert the name of the last (*MARK) encountered $*MARK or ${*MARK} insert a (*MARK), (*PRUNE), or (*THEN) name
Either a group number or a group name can be given for <n>. Curly Either a group number or a group name can be given for <n>. Curly
brackets are required only if the following character would be inter- brackets are required only if the following character would be inter-
preted as part of the number or name. The number may be zero to include preted as part of the number or name. The number may be zero to include
the entire matched string. For example, if the pattern a(b)c is the entire matched string. For example, if the pattern a(b)c is
matched with "=abc=" and the replacement string "+$1$0$1+", the result matched with "=abc=" and the replacement string "+$1$0$1+", the result
is "=+babcb+=". is "=+babcb+=".
The facility for inserting a (*MARK) name can be used to perform simple $*MARK inserts the name from the last encountered (*MARK), (*PRUNE), or
simultaneous substitutions, as this pcre2test example shows: (*THEN) on the matching path that has a name. (*MARK) must always
include a name, but (*PRUNE) and (*THEN) need not. For example, in the
case of (*MARK:A)(*PRUNE) the name inserted is "A", but for
(*MARK:A)(*PRUNE:B) the relevant name is "B". This facility can be
used to perform simple simultaneous substitutions, as this pcre2test
example shows:
/(*:pear)apple|(*:orange)lemon/g,replace=${*MARK} /(*MARK:pear)apple|(*MARK:orange)lemon/g,replace=${*MARK}
apple lemon apple lemon
2: pear orange 2: pear orange
As well as the usual options for pcre2_match(), a number of additional As well as the usual options for pcre2_match(), a number of additional
options can be set in the options argument. options can be set in the options argument of pcre2_substitute().
PCRE2_SUBSTITUTE_GLOBAL causes the function to iterate over the subject PCRE2_SUBSTITUTE_GLOBAL causes the function to iterate over the subject
string, replacing every matching substring. If this is not set, only string, replacing every matching substring. If this is not set, only
@ -3437,7 +3447,7 @@ AUTHOR
REVISION REVISION
Last updated: 25 September 2017 Last updated: 13 October 2017
Copyright (c) 1997-2017 University of Cambridge. Copyright (c) 1997-2017 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------

View File

@ -903,6 +903,10 @@ DIAGNOSTICS
errors. Using the -s option to suppress error messages about inaccessi- errors. Using the -s option to suppress error messages about inaccessi-
ble files does not affect the return code. ble files does not affect the return code.
When run under VMS, the return code is placed in the symbol
PCRE2GREP_RC because VMS does not distinguish between exit(0) and
exit(1).
SEE ALSO SEE ALSO
@ -918,5 +922,5 @@ AUTHOR
REVISION REVISION
Last updated: 17 June 2017 Last updated: 11 October 2017
Copyright (c) 1997-2017 University of Cambridge. Copyright (c) 1997-2017 University of Cambridge.