Update HTML documentation.

This commit is contained in:
Philip.Hazel 2015-09-12 18:12:01 +00:00
parent 06477b27af
commit 6aa0f3e56f
8 changed files with 854 additions and 721 deletions

View File

@ -20,7 +20,7 @@ SYNOPSIS
</P> </P>
<P> <P>
<b>int32_t pcre2_serialize_decode(pcre2_code **<i>codes</i>,</b> <b>int32_t pcre2_serialize_decode(pcre2_code **<i>codes</i>,</b>
<b> int32_t <i>number_of_codes</i>, const uint32_t *<i>bytes</i>,</b> <b> int32_t <i>number_of_codes</i>, const uint8_t *<i>bytes</i>,</b>
<b> pcre2_general_context *<i>gcontext</i>);</b> <b> pcre2_general_context *<i>gcontext</i>);</b>
</P> </P>
<br><b> <br><b>

View File

@ -19,8 +19,8 @@ SYNOPSIS
<b>#include &#60;pcre2.h&#62;</b> <b>#include &#60;pcre2.h&#62;</b>
</P> </P>
<P> <P>
<b>int32_t pcre2_serialize_encode(pcre2_code **<i>codes</i>,</b> <b>int32_t pcre2_serialize_encode(const pcre2_code **<i>codes</i>,</b>
<b> int32_t <i>number_of_codes</i>, uint32_t **<i>serialized_bytes</i>,</b> <b> int32_t <i>number_of_codes</i>, uint8_t **<i>serialized_bytes</i>,</b>
<b> PCRE2_SIZE *<i>serialized_size</i>, pcre2_general_context *<i>gcontext</i>);</b> <b> PCRE2_SIZE *<i>serialized_size</i>, pcre2_general_context *<i>gcontext</i>);</b>
</P> </P>
<br><b> <br><b>

View File

@ -266,12 +266,12 @@ document for an overview of all the PCRE2 documentation.
<br><a name="SEC9" href="#TOC1">PCRE2 NATIVE API SERIALIZATION FUNCTIONS</a><br> <br><a name="SEC9" href="#TOC1">PCRE2 NATIVE API SERIALIZATION FUNCTIONS</a><br>
<P> <P>
<b>int32_t pcre2_serialize_decode(pcre2_code **<i>codes</i>,</b> <b>int32_t pcre2_serialize_decode(pcre2_code **<i>codes</i>,</b>
<b> int32_t <i>number_of_codes</i>, const uint32_t *<i>bytes</i>,</b> <b> int32_t <i>number_of_codes</i>, const uint8_t *<i>bytes</i>,</b>
<b> pcre2_general_context *<i>gcontext</i>);</b> <b> pcre2_general_context *<i>gcontext</i>);</b>
<br> <br>
<br> <br>
<b>int32_t pcre2_serialize_encode(pcre2_code **<i>codes</i>,</b> <b>int32_t pcre2_serialize_encode(const pcre2_code **<i>codes</i>,</b>
<b> int32_t <i>number_of_codes</i>, uint32_t **<i>serialized_bytes</i>,</b> <b> int32_t <i>number_of_codes</i>, uint8_t **<i>serialized_bytes</i>,</b>
<b> PCRE2_SIZE *<i>serialized_size</i>, pcre2_general_context *<i>gcontext</i>);</b> <b> PCRE2_SIZE *<i>serialized_size</i>, pcre2_general_context *<i>gcontext</i>);</b>
<br> <br>
<br> <br>
@ -1091,7 +1091,10 @@ By default, for compatibility with Perl, the name in any verb sequence such as
parenthesis. The name is not processed in any way, and it is not possible to parenthesis. The name is not processed in any way, and it is not possible to
include a closing parenthesis in the name. However, if the PCRE2_ALT_VERBNAMES include a closing parenthesis in the name. However, if the PCRE2_ALT_VERBNAMES
option is set, normal backslash processing is applied to verb names and only an option is set, normal backslash processing is applied to verb names and only an
unescaped closing parenthesis terminates the name. unescaped closing parenthesis terminates the name. A closing parenthesis can be
included in a name either as \) or between \Q and \E. If the PCRE2_EXTENDED
option is set, unescaped whitespace in verb names is skipped and #-comments are
recognized, exactly as in the rest of the pattern.
<pre> <pre>
PCRE2_AUTO_CALLOUT PCRE2_AUTO_CALLOUT
</pre> </pre>
@ -2909,7 +2912,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC40" href="#TOC1">REVISION</a><br> <br><a name="SEC40" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 30 August 2015 Last updated: 02 September 2015
<br> <br>
Copyright &copy; 1997-2015 University of Cambridge. Copyright &copy; 1997-2015 University of Cambridge.
<br> <br>

View File

@ -2925,7 +2925,10 @@ that does not include a closing parenthesis. The name is not processed in
any way, and it is not possible to include a closing parenthesis in the name. any way, and it is not possible to include a closing parenthesis in the name.
However, if the PCRE2_ALT_VERBNAMES option is set, normal backslash processing However, if the PCRE2_ALT_VERBNAMES option is set, normal backslash processing
is applied to verb names and only an unescaped closing parenthesis terminates is applied to verb names and only an unescaped closing parenthesis terminates
the name. the name. A closing parenthesis can be included in a name either as \) or
between \Q and \E. If the PCRE2_EXTENDED option is set, unescaped whitespace
in verb names is skipped and #-comments are recognized, exactly as in the rest
of the pattern.
</P> </P>
<P> <P>
The maximum length of a name is 255 in the 8-bit library and 65535 in the The maximum length of a name is 255 in the 8-bit library and 65535 in the
@ -3348,7 +3351,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC30" href="#TOC1">REVISION</a><br> <br><a name="SEC30" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 30 August 2015 Last updated: 01 September 2015
<br> <br>
Copyright &copy; 1997-2015 University of Cambridge. Copyright &copy; 1997-2015 University of Cambridge.
<br> <br>

View File

@ -170,7 +170,7 @@ use the contents of the <i>preg</i> structure. If, for example, you pass it to
This area is not simple, because POSIX and Perl take different views of things. This area is not simple, because POSIX and Perl take different views of things.
It is not possible to get PCRE2 to obey POSIX semantics, but then PCRE2 was It is not possible to get PCRE2 to obey POSIX semantics, but then PCRE2 was
never intended to be a POSIX engine. The following table lists the different never intended to be a POSIX engine. The following table lists the different
possibilities for matching newline characters in PCRE2: possibilities for matching newline characters in Perl and PCRE2:
<pre> <pre>
Default Change with Default Change with
@ -180,7 +180,7 @@ possibilities for matching newline characters in PCRE2:
$ matches \n in middle no PCRE2_MULTILINE $ matches \n in middle no PCRE2_MULTILINE
^ matches \n in middle no PCRE2_MULTILINE ^ matches \n in middle no PCRE2_MULTILINE
</pre> </pre>
This is the equivalent table for POSIX: This is the equivalent table for a POSIX-compatible pattern matcher:
<pre> <pre>
Default Change with Default Change with
@ -190,14 +190,18 @@ This is the equivalent table for POSIX:
$ matches \n in middle no REG_NEWLINE $ matches \n in middle no REG_NEWLINE
^ matches \n in middle no REG_NEWLINE ^ matches \n in middle no REG_NEWLINE
</pre> </pre>
PCRE2's behaviour is the same as Perl's, except that there is no equivalent for This behaviour is not what happens when PCRE2 is called via its POSIX
PCRE2_DOLLAR_ENDONLY in Perl. In both PCRE2 and Perl, there is no way to stop API. By default, PCRE2's behaviour is the same as Perl's, except that there is
newline from matching [^a]. no equivalent for PCRE2_DOLLAR_ENDONLY in Perl. In both PCRE2 and Perl, there
is no way to stop newline from matching [^a].
</P> </P>
<P> <P>
The default POSIX newline handling can be obtained by setting PCRE2_DOTALL and Default POSIX newline handling can be obtained by setting PCRE2_DOTALL and
PCRE2_DOLLAR_ENDONLY, but there is no way to make PCRE2 behave exactly as for PCRE2_DOLLAR_ENDONLY when calling <b>pcre2_compile()</b> directly, but there is
the REG_NEWLINE action. no way to make PCRE2 behave exactly as for the REG_NEWLINE action. When using
the POSIX API, passing REG_NEWLINE to PCRE2's <b>regcomp()</b> function
causes PCRE2_MULTILINE to be passed to <b>pcre2_compile()</b>, and REG_DOTALL
passes PCRE2_DOTALL. There is no way to pass PCRE2_DOLLAR_ENDONLY.
</P> </P>
<br><a name="SEC5" href="#TOC1">MATCHING A PATTERN</a><br> <br><a name="SEC5" href="#TOC1">MATCHING A PATTERN</a><br>
<P> <P>
@ -283,9 +287,9 @@ Cambridge, England.
</P> </P>
<br><a name="SEC9" href="#TOC1">REVISION</a><br> <br><a name="SEC9" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 20 October 2014 Last updated: 03 September 2015
<br> <br>
Copyright &copy; 1997-2014 University of Cambridge. Copyright &copy; 1997-2015 University of Cambridge.
<br> <br>
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -304,6 +304,36 @@ output.
This command is used to load a set of precompiled patterns from a file, as This command is used to load a set of precompiled patterns from a file, as
described in the section entitled "Saving and restoring compiled patterns" described in the section entitled "Saving and restoring compiled patterns"
<a href="#saverestore">below.</a> <a href="#saverestore">below.</a>
<pre>
#newline_default [&#60;newline-list&#62;]
</pre>
When PCRE2 is built, a default newline convention can be specified. This
determines which characters and/or character pairs are recognized as indicating
a newline in a pattern or subject string. The default can be overridden when a
pattern is compiled. The standard test files contain tests of various newline
conventions, but the majority of the tests expect a single linefeed to be
recognized as a newline by default. Without special action the tests would fail
when PCRE2 is compiled with either CR or CRLF as the default newline.
</P>
<P>
The #newline_default command specifies a list of newline types that are
acceptable as the default. The types must be one of CR, LF, CRLF, ANYCRLF, or
ANY (in upper or lower case), for example:
<pre>
#newline_default LF Any anyCRLF
</pre>
If the default newline is in the list, this command has no effect. Otherwise,
except when testing the POSIX API, a <b>newline</b> modifier that specifies the
first newline convention in the list (LF in the above example) is added to any
pattern that does not already have a <b>newline</b> modifier. If the newline
list is empty, the feature is turned off. This command is present in a number
of the standard test input files.
</P>
<P>
When the POSIX API is being tested there is no way to override the default
newline convention, though it is possible to set the newline convention from
within the pattern. A warning is given if the <b>posix</b> modifier is used when
<b>#newline_default</b> would set a default for the non-POSIX API.
<pre> <pre>
#pattern &#60;modifier-list&#62; #pattern &#60;modifier-list&#62;
</pre> </pre>
@ -480,7 +510,7 @@ for a description of their effects.
allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS
alt_bsux set PCRE2_ALT_BSUX alt_bsux set PCRE2_ALT_BSUX
alt_circumflex set PCRE2_ALT_CIRCUMFLEX alt_circumflex set PCRE2_ALT_CIRCUMFLEX
alt_verbnames set PCRE2_ALT_VERBNAMES alt_verbnames set PCRE2_ALT_VERBNAMES
anchored set PCRE2_ANCHORED anchored set PCRE2_ANCHORED
auto_callout set PCRE2_AUTO_CALLOUT auto_callout set PCRE2_AUTO_CALLOUT
/i caseless set PCRE2_CASELESS /i caseless set PCRE2_CASELESS
@ -625,21 +655,51 @@ actual length of the pattern is passed.
JIT compilation JIT compilation
</b><br> </b><br>
<P> <P>
The <b>/jit</b> modifier may optionally be followed by an equals sign and a Just-in-time (JIT) compiling is a heavyweight optimization that can greatly
number in the range 0 to 7: speed up pattern matching. See the
<a href="pcre2jit.html"><b>pcre2jit</b></a>
documentation for details. JIT compiling happens, optionally, after a pattern
has been successfully compiled into an internal form. The JIT compiler converts
this to optimized machine code. It needs to know whether the match-time options
PCRE2_PARTIAL_HARD and PCRE2_PARTIAL_SOFT are going to be used, because
different code is generated for the different cases. See the <b>partial</b>
modifier in "Subject Modifiers"
<a href="#subjectmodifiers">below</a>
for details of how these options are specified for each match attempt.
</P>
<P>
JIT compilation is requested by the <b>/jit</b> pattern modifier, which may
optionally be followed by an equals sign and a number in the range 0 to 7.
The three bits that make up the number specify which of the three JIT operating
modes are to be compiled:
<pre>
1 compile JIT code for non-partial matching
2 compile JIT code for soft partial matching
4 compile JIT code for hard partial matching
</pre>
The possible values for the <b>/jit</b> modifier are therefore:
<pre> <pre>
0 disable JIT 0 disable JIT
1 use JIT for normal match only 1 normal matching only
2 use JIT for soft partial match only 2 soft partial matching only
3 use JIT for normal match and soft partial match 3 normal and soft partial matching
4 use JIT for hard partial match only 4 hard partial matching only
6 use JIT for soft and hard partial match 6 soft and hard partial matching only
7 all three modes 7 all three modes
</pre> </pre>
If no number is given, 7 is assumed. If JIT compilation is successful, the If no number is given, 7 is assumed. The phrase "partial matching" means a call
compiled JIT code will automatically be used when <b>pcre2_match()</b> is run to <b>pcre2_match()</b> with either the PCRE2_PARTIAL_SOFT or the
for the appropriate type of match, except when incompatible run-time options PCRE2_PARTIAL_HARD option set. Note that such a call may return a complete
are specified. For more details, see the match; the options enable the possibility of a partial match, but do not
require it. Note also that if you request JIT compilation only for partial
matching (for example, /jit=2) but do not set the <b>partial</b> modifier on a
subject line, that match will not use JIT code because none was compiled for
non-partial matching.
</P>
<P>
If JIT compilation is successful, the compiled JIT code will automatically be
used when an appropriate type of match is run, except when incompatible
run-time options are specified. For more details, see the
<a href="pcre2jit.html"><b>pcre2jit</b></a> <a href="pcre2jit.html"><b>pcre2jit</b></a>
documentation. See also the <b>jitstack</b> modifier below for a way of documentation. See also the <b>jitstack</b> modifier below for a way of
setting the size of the JIT stack. setting the size of the JIT stack.
@ -707,8 +767,10 @@ Using the POSIX wrapper API
<P> <P>
The <b>/posix</b> modifier causes <b>pcre2test</b> to call PCRE2 via the POSIX The <b>/posix</b> modifier causes <b>pcre2test</b> to call PCRE2 via the POSIX
wrapper API rather than its native API. This supports only the 8-bit library. wrapper API rather than its native API. This supports only the 8-bit library.
When the POSIX API is being used, the following pattern modifiers set options Note that it does not imply POSIX matching semantics; for more detail see the
for the <b>regcomp()</b> function: <a href="pcre2posix.html"><b>pcre2posix</b></a>
documentation. When the POSIX API is being used, the following pattern
modifiers set options for the <b>regcomp()</b> function:
<pre> <pre>
caseless REG_ICASE caseless REG_ICASE
multiline REG_NEWLINE multiline REG_NEWLINE
@ -790,7 +852,7 @@ The <b>push</b> modifier is incompatible with compilation modifiers such as
warning message, except for <b>replace</b>, which causes an error. Note that, warning message, except for <b>replace</b>, which causes an error. Note that,
<b>jitverify</b>, which is allowed, does not carry through to any subsequent <b>jitverify</b>, which is allowed, does not carry through to any subsequent
matching that uses this pattern. matching that uses this pattern.
</P> <a name="subjectmodifiers"></a></P>
<br><a name="SEC11" href="#TOC1">SUBJECT MODIFIERS</a><br> <br><a name="SEC11" href="#TOC1">SUBJECT MODIFIERS</a><br>
<P> <P>
The modifiers that can appear in subject lines and the <b>#subject</b> The modifiers that can appear in subject lines and the <b>#subject</b>
@ -1471,7 +1533,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC21" href="#TOC1">REVISION</a><br> <br><a name="SEC21" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 30 August 2015 Last updated: 12 September 2015
<br> <br>
Copyright &copy; 1997-2015 University of Cambridge. Copyright &copy; 1997-2015 University of Cambridge.
<br> <br>

File diff suppressed because it is too large Load Diff

View File

@ -247,6 +247,36 @@ COMMAND LINES
as described in the section entitled "Saving and restoring compiled as described in the section entitled "Saving and restoring compiled
patterns" below. patterns" below.
#newline_default [<newline-list>]
When PCRE2 is built, a default newline convention can be specified.
This determines which characters and/or character pairs are recognized
as indicating a newline in a pattern or subject string. The default can
be overridden when a pattern is compiled. The standard test files con-
tain tests of various newline conventions, but the majority of the
tests expect a single linefeed to be recognized as a newline by
default. Without special action the tests would fail when PCRE2 is com-
piled with either CR or CRLF as the default newline.
The #newline_default command specifies a list of newline types that are
acceptable as the default. The types must be one of CR, LF, CRLF, ANY-
CRLF, or ANY (in upper or lower case), for example:
#newline_default LF Any anyCRLF
If the default newline is in the list, this command has no effect. Oth-
erwise, except when testing the POSIX API, a newline modifier that
specifies the first newline convention in the list (LF in the above
example) is added to any pattern that does not already have a newline
modifier. If the newline list is empty, the feature is turned off. This
command is present in a number of the standard test input files.
When the POSIX API is being tested there is no way to override the
default newline convention, though it is possible to set the newline
convention from within the pattern. A warning is given if the posix
modifier is used when #newline_default would set a default for the non-
POSIX API.
#pattern <modifier-list> #pattern <modifier-list>
This command sets a default modifier list that applies to all subse- This command sets a default modifier list that applies to all subse-
@ -558,23 +588,49 @@ PATTERN MODIFIERS
JIT compilation JIT compilation
The /jit modifier may optionally be followed by an equals sign and a Just-in-time (JIT) compiling is a heavyweight optimization that can
number in the range 0 to 7: greatly speed up pattern matching. See the pcre2jit documentation for
details. JIT compiling happens, optionally, after a pattern has been
successfully compiled into an internal form. The JIT compiler converts
this to optimized machine code. It needs to know whether the match-time
options PCRE2_PARTIAL_HARD and PCRE2_PARTIAL_SOFT are going to be used,
because different code is generated for the different cases. See the
partial modifier in "Subject Modifiers" below for details of how these
options are specified for each match attempt.
JIT compilation is requested by the /jit pattern modifier, which may
optionally be followed by an equals sign and a number in the range 0 to
7. The three bits that make up the number specify which of the three
JIT operating modes are to be compiled:
1 compile JIT code for non-partial matching
2 compile JIT code for soft partial matching
4 compile JIT code for hard partial matching
The possible values for the /jit modifier are therefore:
0 disable JIT 0 disable JIT
1 use JIT for normal match only 1 normal matching only
2 use JIT for soft partial match only 2 soft partial matching only
3 use JIT for normal match and soft partial match 3 normal and soft partial matching
4 use JIT for hard partial match only 4 hard partial matching only
6 use JIT for soft and hard partial match 6 soft and hard partial matching only
7 all three modes 7 all three modes
If no number is given, 7 is assumed. If JIT compilation is successful, If no number is given, 7 is assumed. The phrase "partial matching"
the compiled JIT code will automatically be used when pcre2_match() is means a call to pcre2_match() with either the PCRE2_PARTIAL_SOFT or the
run for the appropriate type of match, except when incompatible run- PCRE2_PARTIAL_HARD option set. Note that such a call may return a com-
time options are specified. For more details, see the pcre2jit documen- plete match; the options enable the possibility of a partial match, but
tation. See also the jitstack modifier below for a way of setting the do not require it. Note also that if you request JIT compilation only
size of the JIT stack. for partial matching (for example, /jit=2) but do not set the partial
modifier on a subject line, that match will not use JIT code because
none was compiled for non-partial matching.
If JIT compilation is successful, the compiled JIT code will automati-
cally be used when an appropriate type of match is run, except when
incompatible run-time options are specified. For more details, see the
pcre2jit documentation. See also the jitstack modifier below for a way
of setting the size of the JIT stack.
If the jitfast modifier is specified, matching is done using the JIT If the jitfast modifier is specified, matching is done using the JIT
"fast path" interface, pcre2_jit_match(), which skips some of the san- "fast path" interface, pcre2_jit_match(), which skips some of the san-
@ -628,8 +684,10 @@ PATTERN MODIFIERS
The /posix modifier causes pcre2test to call PCRE2 via the POSIX wrap- The /posix modifier causes pcre2test to call PCRE2 via the POSIX wrap-
per API rather than its native API. This supports only the 8-bit per API rather than its native API. This supports only the 8-bit
library. When the POSIX API is being used, the following pattern modi- library. Note that it does not imply POSIX matching semantics; for
fiers set options for the regcomp() function: more detail see the pcre2posix documentation. When the POSIX API is
being used, the following pattern modifiers set options for the reg-
comp() function:
caseless REG_ICASE caseless REG_ICASE
multiline REG_NEWLINE multiline REG_NEWLINE
@ -1333,5 +1391,5 @@ AUTHOR
REVISION REVISION
Last updated: 30 August 2015 Last updated: 12 September 2015
Copyright (c) 1997-2015 University of Cambridge. Copyright (c) 1997-2015 University of Cambridge.