Implement PCRE2_NO_JIT, update HTML docs as well.
This commit is contained in:
parent
afa3c56afd
commit
d243224a60
|
@ -128,6 +128,8 @@ Memcheck warnings Addr16 and Cond in unknown objects (that is, JIT-compiled
|
|||
code). Also changed smc-check=all to smc-check=all-non-file as was done for
|
||||
RunTest (see 4 above).
|
||||
|
||||
32. Implemented the PCRE2_NO_JIT option for pcre2_match().
|
||||
|
||||
|
||||
Version 10.21 12-January-2016
|
||||
-----------------------------
|
||||
|
|
|
@ -168,15 +168,12 @@ library. They are also documented in the pcre2build man page.
|
|||
built. If you want only the 16-bit or 32-bit library, use --disable-pcre2-8
|
||||
to disable building the 8-bit library.
|
||||
|
||||
. If you want to include support for just-in-time compiling, which can give
|
||||
large performance improvements on certain platforms, add --enable-jit to the
|
||||
"configure" command. This support is available only for certain hardware
|
||||
. If you want to include support for just-in-time (JIT) compiling, which can
|
||||
give large performance improvements on certain platforms, add --enable-jit to
|
||||
the "configure" command. This support is available only for certain hardware
|
||||
architectures. If you try to enable it on an unsupported architecture, there
|
||||
will be a compile time error.
|
||||
|
||||
. When JIT support is enabled, pcre2grep automatically makes use of it, unless
|
||||
you add --disable-pcre2grep-jit to the "configure" command.
|
||||
|
||||
. If you do not want to make use of the support for UTF-8 Unicode character
|
||||
strings in the 8-bit library, UTF-16 Unicode character strings in the 16-bit
|
||||
library, or UTF-32 Unicode character strings in the 32-bit library, you can
|
||||
|
@ -324,6 +321,14 @@ library. They are also documented in the pcre2build man page.
|
|||
running "make" to build PCRE2. There is more information about coverage
|
||||
reporting in the "pcre2build" documentation.
|
||||
|
||||
. When JIT support is enabled, pcre2grep automatically makes use of it, unless
|
||||
you add --disable-pcre2grep-jit to the "configure" command.
|
||||
|
||||
. On non-Windows sytems there is support for calling external scripts during
|
||||
matching in the pcre2grep command via PCRE2's callout facility with string
|
||||
arguments. This support can be disabled by adding --disable-pcre2grep-callout
|
||||
to the "configure" command.
|
||||
|
||||
. The pcre2grep program currently supports only 8-bit data files, and so
|
||||
requires the 8-bit PCRE2 library. It is possible to compile pcre2grep to use
|
||||
libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by
|
||||
|
@ -840,4 +845,4 @@ The distribution should contain the files listed below.
|
|||
Philip Hazel
|
||||
Email local part: ph10
|
||||
Email domain: cam.ac.uk
|
||||
Last updated: 16 October 2015
|
||||
Last updated: 01 April 2016
|
||||
|
|
|
@ -417,9 +417,10 @@ More complicated programs might need to make use of the specialist functions
|
|||
<b>pcre2_jit_stack_assign()</b> in order to control the JIT code's memory usage.
|
||||
</P>
|
||||
<P>
|
||||
JIT matching is automatically used by <b>pcre2_match()</b> if it is available.
|
||||
There is also a direct interface for JIT matching, which gives improved
|
||||
performance. The JIT-specific functions are discussed in the
|
||||
JIT matching is automatically used by <b>pcre2_match()</b> if it is available,
|
||||
unless the PCRE2_NO_JIT option is set. There is also a direct interface for JIT
|
||||
matching, which gives improved performance. The JIT-specific functions are
|
||||
discussed in the
|
||||
<a href="pcre2jit.html"><b>pcre2jit</b></a>
|
||||
documentation.
|
||||
</P>
|
||||
|
@ -1630,10 +1631,15 @@ are as follows:
|
|||
Return a copy of the pattern's options. The third argument should point to a
|
||||
<b>uint32_t</b> variable. PCRE2_INFO_ARGOPTIONS returns exactly the options that
|
||||
were passed to <b>pcre2_compile()</b>, whereas PCRE2_INFO_ALLOPTIONS returns
|
||||
the compile options as modified by any top-level option settings such as (*UTF)
|
||||
at the start of the pattern itself. For example, if the pattern /(*UTF)abc/ is
|
||||
compiled with the PCRE2_EXTENDED option, the result is PCRE2_EXTENDED and
|
||||
PCRE2_UTF.
|
||||
the compile options as modified by any top-level (*XXX) option settings such as
|
||||
(*UTF) at the start of the pattern itself.
|
||||
</P>
|
||||
<P>
|
||||
For example, if the pattern /(*UTF)abc/ is compiled with the PCRE2_EXTENDED
|
||||
option, the result for PCRE2_INFO_ALLOPTIONS is PCRE2_EXTENDED and PCRE2_UTF.
|
||||
Option settings such as (?i) that can change within a pattern do not affect the
|
||||
result of PCRE2_INFO_ALLOPTIONS, even if they appear right at the start of the
|
||||
pattern. (This was different in some earlier releases.)
|
||||
</P>
|
||||
<P>
|
||||
A pattern compiled without PCRE2_ANCHORED is automatically anchored by PCRE2 if
|
||||
|
@ -2088,14 +2094,15 @@ Option bits for <b>pcre2_match()</b>
|
|||
<P>
|
||||
The unused bits of the <i>options</i> argument for <b>pcre2_match()</b> must be
|
||||
zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
|
||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
|
||||
PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is described below.
|
||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_JIT,
|
||||
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is
|
||||
described below.
|
||||
</P>
|
||||
<P>
|
||||
Setting PCRE2_ANCHORED at match time is not supported by the just-in-time (JIT)
|
||||
compiler. If it is set, JIT matching is disabled and the normal interpretive
|
||||
code in <b>pcre2_match()</b> is run. The remaining options are supported for JIT
|
||||
matching.
|
||||
code in <b>pcre2_match()</b> is run. Apart from PCRE2_NO_JIT (obviously), the
|
||||
remaining options are supported for JIT matching.
|
||||
<pre>
|
||||
PCRE2_ANCHORED
|
||||
</pre>
|
||||
|
@ -2142,6 +2149,13 @@ only at the first matching position, that is, at the start of the subject plus
|
|||
the starting offset. An empty string match later in the subject is permitted.
|
||||
If the pattern is anchored, such a match can occur only if the pattern contains
|
||||
\K.
|
||||
<pre>
|
||||
PCRE2_NO_JIT
|
||||
</pre>
|
||||
By default, if a pattern has been successfully processed by
|
||||
<b>pcre2_jit_compile()</b>, JIT is automatically used when <b>pcre2_match()</b>
|
||||
is called with options that JIT supports. Setting PCRE2_NO_JIT disables the use
|
||||
of JIT; it forces matching to be done by the interpreter.
|
||||
<pre>
|
||||
PCRE2_NO_UTF_CHECK
|
||||
</pre>
|
||||
|
@ -3184,7 +3198,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC40" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 26 February 2016
|
||||
Last updated: 05 June 2016
|
||||
<br>
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -27,15 +27,16 @@ please consult the man page, in case the conversion went wrong.
|
|||
<li><a name="TOC12" href="#SEC12">LIMITING PCRE2 RESOURCE USAGE</a>
|
||||
<li><a name="TOC13" href="#SEC13">CREATING CHARACTER TABLES AT BUILD TIME</a>
|
||||
<li><a name="TOC14" href="#SEC14">USING EBCDIC CODE</a>
|
||||
<li><a name="TOC15" href="#SEC15">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a>
|
||||
<li><a name="TOC16" href="#SEC16">PCRE2GREP BUFFER SIZE</a>
|
||||
<li><a name="TOC17" href="#SEC17">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a>
|
||||
<li><a name="TOC18" href="#SEC18">INCLUDING DEBUGGING CODE</a>
|
||||
<li><a name="TOC19" href="#SEC19">DEBUGGING WITH VALGRIND SUPPORT</a>
|
||||
<li><a name="TOC20" href="#SEC20">CODE COVERAGE REPORTING</a>
|
||||
<li><a name="TOC21" href="#SEC21">SEE ALSO</a>
|
||||
<li><a name="TOC22" href="#SEC22">AUTHOR</a>
|
||||
<li><a name="TOC23" href="#SEC23">REVISION</a>
|
||||
<li><a name="TOC15" href="#SEC15">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a>
|
||||
<li><a name="TOC16" href="#SEC16">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a>
|
||||
<li><a name="TOC17" href="#SEC17">PCRE2GREP BUFFER SIZE</a>
|
||||
<li><a name="TOC18" href="#SEC18">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a>
|
||||
<li><a name="TOC19" href="#SEC19">INCLUDING DEBUGGING CODE</a>
|
||||
<li><a name="TOC20" href="#SEC20">DEBUGGING WITH VALGRIND SUPPORT</a>
|
||||
<li><a name="TOC21" href="#SEC21">CODE COVERAGE REPORTING</a>
|
||||
<li><a name="TOC22" href="#SEC22">SEE ALSO</a>
|
||||
<li><a name="TOC23" href="#SEC23">AUTHOR</a>
|
||||
<li><a name="TOC24" href="#SEC24">REVISION</a>
|
||||
</ul>
|
||||
<br><a name="SEC1" href="#TOC1">BUILDING PCRE2</a><br>
|
||||
<P>
|
||||
|
@ -349,7 +350,16 @@ The options that select newline behaviour, such as --enable-newline-is-cr,
|
|||
and equivalent run-time options, refer to these character values in an EBCDIC
|
||||
environment.
|
||||
</P>
|
||||
<br><a name="SEC15" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
|
||||
<br><a name="SEC15" href="#TOC1">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a><br>
|
||||
<P>
|
||||
By default, on non-Windows systems, <b>pcre2grep</b> supports the use of
|
||||
callouts with string arguments within the patterns it is matching, in order to
|
||||
run external scripts. For details, see the
|
||||
<a href="pcre2grep.html"><b>pcre2grep</b></a>
|
||||
documentation. This support can be disabled by adding
|
||||
--disable-pcre2grep-callout to the <b>configure</b> command.
|
||||
</P>
|
||||
<br><a name="SEC16" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
|
||||
<P>
|
||||
By default, <b>pcre2grep</b> reads all files as plain text. You can build it so
|
||||
that it recognizes files whose names end in <b>.gz</b> or <b>.bz2</b>, and reads
|
||||
|
@ -362,7 +372,7 @@ to the <b>configure</b> command. These options naturally require that the
|
|||
relevant libraries are installed on your system. Configuration will fail if
|
||||
they are not.
|
||||
</P>
|
||||
<br><a name="SEC16" href="#TOC1">PCRE2GREP BUFFER SIZE</a><br>
|
||||
<br><a name="SEC17" href="#TOC1">PCRE2GREP BUFFER SIZE</a><br>
|
||||
<P>
|
||||
<b>pcre2grep</b> uses an internal buffer to hold a "window" on the file it is
|
||||
scanning, in order to be able to output "before" and "after" lines when it
|
||||
|
@ -375,9 +385,9 @@ parameter value by adding, for example,
|
|||
--with-pcre2grep-bufsize=50K
|
||||
</pre>
|
||||
to the <b>configure</b> command. The caller of \fPpcre2grep\fP can override this
|
||||
value by using --buffer-size on the command line..
|
||||
value by using --buffer-size on the command line.
|
||||
</P>
|
||||
<br><a name="SEC17" href="#TOC1">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a><br>
|
||||
<br><a name="SEC18" href="#TOC1">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a><br>
|
||||
<P>
|
||||
If you add one of
|
||||
<pre>
|
||||
|
@ -411,7 +421,7 @@ automatically included, you may need to add something like
|
|||
</pre>
|
||||
immediately before the <b>configure</b> command.
|
||||
</P>
|
||||
<br><a name="SEC18" href="#TOC1">INCLUDING DEBUGGING CODE</a><br>
|
||||
<br><a name="SEC19" href="#TOC1">INCLUDING DEBUGGING CODE</a><br>
|
||||
<P>
|
||||
If you add
|
||||
<pre>
|
||||
|
@ -420,7 +430,7 @@ If you add
|
|||
to the <b>configure</b> command, additional debugging code is included in the
|
||||
build. This feature is intended for use by the PCRE2 maintainers.
|
||||
</P>
|
||||
<br><a name="SEC19" href="#TOC1">DEBUGGING WITH VALGRIND SUPPORT</a><br>
|
||||
<br><a name="SEC20" href="#TOC1">DEBUGGING WITH VALGRIND SUPPORT</a><br>
|
||||
<P>
|
||||
If you add
|
||||
<pre>
|
||||
|
@ -430,7 +440,7 @@ to the <b>configure</b> command, PCRE2 will use valgrind annotations to mark
|
|||
certain memory regions as unaddressable. This allows it to detect invalid
|
||||
memory accesses, and is mostly useful for debugging PCRE2 itself.
|
||||
</P>
|
||||
<br><a name="SEC20" href="#TOC1">CODE COVERAGE REPORTING</a><br>
|
||||
<br><a name="SEC21" href="#TOC1">CODE COVERAGE REPORTING</a><br>
|
||||
<P>
|
||||
If your C compiler is gcc, you can build a version of PCRE2 that can generate a
|
||||
code coverage report for its test suite. To enable this, you must install
|
||||
|
@ -487,11 +497,11 @@ This cleans all coverage data including the generated coverage report. For more
|
|||
information about code coverage, see the <b>gcov</b> and <b>lcov</b>
|
||||
documentation.
|
||||
</P>
|
||||
<br><a name="SEC21" href="#TOC1">SEE ALSO</a><br>
|
||||
<br><a name="SEC22" href="#TOC1">SEE ALSO</a><br>
|
||||
<P>
|
||||
<b>pcre2api</b>(3), <b>pcre2-config</b>(3).
|
||||
</P>
|
||||
<br><a name="SEC22" href="#TOC1">AUTHOR</a><br>
|
||||
<br><a name="SEC23" href="#TOC1">AUTHOR</a><br>
|
||||
<P>
|
||||
Philip Hazel
|
||||
<br>
|
||||
|
@ -500,11 +510,11 @@ University Computing Service
|
|||
Cambridge, England.
|
||||
<br>
|
||||
</P>
|
||||
<br><a name="SEC23" href="#TOC1">REVISION</a><br>
|
||||
<br><a name="SEC24" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 16 October 2015
|
||||
Last updated: 01 April 2016
|
||||
<br>
|
||||
Copyright © 1997-2015 University of Cambridge.
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -152,6 +152,10 @@ PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The
|
|||
PCRE2_ANCHORED option is not supported at match time.
|
||||
</P>
|
||||
<P>
|
||||
If the PCRE2_NO_JIT option is passed to <b>pcre2_match()</b> it disables the
|
||||
use of JIT, forcing matching by the interpreter code.
|
||||
</P>
|
||||
<P>
|
||||
The only unsupported pattern items are \C (match a single data unit) when
|
||||
running in a UTF mode, and a callout immediately before an assertion condition
|
||||
in a conditional group.
|
||||
|
@ -403,7 +407,7 @@ The fast path function is called <b>pcre2_jit_match()</b>, and it takes exactly
|
|||
the same arguments as <b>pcre2_match()</b>. The return values are also the same,
|
||||
plus PCRE2_ERROR_JIT_BADOPTION if a matching mode (partial or complete) is
|
||||
requested that was not compiled. Unsupported option bits (for example,
|
||||
PCRE2_ANCHORED) are ignored.
|
||||
PCRE2_ANCHORED) are ignored, as is the PCRE2_NO_JIT option.
|
||||
</P>
|
||||
<P>
|
||||
When you call <b>pcre2_match()</b>, as well as testing for invalid options, a
|
||||
|
@ -432,9 +436,9 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC13" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 14 November 2015
|
||||
Last updated: 05 June 2016
|
||||
<br>
|
||||
Copyright © 1997-2015 University of Cambridge.
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -14,10 +14,11 @@ please consult the man page, in case the conversion went wrong.
|
|||
<br>
|
||||
<ul>
|
||||
<li><a name="TOC1" href="#SEC1">SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS</a>
|
||||
<li><a name="TOC2" href="#SEC2">SAVING COMPILED PATTERNS</a>
|
||||
<li><a name="TOC3" href="#SEC3">RE-USING PRECOMPILED PATTERNS</a>
|
||||
<li><a name="TOC4" href="#SEC4">AUTHOR</a>
|
||||
<li><a name="TOC5" href="#SEC5">REVISION</a>
|
||||
<li><a name="TOC2" href="#SEC2">SECURITY CONCERNS</a>
|
||||
<li><a name="TOC3" href="#SEC3">SAVING COMPILED PATTERNS</a>
|
||||
<li><a name="TOC4" href="#SEC4">RE-USING PRECOMPILED PATTERNS</a>
|
||||
<li><a name="TOC5" href="#SEC5">AUTHOR</a>
|
||||
<li><a name="TOC6" href="#SEC6">REVISION</a>
|
||||
</ul>
|
||||
<br><a name="SEC1" href="#TOC1">SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS</a><br>
|
||||
<P>
|
||||
|
@ -48,7 +49,15 @@ and PCRE2_SIZE type. For example, patterns compiled on a 32-bit system using
|
|||
PCRE2's 16-bit library cannot be reloaded on a 64-bit system, nor can they be
|
||||
reloaded using the 8-bit library.
|
||||
</P>
|
||||
<br><a name="SEC2" href="#TOC1">SAVING COMPILED PATTERNS</a><br>
|
||||
<br><a name="SEC2" href="#TOC1">SECURITY CONCERNS</a><br>
|
||||
<P>
|
||||
The facility for saving and restoring compiled patterns is intended for use
|
||||
within individual applications. As such, the data supplied to
|
||||
<b>pcre2_serialize_decode()</b> is expected to be trusted data, not data from
|
||||
arbitrary external sources. There is only some simple consistency checking, not
|
||||
complete validation of what is being re-loaded.
|
||||
</P>
|
||||
<br><a name="SEC3" href="#TOC1">SAVING COMPILED PATTERNS</a><br>
|
||||
<P>
|
||||
Before compiled patterns can be saved they must be serialized, that is,
|
||||
converted to a stream of bytes. A single byte stream may contain any number of
|
||||
|
@ -110,7 +119,7 @@ still be used for matching. Their memory must eventually be freed in the usual
|
|||
way by calling <b>pcre2_code_free()</b>. When you have finished with the byte
|
||||
stream, it too must be freed by calling <b>pcre2_serialize_free()</b>.
|
||||
</P>
|
||||
<br><a name="SEC3" href="#TOC1">RE-USING PRECOMPILED PATTERNS</a><br>
|
||||
<br><a name="SEC4" href="#TOC1">RE-USING PRECOMPILED PATTERNS</a><br>
|
||||
<P>
|
||||
In order to re-use a set of saved patterns you must first make the serialized
|
||||
byte stream available in main memory (for example, by reading from a file). The
|
||||
|
@ -144,7 +153,8 @@ error codes:
|
|||
<pre>
|
||||
PCRE2_ERROR_BADDATA second argument is zero or less
|
||||
PCRE2_ERROR_BADMAGIC mismatch of id bytes in the data
|
||||
PCRE2_ERROR_BADMODE mismatch of variable unit size or PCRE2 version
|
||||
PCRE2_ERROR_BADMODE mismatch of code unit size or PCRE2 version
|
||||
PCRE2_ERROR_BADSERIALIZEDDATA other sanity check failure
|
||||
PCRE2_ERROR_MEMORY memory allocation failed
|
||||
PCRE2_ERROR_NULL first or third argument is NULL
|
||||
</pre>
|
||||
|
@ -169,7 +179,7 @@ serialized, the JIT data is discarded and so is no longer available after a
|
|||
save/restore cycle. You can, however, process a restored pattern with
|
||||
<b>pcre2_jit_compile()</b> if you wish.
|
||||
</P>
|
||||
<br><a name="SEC4" href="#TOC1">AUTHOR</a><br>
|
||||
<br><a name="SEC5" href="#TOC1">AUTHOR</a><br>
|
||||
<P>
|
||||
Philip Hazel
|
||||
<br>
|
||||
|
@ -178,11 +188,11 @@ University Computing Service
|
|||
Cambridge, England.
|
||||
<br>
|
||||
</P>
|
||||
<br><a name="SEC5" href="#TOC1">REVISION</a><br>
|
||||
<br><a name="SEC6" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 03 November 2015
|
||||
Last updated: 24 May 2016
|
||||
<br>
|
||||
Copyright © 1997-2015 University of Cambridge.
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -962,6 +962,7 @@ for a description of their effects.
|
|||
anchored set PCRE2_ANCHORED
|
||||
dfa_restart set PCRE2_DFA_RESTART
|
||||
dfa_shortest set PCRE2_DFA_SHORTEST
|
||||
no_jit set PCRE2_NO_JIT
|
||||
no_utf_check set PCRE2_NO_UTF_CHECK
|
||||
notbol set PCRE2_NOTBOL
|
||||
notempty set PCRE2_NOTEMPTY
|
||||
|
@ -1697,7 +1698,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 06 February 2016
|
||||
Last updated: 05 June 2016
|
||||
<br>
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -489,10 +489,10 @@ PCRE2 API OVERVIEW
|
|||
pcre2_jit_stack_assign() in order to control the JIT code's memory
|
||||
usage.
|
||||
|
||||
JIT matching is automatically used by pcre2_match() if it is available.
|
||||
There is also a direct interface for JIT matching, which gives improved
|
||||
performance. The JIT-specific functions are discussed in the pcre2jit
|
||||
documentation.
|
||||
JIT matching is automatically used by pcre2_match() if it is available,
|
||||
unless the PCRE2_NO_JIT option is set. There is also a direct interface
|
||||
for JIT matching, which gives improved performance. The JIT-specific
|
||||
functions are discussed in the pcre2jit documentation.
|
||||
|
||||
A second matching function, pcre2_dfa_match(), which is not Perl-com-
|
||||
patible, is also provided. This uses a different algorithm for the
|
||||
|
@ -1649,10 +1649,15 @@ INFORMATION ABOUT A COMPILED PATTERN
|
|||
Return a copy of the pattern's options. The third argument should point
|
||||
to a uint32_t variable. PCRE2_INFO_ARGOPTIONS returns exactly the
|
||||
options that were passed to pcre2_compile(), whereas PCRE2_INFO_ALLOP-
|
||||
TIONS returns the compile options as modified by any top-level option
|
||||
settings such as (*UTF) at the start of the pattern itself. For exam-
|
||||
ple, if the pattern /(*UTF)abc/ is compiled with the PCRE2_EXTENDED
|
||||
option, the result is PCRE2_EXTENDED and PCRE2_UTF.
|
||||
TIONS returns the compile options as modified by any top-level (*XXX)
|
||||
option settings such as (*UTF) at the start of the pattern itself.
|
||||
|
||||
For example, if the pattern /(*UTF)abc/ is compiled with the
|
||||
PCRE2_EXTENDED option, the result for PCRE2_INFO_ALLOPTIONS is
|
||||
PCRE2_EXTENDED and PCRE2_UTF. Option settings such as (?i) that can
|
||||
change within a pattern do not affect the result of PCRE2_INFO_ALLOP-
|
||||
TIONS, even if they appear right at the start of the pattern. (This was
|
||||
different in some earlier releases.)
|
||||
|
||||
A pattern compiled without PCRE2_ANCHORED is automatically anchored by
|
||||
PCRE2 if the first significant item in every top-level branch is one of
|
||||
|
@ -2092,14 +2097,15 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
|
|||
|
||||
The unused bits of the options argument for pcre2_match() must be zero.
|
||||
The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
|
||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
|
||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_JIT,
|
||||
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their
|
||||
action is described below.
|
||||
|
||||
Setting PCRE2_ANCHORED at match time is not supported by the just-in-
|
||||
time (JIT) compiler. If it is set, JIT matching is disabled and the
|
||||
normal interpretive code in pcre2_match() is run. The remaining options
|
||||
are supported for JIT matching.
|
||||
normal interpretive code in pcre2_match() is run. Apart from
|
||||
PCRE2_NO_JIT (obviously), the remaining options are supported for JIT
|
||||
matching.
|
||||
|
||||
PCRE2_ANCHORED
|
||||
|
||||
|
@ -2148,6 +2154,13 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
|
|||
subject is permitted. If the pattern is anchored, such a match can
|
||||
occur only if the pattern contains \K.
|
||||
|
||||
PCRE2_NO_JIT
|
||||
|
||||
By default, if a pattern has been successfully processed by
|
||||
pcre2_jit_compile(), JIT is automatically used when pcre2_match() is
|
||||
called with options that JIT supports. Setting PCRE2_NO_JIT disables
|
||||
the use of JIT; it forces matching to be done by the interpreter.
|
||||
|
||||
PCRE2_NO_UTF_CHECK
|
||||
|
||||
When PCRE2_UTF is set at compile time, the validity of the subject as a
|
||||
|
@ -3109,7 +3122,7 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 26 February 2016
|
||||
Last updated: 05 June 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
@ -3437,6 +3450,15 @@ USING EBCDIC CODE
|
|||
an EBCDIC environment.
|
||||
|
||||
|
||||
PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS
|
||||
|
||||
By default, on non-Windows systems, pcre2grep supports the use of call-
|
||||
outs with string arguments within the patterns it is matching, in order
|
||||
to run external scripts. For details, see the pcre2grep documentation.
|
||||
This support can be disabled by adding --disable-pcre2grep-callout to
|
||||
the configure command.
|
||||
|
||||
|
||||
PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT
|
||||
|
||||
By default, pcre2grep reads all files as plain text. You can build it
|
||||
|
@ -3464,7 +3486,7 @@ PCRE2GREP BUFFER SIZE
|
|||
--with-pcre2grep-bufsize=50K
|
||||
|
||||
to the configure command. The caller of pcre2grep can override this
|
||||
value by using --buffer-size on the command line..
|
||||
value by using --buffer-size on the command line.
|
||||
|
||||
|
||||
PCRE2TEST OPTION FOR LIBREADLINE SUPPORT
|
||||
|
@ -3593,8 +3615,8 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 16 October 2015
|
||||
Copyright (c) 1997-2015 University of Cambridge.
|
||||
Last updated: 01 April 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
|
@ -4272,6 +4294,9 @@ UNSUPPORTED OPTIONS AND PATTERN ITEMS
|
|||
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The
|
||||
PCRE2_ANCHORED option is not supported at match time.
|
||||
|
||||
If the PCRE2_NO_JIT option is passed to pcre2_match() it disables the
|
||||
use of JIT, forcing matching by the interpreter code.
|
||||
|
||||
The only unsupported pattern items are \C (match a single data unit)
|
||||
when running in a UTF mode, and a callout immediately before an asser-
|
||||
tion condition in a conditional group.
|
||||
|
@ -4508,7 +4533,8 @@ JIT FAST PATH API
|
|||
exactly the same arguments as pcre2_match(). The return values are also
|
||||
the same, plus PCRE2_ERROR_JIT_BADOPTION if a matching mode (partial or
|
||||
complete) is requested that was not compiled. Unsupported option bits
|
||||
(for example, PCRE2_ANCHORED) are ignored.
|
||||
(for example, PCRE2_ANCHORED) are ignored, as is the PCRE2_NO_JIT
|
||||
option.
|
||||
|
||||
When you call pcre2_match(), as well as testing for invalid options, a
|
||||
number of other sanity checks are performed on the arguments. For exam-
|
||||
|
@ -4535,8 +4561,8 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 14 November 2015
|
||||
Copyright (c) 1997-2015 University of Cambridge.
|
||||
Last updated: 05 June 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
|
@ -8884,6 +8910,15 @@ SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS
|
|||
using the 8-bit library.
|
||||
|
||||
|
||||
SECURITY CONCERNS
|
||||
|
||||
The facility for saving and restoring compiled patterns is intended for
|
||||
use within individual applications. As such, the data supplied to
|
||||
pcre2_serialize_decode() is expected to be trusted data, not data from
|
||||
arbitrary external sources. There is only some simple consistency
|
||||
checking, not complete validation of what is being re-loaded.
|
||||
|
||||
|
||||
SAVING COMPILED PATTERNS
|
||||
|
||||
Before compiled patterns can be saved they must be serialized, that is,
|
||||
|
@ -8979,7 +9014,8 @@ RE-USING PRECOMPILED PATTERNS
|
|||
|
||||
PCRE2_ERROR_BADDATA second argument is zero or less
|
||||
PCRE2_ERROR_BADMAGIC mismatch of id bytes in the data
|
||||
PCRE2_ERROR_BADMODE mismatch of variable unit size or PCRE2 version
|
||||
PCRE2_ERROR_BADMODE mismatch of code unit size or PCRE2 version
|
||||
PCRE2_ERROR_BADSERIALIZEDDATA other sanity check failure
|
||||
PCRE2_ERROR_MEMORY memory allocation failed
|
||||
PCRE2_ERROR_NULL first or third argument is NULL
|
||||
|
||||
|
@ -9013,8 +9049,8 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 03 November 2015
|
||||
Copyright (c) 1997-2015 University of Cambridge.
|
||||
Last updated: 24 May 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2API 3 "27 February 2016" "PCRE2 10.22"
|
||||
.TH PCRE2API 3 "05 June 2016" "PCRE2 10.22"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.sp
|
||||
|
@ -354,9 +354,10 @@ More complicated programs might need to make use of the specialist functions
|
|||
\fBpcre2_jit_stack_create()\fP, \fBpcre2_jit_stack_free()\fP, and
|
||||
\fBpcre2_jit_stack_assign()\fP in order to control the JIT code's memory usage.
|
||||
.P
|
||||
JIT matching is automatically used by \fBpcre2_match()\fP if it is available.
|
||||
There is also a direct interface for JIT matching, which gives improved
|
||||
performance. The JIT-specific functions are discussed in the
|
||||
JIT matching is automatically used by \fBpcre2_match()\fP if it is available,
|
||||
unless the PCRE2_NO_JIT option is set. There is also a direct interface for JIT
|
||||
matching, which gives improved performance. The JIT-specific functions are
|
||||
discussed in the
|
||||
.\" HREF
|
||||
\fBpcre2jit\fP
|
||||
.\"
|
||||
|
@ -2110,13 +2111,14 @@ pattern does not require the match to be at the start of the subject.
|
|||
.sp
|
||||
The unused bits of the \fIoptions\fP argument for \fBpcre2_match()\fP must be
|
||||
zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
|
||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
|
||||
PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is described below.
|
||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_JIT,
|
||||
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is
|
||||
described below.
|
||||
.P
|
||||
Setting PCRE2_ANCHORED at match time is not supported by the just-in-time (JIT)
|
||||
compiler. If it is set, JIT matching is disabled and the normal interpretive
|
||||
code in \fBpcre2_match()\fP is run. The remaining options are supported for JIT
|
||||
matching.
|
||||
code in \fBpcre2_match()\fP is run. Apart from PCRE2_NO_JIT (obviously), the
|
||||
remaining options are supported for JIT matching.
|
||||
.sp
|
||||
PCRE2_ANCHORED
|
||||
.sp
|
||||
|
@ -2163,6 +2165,13 @@ only at the first matching position, that is, at the start of the subject plus
|
|||
the starting offset. An empty string match later in the subject is permitted.
|
||||
If the pattern is anchored, such a match can occur only if the pattern contains
|
||||
\eK.
|
||||
.sp
|
||||
PCRE2_NO_JIT
|
||||
.sp
|
||||
By default, if a pattern has been successfully processed by
|
||||
\fBpcre2_jit_compile()\fP, JIT is automatically used when \fBpcre2_match()\fP
|
||||
is called with options that JIT supports. Setting PCRE2_NO_JIT disables the use
|
||||
of JIT; it forces matching to be done by the interpreter.
|
||||
.sp
|
||||
PCRE2_NO_UTF_CHECK
|
||||
.sp
|
||||
|
@ -3233,6 +3242,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 27 February 2016
|
||||
Last updated: 05 June 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2JIT 3 "14 November 2015" "PCRE2 10.21"
|
||||
.TH PCRE2JIT 3 "05 June 2016" "PCRE2 10.22"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.SH "PCRE2 JUST-IN-TIME COMPILER SUPPORT"
|
||||
|
@ -128,6 +128,9 @@ PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
|
|||
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The
|
||||
PCRE2_ANCHORED option is not supported at match time.
|
||||
.P
|
||||
If the PCRE2_NO_JIT option is passed to \fBpcre2_match()\fP it disables the
|
||||
use of JIT, forcing matching by the interpreter code.
|
||||
.P
|
||||
The only unsupported pattern items are \eC (match a single data unit) when
|
||||
running in a UTF mode, and a callout immediately before an assertion condition
|
||||
in a conditional group.
|
||||
|
@ -377,7 +380,7 @@ The fast path function is called \fBpcre2_jit_match()\fP, and it takes exactly
|
|||
the same arguments as \fBpcre2_match()\fP. The return values are also the same,
|
||||
plus PCRE2_ERROR_JIT_BADOPTION if a matching mode (partial or complete) is
|
||||
requested that was not compiled. Unsupported option bits (for example,
|
||||
PCRE2_ANCHORED) are ignored.
|
||||
PCRE2_ANCHORED) are ignored, as is the PCRE2_NO_JIT option.
|
||||
.P
|
||||
When you call \fBpcre2_match()\fP, as well as testing for invalid options, a
|
||||
number of other sanity checks are performed on the arguments. For example, if
|
||||
|
@ -410,6 +413,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 14 November 2015
|
||||
Copyright (c) 1997-2015 University of Cambridge.
|
||||
Last updated: 05 June 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2TEST 1 "26 February 2016" "PCRE 10.22"
|
||||
.TH PCRE2TEST 1 "05 June 2016" "PCRE 10.22"
|
||||
.SH NAME
|
||||
pcre2test - a program for testing Perl-compatible regular expressions.
|
||||
.SH SYNOPSIS
|
||||
|
@ -931,6 +931,7 @@ for a description of their effects.
|
|||
anchored set PCRE2_ANCHORED
|
||||
dfa_restart set PCRE2_DFA_RESTART
|
||||
dfa_shortest set PCRE2_DFA_SHORTEST
|
||||
no_jit set PCRE2_NO_JIT
|
||||
no_utf_check set PCRE2_NO_UTF_CHECK
|
||||
notbol set PCRE2_NOTBOL
|
||||
notempty set PCRE2_NOTEMPTY
|
||||
|
@ -1674,6 +1675,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 06 February 2016
|
||||
Last updated: 05 June 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -859,6 +859,7 @@ SUBJECT MODIFIERS
|
|||
anchored set PCRE2_ANCHORED
|
||||
dfa_restart set PCRE2_DFA_RESTART
|
||||
dfa_shortest set PCRE2_DFA_SHORTEST
|
||||
no_jit set PCRE2_NO_JIT
|
||||
no_utf_check set PCRE2_NO_UTF_CHECK
|
||||
notbol set PCRE2_NOTBOL
|
||||
notempty set PCRE2_NOTEMPTY
|
||||
|
@ -1538,5 +1539,5 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 06 February 2016
|
||||
Last updated: 05 June 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
|
|
|
@ -146,7 +146,8 @@ sanity checks). */
|
|||
#define PCRE2_DFA_RESTART 0x00000040u
|
||||
#define PCRE2_DFA_SHORTEST 0x00000080u
|
||||
|
||||
/* These are additional options for pcre2_substitute(). */
|
||||
/* These are additional options for pcre2_substitute(), which passes any others
|
||||
through to pcre2_match(). */
|
||||
|
||||
#define PCRE2_SUBSTITUTE_GLOBAL 0x00000100u
|
||||
#define PCRE2_SUBSTITUTE_EXTENDED 0x00000200u
|
||||
|
@ -154,6 +155,11 @@ sanity checks). */
|
|||
#define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u
|
||||
#define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u
|
||||
|
||||
/* A further option for pcre2_match(), not allowed for pcre2_dfa_match(),
|
||||
ignored for pcre2_jit_match(). */
|
||||
|
||||
#define PCRE2_NO_JIT 0x00002000u
|
||||
|
||||
/* Newline and \R settings, for use in compile contexts. The newline values
|
||||
must be kept in step with values set in config.h and both sets must all be
|
||||
greater than zero. */
|
||||
|
|
|
@ -146,7 +146,8 @@ sanity checks). */
|
|||
#define PCRE2_DFA_RESTART 0x00000040u
|
||||
#define PCRE2_DFA_SHORTEST 0x00000080u
|
||||
|
||||
/* These are additional options for pcre2_substitute(). */
|
||||
/* These are additional options for pcre2_substitute(), which passes any others
|
||||
through to pcre2_match(). */
|
||||
|
||||
#define PCRE2_SUBSTITUTE_GLOBAL 0x00000100u
|
||||
#define PCRE2_SUBSTITUTE_EXTENDED 0x00000200u
|
||||
|
@ -154,6 +155,11 @@ sanity checks). */
|
|||
#define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u
|
||||
#define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u
|
||||
|
||||
/* A further option for pcre2_match(), not allowed for pcre2_dfa_match(),
|
||||
ignored for pcre2_jit_match(). */
|
||||
|
||||
#define PCRE2_NO_JIT 0x00002000u
|
||||
|
||||
/* Newline and \R settings, for use in compile contexts. The newline values
|
||||
must be kept in step with values set in config.h and both sets must all be
|
||||
greater than zero. */
|
||||
|
|
|
@ -55,7 +55,7 @@ POSSIBILITY OF SUCH DAMAGE.
|
|||
#define PUBLIC_MATCH_OPTIONS \
|
||||
(PCRE2_ANCHORED|PCRE2_NOTBOL|PCRE2_NOTEOL|PCRE2_NOTEMPTY| \
|
||||
PCRE2_NOTEMPTY_ATSTART|PCRE2_NO_UTF_CHECK|PCRE2_PARTIAL_HARD| \
|
||||
PCRE2_PARTIAL_SOFT)
|
||||
PCRE2_PARTIAL_SOFT|PCRE2_NO_JIT)
|
||||
|
||||
#define PUBLIC_JIT_MATCH_OPTIONS \
|
||||
(PCRE2_NO_UTF_CHECK|PCRE2_NOTBOL|PCRE2_NOTEOL|PCRE2_NOTEMPTY|\
|
||||
|
|
|
@ -586,6 +586,7 @@ static modstruct modlist[] = {
|
|||
{ "no_auto_capture", MOD_PAT, MOD_OPT, PCRE2_NO_AUTO_CAPTURE, PO(options) },
|
||||
{ "no_auto_possess", MOD_PATP, MOD_OPT, PCRE2_NO_AUTO_POSSESS, PO(options) },
|
||||
{ "no_dotstar_anchor", MOD_PAT, MOD_OPT, PCRE2_NO_DOTSTAR_ANCHOR, PO(options) },
|
||||
{ "no_jit", MOD_DAT, MOD_OPT, PCRE2_NO_JIT, DO(options) },
|
||||
{ "no_start_optimize", MOD_PATP, MOD_OPT, PCRE2_NO_START_OPTIMIZE, PO(options) },
|
||||
{ "no_utf_check", MOD_PD, MOD_OPT, PCRE2_NO_UTF_CHECK, PD(options) },
|
||||
{ "notbol", MOD_DAT, MOD_OPT, PCRE2_NOTBOL, DO(options) },
|
||||
|
|
|
@ -279,4 +279,14 @@
|
|||
/(.|.)*?bx/
|
||||
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabax
|
||||
|
||||
# Test JIT disable
|
||||
|
||||
/abc/
|
||||
abc
|
||||
abc\=no_jit
|
||||
|
||||
/abc/jitfast
|
||||
abc
|
||||
abc\=no_jit
|
||||
|
||||
# End of testinput17
|
||||
|
|
|
@ -517,4 +517,18 @@ Failed: error -46: JIT stack limit reached
|
|||
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabax
|
||||
Failed: error -47: match limit exceeded
|
||||
|
||||
# Test JIT disable
|
||||
|
||||
/abc/
|
||||
abc
|
||||
0: abc (JIT)
|
||||
abc\=no_jit
|
||||
0: abc
|
||||
|
||||
/abc/jitfast
|
||||
abc
|
||||
0: abc (JIT)
|
||||
abc\=no_jit
|
||||
0: abc (JIT)
|
||||
|
||||
# End of testinput17
|
||||
|
|
Loading…
Reference in New Issue