Implement PCRE2_NO_JIT, update HTML docs as well.
This commit is contained in:
parent
afa3c56afd
commit
d243224a60
|
@ -128,6 +128,8 @@ Memcheck warnings Addr16 and Cond in unknown objects (that is, JIT-compiled
|
|||
code). Also changed smc-check=all to smc-check=all-non-file as was done for
|
||||
RunTest (see 4 above).
|
||||
|
||||
32. Implemented the PCRE2_NO_JIT option for pcre2_match().
|
||||
|
||||
|
||||
Version 10.21 12-January-2016
|
||||
-----------------------------
|
||||
|
|
|
@ -168,15 +168,12 @@ library. They are also documented in the pcre2build man page.
|
|||
built. If you want only the 16-bit or 32-bit library, use --disable-pcre2-8
|
||||
to disable building the 8-bit library.
|
||||
|
||||
. If you want to include support for just-in-time compiling, which can give
|
||||
large performance improvements on certain platforms, add --enable-jit to the
|
||||
"configure" command. This support is available only for certain hardware
|
||||
. If you want to include support for just-in-time (JIT) compiling, which can
|
||||
give large performance improvements on certain platforms, add --enable-jit to
|
||||
the "configure" command. This support is available only for certain hardware
|
||||
architectures. If you try to enable it on an unsupported architecture, there
|
||||
will be a compile time error.
|
||||
|
||||
. When JIT support is enabled, pcre2grep automatically makes use of it, unless
|
||||
you add --disable-pcre2grep-jit to the "configure" command.
|
||||
|
||||
. If you do not want to make use of the support for UTF-8 Unicode character
|
||||
strings in the 8-bit library, UTF-16 Unicode character strings in the 16-bit
|
||||
library, or UTF-32 Unicode character strings in the 32-bit library, you can
|
||||
|
@ -324,6 +321,14 @@ library. They are also documented in the pcre2build man page.
|
|||
running "make" to build PCRE2. There is more information about coverage
|
||||
reporting in the "pcre2build" documentation.
|
||||
|
||||
. When JIT support is enabled, pcre2grep automatically makes use of it, unless
|
||||
you add --disable-pcre2grep-jit to the "configure" command.
|
||||
|
||||
. On non-Windows sytems there is support for calling external scripts during
|
||||
matching in the pcre2grep command via PCRE2's callout facility with string
|
||||
arguments. This support can be disabled by adding --disable-pcre2grep-callout
|
||||
to the "configure" command.
|
||||
|
||||
. The pcre2grep program currently supports only 8-bit data files, and so
|
||||
requires the 8-bit PCRE2 library. It is possible to compile pcre2grep to use
|
||||
libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by
|
||||
|
@ -840,4 +845,4 @@ The distribution should contain the files listed below.
|
|||
Philip Hazel
|
||||
Email local part: ph10
|
||||
Email domain: cam.ac.uk
|
||||
Last updated: 16 October 2015
|
||||
Last updated: 01 April 2016
|
||||
|
|
|
@ -417,9 +417,10 @@ More complicated programs might need to make use of the specialist functions
|
|||
<b>pcre2_jit_stack_assign()</b> in order to control the JIT code's memory usage.
|
||||
</P>
|
||||
<P>
|
||||
JIT matching is automatically used by <b>pcre2_match()</b> if it is available.
|
||||
There is also a direct interface for JIT matching, which gives improved
|
||||
performance. The JIT-specific functions are discussed in the
|
||||
JIT matching is automatically used by <b>pcre2_match()</b> if it is available,
|
||||
unless the PCRE2_NO_JIT option is set. There is also a direct interface for JIT
|
||||
matching, which gives improved performance. The JIT-specific functions are
|
||||
discussed in the
|
||||
<a href="pcre2jit.html"><b>pcre2jit</b></a>
|
||||
documentation.
|
||||
</P>
|
||||
|
@ -555,7 +556,7 @@ least until a pattern has been compiled. The logic can be something like this:
|
|||
Get a write (unique) lock for pointer
|
||||
pointer = pcre2_compile(...
|
||||
}
|
||||
Release the lock
|
||||
Release the lock
|
||||
Use pointer in pcre2_match()
|
||||
</pre>
|
||||
Of course, testing for compilation errors should also be included in the code.
|
||||
|
@ -563,9 +564,9 @@ Of course, testing for compilation errors should also be included in the code.
|
|||
<P>
|
||||
If JIT is being used, but the JIT compilation is not being done immediately,
|
||||
(perhaps waiting to see if the pattern is used often enough) similar logic is
|
||||
required. JIT compilation updates a pointer within the compiled code block, so
|
||||
a thread must gain unique write access to the pointer before calling
|
||||
<b>pcre2_jit_compile()</b>. Alternatively, <b>pcre2_code_copy()</b> can be used
|
||||
required. JIT compilation updates a pointer within the compiled code block, so
|
||||
a thread must gain unique write access to the pointer before calling
|
||||
<b>pcre2_jit_compile()</b>. Alternatively, <b>pcre2_code_copy()</b> can be used
|
||||
to obtain a private copy of the compiled code.
|
||||
</P>
|
||||
<br><b>
|
||||
|
@ -1062,7 +1063,7 @@ The <b>pcre2_compile()</b> function compiles a pattern into an internal form.
|
|||
The pattern is defined by a pointer to a string of code units and a length. If
|
||||
the pattern is zero-terminated, the length can be specified as
|
||||
PCRE2_ZERO_TERMINATED. The function returns a pointer to a block of memory that
|
||||
contains the compiled pattern and related data.
|
||||
contains the compiled pattern and related data.
|
||||
</P>
|
||||
<P>
|
||||
If the compile context argument <i>ccontext</i> is NULL, memory for the compiled
|
||||
|
@ -1071,12 +1072,12 @@ the same memory function that was used for the compile context. The caller must
|
|||
free the memory by calling <b>pcre2_code_free()</b> when it is no longer needed.
|
||||
</P>
|
||||
<P>
|
||||
The function <b>pcre2_code_copy()</b> makes a copy of the compiled code in new
|
||||
memory, using the same memory allocator as was used for the original. However,
|
||||
The function <b>pcre2_code_copy()</b> makes a copy of the compiled code in new
|
||||
memory, using the same memory allocator as was used for the original. However,
|
||||
if the code has been processed by the JIT compiler (see
|
||||
<a href="#jitcompiling">below),</a>
|
||||
the JIT information cannot be copied (because it is position-dependent).
|
||||
The new copy can initially be used only for non-JIT matching, though it can be
|
||||
the JIT information cannot be copied (because it is position-dependent).
|
||||
The new copy can initially be used only for non-JIT matching, though it can be
|
||||
passed to <b>pcre2_jit_compile()</b> if required. The <b>pcre2_code_copy()</b>
|
||||
function provides a way for individual threads in a multithreaded application
|
||||
to acquire a private copy of shared compiled code.
|
||||
|
@ -1630,10 +1631,15 @@ are as follows:
|
|||
Return a copy of the pattern's options. The third argument should point to a
|
||||
<b>uint32_t</b> variable. PCRE2_INFO_ARGOPTIONS returns exactly the options that
|
||||
were passed to <b>pcre2_compile()</b>, whereas PCRE2_INFO_ALLOPTIONS returns
|
||||
the compile options as modified by any top-level option settings such as (*UTF)
|
||||
at the start of the pattern itself. For example, if the pattern /(*UTF)abc/ is
|
||||
compiled with the PCRE2_EXTENDED option, the result is PCRE2_EXTENDED and
|
||||
PCRE2_UTF.
|
||||
the compile options as modified by any top-level (*XXX) option settings such as
|
||||
(*UTF) at the start of the pattern itself.
|
||||
</P>
|
||||
<P>
|
||||
For example, if the pattern /(*UTF)abc/ is compiled with the PCRE2_EXTENDED
|
||||
option, the result for PCRE2_INFO_ALLOPTIONS is PCRE2_EXTENDED and PCRE2_UTF.
|
||||
Option settings such as (?i) that can change within a pattern do not affect the
|
||||
result of PCRE2_INFO_ALLOPTIONS, even if they appear right at the start of the
|
||||
pattern. (This was different in some earlier releases.)
|
||||
</P>
|
||||
<P>
|
||||
A pattern compiled without PCRE2_ANCHORED is automatically anchored by PCRE2 if
|
||||
|
@ -2088,14 +2094,15 @@ Option bits for <b>pcre2_match()</b>
|
|||
<P>
|
||||
The unused bits of the <i>options</i> argument for <b>pcre2_match()</b> must be
|
||||
zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
|
||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
|
||||
PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is described below.
|
||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_JIT,
|
||||
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is
|
||||
described below.
|
||||
</P>
|
||||
<P>
|
||||
Setting PCRE2_ANCHORED at match time is not supported by the just-in-time (JIT)
|
||||
compiler. If it is set, JIT matching is disabled and the normal interpretive
|
||||
code in <b>pcre2_match()</b> is run. The remaining options are supported for JIT
|
||||
matching.
|
||||
code in <b>pcre2_match()</b> is run. Apart from PCRE2_NO_JIT (obviously), the
|
||||
remaining options are supported for JIT matching.
|
||||
<pre>
|
||||
PCRE2_ANCHORED
|
||||
</pre>
|
||||
|
@ -2142,6 +2149,13 @@ only at the first matching position, that is, at the start of the subject plus
|
|||
the starting offset. An empty string match later in the subject is permitted.
|
||||
If the pattern is anchored, such a match can occur only if the pattern contains
|
||||
\K.
|
||||
<pre>
|
||||
PCRE2_NO_JIT
|
||||
</pre>
|
||||
By default, if a pattern has been successfully processed by
|
||||
<b>pcre2_jit_compile()</b>, JIT is automatically used when <b>pcre2_match()</b>
|
||||
is called with options that JIT supports. Setting PCRE2_NO_JIT disables the use
|
||||
of JIT; it forces matching to be done by the interpreter.
|
||||
<pre>
|
||||
PCRE2_NO_UTF_CHECK
|
||||
</pre>
|
||||
|
@ -3184,7 +3198,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC40" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 26 February 2016
|
||||
Last updated: 05 June 2016
|
||||
<br>
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -27,15 +27,16 @@ please consult the man page, in case the conversion went wrong.
|
|||
<li><a name="TOC12" href="#SEC12">LIMITING PCRE2 RESOURCE USAGE</a>
|
||||
<li><a name="TOC13" href="#SEC13">CREATING CHARACTER TABLES AT BUILD TIME</a>
|
||||
<li><a name="TOC14" href="#SEC14">USING EBCDIC CODE</a>
|
||||
<li><a name="TOC15" href="#SEC15">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a>
|
||||
<li><a name="TOC16" href="#SEC16">PCRE2GREP BUFFER SIZE</a>
|
||||
<li><a name="TOC17" href="#SEC17">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a>
|
||||
<li><a name="TOC18" href="#SEC18">INCLUDING DEBUGGING CODE</a>
|
||||
<li><a name="TOC19" href="#SEC19">DEBUGGING WITH VALGRIND SUPPORT</a>
|
||||
<li><a name="TOC20" href="#SEC20">CODE COVERAGE REPORTING</a>
|
||||
<li><a name="TOC21" href="#SEC21">SEE ALSO</a>
|
||||
<li><a name="TOC22" href="#SEC22">AUTHOR</a>
|
||||
<li><a name="TOC23" href="#SEC23">REVISION</a>
|
||||
<li><a name="TOC15" href="#SEC15">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a>
|
||||
<li><a name="TOC16" href="#SEC16">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a>
|
||||
<li><a name="TOC17" href="#SEC17">PCRE2GREP BUFFER SIZE</a>
|
||||
<li><a name="TOC18" href="#SEC18">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a>
|
||||
<li><a name="TOC19" href="#SEC19">INCLUDING DEBUGGING CODE</a>
|
||||
<li><a name="TOC20" href="#SEC20">DEBUGGING WITH VALGRIND SUPPORT</a>
|
||||
<li><a name="TOC21" href="#SEC21">CODE COVERAGE REPORTING</a>
|
||||
<li><a name="TOC22" href="#SEC22">SEE ALSO</a>
|
||||
<li><a name="TOC23" href="#SEC23">AUTHOR</a>
|
||||
<li><a name="TOC24" href="#SEC24">REVISION</a>
|
||||
</ul>
|
||||
<br><a name="SEC1" href="#TOC1">BUILDING PCRE2</a><br>
|
||||
<P>
|
||||
|
@ -349,7 +350,16 @@ The options that select newline behaviour, such as --enable-newline-is-cr,
|
|||
and equivalent run-time options, refer to these character values in an EBCDIC
|
||||
environment.
|
||||
</P>
|
||||
<br><a name="SEC15" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
|
||||
<br><a name="SEC15" href="#TOC1">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a><br>
|
||||
<P>
|
||||
By default, on non-Windows systems, <b>pcre2grep</b> supports the use of
|
||||
callouts with string arguments within the patterns it is matching, in order to
|
||||
run external scripts. For details, see the
|
||||
<a href="pcre2grep.html"><b>pcre2grep</b></a>
|
||||
documentation. This support can be disabled by adding
|
||||
--disable-pcre2grep-callout to the <b>configure</b> command.
|
||||
</P>
|
||||
<br><a name="SEC16" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
|
||||
<P>
|
||||
By default, <b>pcre2grep</b> reads all files as plain text. You can build it so
|
||||
that it recognizes files whose names end in <b>.gz</b> or <b>.bz2</b>, and reads
|
||||
|
@ -362,7 +372,7 @@ to the <b>configure</b> command. These options naturally require that the
|
|||
relevant libraries are installed on your system. Configuration will fail if
|
||||
they are not.
|
||||
</P>
|
||||
<br><a name="SEC16" href="#TOC1">PCRE2GREP BUFFER SIZE</a><br>
|
||||
<br><a name="SEC17" href="#TOC1">PCRE2GREP BUFFER SIZE</a><br>
|
||||
<P>
|
||||
<b>pcre2grep</b> uses an internal buffer to hold a "window" on the file it is
|
||||
scanning, in order to be able to output "before" and "after" lines when it
|
||||
|
@ -375,9 +385,9 @@ parameter value by adding, for example,
|
|||
--with-pcre2grep-bufsize=50K
|
||||
</pre>
|
||||
to the <b>configure</b> command. The caller of \fPpcre2grep\fP can override this
|
||||
value by using --buffer-size on the command line..
|
||||
value by using --buffer-size on the command line.
|
||||
</P>
|
||||
<br><a name="SEC17" href="#TOC1">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a><br>
|
||||
<br><a name="SEC18" href="#TOC1">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a><br>
|
||||
<P>
|
||||
If you add one of
|
||||
<pre>
|
||||
|
@ -411,7 +421,7 @@ automatically included, you may need to add something like
|
|||
</pre>
|
||||
immediately before the <b>configure</b> command.
|
||||
</P>
|
||||
<br><a name="SEC18" href="#TOC1">INCLUDING DEBUGGING CODE</a><br>
|
||||
<br><a name="SEC19" href="#TOC1">INCLUDING DEBUGGING CODE</a><br>
|
||||
<P>
|
||||
If you add
|
||||
<pre>
|
||||
|
@ -420,7 +430,7 @@ If you add
|
|||
to the <b>configure</b> command, additional debugging code is included in the
|
||||
build. This feature is intended for use by the PCRE2 maintainers.
|
||||
</P>
|
||||
<br><a name="SEC19" href="#TOC1">DEBUGGING WITH VALGRIND SUPPORT</a><br>
|
||||
<br><a name="SEC20" href="#TOC1">DEBUGGING WITH VALGRIND SUPPORT</a><br>
|
||||
<P>
|
||||
If you add
|
||||
<pre>
|
||||
|
@ -430,7 +440,7 @@ to the <b>configure</b> command, PCRE2 will use valgrind annotations to mark
|
|||
certain memory regions as unaddressable. This allows it to detect invalid
|
||||
memory accesses, and is mostly useful for debugging PCRE2 itself.
|
||||
</P>
|
||||
<br><a name="SEC20" href="#TOC1">CODE COVERAGE REPORTING</a><br>
|
||||
<br><a name="SEC21" href="#TOC1">CODE COVERAGE REPORTING</a><br>
|
||||
<P>
|
||||
If your C compiler is gcc, you can build a version of PCRE2 that can generate a
|
||||
code coverage report for its test suite. To enable this, you must install
|
||||
|
@ -487,11 +497,11 @@ This cleans all coverage data including the generated coverage report. For more
|
|||
information about code coverage, see the <b>gcov</b> and <b>lcov</b>
|
||||
documentation.
|
||||
</P>
|
||||
<br><a name="SEC21" href="#TOC1">SEE ALSO</a><br>
|
||||
<br><a name="SEC22" href="#TOC1">SEE ALSO</a><br>
|
||||
<P>
|
||||
<b>pcre2api</b>(3), <b>pcre2-config</b>(3).
|
||||
</P>
|
||||
<br><a name="SEC22" href="#TOC1">AUTHOR</a><br>
|
||||
<br><a name="SEC23" href="#TOC1">AUTHOR</a><br>
|
||||
<P>
|
||||
Philip Hazel
|
||||
<br>
|
||||
|
@ -500,11 +510,11 @@ University Computing Service
|
|||
Cambridge, England.
|
||||
<br>
|
||||
</P>
|
||||
<br><a name="SEC23" href="#TOC1">REVISION</a><br>
|
||||
<br><a name="SEC24" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 16 October 2015
|
||||
Last updated: 01 April 2016
|
||||
<br>
|
||||
Copyright © 1997-2015 University of Cambridge.
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -152,6 +152,10 @@ PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The
|
|||
PCRE2_ANCHORED option is not supported at match time.
|
||||
</P>
|
||||
<P>
|
||||
If the PCRE2_NO_JIT option is passed to <b>pcre2_match()</b> it disables the
|
||||
use of JIT, forcing matching by the interpreter code.
|
||||
</P>
|
||||
<P>
|
||||
The only unsupported pattern items are \C (match a single data unit) when
|
||||
running in a UTF mode, and a callout immediately before an assertion condition
|
||||
in a conditional group.
|
||||
|
@ -403,7 +407,7 @@ The fast path function is called <b>pcre2_jit_match()</b>, and it takes exactly
|
|||
the same arguments as <b>pcre2_match()</b>. The return values are also the same,
|
||||
plus PCRE2_ERROR_JIT_BADOPTION if a matching mode (partial or complete) is
|
||||
requested that was not compiled. Unsupported option bits (for example,
|
||||
PCRE2_ANCHORED) are ignored.
|
||||
PCRE2_ANCHORED) are ignored, as is the PCRE2_NO_JIT option.
|
||||
</P>
|
||||
<P>
|
||||
When you call <b>pcre2_match()</b>, as well as testing for invalid options, a
|
||||
|
@ -432,9 +436,9 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC13" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 14 November 2015
|
||||
Last updated: 05 June 2016
|
||||
<br>
|
||||
Copyright © 1997-2015 University of Cambridge.
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -14,10 +14,11 @@ please consult the man page, in case the conversion went wrong.
|
|||
<br>
|
||||
<ul>
|
||||
<li><a name="TOC1" href="#SEC1">SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS</a>
|
||||
<li><a name="TOC2" href="#SEC2">SAVING COMPILED PATTERNS</a>
|
||||
<li><a name="TOC3" href="#SEC3">RE-USING PRECOMPILED PATTERNS</a>
|
||||
<li><a name="TOC4" href="#SEC4">AUTHOR</a>
|
||||
<li><a name="TOC5" href="#SEC5">REVISION</a>
|
||||
<li><a name="TOC2" href="#SEC2">SECURITY CONCERNS</a>
|
||||
<li><a name="TOC3" href="#SEC3">SAVING COMPILED PATTERNS</a>
|
||||
<li><a name="TOC4" href="#SEC4">RE-USING PRECOMPILED PATTERNS</a>
|
||||
<li><a name="TOC5" href="#SEC5">AUTHOR</a>
|
||||
<li><a name="TOC6" href="#SEC6">REVISION</a>
|
||||
</ul>
|
||||
<br><a name="SEC1" href="#TOC1">SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS</a><br>
|
||||
<P>
|
||||
|
@ -48,7 +49,15 @@ and PCRE2_SIZE type. For example, patterns compiled on a 32-bit system using
|
|||
PCRE2's 16-bit library cannot be reloaded on a 64-bit system, nor can they be
|
||||
reloaded using the 8-bit library.
|
||||
</P>
|
||||
<br><a name="SEC2" href="#TOC1">SAVING COMPILED PATTERNS</a><br>
|
||||
<br><a name="SEC2" href="#TOC1">SECURITY CONCERNS</a><br>
|
||||
<P>
|
||||
The facility for saving and restoring compiled patterns is intended for use
|
||||
within individual applications. As such, the data supplied to
|
||||
<b>pcre2_serialize_decode()</b> is expected to be trusted data, not data from
|
||||
arbitrary external sources. There is only some simple consistency checking, not
|
||||
complete validation of what is being re-loaded.
|
||||
</P>
|
||||
<br><a name="SEC3" href="#TOC1">SAVING COMPILED PATTERNS</a><br>
|
||||
<P>
|
||||
Before compiled patterns can be saved they must be serialized, that is,
|
||||
converted to a stream of bytes. A single byte stream may contain any number of
|
||||
|
@ -110,7 +119,7 @@ still be used for matching. Their memory must eventually be freed in the usual
|
|||
way by calling <b>pcre2_code_free()</b>. When you have finished with the byte
|
||||
stream, it too must be freed by calling <b>pcre2_serialize_free()</b>.
|
||||
</P>
|
||||
<br><a name="SEC3" href="#TOC1">RE-USING PRECOMPILED PATTERNS</a><br>
|
||||
<br><a name="SEC4" href="#TOC1">RE-USING PRECOMPILED PATTERNS</a><br>
|
||||
<P>
|
||||
In order to re-use a set of saved patterns you must first make the serialized
|
||||
byte stream available in main memory (for example, by reading from a file). The
|
||||
|
@ -142,11 +151,12 @@ is filled with those that fit, and the remainder are ignored. The yield of the
|
|||
function is the number of decoded patterns, or one of the following negative
|
||||
error codes:
|
||||
<pre>
|
||||
PCRE2_ERROR_BADDATA second argument is zero or less
|
||||
PCRE2_ERROR_BADMAGIC mismatch of id bytes in the data
|
||||
PCRE2_ERROR_BADMODE mismatch of variable unit size or PCRE2 version
|
||||
PCRE2_ERROR_MEMORY memory allocation failed
|
||||
PCRE2_ERROR_NULL first or third argument is NULL
|
||||
PCRE2_ERROR_BADDATA second argument is zero or less
|
||||
PCRE2_ERROR_BADMAGIC mismatch of id bytes in the data
|
||||
PCRE2_ERROR_BADMODE mismatch of code unit size or PCRE2 version
|
||||
PCRE2_ERROR_BADSERIALIZEDDATA other sanity check failure
|
||||
PCRE2_ERROR_MEMORY memory allocation failed
|
||||
PCRE2_ERROR_NULL first or third argument is NULL
|
||||
</pre>
|
||||
PCRE2_ERROR_BADMAGIC may mean that the data is corrupt, or that it was compiled
|
||||
on a system with different endianness.
|
||||
|
@ -169,7 +179,7 @@ serialized, the JIT data is discarded and so is no longer available after a
|
|||
save/restore cycle. You can, however, process a restored pattern with
|
||||
<b>pcre2_jit_compile()</b> if you wish.
|
||||
</P>
|
||||
<br><a name="SEC4" href="#TOC1">AUTHOR</a><br>
|
||||
<br><a name="SEC5" href="#TOC1">AUTHOR</a><br>
|
||||
<P>
|
||||
Philip Hazel
|
||||
<br>
|
||||
|
@ -178,11 +188,11 @@ University Computing Service
|
|||
Cambridge, England.
|
||||
<br>
|
||||
</P>
|
||||
<br><a name="SEC5" href="#TOC1">REVISION</a><br>
|
||||
<br><a name="SEC6" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 03 November 2015
|
||||
Last updated: 24 May 2016
|
||||
<br>
|
||||
Copyright © 1997-2015 University of Cambridge.
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -962,6 +962,7 @@ for a description of their effects.
|
|||
anchored set PCRE2_ANCHORED
|
||||
dfa_restart set PCRE2_DFA_RESTART
|
||||
dfa_shortest set PCRE2_DFA_SHORTEST
|
||||
no_jit set PCRE2_NO_JIT
|
||||
no_utf_check set PCRE2_NO_UTF_CHECK
|
||||
notbol set PCRE2_NOTBOL
|
||||
notempty set PCRE2_NOTEMPTY
|
||||
|
@ -1697,7 +1698,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 06 February 2016
|
||||
Last updated: 05 June 2016
|
||||
<br>
|
||||
Copyright © 1997-2016 University of Cambridge.
|
||||
<br>
|
||||
|
|
1336
doc/pcre2.txt
1336
doc/pcre2.txt
File diff suppressed because it is too large
Load Diff
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2API 3 "27 February 2016" "PCRE2 10.22"
|
||||
.TH PCRE2API 3 "05 June 2016" "PCRE2 10.22"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.sp
|
||||
|
@ -354,9 +354,10 @@ More complicated programs might need to make use of the specialist functions
|
|||
\fBpcre2_jit_stack_create()\fP, \fBpcre2_jit_stack_free()\fP, and
|
||||
\fBpcre2_jit_stack_assign()\fP in order to control the JIT code's memory usage.
|
||||
.P
|
||||
JIT matching is automatically used by \fBpcre2_match()\fP if it is available.
|
||||
There is also a direct interface for JIT matching, which gives improved
|
||||
performance. The JIT-specific functions are discussed in the
|
||||
JIT matching is automatically used by \fBpcre2_match()\fP if it is available,
|
||||
unless the PCRE2_NO_JIT option is set. There is also a direct interface for JIT
|
||||
matching, which gives improved performance. The JIT-specific functions are
|
||||
discussed in the
|
||||
.\" HREF
|
||||
\fBpcre2jit\fP
|
||||
.\"
|
||||
|
@ -499,16 +500,16 @@ least until a pattern has been compiled. The logic can be something like this:
|
|||
Get a write (unique) lock for pointer
|
||||
pointer = pcre2_compile(...
|
||||
}
|
||||
Release the lock
|
||||
Release the lock
|
||||
Use pointer in pcre2_match()
|
||||
.sp
|
||||
Of course, testing for compilation errors should also be included in the code.
|
||||
.P
|
||||
If JIT is being used, but the JIT compilation is not being done immediately,
|
||||
(perhaps waiting to see if the pattern is used often enough) similar logic is
|
||||
required. JIT compilation updates a pointer within the compiled code block, so
|
||||
a thread must gain unique write access to the pointer before calling
|
||||
\fBpcre2_jit_compile()\fP. Alternatively, \fBpcre2_code_copy()\fP can be used
|
||||
required. JIT compilation updates a pointer within the compiled code block, so
|
||||
a thread must gain unique write access to the pointer before calling
|
||||
\fBpcre2_jit_compile()\fP. Alternatively, \fBpcre2_code_copy()\fP can be used
|
||||
to obtain a private copy of the compiled code.
|
||||
.
|
||||
.
|
||||
|
@ -1031,22 +1032,22 @@ The \fBpcre2_compile()\fP function compiles a pattern into an internal form.
|
|||
The pattern is defined by a pointer to a string of code units and a length. If
|
||||
the pattern is zero-terminated, the length can be specified as
|
||||
PCRE2_ZERO_TERMINATED. The function returns a pointer to a block of memory that
|
||||
contains the compiled pattern and related data.
|
||||
contains the compiled pattern and related data.
|
||||
.P
|
||||
If the compile context argument \fIccontext\fP is NULL, memory for the compiled
|
||||
pattern is obtained by calling \fBmalloc()\fP. Otherwise, it is obtained from
|
||||
the same memory function that was used for the compile context. The caller must
|
||||
free the memory by calling \fBpcre2_code_free()\fP when it is no longer needed.
|
||||
.P
|
||||
The function \fBpcre2_code_copy()\fP makes a copy of the compiled code in new
|
||||
memory, using the same memory allocator as was used for the original. However,
|
||||
The function \fBpcre2_code_copy()\fP makes a copy of the compiled code in new
|
||||
memory, using the same memory allocator as was used for the original. However,
|
||||
if the code has been processed by the JIT compiler (see
|
||||
.\" HTML <a href="#jitcompiling">
|
||||
.\" </a>
|
||||
below),
|
||||
.\"
|
||||
the JIT information cannot be copied (because it is position-dependent).
|
||||
The new copy can initially be used only for non-JIT matching, though it can be
|
||||
the JIT information cannot be copied (because it is position-dependent).
|
||||
The new copy can initially be used only for non-JIT matching, though it can be
|
||||
passed to \fBpcre2_jit_compile()\fP if required. The \fBpcre2_code_copy()\fP
|
||||
function provides a way for individual threads in a multithreaded application
|
||||
to acquire a private copy of shared compiled code.
|
||||
|
@ -1629,7 +1630,7 @@ Return a copy of the pattern's options. The third argument should point to a
|
|||
\fBuint32_t\fP variable. PCRE2_INFO_ARGOPTIONS returns exactly the options that
|
||||
were passed to \fBpcre2_compile()\fP, whereas PCRE2_INFO_ALLOPTIONS returns
|
||||
the compile options as modified by any top-level (*XXX) option settings such as
|
||||
(*UTF) at the start of the pattern itself.
|
||||
(*UTF) at the start of the pattern itself.
|
||||
.P
|
||||
For example, if the pattern /(*UTF)abc/ is compiled with the PCRE2_EXTENDED
|
||||
option, the result for PCRE2_INFO_ALLOPTIONS is PCRE2_EXTENDED and PCRE2_UTF.
|
||||
|
@ -2110,13 +2111,14 @@ pattern does not require the match to be at the start of the subject.
|
|||
.sp
|
||||
The unused bits of the \fIoptions\fP argument for \fBpcre2_match()\fP must be
|
||||
zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
|
||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
|
||||
PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is described below.
|
||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_JIT,
|
||||
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is
|
||||
described below.
|
||||
.P
|
||||
Setting PCRE2_ANCHORED at match time is not supported by the just-in-time (JIT)
|
||||
compiler. If it is set, JIT matching is disabled and the normal interpretive
|
||||
code in \fBpcre2_match()\fP is run. The remaining options are supported for JIT
|
||||
matching.
|
||||
code in \fBpcre2_match()\fP is run. Apart from PCRE2_NO_JIT (obviously), the
|
||||
remaining options are supported for JIT matching.
|
||||
.sp
|
||||
PCRE2_ANCHORED
|
||||
.sp
|
||||
|
@ -2163,6 +2165,13 @@ only at the first matching position, that is, at the start of the subject plus
|
|||
the starting offset. An empty string match later in the subject is permitted.
|
||||
If the pattern is anchored, such a match can occur only if the pattern contains
|
||||
\eK.
|
||||
.sp
|
||||
PCRE2_NO_JIT
|
||||
.sp
|
||||
By default, if a pattern has been successfully processed by
|
||||
\fBpcre2_jit_compile()\fP, JIT is automatically used when \fBpcre2_match()\fP
|
||||
is called with options that JIT supports. Setting PCRE2_NO_JIT disables the use
|
||||
of JIT; it forces matching to be done by the interpreter.
|
||||
.sp
|
||||
PCRE2_NO_UTF_CHECK
|
||||
.sp
|
||||
|
@ -3233,6 +3242,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 27 February 2016
|
||||
Last updated: 05 June 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2JIT 3 "14 November 2015" "PCRE2 10.21"
|
||||
.TH PCRE2JIT 3 "05 June 2016" "PCRE2 10.22"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.SH "PCRE2 JUST-IN-TIME COMPILER SUPPORT"
|
||||
|
@ -128,6 +128,9 @@ PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
|
|||
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The
|
||||
PCRE2_ANCHORED option is not supported at match time.
|
||||
.P
|
||||
If the PCRE2_NO_JIT option is passed to \fBpcre2_match()\fP it disables the
|
||||
use of JIT, forcing matching by the interpreter code.
|
||||
.P
|
||||
The only unsupported pattern items are \eC (match a single data unit) when
|
||||
running in a UTF mode, and a callout immediately before an assertion condition
|
||||
in a conditional group.
|
||||
|
@ -377,7 +380,7 @@ The fast path function is called \fBpcre2_jit_match()\fP, and it takes exactly
|
|||
the same arguments as \fBpcre2_match()\fP. The return values are also the same,
|
||||
plus PCRE2_ERROR_JIT_BADOPTION if a matching mode (partial or complete) is
|
||||
requested that was not compiled. Unsupported option bits (for example,
|
||||
PCRE2_ANCHORED) are ignored.
|
||||
PCRE2_ANCHORED) are ignored, as is the PCRE2_NO_JIT option.
|
||||
.P
|
||||
When you call \fBpcre2_match()\fP, as well as testing for invalid options, a
|
||||
number of other sanity checks are performed on the arguments. For example, if
|
||||
|
@ -410,6 +413,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 14 November 2015
|
||||
Copyright (c) 1997-2015 University of Cambridge.
|
||||
Last updated: 05 June 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2TEST 1 "26 February 2016" "PCRE 10.22"
|
||||
.TH PCRE2TEST 1 "05 June 2016" "PCRE 10.22"
|
||||
.SH NAME
|
||||
pcre2test - a program for testing Perl-compatible regular expressions.
|
||||
.SH SYNOPSIS
|
||||
|
@ -931,6 +931,7 @@ for a description of their effects.
|
|||
anchored set PCRE2_ANCHORED
|
||||
dfa_restart set PCRE2_DFA_RESTART
|
||||
dfa_shortest set PCRE2_DFA_SHORTEST
|
||||
no_jit set PCRE2_NO_JIT
|
||||
no_utf_check set PCRE2_NO_UTF_CHECK
|
||||
notbol set PCRE2_NOTBOL
|
||||
notempty set PCRE2_NOTEMPTY
|
||||
|
@ -1674,6 +1675,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 06 February 2016
|
||||
Last updated: 05 June 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -859,6 +859,7 @@ SUBJECT MODIFIERS
|
|||
anchored set PCRE2_ANCHORED
|
||||
dfa_restart set PCRE2_DFA_RESTART
|
||||
dfa_shortest set PCRE2_DFA_SHORTEST
|
||||
no_jit set PCRE2_NO_JIT
|
||||
no_utf_check set PCRE2_NO_UTF_CHECK
|
||||
notbol set PCRE2_NOTBOL
|
||||
notempty set PCRE2_NOTEMPTY
|
||||
|
@ -1538,5 +1539,5 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 06 February 2016
|
||||
Last updated: 05 June 2016
|
||||
Copyright (c) 1997-2016 University of Cambridge.
|
||||
|
|
|
@ -146,7 +146,8 @@ sanity checks). */
|
|||
#define PCRE2_DFA_RESTART 0x00000040u
|
||||
#define PCRE2_DFA_SHORTEST 0x00000080u
|
||||
|
||||
/* These are additional options for pcre2_substitute(). */
|
||||
/* These are additional options for pcre2_substitute(), which passes any others
|
||||
through to pcre2_match(). */
|
||||
|
||||
#define PCRE2_SUBSTITUTE_GLOBAL 0x00000100u
|
||||
#define PCRE2_SUBSTITUTE_EXTENDED 0x00000200u
|
||||
|
@ -154,6 +155,11 @@ sanity checks). */
|
|||
#define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u
|
||||
#define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u
|
||||
|
||||
/* A further option for pcre2_match(), not allowed for pcre2_dfa_match(),
|
||||
ignored for pcre2_jit_match(). */
|
||||
|
||||
#define PCRE2_NO_JIT 0x00002000u
|
||||
|
||||
/* Newline and \R settings, for use in compile contexts. The newline values
|
||||
must be kept in step with values set in config.h and both sets must all be
|
||||
greater than zero. */
|
||||
|
|
|
@ -146,7 +146,8 @@ sanity checks). */
|
|||
#define PCRE2_DFA_RESTART 0x00000040u
|
||||
#define PCRE2_DFA_SHORTEST 0x00000080u
|
||||
|
||||
/* These are additional options for pcre2_substitute(). */
|
||||
/* These are additional options for pcre2_substitute(), which passes any others
|
||||
through to pcre2_match(). */
|
||||
|
||||
#define PCRE2_SUBSTITUTE_GLOBAL 0x00000100u
|
||||
#define PCRE2_SUBSTITUTE_EXTENDED 0x00000200u
|
||||
|
@ -154,6 +155,11 @@ sanity checks). */
|
|||
#define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u
|
||||
#define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u
|
||||
|
||||
/* A further option for pcre2_match(), not allowed for pcre2_dfa_match(),
|
||||
ignored for pcre2_jit_match(). */
|
||||
|
||||
#define PCRE2_NO_JIT 0x00002000u
|
||||
|
||||
/* Newline and \R settings, for use in compile contexts. The newline values
|
||||
must be kept in step with values set in config.h and both sets must all be
|
||||
greater than zero. */
|
||||
|
|
|
@ -55,7 +55,7 @@ POSSIBILITY OF SUCH DAMAGE.
|
|||
#define PUBLIC_MATCH_OPTIONS \
|
||||
(PCRE2_ANCHORED|PCRE2_NOTBOL|PCRE2_NOTEOL|PCRE2_NOTEMPTY| \
|
||||
PCRE2_NOTEMPTY_ATSTART|PCRE2_NO_UTF_CHECK|PCRE2_PARTIAL_HARD| \
|
||||
PCRE2_PARTIAL_SOFT)
|
||||
PCRE2_PARTIAL_SOFT|PCRE2_NO_JIT)
|
||||
|
||||
#define PUBLIC_JIT_MATCH_OPTIONS \
|
||||
(PCRE2_NO_UTF_CHECK|PCRE2_NOTBOL|PCRE2_NOTEOL|PCRE2_NOTEMPTY|\
|
||||
|
|
|
@ -586,6 +586,7 @@ static modstruct modlist[] = {
|
|||
{ "no_auto_capture", MOD_PAT, MOD_OPT, PCRE2_NO_AUTO_CAPTURE, PO(options) },
|
||||
{ "no_auto_possess", MOD_PATP, MOD_OPT, PCRE2_NO_AUTO_POSSESS, PO(options) },
|
||||
{ "no_dotstar_anchor", MOD_PAT, MOD_OPT, PCRE2_NO_DOTSTAR_ANCHOR, PO(options) },
|
||||
{ "no_jit", MOD_DAT, MOD_OPT, PCRE2_NO_JIT, DO(options) },
|
||||
{ "no_start_optimize", MOD_PATP, MOD_OPT, PCRE2_NO_START_OPTIMIZE, PO(options) },
|
||||
{ "no_utf_check", MOD_PD, MOD_OPT, PCRE2_NO_UTF_CHECK, PD(options) },
|
||||
{ "notbol", MOD_DAT, MOD_OPT, PCRE2_NOTBOL, DO(options) },
|
||||
|
|
|
@ -278,5 +278,15 @@
|
|||
|
||||
/(.|.)*?bx/
|
||||
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabax
|
||||
|
||||
# Test JIT disable
|
||||
|
||||
/abc/
|
||||
abc
|
||||
abc\=no_jit
|
||||
|
||||
/abc/jitfast
|
||||
abc
|
||||
abc\=no_jit
|
||||
|
||||
# End of testinput17
|
||||
|
|
|
@ -516,5 +516,19 @@ Failed: error -46: JIT stack limit reached
|
|||
/(.|.)*?bx/
|
||||
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabax
|
||||
Failed: error -47: match limit exceeded
|
||||
|
||||
# Test JIT disable
|
||||
|
||||
/abc/
|
||||
abc
|
||||
0: abc (JIT)
|
||||
abc\=no_jit
|
||||
0: abc
|
||||
|
||||
/abc/jitfast
|
||||
abc
|
||||
0: abc (JIT)
|
||||
abc\=no_jit
|
||||
0: abc (JIT)
|
||||
|
||||
# End of testinput17
|
||||
|
|
Loading…
Reference in New Issue