Documentation detrail and make HTML for 10.22-RC1.

This commit is contained in:
Philip.Hazel 2016-06-29 16:52:05 +00:00
parent 431d41cb2d
commit 921636f6fc
33 changed files with 258 additions and 221 deletions

120
ChangeLog
View File

@ -5,36 +5,36 @@ Change Log for PCRE2
Version 10.22 29-June-2016 Version 10.22 29-June-2016
-------------------------- --------------------------
1. Applied Jason Hood's patches to RunTest.bat and testdata/wintestoutput3 1. Applied Jason Hood's patches to RunTest.bat and testdata/wintestoutput3
to fix problems with running the tests under Windows. to fix problems with running the tests under Windows.
2. Implemented a facility for quoting literal characters within hexadecimal 2. Implemented a facility for quoting literal characters within hexadecimal
patterns in pcre2test, to make it easier to create patterns with just a few patterns in pcre2test, to make it easier to create patterns with just a few
non-printing characters. non-printing characters.
3. Binary zeros are not supported in pcre2test input files. It now detects them 3. Binary zeros are not supported in pcre2test input files. It now detects them
and gives an error. and gives an error.
4. Updated the valgrind parameters in RunTest: (a) changed smc-check=all to 4. Updated the valgrind parameters in RunTest: (a) changed smc-check=all to
smc-check=all-non-file; (b) changed obj:* in the suppression file to obj:??? so smc-check=all-non-file; (b) changed obj:* in the suppression file to obj:??? so
that it matches only unknown objects. that it matches only unknown objects.
5. Updated the maintenance script maint/ManyConfigTests to make it easier to 5. Updated the maintenance script maint/ManyConfigTests to make it easier to
select individual groups of tests. select individual groups of tests.
6. When the POSIX wrapper function regcomp() is called, the REG_NOSUB option 6. When the POSIX wrapper function regcomp() is called, the REG_NOSUB option
used to set PCRE2_NO_AUTO_CAPTURE when calling pcre2_compile(). However, this used to set PCRE2_NO_AUTO_CAPTURE when calling pcre2_compile(). However, this
disables the use of back references (and subroutine calls), which are supported disables the use of back references (and subroutine calls), which are supported
by other implementations of regcomp() with RE_NOSUB. Therefore, REG_NOSUB no by other implementations of regcomp() with RE_NOSUB. Therefore, REG_NOSUB no
longer causes PCRE2_NO_AUTO_CAPTURE to be set, though it still ignores nmatch longer causes PCRE2_NO_AUTO_CAPTURE to be set, though it still ignores nmatch
and pmatch when regexec() is called. and pmatch when regexec() is called.
7. Because of 6 above, pcre2test has been modified with a new modifier called 7. Because of 6 above, pcre2test has been modified with a new modifier called
posix_nosub, to call regcomp() with REG_NOSUB. Previously the no_auto_capture posix_nosub, to call regcomp() with REG_NOSUB. Previously the no_auto_capture
modifier had this effect. That option is now ignored when the POSIX API is in modifier had this effect. That option is now ignored when the POSIX API is in
use. use.
8. Minor tidies to the pcre2demo.c sample program, including more comments 8. Minor tidies to the pcre2demo.c sample program, including more comments
about its 8-bit-ness. about its 8-bit-ness.
9. Detect unmatched closing parentheses and give the error in the pre-scan 9. Detect unmatched closing parentheses and give the error in the pre-scan
@ -48,88 +48,88 @@ regex library instead of libpcre2-posix. In this situation, a call to regcomp()
own data into the regex_t block. In one example the re_pcre2_code field was own data into the regex_t block. In one example the re_pcre2_code field was
left as NULL, which made pcre2test think it had not got a compiled POSIX regex, left as NULL, which made pcre2test think it had not got a compiled POSIX regex,
so it treated the next line as another pattern line, resulting in a confusing so it treated the next line as another pattern line, resulting in a confusing
error message. A check has been added to pcre2test to see if the data returned error message. A check has been added to pcre2test to see if the data returned
from a successful call of regcomp() are valid for PCRE2's regcomp(). If they from a successful call of regcomp() are valid for PCRE2's regcomp(). If they
are not, an error message is output and the pcre2test run is abandoned. The are not, an error message is output and the pcre2test run is abandoned. The
message points out the possibility of a mis-linking. Hopefully this will avoid message points out the possibility of a mis-linking. Hopefully this will avoid
some head-scratching the next time this happens. some head-scratching the next time this happens.
11. A pattern such as /(?<=((?C)0))/, which has a callout inside a lookbehind 11. A pattern such as /(?<=((?C)0))/, which has a callout inside a lookbehind
assertion, caused pcre2test to output a very large number of spaces when the assertion, caused pcre2test to output a very large number of spaces when the
callout was taken, making the program appearing to loop. callout was taken, making the program appearing to loop.
12. A pattern that included (*ACCEPT) in the middle of a sufficiently deeply 12. A pattern that included (*ACCEPT) in the middle of a sufficiently deeply
nested set of parentheses of sufficient size caused an overflow of the nested set of parentheses of sufficient size caused an overflow of the
compiling workspace (which was diagnosed, but of course is not desirable). compiling workspace (which was diagnosed, but of course is not desirable).
13. Detect missing closing parentheses during the pre-pass for group 13. Detect missing closing parentheses during the pre-pass for group
identification. identification.
14. Changed some integer variable types and put in a number of casts, following 14. Changed some integer variable types and put in a number of casts, following
a report of compiler warnings from Visual Studio 2013 and a few tests with a report of compiler warnings from Visual Studio 2013 and a few tests with
gcc's -Wconversion (which still throws up a lot). gcc's -Wconversion (which still throws up a lot).
15. Implemented pcre2_code_copy(), and added pushcopy and #popcopy to pcre2test 15. Implemented pcre2_code_copy(), and added pushcopy and #popcopy to pcre2test
for testing it. for testing it.
16. Change 66 for 10.21 introduced the use of snprintf() in PCRE2's version of 16. Change 66 for 10.21 introduced the use of snprintf() in PCRE2's version of
regerror(). When the error buffer is too small, my version of snprintf() puts a regerror(). When the error buffer is too small, my version of snprintf() puts a
binary zero in the final byte. Bug #1801 seems to show that other versions do binary zero in the final byte. Bug #1801 seems to show that other versions do
not do this, leading to bad output from pcre2test when it was checking for not do this, leading to bad output from pcre2test when it was checking for
buffer overflow. It no longer assumes a binary zero at the end of a too-small buffer overflow. It no longer assumes a binary zero at the end of a too-small
regerror() buffer. regerror() buffer.
17. Fixed typo ("&&" for "&") in pcre2_study(). Fortunately, this could not 17. Fixed typo ("&&" for "&") in pcre2_study(). Fortunately, this could not
actually affect anything, by sheer luck. actually affect anything, by sheer luck.
18. Two minor fixes for MSVC compilation: (a) removal of apparently incorrect 18. Two minor fixes for MSVC compilation: (a) removal of apparently incorrect
"const" qualifiers in pcre2test and (b) defining snprintf as _snprintf for "const" qualifiers in pcre2test and (b) defining snprintf as _snprintf for
older MSVC compilers. This has been done both in src/pcre2_internal.h for most older MSVC compilers. This has been done both in src/pcre2_internal.h for most
of the library, and also in src/pcre2posix.c, which no longer includes of the library, and also in src/pcre2posix.c, which no longer includes
pcre2_internal.h (see 24 below). pcre2_internal.h (see 24 below).
19. Applied Chris Wilson's patch (Bugzilla #1681) to CMakeLists.txt for MSVC 19. Applied Chris Wilson's patch (Bugzilla #1681) to CMakeLists.txt for MSVC
static compilation. Subsequently applied Chris Wilson's second patch, putting static compilation. Subsequently applied Chris Wilson's second patch, putting
the first patch under a new option instead of being unconditional when the first patch under a new option instead of being unconditional when
PCRE_STATIC is set. PCRE_STATIC is set.
20. Updated pcre2grep to set stdout as binary when run under Windows, so as not 20. Updated pcre2grep to set stdout as binary when run under Windows, so as not
to convert \r\n at the ends of reflected lines into \r\r\n. This required to convert \r\n at the ends of reflected lines into \r\r\n. This required
ensuring that other output that is written to stdout (e.g. file names) uses the ensuring that other output that is written to stdout (e.g. file names) uses the
appropriate line terminator: \r\n for Windows, \n otherwise. appropriate line terminator: \r\n for Windows, \n otherwise.
21. When a line is too long for pcre2grep's internal buffer, show the maximum 21. When a line is too long for pcre2grep's internal buffer, show the maximum
length in the error message. length in the error message.
22. Added support for string callouts to pcre2grep (Zoltan's patch with PH 22. Added support for string callouts to pcre2grep (Zoltan's patch with PH
additions). additions).
23. RunTest.bat was missing a "set type" line for test 22. 23. RunTest.bat was missing a "set type" line for test 22.
24. The pcre2posix.c file was including pcre2_internal.h, and using some 24. The pcre2posix.c file was including pcre2_internal.h, and using some
"private" knowledge of the data structures. This is unnecessary; the code has "private" knowledge of the data structures. This is unnecessary; the code has
been re-factored and no longer includes pcre2_internal.h. been re-factored and no longer includes pcre2_internal.h.
25. A racing condition is fixed in JIT reported by Mozilla. 25. A racing condition is fixed in JIT reported by Mozilla.
26. Minor code refactor to avoid "array subscript is below array bounds" 26. Minor code refactor to avoid "array subscript is below array bounds"
compiler warning. compiler warning.
27. Minor code refactor to avoid "left shift of negative number" warning. 27. Minor code refactor to avoid "left shift of negative number" warning.
28. Add a bit more sanity checking to pcre2_serialize_decode() and document 28. Add a bit more sanity checking to pcre2_serialize_decode() and document
that it expects trusted data. that it expects trusted data.
29. Fix typo in pcre2_jit_test.c 29. Fix typo in pcre2_jit_test.c
30. Due to an oversight, pcre2grep was not making use of JIT when available. 30. Due to an oversight, pcre2grep was not making use of JIT when available.
This is now fixed. This is now fixed.
31. The RunGrepTest script is updated to use the valgrind suppressions file 31. The RunGrepTest script is updated to use the valgrind suppressions file
when testing with JIT under valgrind (compare 10.21/51 below). The suppressions when testing with JIT under valgrind (compare 10.21/51 below). The suppressions
file is updated so that is now the same as for PCRE1: it suppresses the file is updated so that is now the same as for PCRE1: it suppresses the
Memcheck warnings Addr16 and Cond in unknown objects (that is, JIT-compiled Memcheck warnings Addr16 and Cond in unknown objects (that is, JIT-compiled
code). Also changed smc-check=all to smc-check=all-non-file as was done for code). Also changed smc-check=all to smc-check=all-non-file as was done for
RunTest (see 4 above). RunTest (see 4 above).
32. Implemented the PCRE2_NO_JIT option for pcre2_match(). 32. Implemented the PCRE2_NO_JIT option for pcre2_match().
@ -140,30 +140,30 @@ RunTest (see 4 above).
35. Fix potential negative index in pcre2test. 35. Fix potential negative index in pcre2test.
36. Calls to pcre2_get_error_message() with error numbers that are never 36. Calls to pcre2_get_error_message() with error numbers that are never
returned by PCRE2 functions were returning empty strings. Now the error code returned by PCRE2 functions were returning empty strings. Now the error code
PCRE2_ERROR_BADDATA is returned. A facility has been added to pcre2test to PCRE2_ERROR_BADDATA is returned. A facility has been added to pcre2test to
show the texts for given error numbers (i.e. to call pcre2_get_error_message() show the texts for given error numbers (i.e. to call pcre2_get_error_message()
and display what it returns) and a few representative error codes are now and display what it returns) and a few representative error codes are now
checked in RunTest. checked in RunTest.
37. Added "&& !defined(__INTEL_COMPILER)" to the test for __GNUC__ in 37. Added "&& !defined(__INTEL_COMPILER)" to the test for __GNUC__ in
pcre2_match.c, in anticipation that this is needed for the same reason it was pcre2_match.c, in anticipation that this is needed for the same reason it was
recently added to pcrecpp.cc in PCRE1. recently added to pcrecpp.cc in PCRE1.
38. Using -o with -M in pcre2grep could cause unnecessary repeated output when 38. Using -o with -M in pcre2grep could cause unnecessary repeated output when
the match extended over a line boundary, as it tried to find more matches "on the match extended over a line boundary, as it tried to find more matches "on
the same line" - but it was already over the end. the same line" - but it was already over the end.
39. Allow \C in lookbehinds and DFA matching in UTF-32 mode (by converting it 39. Allow \C in lookbehinds and DFA matching in UTF-32 mode (by converting it
to the same code as '.' when PCRE2_DOTALL is set). to the same code as '.' when PCRE2_DOTALL is set).
40. Fix two clang compiler warnings in pcre2test when only one code unit width 40. Fix two clang compiler warnings in pcre2test when only one code unit width
is supported. is supported.
41. Upgrade RunTest to automatically re-run test 2 with a large (64M) stack if 41. Upgrade RunTest to automatically re-run test 2 with a large (64M) stack if
it fails when running the interpreter with a 16M stack (and if changing the it fails when running the interpreter with a 16M stack (and if changing the
stack size via pcre2test is possible). This avoids having to manually set a stack size via pcre2test is possible). This avoids having to manually set a
large stack size when testing with clang. large stack size when testing with clang.

6
NEWS
View File

@ -6,17 +6,17 @@ Version 10.22 29-June-2016
1. ChangeLog has the details of a number of bug fixes. 1. ChangeLog has the details of a number of bug fixes.
2. The POSIX wrapper function regcomp() did not used to support back references 2. The POSIX wrapper function regcomp() did not used to support back references
and subroutine calls if called with the REG_NOSUB option. It now does. and subroutine calls if called with the REG_NOSUB option. It now does.
3. A new function, pcre2_code_copy(), is added, to make a copy of a compiled 3. A new function, pcre2_code_copy(), is added, to make a copy of a compiled
pattern. pattern.
4. Support for string callouts is added to pcre2grep. 4. Support for string callouts is added to pcre2grep.
5. Added the PCRE2_NO_JIT option to pcre2_match(). 5. Added the PCRE2_NO_JIT option to pcre2_match().
6. The pcre2_get_error_message() function now returns with a negative error 6. The pcre2_get_error_message() function now returns with a negative error
code if the error number it is given is unknown. code if the error number it is given is unknown.
7. Several updates have been made to pcre2test and test scripts (see 7. Several updates have been made to pcre2test and test scripts (see

6
README
View File

@ -168,7 +168,7 @@ library. They are also documented in the pcre2build man page.
built. If you want only the 16-bit or 32-bit library, use --disable-pcre2-8 built. If you want only the 16-bit or 32-bit library, use --disable-pcre2-8
to disable building the 8-bit library. to disable building the 8-bit library.
. If you want to include support for just-in-time (JIT) compiling, which can . If you want to include support for just-in-time (JIT) compiling, which can
give large performance improvements on certain platforms, add --enable-jit to give large performance improvements on certain platforms, add --enable-jit to
the "configure" command. This support is available only for certain hardware the "configure" command. This support is available only for certain hardware
architectures. If you try to enable it on an unsupported architecture, there architectures. If you try to enable it on an unsupported architecture, there
@ -323,8 +323,8 @@ library. They are also documented in the pcre2build man page.
. When JIT support is enabled, pcre2grep automatically makes use of it, unless . When JIT support is enabled, pcre2grep automatically makes use of it, unless
you add --disable-pcre2grep-jit to the "configure" command. you add --disable-pcre2grep-jit to the "configure" command.
. On non-Windows sytems there is support for calling external scripts during . On non-Windows sytems there is support for calling external scripts during
matching in the pcre2grep command via PCRE2's callout facility with string matching in the pcre2grep command via PCRE2's callout facility with string
arguments. This support can be disabled by adding --disable-pcre2grep-callout arguments. This support can be disabled by adding --disable-pcre2grep-callout
to the "configure" command. to the "configure" command.

View File

@ -48,8 +48,8 @@ else
echo "Testing $pcre2grep_version using valgrind" echo "Testing $pcre2grep_version using valgrind"
$pcre2test -C jit >/dev/null $pcre2test -C jit >/dev/null
if [ $? -ne 0 ]; then if [ $? -ne 0 ]; then
vjs="--suppressions=./testdata/valgrind-jit.supp" vjs="--suppressions=./testdata/valgrind-jit.supp"
fi fi
fi fi
# Set up a suitable "diff" command for comparison. Some systems have a diff # Set up a suitable "diff" command for comparison. Some systems have a diff
@ -633,15 +633,15 @@ if [ $? != 0 ] ; then exit 1; fi
# If pcre2grep supports script callouts, run some tests on them. # If pcre2grep supports script callouts, run some tests on them.
if $valgrind $vjs $pcre2grep --help | $valgrind $vjs $pcre2grep -q 'Callout scripts in patterns are supported'; then if $valgrind $vjs $pcre2grep --help | $valgrind $vjs $pcre2grep -q 'Callout scripts in patterns are supported'; then
echo "Testing pcre2grep script callouts" echo "Testing pcre2grep script callouts"
$valgrind $vjs $pcre2grep '(T)(..(.))(?C"/bin/echo|Arg1: [$1] [$2] [$3]|Arg2: $|${1}$| ($4) ($14) ($0)")()' $srcdir/testdata/grepinputv >testtrygrep $valgrind $vjs $pcre2grep '(T)(..(.))(?C"/bin/echo|Arg1: [$1] [$2] [$3]|Arg2: $|${1}$| ($4) ($14) ($0)")()' $srcdir/testdata/grepinputv >testtrygrep
$valgrind $vjs $pcre2grep '(T)(..(.))()()()()()()()(..)(?C"/bin/echo|Arg1: [$11] [${11}]")' $srcdir/testdata/grepinputv >>testtrygrep $valgrind $vjs $pcre2grep '(T)(..(.))()()()()()()()(..)(?C"/bin/echo|Arg1: [$11] [${11}]")' $srcdir/testdata/grepinputv >>testtrygrep
$cf $srcdir/testdata/grepoutputC testtrygrep $cf $srcdir/testdata/grepoutputC testtrygrep
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
else else
echo "Script callouts are not supported" echo "Script callouts are not supported"
fi fi
# Finally, some tests to exercise code that is not tested above, just to be # Finally, some tests to exercise code that is not tested above, just to be
# sure that it runs OK. Doing this improves the coverage statistics. The output # sure that it runs OK. Doing this improves the coverage statistics. The output

10
RunTest
View File

@ -314,7 +314,7 @@ fi
# line, set even bigger numbers. When the compiler is clang, sanitize options # line, set even bigger numbers. When the compiler is clang, sanitize options
# require an even bigger stack for test 2, and an increased stack for some of # require an even bigger stack for test 2, and an increased stack for some of
# the other tests. Test 2 now has code to automatically try again with a 64M # the other tests. Test 2 now has code to automatically try again with a 64M
# stack if it crashes when test2stack is "-S 16" when matching with the # stack if it crashes when test2stack is "-S 16" when matching with the
# interpreter. # interpreter.
$sim ./pcre2test -S 1 /dev/null /dev/null $sim ./pcre2test -S 1 /dev/null /dev/null
@ -502,7 +502,7 @@ for bmode in "$test8" "$test16" "$test32"; do
for opt in "" $jitopt; do for opt in "" $jitopt; do
$sim $valgrind ${opt:+$vjs} ./pcre2test -q $test2stack $bmode $opt $testdata/testinput2 testtry $sim $valgrind ${opt:+$vjs} ./pcre2test -q $test2stack $bmode $opt $testdata/testinput2 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$sim $valgrind ${opt:+$vjs} ./pcre2test -q $bmode $opt -error -63,-62,-2,-1,0,100,188,189 >>testtry $sim $valgrind ${opt:+$vjs} ./pcre2test -q $bmode $opt -error -63,-62,-2,-1,0,100,188,189 >>testtry
checkresult $? 2 "$opt" checkresult $? 2 "$opt"
else else
echo " " echo " "
@ -520,14 +520,14 @@ for bmode in "$test8" "$test16" "$test32"; do
echo $title2 "(excluding UTF-$bits) (64M stack)" echo $title2 "(excluding UTF-$bits) (64M stack)"
$sim $valgrind ${opt:+$vjs} ./pcre2test -q -S 64 $bmode $opt $testdata/testinput2 testtry $sim $valgrind ${opt:+$vjs} ./pcre2test -q -S 64 $bmode $opt $testdata/testinput2 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$sim $valgrind ${opt:+$vjs} ./pcre2test -q $bmode $opt -error -63,-62,-2,-1,0,100,188,189 >>testtry $sim $valgrind ${opt:+$vjs} ./pcre2test -q $bmode $opt -error -63,-62,-2,-1,0,100,188,189 >>testtry
checkresult $? 2 "$opt" checkresult $? 2 "$opt"
else else
echo " " echo " "
echo "** Failed with an increased stack size. Tests abandoned." echo "** Failed with an increased stack size. Tests abandoned."
echo " " echo " "
exit 1 exit 1
fi fi
fi fi
done done
fi fi

View File

@ -156,8 +156,8 @@ if test "$HAVE_WINDOWS_H" != "1"; then
[disable callout script support in pcre2grep]), [disable callout script support in pcre2grep]),
, enable_pcre2grep_callout=yes) , enable_pcre2grep_callout=yes)
else else
enable_pcre2grep_callout=no enable_pcre2grep_callout=no
fi fi
# Handle --enable-rebuild-chartables # Handle --enable-rebuild-chartables
AC_ARG_ENABLE(rebuild-chartables, AC_ARG_ENABLE(rebuild-chartables,
@ -564,12 +564,12 @@ if test "$enable_pcre2grep_callout" = "yes"; then
if test "$HAVE_WINDOWS_H" != "1"; then if test "$HAVE_WINDOWS_H" != "1"; then
if test "$HAVE_SYS_WAIT_H" != "1"; then if test "$HAVE_SYS_WAIT_H" != "1"; then
AC_MSG_ERROR([Callout script support needs sys/wait.h.]) AC_MSG_ERROR([Callout script support needs sys/wait.h.])
fi fi
AC_DEFINE([SUPPORT_PCRE2GREP_CALLOUT], [], [ AC_DEFINE([SUPPORT_PCRE2GREP_CALLOUT], [], [
Define to any value to enable callout script support in pcre2grep.]) Define to any value to enable callout script support in pcre2grep.])
else else
AC_MSG_WARN([Callout script support is not available for Windows: disabled]) AC_MSG_WARN([Callout script support is not available for Windows: disabled])
enable_pcre2grep_callout=no enable_pcre2grep_callout=no
fi fi
fi fi

View File

@ -168,7 +168,7 @@ library. They are also documented in the pcre2build man page.
built. If you want only the 16-bit or 32-bit library, use --disable-pcre2-8 built. If you want only the 16-bit or 32-bit library, use --disable-pcre2-8
to disable building the 8-bit library. to disable building the 8-bit library.
. If you want to include support for just-in-time (JIT) compiling, which can . If you want to include support for just-in-time (JIT) compiling, which can
give large performance improvements on certain platforms, add --enable-jit to give large performance improvements on certain platforms, add --enable-jit to
the "configure" command. This support is available only for certain hardware the "configure" command. This support is available only for certain hardware
architectures. If you try to enable it on an unsupported architecture, there architectures. If you try to enable it on an unsupported architecture, there
@ -323,8 +323,8 @@ library. They are also documented in the pcre2build man page.
. When JIT support is enabled, pcre2grep automatically makes use of it, unless . When JIT support is enabled, pcre2grep automatically makes use of it, unless
you add --disable-pcre2grep-jit to the "configure" command. you add --disable-pcre2grep-jit to the "configure" command.
. On non-Windows sytems there is support for calling external scripts during . On non-Windows sytems there is support for calling external scripts during
matching in the pcre2grep command via PCRE2's callout facility with string matching in the pcre2grep command via PCRE2's callout facility with string
arguments. This support can be disabled by adding --disable-pcre2grep-callout arguments. This support can be disabled by adding --disable-pcre2grep-callout
to the "configure" command. to the "configure" command.

View File

@ -2542,12 +2542,12 @@ The internal recursion limit was reached.
<b> PCRE2_SIZE <i>bufflen</i>);</b> <b> PCRE2_SIZE <i>bufflen</i>);</b>
</P> </P>
<P> <P>
A text message for an error code from any PCRE2 function (compile, match, or A text message for an error code from any PCRE2 function (compile, match, or
auxiliary) can be obtained by calling <b>pcre2_get_error_message()</b>. The code auxiliary) can be obtained by calling <b>pcre2_get_error_message()</b>. The code
is passed as the first argument, with the remaining two arguments specifying a is passed as the first argument, with the remaining two arguments specifying a
code unit buffer and its length, into which the text message is placed. Note code unit buffer and its length, into which the text message is placed. Note
that the message is returned in code units of the appropriate width for the that the message is returned in code units of the appropriate width for the
library that is being used. library that is being used.
</P> </P>
<P> <P>
The returned message is terminated with a trailing zero, and the function The returned message is terminated with a trailing zero, and the function

View File

@ -352,12 +352,12 @@ environment.
</P> </P>
<br><a name="SEC15" href="#TOC1">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a><br> <br><a name="SEC15" href="#TOC1">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a><br>
<P> <P>
By default, on non-Windows systems, <b>pcre2grep</b> supports the use of By default, on non-Windows systems, <b>pcre2grep</b> supports the use of
callouts with string arguments within the patterns it is matching, in order to callouts with string arguments within the patterns it is matching, in order to
run external scripts. For details, see the run external scripts. For details, see the
<a href="pcre2grep.html"><b>pcre2grep</b></a> <a href="pcre2grep.html"><b>pcre2grep</b></a>
documentation. This support can be disabled by adding documentation. This support can be disabled by adding
--disable-pcre2grep-callout to the <b>configure</b> command. --disable-pcre2grep-callout to the <b>configure</b> command.
</P> </P>
<br><a name="SEC16" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br> <br><a name="SEC16" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
<P> <P>

View File

@ -511,10 +511,16 @@ line in which the match ended. If the matched string ends with a newline
sequence the output ends at the end of that line. sequence the output ends at the end of that line.
<br> <br>
<br> <br>
When this option is set, the PCRE2 library is called in "multiline" mode. When this option is set, the PCRE2 library is called in "multiline" mode. This
However, <b>pcre2grep</b> still processes the input line by line. The difference allows a matched string to extend past the end of a line and continue on one or
is that a matched string may extend past the end of a line and continue on more subsequent lines. However, <b>pcre2grep</b> still processes the input line
one or more subsequent lines. The newline sequence must be matched as part of by line. Once a match has been handled, scanning restarts at the beginning of
the next line, just as it does when <b>-M</b> is not present. This means that it
is possible for the second or subsequent lines in a multiline match to be
output again as part of another match.
<br>
<br>
The newline sequence that separates multiple lines must be matched as part of
the pattern. For example, to find the phrase "regular expression" in a file the pattern. For example, to find the phrase "regular expression" in a file
where "regular" might be at the end of a line and "expression" at the start of where "regular" might be at the end of a line and "expression" at the start of
the next line, you could use this command: the next line, you could use this command:
@ -746,7 +752,7 @@ with the <b>--help</b> option. If the support is not enabled, all callouts in
patterns are ignored by <b>pcre2grep</b>. patterns are ignored by <b>pcre2grep</b>.
</P> </P>
<P> <P>
A callout in a PCRE2 pattern is of the form (?C&#60;arg&#62;) where the argument is A callout in a PCRE2 pattern is of the form (?C&#60;arg&#62;) where the argument is
either a number or a quoted string (see the either a number or a quoted string (see the
<a href="pcre2callout.html"><b>pcre2callout</b></a> <a href="pcre2callout.html"><b>pcre2callout</b></a>
documentation for details). Numbered callouts are ignored by <b>pcre2grep</b>. documentation for details). Numbered callouts are ignored by <b>pcre2grep</b>.
@ -825,7 +831,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC15" href="#TOC1">REVISION</a><br> <br><a name="SEC15" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 06 April 2016 Last updated: 19 June 2016
<br> <br>
Copyright &copy; 1997-2016 University of Cambridge. Copyright &copy; 1997-2016 University of Cambridge.
<br> <br>

View File

@ -152,7 +152,7 @@ PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The
PCRE2_ANCHORED option is not supported at match time. PCRE2_ANCHORED option is not supported at match time.
</P> </P>
<P> <P>
If the PCRE2_NO_JIT option is passed to <b>pcre2_match()</b> it disables the If the PCRE2_NO_JIT option is passed to <b>pcre2_match()</b> it disables the
use of JIT, forcing matching by the interpreter code. use of JIT, forcing matching by the interpreter code.
</P> </P>
<P> <P>

View File

@ -1256,17 +1256,22 @@ build PCRE2 with the use of \C permanently disabled.
<P> <P>
PCRE2 does not allow \C to appear in lookbehind assertions PCRE2 does not allow \C to appear in lookbehind assertions
<a href="#lookbehind">(described below)</a> <a href="#lookbehind">(described below)</a>
in a UTF mode, because this would make it impossible to calculate the length of in UTF-8 or UTF-16 modes, because this would make it impossible to calculate
the lookbehind. Neither the alternative matching function the length of the lookbehind. Neither the alternative matching function
<b>pcre2_dfa_match()</b> nor the JIT optimizer support \C in a UTF mode. The <b>pcre2_dfa_match()</b> nor the JIT optimizer support \C in these UTF modes.
former gives a match-time error; the latter fails to optimize and so the match The former gives a match-time error; the latter fails to optimize and so the
is always run using the interpreter. match is always run using the interpreter.
</P>
<P>
In the 32-bit library, however, \C is always supported (when not explicitly
locked out) because it always matches a single code unit, whether or not UTF-32
is specified.
</P> </P>
<P> <P>
In general, the \C escape sequence is best avoided. However, one way of using In general, the \C escape sequence is best avoided. However, one way of using
it that avoids the problem of malformed UTF characters is to use a lookahead to it that avoids the problem of malformed UTF-8 or UTF-16 characters is to use a
check the length of the next character, as in this pattern, which could be used lookahead to check the length of the next character, as in this pattern, which
with a UTF-8 string (ignore white space and line breaks): could be used with a UTF-8 string (ignore white space and line breaks):
<pre> <pre>
(?| (?=[\x00-\x7f])(\C) | (?| (?=[\x00-\x7f])(\C) |
(?=[\x80-\x{7ff}])(\C)(\C) | (?=[\x80-\x{7ff}])(\C)(\C) |
@ -3388,9 +3393,9 @@ Cambridge, England.
</P> </P>
<br><a name="SEC30" href="#TOC1">REVISION</a><br> <br><a name="SEC30" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 13 November 2015 Last updated: 20 June 2016
<br> <br>
Copyright &copy; 1997-2015 University of Cambridge. Copyright &copy; 1997-2016 University of Cambridge.
<br> <br>
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -121,8 +121,8 @@ defined POSIX behaviour for REG_NEWLINE (see the following section).
</pre> </pre>
When a pattern that is compiled with this flag is passed to <b>regexec()</b> for When a pattern that is compiled with this flag is passed to <b>regexec()</b> for
matching, the <i>nmatch</i> and <i>pmatch</i> arguments are ignored, and no matching, the <i>nmatch</i> and <i>pmatch</i> arguments are ignored, and no
captured strings are returned. Versions of the PCRE library prior to 10.22 used captured strings are returned. Versions of the PCRE library prior to 10.22 used
to set the PCRE2_NO_AUTO_CAPTURE compile option, but this no longer happens to set the PCRE2_NO_AUTO_CAPTURE compile option, but this no longer happens
because it disables the use of back references. because it disables the use of back references.
<pre> <pre>
REG_UCP REG_UCP

View File

@ -39,8 +39,8 @@ an empty string. Comments in the code explain what is going on.
<P> <P>
The code in <b>pcre2demo.c</b> is an 8-bit program that uses the PCRE2 8-bit The code in <b>pcre2demo.c</b> is an 8-bit program that uses the PCRE2 8-bit
library. It handles strings and characters that are stored in 8-bit code units. library. It handles strings and characters that are stored in 8-bit code units.
By default, one character corresponds to one code unit, but if the pattern By default, one character corresponds to one code unit, but if the pattern
starts with "(*UTF)", both it and the subject are treated as UTF-8 strings, starts with "(*UTF)", both it and the subject are treated as UTF-8 strings,
where characters may occupy multiple code units. where characters may occupy multiple code units.
</P> </P>
<P> <P>

View File

@ -51,10 +51,10 @@ reloaded using the 8-bit library.
</P> </P>
<br><a name="SEC2" href="#TOC1">SECURITY CONCERNS</a><br> <br><a name="SEC2" href="#TOC1">SECURITY CONCERNS</a><br>
<P> <P>
The facility for saving and restoring compiled patterns is intended for use The facility for saving and restoring compiled patterns is intended for use
within individual applications. As such, the data supplied to within individual applications. As such, the data supplied to
<b>pcre2_serialize_decode()</b> is expected to be trusted data, not data from <b>pcre2_serialize_decode()</b> is expected to be trusted data, not data from
arbitrary external sources. There is only some simple consistency checking, not arbitrary external sources. There is only some simple consistency checking, not
complete validation of what is being re-loaded. complete validation of what is being re-loaded.
</P> </P>
<br><a name="SEC3" href="#TOC1">SAVING COMPILED PATTERNS</a><br> <br><a name="SEC3" href="#TOC1">SAVING COMPILED PATTERNS</a><br>

View File

@ -169,8 +169,8 @@ REVISION
Last updated: 16 October 2015 Last updated: 16 October 2015
Copyright (c) 1997-2015 University of Cambridge. Copyright (c) 1997-2015 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2API(3) Library Functions Manual PCRE2API(3) PCRE2API(3) Library Functions Manual PCRE2API(3)
@ -3154,8 +3154,8 @@ REVISION
Last updated: 17 June 2016 Last updated: 17 June 2016
Copyright (c) 1997-2016 University of Cambridge. Copyright (c) 1997-2016 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2BUILD(3) Library Functions Manual PCRE2BUILD(3) PCRE2BUILD(3) Library Functions Manual PCRE2BUILD(3)
@ -3647,8 +3647,8 @@ REVISION
Last updated: 01 April 2016 Last updated: 01 April 2016
Copyright (c) 1997-2016 University of Cambridge. Copyright (c) 1997-2016 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2CALLOUT(3) Library Functions Manual PCRE2CALLOUT(3) PCRE2CALLOUT(3) Library Functions Manual PCRE2CALLOUT(3)
@ -4011,8 +4011,8 @@ REVISION
Last updated: 23 March 2015 Last updated: 23 March 2015
Copyright (c) 1997-2015 University of Cambridge. Copyright (c) 1997-2015 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2COMPAT(3) Library Functions Manual PCRE2COMPAT(3) PCRE2COMPAT(3) Library Functions Manual PCRE2COMPAT(3)
@ -4196,8 +4196,8 @@ REVISION
Last updated: 15 March 2015 Last updated: 15 March 2015
Copyright (c) 1997-2015 University of Cambridge. Copyright (c) 1997-2015 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2JIT(3) Library Functions Manual PCRE2JIT(3) PCRE2JIT(3) Library Functions Manual PCRE2JIT(3)
@ -4593,8 +4593,8 @@ REVISION
Last updated: 05 June 2016 Last updated: 05 June 2016
Copyright (c) 1997-2016 University of Cambridge. Copyright (c) 1997-2016 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2LIMITS(3) Library Functions Manual PCRE2LIMITS(3) PCRE2LIMITS(3) Library Functions Manual PCRE2LIMITS(3)
@ -4671,8 +4671,8 @@ REVISION
Last updated: 05 November 2015 Last updated: 05 November 2015
Copyright (c) 1997-2015 University of Cambridge. Copyright (c) 1997-2015 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2MATCHING(3) Library Functions Manual PCRE2MATCHING(3) PCRE2MATCHING(3) Library Functions Manual PCRE2MATCHING(3)
@ -4890,8 +4890,8 @@ REVISION
Last updated: 29 September 2014 Last updated: 29 September 2014
Copyright (c) 1997-2014 University of Cambridge. Copyright (c) 1997-2014 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2PARTIAL(3) Library Functions Manual PCRE2PARTIAL(3) PCRE2PARTIAL(3) Library Functions Manual PCRE2PARTIAL(3)
@ -5330,8 +5330,8 @@ REVISION
Last updated: 22 December 2014 Last updated: 22 December 2014
Copyright (c) 1997-2014 University of Cambridge. Copyright (c) 1997-2014 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2PATTERN(3) Library Functions Manual PCRE2PATTERN(3) PCRE2PATTERN(3) Library Functions Manual PCRE2PATTERN(3)
@ -6338,17 +6338,21 @@ MATCHING A SINGLE CODE UNIT
possible to build PCRE2 with the use of \C permanently disabled. possible to build PCRE2 with the use of \C permanently disabled.
PCRE2 does not allow \C to appear in lookbehind assertions (described PCRE2 does not allow \C to appear in lookbehind assertions (described
below) in a UTF mode, because this would make it impossible to calcu- below) in UTF-8 or UTF-16 modes, because this would make it impossible
late the length of the lookbehind. Neither the alternative matching to calculate the length of the lookbehind. Neither the alternative
function pcre2_dfa_match() nor the JIT optimizer support \C in a UTF matching function pcre2_dfa_match() nor the JIT optimizer support \C in
mode. The former gives a match-time error; the latter fails to optimize these UTF modes. The former gives a match-time error; the latter fails
and so the match is always run using the interpreter. to optimize and so the match is always run using the interpreter.
In the 32-bit library, however, \C is always supported (when not
explicitly locked out) because it always matches a single code unit,
whether or not UTF-32 is specified.
In general, the \C escape sequence is best avoided. However, one way of In general, the \C escape sequence is best avoided. However, one way of
using it that avoids the problem of malformed UTF characters is to use using it that avoids the problem of malformed UTF-8 or UTF-16 charac-
a lookahead to check the length of the next character, as in this pat- ters is to use a lookahead to check the length of the next character,
tern, which could be used with a UTF-8 string (ignore white space and as in this pattern, which could be used with a UTF-8 string (ignore
line breaks): white space and line breaks):
(?| (?=[\x00-\x7f])(\C) | (?| (?=[\x00-\x7f])(\C) |
(?=[\x80-\x{7ff}])(\C)(\C) | (?=[\x80-\x{7ff}])(\C)(\C) |
@ -8363,11 +8367,11 @@ AUTHOR
REVISION REVISION
Last updated: 13 November 2015 Last updated: 20 June 2016
Copyright (c) 1997-2015 University of Cambridge. Copyright (c) 1997-2016 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2PERFORM(3) Library Functions Manual PCRE2PERFORM(3) PCRE2PERFORM(3) Library Functions Manual PCRE2PERFORM(3)
@ -8539,8 +8543,8 @@ REVISION
Last updated: 02 January 2015 Last updated: 02 January 2015
Copyright (c) 1997-2015 University of Cambridge. Copyright (c) 1997-2015 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2POSIX(3) Library Functions Manual PCRE2POSIX(3) PCRE2POSIX(3) Library Functions Manual PCRE2POSIX(3)
@ -8815,8 +8819,8 @@ REVISION
Last updated: 31 January 2016 Last updated: 31 January 2016
Copyright (c) 1997-2016 University of Cambridge. Copyright (c) 1997-2016 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2SAMPLE(3) Library Functions Manual PCRE2SAMPLE(3) PCRE2SAMPLE(3) Library Functions Manual PCRE2SAMPLE(3)
@ -9081,8 +9085,8 @@ REVISION
Last updated: 24 May 2016 Last updated: 24 May 2016
Copyright (c) 1997-2016 University of Cambridge. Copyright (c) 1997-2016 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2STACK(3) Library Functions Manual PCRE2STACK(3) PCRE2STACK(3) Library Functions Manual PCRE2STACK(3)
@ -9247,8 +9251,8 @@ REVISION
Last updated: 21 November 2014 Last updated: 21 November 2014
Copyright (c) 1997-2014 University of Cambridge. Copyright (c) 1997-2014 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2SYNTAX(3) Library Functions Manual PCRE2SYNTAX(3) PCRE2SYNTAX(3) Library Functions Manual PCRE2SYNTAX(3)
@ -9683,8 +9687,8 @@ REVISION
Last updated: 16 October 2015 Last updated: 16 October 2015
Copyright (c) 1997-2015 University of Cambridge. Copyright (c) 1997-2015 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------
PCRE2UNICODE(3) Library Functions Manual PCRE2UNICODE(3) PCRE2UNICODE(3) Library Functions Manual PCRE2UNICODE(3)
@ -9922,5 +9926,5 @@ REVISION
Last updated: 16 October 2015 Last updated: 16 October 2015
Copyright (c) 1997-2015 University of Cambridge. Copyright (c) 1997-2015 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------

View File

@ -2601,12 +2601,12 @@ The internal recursion limit was reached.
.B " PCRE2_SIZE \fIbufflen\fP);" .B " PCRE2_SIZE \fIbufflen\fP);"
.fi .fi
.P .P
A text message for an error code from any PCRE2 function (compile, match, or A text message for an error code from any PCRE2 function (compile, match, or
auxiliary) can be obtained by calling \fBpcre2_get_error_message()\fP. The code auxiliary) can be obtained by calling \fBpcre2_get_error_message()\fP. The code
is passed as the first argument, with the remaining two arguments specifying a is passed as the first argument, with the remaining two arguments specifying a
code unit buffer and its length, into which the text message is placed. Note code unit buffer and its length, into which the text message is placed. Note
that the message is returned in code units of the appropriate width for the that the message is returned in code units of the appropriate width for the
library that is being used. library that is being used.
.P .P
The returned message is terminated with a trailing zero, and the function The returned message is terminated with a trailing zero, and the function
returns the number of code units used, excluding the trailing zero. If the returns the number of code units used, excluding the trailing zero. If the

View File

@ -355,14 +355,14 @@ environment.
.SH "PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS" .SH "PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS"
.rs .rs
.sp .sp
By default, on non-Windows systems, \fBpcre2grep\fP supports the use of By default, on non-Windows systems, \fBpcre2grep\fP supports the use of
callouts with string arguments within the patterns it is matching, in order to callouts with string arguments within the patterns it is matching, in order to
run external scripts. For details, see the run external scripts. For details, see the
.\" HREF .\" HREF
\fBpcre2grep\fP \fBpcre2grep\fP
.\" .\"
documentation. This support can be disabled by adding documentation. This support can be disabled by adding
--disable-pcre2grep-callout to the \fBconfigure\fP command. --disable-pcre2grep-callout to the \fBconfigure\fP command.
. .
. .
.SH "PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT" .SH "PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT"

View File

@ -444,7 +444,7 @@ When this option is set, the PCRE2 library is called in "multiline" mode. This
allows a matched string to extend past the end of a line and continue on one or allows a matched string to extend past the end of a line and continue on one or
more subsequent lines. However, \fBpcre2grep\fP still processes the input line more subsequent lines. However, \fBpcre2grep\fP still processes the input line
by line. Once a match has been handled, scanning restarts at the beginning of by line. Once a match has been handled, scanning restarts at the beginning of
the next line, just as it does when \fB-M\fP is not present. This means that it the next line, just as it does when \fB-M\fP is not present. This means that it
is possible for the second or subsequent lines in a multiline match to be is possible for the second or subsequent lines in a multiline match to be
output again as part of another match. output again as part of another match.
.sp .sp
@ -668,7 +668,7 @@ You can find out whether your binary has support for callouts by running it
with the \fB--help\fP option. If the support is not enabled, all callouts in with the \fB--help\fP option. If the support is not enabled, all callouts in
patterns are ignored by \fBpcre2grep\fP. patterns are ignored by \fBpcre2grep\fP.
.P .P
A callout in a PCRE2 pattern is of the form (?C<arg>) where the argument is A callout in a PCRE2 pattern is of the form (?C<arg>) where the argument is
either a number or a quoted string (see the either a number or a quoted string (see the
.\" HREF .\" HREF
\fBpcre2callout\fP \fBpcre2callout\fP

View File

@ -493,14 +493,20 @@ OPTIONS
end of that line. end of that line.
When this option is set, the PCRE2 library is called in "mul- When this option is set, the PCRE2 library is called in "mul-
tiline" mode. However, pcre2grep still processes the input tiline" mode. This allows a matched string to extend past the
line by line. The difference is that a matched string may end of a line and continue on one or more subsequent lines.
extend past the end of a line and continue on one or more However, pcre2grep still processes the input line by line.
subsequent lines. The newline sequence must be matched as Once a match has been handled, scanning restarts at the
part of the pattern. For example, to find the phrase "regular beginning of the next line, just as it does when -M is not
expression" in a file where "regular" might be at the end of present. This means that it is possible for the second or
a line and "expression" at the start of the next line, you subsequent lines in a multiline match to be output again as
could use this command: part of another match.
The newline sequence that separates multiple lines must be
matched as part of the pattern. For example, to find the
phrase "regular expression" in a file where "regular" might
be at the end of a line and "expression" at the start of the
next line, you could use this command:
pcre2grep -M 'regular\s+expression' <file> pcre2grep -M 'regular\s+expression' <file>
@ -816,5 +822,5 @@ AUTHOR
REVISION REVISION
Last updated: 06 April 2016 Last updated: 19 June 2016
Copyright (c) 1997-2016 University of Cambridge. Copyright (c) 1997-2016 University of Cambridge.

View File

@ -128,7 +128,7 @@ PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The
PCRE2_ANCHORED option is not supported at match time. PCRE2_ANCHORED option is not supported at match time.
.P .P
If the PCRE2_NO_JIT option is passed to \fBpcre2_match()\fP it disables the If the PCRE2_NO_JIT option is passed to \fBpcre2_match()\fP it disables the
use of JIT, forcing matching by the interpreter code. use of JIT, forcing matching by the interpreter code.
.P .P
The only unsupported pattern items are \eC (match a single data unit) when The only unsupported pattern items are \eC (match a single data unit) when

View File

@ -1262,7 +1262,7 @@ the length of the lookbehind. Neither the alternative matching function
The former gives a match-time error; the latter fails to optimize and so the The former gives a match-time error; the latter fails to optimize and so the
match is always run using the interpreter. match is always run using the interpreter.
.P .P
In the 32-bit library, however, \eC is always supported (when not explicitly In the 32-bit library, however, \eC is always supported (when not explicitly
locked out) because it always matches a single code unit, whether or not UTF-32 locked out) because it always matches a single code unit, whether or not UTF-32
is specified. is specified.
.P .P

View File

@ -97,8 +97,8 @@ defined POSIX behaviour for REG_NEWLINE (see the following section).
.sp .sp
When a pattern that is compiled with this flag is passed to \fBregexec()\fP for When a pattern that is compiled with this flag is passed to \fBregexec()\fP for
matching, the \fInmatch\fP and \fIpmatch\fP arguments are ignored, and no matching, the \fInmatch\fP and \fIpmatch\fP arguments are ignored, and no
captured strings are returned. Versions of the PCRE library prior to 10.22 used captured strings are returned. Versions of the PCRE library prior to 10.22 used
to set the PCRE2_NO_AUTO_CAPTURE compile option, but this no longer happens to set the PCRE2_NO_AUTO_CAPTURE compile option, but this no longer happens
because it disables the use of back references. because it disables the use of back references.
.sp .sp
REG_UCP REG_UCP

View File

@ -26,8 +26,8 @@ an empty string. Comments in the code explain what is going on.
.P .P
The code in \fBpcre2demo.c\fP is an 8-bit program that uses the PCRE2 8-bit The code in \fBpcre2demo.c\fP is an 8-bit program that uses the PCRE2 8-bit
library. It handles strings and characters that are stored in 8-bit code units. library. It handles strings and characters that are stored in 8-bit code units.
By default, one character corresponds to one code unit, but if the pattern By default, one character corresponds to one code unit, but if the pattern
starts with "(*UTF)", both it and the subject are treated as UTF-8 strings, starts with "(*UTF)", both it and the subject are treated as UTF-8 strings,
where characters may occupy multiple code units. where characters may occupy multiple code units.
.P .P
If PCRE2 is installed in the standard include and library directories for your If PCRE2 is installed in the standard include and library directories for your

View File

@ -33,10 +33,10 @@ reloaded using the 8-bit library.
.SH "SECURITY CONCERNS" .SH "SECURITY CONCERNS"
.rs .rs
.sp .sp
The facility for saving and restoring compiled patterns is intended for use The facility for saving and restoring compiled patterns is intended for use
within individual applications. As such, the data supplied to within individual applications. As such, the data supplied to
\fBpcre2_serialize_decode()\fP is expected to be trusted data, not data from \fBpcre2_serialize_decode()\fP is expected to be trusted data, not data from
arbitrary external sources. There is only some simple consistency checking, not arbitrary external sources. There is only some simple consistency checking, not
complete validation of what is being re-loaded. complete validation of what is being re-loaded.
. .
. .

View File

@ -111,6 +111,9 @@ sure both macros are undefined; an emulation function will then be used. */
/* Define to 1 if you have the <sys/types.h> header file. */ /* Define to 1 if you have the <sys/types.h> header file. */
/* #undef HAVE_SYS_TYPES_H */ /* #undef HAVE_SYS_TYPES_H */
/* Define to 1 if you have the <sys/wait.h> header file. */
/* #undef HAVE_SYS_WAIT_H */
/* Define to 1 if you have the <unistd.h> header file. */ /* Define to 1 if you have the <unistd.h> header file. */
/* #undef HAVE_UNISTD_H */ /* #undef HAVE_UNISTD_H */
@ -203,7 +206,7 @@ sure both macros are undefined; an emulation function will then be used. */
#define PACKAGE_NAME "PCRE2" #define PACKAGE_NAME "PCRE2"
/* Define to the full name and version of this package. */ /* Define to the full name and version of this package. */
#define PACKAGE_STRING "PCRE2 10.21" #define PACKAGE_STRING "PCRE2 10.22-RC1"
/* Define to the one symbol short name of this package. */ /* Define to the one symbol short name of this package. */
#define PACKAGE_TARNAME "pcre2" #define PACKAGE_TARNAME "pcre2"
@ -212,7 +215,7 @@ sure both macros are undefined; an emulation function will then be used. */
#define PACKAGE_URL "" #define PACKAGE_URL ""
/* Define to the version of this package. */ /* Define to the version of this package. */
#define PACKAGE_VERSION "10.21" #define PACKAGE_VERSION "10.22-RC1"
/* The value of PARENS_NEST_LIMIT specifies the maximum depth of nested /* The value of PARENS_NEST_LIMIT specifies the maximum depth of nested
parentheses (of any kind) in a pattern. This limits the amount of system parentheses (of any kind) in a pattern. This limits the amount of system
@ -271,6 +274,9 @@ sure both macros are undefined; an emulation function will then be used. */
is able to handle .gz files. */ is able to handle .gz files. */
/* #undef SUPPORT_LIBZ */ /* #undef SUPPORT_LIBZ */
/* Define to any value to enable callout script support in pcre2grep. */
/* #undef SUPPORT_PCRE2GREP_CALLOUT */
/* Define to any value to enable JIT support in pcre2grep. */ /* Define to any value to enable JIT support in pcre2grep. */
/* #undef SUPPORT_PCRE2GREP_JIT */ /* #undef SUPPORT_PCRE2GREP_JIT */
@ -293,7 +299,7 @@ sure both macros are undefined; an emulation function will then be used. */
/* #undef SUPPORT_VALGRIND */ /* #undef SUPPORT_VALGRIND */
/* Version number of package */ /* Version number of package */
#define VERSION "10.21" #define VERSION "10.22-RC1"
/* Define to empty if `const' does not conform to ANSI C. */ /* Define to empty if `const' does not conform to ANSI C. */
/* #undef const */ /* #undef const */

View File

@ -155,7 +155,7 @@ through to pcre2_match(). */
#define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u #define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u
#define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u #define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u
/* A further option for pcre2_match(), not allowed for pcre2_dfa_match(), /* A further option for pcre2_match(), not allowed for pcre2_dfa_match(),
ignored for pcre2_jit_match(). */ ignored for pcre2_jit_match(). */
#define PCRE2_NO_JIT 0x00002000u #define PCRE2_NO_JIT 0x00002000u

View File

@ -42,9 +42,9 @@ POSSIBILITY OF SUCH DAMAGE.
/* The current PCRE version information. */ /* The current PCRE version information. */
#define PCRE2_MAJOR 10 #define PCRE2_MAJOR 10
#define PCRE2_MINOR 21 #define PCRE2_MINOR 22
#define PCRE2_PRERELEASE #define PCRE2_PRERELEASE -RC1
#define PCRE2_DATE 2016-01-12 #define PCRE2_DATE 2016-06-29
/* When an application links to a PCRE DLL in Windows, the symbols that are /* When an application links to a PCRE DLL in Windows, the symbols that are
imported have to be identified as such. When building PCRE2, the appropriate imported have to be identified as such. When building PCRE2, the appropriate
@ -146,7 +146,8 @@ sanity checks). */
#define PCRE2_DFA_RESTART 0x00000040u #define PCRE2_DFA_RESTART 0x00000040u
#define PCRE2_DFA_SHORTEST 0x00000080u #define PCRE2_DFA_SHORTEST 0x00000080u
/* These are additional options for pcre2_substitute(). */ /* These are additional options for pcre2_substitute(), which passes any others
through to pcre2_match(). */
#define PCRE2_SUBSTITUTE_GLOBAL 0x00000100u #define PCRE2_SUBSTITUTE_GLOBAL 0x00000100u
#define PCRE2_SUBSTITUTE_EXTENDED 0x00000200u #define PCRE2_SUBSTITUTE_EXTENDED 0x00000200u
@ -154,6 +155,11 @@ sanity checks). */
#define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u #define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u
#define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u #define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u
/* A further option for pcre2_match(), not allowed for pcre2_dfa_match(),
ignored for pcre2_jit_match(). */
#define PCRE2_NO_JIT 0x00002000u
/* Newline and \R settings, for use in compile contexts. The newline values /* Newline and \R settings, for use in compile contexts. The newline values
must be kept in step with values set in config.h and both sets must all be must be kept in step with values set in config.h and both sets must all be
greater than zero. */ greater than zero. */
@ -245,6 +251,7 @@ numbers must not be changed. */
#define PCRE2_ERROR_BADSUBSTITUTION (-59) #define PCRE2_ERROR_BADSUBSTITUTION (-59)
#define PCRE2_ERROR_BADSUBSPATTERN (-60) #define PCRE2_ERROR_BADSUBSPATTERN (-60)
#define PCRE2_ERROR_TOOMANYREPLACE (-61) #define PCRE2_ERROR_TOOMANYREPLACE (-61)
#define PCRE2_ERROR_BADSERIALIZEDDATA (-62)
/* Request types for pcre2_pattern_info() */ /* Request types for pcre2_pattern_info() */
@ -436,7 +443,9 @@ PCRE2_EXP_DECL int pcre2_set_recursion_memory_management( \
PCRE2_EXP_DECL \ PCRE2_EXP_DECL \
pcre2_code *pcre2_compile(PCRE2_SPTR, PCRE2_SIZE, uint32_t, \ pcre2_code *pcre2_compile(PCRE2_SPTR, PCRE2_SIZE, uint32_t, \
int *, PCRE2_SIZE *, pcre2_compile_context *); \ int *, PCRE2_SIZE *, pcre2_compile_context *); \
PCRE2_EXP_DECL void pcre2_code_free(pcre2_code *); PCRE2_EXP_DECL void pcre2_code_free(pcre2_code *); \
PCRE2_EXP_DECL \
pcre2_code *pcre2_code_copy(const pcre2_code *);
/* Functions that give information about a compiled pattern. */ /* Functions that give information about a compiled pattern. */
@ -585,6 +594,7 @@ pcre2_compile are called by application code. */
/* Functions: the complete list in alphabetical order */ /* Functions: the complete list in alphabetical order */
#define pcre2_callout_enumerate PCRE2_SUFFIX(pcre2_callout_enumerate_) #define pcre2_callout_enumerate PCRE2_SUFFIX(pcre2_callout_enumerate_)
#define pcre2_code_copy PCRE2_SUFFIX(pcre2_code_copy_)
#define pcre2_code_free PCRE2_SUFFIX(pcre2_code_free_) #define pcre2_code_free PCRE2_SUFFIX(pcre2_code_free_)
#define pcre2_compile PCRE2_SUFFIX(pcre2_compile_) #define pcre2_compile PCRE2_SUFFIX(pcre2_compile_)
#define pcre2_compile_context_copy PCRE2_SUFFIX(pcre2_compile_context_copy_) #define pcre2_compile_context_copy PCRE2_SUFFIX(pcre2_compile_context_copy_)

View File

@ -155,7 +155,7 @@ through to pcre2_match(). */
#define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u #define PCRE2_SUBSTITUTE_UNKNOWN_UNSET 0x00000800u
#define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u #define PCRE2_SUBSTITUTE_OVERFLOW_LENGTH 0x00001000u
/* A further option for pcre2_match(), not allowed for pcre2_dfa_match(), /* A further option for pcre2_match(), not allowed for pcre2_dfa_match(),
ignored for pcre2_jit_match(). */ ignored for pcre2_jit_match(). */
#define PCRE2_NO_JIT 0x00002000u #define PCRE2_NO_JIT 0x00002000u

View File

@ -149,13 +149,13 @@ have to check them every time. */
#define OFLOW_MAX (INT_MAX - 20) #define OFLOW_MAX (INT_MAX - 20)
/* Macro for setting individual bits in class bitmaps. It took some /* Macro for setting individual bits in class bitmaps. It took some
experimenting to figure out how to stop gcc 5.3.0 from warning with experimenting to figure out how to stop gcc 5.3.0 from warning with
-Wconversion. This version gets a warning: -Wconversion. This version gets a warning:
#define SETBIT(a,b) a[(b)/8] |= (uint8_t)(1 << ((b)&7)) #define SETBIT(a,b) a[(b)/8] |= (uint8_t)(1 << ((b)&7))
Let's hope the apparently less efficient version isn't actually so bad if the Let's hope the apparently less efficient version isn't actually so bad if the
compiler is clever with identical subexpressions. */ compiler is clever with identical subexpressions. */
#define SETBIT(a,b) a[(b)/8] = (uint8_t)(a[(b)/8] | (1 << ((b)&7))) #define SETBIT(a,b) a[(b)/8] = (uint8_t)(a[(b)/8] | (1 << ((b)&7)))
@ -733,7 +733,7 @@ static const uint8_t opcode_possessify[] = {
* Copy compiled code * * Copy compiled code *
*************************************************/ *************************************************/
/* Compiled JIT code cannot be copied, so the new compiled block has no /* Compiled JIT code cannot be copied, so the new compiled block has no
associated JIT data. */ associated JIT data. */
PCRE2_EXP_DEFN pcre2_code * PCRE2_CALL_CONVENTION PCRE2_EXP_DEFN pcre2_code * PCRE2_CALL_CONVENTION
@ -748,14 +748,14 @@ if (newcode == NULL) return NULL;
memcpy(newcode, code, code->blocksize); memcpy(newcode, code, code->blocksize);
newcode->executable_jit = NULL; newcode->executable_jit = NULL;
/* If the code is one that has been deserialized, increment the reference count /* If the code is one that has been deserialized, increment the reference count
in the decoded tables. */ in the decoded tables. */
if ((code->flags & PCRE2_DEREF_TABLES) != 0) if ((code->flags & PCRE2_DEREF_TABLES) != 0)
{ {
ref_count = (PCRE2_SIZE *)(code->tables + tables_length); ref_count = (PCRE2_SIZE *)(code->tables + tables_length);
(*ref_count)++; (*ref_count)++;
} }
return newcode; return newcode;
} }
@ -881,7 +881,7 @@ Returns: if non-negative, the fixed length,
or -2 if there is no fixed length, or -2 if there is no fixed length,
or -3 if \C was encountered (in UTF mode only) or -3 if \C was encountered (in UTF mode only)
or -4 if length is too long or -4 if length is too long
or -5 if regex is too complicated or -5 if regex is too complicated
or -6 if an unknown opcode was encountered (internal error) or -6 if an unknown opcode was encountered (internal error)
*/ */
@ -1117,7 +1117,7 @@ for (;;)
cc++; cc++;
break; break;
/* The single-byte matcher isn't allowed. This only happens in UTF-8 or /* The single-byte matcher isn't allowed. This only happens in UTF-8 or
UTF-16 mode; otherwise \C is coded as OP_ALLANY. */ UTF-16 mode; otherwise \C is coded as OP_ALLANY. */
case OP_ANYBYTE: case OP_ANYBYTE:
@ -3515,7 +3515,7 @@ for (; ptr < cb->end_pattern; ptr++)
case CHAR_U: case CHAR_U:
break; break;
default: default:
errorcode = ERR11; errorcode = ERR11;
ptr--; /* Correct the offset */ ptr--; /* Correct the offset */
goto FAILED; goto FAILED;
@ -3810,7 +3810,7 @@ if (nest_depth == 0)
return 0; return 0;
} }
/* We give a special error for a missing closing parentheses after (?# because /* We give a special error for a missing closing parentheses after (?# because
it might otherwise be hard to see where the missing character is. */ it might otherwise be hard to see where the missing character is. */
errorcode = (skiptoket == CHAR_NUMBER_SIGN)? ERR18 : ERR14; errorcode = (skiptoket == CHAR_NUMBER_SIGN)? ERR18 : ERR14;
@ -3963,10 +3963,10 @@ for (;; ptr++)
uint32_t subreqcu, subfirstcu; uint32_t subreqcu, subfirstcu;
int32_t subreqcuflags, subfirstcuflags; /* Must be signed */ int32_t subreqcuflags, subfirstcuflags; /* Must be signed */
PCRE2_UCHAR mcbuffer[8]; PCRE2_UCHAR mcbuffer[8];
/* Come here to restart the loop. */ /* Come here to restart the loop. */
REDO_LOOP: REDO_LOOP:
/* Get next character in the pattern */ /* Get next character in the pattern */
@ -4249,14 +4249,14 @@ for (;; ptr++)
{ {
cb->nestptr[0] = ptr + 7; cb->nestptr[0] = ptr + 7;
ptr = sub_start_of_word; ptr = sub_start_of_word;
goto REDO_LOOP; goto REDO_LOOP;
} }
if (PRIV(strncmp_c8)(ptr+1, STRING_WEIRD_ENDWORD, 6) == 0) if (PRIV(strncmp_c8)(ptr+1, STRING_WEIRD_ENDWORD, 6) == 0)
{ {
cb->nestptr[0] = ptr + 7; cb->nestptr[0] = ptr + 7;
ptr = sub_end_of_word; ptr = sub_end_of_word;
goto REDO_LOOP; goto REDO_LOOP;
} }
/* Handle a real character class. */ /* Handle a real character class. */
@ -7430,7 +7430,7 @@ for (;; ptr++)
*code++ = (escape == ESC_C)? OP_ALLANY : escape; *code++ = (escape == ESC_C)? OP_ALLANY : escape;
#else #else
*code++ = (!utf && escape == ESC_C)? OP_ALLANY : escape; *code++ = (!utf && escape == ESC_C)? OP_ALLANY : escape;
#endif #endif
} }
} }
continue; continue;

View File

@ -68,7 +68,7 @@ Arguments:
where where to put the information where where to put the information
Returns: 0 if a numerical value is returned Returns: 0 if a numerical value is returned
>= 0 if a string value >= 0 if a string value
PCRE2_ERROR_BADOPTION if "where" not recognized PCRE2_ERROR_BADOPTION if "where" not recognized
or JIT target requested when JIT not enabled or JIT target requested when JIT not enabled
*/ */

View File

@ -2783,7 +2783,7 @@ for (;;)
#endif #endif
if (charcount > 0) if (charcount > 0)
{ {
ADD_NEW_DATA(-(state_offset + LINK_SIZE + 1), 0, ADD_NEW_DATA(-(state_offset + LINK_SIZE + 1), 0,
(int)(charcount - 1)); (int)(charcount - 1));
} }
else else
@ -3337,7 +3337,7 @@ if (!anchored)
{ {
first_cu2 = TABLE_GET(first_cu, mb->tables + fcc_offset, first_cu); first_cu2 = TABLE_GET(first_cu, mb->tables + fcc_offset, first_cu);
#if defined SUPPORT_UNICODE && PCRE2_CODE_UNIT_WIDTH != 8 #if defined SUPPORT_UNICODE && PCRE2_CODE_UNIT_WIDTH != 8
if (utf && first_cu > 127) if (utf && first_cu > 127)
first_cu2 = (PCRE2_UCHAR)UCD_OTHERCASE(first_cu); first_cu2 = (PCRE2_UCHAR)UCD_OTHERCASE(first_cu);
#endif #endif
} }

View File

@ -1977,7 +1977,7 @@ while (ptr < endptr)
match = match_patterns(matchptr, length, options, startoffset, &mrc); match = match_patterns(matchptr, length, options, startoffset, &mrc);
options = PCRE2_NOTEMPTY; options = PCRE2_NOTEMPTY;
/* If it's a match or a not-match (as required), do what's wanted. */ /* If it's a match or a not-match (as required), do what's wanted. */
if (match != invert) if (match != invert)
@ -2794,7 +2794,7 @@ if ((popts & PO_FIXED_STRINGS) != 0)
} }
sprintf((char *)buffer, "%s%.*s%s", prefix[popts], patlen, ps, suffix[popts]); sprintf((char *)buffer, "%s%.*s%s", prefix[popts], patlen, ps, suffix[popts]);
p->compiled = pcre2_compile(buffer, PCRE2_ZERO_TERMINATED, options, &errcode, p->compiled = pcre2_compile(buffer, PCRE2_ZERO_TERMINATED, options, &errcode,
&erroffset, compile_context); &erroffset, compile_context);
/* Handle successful compile */ /* Handle successful compile */