Code tidies for 10.30-RC1 release candidate.
This commit is contained in:
parent
e3052af6fd
commit
810d9b6da5
|
@ -2,8 +2,8 @@ Change Log for PCRE2
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
|
|
||||||
Version 10.30-DEV 09-March-2017
|
Version 10.30-RC1 18-July-2017
|
||||||
-------------------------------
|
------------------------------
|
||||||
|
|
||||||
1. The main interpreter, pcre2_match(), has been refactored into a new version
|
1. The main interpreter, pcre2_match(), has been refactored into a new version
|
||||||
that does not use recursive function calls (and therefore the stack) for
|
that does not use recursive function calls (and therefore the stack) for
|
||||||
|
|
57
NEWS
57
NEWS
|
@ -1,6 +1,63 @@
|
||||||
News about PCRE2 releases
|
News about PCRE2 releases
|
||||||
-------------------------
|
-------------------------
|
||||||
|
|
||||||
|
Version 10.30-RC1 18-July-2017
|
||||||
|
------------------------------
|
||||||
|
|
||||||
|
The full list of changes that includes bugfixes and tidies is, as always, in
|
||||||
|
ChangeLog. These are the most important new features:
|
||||||
|
|
||||||
|
1. The main interpreter, pcre2_match(), has been refactored into a new version
|
||||||
|
that does not use recursive function calls (and therefore the system stack) for
|
||||||
|
remembering backtracking positions. This makes --disable-stack-for-recursion a
|
||||||
|
NOOP. The new implementation allows backtracking into recursive group calls in
|
||||||
|
patterns, making it more compatible with Perl, and also fixes some other
|
||||||
|
previously hard-to-do issues. For patterns that have a lot of backtracking, the
|
||||||
|
heap is now used, and there is explicit limit on the amount, settable by
|
||||||
|
pcre2_set_heap_limit() or (*LIMIT_HEAP=xxx). The "recursion limit" is retained,
|
||||||
|
but is renamed as "depth limit" (though the old names remain for
|
||||||
|
compatibility).
|
||||||
|
|
||||||
|
There is also a change in the way callouts from pcre2_match() are handled. The
|
||||||
|
offset_vector field in the callout block is no longer a pointer to the
|
||||||
|
actual ovector that was passed to the matching function in the match data
|
||||||
|
block. Instead it points to an internal ovector of a size large enough to hold
|
||||||
|
all possible captured substrings in the pattern.
|
||||||
|
|
||||||
|
2. The new option PCRE2_ENDANCHORED insists that a pattern match must end at
|
||||||
|
the end of the subject.
|
||||||
|
|
||||||
|
3. The new option PCRE2_EXTENDED_MORE implements Perl's /xx feature, and
|
||||||
|
pcre2test is upgraded to support it. Setting within the pattern by (?xx) is
|
||||||
|
also supported.
|
||||||
|
|
||||||
|
4. (?n) can be used to set PCRE2_NO_AUTO_CAPTURE, because Perl now has this.
|
||||||
|
|
||||||
|
5. Additional compile options in the compile context are now available, and the
|
||||||
|
first two are: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES and
|
||||||
|
PCRE2_EXTRA_BAD_ESCAPE_IS LITERAL.
|
||||||
|
|
||||||
|
6. The newline type PCRE2_NEWLINE_NUL is now available.
|
||||||
|
|
||||||
|
7. The match limit value now also applies to pcre2_dfa_match() as there are
|
||||||
|
patterns that can use up a lot of resources without necessarily recursing very
|
||||||
|
deeply.
|
||||||
|
|
||||||
|
8. The option REG_PEND (a GNU extension) is now available for the POSIX
|
||||||
|
wrapper. Also there is a new option PCRE2_LITERAL which is used to support
|
||||||
|
REG_NOSPEC.
|
||||||
|
|
||||||
|
9. PCRE2_EXTRA_MATCH_LINE and PCRE2_EXTRA_MATCH_WORD are implemented for the
|
||||||
|
benefit of pcre2grep, and pcre2grep's -F, -w, and -x options are re-implemented
|
||||||
|
using PCRE2_LITERAL, PCRE2_EXTRA_MATCH_WORD, and PCRE2_EXTRA_MATCH_LINE. This
|
||||||
|
is tidier and also fixes some bugs.
|
||||||
|
|
||||||
|
10. The Unicode tables are upgraded from Unicode 8.0.0 to Unicode 10.0.0.
|
||||||
|
|
||||||
|
11. There are some experimental functions for converting foreign patterns
|
||||||
|
(globs and POSIX patterns) into PCRE2 patterns.
|
||||||
|
|
||||||
|
|
||||||
Version 10.23 14-February-2017
|
Version 10.23 14-February-2017
|
||||||
------------------------------
|
------------------------------
|
||||||
|
|
||||||
|
|
47
README
47
README
|
@ -198,13 +198,14 @@ library. They are also documented in the pcre2build man page.
|
||||||
or starting a pattern with (*UCP).
|
or starting a pattern with (*UCP).
|
||||||
|
|
||||||
. You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any
|
. You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any
|
||||||
of the preceding, or any of the Unicode newline sequences, as indicating the
|
of the preceding, or any of the Unicode newline sequences, or the NUL (zero)
|
||||||
end of a line. Whatever you specify at build time is the default; the caller
|
character as indicating the end of a line. Whatever you specify at build time
|
||||||
of PCRE2 can change the selection at run time. The default newline indicator
|
is the default; the caller of PCRE2 can change the selection at run time. The
|
||||||
is a single LF character (the Unix standard). You can specify the default
|
default newline indicator is a single LF character (the Unix standard). You
|
||||||
newline indicator by adding --enable-newline-is-cr, --enable-newline-is-lf,
|
can specify the default newline indicator by adding --enable-newline-is-cr,
|
||||||
--enable-newline-is-crlf, --enable-newline-is-anycrlf, or
|
--enable-newline-is-lf, --enable-newline-is-crlf,
|
||||||
--enable-newline-is-any to the "configure" command, respectively.
|
--enable-newline-is-anycrlf, --enable-newline-is-any, or
|
||||||
|
--enable-newline-is-nul to the "configure" command, respectively.
|
||||||
|
|
||||||
. By default, the sequence \R in a pattern matches any Unicode line ending
|
. By default, the sequence \R in a pattern matches any Unicode line ending
|
||||||
sequence. This is independent of the option specifying what PCRE2 considers
|
sequence. This is independent of the option specifying what PCRE2 considers
|
||||||
|
@ -227,15 +228,15 @@ library. They are also documented in the pcre2build man page.
|
||||||
--with-parens-nest-limit=500
|
--with-parens-nest-limit=500
|
||||||
|
|
||||||
. PCRE2 has a counter that can be set to limit the amount of computing resource
|
. PCRE2 has a counter that can be set to limit the amount of computing resource
|
||||||
it uses when matching a pattern with the Perl-compatible matching function.
|
it uses when matching a pattern. If the limit is exceeded during a match, the
|
||||||
If the limit is exceeded during a match, the match fails. The default is ten
|
match fails. The default is ten million. You can change the default by
|
||||||
million. You can change the default by setting, for example,
|
setting, for example,
|
||||||
|
|
||||||
--with-match-limit=500000
|
--with-match-limit=500000
|
||||||
|
|
||||||
on the "configure" command. This is just the default; individual calls to
|
on the "configure" command. This is just the default; individual calls to
|
||||||
pcre2_match() can supply their own value. There is more discussion in the
|
pcre2_match() or pcre2_dfa_match() can supply their own value. There is more
|
||||||
pcre2api man page (search for pcre2_set_match_limit).
|
discussion in the pcre2api man page (search for pcre2_set_match_limit).
|
||||||
|
|
||||||
. There is a separate counter that limits the depth of nested backtracking
|
. There is a separate counter that limits the depth of nested backtracking
|
||||||
during a matching process, which indirectly limits the amount of heap memory
|
during a matching process, which indirectly limits the amount of heap memory
|
||||||
|
@ -659,9 +660,10 @@ with the perltest.sh script, and test 5 checking PCRE2-specific things.
|
||||||
Tests 6 and 7 check the pcre2_dfa_match() alternative matching function, in
|
Tests 6 and 7 check the pcre2_dfa_match() alternative matching function, in
|
||||||
non-UTF mode and UTF-mode with Unicode property support, respectively.
|
non-UTF mode and UTF-mode with Unicode property support, respectively.
|
||||||
|
|
||||||
Test 8 checks some internal offsets and code size features; it is run only when
|
Test 8 checks some internal offsets and code size features, but it is run only
|
||||||
the default "link size" of 2 is set (in other cases the sizes change) and when
|
when Unicode support is enabled. The output is different in 8-bit, 16-bit, and
|
||||||
Unicode support is enabled.
|
32-bit modes and for different link sizes, so there are different output files
|
||||||
|
for each mode and link size.
|
||||||
|
|
||||||
Tests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in
|
Tests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in
|
||||||
16-bit and 32-bit modes. These are tests that generate different output in
|
16-bit and 32-bit modes. These are tests that generate different output in
|
||||||
|
@ -671,7 +673,7 @@ Test 13 checks the handling of non-UTF characters greater than 255 by
|
||||||
pcre2_dfa_match() in 16-bit and 32-bit modes.
|
pcre2_dfa_match() in 16-bit and 32-bit modes.
|
||||||
|
|
||||||
Test 14 contains some special UTF and UCP tests that give different output for
|
Test 14 contains some special UTF and UCP tests that give different output for
|
||||||
the different widths.
|
different code unit widths.
|
||||||
|
|
||||||
Test 15 contains a number of tests that must not be run with JIT. They check,
|
Test 15 contains a number of tests that must not be run with JIT. They check,
|
||||||
among other non-JIT things, the match-limiting features of the intepretive
|
among other non-JIT things, the match-limiting features of the intepretive
|
||||||
|
@ -692,6 +694,9 @@ patterns to a file, and then reloading and checking them.
|
||||||
Tests 21 and 22 test \C support when the use of \C is not locked out, without
|
Tests 21 and 22 test \C support when the use of \C is not locked out, without
|
||||||
and with UTF support, respectively. Test 23 tests \C when it is locked out.
|
and with UTF support, respectively. Test 23 tests \C when it is locked out.
|
||||||
|
|
||||||
|
Tests 24 and 25 test the experimental pattern conversion functions, without and
|
||||||
|
with UTF support, respectively.
|
||||||
|
|
||||||
|
|
||||||
Character tables
|
Character tables
|
||||||
----------------
|
----------------
|
||||||
|
@ -710,7 +715,7 @@ specified for ./configure, a different version of pcre2_chartables.c is built
|
||||||
by the program dftables (compiled from dftables.c), which uses the ANSI C
|
by the program dftables (compiled from dftables.c), which uses the ANSI C
|
||||||
character handling functions such as isalnum(), isalpha(), isupper(),
|
character handling functions such as isalnum(), isalpha(), isupper(),
|
||||||
islower(), etc. to build the table sources. This means that the default C
|
islower(), etc. to build the table sources. This means that the default C
|
||||||
locale which is set for your system will control the contents of these default
|
locale that is set for your system will control the contents of these default
|
||||||
tables. You can change the default tables by editing pcre2_chartables.c and
|
tables. You can change the default tables by editing pcre2_chartables.c and
|
||||||
then re-building PCRE2. If you do this, you should take care to ensure that the
|
then re-building PCRE2. If you do this, you should take care to ensure that the
|
||||||
file does not get automatically re-generated. The best way to do this is to
|
file does not get automatically re-generated. The best way to do this is to
|
||||||
|
@ -765,6 +770,7 @@ The distribution should contain the files listed below.
|
||||||
src/pcre2_compile.c )
|
src/pcre2_compile.c )
|
||||||
src/pcre2_config.c )
|
src/pcre2_config.c )
|
||||||
src/pcre2_context.c )
|
src/pcre2_context.c )
|
||||||
|
src/pcre2_convert.c )
|
||||||
src/pcre2_dfa_match.c )
|
src/pcre2_dfa_match.c )
|
||||||
src/pcre2_error.c )
|
src/pcre2_error.c )
|
||||||
src/pcre2_find_bracket.c )
|
src/pcre2_find_bracket.c )
|
||||||
|
@ -804,7 +810,6 @@ The distribution should contain the files listed below.
|
||||||
src/pcre2demo.c simple demonstration of coding calls to PCRE2
|
src/pcre2demo.c simple demonstration of coding calls to PCRE2
|
||||||
src/pcre2grep.c source of a grep utility that uses PCRE2
|
src/pcre2grep.c source of a grep utility that uses PCRE2
|
||||||
src/pcre2test.c comprehensive test program
|
src/pcre2test.c comprehensive test program
|
||||||
src/pcre2_printint.c part of pcre2test
|
|
||||||
src/pcre2_jit_test.c JIT test program
|
src/pcre2_jit_test.c JIT test program
|
||||||
|
|
||||||
(C) Auxiliary files:
|
(C) Auxiliary files:
|
||||||
|
@ -869,12 +874,12 @@ The distribution should contain the files listed below.
|
||||||
|
|
||||||
(E) Auxiliary files for building PCRE2 "by hand"
|
(E) Auxiliary files for building PCRE2 "by hand"
|
||||||
|
|
||||||
pcre2.h.generic ) a version of the public PCRE2 header file
|
src/pcre2.h.generic ) a version of the public PCRE2 header file
|
||||||
) for use in non-"configure" environments
|
) for use in non-"configure" environments
|
||||||
config.h.generic ) a version of config.h for use in non-"configure"
|
src/config.h.generic ) a version of config.h for use in non-"configure"
|
||||||
) environments
|
) environments
|
||||||
|
|
||||||
Philip Hazel
|
Philip Hazel
|
||||||
Email local part: ph10
|
Email local part: ph10
|
||||||
Email domain: cam.ac.uk
|
Email domain: cam.ac.uk
|
||||||
Last updated: 17 June 2017
|
Last updated: 18 July 2017
|
||||||
|
|
6
RunTest
6
RunTest
|
@ -830,7 +830,7 @@ for bmode in "$test8" "$test16" "$test32"; do
|
||||||
if [ $supportBSC -ne 0 ] ; then
|
if [ $supportBSC -ne 0 ] ; then
|
||||||
echo " Skipped because \C is not disabled"
|
echo " Skipped because \C is not disabled"
|
||||||
else
|
else
|
||||||
$sim $valgrind ${opt:+$vjs} ./pcre2test -q $setstack $bmode $opt $testdata/testinput23 testtry
|
$sim $valgrind ./pcre2test -q $setstack $bmode $testdata/testinput23 testtry
|
||||||
checkresult $? 23 ""
|
checkresult $? 23 ""
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
@ -839,7 +839,7 @@ for bmode in "$test8" "$test16" "$test32"; do
|
||||||
|
|
||||||
if [ "$do24" = yes ] ; then
|
if [ "$do24" = yes ] ; then
|
||||||
echo $title24
|
echo $title24
|
||||||
$sim $valgrind ${opt:+$vjs} ./pcre2test -q $setstack $bmode $opt $testdata/testinput24 testtry
|
$sim $valgrind ./pcre2test -q $setstack $bmode $testdata/testinput24 testtry
|
||||||
checkresult $? 24 ""
|
checkresult $? 24 ""
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
@ -850,7 +850,7 @@ for bmode in "$test8" "$test16" "$test32"; do
|
||||||
if [ $utf -eq 0 ] ; then
|
if [ $utf -eq 0 ] ; then
|
||||||
echo " Skipped because UTF-$bits support is not available"
|
echo " Skipped because UTF-$bits support is not available"
|
||||||
else
|
else
|
||||||
$sim $valgrind ${opt:+$vjs} ./pcre2test -q $setstack $bmode $opt $testdata/testinput25 testtry
|
$sim $valgrind ./pcre2test -q $setstack $bmode $testdata/testinput25 testtry
|
||||||
checkresult $? 25 ""
|
checkresult $? 25 ""
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
12
configure.ac
12
configure.ac
|
@ -10,17 +10,17 @@ dnl be defined as -RC2, for example. For real releases, it should be empty.
|
||||||
|
|
||||||
m4_define(pcre2_major, [10])
|
m4_define(pcre2_major, [10])
|
||||||
m4_define(pcre2_minor, [30])
|
m4_define(pcre2_minor, [30])
|
||||||
m4_define(pcre2_prerelease, [-DEV])
|
m4_define(pcre2_prerelease, [-RC1])
|
||||||
m4_define(pcre2_date, [2017-03-05])
|
m4_define(pcre2_date, [2017-07-18])
|
||||||
|
|
||||||
# NOTE: The CMakeLists.txt file searches for the above variables in the first
|
# NOTE: The CMakeLists.txt file searches for the above variables in the first
|
||||||
# 50 lines of this file. Please update that if the variables above are moved.
|
# 50 lines of this file. Please update that if the variables above are moved.
|
||||||
|
|
||||||
# Libtool shared library interface versions (current:revision:age)
|
# Libtool shared library interface versions (current:revision:age)
|
||||||
m4_define(libpcre2_8_version, [5:0:5])
|
m4_define(libpcre2_8_version, [6:0:6])
|
||||||
m4_define(libpcre2_16_version, [5:0:5])
|
m4_define(libpcre2_16_version, [6:0:6])
|
||||||
m4_define(libpcre2_32_version, [5:0:5])
|
m4_define(libpcre2_32_version, [6:0:6])
|
||||||
m4_define(libpcre2_posix_version, [1:1:0])
|
m4_define(libpcre2_posix_version, [2:0:0])
|
||||||
|
|
||||||
AC_PREREQ(2.57)
|
AC_PREREQ(2.57)
|
||||||
AC_INIT(PCRE2, pcre2_major.pcre2_minor[]pcre2_prerelease, , pcre2)
|
AC_INIT(PCRE2, pcre2_major.pcre2_minor[]pcre2_prerelease, , pcre2)
|
||||||
|
|
|
@ -198,13 +198,14 @@ library. They are also documented in the pcre2build man page.
|
||||||
or starting a pattern with (*UCP).
|
or starting a pattern with (*UCP).
|
||||||
|
|
||||||
. You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any
|
. You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any
|
||||||
of the preceding, or any of the Unicode newline sequences, as indicating the
|
of the preceding, or any of the Unicode newline sequences, or the NUL (zero)
|
||||||
end of a line. Whatever you specify at build time is the default; the caller
|
character as indicating the end of a line. Whatever you specify at build time
|
||||||
of PCRE2 can change the selection at run time. The default newline indicator
|
is the default; the caller of PCRE2 can change the selection at run time. The
|
||||||
is a single LF character (the Unix standard). You can specify the default
|
default newline indicator is a single LF character (the Unix standard). You
|
||||||
newline indicator by adding --enable-newline-is-cr, --enable-newline-is-lf,
|
can specify the default newline indicator by adding --enable-newline-is-cr,
|
||||||
--enable-newline-is-crlf, --enable-newline-is-anycrlf, or
|
--enable-newline-is-lf, --enable-newline-is-crlf,
|
||||||
--enable-newline-is-any to the "configure" command, respectively.
|
--enable-newline-is-anycrlf, --enable-newline-is-any, or
|
||||||
|
--enable-newline-is-nul to the "configure" command, respectively.
|
||||||
|
|
||||||
. By default, the sequence \R in a pattern matches any Unicode line ending
|
. By default, the sequence \R in a pattern matches any Unicode line ending
|
||||||
sequence. This is independent of the option specifying what PCRE2 considers
|
sequence. This is independent of the option specifying what PCRE2 considers
|
||||||
|
@ -227,15 +228,15 @@ library. They are also documented in the pcre2build man page.
|
||||||
--with-parens-nest-limit=500
|
--with-parens-nest-limit=500
|
||||||
|
|
||||||
. PCRE2 has a counter that can be set to limit the amount of computing resource
|
. PCRE2 has a counter that can be set to limit the amount of computing resource
|
||||||
it uses when matching a pattern with the Perl-compatible matching function.
|
it uses when matching a pattern. If the limit is exceeded during a match, the
|
||||||
If the limit is exceeded during a match, the match fails. The default is ten
|
match fails. The default is ten million. You can change the default by
|
||||||
million. You can change the default by setting, for example,
|
setting, for example,
|
||||||
|
|
||||||
--with-match-limit=500000
|
--with-match-limit=500000
|
||||||
|
|
||||||
on the "configure" command. This is just the default; individual calls to
|
on the "configure" command. This is just the default; individual calls to
|
||||||
pcre2_match() can supply their own value. There is more discussion in the
|
pcre2_match() or pcre2_dfa_match() can supply their own value. There is more
|
||||||
pcre2api man page (search for pcre2_set_match_limit).
|
discussion in the pcre2api man page (search for pcre2_set_match_limit).
|
||||||
|
|
||||||
. There is a separate counter that limits the depth of nested backtracking
|
. There is a separate counter that limits the depth of nested backtracking
|
||||||
during a matching process, which indirectly limits the amount of heap memory
|
during a matching process, which indirectly limits the amount of heap memory
|
||||||
|
@ -659,9 +660,10 @@ with the perltest.sh script, and test 5 checking PCRE2-specific things.
|
||||||
Tests 6 and 7 check the pcre2_dfa_match() alternative matching function, in
|
Tests 6 and 7 check the pcre2_dfa_match() alternative matching function, in
|
||||||
non-UTF mode and UTF-mode with Unicode property support, respectively.
|
non-UTF mode and UTF-mode with Unicode property support, respectively.
|
||||||
|
|
||||||
Test 8 checks some internal offsets and code size features; it is run only when
|
Test 8 checks some internal offsets and code size features, but it is run only
|
||||||
the default "link size" of 2 is set (in other cases the sizes change) and when
|
when Unicode support is enabled. The output is different in 8-bit, 16-bit, and
|
||||||
Unicode support is enabled.
|
32-bit modes and for different link sizes, so there are different output files
|
||||||
|
for each mode and link size.
|
||||||
|
|
||||||
Tests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in
|
Tests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in
|
||||||
16-bit and 32-bit modes. These are tests that generate different output in
|
16-bit and 32-bit modes. These are tests that generate different output in
|
||||||
|
@ -671,7 +673,7 @@ Test 13 checks the handling of non-UTF characters greater than 255 by
|
||||||
pcre2_dfa_match() in 16-bit and 32-bit modes.
|
pcre2_dfa_match() in 16-bit and 32-bit modes.
|
||||||
|
|
||||||
Test 14 contains some special UTF and UCP tests that give different output for
|
Test 14 contains some special UTF and UCP tests that give different output for
|
||||||
the different widths.
|
different code unit widths.
|
||||||
|
|
||||||
Test 15 contains a number of tests that must not be run with JIT. They check,
|
Test 15 contains a number of tests that must not be run with JIT. They check,
|
||||||
among other non-JIT things, the match-limiting features of the intepretive
|
among other non-JIT things, the match-limiting features of the intepretive
|
||||||
|
@ -692,6 +694,9 @@ patterns to a file, and then reloading and checking them.
|
||||||
Tests 21 and 22 test \C support when the use of \C is not locked out, without
|
Tests 21 and 22 test \C support when the use of \C is not locked out, without
|
||||||
and with UTF support, respectively. Test 23 tests \C when it is locked out.
|
and with UTF support, respectively. Test 23 tests \C when it is locked out.
|
||||||
|
|
||||||
|
Tests 24 and 25 test the experimental pattern conversion functions, without and
|
||||||
|
with UTF support, respectively.
|
||||||
|
|
||||||
|
|
||||||
Character tables
|
Character tables
|
||||||
----------------
|
----------------
|
||||||
|
@ -710,7 +715,7 @@ specified for ./configure, a different version of pcre2_chartables.c is built
|
||||||
by the program dftables (compiled from dftables.c), which uses the ANSI C
|
by the program dftables (compiled from dftables.c), which uses the ANSI C
|
||||||
character handling functions such as isalnum(), isalpha(), isupper(),
|
character handling functions such as isalnum(), isalpha(), isupper(),
|
||||||
islower(), etc. to build the table sources. This means that the default C
|
islower(), etc. to build the table sources. This means that the default C
|
||||||
locale which is set for your system will control the contents of these default
|
locale that is set for your system will control the contents of these default
|
||||||
tables. You can change the default tables by editing pcre2_chartables.c and
|
tables. You can change the default tables by editing pcre2_chartables.c and
|
||||||
then re-building PCRE2. If you do this, you should take care to ensure that the
|
then re-building PCRE2. If you do this, you should take care to ensure that the
|
||||||
file does not get automatically re-generated. The best way to do this is to
|
file does not get automatically re-generated. The best way to do this is to
|
||||||
|
@ -765,6 +770,7 @@ The distribution should contain the files listed below.
|
||||||
src/pcre2_compile.c )
|
src/pcre2_compile.c )
|
||||||
src/pcre2_config.c )
|
src/pcre2_config.c )
|
||||||
src/pcre2_context.c )
|
src/pcre2_context.c )
|
||||||
|
src/pcre2_convert.c )
|
||||||
src/pcre2_dfa_match.c )
|
src/pcre2_dfa_match.c )
|
||||||
src/pcre2_error.c )
|
src/pcre2_error.c )
|
||||||
src/pcre2_find_bracket.c )
|
src/pcre2_find_bracket.c )
|
||||||
|
@ -804,7 +810,6 @@ The distribution should contain the files listed below.
|
||||||
src/pcre2demo.c simple demonstration of coding calls to PCRE2
|
src/pcre2demo.c simple demonstration of coding calls to PCRE2
|
||||||
src/pcre2grep.c source of a grep utility that uses PCRE2
|
src/pcre2grep.c source of a grep utility that uses PCRE2
|
||||||
src/pcre2test.c comprehensive test program
|
src/pcre2test.c comprehensive test program
|
||||||
src/pcre2_printint.c part of pcre2test
|
|
||||||
src/pcre2_jit_test.c JIT test program
|
src/pcre2_jit_test.c JIT test program
|
||||||
|
|
||||||
(C) Auxiliary files:
|
(C) Auxiliary files:
|
||||||
|
@ -869,12 +874,12 @@ The distribution should contain the files listed below.
|
||||||
|
|
||||||
(E) Auxiliary files for building PCRE2 "by hand"
|
(E) Auxiliary files for building PCRE2 "by hand"
|
||||||
|
|
||||||
pcre2.h.generic ) a version of the public PCRE2 header file
|
src/pcre2.h.generic ) a version of the public PCRE2 header file
|
||||||
) for use in non-"configure" environments
|
) for use in non-"configure" environments
|
||||||
config.h.generic ) a version of config.h for use in non-"configure"
|
src/config.h.generic ) a version of config.h for use in non-"configure"
|
||||||
) environments
|
) environments
|
||||||
|
|
||||||
Philip Hazel
|
Philip Hazel
|
||||||
Email local part: ph10
|
Email local part: ph10
|
||||||
Email domain: cam.ac.uk
|
Email domain: cam.ac.uk
|
||||||
Last updated: 17 June 2017
|
Last updated: 18 July 2017
|
||||||
|
|
|
@ -87,10 +87,10 @@ Options that specify values have names that start with --with.
|
||||||
<br><a name="SEC3" href="#TOC1">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br>
|
<br><a name="SEC3" href="#TOC1">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br>
|
||||||
<P>
|
<P>
|
||||||
By default, a library called <b>libpcre2-8</b> is built, containing functions
|
By default, a library called <b>libpcre2-8</b> is built, containing functions
|
||||||
that take string arguments contained in vectors of bytes, interpreted either as
|
that take string arguments contained in arrays of bytes, interpreted either as
|
||||||
single-byte characters, or UTF-8 strings. You can also build two other
|
single-byte characters, or UTF-8 strings. You can also build two other
|
||||||
libraries, called <b>libpcre2-16</b> and <b>libpcre2-32</b>, which process
|
libraries, called <b>libpcre2-16</b> and <b>libpcre2-32</b>, which process
|
||||||
strings that are contained in vectors of 16-bit and 32-bit code units,
|
strings that are contained in arrays of 16-bit and 32-bit code units,
|
||||||
respectively. These can be interpreted either as single-unit characters or
|
respectively. These can be interpreted either as single-unit characters or
|
||||||
UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
|
UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
|
||||||
the following to the <b>configure</b> command:
|
the following to the <b>configure</b> command:
|
||||||
|
@ -208,19 +208,23 @@ to the <b>configure</b> command. There is a fourth option, specified by
|
||||||
--enable-newline-is-anycrlf
|
--enable-newline-is-anycrlf
|
||||||
</pre>
|
</pre>
|
||||||
which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
|
which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
|
||||||
indicating a line ending. Finally, a fifth option, specified by
|
indicating a line ending. A fifth option, specified by
|
||||||
<pre>
|
<pre>
|
||||||
--enable-newline-is-any
|
--enable-newline-is-any
|
||||||
</pre>
|
</pre>
|
||||||
causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
|
causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
|
||||||
sequences are the three just mentioned, plus the single characters VT (vertical
|
sequences are the three just mentioned, plus the single characters VT (vertical
|
||||||
tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
|
tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
|
||||||
separator, U+2028), and PS (paragraph separator, U+2029).
|
separator, U+2028), and PS (paragraph separator, U+2029). The final option is
|
||||||
|
<pre>
|
||||||
|
--enable-newline-is-nul
|
||||||
|
</pre>
|
||||||
|
which causes NUL (binary zero) is set as the default line-ending character.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Whatever default line ending convention is selected when PCRE2 is built can be
|
Whatever default line ending convention is selected when PCRE2 is built can be
|
||||||
overridden by applications that use the library. At build time it is
|
overridden by applications that use the library. At build time it is
|
||||||
conventional to use the standard for your operating system.
|
recommended to use the standard for your operating system.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC9" href="#TOC1">WHAT \R MATCHES</a><br>
|
<br><a name="SEC9" href="#TOC1">WHAT \R MATCHES</a><br>
|
||||||
<P>
|
<P>
|
||||||
|
@ -301,7 +305,9 @@ because the size of each backtracking "frame" depends on the number of
|
||||||
capturing parentheses in a pattern, the amount of heap that is used before the
|
capturing parentheses in a pattern, the amount of heap that is used before the
|
||||||
limit is reached varies from pattern to pattern. This limit was more useful in
|
limit is reached varies from pattern to pattern. This limit was more useful in
|
||||||
versions before 10.30, where function recursion was used for backtracking.
|
versions before 10.30, where function recursion was used for backtracking.
|
||||||
However, as well as applying to <b>pcre2_match()</b>, this limit also controls
|
</P>
|
||||||
|
<P>
|
||||||
|
As well as applying to <b>pcre2_match()</b>, the depth limit also controls
|
||||||
the depth of recursive function calls in <b>pcre2_dfa_match()</b>. These are
|
the depth of recursive function calls in <b>pcre2_dfa_match()</b>. These are
|
||||||
used for lookaround assertions, atomic groups, and recursion within patterns.
|
used for lookaround assertions, atomic groups, and recursion within patterns.
|
||||||
The limit does not apply to JIT matching.
|
The limit does not apply to JIT matching.
|
||||||
|
@ -559,7 +565,7 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC25" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC25" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 17 June 2017
|
Last updated: 18 July 2017
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2017 University of Cambridge.
|
Copyright © 1997-2017 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
102
doc/pcre2.txt
102
doc/pcre2.txt
|
@ -3487,10 +3487,10 @@ PCRE2 BUILD-TIME OPTIONS
|
||||||
BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES
|
BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES
|
||||||
|
|
||||||
By default, a library called libpcre2-8 is built, containing functions
|
By default, a library called libpcre2-8 is built, containing functions
|
||||||
that take string arguments contained in vectors of bytes, interpreted
|
that take string arguments contained in arrays of bytes, interpreted
|
||||||
either as single-byte characters, or UTF-8 strings. You can also build
|
either as single-byte characters, or UTF-8 strings. You can also build
|
||||||
two other libraries, called libpcre2-16 and libpcre2-32, which process
|
two other libraries, called libpcre2-16 and libpcre2-32, which process
|
||||||
strings that are contained in vectors of 16-bit and 32-bit code units,
|
strings that are contained in arrays of 16-bit and 32-bit code units,
|
||||||
respectively. These can be interpreted either as single-unit characters
|
respectively. These can be interpreted either as single-unit characters
|
||||||
or UTF-16/UTF-32 strings. To build these additional libraries, add one
|
or UTF-16/UTF-32 strings. To build these additional libraries, add one
|
||||||
or both of the following to the configure command:
|
or both of the following to the configure command:
|
||||||
|
@ -3609,7 +3609,7 @@ NEWLINE RECOGNITION
|
||||||
--enable-newline-is-anycrlf
|
--enable-newline-is-anycrlf
|
||||||
|
|
||||||
which causes PCRE2 to recognize any of the three sequences CR, LF, or
|
which causes PCRE2 to recognize any of the three sequences CR, LF, or
|
||||||
CRLF as indicating a line ending. Finally, a fifth option, specified by
|
CRLF as indicating a line ending. A fifth option, specified by
|
||||||
|
|
||||||
--enable-newline-is-any
|
--enable-newline-is-any
|
||||||
|
|
||||||
|
@ -3617,97 +3617,103 @@ NEWLINE RECOGNITION
|
||||||
newline sequences are the three just mentioned, plus the single charac-
|
newline sequences are the three just mentioned, plus the single charac-
|
||||||
ters VT (vertical tab, U+000B), FF (form feed, U+000C), NEL (next line,
|
ters VT (vertical tab, U+000B), FF (form feed, U+000C), NEL (next line,
|
||||||
U+0085), LS (line separator, U+2028), and PS (paragraph separator,
|
U+0085), LS (line separator, U+2028), and PS (paragraph separator,
|
||||||
U+2029).
|
U+2029). The final option is
|
||||||
|
|
||||||
|
--enable-newline-is-nul
|
||||||
|
|
||||||
|
which causes NUL (binary zero) is set as the default line-ending char-
|
||||||
|
acter.
|
||||||
|
|
||||||
Whatever default line ending convention is selected when PCRE2 is built
|
Whatever default line ending convention is selected when PCRE2 is built
|
||||||
can be overridden by applications that use the library. At build time
|
can be overridden by applications that use the library. At build time
|
||||||
it is conventional to use the standard for your operating system.
|
it is recommended to use the standard for your operating system.
|
||||||
|
|
||||||
|
|
||||||
WHAT \R MATCHES
|
WHAT \R MATCHES
|
||||||
|
|
||||||
By default, the sequence \R in a pattern matches any Unicode newline
|
By default, the sequence \R in a pattern matches any Unicode newline
|
||||||
sequence, independently of what has been selected as the line ending
|
sequence, independently of what has been selected as the line ending
|
||||||
sequence. If you specify
|
sequence. If you specify
|
||||||
|
|
||||||
--enable-bsr-anycrlf
|
--enable-bsr-anycrlf
|
||||||
|
|
||||||
the default is changed so that \R matches only CR, LF, or CRLF. What-
|
the default is changed so that \R matches only CR, LF, or CRLF. What-
|
||||||
ever is selected when PCRE2 is built can be overridden by applications
|
ever is selected when PCRE2 is built can be overridden by applications
|
||||||
that use the library.
|
that use the library.
|
||||||
|
|
||||||
|
|
||||||
HANDLING VERY LARGE PATTERNS
|
HANDLING VERY LARGE PATTERNS
|
||||||
|
|
||||||
Within a compiled pattern, offset values are used to point from one
|
Within a compiled pattern, offset values are used to point from one
|
||||||
part to another (for example, from an opening parenthesis to an alter-
|
part to another (for example, from an opening parenthesis to an alter-
|
||||||
nation metacharacter). By default, in the 8-bit and 16-bit libraries,
|
nation metacharacter). By default, in the 8-bit and 16-bit libraries,
|
||||||
two-byte values are used for these offsets, leading to a maximum size
|
two-byte values are used for these offsets, leading to a maximum size
|
||||||
for a compiled pattern of around 64K code units. This is sufficient to
|
for a compiled pattern of around 64K code units. This is sufficient to
|
||||||
handle all but the most gigantic patterns. Nevertheless, some people do
|
handle all but the most gigantic patterns. Nevertheless, some people do
|
||||||
want to process truly enormous patterns, so it is possible to compile
|
want to process truly enormous patterns, so it is possible to compile
|
||||||
PCRE2 to use three-byte or four-byte offsets by adding a setting such
|
PCRE2 to use three-byte or four-byte offsets by adding a setting such
|
||||||
as
|
as
|
||||||
|
|
||||||
--with-link-size=3
|
--with-link-size=3
|
||||||
|
|
||||||
to the configure command. The value given must be 2, 3, or 4. For the
|
to the configure command. The value given must be 2, 3, or 4. For the
|
||||||
16-bit library, a value of 3 is rounded up to 4. In these libraries,
|
16-bit library, a value of 3 is rounded up to 4. In these libraries,
|
||||||
using longer offsets slows down the operation of PCRE2 because it has
|
using longer offsets slows down the operation of PCRE2 because it has
|
||||||
to load additional data when handling them. For the 32-bit library the
|
to load additional data when handling them. For the 32-bit library the
|
||||||
value is always 4 and cannot be overridden; the value of --with-link-
|
value is always 4 and cannot be overridden; the value of --with-link-
|
||||||
size is ignored.
|
size is ignored.
|
||||||
|
|
||||||
|
|
||||||
LIMITING PCRE2 RESOURCE USAGE
|
LIMITING PCRE2 RESOURCE USAGE
|
||||||
|
|
||||||
The pcre2_match() function increments a counter each time it goes round
|
The pcre2_match() function increments a counter each time it goes round
|
||||||
its main loop. Putting a limit on this counter controls the amount of
|
its main loop. Putting a limit on this counter controls the amount of
|
||||||
computing resource used by a single call to pcre2_match(). The limit
|
computing resource used by a single call to pcre2_match(). The limit
|
||||||
can be changed at run time, as described in the pcre2api documentation.
|
can be changed at run time, as described in the pcre2api documentation.
|
||||||
The default is 10 million, but this can be changed by adding a setting
|
The default is 10 million, but this can be changed by adding a setting
|
||||||
such as
|
such as
|
||||||
|
|
||||||
--with-match-limit=500000
|
--with-match-limit=500000
|
||||||
|
|
||||||
to the configure command. This setting also applies to the
|
to the configure command. This setting also applies to the
|
||||||
pcre2_dfa_match() matching function, and to JIT matching (though the
|
pcre2_dfa_match() matching function, and to JIT matching (though the
|
||||||
counting is done differently).
|
counting is done differently).
|
||||||
|
|
||||||
The pcre2_match() function starts out using a 20K vector on the system
|
The pcre2_match() function starts out using a 20K vector on the system
|
||||||
stack to record backtracking points. The more nested backtracking
|
stack to record backtracking points. The more nested backtracking
|
||||||
points there are (that is, the deeper the search tree), the more memory
|
points there are (that is, the deeper the search tree), the more memory
|
||||||
is needed. If the initial vector is not large enough, heap memory is
|
is needed. If the initial vector is not large enough, heap memory is
|
||||||
used, up to a certain limit, which is specified in kilobytes. The limit
|
used, up to a certain limit, which is specified in kilobytes. The limit
|
||||||
can be changed at run time, as described in the pcre2api documentation.
|
can be changed at run time, as described in the pcre2api documentation.
|
||||||
The default limit (in effect unlimited) is 20 million. You can change
|
The default limit (in effect unlimited) is 20 million. You can change
|
||||||
this by a setting such as
|
this by a setting such as
|
||||||
|
|
||||||
--with-heap-limit=500
|
--with-heap-limit=500
|
||||||
|
|
||||||
which limits the amount of heap to 500 kilobytes. This limit applies
|
which limits the amount of heap to 500 kilobytes. This limit applies
|
||||||
only to interpretive matching in pcre2_match(). It does not apply when
|
only to interpretive matching in pcre2_match(). It does not apply when
|
||||||
JIT (which has its own memory arrangements) is used, nor does it apply
|
JIT (which has its own memory arrangements) is used, nor does it apply
|
||||||
to pcre2_dfa_match().
|
to pcre2_dfa_match().
|
||||||
|
|
||||||
You can also explicitly limit the depth of nested backtracking in the
|
You can also explicitly limit the depth of nested backtracking in the
|
||||||
pcre2_match() interpreter. This limit defaults to the value that is set
|
pcre2_match() interpreter. This limit defaults to the value that is set
|
||||||
for --with-match-limit. You can set a lower default limit by adding,
|
for --with-match-limit. You can set a lower default limit by adding,
|
||||||
for example,
|
for example,
|
||||||
|
|
||||||
--with-match-limit_depth=10000
|
--with-match-limit_depth=10000
|
||||||
|
|
||||||
to the configure command. This value can be overridden at run time.
|
to the configure command. This value can be overridden at run time.
|
||||||
This depth limit indirectly limits the amount of heap memory that is
|
This depth limit indirectly limits the amount of heap memory that is
|
||||||
used, but because the size of each backtracking "frame" depends on the
|
used, but because the size of each backtracking "frame" depends on the
|
||||||
number of capturing parentheses in a pattern, the amount of heap that
|
number of capturing parentheses in a pattern, the amount of heap that
|
||||||
is used before the limit is reached varies from pattern to pattern.
|
is used before the limit is reached varies from pattern to pattern.
|
||||||
This limit was more useful in versions before 10.30, where function
|
This limit was more useful in versions before 10.30, where function
|
||||||
recursion was used for backtracking. However, as well as applying to
|
recursion was used for backtracking.
|
||||||
pcre2_match(), this limit also controls the depth of recursive function
|
|
||||||
calls in pcre2_dfa_match(). These are used for lookaround assertions,
|
As well as applying to pcre2_match(), the depth limit also controls the
|
||||||
atomic groups, and recursion within patterns. The limit does not apply
|
depth of recursive function calls in pcre2_dfa_match(). These are used
|
||||||
to JIT matching.
|
for lookaround assertions, atomic groups, and recursion within pat-
|
||||||
|
terns. The limit does not apply to JIT matching.
|
||||||
|
|
||||||
|
|
||||||
CREATING CHARACTER TABLES AT BUILD TIME
|
CREATING CHARACTER TABLES AT BUILD TIME
|
||||||
|
@ -3969,7 +3975,7 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 17 June 2017
|
Last updated: 18 July 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2BUILD 3 "17 June 2017" "PCRE2 10.30"
|
.TH PCRE2BUILD 3 "18 July 2017" "PCRE2 10.30"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.
|
.
|
||||||
|
@ -66,10 +66,10 @@ Options that specify values have names that start with --with.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
By default, a library called \fBlibpcre2-8\fP is built, containing functions
|
By default, a library called \fBlibpcre2-8\fP is built, containing functions
|
||||||
that take string arguments contained in vectors of bytes, interpreted either as
|
that take string arguments contained in arrays of bytes, interpreted either as
|
||||||
single-byte characters, or UTF-8 strings. You can also build two other
|
single-byte characters, or UTF-8 strings. You can also build two other
|
||||||
libraries, called \fBlibpcre2-16\fP and \fBlibpcre2-32\fP, which process
|
libraries, called \fBlibpcre2-16\fP and \fBlibpcre2-32\fP, which process
|
||||||
strings that are contained in vectors of 16-bit and 32-bit code units,
|
strings that are contained in arrays of 16-bit and 32-bit code units,
|
||||||
respectively. These can be interpreted either as single-unit characters or
|
respectively. These can be interpreted either as single-unit characters or
|
||||||
UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
|
UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
|
||||||
the following to the \fBconfigure\fP command:
|
the following to the \fBconfigure\fP command:
|
||||||
|
@ -197,18 +197,22 @@ to the \fBconfigure\fP command. There is a fourth option, specified by
|
||||||
--enable-newline-is-anycrlf
|
--enable-newline-is-anycrlf
|
||||||
.sp
|
.sp
|
||||||
which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
|
which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
|
||||||
indicating a line ending. Finally, a fifth option, specified by
|
indicating a line ending. A fifth option, specified by
|
||||||
.sp
|
.sp
|
||||||
--enable-newline-is-any
|
--enable-newline-is-any
|
||||||
.sp
|
.sp
|
||||||
causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
|
causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
|
||||||
sequences are the three just mentioned, plus the single characters VT (vertical
|
sequences are the three just mentioned, plus the single characters VT (vertical
|
||||||
tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
|
tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
|
||||||
separator, U+2028), and PS (paragraph separator, U+2029).
|
separator, U+2028), and PS (paragraph separator, U+2029). The final option is
|
||||||
|
.sp
|
||||||
|
--enable-newline-is-nul
|
||||||
|
.sp
|
||||||
|
which causes NUL (binary zero) is set as the default line-ending character.
|
||||||
.P
|
.P
|
||||||
Whatever default line ending convention is selected when PCRE2 is built can be
|
Whatever default line ending convention is selected when PCRE2 is built can be
|
||||||
overridden by applications that use the library. At build time it is
|
overridden by applications that use the library. At build time it is
|
||||||
conventional to use the standard for your operating system.
|
recommended to use the standard for your operating system.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SH "WHAT \eR MATCHES"
|
.SH "WHAT \eR MATCHES"
|
||||||
|
@ -297,7 +301,8 @@ because the size of each backtracking "frame" depends on the number of
|
||||||
capturing parentheses in a pattern, the amount of heap that is used before the
|
capturing parentheses in a pattern, the amount of heap that is used before the
|
||||||
limit is reached varies from pattern to pattern. This limit was more useful in
|
limit is reached varies from pattern to pattern. This limit was more useful in
|
||||||
versions before 10.30, where function recursion was used for backtracking.
|
versions before 10.30, where function recursion was used for backtracking.
|
||||||
However, as well as applying to \fBpcre2_match()\fP, this limit also controls
|
.P
|
||||||
|
As well as applying to \fBpcre2_match()\fP, the depth limit also controls
|
||||||
the depth of recursive function calls in \fBpcre2_dfa_match()\fP. These are
|
the depth of recursive function calls in \fBpcre2_dfa_match()\fP. These are
|
||||||
used for lookaround assertions, atomic groups, and recursion within patterns.
|
used for lookaround assertions, atomic groups, and recursion within patterns.
|
||||||
The limit does not apply to JIT matching.
|
The limit does not apply to JIT matching.
|
||||||
|
@ -577,6 +582,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 17 June 2017
|
Last updated: 18 July 2017
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2017 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -132,6 +132,12 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
/* Define to 1 if you have the <zlib.h> header file. */
|
/* Define to 1 if you have the <zlib.h> header file. */
|
||||||
/* #undef HAVE_ZLIB_H */
|
/* #undef HAVE_ZLIB_H */
|
||||||
|
|
||||||
|
/* This limits the amount of memory that pcre2_match() may use while matching
|
||||||
|
a pattern. The value is in kilobytes. */
|
||||||
|
#ifndef HEAP_LIMIT
|
||||||
|
#define HEAP_LIMIT 20000000
|
||||||
|
#endif
|
||||||
|
|
||||||
/* The value of LINK_SIZE determines the number of bytes used to store links
|
/* The value of LINK_SIZE determines the number of bytes used to store links
|
||||||
as offsets within the compiled regex. The default is 2, which allows for
|
as offsets within the compiled regex. The default is 2, which allows for
|
||||||
compiled patterns up to 64K long. This covers the vast majority of cases.
|
compiled patterns up to 64K long. This covers the vast majority of cases.
|
||||||
|
@ -148,7 +154,7 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/* The value of MATCH_LIMIT determines the default number of times the
|
/* The value of MATCH_LIMIT determines the default number of times the
|
||||||
internal match() function can record a backtrack position during a single
|
pcre2_match() function can record a backtrack position during a single
|
||||||
matching attempt. There is a runtime interface for setting a different
|
matching attempt. There is a runtime interface for setting a different
|
||||||
limit. The limit exists in order to catch runaway regular expressions that
|
limit. The limit exists in order to catch runaway regular expressions that
|
||||||
take for ever to determine that they do not match. The default is set very
|
take for ever to determine that they do not match. The default is set very
|
||||||
|
@ -188,8 +194,8 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
|
|
||||||
/* The value of NEWLINE_DEFAULT determines the default newline character
|
/* The value of NEWLINE_DEFAULT determines the default newline character
|
||||||
sequence. PCRE2 client programs can override this by selecting other values
|
sequence. PCRE2 client programs can override this by selecting other values
|
||||||
at run time. The valid values are 1 (CR), 2 (LF), 3 (CRLF), 4 (ANY), and 5
|
at run time. The valid values are 1 (CR), 2 (LF), 3 (CRLF), 4 (ANY), 5
|
||||||
(ANYCRLF). */
|
(ANYCRLF), and 6 (NUL). */
|
||||||
#ifndef NEWLINE_DEFAULT
|
#ifndef NEWLINE_DEFAULT
|
||||||
#define NEWLINE_DEFAULT 2
|
#define NEWLINE_DEFAULT 2
|
||||||
#endif
|
#endif
|
||||||
|
@ -204,7 +210,7 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
#define PACKAGE_NAME "PCRE2"
|
#define PACKAGE_NAME "PCRE2"
|
||||||
|
|
||||||
/* Define to the full name and version of this package. */
|
/* Define to the full name and version of this package. */
|
||||||
#define PACKAGE_STRING "PCRE2 10.30-DEV"
|
#define PACKAGE_STRING "PCRE2 10.30-RC1"
|
||||||
|
|
||||||
/* Define to the one symbol short name of this package. */
|
/* Define to the one symbol short name of this package. */
|
||||||
#define PACKAGE_TARNAME "pcre2"
|
#define PACKAGE_TARNAME "pcre2"
|
||||||
|
@ -213,7 +219,7 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
#define PACKAGE_URL ""
|
#define PACKAGE_URL ""
|
||||||
|
|
||||||
/* Define to the version of this package. */
|
/* Define to the version of this package. */
|
||||||
#define PACKAGE_VERSION "10.30-DEV"
|
#define PACKAGE_VERSION "10.30-RC1"
|
||||||
|
|
||||||
/* The value of PARENS_NEST_LIMIT specifies the maximum depth of nested
|
/* The value of PARENS_NEST_LIMIT specifies the maximum depth of nested
|
||||||
parentheses (of any kind) in a pattern. This limits the amount of system
|
parentheses (of any kind) in a pattern. This limits the amount of system
|
||||||
|
@ -261,6 +267,11 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
your system. */
|
your system. */
|
||||||
/* #undef PTHREAD_CREATE_JOINABLE */
|
/* #undef PTHREAD_CREATE_JOINABLE */
|
||||||
|
|
||||||
|
/* Define to any non-zero number to enable support for SELinux compatible
|
||||||
|
executable memory allocator in JIT. Note that this will have no effect
|
||||||
|
unless SUPPORT_JIT is also defined. */
|
||||||
|
/* #undef SLJIT_PROT_EXECUTABLE_ALLOCATOR */
|
||||||
|
|
||||||
/* Define to 1 if you have the ANSI C header files. */
|
/* Define to 1 if you have the ANSI C header files. */
|
||||||
/* #undef STDC_HEADERS */
|
/* #undef STDC_HEADERS */
|
||||||
|
|
||||||
|
@ -328,7 +339,7 @@ sure both macros are undefined; an emulation function will then be used. */
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/* Version number of package */
|
/* Version number of package */
|
||||||
#define VERSION "10.30-DEV"
|
#define VERSION "10.30-RC1"
|
||||||
|
|
||||||
/* Define to 1 if on MINIX. */
|
/* Define to 1 if on MINIX. */
|
||||||
/* #undef _MINIX */
|
/* #undef _MINIX */
|
||||||
|
|
|
@ -43,8 +43,8 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||||
|
|
||||||
#define PCRE2_MAJOR 10
|
#define PCRE2_MAJOR 10
|
||||||
#define PCRE2_MINOR 30
|
#define PCRE2_MINOR 30
|
||||||
#define PCRE2_PRERELEASE -DEV
|
#define PCRE2_PRERELEASE -RC1
|
||||||
#define PCRE2_DATE 2017-03-05
|
#define PCRE2_DATE 2017-07-18
|
||||||
|
|
||||||
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
||||||
imported have to be identified as such. When building PCRE2, the appropriate
|
imported have to be identified as such. When building PCRE2, the appropriate
|
||||||
|
|
|
@ -43,8 +43,8 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||||
|
|
||||||
#define PCRE2_MAJOR 10
|
#define PCRE2_MAJOR 10
|
||||||
#define PCRE2_MINOR 30
|
#define PCRE2_MINOR 30
|
||||||
#define PCRE2_PRERELEASE -DEV
|
#define PCRE2_PRERELEASE -RC1
|
||||||
#define PCRE2_DATE 2017-03-05
|
#define PCRE2_DATE 2017-07-18
|
||||||
|
|
||||||
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
/* When an application links to a PCRE DLL in Windows, the symbols that are
|
||||||
imported have to be identified as such. When building PCRE2, the appropriate
|
imported have to be identified as such. When building PCRE2, the appropriate
|
||||||
|
@ -138,6 +138,14 @@ D is inspected during pcre2_dfa_match() execution
|
||||||
#define PCRE2_ALT_VERBNAMES 0x00400000u /* C */
|
#define PCRE2_ALT_VERBNAMES 0x00400000u /* C */
|
||||||
#define PCRE2_USE_OFFSET_LIMIT 0x00800000u /* J M D */
|
#define PCRE2_USE_OFFSET_LIMIT 0x00800000u /* J M D */
|
||||||
#define PCRE2_EXTENDED_MORE 0x01000000u /* C */
|
#define PCRE2_EXTENDED_MORE 0x01000000u /* C */
|
||||||
|
#define PCRE2_LITERAL 0x02000000u /* C */
|
||||||
|
|
||||||
|
/* An additional compile options word is available in the compile context. */
|
||||||
|
|
||||||
|
#define PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES 0x00000001u /* C */
|
||||||
|
#define PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL 0x00000002u /* C */
|
||||||
|
#define PCRE2_EXTRA_MATCH_WORD 0x00000004u /* C */
|
||||||
|
#define PCRE2_EXTRA_MATCH_LINE 0x00000008u /* C */
|
||||||
|
|
||||||
/* These are for pcre2_jit_compile(). */
|
/* These are for pcre2_jit_compile(). */
|
||||||
|
|
||||||
|
@ -176,6 +184,16 @@ ignored for pcre2_jit_match(). */
|
||||||
|
|
||||||
#define PCRE2_NO_JIT 0x00002000u
|
#define PCRE2_NO_JIT 0x00002000u
|
||||||
|
|
||||||
|
/* Options for pcre2_pattern_convert(). */
|
||||||
|
|
||||||
|
#define PCRE2_CONVERT_UTF 0x00000001u
|
||||||
|
#define PCRE2_CONVERT_NO_UTF_CHECK 0x00000002u
|
||||||
|
#define PCRE2_CONVERT_POSIX_BASIC 0x00000004u
|
||||||
|
#define PCRE2_CONVERT_POSIX_EXTENDED 0x00000008u
|
||||||
|
#define PCRE2_CONVERT_GLOB 0x00000010u
|
||||||
|
#define PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR 0x00000030u
|
||||||
|
#define PCRE2_CONVERT_GLOB_NO_STARSTAR 0x00000050u
|
||||||
|
|
||||||
/* Newline and \R settings, for use in compile contexts. The newline values
|
/* Newline and \R settings, for use in compile contexts. The newline values
|
||||||
must be kept in step with values set in config.h and both sets must all be
|
must be kept in step with values set in config.h and both sets must all be
|
||||||
greater than zero. */
|
greater than zero. */
|
||||||
|
@ -185,6 +203,7 @@ greater than zero. */
|
||||||
#define PCRE2_NEWLINE_CRLF 3
|
#define PCRE2_NEWLINE_CRLF 3
|
||||||
#define PCRE2_NEWLINE_ANY 4
|
#define PCRE2_NEWLINE_ANY 4
|
||||||
#define PCRE2_NEWLINE_ANYCRLF 5
|
#define PCRE2_NEWLINE_ANYCRLF 5
|
||||||
|
#define PCRE2_NEWLINE_NUL 6
|
||||||
|
|
||||||
#define PCRE2_BSR_UNICODE 1
|
#define PCRE2_BSR_UNICODE 1
|
||||||
#define PCRE2_BSR_ANYCRLF 2
|
#define PCRE2_BSR_ANYCRLF 2
|
||||||
|
@ -270,6 +289,8 @@ numbers must not be changed. */
|
||||||
#define PCRE2_ERROR_TOOMANYREPLACE (-61)
|
#define PCRE2_ERROR_TOOMANYREPLACE (-61)
|
||||||
#define PCRE2_ERROR_BADSERIALIZEDDATA (-62)
|
#define PCRE2_ERROR_BADSERIALIZEDDATA (-62)
|
||||||
#define PCRE2_ERROR_HEAPLIMIT (-63)
|
#define PCRE2_ERROR_HEAPLIMIT (-63)
|
||||||
|
#define PCRE2_ERROR_CONVERT_SYNTAX (-64)
|
||||||
|
|
||||||
|
|
||||||
/* Request types for pcre2_pattern_info() */
|
/* Request types for pcre2_pattern_info() */
|
||||||
|
|
||||||
|
@ -351,6 +372,9 @@ typedef struct pcre2_real_compile_context pcre2_compile_context; \
|
||||||
struct pcre2_real_match_context; \
|
struct pcre2_real_match_context; \
|
||||||
typedef struct pcre2_real_match_context pcre2_match_context; \
|
typedef struct pcre2_real_match_context pcre2_match_context; \
|
||||||
\
|
\
|
||||||
|
struct pcre2_real_convert_context; \
|
||||||
|
typedef struct pcre2_real_convert_context pcre2_convert_context; \
|
||||||
|
\
|
||||||
struct pcre2_real_code; \
|
struct pcre2_real_code; \
|
||||||
typedef struct pcre2_real_code pcre2_code; \
|
typedef struct pcre2_real_code pcre2_code; \
|
||||||
\
|
\
|
||||||
|
@ -434,6 +458,8 @@ PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
pcre2_set_bsr(pcre2_compile_context *, uint32_t); \
|
pcre2_set_bsr(pcre2_compile_context *, uint32_t); \
|
||||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
pcre2_set_character_tables(pcre2_compile_context *, const unsigned char *); \
|
pcre2_set_character_tables(pcre2_compile_context *, const unsigned char *); \
|
||||||
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_set_compile_extra_options(pcre2_compile_context *, uint32_t); \
|
||||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
pcre2_set_max_pattern_length(pcre2_compile_context *, PCRE2_SIZE); \
|
pcre2_set_max_pattern_length(pcre2_compile_context *, PCRE2_SIZE); \
|
||||||
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
@ -466,6 +492,18 @@ PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
pcre2_set_recursion_memory_management(pcre2_match_context *, \
|
pcre2_set_recursion_memory_management(pcre2_match_context *, \
|
||||||
void *(*)(PCRE2_SIZE, void *), void (*)(void *, void *), void *);
|
void *(*)(PCRE2_SIZE, void *), void (*)(void *, void *), void *);
|
||||||
|
|
||||||
|
#define PCRE2_CONVERT_CONTEXT_FUNCTIONS \
|
||||||
|
PCRE2_EXP_DECL pcre2_convert_context PCRE2_CALL_CONVENTION \
|
||||||
|
*pcre2_convert_context_copy(pcre2_convert_context *); \
|
||||||
|
PCRE2_EXP_DECL pcre2_convert_context PCRE2_CALL_CONVENTION \
|
||||||
|
*pcre2_convert_context_create(pcre2_general_context *); \
|
||||||
|
PCRE2_EXP_DECL void PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_convert_context_free(pcre2_convert_context *); \
|
||||||
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_set_glob_escape(pcre2_convert_context *, uint32_t); \
|
||||||
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_set_glob_separator(pcre2_convert_context *, uint32_t);
|
||||||
|
|
||||||
|
|
||||||
/* Functions concerned with compiling a pattern to PCRE internal code. */
|
/* Functions concerned with compiling a pattern to PCRE internal code. */
|
||||||
|
|
||||||
|
@ -572,6 +610,16 @@ PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
PCRE2_SIZE, PCRE2_UCHAR *, PCRE2_SIZE *);
|
PCRE2_SIZE, PCRE2_UCHAR *, PCRE2_SIZE *);
|
||||||
|
|
||||||
|
|
||||||
|
/* Functions for converting pattern source strings. */
|
||||||
|
|
||||||
|
#define PCRE2_CONVERT_FUNCTIONS \
|
||||||
|
PCRE2_EXP_DECL int PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_pattern_convert(PCRE2_SPTR, PCRE2_SIZE, uint32_t, PCRE2_UCHAR **, \
|
||||||
|
PCRE2_SIZE *, pcre2_convert_context *); \
|
||||||
|
PCRE2_EXP_DECL void PCRE2_CALL_CONVENTION \
|
||||||
|
pcre2_converted_pattern_free(PCRE2_UCHAR *);
|
||||||
|
|
||||||
|
|
||||||
/* Functions for JIT processing */
|
/* Functions for JIT processing */
|
||||||
|
|
||||||
#define PCRE2_JIT_FUNCTIONS \
|
#define PCRE2_JIT_FUNCTIONS \
|
||||||
|
@ -623,6 +671,7 @@ pcre2_compile are called by application code. */
|
||||||
#define pcre2_real_code PCRE2_SUFFIX(pcre2_real_code_)
|
#define pcre2_real_code PCRE2_SUFFIX(pcre2_real_code_)
|
||||||
#define pcre2_real_general_context PCRE2_SUFFIX(pcre2_real_general_context_)
|
#define pcre2_real_general_context PCRE2_SUFFIX(pcre2_real_general_context_)
|
||||||
#define pcre2_real_compile_context PCRE2_SUFFIX(pcre2_real_compile_context_)
|
#define pcre2_real_compile_context PCRE2_SUFFIX(pcre2_real_compile_context_)
|
||||||
|
#define pcre2_real_convert_context PCRE2_SUFFIX(pcre2_real_convert_context_)
|
||||||
#define pcre2_real_match_context PCRE2_SUFFIX(pcre2_real_match_context_)
|
#define pcre2_real_match_context PCRE2_SUFFIX(pcre2_real_match_context_)
|
||||||
#define pcre2_real_jit_stack PCRE2_SUFFIX(pcre2_real_jit_stack_)
|
#define pcre2_real_jit_stack PCRE2_SUFFIX(pcre2_real_jit_stack_)
|
||||||
#define pcre2_real_match_data PCRE2_SUFFIX(pcre2_real_match_data_)
|
#define pcre2_real_match_data PCRE2_SUFFIX(pcre2_real_match_data_)
|
||||||
|
@ -634,6 +683,7 @@ pcre2_compile are called by application code. */
|
||||||
#define pcre2_callout_enumerate_block PCRE2_SUFFIX(pcre2_callout_enumerate_block_)
|
#define pcre2_callout_enumerate_block PCRE2_SUFFIX(pcre2_callout_enumerate_block_)
|
||||||
#define pcre2_general_context PCRE2_SUFFIX(pcre2_general_context_)
|
#define pcre2_general_context PCRE2_SUFFIX(pcre2_general_context_)
|
||||||
#define pcre2_compile_context PCRE2_SUFFIX(pcre2_compile_context_)
|
#define pcre2_compile_context PCRE2_SUFFIX(pcre2_compile_context_)
|
||||||
|
#define pcre2_convert_context PCRE2_SUFFIX(pcre2_convert_context_)
|
||||||
#define pcre2_match_context PCRE2_SUFFIX(pcre2_match_context_)
|
#define pcre2_match_context PCRE2_SUFFIX(pcre2_match_context_)
|
||||||
#define pcre2_match_data PCRE2_SUFFIX(pcre2_match_data_)
|
#define pcre2_match_data PCRE2_SUFFIX(pcre2_match_data_)
|
||||||
|
|
||||||
|
@ -649,6 +699,10 @@ pcre2_compile are called by application code. */
|
||||||
#define pcre2_compile_context_create PCRE2_SUFFIX(pcre2_compile_context_create_)
|
#define pcre2_compile_context_create PCRE2_SUFFIX(pcre2_compile_context_create_)
|
||||||
#define pcre2_compile_context_free PCRE2_SUFFIX(pcre2_compile_context_free_)
|
#define pcre2_compile_context_free PCRE2_SUFFIX(pcre2_compile_context_free_)
|
||||||
#define pcre2_config PCRE2_SUFFIX(pcre2_config_)
|
#define pcre2_config PCRE2_SUFFIX(pcre2_config_)
|
||||||
|
#define pcre2_convert_context_copy PCRE2_SUFFIX(pcre2_convert_context_copy_)
|
||||||
|
#define pcre2_convert_context_create PCRE2_SUFFIX(pcre2_convert_context_create_)
|
||||||
|
#define pcre2_convert_context_free PCRE2_SUFFIX(pcre2_convert_context_free_)
|
||||||
|
#define pcre2_converted_pattern_free PCRE2_SUFFIX(pcre2_converted_pattern_free_)
|
||||||
#define pcre2_dfa_match PCRE2_SUFFIX(pcre2_dfa_match_)
|
#define pcre2_dfa_match PCRE2_SUFFIX(pcre2_dfa_match_)
|
||||||
#define pcre2_general_context_copy PCRE2_SUFFIX(pcre2_general_context_copy_)
|
#define pcre2_general_context_copy PCRE2_SUFFIX(pcre2_general_context_copy_)
|
||||||
#define pcre2_general_context_create PCRE2_SUFFIX(pcre2_general_context_create_)
|
#define pcre2_general_context_create PCRE2_SUFFIX(pcre2_general_context_create_)
|
||||||
|
@ -672,6 +726,7 @@ pcre2_compile are called by application code. */
|
||||||
#define pcre2_match_data_create PCRE2_SUFFIX(pcre2_match_data_create_)
|
#define pcre2_match_data_create PCRE2_SUFFIX(pcre2_match_data_create_)
|
||||||
#define pcre2_match_data_create_from_pattern PCRE2_SUFFIX(pcre2_match_data_create_from_pattern_)
|
#define pcre2_match_data_create_from_pattern PCRE2_SUFFIX(pcre2_match_data_create_from_pattern_)
|
||||||
#define pcre2_match_data_free PCRE2_SUFFIX(pcre2_match_data_free_)
|
#define pcre2_match_data_free PCRE2_SUFFIX(pcre2_match_data_free_)
|
||||||
|
#define pcre2_pattern_convert PCRE2_SUFFIX(pcre2_pattern_convert_)
|
||||||
#define pcre2_pattern_info PCRE2_SUFFIX(pcre2_pattern_info_)
|
#define pcre2_pattern_info PCRE2_SUFFIX(pcre2_pattern_info_)
|
||||||
#define pcre2_serialize_decode PCRE2_SUFFIX(pcre2_serialize_decode_)
|
#define pcre2_serialize_decode PCRE2_SUFFIX(pcre2_serialize_decode_)
|
||||||
#define pcre2_serialize_encode PCRE2_SUFFIX(pcre2_serialize_encode_)
|
#define pcre2_serialize_encode PCRE2_SUFFIX(pcre2_serialize_encode_)
|
||||||
|
@ -680,8 +735,11 @@ pcre2_compile are called by application code. */
|
||||||
#define pcre2_set_bsr PCRE2_SUFFIX(pcre2_set_bsr_)
|
#define pcre2_set_bsr PCRE2_SUFFIX(pcre2_set_bsr_)
|
||||||
#define pcre2_set_callout PCRE2_SUFFIX(pcre2_set_callout_)
|
#define pcre2_set_callout PCRE2_SUFFIX(pcre2_set_callout_)
|
||||||
#define pcre2_set_character_tables PCRE2_SUFFIX(pcre2_set_character_tables_)
|
#define pcre2_set_character_tables PCRE2_SUFFIX(pcre2_set_character_tables_)
|
||||||
|
#define pcre2_set_compile_extra_options PCRE2_SUFFIX(pcre2_set_compile_extra_options_)
|
||||||
#define pcre2_set_compile_recursion_guard PCRE2_SUFFIX(pcre2_set_compile_recursion_guard_)
|
#define pcre2_set_compile_recursion_guard PCRE2_SUFFIX(pcre2_set_compile_recursion_guard_)
|
||||||
#define pcre2_set_depth_limit PCRE2_SUFFIX(pcre2_set_depth_limit_)
|
#define pcre2_set_depth_limit PCRE2_SUFFIX(pcre2_set_depth_limit_)
|
||||||
|
#define pcre2_set_glob_escape PCRE2_SUFFIX(pcre2_set_glob_escape_)
|
||||||
|
#define pcre2_set_glob_separator PCRE2_SUFFIX(pcre2_set_glob_separator_)
|
||||||
#define pcre2_set_heap_limit PCRE2_SUFFIX(pcre2_set_heap_limit_)
|
#define pcre2_set_heap_limit PCRE2_SUFFIX(pcre2_set_heap_limit_)
|
||||||
#define pcre2_set_match_limit PCRE2_SUFFIX(pcre2_set_match_limit_)
|
#define pcre2_set_match_limit PCRE2_SUFFIX(pcre2_set_match_limit_)
|
||||||
#define pcre2_set_max_pattern_length PCRE2_SUFFIX(pcre2_set_max_pattern_length_)
|
#define pcre2_set_max_pattern_length PCRE2_SUFFIX(pcre2_set_max_pattern_length_)
|
||||||
|
@ -716,6 +774,8 @@ PCRE2_STRUCTURE_LIST \
|
||||||
PCRE2_GENERAL_INFO_FUNCTIONS \
|
PCRE2_GENERAL_INFO_FUNCTIONS \
|
||||||
PCRE2_GENERAL_CONTEXT_FUNCTIONS \
|
PCRE2_GENERAL_CONTEXT_FUNCTIONS \
|
||||||
PCRE2_COMPILE_CONTEXT_FUNCTIONS \
|
PCRE2_COMPILE_CONTEXT_FUNCTIONS \
|
||||||
|
PCRE2_CONVERT_CONTEXT_FUNCTIONS \
|
||||||
|
PCRE2_CONVERT_FUNCTIONS \
|
||||||
PCRE2_MATCH_CONTEXT_FUNCTIONS \
|
PCRE2_MATCH_CONTEXT_FUNCTIONS \
|
||||||
PCRE2_COMPILE_FUNCTIONS \
|
PCRE2_COMPILE_FUNCTIONS \
|
||||||
PCRE2_PATTERN_INFO_FUNCTIONS \
|
PCRE2_PATTERN_INFO_FUNCTIONS \
|
||||||
|
@ -745,6 +805,7 @@ PCRE2_TYPES_STRUCTURES_AND_FUNCTIONS
|
||||||
#undef PCRE2_GENERAL_INFO_FUNCTIONS
|
#undef PCRE2_GENERAL_INFO_FUNCTIONS
|
||||||
#undef PCRE2_GENERAL_CONTEXT_FUNCTIONS
|
#undef PCRE2_GENERAL_CONTEXT_FUNCTIONS
|
||||||
#undef PCRE2_COMPILE_CONTEXT_FUNCTIONS
|
#undef PCRE2_COMPILE_CONTEXT_FUNCTIONS
|
||||||
|
#undef PCRE2_CONVERT_CONTEXT_FUNCTIONS
|
||||||
#undef PCRE2_MATCH_CONTEXT_FUNCTIONS
|
#undef PCRE2_MATCH_CONTEXT_FUNCTIONS
|
||||||
#undef PCRE2_COMPILE_FUNCTIONS
|
#undef PCRE2_COMPILE_FUNCTIONS
|
||||||
#undef PCRE2_PATTERN_INFO_FUNCTIONS
|
#undef PCRE2_PATTERN_INFO_FUNCTIONS
|
||||||
|
|
|
@ -4351,7 +4351,7 @@ struct sljit_jump *quit;
|
||||||
struct sljit_jump *partial_quit[2];
|
struct sljit_jump *partial_quit[2];
|
||||||
sljit_u8 instruction[8];
|
sljit_u8 instruction[8];
|
||||||
sljit_s32 tmp1_ind = sljit_get_register_index(TMP1);
|
sljit_s32 tmp1_ind = sljit_get_register_index(TMP1);
|
||||||
sljit_s32 tmp2_ind = sljit_get_register_index(TMP2);
|
// sljit_s32 tmp2_ind = sljit_get_register_index(TMP2);
|
||||||
sljit_s32 str_ptr_ind = sljit_get_register_index(STR_PTR);
|
sljit_s32 str_ptr_ind = sljit_get_register_index(STR_PTR);
|
||||||
sljit_s32 data_ind = 0;
|
sljit_s32 data_ind = 0;
|
||||||
sljit_s32 tmp_ind = 1;
|
sljit_s32 tmp_ind = 1;
|
||||||
|
@ -4376,7 +4376,9 @@ if (common->mode == PCRE2_JIT_COMPLETE)
|
||||||
|
|
||||||
OP1(SLJIT_MOV, TMP1, 0, SLJIT_IMM, character_to_int32(char1 | bit));
|
OP1(SLJIT_MOV, TMP1, 0, SLJIT_IMM, character_to_int32(char1 | bit));
|
||||||
|
|
||||||
SLJIT_ASSERT(tmp1_ind < 8 && tmp2_ind == 1);
|
// SLJIT_ASSERT(tmp1_ind < 8 && tmp2_ind == 1);
|
||||||
|
|
||||||
|
SLJIT_ASSERT(tmp1_ind < 8);
|
||||||
|
|
||||||
/* MOVD xmm, r/m32 */
|
/* MOVD xmm, r/m32 */
|
||||||
instruction[0] = 0x66;
|
instruction[0] = 0x66;
|
||||||
|
|
|
@ -4073,7 +4073,8 @@ else fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%
|
||||||
* Show compile extra options *
|
* Show compile extra options *
|
||||||
*************************************************/
|
*************************************************/
|
||||||
|
|
||||||
/* Called for unsupported POSIX options.
|
/* Called only for unsupported POSIX options at present, and therefore needed
|
||||||
|
only when the 8-bit library is being compiled.
|
||||||
|
|
||||||
Arguments:
|
Arguments:
|
||||||
options an options word
|
options an options word
|
||||||
|
@ -4083,17 +4084,21 @@ Arguments:
|
||||||
Returns: nothing
|
Returns: nothing
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
#ifdef SUPPORT_PCRE2_8
|
||||||
static void
|
static void
|
||||||
show_compile_extra_options(uint32_t options, const char *before,
|
show_compile_extra_options(uint32_t options, const char *before,
|
||||||
const char *after)
|
const char *after)
|
||||||
{
|
{
|
||||||
if (options == 0) fprintf(outfile, "%s <none>%s", before, after);
|
if (options == 0) fprintf(outfile, "%s <none>%s", before, after);
|
||||||
else fprintf(outfile, "%s%s%s%s",
|
else fprintf(outfile, "%s%s%s%s%s%s",
|
||||||
before,
|
before,
|
||||||
((options & PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES) != 0)? " allow_surrogate_escapes" : "",
|
((options & PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES) != 0)? " allow_surrogate_escapes" : "",
|
||||||
((options & PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL) != 0)? " bad_escape_is_literal" : "",
|
((options & PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL) != 0)? " bad_escape_is_literal" : "",
|
||||||
|
((options & PCRE2_EXTRA_MATCH_WORD) != 0)? " match_word" : "",
|
||||||
|
((options & PCRE2_EXTRA_MATCH_LINE) != 0)? " match_line" : "",
|
||||||
after);
|
after);
|
||||||
}
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -124,10 +124,10 @@
|
||||||
/* SLJIT_REWRITABLE_JUMP is 0x1000. */
|
/* SLJIT_REWRITABLE_JUMP is 0x1000. */
|
||||||
|
|
||||||
#if (defined SLJIT_CONFIG_X86 && SLJIT_CONFIG_X86)
|
#if (defined SLJIT_CONFIG_X86 && SLJIT_CONFIG_X86)
|
||||||
# define PATCH_MB 0x4
|
# define PATCH_MB 0x4
|
||||||
# define PATCH_MW 0x8
|
# define PATCH_MW 0x8
|
||||||
#if (defined SLJIT_CONFIG_X86_64 && SLJIT_CONFIG_X86_64)
|
#if (defined SLJIT_CONFIG_X86_64 && SLJIT_CONFIG_X86_64)
|
||||||
# define PATCH_MD 0x10
|
# define PATCH_MD 0x10
|
||||||
#endif
|
#endif
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
@ -1555,6 +1555,7 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_cmov(struct sljit_compile
|
||||||
sljit_s32 dst_reg,
|
sljit_s32 dst_reg,
|
||||||
sljit_s32 src, sljit_sw srcw)
|
sljit_s32 src, sljit_sw srcw)
|
||||||
{
|
{
|
||||||
|
(void)srcw; /* To stop compiler warning */
|
||||||
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
|
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
|
||||||
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_I32_OP)));
|
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_I32_OP)));
|
||||||
CHECK_ARGUMENT((type & 0xff) >= SLJIT_EQUAL && (type & 0xff) <= SLJIT_ORDERED_F64);
|
CHECK_ARGUMENT((type & 0xff) >= SLJIT_EQUAL && (type & 0xff) <= SLJIT_ORDERED_F64);
|
||||||
|
|
|
@ -95,17 +95,6 @@
|
||||||
aaac
|
aaac
|
||||||
abbbbbbbbbbbac
|
abbbbbbbbbbbac
|
||||||
|
|
||||||
/^(b+|a){1,2}?bc/
|
|
||||||
bbc
|
|
||||||
|
|
||||||
/^(b*|ba){1,2}?bc/
|
|
||||||
babc
|
|
||||||
bbabc
|
|
||||||
bababc
|
|
||||||
\= Expect no match
|
|
||||||
bababbc
|
|
||||||
babababc
|
|
||||||
|
|
||||||
/^(ba|b*){1,2}?bc/
|
/^(ba|b*){1,2}?bc/
|
||||||
babc
|
babc
|
||||||
bbabc
|
bbabc
|
||||||
|
|
|
@ -5350,4 +5350,19 @@ a)"xI
|
||||||
\= Expect no match
|
\= Expect no match
|
||||||
Not a whole line
|
Not a whole line
|
||||||
|
|
||||||
|
# Perl gets this wrong, failing to capture 'b' in group 1.
|
||||||
|
|
||||||
|
/^(b+|a){1,2}?bc/
|
||||||
|
bbc
|
||||||
|
|
||||||
|
# And again here, for the "babc" subject string.
|
||||||
|
|
||||||
|
/^(b*|ba){1,2}?bc/
|
||||||
|
babc
|
||||||
|
bbabc
|
||||||
|
bababc
|
||||||
|
\= Expect no match
|
||||||
|
bababbc
|
||||||
|
babababc
|
||||||
|
|
||||||
# End of testinput2
|
# End of testinput2
|
||||||
|
|
|
@ -183,27 +183,6 @@ No match
|
||||||
abbbbbbbbbbbac
|
abbbbbbbbbbbac
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/^(b+|a){1,2}?bc/
|
|
||||||
bbc
|
|
||||||
0: bbc
|
|
||||||
1: b
|
|
||||||
|
|
||||||
/^(b*|ba){1,2}?bc/
|
|
||||||
babc
|
|
||||||
0: babc
|
|
||||||
1: ba
|
|
||||||
bbabc
|
|
||||||
0: bbabc
|
|
||||||
1: ba
|
|
||||||
bababc
|
|
||||||
0: bababc
|
|
||||||
1: ba
|
|
||||||
\= Expect no match
|
|
||||||
bababbc
|
|
||||||
No match
|
|
||||||
babababc
|
|
||||||
No match
|
|
||||||
|
|
||||||
/^(ba|b*){1,2}?bc/
|
/^(ba|b*){1,2}?bc/
|
||||||
babc
|
babc
|
||||||
0: babc
|
0: babc
|
||||||
|
|
|
@ -16300,6 +16300,31 @@ No match
|
||||||
Not a whole line
|
Not a whole line
|
||||||
No match
|
No match
|
||||||
|
|
||||||
|
# Perl gets this wrong, failing to capture 'b' in group 1.
|
||||||
|
|
||||||
|
/^(b+|a){1,2}?bc/
|
||||||
|
bbc
|
||||||
|
0: bbc
|
||||||
|
1: b
|
||||||
|
|
||||||
|
# And again here, for the "babc" subject string.
|
||||||
|
|
||||||
|
/^(b*|ba){1,2}?bc/
|
||||||
|
babc
|
||||||
|
0: babc
|
||||||
|
1: ba
|
||||||
|
bbabc
|
||||||
|
0: bbabc
|
||||||
|
1: ba
|
||||||
|
bababc
|
||||||
|
0: bababc
|
||||||
|
1: ba
|
||||||
|
\= Expect no match
|
||||||
|
bababbc
|
||||||
|
No match
|
||||||
|
babababc
|
||||||
|
No match
|
||||||
|
|
||||||
# End of testinput2
|
# End of testinput2
|
||||||
Error -65: PCRE2_ERROR_BADDATA (unknown error number)
|
Error -65: PCRE2_ERROR_BADDATA (unknown error number)
|
||||||
Error -62: bad serialized data
|
Error -62: bad serialized data
|
||||||
|
|
|
@ -853,10 +853,8 @@ Memory allocation (code space): 28
|
||||||
# with link size - hence multiple tests with different values.
|
# with link size - hence multiple tests with different values.
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
||||||
Failed: error 186 at offset 5813: regular expression is too complicated
|
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
||||||
Failed: error 186 at offset 5820: regular expression is too complicated
|
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
||||||
Failed: error 186 at offset 12820: regular expression is too complicated
|
Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
|
|
|
@ -853,10 +853,8 @@ Memory allocation (code space): 28
|
||||||
# with link size - hence multiple tests with different values.
|
# with link size - hence multiple tests with different values.
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
/(?'ABC'\[[bar](]{792}*THEN:\[A]{255}\[)]{793}/expand,-fullbincode,parens_nest_limit=1000
|
||||||
Failed: error 186 at offset 5813: regular expression is too complicated
|
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
/(?'ABC'\[[bar](]{793}*THEN:\[A]{255}\[)]{794}/expand,-fullbincode,parens_nest_limit=1000
|
||||||
Failed: error 186 at offset 5820: regular expression is too complicated
|
|
||||||
|
|
||||||
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
/(?'ABC'\[[bar](]{1793}*THEN:\[A]{255}\[)]{1794}/expand,-fullbincode,parens_nest_limit=2000
|
||||||
Failed: error 186 at offset 12820: regular expression is too complicated
|
Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
|
|
Loading…
Reference in New Issue