Make PCRE2_NO_START_OPTIMIZE a compile-only option.
This commit is contained in:
parent
313245365d
commit
a0410efc56
124
doc/pcre2api.3
124
doc/pcre2api.3
|
@ -930,9 +930,8 @@ documentation).
|
||||||
.P
|
.P
|
||||||
For those options that can be different in different parts of the pattern, the
|
For those options that can be different in different parts of the pattern, the
|
||||||
contents of the \fIoptions\fP argument specifies their settings at the start of
|
contents of the \fIoptions\fP argument specifies their settings at the start of
|
||||||
compilation. The PCRE2_ANCHORED, PCRE2_NO_UTF_CHECK, and
|
compilation. The PCRE2_ANCHORED and PCRE2_NO_UTF_CHECK options can be set at
|
||||||
PCRE2_NO_START_OPTIMIZE options can be set at the time of matching as well as
|
the time of matching as well as at compile time.
|
||||||
at compile time.
|
|
||||||
.P
|
.P
|
||||||
Other, less frequently required compile-time parameters (for example, the
|
Other, less frequently required compile-time parameters (for example, the
|
||||||
newline setting) can be provided in a compile context (as described
|
newline setting) can be provided in a compile context (as described
|
||||||
|
@ -1150,17 +1149,52 @@ purposes.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_NO_START_OPTIMIZE
|
PCRE2_NO_START_OPTIMIZE
|
||||||
.sp
|
.sp
|
||||||
This is an option that acts at matching time; that is, it is really an option
|
This is an option whose main effect is at matching time. It does not change
|
||||||
for \fBpcre2_match()\fP or \fBpcre_dfa_match()\fP. If it is set at compile
|
what \fBpcre2_compile()\fP generates, but it does affect the output of the JIT
|
||||||
time, it is remembered with the compiled pattern and assumed at matching time.
|
compiler.
|
||||||
This is necessary if you want to use JIT execution, because the JIT compiler
|
.P
|
||||||
needs to know whether or not this option is set. For details, see the
|
There are a number of optimizations that may occur at the start of a match, in
|
||||||
discussion of PCRE2_NO_START_OPTIMIZE in the section on \fBpcre2_match()\fP
|
order to speed up the process. For example, if it is known that an unanchored
|
||||||
options
|
match must start with a specific character, the matching code searches the
|
||||||
.\" HTML <a href="#matchoptions">
|
subject for that character, and fails immediately if it cannot find it, without
|
||||||
.\" </a>
|
actually running the main matching function. This means that a special item
|
||||||
below.
|
such as (*COMMIT) at the start of a pattern is not considered until after a
|
||||||
.\"
|
suitable starting point for the match has been found. Also, when callouts or
|
||||||
|
(*MARK) items are in use, these "start-up" optimizations can cause them to be
|
||||||
|
skipped if the pattern is never actually used. The start-up optimizations are
|
||||||
|
in effect a pre-scan of the subject that takes place before the pattern is run.
|
||||||
|
.P
|
||||||
|
The PCRE2_NO_START_OPTIMIZE option disables the start-up optimizations,
|
||||||
|
possibly causing performance to suffer, but ensuring that in cases where the
|
||||||
|
result is "no match", the callouts do occur, and that items such as (*COMMIT)
|
||||||
|
and (*MARK) are considered at every possible starting position in the subject
|
||||||
|
string.
|
||||||
|
.P
|
||||||
|
Setting PCRE2_NO_START_OPTIMIZE may change the outcome of a matching operation.
|
||||||
|
Consider the pattern
|
||||||
|
.sp
|
||||||
|
(*COMMIT)ABC
|
||||||
|
.sp
|
||||||
|
When this is compiled, PCRE2 records the fact that a match must start with the
|
||||||
|
character "A". Suppose the subject string is "DEFABC". The start-up
|
||||||
|
optimization scans along the subject, finds "A" and runs the first match
|
||||||
|
attempt from there. The (*COMMIT) item means that the pattern must match the
|
||||||
|
current starting position, which in this case, it does. However, if the same
|
||||||
|
match is run with PCRE2_NO_START_OPTIMIZE set, the initial scan along the
|
||||||
|
subject string does not happen. The first match attempt is run starting from
|
||||||
|
"D" and when this fails, (*COMMIT) prevents any further matches being tried, so
|
||||||
|
the overall result is "no match". There are also other start-up optimizations.
|
||||||
|
For example, a minimum length for the subject may be recorded. Consider the
|
||||||
|
pattern
|
||||||
|
.sp
|
||||||
|
(*MARK:A)(X|Y)
|
||||||
|
.sp
|
||||||
|
The minimum length for a match is one character. If the subject is "ABC", there
|
||||||
|
will be attempts to match "ABC", "BC", and "C". An attempt to match an empty
|
||||||
|
string at the end of the subject does not take place, because PCRE2 knows that
|
||||||
|
the subject is now too short, and so the (*MARK) is never encountered. In this
|
||||||
|
case, the optimization does not affect the overall match result, which is still
|
||||||
|
"no match", but it does affect the auxiliary information that is returned.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_NO_UTF_CHECK
|
PCRE2_NO_UTF_CHECK
|
||||||
.sp
|
.sp
|
||||||
|
@ -1787,10 +1821,9 @@ pattern does not require the match to be at the start of the subject.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
The unused bits of the \fIoptions\fP argument for \fBpcre2_match()\fP must be
|
The unused bits of the \fIoptions\fP argument for \fBpcre2_match()\fP must be
|
||||||
zero. The only bits that may be set are PCRE2_ANCHORED,
|
zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
|
||||||
PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
|
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
|
||||||
PCRE2_NO_START_OPTIMIZE, PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and
|
PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is described below.
|
||||||
PCRE2_PARTIAL_SOFT. Their action is described below.
|
|
||||||
.P
|
.P
|
||||||
If the pattern was successfully processed by the just-in-time (JIT) compiler,
|
If the pattern was successfully processed by the just-in-time (JIT) compiler,
|
||||||
the only supported options for matching using the JIT code are PCRE2_NOTBOL,
|
the only supported options for matching using the JIT code are PCRE2_NOTBOL,
|
||||||
|
@ -1840,54 +1873,6 @@ valid, so PCRE2 searches further into the string for occurrences of "a" or "b".
|
||||||
This is like PCRE2_NOTEMPTY, except that an empty string match that is not at
|
This is like PCRE2_NOTEMPTY, except that an empty string match that is not at
|
||||||
the start of the subject is permitted. If the pattern is anchored, such a match
|
the start of the subject is permitted. If the pattern is anchored, such a match
|
||||||
can occur only if the pattern contains \eK.
|
can occur only if the pattern contains \eK.
|
||||||
.sp
|
|
||||||
PCRE2_NO_START_OPTIMIZE
|
|
||||||
.sp
|
|
||||||
There are a number of optimizations that \fBpcre2_match()\fP uses at the start
|
|
||||||
of a match, in order to speed up the process. For example, if it is known that
|
|
||||||
an unanchored match must start with a specific character, it searches the
|
|
||||||
subject for that character, and fails immediately if it cannot find it, without
|
|
||||||
actually running the main matching function. This means that a special item
|
|
||||||
such as (*COMMIT) at the start of a pattern is not considered until after a
|
|
||||||
suitable starting point for the match has been found. Also, when callouts or
|
|
||||||
(*MARK) items are in use, these "start-up" optimizations can cause them to be
|
|
||||||
skipped if the pattern is never actually used. The start-up optimizations are
|
|
||||||
in effect a pre-scan of the subject that takes place before the pattern is run.
|
|
||||||
.P
|
|
||||||
The PCRE2_NO_START_OPTIMIZE option disables the start-up optimizations,
|
|
||||||
possibly causing performance to suffer, but ensuring that in cases where the
|
|
||||||
result is "no match", the callouts do occur, and that items such as (*COMMIT)
|
|
||||||
and (*MARK) are considered at every possible starting position in the subject
|
|
||||||
string. If PCRE2_NO_START_OPTIMIZE is set at compile time, it cannot be unset
|
|
||||||
at matching time. The use of PCRE2_NO_START_OPTIMIZE at matching time (that is,
|
|
||||||
passing it to \fBpcre2_match()\fP) disables JIT execution; in this situation,
|
|
||||||
matching is always done using interpretively.
|
|
||||||
.P
|
|
||||||
Setting PCRE2_NO_START_OPTIMIZE can change the outcome of a matching operation.
|
|
||||||
Consider the pattern
|
|
||||||
.sp
|
|
||||||
(*COMMIT)ABC
|
|
||||||
.sp
|
|
||||||
When this is compiled, PCRE2 records the fact that a match must start with the
|
|
||||||
character "A". Suppose the subject string is "DEFABC". The start-up
|
|
||||||
optimization scans along the subject, finds "A" and runs the first match
|
|
||||||
attempt from there. The (*COMMIT) item means that the pattern must match the
|
|
||||||
current starting position, which in this case, it does. However, if the same
|
|
||||||
match is run with PCRE2_NO_START_OPTIMIZE set, the initial scan along the
|
|
||||||
subject string does not happen. The first match attempt is run starting from
|
|
||||||
"D" and when this fails, (*COMMIT) prevents any further matches being tried, so
|
|
||||||
the overall result is "no match". There are also other start-up optimizations.
|
|
||||||
For example, a minimum length for the subject may be recorded. Consider the
|
|
||||||
pattern
|
|
||||||
.sp
|
|
||||||
(*MARK:A)(X|Y)
|
|
||||||
.sp
|
|
||||||
The minimum length for a match is one character. If the subject is "ABC", there
|
|
||||||
will be attempts to match "ABC", "BC", and "C". An attempt to match an empty
|
|
||||||
string at the end of the subject does not take place, because PCRE2 knows that
|
|
||||||
the subject is now too short, and so the (*MARK) is never encountered. In this
|
|
||||||
case, the optimization does not affect the overall match result, which is still
|
|
||||||
"no match", but it does affect the auxiliary information that is returned.
|
|
||||||
.sp
|
.sp
|
||||||
PCRE2_NO_UTF_CHECK
|
PCRE2_NO_UTF_CHECK
|
||||||
.sp
|
.sp
|
||||||
|
@ -2550,10 +2535,9 @@ Here is an example of a simple call to \fBpcre2_dfa_match()\fP:
|
||||||
The unused bits of the \fIoptions\fP argument for \fBpcre2_dfa_match()\fP must
|
The unused bits of the \fIoptions\fP argument for \fBpcre2_dfa_match()\fP must
|
||||||
be zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
|
be zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL,
|
||||||
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
|
PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
|
||||||
PCRE2_NO_START_OPTIMIZE, PCRE2_PARTIAL_HARD, PCRE2_PARTIAL_SOFT,
|
PCRE2_PARTIAL_HARD, PCRE2_PARTIAL_SOFT, PCRE2_DFA_SHORTEST, and
|
||||||
PCRE2_DFA_SHORTEST, and PCRE2_DFA_RESTART. All but the last four of these are
|
PCRE2_DFA_RESTART. All but the last four of these are exactly the same as for
|
||||||
exactly the same as for \fBpcre2_match()\fP, so their description is not
|
\fBpcre2_match()\fP, so their description is not repeated here.
|
||||||
repeated here.
|
|
||||||
.sp
|
.sp
|
||||||
PCRE2_PARTIAL_HARD
|
PCRE2_PARTIAL_HARD
|
||||||
PCRE2_PARTIAL_SOFT
|
PCRE2_PARTIAL_SOFT
|
||||||
|
|
|
@ -111,7 +111,7 @@ give a "no match" return without actually running a match if the subject is not
|
||||||
long enough, or, for unanchored patterns, if it has been scanned far enough.
|
long enough, or, for unanchored patterns, if it has been scanned far enough.
|
||||||
.P
|
.P
|
||||||
You can disable these optimizations by passing the PCRE2_NO_START_OPTIMIZE
|
You can disable these optimizations by passing the PCRE2_NO_START_OPTIMIZE
|
||||||
option to the matching function, or by starting the pattern with
|
option to \fBpcre2_compile()\fP, or by starting the pattern with
|
||||||
(*NO_START_OPT). This slows down the matching process, but does ensure that
|
(*NO_START_OPT). This slows down the matching process, but does ensure that
|
||||||
callouts such as the example above are obeyed.
|
callouts such as the example above are obeyed.
|
||||||
.
|
.
|
||||||
|
|
|
@ -107,9 +107,8 @@ or the JIT compiler was not able to handle the pattern.
|
||||||
.sp
|
.sp
|
||||||
The \fBpcre2_match()\fP options that are supported for JIT matching are
|
The \fBpcre2_match()\fP options that are supported for JIT matching are
|
||||||
PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
|
PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
|
||||||
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The options
|
PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The
|
||||||
that are not supported at match time are PCRE2_ANCHORED and
|
PCRE2_ANCHORED option is not supported at match time.
|
||||||
PCRE2_NO_START_OPTIMIZE, though they are supported if given at compile time.
|
|
||||||
.P
|
.P
|
||||||
The only unsupported pattern items are \eC (match a single data unit) when
|
The only unsupported pattern items are \eC (match a single data unit) when
|
||||||
running in a UTF mode, and a callout immediately before an assertion condition
|
running in a UTF mode, and a callout immediately before an assertion condition
|
||||||
|
|
|
@ -662,7 +662,6 @@ for a description of their effects.
|
||||||
anchored set PCRE2_ANCHORED
|
anchored set PCRE2_ANCHORED
|
||||||
dfa_restart set PCRE2_DFA_RESTART
|
dfa_restart set PCRE2_DFA_RESTART
|
||||||
dfa_shortest set PCRE2_DFA_SHORTEST
|
dfa_shortest set PCRE2_DFA_SHORTEST
|
||||||
no_start_optimize set PCRE2_NO_START_OPTIMIZE
|
|
||||||
no_utf_check set PCRE2_NO_UTF_CHECK
|
no_utf_check set PCRE2_NO_UTF_CHECK
|
||||||
notbol set PCRE2_NOTBOL
|
notbol set PCRE2_NOTBOL
|
||||||
notempty set PCRE2_NOTEMPTY
|
notempty set PCRE2_NOTEMPTY
|
||||||
|
|
|
@ -86,8 +86,7 @@ passed. Put these bits at the most significant end of the options word so
|
||||||
others can be added next to them */
|
others can be added next to them */
|
||||||
|
|
||||||
#define PCRE2_ANCHORED 0x80000000u
|
#define PCRE2_ANCHORED 0x80000000u
|
||||||
#define PCRE2_NO_START_OPTIMIZE 0x40000000u
|
#define PCRE2_NO_UTF_CHECK 0x40000000u
|
||||||
#define PCRE2_NO_UTF_CHECK 0x20000000u
|
|
||||||
|
|
||||||
/* Other options that can be passed to pcre2_compile(). They may affect
|
/* Other options that can be passed to pcre2_compile(). They may affect
|
||||||
compilation, JIT compilation, and/or interpretive execution. The following tags
|
compilation, JIT compilation, and/or interpretive execution. The following tags
|
||||||
|
@ -95,7 +94,7 @@ indicate which:
|
||||||
|
|
||||||
C alters what is compiled
|
C alters what is compiled
|
||||||
J alters what JIT compiles
|
J alters what JIT compiles
|
||||||
E is inspected during pcre2_match() execution
|
M is inspected during pcre2_match() execution
|
||||||
D is inspected during pcre2_dfa_match() execution
|
D is inspected during pcre2_dfa_match() execution
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
@ -103,20 +102,21 @@ D is inspected during pcre2_dfa_match() execution
|
||||||
#define PCRE2_ALT_BSUX 0x00000002u /* C */
|
#define PCRE2_ALT_BSUX 0x00000002u /* C */
|
||||||
#define PCRE2_AUTO_CALLOUT 0x00000004u /* C */
|
#define PCRE2_AUTO_CALLOUT 0x00000004u /* C */
|
||||||
#define PCRE2_CASELESS 0x00000008u /* C */
|
#define PCRE2_CASELESS 0x00000008u /* C */
|
||||||
#define PCRE2_DOLLAR_ENDONLY 0x00000010u /* J E D */
|
#define PCRE2_DOLLAR_ENDONLY 0x00000010u /* J M D */
|
||||||
#define PCRE2_DOTALL 0x00000020u /* C */
|
#define PCRE2_DOTALL 0x00000020u /* C */
|
||||||
#define PCRE2_DUPNAMES 0x00000040u /* C */
|
#define PCRE2_DUPNAMES 0x00000040u /* C */
|
||||||
#define PCRE2_EXTENDED 0x00000080u /* C */
|
#define PCRE2_EXTENDED 0x00000080u /* C */
|
||||||
#define PCRE2_FIRSTLINE 0x00000100u /* J E D */
|
#define PCRE2_FIRSTLINE 0x00000100u /* J M D */
|
||||||
#define PCRE2_MATCH_UNSET_BACKREF 0x00000200u /* C J E */
|
#define PCRE2_MATCH_UNSET_BACKREF 0x00000200u /* C J M */
|
||||||
#define PCRE2_MULTILINE 0x00000400u /* C */
|
#define PCRE2_MULTILINE 0x00000400u /* C */
|
||||||
#define PCRE2_NEVER_UCP 0x00000800u /* C */
|
#define PCRE2_NEVER_UCP 0x00000800u /* C */
|
||||||
#define PCRE2_NEVER_UTF 0x00001000u /* C */
|
#define PCRE2_NEVER_UTF 0x00001000u /* C */
|
||||||
#define PCRE2_NO_AUTO_CAPTURE 0x00002000u /* C */
|
#define PCRE2_NO_AUTO_CAPTURE 0x00002000u /* C */
|
||||||
#define PCRE2_NO_AUTO_POSSESS 0x00004000u /* C */
|
#define PCRE2_NO_AUTO_POSSESS 0x00004000u /* C */
|
||||||
#define PCRE2_UCP 0x00008000u /* C J E D */
|
#define PCRE2_NO_START_OPTIMIZE 0x00008000u /* J M D */
|
||||||
#define PCRE2_UNGREEDY 0x00010000u /* C */
|
#define PCRE2_UCP 0x00010000u /* C J M D */
|
||||||
#define PCRE2_UTF 0x00020000u /* C J E D */
|
#define PCRE2_UNGREEDY 0x00020000u /* C */
|
||||||
|
#define PCRE2_UTF 0x00040000u /* C J M D */
|
||||||
|
|
||||||
/* These are for pcre2_jit_compile(). */
|
/* These are for pcre2_jit_compile(). */
|
||||||
|
|
||||||
|
|
|
@ -85,8 +85,7 @@ in others, so I abandoned this code. */
|
||||||
#define PUBLIC_DFA_MATCH_OPTIONS \
|
#define PUBLIC_DFA_MATCH_OPTIONS \
|
||||||
(PCRE2_ANCHORED|PCRE2_NOTBOL|PCRE2_NOTEOL|PCRE2_NOTEMPTY| \
|
(PCRE2_ANCHORED|PCRE2_NOTBOL|PCRE2_NOTEOL|PCRE2_NOTEMPTY| \
|
||||||
PCRE2_NOTEMPTY_ATSTART|PCRE2_NO_UTF_CHECK|PCRE2_PARTIAL_HARD| \
|
PCRE2_NOTEMPTY_ATSTART|PCRE2_NO_UTF_CHECK|PCRE2_PARTIAL_HARD| \
|
||||||
PCRE2_PARTIAL_SOFT|PCRE2_DFA_SHORTEST|PCRE2_DFA_RESTART| \
|
PCRE2_PARTIAL_SOFT|PCRE2_DFA_SHORTEST|PCRE2_DFA_RESTART)
|
||||||
PCRE2_NO_START_OPTIMIZE)
|
|
||||||
|
|
||||||
|
|
||||||
/*************************************************
|
/*************************************************
|
||||||
|
@ -3319,12 +3318,12 @@ for (;;)
|
||||||
|
|
||||||
/* There are some optimizations that avoid running the match if a known
|
/* There are some optimizations that avoid running the match if a known
|
||||||
starting point is not found, or if a known later code unit is not present.
|
starting point is not found, or if a known later code unit is not present.
|
||||||
However, there is an option (settable at compile or match time) that disables
|
However, there is an option (settable at compile time) that disables
|
||||||
these, for testing and for ensuring that all callouts do actually occur.
|
these, for testing and for ensuring that all callouts do actually occur.
|
||||||
The must also be avoided when restarting a DFA match. */
|
The optimizations must also be avoided when restarting a DFA match. */
|
||||||
|
|
||||||
if (((options | re->overall_options) &
|
if ((re->overall_options & PCRE2_NO_START_OPTIMIZE) == 0 &&
|
||||||
(PCRE2_NO_START_OPTIMIZE|PCRE2_DFA_RESTART)) == 0)
|
(options & PCRE2_DFA_RESTART) == 0)
|
||||||
{
|
{
|
||||||
PCRE2_SPTR save_end_subject = end_subject;
|
PCRE2_SPTR save_end_subject = end_subject;
|
||||||
|
|
||||||
|
|
|
@ -55,7 +55,7 @@ POSSIBILITY OF SUCH DAMAGE.
|
||||||
#define PUBLIC_MATCH_OPTIONS \
|
#define PUBLIC_MATCH_OPTIONS \
|
||||||
(PCRE2_ANCHORED|PCRE2_NOTBOL|PCRE2_NOTEOL|PCRE2_NOTEMPTY| \
|
(PCRE2_ANCHORED|PCRE2_NOTBOL|PCRE2_NOTEOL|PCRE2_NOTEMPTY| \
|
||||||
PCRE2_NOTEMPTY_ATSTART|PCRE2_NO_UTF_CHECK|PCRE2_PARTIAL_HARD| \
|
PCRE2_NOTEMPTY_ATSTART|PCRE2_NO_UTF_CHECK|PCRE2_PARTIAL_HARD| \
|
||||||
PCRE2_PARTIAL_SOFT|PCRE2_NO_START_OPTIMIZE)
|
PCRE2_PARTIAL_SOFT)
|
||||||
|
|
||||||
#define PUBLIC_JIT_MATCH_OPTIONS \
|
#define PUBLIC_JIT_MATCH_OPTIONS \
|
||||||
(PCRE2_NO_UTF_CHECK|PCRE2_NOTBOL|PCRE2_NOTEOL|PCRE2_NOTEMPTY|\
|
(PCRE2_NO_UTF_CHECK|PCRE2_NOTBOL|PCRE2_NOTEOL|PCRE2_NOTEMPTY|\
|
||||||
|
@ -6687,10 +6687,10 @@ for(;;)
|
||||||
|
|
||||||
/* There are some optimizations that avoid running the match if a known
|
/* There are some optimizations that avoid running the match if a known
|
||||||
starting point is not found, or if a known later code unit is not present.
|
starting point is not found, or if a known later code unit is not present.
|
||||||
However, there is an option (settable at compile or match time) that disables
|
However, there is an option (settable at compile time) that disables these,
|
||||||
these, for testing and for ensuring that all callouts do actually occur. */
|
for testing and for ensuring that all callouts do actually occur. */
|
||||||
|
|
||||||
if (((options | re->overall_options) & PCRE2_NO_START_OPTIMIZE) == 0)
|
if ((re->overall_options & PCRE2_NO_START_OPTIMIZE) == 0)
|
||||||
{
|
{
|
||||||
PCRE2_SPTR save_end_subject = end_subject;
|
PCRE2_SPTR save_end_subject = end_subject;
|
||||||
|
|
||||||
|
|
|
@ -461,7 +461,7 @@ static modstruct modlist[] = {
|
||||||
{ "newline", MOD_CTB, MOD_NL, MO(newline_convention), CO(newline_convention) },
|
{ "newline", MOD_CTB, MOD_NL, MO(newline_convention), CO(newline_convention) },
|
||||||
{ "no_auto_capture", MOD_PAT, MOD_OPT, PCRE2_NO_AUTO_CAPTURE, PO(options) },
|
{ "no_auto_capture", MOD_PAT, MOD_OPT, PCRE2_NO_AUTO_CAPTURE, PO(options) },
|
||||||
{ "no_auto_possess", MOD_PATP, MOD_OPT, PCRE2_NO_AUTO_POSSESS, PO(options) },
|
{ "no_auto_possess", MOD_PATP, MOD_OPT, PCRE2_NO_AUTO_POSSESS, PO(options) },
|
||||||
{ "no_start_optimize", MOD_PDP, MOD_OPT, PCRE2_NO_START_OPTIMIZE, PD(options) },
|
{ "no_start_optimize", MOD_PATP, MOD_OPT, PCRE2_NO_START_OPTIMIZE, PO(options) },
|
||||||
{ "no_utf_check", MOD_PD, MOD_OPT, PCRE2_NO_UTF_CHECK, PD(options) },
|
{ "no_utf_check", MOD_PD, MOD_OPT, PCRE2_NO_UTF_CHECK, PD(options) },
|
||||||
{ "notbol", MOD_DAT, MOD_OPT, PCRE2_NOTBOL, DO(options) },
|
{ "notbol", MOD_DAT, MOD_OPT, PCRE2_NOTBOL, DO(options) },
|
||||||
{ "notempty", MOD_DAT, MOD_OPT, PCRE2_NOTEMPTY, DO(options) },
|
{ "notempty", MOD_DAT, MOD_OPT, PCRE2_NOTEMPTY, DO(options) },
|
||||||
|
@ -3058,11 +3058,10 @@ fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s",
|
||||||
static void
|
static void
|
||||||
show_match_options(uint32_t options)
|
show_match_options(uint32_t options)
|
||||||
{
|
{
|
||||||
fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s",
|
fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s",
|
||||||
((options & PCRE2_ANCHORED) != 0)? " anchored" : "",
|
((options & PCRE2_ANCHORED) != 0)? " anchored" : "",
|
||||||
((options & PCRE2_DFA_RESTART) != 0)? " dfa_restart" : "",
|
((options & PCRE2_DFA_RESTART) != 0)? " dfa_restart" : "",
|
||||||
((options & PCRE2_DFA_SHORTEST) != 0)? " dfa_shortest" : "",
|
((options & PCRE2_DFA_SHORTEST) != 0)? " dfa_shortest" : "",
|
||||||
((options & PCRE2_NO_START_OPTIMIZE) != 0)? " no_start_optimize" : "",
|
|
||||||
((options & PCRE2_NO_UTF_CHECK) != 0)? " no_utf_check" : "",
|
((options & PCRE2_NO_UTF_CHECK) != 0)? " no_utf_check" : "",
|
||||||
((options & PCRE2_NOTBOL) != 0)? " notbol" : "",
|
((options & PCRE2_NOTBOL) != 0)? " notbol" : "",
|
||||||
((options & PCRE2_NOTEMPTY) != 0)? " notempty" : "",
|
((options & PCRE2_NOTEMPTY) != 0)? " notempty" : "",
|
||||||
|
|
|
@ -2491,12 +2491,15 @@ a random value. /Ix
|
||||||
/xyz/auto_callout
|
/xyz/auto_callout
|
||||||
xyz
|
xyz
|
||||||
abcxyz
|
abcxyz
|
||||||
abcxyz\=no_start_optimize
|
|
||||||
** Failers
|
** Failers
|
||||||
abc
|
abc
|
||||||
abc\=no_start_optimize
|
|
||||||
abcxypqr
|
abcxypqr
|
||||||
abcxypqr\=no_start_optimize
|
|
||||||
|
/xyz/auto_callout,no_start_optimize
|
||||||
|
abcxyz
|
||||||
|
** Failers
|
||||||
|
abc
|
||||||
|
abcxypqr
|
||||||
|
|
||||||
/(*NO_START_OPT)xyz/auto_callout
|
/(*NO_START_OPT)xyz/auto_callout
|
||||||
abcxyz
|
abcxyz
|
||||||
|
@ -2987,8 +2990,10 @@ a random value. /Ix
|
||||||
|
|
||||||
/(*COMMIT)ABC/
|
/(*COMMIT)ABC/
|
||||||
ABCDEFG
|
ABCDEFG
|
||||||
|
|
||||||
|
/(*COMMIT)ABC/no_start_optimize
|
||||||
** Failers
|
** Failers
|
||||||
DEFGABC\=no_start_optimize
|
DEFGABC
|
||||||
|
|
||||||
/^(ab (c+(*THEN)cd) | xyz)/x
|
/^(ab (c+(*THEN)cd) | xyz)/x
|
||||||
abcccd
|
abcccd
|
||||||
|
|
|
@ -4349,12 +4349,15 @@
|
||||||
/xyz/auto_callout
|
/xyz/auto_callout
|
||||||
xyz
|
xyz
|
||||||
abcxyz
|
abcxyz
|
||||||
abcxyz\=no_start_optimize
|
|
||||||
** Failers
|
** Failers
|
||||||
abc
|
abc
|
||||||
abc\=no_start_optimize
|
|
||||||
abcxypqr
|
abcxypqr
|
||||||
abcxypqr\=no_start_optimize
|
|
||||||
|
/xyz/auto_callout,no_start_optimize
|
||||||
|
abcxyz
|
||||||
|
** Failers
|
||||||
|
abc
|
||||||
|
abcxypqr
|
||||||
|
|
||||||
/(*NO_START_OPT)xyz/auto_callout
|
/(*NO_START_OPT)xyz/auto_callout
|
||||||
abcxyz
|
abcxyz
|
||||||
|
@ -4439,20 +4442,14 @@
|
||||||
/(abc|def|xyz)/I
|
/(abc|def|xyz)/I
|
||||||
terhjk;abcdaadsfe
|
terhjk;abcdaadsfe
|
||||||
the quick xyz brown fox
|
the quick xyz brown fox
|
||||||
terhjk;abcdaadsfe\=no_start_optimize
|
|
||||||
the quick xyz brown fox\=no_start_optimize
|
|
||||||
** Failers
|
** Failers
|
||||||
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
|
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
|
||||||
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd\=no_start_optimize
|
|
||||||
|
|
||||||
/(abc|def|xyz)/I
|
/(abc|def|xyz)/I,no_start_optimize
|
||||||
terhjk;abcdaadsfe
|
terhjk;abcdaadsfe
|
||||||
the quick xyz brown fox
|
the quick xyz brown fox
|
||||||
terhjk;abcdaadsfe\=no_start_optimize
|
|
||||||
the quick xyz brown fox\=no_start_optimize
|
|
||||||
** Failers
|
** Failers
|
||||||
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
|
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
|
||||||
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd\=no_start_optimize
|
|
||||||
|
|
||||||
/abcd*/aftertext
|
/abcd*/aftertext
|
||||||
xxxxabcd\=ps
|
xxxxabcd\=ps
|
||||||
|
|
|
@ -8941,7 +8941,15 @@ Subject length lower bound = 1
|
||||||
+2 ^ ^ z
|
+2 ^ ^ z
|
||||||
+3 ^ ^
|
+3 ^ ^
|
||||||
0: xyz
|
0: xyz
|
||||||
abcxyz\=no_start_optimize
|
** Failers
|
||||||
|
No match
|
||||||
|
abc
|
||||||
|
No match
|
||||||
|
abcxypqr
|
||||||
|
No match
|
||||||
|
|
||||||
|
/xyz/auto_callout,no_start_optimize
|
||||||
|
abcxyz
|
||||||
--->abcxyz
|
--->abcxyz
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
|
@ -8952,10 +8960,20 @@ Subject length lower bound = 1
|
||||||
+3 ^ ^
|
+3 ^ ^
|
||||||
0: xyz
|
0: xyz
|
||||||
** Failers
|
** Failers
|
||||||
|
--->** Failers
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
No match
|
No match
|
||||||
abc
|
abc
|
||||||
No match
|
|
||||||
abc\=no_start_optimize
|
|
||||||
--->abc
|
--->abc
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
|
@ -8963,8 +8981,6 @@ No match
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
No match
|
No match
|
||||||
abcxypqr
|
abcxypqr
|
||||||
No match
|
|
||||||
abcxypqr\=no_start_optimize
|
|
||||||
--->abcxypqr
|
--->abcxypqr
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
|
@ -10182,9 +10198,11 @@ No match, mark = A
|
||||||
/(*COMMIT)ABC/
|
/(*COMMIT)ABC/
|
||||||
ABCDEFG
|
ABCDEFG
|
||||||
0: ABC
|
0: ABC
|
||||||
|
|
||||||
|
/(*COMMIT)ABC/no_start_optimize
|
||||||
** Failers
|
** Failers
|
||||||
No match
|
No match
|
||||||
DEFGABC\=no_start_optimize
|
DEFGABC
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/^(ab (c+(*THEN)cd) | xyz)/x
|
/^(ab (c+(*THEN)cd) | xyz)/x
|
||||||
|
|
|
@ -6882,7 +6882,15 @@ No match
|
||||||
+2 ^ ^ z
|
+2 ^ ^ z
|
||||||
+3 ^ ^
|
+3 ^ ^
|
||||||
0: xyz
|
0: xyz
|
||||||
abcxyz\=no_start_optimize
|
** Failers
|
||||||
|
No match
|
||||||
|
abc
|
||||||
|
No match
|
||||||
|
abcxypqr
|
||||||
|
No match
|
||||||
|
|
||||||
|
/xyz/auto_callout,no_start_optimize
|
||||||
|
abcxyz
|
||||||
--->abcxyz
|
--->abcxyz
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
|
@ -6893,10 +6901,20 @@ No match
|
||||||
+3 ^ ^
|
+3 ^ ^
|
||||||
0: xyz
|
0: xyz
|
||||||
** Failers
|
** Failers
|
||||||
|
--->** Failers
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
|
+0 ^ x
|
||||||
No match
|
No match
|
||||||
abc
|
abc
|
||||||
No match
|
|
||||||
abc\=no_start_optimize
|
|
||||||
--->abc
|
--->abc
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
|
@ -6904,8 +6922,6 @@ No match
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
No match
|
No match
|
||||||
abcxypqr
|
abcxypqr
|
||||||
No match
|
|
||||||
abcxypqr\=no_start_optimize
|
|
||||||
--->abcxypqr
|
--->abcxypqr
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
+0 ^ x
|
+0 ^ x
|
||||||
|
@ -7091,36 +7107,25 @@ Subject length lower bound = 3
|
||||||
terhjk;abcdaadsfe
|
terhjk;abcdaadsfe
|
||||||
0: abc
|
0: abc
|
||||||
the quick xyz brown fox
|
the quick xyz brown fox
|
||||||
0: xyz
|
|
||||||
terhjk;abcdaadsfe\=no_start_optimize
|
|
||||||
0: abc
|
|
||||||
the quick xyz brown fox\=no_start_optimize
|
|
||||||
0: xyz
|
0: xyz
|
||||||
** Failers
|
** Failers
|
||||||
No match
|
No match
|
||||||
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
|
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
|
||||||
No match
|
No match
|
||||||
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd\=no_start_optimize
|
|
||||||
No match
|
|
||||||
|
|
||||||
/(abc|def|xyz)/I
|
/(abc|def|xyz)/I,no_start_optimize
|
||||||
Capturing subpattern count = 1
|
Capturing subpattern count = 1
|
||||||
|
Options: no_start_optimize
|
||||||
Starting code units: a d x
|
Starting code units: a d x
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
terhjk;abcdaadsfe
|
terhjk;abcdaadsfe
|
||||||
0: abc
|
0: abc
|
||||||
the quick xyz brown fox
|
the quick xyz brown fox
|
||||||
0: xyz
|
|
||||||
terhjk;abcdaadsfe\=no_start_optimize
|
|
||||||
0: abc
|
|
||||||
the quick xyz brown fox\=no_start_optimize
|
|
||||||
0: xyz
|
0: xyz
|
||||||
** Failers
|
** Failers
|
||||||
No match
|
No match
|
||||||
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
|
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd
|
||||||
No match
|
No match
|
||||||
thejk;adlfj aenjl;fda asdfasd ehj;kjxyasiupd\=no_start_optimize
|
|
||||||
No match
|
|
||||||
|
|
||||||
/abcd*/aftertext
|
/abcd*/aftertext
|
||||||
xxxxabcd\=ps
|
xxxxabcd\=ps
|
||||||
|
|
Loading…
Reference in New Issue