More partial match tweaks.
This commit is contained in:
parent
f7e21162fa
commit
3572634086
|
@ -104,7 +104,11 @@ within it, the nested lookbehind was not correctly processed. For example, if
|
|||
is another situation where adding characters to the current subject can
|
||||
lead to a full match. Example: /c*+(?<=[bc])/ with subject "ab".
|
||||
|
||||
(b) An empty string partial hard match can be returned for \z and \Z as it
|
||||
(b) Similarly, if a pattern could match an empty string, an empty partial
|
||||
match may be given. Example: /(?![ab]).*/ with subject "ab". This case
|
||||
applies only to PCRE2_PARTIAL_HARD.
|
||||
|
||||
(c) An empty string partial hard match can be returned for \z and \Z as it
|
||||
is documented that they shouldn't match.
|
||||
|
||||
|
||||
|
|
|
@ -2057,7 +2057,7 @@ the following negative numbers:
|
|||
PCRE2_ERROR_BADOPTION the value of <i>what</i> was invalid
|
||||
PCRE2_ERROR_UNSET the requested field is not set
|
||||
</pre>
|
||||
The "magic number" is placed at the start of each compiled pattern as an simple
|
||||
The "magic number" is placed at the start of each compiled pattern as a simple
|
||||
check against passing an arbitrary memory pointer. Here is a typical call of
|
||||
<b>pcre2_pattern_info()</b>, to obtain the length of the compiled pattern:
|
||||
<pre>
|
||||
|
@ -2114,7 +2114,7 @@ options returned for PCRE2_INFO_ALLOPTIONS.
|
|||
PCRE2_INFO_BACKREFMAX
|
||||
</pre>
|
||||
Return the number of the highest backreference in the pattern. The third
|
||||
argument should point to an <b>uint32_t</b> variable. Named capture groups
|
||||
argument should point to a <b>uint32_t</b> variable. Named capture groups
|
||||
acquire numbers as well as names, and these count towards the highest
|
||||
backreference. Backreferences such as \4 or \g{12} match the captured
|
||||
characters of the given group, but in addition, the check that a capture
|
||||
|
@ -2132,7 +2132,7 @@ that \R matches only CR, LF, or CRLF.
|
|||
</pre>
|
||||
Return the highest capture group number in the pattern. In patterns where (?|
|
||||
is not used, this is also the total number of capture groups. The third
|
||||
argument should point to an <b>uint32_t</b> variable.
|
||||
argument should point to a <b>uint32_t</b> variable.
|
||||
<pre>
|
||||
PCRE2_INFO_DEPTHLIMIT
|
||||
</pre>
|
||||
|
@ -2157,7 +2157,7 @@ returned. Otherwise NULL is returned. The third argument should point to a
|
|||
PCRE2_INFO_FIRSTCODETYPE
|
||||
</pre>
|
||||
Return information about the first code unit of any matched string, for a
|
||||
non-anchored pattern. The third argument should point to an <b>uint32_t</b>
|
||||
non-anchored pattern. The third argument should point to a <b>uint32_t</b>
|
||||
variable. If there is a fixed first value, for example, the letter "c" from a
|
||||
pattern such as (cat|cow|coyote), 1 is returned, and the value can be retrieved
|
||||
using PCRE2_INFO_FIRSTCODEUNIT. If there is no fixed first value, but it is
|
||||
|
@ -2169,7 +2169,7 @@ is returned.
|
|||
</pre>
|
||||
Return the value of the first code unit of any matched string for a pattern
|
||||
where PCRE2_INFO_FIRSTCODETYPE returns 1; otherwise return 0. The third
|
||||
argument should point to an <b>uint32_t</b> variable. In the 8-bit library, the
|
||||
argument should point to a <b>uint32_t</b> variable. In the 8-bit library, the
|
||||
value is always less than 256. In the 16-bit library the value can be up to
|
||||
0xffff. In the 32-bit library in UTF-32 mode the value can be up to 0x10ffff,
|
||||
and up to 0xffffffff when not using UTF-32 mode.
|
||||
|
@ -2185,12 +2185,12 @@ pattern. Each additional capture group adds two PCRE2_SIZE variables.
|
|||
PCRE2_INFO_HASBACKSLASHC
|
||||
</pre>
|
||||
Return 1 if the pattern contains any instances of \C, otherwise 0. The third
|
||||
argument should point to an <b>uint32_t</b> variable.
|
||||
argument should point to a <b>uint32_t</b> variable.
|
||||
<pre>
|
||||
PCRE2_INFO_HASCRORLF
|
||||
</pre>
|
||||
Return 1 if the pattern contains any explicit matches for CR or LF characters,
|
||||
otherwise 0. The third argument should point to an <b>uint32_t</b> variable. An
|
||||
otherwise 0. The third argument should point to a <b>uint32_t</b> variable. An
|
||||
explicit match is either a literal CR or LF character, or \r or \n or one of
|
||||
the equivalent hexadecimal or octal escape sequences.
|
||||
<pre>
|
||||
|
@ -2206,7 +2206,7 @@ defaulted by the caller of the match function.
|
|||
PCRE2_INFO_JCHANGED
|
||||
</pre>
|
||||
Return 1 if the (?J) or (?-J) option setting is used in the pattern, otherwise
|
||||
0. The third argument should point to an <b>uint32_t</b> variable. (?J) and
|
||||
0. The third argument should point to a <b>uint32_t</b> variable. (?J) and
|
||||
(?-J) set and unset the local PCRE2_DUPNAMES option, respectively.
|
||||
<pre>
|
||||
PCRE2_INFO_JITSIZE
|
||||
|
@ -2218,7 +2218,7 @@ return zero. The third argument should point to a <b>size_t</b> variable.
|
|||
PCRE2_INFO_LASTCODETYPE
|
||||
</pre>
|
||||
Returns 1 if there is a rightmost literal code unit that must exist in any
|
||||
matched string, other than at its start. The third argument should point to an
|
||||
matched string, other than at its start. The third argument should point to a
|
||||
<b>uint32_t</b> variable. If there is no such value, 0 is returned. When 1 is
|
||||
returned, the code unit value itself can be retrieved using
|
||||
PCRE2_INFO_LASTCODEUNIT. For anchored patterns, a last literal value is
|
||||
|
@ -2231,12 +2231,12 @@ PCRE2_INFO_LASTCODEUNIT), but for /^a\dz\d/ the returned value is 0.
|
|||
Return the value of the rightmost literal code unit that must exist in any
|
||||
matched string, other than at its start, for a pattern where
|
||||
PCRE2_INFO_LASTCODETYPE returns 1. Otherwise, return 0. The third argument
|
||||
should point to an <b>uint32_t</b> variable.
|
||||
should point to a <b>uint32_t</b> variable.
|
||||
<pre>
|
||||
PCRE2_INFO_MATCHEMPTY
|
||||
</pre>
|
||||
Return 1 if the pattern might match an empty string, otherwise 0. The third
|
||||
argument should point to an <b>uint32_t</b> variable. When a pattern contains
|
||||
argument should point to a <b>uint32_t</b> variable. When a pattern contains
|
||||
recursive subroutine calls it is not always possible to determine whether or
|
||||
not it can match an empty string. PCRE2 takes a cautious approach and returns 1
|
||||
in such cases.
|
||||
|
@ -2279,7 +2279,7 @@ If a minimum length for matching subject strings was computed, its value is
|
|||
returned. Otherwise the returned value is 0. This value is not computed when
|
||||
PCRE2_NO_START_OPTIMIZE is set. The value is a number of characters, which in
|
||||
UTF mode may be different from the number of code units. The third argument
|
||||
should point to an <b>uint32_t</b> variable. The value is a lower bound to the
|
||||
should point to a <b>uint32_t</b> variable. The value is a lower bound to the
|
||||
length of any matching string. There may not be any strings of that length that
|
||||
do actually match, but every string that does match is at least that long.
|
||||
<pre>
|
||||
|
@ -2726,7 +2726,8 @@ Your program may crash or loop indefinitely or give wrong results.
|
|||
These options turn on the partial matching feature. A partial match occurs if
|
||||
the end of the subject string is reached successfully, but there are not enough
|
||||
subject characters to complete the match. In addition, either at least one
|
||||
character must have been inspected or the pattern must contain a lookbehind.
|
||||
character must have been inspected or the pattern must contain a lookbehind, or
|
||||
the pattern must be one that could match an empty string.
|
||||
</P>
|
||||
<P>
|
||||
If this situation arises when PCRE2_PARTIAL_SOFT (but not PCRE2_PARTIAL_HARD)
|
||||
|
@ -3850,7 +3851,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 20 July 2019
|
||||
Last updated: 22 July 2019
|
||||
<br>
|
||||
Copyright © 1997-2019 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -80,15 +80,17 @@ is also disabled for partial matching.
|
|||
A partial match occurs during a call to <b>pcre2_match()</b> when the end of the
|
||||
subject string is reached successfully, but matching cannot continue because
|
||||
more characters are needed, and in addition, either at least one character in
|
||||
the subject has been inspected or the pattern contains a lookbehind. An
|
||||
the subject has been inspected or the pattern contains a lookbehind, or (when
|
||||
PCRE2_PARTIAL_HARD is set) the pattern could match an empty string. An
|
||||
inspected character need not form part of the final matched string; lookbehind
|
||||
assertions and the \K escape sequence provide ways of inspecting characters
|
||||
before the start of a matched string.
|
||||
</P>
|
||||
<P>
|
||||
The two additional requirements define the cases where adding more characters
|
||||
to the existing subject may complete the match. Without these conditions there
|
||||
would be a partial match of an empty string at the end of the subject for all
|
||||
The three additional requirements define the cases where adding more characters
|
||||
to the existing subject may complete the same match that would occur if they
|
||||
had all been present in the first place. Without these conditions there would
|
||||
be a partial match of an empty string at the end of the subject for all
|
||||
unanchored patterns (and also for anchored patterns if the subject itself is
|
||||
empty).
|
||||
</P>
|
||||
|
@ -449,7 +451,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC10" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 21 July 2019
|
||||
Last updated: 22 July 2019
|
||||
<br>
|
||||
Copyright © 1997-2019 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -2013,8 +2013,8 @@ INFORMATION ABOUT A COMPILED PATTERN
|
|||
PCRE2_ERROR_BADOPTION the value of what was invalid
|
||||
PCRE2_ERROR_UNSET the requested field is not set
|
||||
|
||||
The "magic number" is placed at the start of each compiled pattern as
|
||||
an simple check against passing an arbitrary memory pointer. Here is a
|
||||
The "magic number" is placed at the start of each compiled pattern as a
|
||||
simple check against passing an arbitrary memory pointer. Here is a
|
||||
typical call of pcre2_pattern_info(), to obtain the length of the com-
|
||||
piled pattern:
|
||||
|
||||
|
@ -2073,7 +2073,7 @@ INFORMATION ABOUT A COMPILED PATTERN
|
|||
PCRE2_INFO_BACKREFMAX
|
||||
|
||||
Return the number of the highest backreference in the pattern. The
|
||||
third argument should point to an uint32_t variable. Named capture
|
||||
third argument should point to a uint32_t variable. Named capture
|
||||
groups acquire numbers as well as names, and these count towards the
|
||||
highest backreference. Backreferences such as \4 or \g{12} match the
|
||||
captured characters of the given group, but in addition, the check that
|
||||
|
@ -2091,7 +2091,7 @@ INFORMATION ABOUT A COMPILED PATTERN
|
|||
|
||||
Return the highest capture group number in the pattern. In patterns
|
||||
where (?| is not used, this is also the total number of capture groups.
|
||||
The third argument should point to an uint32_t variable.
|
||||
The third argument should point to a uint32_t variable.
|
||||
|
||||
PCRE2_INFO_DEPTHLIMIT
|
||||
|
||||
|
@ -2117,7 +2117,7 @@ INFORMATION ABOUT A COMPILED PATTERN
|
|||
PCRE2_INFO_FIRSTCODETYPE
|
||||
|
||||
Return information about the first code unit of any matched string, for
|
||||
a non-anchored pattern. The third argument should point to an uint32_t
|
||||
a non-anchored pattern. The third argument should point to a uint32_t
|
||||
variable. If there is a fixed first value, for example, the letter "c"
|
||||
from a pattern such as (cat|cow|coyote), 1 is returned, and the value
|
||||
can be retrieved using PCRE2_INFO_FIRSTCODEUNIT. If there is no fixed
|
||||
|
@ -2129,7 +2129,7 @@ INFORMATION ABOUT A COMPILED PATTERN
|
|||
|
||||
Return the value of the first code unit of any matched string for a
|
||||
pattern where PCRE2_INFO_FIRSTCODETYPE returns 1; otherwise return 0.
|
||||
The third argument should point to an uint32_t variable. In the 8-bit
|
||||
The third argument should point to a uint32_t variable. In the 8-bit
|
||||
library, the value is always less than 256. In the 16-bit library the
|
||||
value can be up to 0xffff. In the 32-bit library in UTF-32 mode the
|
||||
value can be up to 0x10ffff, and up to 0xffffffff when not using UTF-32
|
||||
|
@ -2147,12 +2147,12 @@ INFORMATION ABOUT A COMPILED PATTERN
|
|||
PCRE2_INFO_HASBACKSLASHC
|
||||
|
||||
Return 1 if the pattern contains any instances of \C, otherwise 0. The
|
||||
third argument should point to an uint32_t variable.
|
||||
third argument should point to a uint32_t variable.
|
||||
|
||||
PCRE2_INFO_HASCRORLF
|
||||
|
||||
Return 1 if the pattern contains any explicit matches for CR or LF
|
||||
characters, otherwise 0. The third argument should point to an uint32_t
|
||||
characters, otherwise 0. The third argument should point to a uint32_t
|
||||
variable. An explicit match is either a literal CR or LF character, or
|
||||
\r or \n or one of the equivalent hexadecimal or octal escape se-
|
||||
quences.
|
||||
|
@ -2169,7 +2169,7 @@ INFORMATION ABOUT A COMPILED PATTERN
|
|||
PCRE2_INFO_JCHANGED
|
||||
|
||||
Return 1 if the (?J) or (?-J) option setting is used in the pattern,
|
||||
otherwise 0. The third argument should point to an uint32_t variable.
|
||||
otherwise 0. The third argument should point to a uint32_t variable.
|
||||
(?J) and (?-J) set and unset the local PCRE2_DUPNAMES option, respec-
|
||||
tively.
|
||||
|
||||
|
@ -2183,28 +2183,28 @@ INFORMATION ABOUT A COMPILED PATTERN
|
|||
|
||||
Returns 1 if there is a rightmost literal code unit that must exist in
|
||||
any matched string, other than at its start. The third argument should
|
||||
point to an uint32_t variable. If there is no such value, 0 is re-
|
||||
turned. When 1 is returned, the code unit value itself can be retrieved
|
||||
using PCRE2_INFO_LASTCODEUNIT. For anchored patterns, a last literal
|
||||
value is recorded only if it follows something of variable length. For
|
||||
example, for the pattern /^a\d+z\d+/ the returned value is 1 (with "z"
|
||||
returned from PCRE2_INFO_LASTCODEUNIT), but for /^a\dz\d/ the returned
|
||||
value is 0.
|
||||
point to a uint32_t variable. If there is no such value, 0 is returned.
|
||||
When 1 is returned, the code unit value itself can be retrieved using
|
||||
PCRE2_INFO_LASTCODEUNIT. For anchored patterns, a last literal value is
|
||||
recorded only if it follows something of variable length. For example,
|
||||
for the pattern /^a\d+z\d+/ the returned value is 1 (with "z" returned
|
||||
from PCRE2_INFO_LASTCODEUNIT), but for /^a\dz\d/ the returned value is
|
||||
0.
|
||||
|
||||
PCRE2_INFO_LASTCODEUNIT
|
||||
|
||||
Return the value of the rightmost literal code unit that must exist in
|
||||
any matched string, other than at its start, for a pattern where
|
||||
PCRE2_INFO_LASTCODETYPE returns 1. Otherwise, return 0. The third argu-
|
||||
ment should point to an uint32_t variable.
|
||||
ment should point to a uint32_t variable.
|
||||
|
||||
PCRE2_INFO_MATCHEMPTY
|
||||
|
||||
Return 1 if the pattern might match an empty string, otherwise 0. The
|
||||
third argument should point to an uint32_t variable. When a pattern
|
||||
contains recursive subroutine calls it is not always possible to deter-
|
||||
mine whether or not it can match an empty string. PCRE2 takes a cau-
|
||||
tious approach and returns 1 in such cases.
|
||||
third argument should point to a uint32_t variable. When a pattern con-
|
||||
tains recursive subroutine calls it is not always possible to determine
|
||||
whether or not it can match an empty string. PCRE2 takes a cautious ap-
|
||||
proach and returns 1 in such cases.
|
||||
|
||||
PCRE2_INFO_MATCHLIMIT
|
||||
|
||||
|
@ -2244,7 +2244,7 @@ INFORMATION ABOUT A COMPILED PATTERN
|
|||
value is returned. Otherwise the returned value is 0. This value is not
|
||||
computed when PCRE2_NO_START_OPTIMIZE is set. The value is a number of
|
||||
characters, which in UTF mode may be different from the number of code
|
||||
units. The third argument should point to an uint32_t variable. The
|
||||
units. The third argument should point to a uint32_t variable. The
|
||||
value is a lower bound to the length of any matching string. There may
|
||||
not be any strings of that length that do actually match, but every
|
||||
string that does match is at least that long.
|
||||
|
@ -2663,7 +2663,8 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
|
|||
curs if the end of the subject string is reached successfully, but
|
||||
there are not enough subject characters to complete the match. In addi-
|
||||
tion, either at least one character must have been inspected or the
|
||||
pattern must contain a lookbehind.
|
||||
pattern must contain a lookbehind, or the pattern must be one that
|
||||
could match an empty string.
|
||||
|
||||
If this situation arises when PCRE2_PARTIAL_SOFT (but not PCRE2_PAR-
|
||||
TIAL_HARD) is set, matching continues by testing any remaining alterna-
|
||||
|
@ -3705,7 +3706,7 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 20 July 2019
|
||||
Last updated: 22 July 2019
|
||||
Copyright (c) 1997-2019 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
@ -5703,13 +5704,15 @@ PARTIAL MATCHING USING pcre2_match()
|
|||
the subject string is reached successfully, but matching cannot con-
|
||||
tinue because more characters are needed, and in addition, either at
|
||||
least one character in the subject has been inspected or the pattern
|
||||
contains a lookbehind. An inspected character need not form part of the
|
||||
final matched string; lookbehind assertions and the \K escape sequence
|
||||
provide ways of inspecting characters before the start of a matched
|
||||
string.
|
||||
contains a lookbehind, or (when PCRE2_PARTIAL_HARD is set) the pattern
|
||||
could match an empty string. An inspected character need not form part
|
||||
of the final matched string; lookbehind assertions and the \K escape
|
||||
sequence provide ways of inspecting characters before the start of a
|
||||
matched string.
|
||||
|
||||
The two additional requirements define the cases where adding more
|
||||
characters to the existing subject may complete the match. Without
|
||||
The three additional requirements define the cases where adding more
|
||||
characters to the existing subject may complete the same match that
|
||||
would occur if they had all been present in the first place. Without
|
||||
these conditions there would be a partial match of an empty string at
|
||||
the end of the subject for all unanchored patterns (and also for an-
|
||||
chored patterns if the subject itself is empty).
|
||||
|
@ -6068,7 +6071,7 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 21 July 2019
|
||||
Last updated: 22 July 2019
|
||||
Copyright (c) 1997-2019 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
|
|
@ -2720,7 +2720,8 @@ Your program may crash or loop indefinitely or give wrong results.
|
|||
These options turn on the partial matching feature. A partial match occurs if
|
||||
the end of the subject string is reached successfully, but there are not enough
|
||||
subject characters to complete the match. In addition, either at least one
|
||||
character must have been inspected or the pattern must contain a lookbehind.
|
||||
character must have been inspected or the pattern must contain a lookbehind, or
|
||||
the pattern must be one that could match an empty string.
|
||||
.P
|
||||
If this situation arises when PCRE2_PARTIAL_SOFT (but not PCRE2_PARTIAL_HARD)
|
||||
is set, matching continues by testing any remaining alternatives. Only if no
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2PARTIAL 3 "21 July 2019" "PCRE2 10.34"
|
||||
.TH PCRE2PARTIAL 3 "22 July 2019" "PCRE2 10.34"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions
|
||||
.SH "PARTIAL MATCHING IN PCRE2"
|
||||
|
@ -56,14 +56,16 @@ is also disabled for partial matching.
|
|||
A partial match occurs during a call to \fBpcre2_match()\fP when the end of the
|
||||
subject string is reached successfully, but matching cannot continue because
|
||||
more characters are needed, and in addition, either at least one character in
|
||||
the subject has been inspected or the pattern contains a lookbehind. An
|
||||
the subject has been inspected or the pattern contains a lookbehind, or (when
|
||||
PCRE2_PARTIAL_HARD is set) the pattern could match an empty string. An
|
||||
inspected character need not form part of the final matched string; lookbehind
|
||||
assertions and the \eK escape sequence provide ways of inspecting characters
|
||||
before the start of a matched string.
|
||||
.P
|
||||
The two additional requirements define the cases where adding more characters
|
||||
to the existing subject may complete the match. Without these conditions there
|
||||
would be a partial match of an empty string at the end of the subject for all
|
||||
The three additional requirements define the cases where adding more characters
|
||||
to the existing subject may complete the same match that would occur if they
|
||||
had all been present in the first place. Without these conditions there would
|
||||
be a partial match of an empty string at the end of the subject for all
|
||||
unanchored patterns (and also for anchored patterns if the subject itself is
|
||||
empty).
|
||||
.P
|
||||
|
@ -422,6 +424,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 21 July 2019
|
||||
Last updated: 22 July 2019
|
||||
Copyright (c) 1997-2019 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -3185,8 +3185,8 @@ for (;;)
|
|||
ptr >= end_subject && /* End of subject and */
|
||||
( /* either */
|
||||
ptr > mb->start_used_ptr || /* Inspected non-empty string */
|
||||
mb->haslookbehind /* or pattern has lookbehind */
|
||||
)
|
||||
mb->allowemptypartial /* or pattern has lookbehind */
|
||||
) /* or could match empty */
|
||||
)
|
||||
))
|
||||
match_count = PCRE2_ERROR_PARTIAL;
|
||||
|
@ -3417,7 +3417,8 @@ mb->tables = re->tables;
|
|||
mb->start_subject = subject;
|
||||
mb->end_subject = end_subject;
|
||||
mb->start_offset = start_offset;
|
||||
mb->haslookbehind = (re->max_lookbehind > 0);
|
||||
mb->allowemptypartial = (re->max_lookbehind > 0) ||
|
||||
(re->flags & PCRE2_MATCH_EMPTY) != 0;
|
||||
mb->moptions = options;
|
||||
mb->poptions = re->overall_options;
|
||||
mb->match_call_count = 0;
|
||||
|
|
|
@ -854,7 +854,7 @@ typedef struct match_block {
|
|||
uint32_t match_call_count; /* Number of times a new frame is created */
|
||||
BOOL hitend; /* Hit the end of the subject at some point */
|
||||
BOOL hasthen; /* Pattern contains (*THEN) */
|
||||
BOOL haslookbehind; /* Pattern contains sigificant lookbehind */
|
||||
BOOL allowemptypartial; /* Allow empty hard partial */
|
||||
const uint8_t *lcc; /* Points to lower casing table */
|
||||
const uint8_t *fcc; /* Points to case-flipping table */
|
||||
const uint8_t *ctypes; /* Points to table of type maps */
|
||||
|
@ -910,7 +910,7 @@ typedef struct dfa_match_block {
|
|||
uint32_t poptions; /* Pattern options */
|
||||
uint32_t nltype; /* Newline type */
|
||||
uint32_t nllen; /* Newline string length */
|
||||
BOOL haslookbehind; /* Pattern contains significant lookbehind */
|
||||
BOOL allowemptypartial; /* Allow empty hard partial */
|
||||
PCRE2_UCHAR nl[4]; /* Newline string when fixed */
|
||||
uint16_t bsr_convention; /* \R interpretation */
|
||||
pcre2_callout_block *cb; /* Points to a callout block */
|
||||
|
|
|
@ -508,7 +508,8 @@ A partial match is returned only if no complete match can be found. */
|
|||
}
|
||||
|
||||
#define SCHECK_PARTIAL()\
|
||||
if (mb->partial != 0 && (Feptr > mb->start_used_ptr || mb->haslookbehind)) \
|
||||
if (mb->partial != 0 && \
|
||||
(Feptr > mb->start_used_ptr || mb->allowemptypartial)) \
|
||||
{ \
|
||||
mb->hitend = TRUE; \
|
||||
if (mb->partial > 1) return PCRE2_ERROR_PARTIAL; \
|
||||
|
@ -6451,7 +6452,8 @@ mb->start_subject = subject;
|
|||
mb->start_offset = start_offset;
|
||||
mb->end_subject = end_subject;
|
||||
mb->hasthen = (re->flags & PCRE2_HASTHEN) != 0;
|
||||
mb->haslookbehind = (re->max_lookbehind > 0);
|
||||
mb->allowemptypartial = (re->max_lookbehind > 0) ||
|
||||
(re->flags & PCRE2_MATCH_EMPTY) != 0;
|
||||
mb->poptions = re->overall_options; /* Pattern options */
|
||||
mb->ignore_skip_arg = 0;
|
||||
mb->mark = mb->nomatch_mark = NULL; /* In case never set */
|
||||
|
|
|
@ -5719,4 +5719,10 @@ a)"xI
|
|||
abc\n\=ph,no_jit
|
||||
abc\n\=ps
|
||||
|
||||
/(?![ab]).*/
|
||||
ab\=ph,no_jit
|
||||
|
||||
/c*+/
|
||||
ab\=ph,offset=2,no_jit
|
||||
|
||||
# End of testinput2
|
||||
|
|
|
@ -5020,4 +5020,10 @@
|
|||
bxyz
|
||||
xyz
|
||||
|
||||
/(?![ab]).*/
|
||||
ab\=ph
|
||||
|
||||
/c*+/
|
||||
ab\=ph,offset=2
|
||||
|
||||
# End of testinput6
|
||||
|
|
|
@ -17231,6 +17231,14 @@ Partial match: \x0a
|
|||
abc\n\=ps
|
||||
0:
|
||||
|
||||
/(?![ab]).*/
|
||||
ab\=ph,no_jit
|
||||
Partial match:
|
||||
|
||||
/c*+/
|
||||
ab\=ph,offset=2,no_jit
|
||||
Partial match:
|
||||
|
||||
# End of testinput2
|
||||
Error -70: PCRE2_ERROR_BADDATA (unknown error number)
|
||||
Error -62: bad serialized data
|
||||
|
|
|
@ -7887,4 +7887,12 @@ Partial match:
|
|||
xyz
|
||||
0:
|
||||
|
||||
/(?![ab]).*/
|
||||
ab\=ph
|
||||
Partial match:
|
||||
|
||||
/c*+/
|
||||
ab\=ph,offset=2
|
||||
Partial match:
|
||||
|
||||
# End of testinput6
|
||||
|
|
Loading…
Reference in New Issue