Allow non-ASCII in group names when UTF is set; revise group naming terminology
in documentation to use "capture group", as Perl does.
This commit is contained in:
parent
a657d4cff8
commit
d7b10a57d1
|
@ -121,6 +121,9 @@ the option applies only to unrecognized or malformed escape sequences.
|
||||||
tests such as (?(VERSION>=0)...) when the version test was true. Incorrect
|
tests such as (?(VERSION>=0)...) when the version test was true. Incorrect
|
||||||
processing or a crash could result.
|
processing or a crash could result.
|
||||||
|
|
||||||
|
30. When PCRE2_UTF is set, allow non-ASCII letters and decimal digits in group
|
||||||
|
names, as Perl does.
|
||||||
|
|
||||||
|
|
||||||
Version 10.32 10-September-2018
|
Version 10.32 10-September-2018
|
||||||
-------------------------------
|
-------------------------------
|
||||||
|
|
|
@ -27,8 +27,8 @@ DESCRIPTION
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
This convenience function finds, for a compiled pattern, the first and last
|
This convenience function finds, for a compiled pattern, the first and last
|
||||||
entries for a given name in the table that translates capturing parenthesis
|
entries for a given name in the table that translates capture group names into
|
||||||
names into numbers.
|
numbers.
|
||||||
<pre>
|
<pre>
|
||||||
<i>code</i> Compiled regular expression
|
<i>code</i> Compiled regular expression
|
||||||
<i>name</i> Name whose entries required
|
<i>name</i> Name whose entries required
|
||||||
|
|
|
@ -49,7 +49,7 @@ please consult the man page, in case the conversion went wrong.
|
||||||
<li><a name="TOC34" href="#SEC34">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a>
|
<li><a name="TOC34" href="#SEC34">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a>
|
||||||
<li><a name="TOC35" href="#SEC35">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a>
|
<li><a name="TOC35" href="#SEC35">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a>
|
||||||
<li><a name="TOC36" href="#SEC36">CREATING A NEW STRING WITH SUBSTITUTIONS</a>
|
<li><a name="TOC36" href="#SEC36">CREATING A NEW STRING WITH SUBSTITUTIONS</a>
|
||||||
<li><a name="TOC37" href="#SEC37">DUPLICATE SUBPATTERN NAMES</a>
|
<li><a name="TOC37" href="#SEC37">DUPLICATE CAPTURE GROUP NAMES</a>
|
||||||
<li><a name="TOC38" href="#SEC38">FINDING ALL POSSIBLE MATCHES AT ONE POSITION</a>
|
<li><a name="TOC38" href="#SEC38">FINDING ALL POSSIBLE MATCHES AT ONE POSITION</a>
|
||||||
<li><a name="TOC39" href="#SEC39">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a>
|
<li><a name="TOC39" href="#SEC39">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a>
|
||||||
<li><a name="TOC40" href="#SEC40">SEE ALSO</a>
|
<li><a name="TOC40" href="#SEC40">SEE ALSO</a>
|
||||||
|
@ -1490,10 +1490,10 @@ independent of the setting of PCRE2_DOTALL.
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_DUPNAMES
|
PCRE2_DUPNAMES
|
||||||
</pre>
|
</pre>
|
||||||
If this bit is set, names used to identify capturing subpatterns need not be
|
If this bit is set, names used to identify capture groups need not be unique.
|
||||||
unique. This can be helpful for certain types of pattern when it is known that
|
This can be helpful for certain types of pattern when it is known that only one
|
||||||
only one instance of the named subpattern can ever be matched. There are more
|
instance of the named group can ever be matched. There are more details of
|
||||||
details of named subpatterns below; see also the
|
named capture groups below; see also the
|
||||||
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
||||||
documentation.
|
documentation.
|
||||||
<pre>
|
<pre>
|
||||||
|
@ -1526,11 +1526,11 @@ the end of the subject.
|
||||||
If this bit is set, most white space characters in the pattern are totally
|
If this bit is set, most white space characters in the pattern are totally
|
||||||
ignored except when escaped or inside a character class. However, white space
|
ignored except when escaped or inside a character class. However, white space
|
||||||
is not allowed within sequences such as (?> that introduce various
|
is not allowed within sequences such as (?> that introduce various
|
||||||
parenthesized subpatterns, nor within numerical quantifiers such as {1,3}.
|
parenthesized groups, nor within numerical quantifiers such as {1,3}. Ignorable
|
||||||
Ignorable white space is permitted between an item and a following quantifier
|
white space is permitted between an item and a following quantifier and between
|
||||||
and between a quantifier and a following + that indicates possessiveness.
|
a quantifier and a following + that indicates possessiveness. PCRE2_EXTENDED is
|
||||||
PCRE2_EXTENDED is equivalent to Perl's /x option, and it can be changed within
|
equivalent to Perl's /x option, and it can be changed within a pattern by a
|
||||||
a pattern by a (?x) option setting.
|
(?x) option setting.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
When PCRE2 is compiled without Unicode support, PCRE2_EXTENDED recognizes as
|
When PCRE2 is compiled without Unicode support, PCRE2_EXTENDED recognizes as
|
||||||
|
@ -1606,7 +1606,7 @@ error.
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_MATCH_UNSET_BACKREF
|
PCRE2_MATCH_UNSET_BACKREF
|
||||||
</pre>
|
</pre>
|
||||||
If this option is set, a backreference to an unset subpattern group matches an
|
If this option is set, a backreference to an unset capture group matches an
|
||||||
empty string (by default this causes the current matching alternative to fail).
|
empty string (by default this causes the current matching alternative to fail).
|
||||||
A pattern such as (\1)(a) succeeds when this option is set (assuming it can
|
A pattern such as (\1)(a) succeeds when this option is set (assuming it can
|
||||||
find an "a" in the subject), whereas it fails by default, for Perl
|
find an "a" in the subject), whereas it fails by default, for Perl
|
||||||
|
@ -1668,7 +1668,7 @@ If this option is set, it disables the use of numbered capturing parentheses in
|
||||||
the pattern. Any opening parenthesis that is not followed by ? behaves as if it
|
the pattern. Any opening parenthesis that is not followed by ? behaves as if it
|
||||||
were followed by ?: but named parentheses can still be used for capturing (and
|
were followed by ?: but named parentheses can still be used for capturing (and
|
||||||
they acquire numbers in the usual way). This is the same as Perl's /n option.
|
they acquire numbers in the usual way). This is the same as Perl's /n option.
|
||||||
Note that, when this option is set, references to capturing groups
|
Note that, when this option is set, references to capture groups
|
||||||
(backreferences or recursion/subroutine calls) may only refer to named groups,
|
(backreferences or recursion/subroutine calls) may only refer to named groups,
|
||||||
though the reference can be by name or by number.
|
though the reference can be by name or by number.
|
||||||
<pre>
|
<pre>
|
||||||
|
@ -1687,7 +1687,7 @@ purposes.
|
||||||
If this option is set, it disables an optimization that is applied when .* is
|
If this option is set, it disables an optimization that is applied when .* is
|
||||||
the first significant item in a top-level branch of a pattern, and all the
|
the first significant item in a top-level branch of a pattern, and all the
|
||||||
other branches also start with .* or with \A or \G or ^. The optimization is
|
other branches also start with .* or with \A or \G or ^. The optimization is
|
||||||
automatically disabled for .* if it is inside an atomic group or a capturing
|
automatically disabled for .* if it is inside an atomic group or a capture
|
||||||
group that is the subject of a backreference, or if the pattern contains
|
group that is the subject of a backreference, or if the pattern contains
|
||||||
(*PRUNE) or (*SKIP). When the optimization is not disabled, such a pattern is
|
(*PRUNE) or (*SKIP). When the optimization is not disabled, such a pattern is
|
||||||
automatically anchored if PCRE2_DOTALL is set for all the .* items and
|
automatically anchored if PCRE2_DOTALL is set for all the .* items and
|
||||||
|
@ -2066,7 +2066,7 @@ When .* is the first significant item, anchoring is possible only when all the
|
||||||
following are true:
|
following are true:
|
||||||
<pre>
|
<pre>
|
||||||
.* is not in an atomic group
|
.* is not in an atomic group
|
||||||
.* is not in a capturing group that is the subject of a backreference
|
.* is not in a capture group that is the subject of a backreference
|
||||||
PCRE2_DOTALL is in force for .*
|
PCRE2_DOTALL is in force for .*
|
||||||
Neither (*PRUNE) nor (*SKIP) appears in the pattern
|
Neither (*PRUNE) nor (*SKIP) appears in the pattern
|
||||||
PCRE2_NO_DOTSTAR_ANCHOR is not set
|
PCRE2_NO_DOTSTAR_ANCHOR is not set
|
||||||
|
@ -2077,12 +2077,12 @@ options returned for PCRE2_INFO_ALLOPTIONS.
|
||||||
PCRE2_INFO_BACKREFMAX
|
PCRE2_INFO_BACKREFMAX
|
||||||
</pre>
|
</pre>
|
||||||
Return the number of the highest backreference in the pattern. The third
|
Return the number of the highest backreference in the pattern. The third
|
||||||
argument should point to an <b>uint32_t</b> variable. Named subpatterns acquire
|
argument should point to an <b>uint32_t</b> variable. Named capture groups
|
||||||
numbers as well as names, and these count towards the highest backreference.
|
acquire numbers as well as names, and these count towards the highest
|
||||||
Backreferences such as \4 or \g{12} match the captured characters of the
|
backreference. Backreferences such as \4 or \g{12} match the captured
|
||||||
given group, but in addition, the check that a capturing group is set in a
|
characters of the given group, but in addition, the check that a capture
|
||||||
conditional subpattern such as (?(3)a|b) is also a backreference. Zero is
|
group is set in a conditional group such as (?(3)a|b) is also a backreference.
|
||||||
returned if there are no backreferences.
|
Zero is returned if there are no backreferences.
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_INFO_BSR
|
PCRE2_INFO_BSR
|
||||||
</pre>
|
</pre>
|
||||||
|
@ -2093,9 +2093,9 @@ that \R matches only CR, LF, or CRLF.
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_INFO_CAPTURECOUNT
|
PCRE2_INFO_CAPTURECOUNT
|
||||||
</pre>
|
</pre>
|
||||||
Return the highest capturing subpattern number in the pattern. In patterns
|
Return the highest capture group number in the pattern. In patterns where (?|
|
||||||
where (?| is not used, this is also the total number of capturing subpatterns.
|
is not used, this is also the total number of capture groups. The third
|
||||||
The third argument should point to an <b>uint32_t</b> variable.
|
argument should point to an <b>uint32_t</b> variable.
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_INFO_DEPTHLIMIT
|
PCRE2_INFO_DEPTHLIMIT
|
||||||
</pre>
|
</pre>
|
||||||
|
@ -2143,7 +2143,7 @@ Return the size (in bytes) of the data frames that are used to remember
|
||||||
backtracking positions when the pattern is processed by <b>pcre2_match()</b>
|
backtracking positions when the pattern is processed by <b>pcre2_match()</b>
|
||||||
without the use of JIT. The third argument should point to a <b>size_t</b>
|
without the use of JIT. The third argument should point to a <b>size_t</b>
|
||||||
variable. The frame size depends on the number of capturing parentheses in the
|
variable. The frame size depends on the number of capturing parentheses in the
|
||||||
pattern. Each additional capturing group adds two PCRE2_SIZE variables.
|
pattern. Each additional capture group adds two PCRE2_SIZE variables.
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_INFO_HASBACKSLASHC
|
PCRE2_INFO_HASBACKSLASHC
|
||||||
</pre>
|
</pre>
|
||||||
|
@ -2267,20 +2267,20 @@ the parenthesis number. The rest of the entry is the corresponding name, zero
|
||||||
terminated.
|
terminated.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The names are in alphabetical order. If (?| is used to create multiple groups
|
The names are in alphabetical order. If (?| is used to create multiple capture
|
||||||
with the same number, as described in the
|
groups with the same number, as described in the
|
||||||
<a href="pcre2pattern.html#dupsubpatternnumber">section on duplicate subpattern numbers</a>
|
<a href="pcre2pattern.html#dupgroupnumber">section on duplicate group numbers</a>
|
||||||
in the
|
in the
|
||||||
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
||||||
page, the groups may be given the same name, but there is only one entry in the
|
page, the groups may be given the same name, but there is only one entry in the
|
||||||
table. Different names for groups of the same number are not permitted.
|
table. Different names for groups of the same number are not permitted.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Duplicate names for subpatterns with different numbers are permitted, but only
|
Duplicate names for capture groups with different numbers are permitted, but
|
||||||
if PCRE2_DUPNAMES is set. They appear in the table in the order in which they
|
only if PCRE2_DUPNAMES is set. They appear in the table in the order in which
|
||||||
were found in the pattern. In the absence of (?| this is the order of
|
they were found in the pattern. In the absence of (?| this is the order of
|
||||||
increasing number; when (?| is used this is not necessarily the case because
|
increasing number; when (?| is used this is not necessarily the case because
|
||||||
later subpatterns may have lower numbers.
|
later capture groups may have lower numbers.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
As a simple example of the name/number table, consider the following pattern
|
As a simple example of the name/number table, consider the following pattern
|
||||||
|
@ -2289,16 +2289,16 @@ space - including newlines - is ignored):
|
||||||
<pre>
|
<pre>
|
||||||
(?<date> (?<year>(\d\d)?\d\d) - (?<month>\d\d) - (?<day>\d\d) )
|
(?<date> (?<year>(\d\d)?\d\d) - (?<month>\d\d) - (?<day>\d\d) )
|
||||||
</pre>
|
</pre>
|
||||||
There are four named subpatterns, so the table has four entries, and each entry
|
There are four named capture groups, so the table has four entries, and each
|
||||||
in the table is eight bytes long. The table is as follows, with non-printing
|
entry in the table is eight bytes long. The table is as follows, with
|
||||||
bytes shows in hexadecimal, and undefined bytes shown as ??:
|
non-printing bytes shows in hexadecimal, and undefined bytes shown as ??:
|
||||||
<pre>
|
<pre>
|
||||||
00 01 d a t e 00 ??
|
00 01 d a t e 00 ??
|
||||||
00 05 d a y 00 ?? ??
|
00 05 d a y 00 ?? ??
|
||||||
00 04 m o n t h 00
|
00 04 m o n t h 00
|
||||||
00 02 y e a r 00 ??
|
00 02 y e a r 00 ??
|
||||||
</pre>
|
</pre>
|
||||||
When writing code to extract data from named subpatterns using the
|
When writing code to extract data from named capture groups using the
|
||||||
name-to-number map, remember that the length of the entries is likely to be
|
name-to-number map, remember that the length of the entries is likely to be
|
||||||
different for each compiled pattern.
|
different for each compiled pattern.
|
||||||
<pre>
|
<pre>
|
||||||
|
@ -2741,12 +2741,12 @@ valid newline sequence and explicit \r or \n escapes appear in the pattern.
|
||||||
In general, a pattern matches a certain portion of the subject, and in
|
In general, a pattern matches a certain portion of the subject, and in
|
||||||
addition, further substrings from the subject may be picked out by
|
addition, further substrings from the subject may be picked out by
|
||||||
parenthesized parts of the pattern. Following the usage in Jeffrey Friedl's
|
parenthesized parts of the pattern. Following the usage in Jeffrey Friedl's
|
||||||
book, this is called "capturing" in what follows, and the phrase "capturing
|
book, this is called "capturing" in what follows, and the phrase "capture
|
||||||
subpattern" or "capturing group" is used for a fragment of a pattern that picks
|
group" (Perl terminology) is used for a fragment of a pattern that picks out a
|
||||||
out a substring. PCRE2 supports several other kinds of parenthesized subpattern
|
substring. PCRE2 supports several other kinds of parenthesized group that do
|
||||||
that do not cause substrings to be captured. The <b>pcre2_pattern_info()</b>
|
not cause substrings to be captured. The <b>pcre2_pattern_info()</b> function
|
||||||
function can be used to find out how many capturing subpatterns there are in a
|
can be used to find out how many capture groups there are in a compiled
|
||||||
compiled pattern.
|
pattern.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
You can use auxiliary functions for accessing captured substrings
|
You can use auxiliary functions for accessing captured substrings
|
||||||
|
@ -2795,9 +2795,8 @@ For example, if the pattern (?=ab\K) is matched against "ab", the start and
|
||||||
end offset values for the match are 2 and 0.
|
end offset values for the match are 2 and 0.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
If a capturing subpattern group is matched repeatedly within a single match
|
If a capture group is matched repeatedly within a single match operation, it is
|
||||||
operation, it is the last portion of the subject that it matched that is
|
the last portion of the subject that it matched that is returned.
|
||||||
returned.
|
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
If the ovector is too small to hold all the captured substring offsets, as much
|
If the ovector is too small to hold all the captured substring offsets, as much
|
||||||
|
@ -2806,21 +2805,20 @@ substrings are not of interest, <b>pcre2_match()</b> may be called with a match
|
||||||
data block whose ovector is of minimum length (that is, one pair).
|
data block whose ovector is of minimum length (that is, one pair).
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
It is possible for capturing subpattern number <i>n+1</i> to match some part of
|
It is possible for capture group number <i>n+1</i> to match some part of the
|
||||||
the subject when subpattern <i>n</i> has not been used at all. For example, if
|
subject when group <i>n</i> has not been used at all. For example, if the string
|
||||||
the string "abc" is matched against the pattern (a|(z))(bc) the return from the
|
"abc" is matched against the pattern (a|(z))(bc) the return from the function
|
||||||
function is 4, and subpatterns 1 and 3 are matched, but 2 is not. When this
|
is 4, and groups 1 and 3 are matched, but 2 is not. When this happens, both
|
||||||
happens, both values in the offset pairs corresponding to unused subpatterns
|
values in the offset pairs corresponding to unused groups are set to
|
||||||
are set to PCRE2_UNSET.
|
PCRE2_UNSET.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Offset values that correspond to unused subpatterns at the end of the
|
Offset values that correspond to unused groups at the end of the expression are
|
||||||
expression are also set to PCRE2_UNSET. For example, if the string "abc" is
|
also set to PCRE2_UNSET. For example, if the string "abc" is matched against
|
||||||
matched against the pattern (abc)(x(yz)?)? subpatterns 2 and 3 are not matched.
|
the pattern (abc)(x(yz)?)? groups 2 and 3 are not matched. The return from the
|
||||||
The return from the function is 2, because the highest used capturing
|
function is 2, because the highest used capture group number is 1. The offsets
|
||||||
subpattern number is 1. The offsets for for the second and third capturing
|
for for the second and third capture groupss (assuming the vector is large
|
||||||
subpatterns (assuming the vector is large enough, of course) are set to
|
enough, of course) are set to PCRE2_UNSET.
|
||||||
PCRE2_UNSET.
|
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Elements in the ovector that do not correspond to capturing parentheses in the
|
Elements in the ovector that do not correspond to capturing parentheses in the
|
||||||
|
@ -2993,11 +2991,11 @@ as NULL.
|
||||||
</pre>
|
</pre>
|
||||||
This error is returned when <b>pcre2_match()</b> detects a recursion loop within
|
This error is returned when <b>pcre2_match()</b> detects a recursion loop within
|
||||||
the pattern. Specifically, it means that either the whole pattern or a
|
the pattern. Specifically, it means that either the whole pattern or a
|
||||||
subpattern has been called recursively for the second time at the same position
|
capture group has been called recursively for the second time at the same
|
||||||
in the subject string. Some simple patterns that might do this are detected and
|
position in the subject string. Some simple patterns that might do this are
|
||||||
faulted at compile time, but more complicated cases, in particular mutual
|
detected and faulted at compile time, but more complicated cases, in particular
|
||||||
recursions between two different subpatterns, cannot be detected until matching
|
mutual recursions between two different groups, cannot be detected until
|
||||||
is attempted.
|
matching is attempted.
|
||||||
<a name="geterrormessage"></a></P>
|
<a name="geterrormessage"></a></P>
|
||||||
<br><a name="SEC32" href="#TOC1">OBTAINING A TEXTUAL ERROR MESSAGE</a><br>
|
<br><a name="SEC32" href="#TOC1">OBTAINING A TEXTUAL ERROR MESSAGE</a><br>
|
||||||
<P>
|
<P>
|
||||||
|
@ -3074,7 +3072,7 @@ The <b>pcre2_substring_copy_bynumber()</b> function copies a captured substring
|
||||||
into a supplied buffer, whereas <b>pcre2_substring_get_bynumber()</b> copies it
|
into a supplied buffer, whereas <b>pcre2_substring_get_bynumber()</b> copies it
|
||||||
into new memory, obtained using the same memory allocation function that was
|
into new memory, obtained using the same memory allocation function that was
|
||||||
used for the match data block. The first two arguments of these functions are a
|
used for the match data block. The first two arguments of these functions are a
|
||||||
pointer to the match data block and a capturing group number.
|
pointer to the match data block and a capture group number.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The final arguments of <b>pcre2_substring_copy_bynumber()</b> are a pointer to
|
The final arguments of <b>pcre2_substring_copy_bynumber()</b> are a pointer to
|
||||||
|
@ -3150,9 +3148,9 @@ calling <b>pcre2_substring_list_free()</b>.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
If this function encounters a substring that is unset, which can happen when
|
If this function encounters a substring that is unset, which can happen when
|
||||||
capturing subpattern number <i>n+1</i> matches some part of the subject, but
|
capture group number <i>n+1</i> matches some part of the subject, but group
|
||||||
subpattern <i>n</i> has not been used at all, it returns an empty string. This
|
<i>n</i> has not been used at all, it returns an empty string. This can be
|
||||||
can be distinguished from a genuine zero-length substring by inspecting the
|
distinguished from a genuine zero-length substring by inspecting the
|
||||||
appropriate offset in the ovector, which contain PCRE2_UNSET for unset
|
appropriate offset in the ovector, which contain PCRE2_UNSET for unset
|
||||||
substrings, or by calling <b>pcre2_substring_length_bynumber()</b>.
|
substrings, or by calling <b>pcre2_substring_length_bynumber()</b>.
|
||||||
<a name="extractbyname"></a></P>
|
<a name="extractbyname"></a></P>
|
||||||
|
@ -3182,21 +3180,21 @@ For example, for this pattern:
|
||||||
<pre>
|
<pre>
|
||||||
(a+)b(?<xxx>\d+)...
|
(a+)b(?<xxx>\d+)...
|
||||||
</pre>
|
</pre>
|
||||||
the number of the subpattern called "xxx" is 2. If the name is known to be
|
the number of the capture group called "xxx" is 2. If the name is known to be
|
||||||
unique (PCRE2_DUPNAMES was not set), you can find the number from the name by
|
unique (PCRE2_DUPNAMES was not set), you can find the number from the name by
|
||||||
calling <b>pcre2_substring_number_from_name()</b>. The first argument is the
|
calling <b>pcre2_substring_number_from_name()</b>. The first argument is the
|
||||||
compiled pattern, and the second is the name. The yield of the function is the
|
compiled pattern, and the second is the name. The yield of the function is the
|
||||||
subpattern number, PCRE2_ERROR_NOSUBSTRING if there is no subpattern of that
|
group number, PCRE2_ERROR_NOSUBSTRING if there is no group with that name, or
|
||||||
name, or PCRE2_ERROR_NOUNIQUESUBSTRING if there is more than one subpattern of
|
PCRE2_ERROR_NOUNIQUESUBSTRING if there is more than one group with that name.
|
||||||
that name. Given the number, you can extract the substring directly from the
|
Given the number, you can extract the substring directly from the ovector, or
|
||||||
ovector, or use one of the "bynumber" functions described above.
|
use one of the "bynumber" functions described above.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
For convenience, there are also "byname" functions that correspond to the
|
For convenience, there are also "byname" functions that correspond to the
|
||||||
"bynumber" functions, the only difference being that the second argument is a
|
"bynumber" functions, the only difference being that the second argument is a
|
||||||
name instead of a number. If PCRE2_DUPNAMES is set and there are duplicate
|
name instead of a number. If PCRE2_DUPNAMES is set and there are duplicate
|
||||||
names, these functions scan all the groups with the given name, and return the
|
names, these functions scan all the groups with the given name, and return the
|
||||||
first named string that is set.
|
captured substring from the first named group that is set.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
If there are no groups with the given name, PCRE2_ERROR_NOSUBSTRING is
|
If there are no groups with the given name, PCRE2_ERROR_NOSUBSTRING is
|
||||||
|
@ -3207,13 +3205,13 @@ set, PCRE2_ERROR_UNSET is returned.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
<b>Warning:</b> If the pattern uses the (?| feature to set up multiple
|
<b>Warning:</b> If the pattern uses the (?| feature to set up multiple
|
||||||
subpatterns with the same number, as described in the
|
capture groups with the same number, as described in the
|
||||||
<a href="pcre2pattern.html#dupsubpatternnumber">section on duplicate subpattern numbers</a>
|
<a href="pcre2pattern.html#dupgroupnumber">section on duplicate group numbers</a>
|
||||||
in the
|
in the
|
||||||
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
||||||
page, you cannot use names to distinguish the different subpatterns, because
|
page, you cannot use names to distinguish the different capture groups, because
|
||||||
names are not included in the compiled code. The matching process uses only
|
names are not included in the compiled code. The matching process uses only
|
||||||
numbers. For this reason, the use of different names for subpatterns of the
|
numbers. For this reason, the use of different names for groups with the
|
||||||
same number causes an error at compile time.
|
same number causes an error at compile time.
|
||||||
<a name="substitutions"></a></P>
|
<a name="substitutions"></a></P>
|
||||||
<br><a name="SEC36" href="#TOC1">CREATING A NEW STRING WITH SUBSTITUTIONS</a><br>
|
<br><a name="SEC36" href="#TOC1">CREATING A NEW STRING WITH SUBSTITUTIONS</a><br>
|
||||||
|
@ -3276,7 +3274,7 @@ length is in code units, not bytes.
|
||||||
In the replacement string, which is interpreted as a UTF string in UTF mode,
|
In the replacement string, which is interpreted as a UTF string in UTF mode,
|
||||||
and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK option is set, a
|
and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK option is set, a
|
||||||
dollar character is an escape character that can specify the insertion of
|
dollar character is an escape character that can specify the insertion of
|
||||||
characters from capturing groups or names from (*MARK) or other control verbs
|
characters from capture groups or names from (*MARK) or other control verbs
|
||||||
in the pattern. The following forms are always recognized:
|
in the pattern. The following forms are always recognized:
|
||||||
<pre>
|
<pre>
|
||||||
$$ insert a dollar character
|
$$ insert a dollar character
|
||||||
|
@ -3345,13 +3343,13 @@ efficient to allocate a large buffer and free the excess afterwards, instead of
|
||||||
using PCRE2_SUBSTITUTE_OVERFLOW_LENGTH.
|
using PCRE2_SUBSTITUTE_OVERFLOW_LENGTH.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
PCRE2_SUBSTITUTE_UNKNOWN_UNSET causes references to capturing groups that do
|
PCRE2_SUBSTITUTE_UNKNOWN_UNSET causes references to capture groups that do
|
||||||
not appear in the pattern to be treated as unset groups. This option should be
|
not appear in the pattern to be treated as unset groups. This option should be
|
||||||
used with care, because it means that a typo in a group name or number no
|
used with care, because it means that a typo in a group name or number no
|
||||||
longer causes the PCRE2_ERROR_NOSUBSTRING error.
|
longer causes the PCRE2_ERROR_NOSUBSTRING error.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
PCRE2_SUBSTITUTE_UNSET_EMPTY causes unset capturing groups (including unknown
|
PCRE2_SUBSTITUTE_UNSET_EMPTY causes unset capture groups (including unknown
|
||||||
groups when PCRE2_SUBSTITUTE_UNKNOWN_UNSET is set) to be treated as empty
|
groups when PCRE2_SUBSTITUTE_UNKNOWN_UNSET is set) to be treated as empty
|
||||||
strings when inserted as described above. If this option is not set, an attempt
|
strings when inserted as described above. If this option is not set, an attempt
|
||||||
to insert an unset group causes the PCRE2_ERROR_UNSET error. This option does
|
to insert an unset group causes the PCRE2_ERROR_UNSET error. This option does
|
||||||
|
@ -3379,7 +3377,7 @@ terminating a \Q quoted sequence) reverts to no case forcing. The sequences
|
||||||
\u and \l force the next character (if it is a letter) to upper or lower
|
\u and \l force the next character (if it is a letter) to upper or lower
|
||||||
case, respectively, and then the state automatically reverts to no case
|
case, respectively, and then the state automatically reverts to no case
|
||||||
forcing. Case forcing applies to all inserted characters, including those from
|
forcing. Case forcing applies to all inserted characters, including those from
|
||||||
captured groups and letters within \Q...\E quoted sequences.
|
capture groups and letters within \Q...\E quoted sequences.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Note that case forcing sequences such as \U...\E do not nest. For example,
|
Note that case forcing sequences such as \U...\E do not nest. For example,
|
||||||
|
@ -3388,7 +3386,8 @@ effect.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The second effect of setting PCRE2_SUBSTITUTE_EXTENDED is to add more
|
The second effect of setting PCRE2_SUBSTITUTE_EXTENDED is to add more
|
||||||
flexibility to group substitution. The syntax is similar to that used by Bash:
|
flexibility to capture group substitution. The syntax is similar to that used
|
||||||
|
by Bash:
|
||||||
<pre>
|
<pre>
|
||||||
${<n>:-<string>}
|
${<n>:-<string>}
|
||||||
${<n>:+<string1>:<string2>}
|
${<n>:+<string1>:<string2>}
|
||||||
|
@ -3518,20 +3517,21 @@ PCRE2_SUBSTITUTE_GLOBAL is not set), the the rest of the input is copied to the
|
||||||
output and the call to <b>pcre2_substitute()</b> exits, returning the number of
|
output and the call to <b>pcre2_substitute()</b> exits, returning the number of
|
||||||
matches so far.
|
matches so far.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC37" href="#TOC1">DUPLICATE SUBPATTERN NAMES</a><br>
|
<br><a name="SEC37" href="#TOC1">DUPLICATE CAPTURE GROUP NAMES</a><br>
|
||||||
<P>
|
<P>
|
||||||
<b>int pcre2_substring_nametable_scan(const pcre2_code *<i>code</i>,</b>
|
<b>int pcre2_substring_nametable_scan(const pcre2_code *<i>code</i>,</b>
|
||||||
<b> PCRE2_SPTR <i>name</i>, PCRE2_SPTR *<i>first</i>, PCRE2_SPTR *<i>last</i>);</b>
|
<b> PCRE2_SPTR <i>name</i>, PCRE2_SPTR *<i>first</i>, PCRE2_SPTR *<i>last</i>);</b>
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
When a pattern is compiled with the PCRE2_DUPNAMES option, names for
|
When a pattern is compiled with the PCRE2_DUPNAMES option, names for capture
|
||||||
subpatterns are not required to be unique. Duplicate names are always allowed
|
groups are not required to be unique. Duplicate names are always allowed for
|
||||||
for subpatterns with the same number, created by using the (?| feature. Indeed,
|
groups with the same number, created by using the (?| feature. Indeed, if such
|
||||||
if such subpatterns are named, they are required to use the same names.
|
groups are named, they are required to use the same names.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Normally, patterns with duplicate names are such that in any one match, only
|
Normally, patterns that use duplicate names are such that in any one match,
|
||||||
one of the named subpatterns participates. An example is shown in the
|
only one of each set of identically-named groups participates. An example is
|
||||||
|
shown in the
|
||||||
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
||||||
documentation.
|
documentation.
|
||||||
</P>
|
</P>
|
||||||
|
@ -3703,9 +3703,8 @@ the three matched strings are
|
||||||
On success, the yield of the function is a number greater than zero, which is
|
On success, the yield of the function is a number greater than zero, which is
|
||||||
the number of matched substrings. The offsets of the substrings are returned in
|
the number of matched substrings. The offsets of the substrings are returned in
|
||||||
the ovector, and can be extracted by number in the same way as for
|
the ovector, and can be extracted by number in the same way as for
|
||||||
<b>pcre2_match()</b>, but the numbers bear no relation to any capturing groups
|
<b>pcre2_match()</b>, but the numbers bear no relation to any capture groups
|
||||||
that may exist in the pattern, because DFA matching does not support group
|
that may exist in the pattern, because DFA matching does not support capturing.
|
||||||
capture.
|
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
Calls to the convenience functions that extract substrings by name
|
Calls to the convenience functions that extract substrings by name
|
||||||
|
@ -3747,7 +3746,7 @@ a backreference.
|
||||||
</pre>
|
</pre>
|
||||||
This return is given if <b>pcre2_dfa_match()</b> encounters a condition item
|
This return is given if <b>pcre2_dfa_match()</b> encounters a condition item
|
||||||
that uses a backreference for the condition, or a test for recursion in a
|
that uses a backreference for the condition, or a test for recursion in a
|
||||||
specific group. These are not supported.
|
specific capture group. These are not supported.
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_ERROR_DFA_WSSIZE
|
PCRE2_ERROR_DFA_WSSIZE
|
||||||
</pre>
|
</pre>
|
||||||
|
@ -3756,9 +3755,9 @@ This return is given if <b>pcre2_dfa_match()</b> runs out of space in the
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_ERROR_DFA_RECURSE
|
PCRE2_ERROR_DFA_RECURSE
|
||||||
</pre>
|
</pre>
|
||||||
When a recursive subpattern is processed, the matching function calls itself
|
When a recursion or subroutine call is processed, the matching function calls
|
||||||
recursively, using private memory for the ovector and <i>workspace</i>. This
|
itself recursively, using private memory for the ovector and <i>workspace</i>.
|
||||||
error is given if the internal ovector is not large enough. This should be
|
This error is given if the internal ovector is not large enough. This should be
|
||||||
extremely rare, as a vector of size 1000 is used.
|
extremely rare, as a vector of size 1000 is used.
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_ERROR_DFA_BADRESTART
|
PCRE2_ERROR_DFA_BADRESTART
|
||||||
|
@ -3785,7 +3784,7 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 04 January 2019
|
Last updated: 04 February 2019
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2019 University of Cambridge.
|
Copyright © 1997-2019 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
|
|
|
@ -151,7 +151,7 @@ branch, automatic anchoring occurs if all branches are anchorable.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
This optimization is disabled, however, if .* is in an atomic group or if there
|
This optimization is disabled, however, if .* is in an atomic group or if there
|
||||||
is a backreference to the capturing group in which it appears. It is also
|
is a backreference to the capture group in which it appears. It is also
|
||||||
disabled if the pattern contains (*PRUNE) or (*SKIP). However, the presence of
|
disabled if the pattern contains (*PRUNE) or (*SKIP). However, the presence of
|
||||||
callouts does not affect it.
|
callouts does not affect it.
|
||||||
</P>
|
</P>
|
||||||
|
@ -354,8 +354,8 @@ callout before an assertion such as (?=ab) the length is 3. For an an
|
||||||
alternation bar or a closing parenthesis, the length is one, unless a closing
|
alternation bar or a closing parenthesis, the length is one, unless a closing
|
||||||
parenthesis is followed by a quantifier, in which case its length is included.
|
parenthesis is followed by a quantifier, in which case its length is included.
|
||||||
(This changed in release 10.23. In earlier releases, before an opening
|
(This changed in release 10.23. In earlier releases, before an opening
|
||||||
parenthesis the length was that of the entire subpattern, and before an
|
parenthesis the length was that of the entire group, and before an alternation
|
||||||
alternation bar or a closing parenthesis the length was zero.)
|
bar or a closing parenthesis the length was zero.)
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The <i>pattern_position</i> and <i>next_item_length</i> fields are intended to
|
The <i>pattern_position</i> and <i>next_item_length</i> fields are intended to
|
||||||
|
@ -471,9 +471,9 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC8" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC8" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 17 September 2018
|
Last updated: 03 February 2019
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2018 University of Cambridge.
|
Copyright © 1997-2019 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
|
@ -36,10 +36,9 @@ assertion just once). Perl allows some repeat quantifiers on other assertions,
|
||||||
for example, \b* (but not \b{3}), but these do not seem to have any use.
|
for example, \b* (but not \b{3}), but these do not seem to have any use.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
3. Capturing subpatterns that occur inside negative lookaround assertions are
|
3. Capture groups that occur inside negative lookaround assertions are counted,
|
||||||
counted, but their entries in the offsets vector are set only when a negative
|
but their entries in the offsets vector are set only when a negative assertion
|
||||||
assertion is a condition that has a matching branch (that is, the condition is
|
is a condition that has a matching branch (that is, the condition is false).
|
||||||
false).
|
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
4. The following Perl escape sequences are not supported: \F, \l, \L, \u,
|
4. The following Perl escape sequences are not supported: \F, \l, \L, \u,
|
||||||
|
@ -94,13 +93,13 @@ to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
|
||||||
into subroutine calls is now supported, as in Perl.
|
into subroutine calls is now supported, as in Perl.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
9. If any of the backtracking control verbs are used in a subpattern that is
|
9. If any of the backtracking control verbs are used in a group that is called
|
||||||
called as a subroutine (whether or not recursively), their effect is confined
|
as a subroutine (whether or not recursively), their effect is confined to that
|
||||||
to that subpattern; it does not extend to the surrounding pattern. This is not
|
group; it does not extend to the surrounding pattern. This is not always the
|
||||||
always the case in Perl. In particular, if (*THEN) is present in a group that
|
case in Perl. In particular, if (*THEN) is present in a group that is called as
|
||||||
is called as a subroutine, its action is limited to that group, even if the
|
a subroutine, its action is limited to that group, even if the group does not
|
||||||
group does not contain any | characters. Note that such subpatterns are
|
contain any | characters. Note that such groups are processed as anchored
|
||||||
processed as anchored at the point where they are tested.
|
at the point where they are tested.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
10. If a pattern contains more than one backtracking control verb, the first
|
10. If a pattern contains more than one backtracking control verb, the first
|
||||||
|
@ -120,22 +119,21 @@ the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE2 it is set to
|
||||||
"b".
|
"b".
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
13. PCRE2's handling of duplicate subpattern numbers and duplicate subpattern
|
13. PCRE2's handling of duplicate capture group numbers and names is not as
|
||||||
names is not as general as Perl's. This is a consequence of the fact the PCRE2
|
general as Perl's. This is a consequence of the fact the PCRE2 works internally
|
||||||
works internally just with numbers, using an external table to translate
|
just with numbers, using an external table to translate between numbers and
|
||||||
between numbers and names. In particular, a pattern such as (?|(?<a>A)|(?<b>B),
|
names. In particular, a pattern such as (?|(?<a>A)|(?<b>B), where the two
|
||||||
where the two capturing parentheses have the same number but different names,
|
capture groups have the same number but different names, is not supported, and
|
||||||
is not supported, and causes an error at compile time. If it were allowed, it
|
causes an error at compile time. If it were allowed, it would not be possible
|
||||||
would not be possible to distinguish which parentheses matched, because both
|
to distinguish which group matched, because both names map to capture group
|
||||||
names map to capturing subpattern number 1. To avoid this confusing situation,
|
number 1. To avoid this confusing situation, an error is given at compile time.
|
||||||
an error is given at compile time.
|
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
14. Perl used to recognize comments in some places that PCRE2 does not, for
|
14. Perl used to recognize comments in some places that PCRE2 does not, for
|
||||||
example, between the ( and ? at the start of a subpattern. If the /x modifier
|
example, between the ( and ? at the start of a group. If the /x modifier is
|
||||||
is set, Perl allowed white space between ( and ? though the latest Perls give
|
set, Perl allowed white space between ( and ? though the latest Perls give an
|
||||||
an error (for a while it was just deprecated). There may still be some cases
|
error (for a while it was just deprecated). There may still be some cases where
|
||||||
where Perl behaves differently.
|
Perl behaves differently.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
15. Perl, when in warning mode, gives warnings for character classes such as
|
15. Perl, when in warning mode, gives warnings for character classes such as
|
||||||
|
@ -235,9 +233,9 @@ Cambridge, England.
|
||||||
REVISION
|
REVISION
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 28 July 2018
|
Last updated: 03 February 2019
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2018 University of Cambridge.
|
Copyright © 1997-2019 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
|
@ -50,17 +50,17 @@ All values in repeating quantifiers must be less than 65536.
|
||||||
The maximum length of a lookbehind assertion is 65535 characters.
|
The maximum length of a lookbehind assertion is 65535 characters.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
There is no limit to the number of parenthesized subpatterns, but there can be
|
There is no limit to the number of parenthesized groups, but there can be no
|
||||||
no more than 65535 capturing subpatterns. There is, however, a limit to the
|
more than 65535 capture groups, and there is a limit to the depth of nesting of
|
||||||
depth of nesting of parenthesized subpatterns of all kinds. This is imposed in
|
parenthesized subpatterns of all kinds. This is imposed in order to limit the
|
||||||
order to limit the amount of system stack used at compile time. The default
|
amount of system stack used at compile time. The default limit can be specified
|
||||||
limit can be specified when PCRE2 is built; if not, the default is set to 250.
|
when PCRE2 is built; if not, the default is set to 250. An application can
|
||||||
An application can change this limit by calling pcre2_set_parens_nest_limit()
|
change this limit by calling pcre2_set_parens_nest_limit() to set the limit in
|
||||||
to set the limit in a compile context.
|
a compile context.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The maximum length of name for a named subpattern is 32 code units, and the
|
The maximum length of name for a named capture group is 32 code units, and the
|
||||||
maximum number of named subpatterns is 10000.
|
maximum number of such groups is 10000.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
|
The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
|
||||||
|
@ -86,9 +86,9 @@ Cambridge, England.
|
||||||
REVISION
|
REVISION
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 30 March 2017
|
Last updated: 02 February 2019
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2017 University of Cambridge.
|
Copyright © 1997-2019 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -31,9 +31,9 @@ of them.
|
||||||
Patterns are compiled by PCRE2 into a reasonably efficient interpretive code,
|
Patterns are compiled by PCRE2 into a reasonably efficient interpretive code,
|
||||||
so that most simple patterns do not use much memory for storing the compiled
|
so that most simple patterns do not use much memory for storing the compiled
|
||||||
version. However, there is one case where the memory usage of a compiled
|
version. However, there is one case where the memory usage of a compiled
|
||||||
pattern can be unexpectedly large. If a parenthesized subpattern has a
|
pattern can be unexpectedly large. If a parenthesized group has a quantifier
|
||||||
quantifier with a minimum greater than 1 and/or a limited maximum, the whole
|
with a minimum greater than 1 and/or a limited maximum, the whole group is
|
||||||
subpattern is repeated in the compiled code. For example, the pattern
|
repeated in the compiled code. For example, the pattern
|
||||||
<pre>
|
<pre>
|
||||||
(abc|def){2,4}
|
(abc|def){2,4}
|
||||||
</pre>
|
</pre>
|
||||||
|
@ -252,9 +252,9 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC6" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC6" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 25 April 2018
|
Last updated: 03 February 2019
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2018 University of Cambridge.
|
Copyright © 1997-2019 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
|
@ -424,20 +424,23 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
|
||||||
<br><a name="SEC13" href="#TOC1">CAPTURING</a><br>
|
<br><a name="SEC13" href="#TOC1">CAPTURING</a><br>
|
||||||
<P>
|
<P>
|
||||||
<pre>
|
<pre>
|
||||||
(...) capturing group
|
(...) capture group
|
||||||
(?<name>...) named capturing group (Perl)
|
(?<name>...) named capture group (Perl)
|
||||||
(?'name'...) named capturing group (Perl)
|
(?'name'...) named capture group (Perl)
|
||||||
(?P<name>...) named capturing group (Python)
|
(?P<name>...) named capture group (Python)
|
||||||
(?:...) non-capturing group
|
(?:...) non-capture group
|
||||||
(?|...) non-capturing group; reset group numbers for
|
(?|...) non-capture group; reset group numbers for
|
||||||
capturing groups in each alternative
|
capture groups in each alternative
|
||||||
</PRE>
|
</pre>
|
||||||
|
In non-UTF modes, names may contain underscores and ASCII letters and digits;
|
||||||
|
in UTF modes, any Unicode letters and Unicode decimal digits are permitted. In
|
||||||
|
both cases, a name must not start with a digit.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC14" href="#TOC1">ATOMIC GROUPS</a><br>
|
<br><a name="SEC14" href="#TOC1">ATOMIC GROUPS</a><br>
|
||||||
<P>
|
<P>
|
||||||
<pre>
|
<pre>
|
||||||
(?>...) atomic, non-capturing group
|
(?>...) atomic non-capture group
|
||||||
(*atomic:...) atomic, non-capturing group
|
(*atomic:...) atomic non-capture group
|
||||||
</PRE>
|
</PRE>
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC15" href="#TOC1">COMMENT</a><br>
|
<br><a name="SEC15" href="#TOC1">COMMENT</a><br>
|
||||||
|
@ -465,7 +468,7 @@ of the group.
|
||||||
Unsetting x or xx unsets both. Several options may be set at once, and a
|
Unsetting x or xx unsets both. Several options may be set at once, and a
|
||||||
mixture of setting and unsetting such as (?i-x) is allowed, but there may be
|
mixture of setting and unsetting such as (?i-x) is allowed, but there may be
|
||||||
only one hyphen. Setting (but no unsetting) is allowed after (?^ for example
|
only one hyphen. Setting (but no unsetting) is allowed after (?^ for example
|
||||||
(?^in). An option setting may appear at the start of a non-capturing group, for
|
(?^in). An option setting may appear at the start of a non-capture group, for
|
||||||
example (?i:...).
|
example (?i:...).
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
|
@ -565,19 +568,19 @@ Each top-level branch of a lookbehind must be of a fixed length.
|
||||||
<P>
|
<P>
|
||||||
<pre>
|
<pre>
|
||||||
(?R) recurse whole pattern
|
(?R) recurse whole pattern
|
||||||
(?n) call subpattern by absolute number
|
(?n) call subroutine by absolute number
|
||||||
(?+n) call subpattern by relative number
|
(?+n) call subroutine by relative number
|
||||||
(?-n) call subpattern by relative number
|
(?-n) call subroutine by relative number
|
||||||
(?&name) call subpattern by name (Perl)
|
(?&name) call subroutine by name (Perl)
|
||||||
(?P>name) call subpattern by name (Python)
|
(?P>name) call subroutine by name (Python)
|
||||||
\g<name> call subpattern by name (Oniguruma)
|
\g<name> call subroutine by name (Oniguruma)
|
||||||
\g'name' call subpattern by name (Oniguruma)
|
\g'name' call subroutine by name (Oniguruma)
|
||||||
\g<n> call subpattern by absolute number (Oniguruma)
|
\g<n> call subroutine by absolute number (Oniguruma)
|
||||||
\g'n' call subpattern by absolute number (Oniguruma)
|
\g'n' call subroutine by absolute number (Oniguruma)
|
||||||
\g<+n> call subpattern by relative number (PCRE2 extension)
|
\g<+n> call subroutine by relative number (PCRE2 extension)
|
||||||
\g'+n' call subpattern by relative number (PCRE2 extension)
|
\g'+n' call subroutine by relative number (PCRE2 extension)
|
||||||
\g<-n> call subpattern by relative number (PCRE2 extension)
|
\g<-n> call subroutine by relative number (PCRE2 extension)
|
||||||
\g'-n' call subpattern by relative number (PCRE2 extension)
|
\g'-n' call subroutine by relative number (PCRE2 extension)
|
||||||
</PRE>
|
</PRE>
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC23" href="#TOC1">CONDITIONAL PATTERNS</a><br>
|
<br><a name="SEC23" href="#TOC1">CONDITIONAL PATTERNS</a><br>
|
||||||
|
@ -595,7 +598,7 @@ Each top-level branch of a lookbehind must be of a fixed length.
|
||||||
(?(R) overall recursion condition
|
(?(R) overall recursion condition
|
||||||
(?(Rn) specific numbered group recursion condition
|
(?(Rn) specific numbered group recursion condition
|
||||||
(?(R&name) specific named group recursion condition
|
(?(R&name) specific named group recursion condition
|
||||||
(?(DEFINE) define subpattern for reference
|
(?(DEFINE) define groups for reference
|
||||||
(?(VERSION[>]=n.m) test PCRE2 version
|
(?(VERSION[>]=n.m) test PCRE2 version
|
||||||
(?(assert) assertion condition
|
(?(assert) assertion condition
|
||||||
</pre>
|
</pre>
|
||||||
|
@ -657,9 +660,9 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC28" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC28" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 10 October 2018
|
Last updated: 03 February 2019
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2018 University of Cambridge.
|
Copyright © 1997-2019 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
|
@ -716,14 +716,14 @@ information is obtained from the <b>pcre2_pattern_info()</b> function. Here are
|
||||||
some typical examples:
|
some typical examples:
|
||||||
<pre>
|
<pre>
|
||||||
re> /(?i)(^a|^b)/m,info
|
re> /(?i)(^a|^b)/m,info
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Compile options: multiline
|
Compile options: multiline
|
||||||
Overall options: caseless multiline
|
Overall options: caseless multiline
|
||||||
First code unit at start or follows newline
|
First code unit at start or follows newline
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
re> /(?i)abc/info
|
re> /(?i)abc/info
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: <none>
|
Compile options: <none>
|
||||||
Overall options: caseless
|
Overall options: caseless
|
||||||
First code unit = 'a' (caseless)
|
First code unit = 'a' (caseless)
|
||||||
|
@ -1353,8 +1353,8 @@ Testing substring extraction functions
|
||||||
<P>
|
<P>
|
||||||
The <b>copy</b> and <b>get</b> modifiers can be used to test the
|
The <b>copy</b> and <b>get</b> modifiers can be used to test the
|
||||||
<b>pcre2_substring_copy_xxx()</b> and <b>pcre2_substring_get_xxx()</b> functions.
|
<b>pcre2_substring_copy_xxx()</b> and <b>pcre2_substring_get_xxx()</b> functions.
|
||||||
They can be given more than once, and each can specify a group name or number,
|
They can be given more than once, and each can specify a capture group name or
|
||||||
for example:
|
number, for example:
|
||||||
<pre>
|
<pre>
|
||||||
abcd\=copy=1,copy=3,get=G1
|
abcd\=copy=1,copy=3,get=G1
|
||||||
</pre>
|
</pre>
|
||||||
|
@ -2075,9 +2075,9 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 12 November 2018
|
Last updated: 03 February 2019
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2018 University of Cambridge.
|
Copyright © 1997-2019 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
|
@ -38,10 +38,11 @@ UNICODE PROPERTY SUPPORT
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
When PCRE2 is built with Unicode support, the escape sequences \p{..},
|
When PCRE2 is built with Unicode support, the escape sequences \p{..},
|
||||||
\P{..}, and \X can be used. The Unicode properties that can be tested are
|
\P{..}, and \X can be used. This is not dependent on the PCRE2_UTF setting.
|
||||||
limited to the general category properties such as Lu for an upper case letter
|
The Unicode properties that can be tested are limited to the general category
|
||||||
or Nd for a decimal number, the Unicode script names such as Arabic or Han, and
|
properties such as Lu for an upper case letter or Nd for a decimal number, the
|
||||||
the derived properties Any and L&. Full lists are given in the
|
Unicode script names such as Arabic or Han, and the derived properties Any and
|
||||||
|
L&. Full lists are given in the
|
||||||
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
||||||
and
|
and
|
||||||
<a href="pcre2syntax.html"><b>pcre2syntax</b></a>
|
<a href="pcre2syntax.html"><b>pcre2syntax</b></a>
|
||||||
|
@ -73,11 +74,17 @@ In UTF modes, the dot metacharacter matches one UTF character instead of a
|
||||||
single code unit.
|
single code unit.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
|
In UTF modes, capture group names are not restricted to ASCII, and may contain
|
||||||
|
any Unicode letters and decimal digits, as well as underscore.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
The escape sequence \C can be used to match a single code unit in a UTF mode,
|
The escape sequence \C can be used to match a single code unit in a UTF mode,
|
||||||
but its use can lead to some strange effects because it breaks up multi-unit
|
but its use can lead to some strange effects because it breaks up multi-unit
|
||||||
characters (see the description of \C in the
|
characters (see the description of \C in the
|
||||||
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
|
||||||
documentation).
|
documentation). For this reason, there is a build-time option that disables
|
||||||
|
support for \C completely. There is also a less draconian compile-time option
|
||||||
|
for locking out the use of \C when a pattern is compiled.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The use of \C is not supported by the alternative matching function
|
The use of \C is not supported by the alternative matching function
|
||||||
|
@ -410,9 +417,9 @@ Cambridge, England.
|
||||||
REVISION
|
REVISION
|
||||||
</b><br>
|
</b><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 12 October 2018
|
Last updated: 03 February 2019
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2018 University of Cambridge.
|
Copyright © 1997-2019 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
3281
doc/pcre2.txt
3281
doc/pcre2.txt
File diff suppressed because it is too large
Load Diff
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2_SUBSTRING_NAMETABLE_SCAN 3 "21 October 2014" "PCRE2 10.00"
|
.TH PCRE2_SUBSTRING_NAMETABLE_SCAN 3 "03 February 2019" "PCRE2 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -15,8 +15,8 @@ PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
This convenience function finds, for a compiled pattern, the first and last
|
This convenience function finds, for a compiled pattern, the first and last
|
||||||
entries for a given name in the table that translates capturing parenthesis
|
entries for a given name in the table that translates capture group names into
|
||||||
names into numbers.
|
numbers.
|
||||||
.sp
|
.sp
|
||||||
\fIcode\fP Compiled regular expression
|
\fIcode\fP Compiled regular expression
|
||||||
\fIname\fP Name whose entries required
|
\fIname\fP Name whose entries required
|
||||||
|
|
195
doc/pcre2api.3
195
doc/pcre2api.3
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2API 3 "04 January 2019" "PCRE2 10.33"
|
.TH PCRE2API 3 "04 February 2019" "PCRE2 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.sp
|
.sp
|
||||||
|
@ -1429,10 +1429,10 @@ independent of the setting of PCRE2_DOTALL.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_DUPNAMES
|
PCRE2_DUPNAMES
|
||||||
.sp
|
.sp
|
||||||
If this bit is set, names used to identify capturing subpatterns need not be
|
If this bit is set, names used to identify capture groups need not be unique.
|
||||||
unique. This can be helpful for certain types of pattern when it is known that
|
This can be helpful for certain types of pattern when it is known that only one
|
||||||
only one instance of the named subpattern can ever be matched. There are more
|
instance of the named group can ever be matched. There are more details of
|
||||||
details of named subpatterns below; see also the
|
named capture groups below; see also the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
\fBpcre2pattern\fP
|
\fBpcre2pattern\fP
|
||||||
.\"
|
.\"
|
||||||
|
@ -1466,11 +1466,11 @@ the end of the subject.
|
||||||
If this bit is set, most white space characters in the pattern are totally
|
If this bit is set, most white space characters in the pattern are totally
|
||||||
ignored except when escaped or inside a character class. However, white space
|
ignored except when escaped or inside a character class. However, white space
|
||||||
is not allowed within sequences such as (?> that introduce various
|
is not allowed within sequences such as (?> that introduce various
|
||||||
parenthesized subpatterns, nor within numerical quantifiers such as {1,3}.
|
parenthesized groups, nor within numerical quantifiers such as {1,3}. Ignorable
|
||||||
Ignorable white space is permitted between an item and a following quantifier
|
white space is permitted between an item and a following quantifier and between
|
||||||
and between a quantifier and a following + that indicates possessiveness.
|
a quantifier and a following + that indicates possessiveness. PCRE2_EXTENDED is
|
||||||
PCRE2_EXTENDED is equivalent to Perl's /x option, and it can be changed within
|
equivalent to Perl's /x option, and it can be changed within a pattern by a
|
||||||
a pattern by a (?x) option setting.
|
(?x) option setting.
|
||||||
.P
|
.P
|
||||||
When PCRE2 is compiled without Unicode support, PCRE2_EXTENDED recognizes as
|
When PCRE2 is compiled without Unicode support, PCRE2_EXTENDED recognizes as
|
||||||
white space only those characters with code points less than 256 that are
|
white space only those characters with code points less than 256 that are
|
||||||
|
@ -1547,7 +1547,7 @@ error.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_MATCH_UNSET_BACKREF
|
PCRE2_MATCH_UNSET_BACKREF
|
||||||
.sp
|
.sp
|
||||||
If this option is set, a backreference to an unset subpattern group matches an
|
If this option is set, a backreference to an unset capture group matches an
|
||||||
empty string (by default this causes the current matching alternative to fail).
|
empty string (by default this causes the current matching alternative to fail).
|
||||||
A pattern such as (\e1)(a) succeeds when this option is set (assuming it can
|
A pattern such as (\e1)(a) succeeds when this option is set (assuming it can
|
||||||
find an "a" in the subject), whereas it fails by default, for Perl
|
find an "a" in the subject), whereas it fails by default, for Perl
|
||||||
|
@ -1608,7 +1608,7 @@ If this option is set, it disables the use of numbered capturing parentheses in
|
||||||
the pattern. Any opening parenthesis that is not followed by ? behaves as if it
|
the pattern. Any opening parenthesis that is not followed by ? behaves as if it
|
||||||
were followed by ?: but named parentheses can still be used for capturing (and
|
were followed by ?: but named parentheses can still be used for capturing (and
|
||||||
they acquire numbers in the usual way). This is the same as Perl's /n option.
|
they acquire numbers in the usual way). This is the same as Perl's /n option.
|
||||||
Note that, when this option is set, references to capturing groups
|
Note that, when this option is set, references to capture groups
|
||||||
(backreferences or recursion/subroutine calls) may only refer to named groups,
|
(backreferences or recursion/subroutine calls) may only refer to named groups,
|
||||||
though the reference can be by name or by number.
|
though the reference can be by name or by number.
|
||||||
.sp
|
.sp
|
||||||
|
@ -1627,7 +1627,7 @@ purposes.
|
||||||
If this option is set, it disables an optimization that is applied when .* is
|
If this option is set, it disables an optimization that is applied when .* is
|
||||||
the first significant item in a top-level branch of a pattern, and all the
|
the first significant item in a top-level branch of a pattern, and all the
|
||||||
other branches also start with .* or with \eA or \eG or ^. The optimization is
|
other branches also start with .* or with \eA or \eG or ^. The optimization is
|
||||||
automatically disabled for .* if it is inside an atomic group or a capturing
|
automatically disabled for .* if it is inside an atomic group or a capture
|
||||||
group that is the subject of a backreference, or if the pattern contains
|
group that is the subject of a backreference, or if the pattern contains
|
||||||
(*PRUNE) or (*SKIP). When the optimization is not disabled, such a pattern is
|
(*PRUNE) or (*SKIP). When the optimization is not disabled, such a pattern is
|
||||||
automatically anchored if PCRE2_DOTALL is set for all the .* items and
|
automatically anchored if PCRE2_DOTALL is set for all the .* items and
|
||||||
|
@ -2025,7 +2025,7 @@ following are true:
|
||||||
.sp
|
.sp
|
||||||
.* is not in an atomic group
|
.* is not in an atomic group
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
.* is not in a capturing group that is the subject
|
.* is not in a capture group that is the subject
|
||||||
of a backreference
|
of a backreference
|
||||||
PCRE2_DOTALL is in force for .*
|
PCRE2_DOTALL is in force for .*
|
||||||
Neither (*PRUNE) nor (*SKIP) appears in the pattern
|
Neither (*PRUNE) nor (*SKIP) appears in the pattern
|
||||||
|
@ -2037,12 +2037,12 @@ options returned for PCRE2_INFO_ALLOPTIONS.
|
||||||
PCRE2_INFO_BACKREFMAX
|
PCRE2_INFO_BACKREFMAX
|
||||||
.sp
|
.sp
|
||||||
Return the number of the highest backreference in the pattern. The third
|
Return the number of the highest backreference in the pattern. The third
|
||||||
argument should point to an \fBuint32_t\fP variable. Named subpatterns acquire
|
argument should point to an \fBuint32_t\fP variable. Named capture groups
|
||||||
numbers as well as names, and these count towards the highest backreference.
|
acquire numbers as well as names, and these count towards the highest
|
||||||
Backreferences such as \e4 or \eg{12} match the captured characters of the
|
backreference. Backreferences such as \e4 or \eg{12} match the captured
|
||||||
given group, but in addition, the check that a capturing group is set in a
|
characters of the given group, but in addition, the check that a capture
|
||||||
conditional subpattern such as (?(3)a|b) is also a backreference. Zero is
|
group is set in a conditional group such as (?(3)a|b) is also a backreference.
|
||||||
returned if there are no backreferences.
|
Zero is returned if there are no backreferences.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_INFO_BSR
|
PCRE2_INFO_BSR
|
||||||
.sp
|
.sp
|
||||||
|
@ -2053,9 +2053,9 @@ that \eR matches only CR, LF, or CRLF.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_INFO_CAPTURECOUNT
|
PCRE2_INFO_CAPTURECOUNT
|
||||||
.sp
|
.sp
|
||||||
Return the highest capturing subpattern number in the pattern. In patterns
|
Return the highest capture group number in the pattern. In patterns where (?|
|
||||||
where (?| is not used, this is also the total number of capturing subpatterns.
|
is not used, this is also the total number of capture groups. The third
|
||||||
The third argument should point to an \fBuint32_t\fP variable.
|
argument should point to an \fBuint32_t\fP variable.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_INFO_DEPTHLIMIT
|
PCRE2_INFO_DEPTHLIMIT
|
||||||
.sp
|
.sp
|
||||||
|
@ -2103,7 +2103,7 @@ Return the size (in bytes) of the data frames that are used to remember
|
||||||
backtracking positions when the pattern is processed by \fBpcre2_match()\fP
|
backtracking positions when the pattern is processed by \fBpcre2_match()\fP
|
||||||
without the use of JIT. The third argument should point to a \fBsize_t\fP
|
without the use of JIT. The third argument should point to a \fBsize_t\fP
|
||||||
variable. The frame size depends on the number of capturing parentheses in the
|
variable. The frame size depends on the number of capturing parentheses in the
|
||||||
pattern. Each additional capturing group adds two PCRE2_SIZE variables.
|
pattern. Each additional capture group adds two PCRE2_SIZE variables.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_INFO_HASBACKSLASHC
|
PCRE2_INFO_HASBACKSLASHC
|
||||||
.sp
|
.sp
|
||||||
|
@ -2224,11 +2224,11 @@ library, the pointer points to 32-bit code units, the first of which contains
|
||||||
the parenthesis number. The rest of the entry is the corresponding name, zero
|
the parenthesis number. The rest of the entry is the corresponding name, zero
|
||||||
terminated.
|
terminated.
|
||||||
.P
|
.P
|
||||||
The names are in alphabetical order. If (?| is used to create multiple groups
|
The names are in alphabetical order. If (?| is used to create multiple capture
|
||||||
with the same number, as described in the
|
groups with the same number, as described in the
|
||||||
.\" HTML <a href="pcre2pattern.html#dupsubpatternnumber">
|
.\" HTML <a href="pcre2pattern.html#dupgroupnumber">
|
||||||
.\" </a>
|
.\" </a>
|
||||||
section on duplicate subpattern numbers
|
section on duplicate group numbers
|
||||||
.\"
|
.\"
|
||||||
in the
|
in the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
|
@ -2237,11 +2237,11 @@ in the
|
||||||
page, the groups may be given the same name, but there is only one entry in the
|
page, the groups may be given the same name, but there is only one entry in the
|
||||||
table. Different names for groups of the same number are not permitted.
|
table. Different names for groups of the same number are not permitted.
|
||||||
.P
|
.P
|
||||||
Duplicate names for subpatterns with different numbers are permitted, but only
|
Duplicate names for capture groups with different numbers are permitted, but
|
||||||
if PCRE2_DUPNAMES is set. They appear in the table in the order in which they
|
only if PCRE2_DUPNAMES is set. They appear in the table in the order in which
|
||||||
were found in the pattern. In the absence of (?| this is the order of
|
they were found in the pattern. In the absence of (?| this is the order of
|
||||||
increasing number; when (?| is used this is not necessarily the case because
|
increasing number; when (?| is used this is not necessarily the case because
|
||||||
later subpatterns may have lower numbers.
|
later capture groups may have lower numbers.
|
||||||
.P
|
.P
|
||||||
As a simple example of the name/number table, consider the following pattern
|
As a simple example of the name/number table, consider the following pattern
|
||||||
after compilation by the 8-bit library (assume PCRE2_EXTENDED is set, so white
|
after compilation by the 8-bit library (assume PCRE2_EXTENDED is set, so white
|
||||||
|
@ -2251,16 +2251,16 @@ space - including newlines - is ignored):
|
||||||
(?<date> (?<year>(\ed\ed)?\ed\ed) -
|
(?<date> (?<year>(\ed\ed)?\ed\ed) -
|
||||||
(?<month>\ed\ed) - (?<day>\ed\ed) )
|
(?<month>\ed\ed) - (?<day>\ed\ed) )
|
||||||
.sp
|
.sp
|
||||||
There are four named subpatterns, so the table has four entries, and each entry
|
There are four named capture groups, so the table has four entries, and each
|
||||||
in the table is eight bytes long. The table is as follows, with non-printing
|
entry in the table is eight bytes long. The table is as follows, with
|
||||||
bytes shows in hexadecimal, and undefined bytes shown as ??:
|
non-printing bytes shows in hexadecimal, and undefined bytes shown as ??:
|
||||||
.sp
|
.sp
|
||||||
00 01 d a t e 00 ??
|
00 01 d a t e 00 ??
|
||||||
00 05 d a y 00 ?? ??
|
00 05 d a y 00 ?? ??
|
||||||
00 04 m o n t h 00
|
00 04 m o n t h 00
|
||||||
00 02 y e a r 00 ??
|
00 02 y e a r 00 ??
|
||||||
.sp
|
.sp
|
||||||
When writing code to extract data from named subpatterns using the
|
When writing code to extract data from named capture groups using the
|
||||||
name-to-number map, remember that the length of the entries is likely to be
|
name-to-number map, remember that the length of the entries is likely to be
|
||||||
different for each compiled pattern.
|
different for each compiled pattern.
|
||||||
.sp
|
.sp
|
||||||
|
@ -2740,12 +2740,12 @@ valid newline sequence and explicit \er or \en escapes appear in the pattern.
|
||||||
In general, a pattern matches a certain portion of the subject, and in
|
In general, a pattern matches a certain portion of the subject, and in
|
||||||
addition, further substrings from the subject may be picked out by
|
addition, further substrings from the subject may be picked out by
|
||||||
parenthesized parts of the pattern. Following the usage in Jeffrey Friedl's
|
parenthesized parts of the pattern. Following the usage in Jeffrey Friedl's
|
||||||
book, this is called "capturing" in what follows, and the phrase "capturing
|
book, this is called "capturing" in what follows, and the phrase "capture
|
||||||
subpattern" or "capturing group" is used for a fragment of a pattern that picks
|
group" (Perl terminology) is used for a fragment of a pattern that picks out a
|
||||||
out a substring. PCRE2 supports several other kinds of parenthesized subpattern
|
substring. PCRE2 supports several other kinds of parenthesized group that do
|
||||||
that do not cause substrings to be captured. The \fBpcre2_pattern_info()\fP
|
not cause substrings to be captured. The \fBpcre2_pattern_info()\fP function
|
||||||
function can be used to find out how many capturing subpatterns there are in a
|
can be used to find out how many capture groups there are in a compiled
|
||||||
compiled pattern.
|
pattern.
|
||||||
.P
|
.P
|
||||||
You can use auxiliary functions for accessing captured substrings
|
You can use auxiliary functions for accessing captured substrings
|
||||||
.\" HTML <a href="#extractbynumber">
|
.\" HTML <a href="#extractbynumber">
|
||||||
|
@ -2798,30 +2798,28 @@ reported start of a successful match can be greater than the end of the match.
|
||||||
For example, if the pattern (?=ab\eK) is matched against "ab", the start and
|
For example, if the pattern (?=ab\eK) is matched against "ab", the start and
|
||||||
end offset values for the match are 2 and 0.
|
end offset values for the match are 2 and 0.
|
||||||
.P
|
.P
|
||||||
If a capturing subpattern group is matched repeatedly within a single match
|
If a capture group is matched repeatedly within a single match operation, it is
|
||||||
operation, it is the last portion of the subject that it matched that is
|
the last portion of the subject that it matched that is returned.
|
||||||
returned.
|
|
||||||
.P
|
.P
|
||||||
If the ovector is too small to hold all the captured substring offsets, as much
|
If the ovector is too small to hold all the captured substring offsets, as much
|
||||||
as possible is filled in, and the function returns a value of zero. If captured
|
as possible is filled in, and the function returns a value of zero. If captured
|
||||||
substrings are not of interest, \fBpcre2_match()\fP may be called with a match
|
substrings are not of interest, \fBpcre2_match()\fP may be called with a match
|
||||||
data block whose ovector is of minimum length (that is, one pair).
|
data block whose ovector is of minimum length (that is, one pair).
|
||||||
.P
|
.P
|
||||||
It is possible for capturing subpattern number \fIn+1\fP to match some part of
|
It is possible for capture group number \fIn+1\fP to match some part of the
|
||||||
the subject when subpattern \fIn\fP has not been used at all. For example, if
|
subject when group \fIn\fP has not been used at all. For example, if the string
|
||||||
the string "abc" is matched against the pattern (a|(z))(bc) the return from the
|
"abc" is matched against the pattern (a|(z))(bc) the return from the function
|
||||||
function is 4, and subpatterns 1 and 3 are matched, but 2 is not. When this
|
is 4, and groups 1 and 3 are matched, but 2 is not. When this happens, both
|
||||||
happens, both values in the offset pairs corresponding to unused subpatterns
|
values in the offset pairs corresponding to unused groups are set to
|
||||||
are set to PCRE2_UNSET.
|
|
||||||
.P
|
|
||||||
Offset values that correspond to unused subpatterns at the end of the
|
|
||||||
expression are also set to PCRE2_UNSET. For example, if the string "abc" is
|
|
||||||
matched against the pattern (abc)(x(yz)?)? subpatterns 2 and 3 are not matched.
|
|
||||||
The return from the function is 2, because the highest used capturing
|
|
||||||
subpattern number is 1. The offsets for for the second and third capturing
|
|
||||||
subpatterns (assuming the vector is large enough, of course) are set to
|
|
||||||
PCRE2_UNSET.
|
PCRE2_UNSET.
|
||||||
.P
|
.P
|
||||||
|
Offset values that correspond to unused groups at the end of the expression are
|
||||||
|
also set to PCRE2_UNSET. For example, if the string "abc" is matched against
|
||||||
|
the pattern (abc)(x(yz)?)? groups 2 and 3 are not matched. The return from the
|
||||||
|
function is 2, because the highest used capture group number is 1. The offsets
|
||||||
|
for for the second and third capture groupss (assuming the vector is large
|
||||||
|
enough, of course) are set to PCRE2_UNSET.
|
||||||
|
.P
|
||||||
Elements in the ovector that do not correspond to capturing parentheses in the
|
Elements in the ovector that do not correspond to capturing parentheses in the
|
||||||
pattern are never changed. That is, if a pattern contains \fIn\fP capturing
|
pattern are never changed. That is, if a pattern contains \fIn\fP capturing
|
||||||
parentheses, no more than \fIovector[0]\fP to \fIovector[2n+1]\fP are set by
|
parentheses, no more than \fIovector[0]\fP to \fIovector[2n+1]\fP are set by
|
||||||
|
@ -3006,11 +3004,11 @@ as NULL.
|
||||||
.sp
|
.sp
|
||||||
This error is returned when \fBpcre2_match()\fP detects a recursion loop within
|
This error is returned when \fBpcre2_match()\fP detects a recursion loop within
|
||||||
the pattern. Specifically, it means that either the whole pattern or a
|
the pattern. Specifically, it means that either the whole pattern or a
|
||||||
subpattern has been called recursively for the second time at the same position
|
capture group has been called recursively for the second time at the same
|
||||||
in the subject string. Some simple patterns that might do this are detected and
|
position in the subject string. Some simple patterns that might do this are
|
||||||
faulted at compile time, but more complicated cases, in particular mutual
|
detected and faulted at compile time, but more complicated cases, in particular
|
||||||
recursions between two different subpatterns, cannot be detected until matching
|
mutual recursions between two different groups, cannot be detected until
|
||||||
is attempted.
|
matching is attempted.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.\" HTML <a name="geterrormessage"></a>
|
.\" HTML <a name="geterrormessage"></a>
|
||||||
|
@ -3090,7 +3088,7 @@ The \fBpcre2_substring_copy_bynumber()\fP function copies a captured substring
|
||||||
into a supplied buffer, whereas \fBpcre2_substring_get_bynumber()\fP copies it
|
into a supplied buffer, whereas \fBpcre2_substring_get_bynumber()\fP copies it
|
||||||
into new memory, obtained using the same memory allocation function that was
|
into new memory, obtained using the same memory allocation function that was
|
||||||
used for the match data block. The first two arguments of these functions are a
|
used for the match data block. The first two arguments of these functions are a
|
||||||
pointer to the match data block and a capturing group number.
|
pointer to the match data block and a capture group number.
|
||||||
.P
|
.P
|
||||||
The final arguments of \fBpcre2_substring_copy_bynumber()\fP are a pointer to
|
The final arguments of \fBpcre2_substring_copy_bynumber()\fP are a pointer to
|
||||||
the buffer and a pointer to a variable that contains its length in code units.
|
the buffer and a pointer to a variable that contains its length in code units.
|
||||||
|
@ -3162,9 +3160,9 @@ could not be obtained. When the list is no longer needed, it should be freed by
|
||||||
calling \fBpcre2_substring_list_free()\fP.
|
calling \fBpcre2_substring_list_free()\fP.
|
||||||
.P
|
.P
|
||||||
If this function encounters a substring that is unset, which can happen when
|
If this function encounters a substring that is unset, which can happen when
|
||||||
capturing subpattern number \fIn+1\fP matches some part of the subject, but
|
capture group number \fIn+1\fP matches some part of the subject, but group
|
||||||
subpattern \fIn\fP has not been used at all, it returns an empty string. This
|
\fIn\fP has not been used at all, it returns an empty string. This can be
|
||||||
can be distinguished from a genuine zero-length substring by inspecting the
|
distinguished from a genuine zero-length substring by inspecting the
|
||||||
appropriate offset in the ovector, which contain PCRE2_UNSET for unset
|
appropriate offset in the ovector, which contain PCRE2_UNSET for unset
|
||||||
substrings, or by calling \fBpcre2_substring_length_bynumber()\fP.
|
substrings, or by calling \fBpcre2_substring_length_bynumber()\fP.
|
||||||
.
|
.
|
||||||
|
@ -3194,20 +3192,20 @@ For example, for this pattern:
|
||||||
.sp
|
.sp
|
||||||
(a+)b(?<xxx>\ed+)...
|
(a+)b(?<xxx>\ed+)...
|
||||||
.sp
|
.sp
|
||||||
the number of the subpattern called "xxx" is 2. If the name is known to be
|
the number of the capture group called "xxx" is 2. If the name is known to be
|
||||||
unique (PCRE2_DUPNAMES was not set), you can find the number from the name by
|
unique (PCRE2_DUPNAMES was not set), you can find the number from the name by
|
||||||
calling \fBpcre2_substring_number_from_name()\fP. The first argument is the
|
calling \fBpcre2_substring_number_from_name()\fP. The first argument is the
|
||||||
compiled pattern, and the second is the name. The yield of the function is the
|
compiled pattern, and the second is the name. The yield of the function is the
|
||||||
subpattern number, PCRE2_ERROR_NOSUBSTRING if there is no subpattern of that
|
group number, PCRE2_ERROR_NOSUBSTRING if there is no group with that name, or
|
||||||
name, or PCRE2_ERROR_NOUNIQUESUBSTRING if there is more than one subpattern of
|
PCRE2_ERROR_NOUNIQUESUBSTRING if there is more than one group with that name.
|
||||||
that name. Given the number, you can extract the substring directly from the
|
Given the number, you can extract the substring directly from the ovector, or
|
||||||
ovector, or use one of the "bynumber" functions described above.
|
use one of the "bynumber" functions described above.
|
||||||
.P
|
.P
|
||||||
For convenience, there are also "byname" functions that correspond to the
|
For convenience, there are also "byname" functions that correspond to the
|
||||||
"bynumber" functions, the only difference being that the second argument is a
|
"bynumber" functions, the only difference being that the second argument is a
|
||||||
name instead of a number. If PCRE2_DUPNAMES is set and there are duplicate
|
name instead of a number. If PCRE2_DUPNAMES is set and there are duplicate
|
||||||
names, these functions scan all the groups with the given name, and return the
|
names, these functions scan all the groups with the given name, and return the
|
||||||
first named string that is set.
|
captured substring from the first named group that is set.
|
||||||
.P
|
.P
|
||||||
If there are no groups with the given name, PCRE2_ERROR_NOSUBSTRING is
|
If there are no groups with the given name, PCRE2_ERROR_NOSUBSTRING is
|
||||||
returned. If all groups with the name have numbers that are greater than the
|
returned. If all groups with the name have numbers that are greater than the
|
||||||
|
@ -3216,18 +3214,18 @@ is at least one group with a slot in the ovector, but no group is found to be
|
||||||
set, PCRE2_ERROR_UNSET is returned.
|
set, PCRE2_ERROR_UNSET is returned.
|
||||||
.P
|
.P
|
||||||
\fBWarning:\fP If the pattern uses the (?| feature to set up multiple
|
\fBWarning:\fP If the pattern uses the (?| feature to set up multiple
|
||||||
subpatterns with the same number, as described in the
|
capture groups with the same number, as described in the
|
||||||
.\" HTML <a href="pcre2pattern.html#dupsubpatternnumber">
|
.\" HTML <a href="pcre2pattern.html#dupgroupnumber">
|
||||||
.\" </a>
|
.\" </a>
|
||||||
section on duplicate subpattern numbers
|
section on duplicate group numbers
|
||||||
.\"
|
.\"
|
||||||
in the
|
in the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
\fBpcre2pattern\fP
|
\fBpcre2pattern\fP
|
||||||
.\"
|
.\"
|
||||||
page, you cannot use names to distinguish the different subpatterns, because
|
page, you cannot use names to distinguish the different capture groups, because
|
||||||
names are not included in the compiled code. The matching process uses only
|
names are not included in the compiled code. The matching process uses only
|
||||||
numbers. For this reason, the use of different names for subpatterns of the
|
numbers. For this reason, the use of different names for groups with the
|
||||||
same number causes an error at compile time.
|
same number causes an error at compile time.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
|
@ -3288,7 +3286,7 @@ length is in code units, not bytes.
|
||||||
In the replacement string, which is interpreted as a UTF string in UTF mode,
|
In the replacement string, which is interpreted as a UTF string in UTF mode,
|
||||||
and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK option is set, a
|
and is checked for UTF validity unless the PCRE2_NO_UTF_CHECK option is set, a
|
||||||
dollar character is an escape character that can specify the insertion of
|
dollar character is an escape character that can specify the insertion of
|
||||||
characters from capturing groups or names from (*MARK) or other control verbs
|
characters from capture groups or names from (*MARK) or other control verbs
|
||||||
in the pattern. The following forms are always recognized:
|
in the pattern. The following forms are always recognized:
|
||||||
.sp
|
.sp
|
||||||
$$ insert a dollar character
|
$$ insert a dollar character
|
||||||
|
@ -3351,12 +3349,12 @@ operation is carried out twice. Depending on the application, it may be more
|
||||||
efficient to allocate a large buffer and free the excess afterwards, instead of
|
efficient to allocate a large buffer and free the excess afterwards, instead of
|
||||||
using PCRE2_SUBSTITUTE_OVERFLOW_LENGTH.
|
using PCRE2_SUBSTITUTE_OVERFLOW_LENGTH.
|
||||||
.P
|
.P
|
||||||
PCRE2_SUBSTITUTE_UNKNOWN_UNSET causes references to capturing groups that do
|
PCRE2_SUBSTITUTE_UNKNOWN_UNSET causes references to capture groups that do
|
||||||
not appear in the pattern to be treated as unset groups. This option should be
|
not appear in the pattern to be treated as unset groups. This option should be
|
||||||
used with care, because it means that a typo in a group name or number no
|
used with care, because it means that a typo in a group name or number no
|
||||||
longer causes the PCRE2_ERROR_NOSUBSTRING error.
|
longer causes the PCRE2_ERROR_NOSUBSTRING error.
|
||||||
.P
|
.P
|
||||||
PCRE2_SUBSTITUTE_UNSET_EMPTY causes unset capturing groups (including unknown
|
PCRE2_SUBSTITUTE_UNSET_EMPTY causes unset capture groups (including unknown
|
||||||
groups when PCRE2_SUBSTITUTE_UNKNOWN_UNSET is set) to be treated as empty
|
groups when PCRE2_SUBSTITUTE_UNKNOWN_UNSET is set) to be treated as empty
|
||||||
strings when inserted as described above. If this option is not set, an attempt
|
strings when inserted as described above. If this option is not set, an attempt
|
||||||
to insert an unset group causes the PCRE2_ERROR_UNSET error. This option does
|
to insert an unset group causes the PCRE2_ERROR_UNSET error. This option does
|
||||||
|
@ -3381,14 +3379,15 @@ terminating a \eQ quoted sequence) reverts to no case forcing. The sequences
|
||||||
\eu and \el force the next character (if it is a letter) to upper or lower
|
\eu and \el force the next character (if it is a letter) to upper or lower
|
||||||
case, respectively, and then the state automatically reverts to no case
|
case, respectively, and then the state automatically reverts to no case
|
||||||
forcing. Case forcing applies to all inserted characters, including those from
|
forcing. Case forcing applies to all inserted characters, including those from
|
||||||
captured groups and letters within \eQ...\eE quoted sequences.
|
capture groups and letters within \eQ...\eE quoted sequences.
|
||||||
.P
|
.P
|
||||||
Note that case forcing sequences such as \eU...\eE do not nest. For example,
|
Note that case forcing sequences such as \eU...\eE do not nest. For example,
|
||||||
the result of processing "\eUaa\eLBB\eEcc\eE" is "AAbbcc"; the final \eE has no
|
the result of processing "\eUaa\eLBB\eEcc\eE" is "AAbbcc"; the final \eE has no
|
||||||
effect.
|
effect.
|
||||||
.P
|
.P
|
||||||
The second effect of setting PCRE2_SUBSTITUTE_EXTENDED is to add more
|
The second effect of setting PCRE2_SUBSTITUTE_EXTENDED is to add more
|
||||||
flexibility to group substitution. The syntax is similar to that used by Bash:
|
flexibility to capture group substitution. The syntax is similar to that used
|
||||||
|
by Bash:
|
||||||
.sp
|
.sp
|
||||||
${<n>:-<string>}
|
${<n>:-<string>}
|
||||||
${<n>:+<string1>:<string2>}
|
${<n>:+<string1>:<string2>}
|
||||||
|
@ -3510,7 +3509,7 @@ output and the call to \fBpcre2_substitute()\fP exits, returning the number of
|
||||||
matches so far.
|
matches so far.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SH "DUPLICATE SUBPATTERN NAMES"
|
.SH "DUPLICATE CAPTURE GROUP NAMES"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
|
@ -3518,13 +3517,14 @@ matches so far.
|
||||||
.B " PCRE2_SPTR \fIname\fP, PCRE2_SPTR *\fIfirst\fP, PCRE2_SPTR *\fIlast\fP);"
|
.B " PCRE2_SPTR \fIname\fP, PCRE2_SPTR *\fIfirst\fP, PCRE2_SPTR *\fIlast\fP);"
|
||||||
.fi
|
.fi
|
||||||
.P
|
.P
|
||||||
When a pattern is compiled with the PCRE2_DUPNAMES option, names for
|
When a pattern is compiled with the PCRE2_DUPNAMES option, names for capture
|
||||||
subpatterns are not required to be unique. Duplicate names are always allowed
|
groups are not required to be unique. Duplicate names are always allowed for
|
||||||
for subpatterns with the same number, created by using the (?| feature. Indeed,
|
groups with the same number, created by using the (?| feature. Indeed, if such
|
||||||
if such subpatterns are named, they are required to use the same names.
|
groups are named, they are required to use the same names.
|
||||||
.P
|
.P
|
||||||
Normally, patterns with duplicate names are such that in any one match, only
|
Normally, patterns that use duplicate names are such that in any one match,
|
||||||
one of the named subpatterns participates. An example is shown in the
|
only one of each set of identically-named groups participates. An example is
|
||||||
|
shown in the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
\fBpcre2pattern\fP
|
\fBpcre2pattern\fP
|
||||||
.\"
|
.\"
|
||||||
|
@ -3705,9 +3705,8 @@ the three matched strings are
|
||||||
On success, the yield of the function is a number greater than zero, which is
|
On success, the yield of the function is a number greater than zero, which is
|
||||||
the number of matched substrings. The offsets of the substrings are returned in
|
the number of matched substrings. The offsets of the substrings are returned in
|
||||||
the ovector, and can be extracted by number in the same way as for
|
the ovector, and can be extracted by number in the same way as for
|
||||||
\fBpcre2_match()\fP, but the numbers bear no relation to any capturing groups
|
\fBpcre2_match()\fP, but the numbers bear no relation to any capture groups
|
||||||
that may exist in the pattern, because DFA matching does not support group
|
that may exist in the pattern, because DFA matching does not support capturing.
|
||||||
capture.
|
|
||||||
.P
|
.P
|
||||||
Calls to the convenience functions that extract substrings by name
|
Calls to the convenience functions that extract substrings by name
|
||||||
return the error PCRE2_ERROR_DFA_UFUNC (unsupported function) if used after a
|
return the error PCRE2_ERROR_DFA_UFUNC (unsupported function) if used after a
|
||||||
|
@ -3749,7 +3748,7 @@ a backreference.
|
||||||
.sp
|
.sp
|
||||||
This return is given if \fBpcre2_dfa_match()\fP encounters a condition item
|
This return is given if \fBpcre2_dfa_match()\fP encounters a condition item
|
||||||
that uses a backreference for the condition, or a test for recursion in a
|
that uses a backreference for the condition, or a test for recursion in a
|
||||||
specific group. These are not supported.
|
specific capture group. These are not supported.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_ERROR_DFA_WSSIZE
|
PCRE2_ERROR_DFA_WSSIZE
|
||||||
.sp
|
.sp
|
||||||
|
@ -3758,9 +3757,9 @@ This return is given if \fBpcre2_dfa_match()\fP runs out of space in the
|
||||||
.sp
|
.sp
|
||||||
PCRE2_ERROR_DFA_RECURSE
|
PCRE2_ERROR_DFA_RECURSE
|
||||||
.sp
|
.sp
|
||||||
When a recursive subpattern is processed, the matching function calls itself
|
When a recursion or subroutine call is processed, the matching function calls
|
||||||
recursively, using private memory for the ovector and \fIworkspace\fP. This
|
itself recursively, using private memory for the ovector and \fIworkspace\fP.
|
||||||
error is given if the internal ovector is not large enough. This should be
|
This error is given if the internal ovector is not large enough. This should be
|
||||||
extremely rare, as a vector of size 1000 is used.
|
extremely rare, as a vector of size 1000 is used.
|
||||||
.sp
|
.sp
|
||||||
PCRE2_ERROR_DFA_BADRESTART
|
PCRE2_ERROR_DFA_BADRESTART
|
||||||
|
@ -3793,6 +3792,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 04 January 2019
|
Last updated: 04 February 2019
|
||||||
Copyright (c) 1997-2019 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2CALLOUT 3 "17 September 2018" "PCRE2 10.33"
|
.TH PCRE2CALLOUT 3 "03 February 2019" "PCRE2 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -137,7 +137,7 @@ start only after an internal newline or at the beginning of the subject, and
|
||||||
branch, automatic anchoring occurs if all branches are anchorable.
|
branch, automatic anchoring occurs if all branches are anchorable.
|
||||||
.P
|
.P
|
||||||
This optimization is disabled, however, if .* is in an atomic group or if there
|
This optimization is disabled, however, if .* is in an atomic group or if there
|
||||||
is a backreference to the capturing group in which it appears. It is also
|
is a backreference to the capture group in which it appears. It is also
|
||||||
disabled if the pattern contains (*PRUNE) or (*SKIP). However, the presence of
|
disabled if the pattern contains (*PRUNE) or (*SKIP). However, the presence of
|
||||||
callouts does not affect it.
|
callouts does not affect it.
|
||||||
.P
|
.P
|
||||||
|
@ -331,8 +331,8 @@ callout before an assertion such as (?=ab) the length is 3. For an an
|
||||||
alternation bar or a closing parenthesis, the length is one, unless a closing
|
alternation bar or a closing parenthesis, the length is one, unless a closing
|
||||||
parenthesis is followed by a quantifier, in which case its length is included.
|
parenthesis is followed by a quantifier, in which case its length is included.
|
||||||
(This changed in release 10.23. In earlier releases, before an opening
|
(This changed in release 10.23. In earlier releases, before an opening
|
||||||
parenthesis the length was that of the entire subpattern, and before an
|
parenthesis the length was that of the entire group, and before an alternation
|
||||||
alternation bar or a closing parenthesis the length was zero.)
|
bar or a closing parenthesis the length was zero.)
|
||||||
.P
|
.P
|
||||||
The \fIpattern_position\fP and \fInext_item_length\fP fields are intended to
|
The \fIpattern_position\fP and \fInext_item_length\fP fields are intended to
|
||||||
help in distinguishing between different automatic callouts, which all have the
|
help in distinguishing between different automatic callouts, which all have the
|
||||||
|
@ -452,6 +452,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 17 September 2018
|
Last updated: 03 February 2019
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2COMPAT 3 "28 July 2018" "PCRE2 10.32"
|
.TH PCRE2COMPAT 3 "03 February 2019" "PCRE2 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH "DIFFERENCES BETWEEN PCRE2 AND PERL"
|
.SH "DIFFERENCES BETWEEN PCRE2 AND PERL"
|
||||||
|
@ -23,10 +23,9 @@ character is not "a" three times (in principle; PCRE2 optimizes this to run the
|
||||||
assertion just once). Perl allows some repeat quantifiers on other assertions,
|
assertion just once). Perl allows some repeat quantifiers on other assertions,
|
||||||
for example, \eb* (but not \eb{3}), but these do not seem to have any use.
|
for example, \eb* (but not \eb{3}), but these do not seem to have any use.
|
||||||
.P
|
.P
|
||||||
3. Capturing subpatterns that occur inside negative lookaround assertions are
|
3. Capture groups that occur inside negative lookaround assertions are counted,
|
||||||
counted, but their entries in the offsets vector are set only when a negative
|
but their entries in the offsets vector are set only when a negative assertion
|
||||||
assertion is a condition that has a matching branch (that is, the condition is
|
is a condition that has a matching branch (that is, the condition is false).
|
||||||
false).
|
|
||||||
.P
|
.P
|
||||||
4. The following Perl escape sequences are not supported: \eF, \el, \eL, \eu,
|
4. The following Perl escape sequences are not supported: \eF, \el, \eL, \eu,
|
||||||
\eU, and \eN when followed by a character name. \eN on its own, matching a
|
\eU, and \eN when followed by a character name. \eN on its own, matching a
|
||||||
|
@ -79,13 +78,13 @@ documentation for details.
|
||||||
to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
|
to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
|
||||||
into subroutine calls is now supported, as in Perl.
|
into subroutine calls is now supported, as in Perl.
|
||||||
.P
|
.P
|
||||||
9. If any of the backtracking control verbs are used in a subpattern that is
|
9. If any of the backtracking control verbs are used in a group that is called
|
||||||
called as a subroutine (whether or not recursively), their effect is confined
|
as a subroutine (whether or not recursively), their effect is confined to that
|
||||||
to that subpattern; it does not extend to the surrounding pattern. This is not
|
group; it does not extend to the surrounding pattern. This is not always the
|
||||||
always the case in Perl. In particular, if (*THEN) is present in a group that
|
case in Perl. In particular, if (*THEN) is present in a group that is called as
|
||||||
is called as a subroutine, its action is limited to that group, even if the
|
a subroutine, its action is limited to that group, even if the group does not
|
||||||
group does not contain any | characters. Note that such subpatterns are
|
contain any | characters. Note that such groups are processed as anchored
|
||||||
processed as anchored at the point where they are tested.
|
at the point where they are tested.
|
||||||
.P
|
.P
|
||||||
10. If a pattern contains more than one backtracking control verb, the first
|
10. If a pattern contains more than one backtracking control verb, the first
|
||||||
one that is backtracked onto acts. For example, in the pattern
|
one that is backtracked onto acts. For example, in the pattern
|
||||||
|
@ -101,21 +100,20 @@ strings when part of a pattern is repeated. For example, matching "aba" against
|
||||||
the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE2 it is set to
|
the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE2 it is set to
|
||||||
"b".
|
"b".
|
||||||
.P
|
.P
|
||||||
13. PCRE2's handling of duplicate subpattern numbers and duplicate subpattern
|
13. PCRE2's handling of duplicate capture group numbers and names is not as
|
||||||
names is not as general as Perl's. This is a consequence of the fact the PCRE2
|
general as Perl's. This is a consequence of the fact the PCRE2 works internally
|
||||||
works internally just with numbers, using an external table to translate
|
just with numbers, using an external table to translate between numbers and
|
||||||
between numbers and names. In particular, a pattern such as (?|(?<a>A)|(?<b>B),
|
names. In particular, a pattern such as (?|(?<a>A)|(?<b>B), where the two
|
||||||
where the two capturing parentheses have the same number but different names,
|
capture groups have the same number but different names, is not supported, and
|
||||||
is not supported, and causes an error at compile time. If it were allowed, it
|
causes an error at compile time. If it were allowed, it would not be possible
|
||||||
would not be possible to distinguish which parentheses matched, because both
|
to distinguish which group matched, because both names map to capture group
|
||||||
names map to capturing subpattern number 1. To avoid this confusing situation,
|
number 1. To avoid this confusing situation, an error is given at compile time.
|
||||||
an error is given at compile time.
|
|
||||||
.P
|
.P
|
||||||
14. Perl used to recognize comments in some places that PCRE2 does not, for
|
14. Perl used to recognize comments in some places that PCRE2 does not, for
|
||||||
example, between the ( and ? at the start of a subpattern. If the /x modifier
|
example, between the ( and ? at the start of a group. If the /x modifier is
|
||||||
is set, Perl allowed white space between ( and ? though the latest Perls give
|
set, Perl allowed white space between ( and ? though the latest Perls give an
|
||||||
an error (for a while it was just deprecated). There may still be some cases
|
error (for a while it was just deprecated). There may still be some cases where
|
||||||
where Perl behaves differently.
|
Perl behaves differently.
|
||||||
.P
|
.P
|
||||||
15. Perl, when in warning mode, gives warnings for character classes such as
|
15. Perl, when in warning mode, gives warnings for character classes such as
|
||||||
[A-\ed] or [a-[:digit:]]. It then treats the hyphens as literals. PCRE2 has no
|
[A-\ed] or [a-[:digit:]]. It then treats the hyphens as literals. PCRE2 has no
|
||||||
|
@ -200,6 +198,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 28 July 2018
|
Last updated: 03 February 2019
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2LIMITS 3 "30 March 2017" "PCRE2 10.30"
|
.TH PCRE2LIMITS 3 "03 February 2019" "PCRE2 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH "SIZE AND OTHER LIMITATIONS"
|
.SH "SIZE AND OTHER LIMITATIONS"
|
||||||
|
@ -34,16 +34,16 @@ All values in repeating quantifiers must be less than 65536.
|
||||||
.P
|
.P
|
||||||
The maximum length of a lookbehind assertion is 65535 characters.
|
The maximum length of a lookbehind assertion is 65535 characters.
|
||||||
.P
|
.P
|
||||||
There is no limit to the number of parenthesized subpatterns, but there can be
|
There is no limit to the number of parenthesized groups, but there can be no
|
||||||
no more than 65535 capturing subpatterns. There is, however, a limit to the
|
more than 65535 capture groups, and there is a limit to the depth of nesting of
|
||||||
depth of nesting of parenthesized subpatterns of all kinds. This is imposed in
|
parenthesized subpatterns of all kinds. This is imposed in order to limit the
|
||||||
order to limit the amount of system stack used at compile time. The default
|
amount of system stack used at compile time. The default limit can be specified
|
||||||
limit can be specified when PCRE2 is built; if not, the default is set to 250.
|
when PCRE2 is built; if not, the default is set to 250. An application can
|
||||||
An application can change this limit by calling pcre2_set_parens_nest_limit()
|
change this limit by calling pcre2_set_parens_nest_limit() to set the limit in
|
||||||
to set the limit in a compile context.
|
a compile context.
|
||||||
.P
|
.P
|
||||||
The maximum length of name for a named subpattern is 32 code units, and the
|
The maximum length of name for a named capture group is 32 code units, and the
|
||||||
maximum number of named subpatterns is 10000.
|
maximum number of such groups is 10000.
|
||||||
.P
|
.P
|
||||||
The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
|
The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
|
||||||
is 255 code units for the 8-bit library and 65535 code units for the 16-bit and
|
is 255 code units for the 8-bit library and 65535 code units for the 16-bit and
|
||||||
|
@ -67,6 +67,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 30 March 2017
|
Last updated: 02 February 2019
|
||||||
Copyright (c) 1997-2017 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2PERFORM 3 "25 April 2018" "PCRE2 10.32"
|
.TH PCRE2PERFORM 3 "03 February 2019" "PCRE2 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH "PCRE2 PERFORMANCE"
|
.SH "PCRE2 PERFORMANCE"
|
||||||
|
@ -14,9 +14,9 @@ of them.
|
||||||
Patterns are compiled by PCRE2 into a reasonably efficient interpretive code,
|
Patterns are compiled by PCRE2 into a reasonably efficient interpretive code,
|
||||||
so that most simple patterns do not use much memory for storing the compiled
|
so that most simple patterns do not use much memory for storing the compiled
|
||||||
version. However, there is one case where the memory usage of a compiled
|
version. However, there is one case where the memory usage of a compiled
|
||||||
pattern can be unexpectedly large. If a parenthesized subpattern has a
|
pattern can be unexpectedly large. If a parenthesized group has a quantifier
|
||||||
quantifier with a minimum greater than 1 and/or a limited maximum, the whole
|
with a minimum greater than 1 and/or a limited maximum, the whole group is
|
||||||
subpattern is repeated in the compiled code. For example, the pattern
|
repeated in the compiled code. For example, the pattern
|
||||||
.sp
|
.sp
|
||||||
(abc|def){2,4}
|
(abc|def){2,4}
|
||||||
.sp
|
.sp
|
||||||
|
@ -239,6 +239,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 25 April 2018
|
Last updated: 03 February 2019
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2SYNTAX 3 "10 October 2018" "PCRE2 10.33"
|
.TH PCRE2SYNTAX 3 "03 February 2019" "PCRE2 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||||
.SH "PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY"
|
.SH "PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY"
|
||||||
|
@ -398,20 +398,24 @@ but some of them use Unicode properties if PCRE2_UCP is set. You can use
|
||||||
.SH "CAPTURING"
|
.SH "CAPTURING"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
(...) capturing group
|
(...) capture group
|
||||||
(?<name>...) named capturing group (Perl)
|
(?<name>...) named capture group (Perl)
|
||||||
(?'name'...) named capturing group (Perl)
|
(?'name'...) named capture group (Perl)
|
||||||
(?P<name>...) named capturing group (Python)
|
(?P<name>...) named capture group (Python)
|
||||||
(?:...) non-capturing group
|
(?:...) non-capture group
|
||||||
(?|...) non-capturing group; reset group numbers for
|
(?|...) non-capture group; reset group numbers for
|
||||||
capturing groups in each alternative
|
capture groups in each alternative
|
||||||
|
.sp
|
||||||
|
In non-UTF modes, names may contain underscores and ASCII letters and digits;
|
||||||
|
in UTF modes, any Unicode letters and Unicode decimal digits are permitted. In
|
||||||
|
both cases, a name must not start with a digit.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SH "ATOMIC GROUPS"
|
.SH "ATOMIC GROUPS"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
(?>...) atomic, non-capturing group
|
(?>...) atomic non-capture group
|
||||||
(*atomic:...) atomic, non-capturing group
|
(*atomic:...) atomic non-capture group
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SH "COMMENT"
|
.SH "COMMENT"
|
||||||
|
@ -439,7 +443,7 @@ of the group.
|
||||||
Unsetting x or xx unsets both. Several options may be set at once, and a
|
Unsetting x or xx unsets both. Several options may be set at once, and a
|
||||||
mixture of setting and unsetting such as (?i-x) is allowed, but there may be
|
mixture of setting and unsetting such as (?i-x) is allowed, but there may be
|
||||||
only one hyphen. Setting (but no unsetting) is allowed after (?^ for example
|
only one hyphen. Setting (but no unsetting) is allowed after (?^ for example
|
||||||
(?^in). An option setting may appear at the start of a non-capturing group, for
|
(?^in). An option setting may appear at the start of a non-capture group, for
|
||||||
example (?i:...).
|
example (?i:...).
|
||||||
.P
|
.P
|
||||||
The following are recognized only at the very start of a pattern or after one
|
The following are recognized only at the very start of a pattern or after one
|
||||||
|
@ -542,19 +546,19 @@ Each top-level branch of a lookbehind must be of a fixed length.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
(?R) recurse whole pattern
|
(?R) recurse whole pattern
|
||||||
(?n) call subpattern by absolute number
|
(?n) call subroutine by absolute number
|
||||||
(?+n) call subpattern by relative number
|
(?+n) call subroutine by relative number
|
||||||
(?-n) call subpattern by relative number
|
(?-n) call subroutine by relative number
|
||||||
(?&name) call subpattern by name (Perl)
|
(?&name) call subroutine by name (Perl)
|
||||||
(?P>name) call subpattern by name (Python)
|
(?P>name) call subroutine by name (Python)
|
||||||
\eg<name> call subpattern by name (Oniguruma)
|
\eg<name> call subroutine by name (Oniguruma)
|
||||||
\eg'name' call subpattern by name (Oniguruma)
|
\eg'name' call subroutine by name (Oniguruma)
|
||||||
\eg<n> call subpattern by absolute number (Oniguruma)
|
\eg<n> call subroutine by absolute number (Oniguruma)
|
||||||
\eg'n' call subpattern by absolute number (Oniguruma)
|
\eg'n' call subroutine by absolute number (Oniguruma)
|
||||||
\eg<+n> call subpattern by relative number (PCRE2 extension)
|
\eg<+n> call subroutine by relative number (PCRE2 extension)
|
||||||
\eg'+n' call subpattern by relative number (PCRE2 extension)
|
\eg'+n' call subroutine by relative number (PCRE2 extension)
|
||||||
\eg<-n> call subpattern by relative number (PCRE2 extension)
|
\eg<-n> call subroutine by relative number (PCRE2 extension)
|
||||||
\eg'-n' call subpattern by relative number (PCRE2 extension)
|
\eg'-n' call subroutine by relative number (PCRE2 extension)
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.SH "CONDITIONAL PATTERNS"
|
.SH "CONDITIONAL PATTERNS"
|
||||||
|
@ -572,7 +576,7 @@ Each top-level branch of a lookbehind must be of a fixed length.
|
||||||
(?(R) overall recursion condition
|
(?(R) overall recursion condition
|
||||||
(?(Rn) specific numbered group recursion condition
|
(?(Rn) specific numbered group recursion condition
|
||||||
(?(R&name) specific named group recursion condition
|
(?(R&name) specific named group recursion condition
|
||||||
(?(DEFINE) define subpattern for reference
|
(?(DEFINE) define groups for reference
|
||||||
(?(VERSION[>]=n.m) test PCRE2 version
|
(?(VERSION[>]=n.m) test PCRE2 version
|
||||||
(?(assert) assertion condition
|
(?(assert) assertion condition
|
||||||
.sp
|
.sp
|
||||||
|
@ -643,6 +647,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 10 October 2018
|
Last updated: 03 February 2019
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2TEST 1 "12 November 2018" "PCRE 10.33"
|
.TH PCRE2TEST 1 "03 February 2019" "PCRE 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
pcre2test - a program for testing Perl-compatible regular expressions.
|
pcre2test - a program for testing Perl-compatible regular expressions.
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -672,14 +672,14 @@ information is obtained from the \fBpcre2_pattern_info()\fP function. Here are
|
||||||
some typical examples:
|
some typical examples:
|
||||||
.sp
|
.sp
|
||||||
re> /(?i)(^a|^b)/m,info
|
re> /(?i)(^a|^b)/m,info
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Compile options: multiline
|
Compile options: multiline
|
||||||
Overall options: caseless multiline
|
Overall options: caseless multiline
|
||||||
First code unit at start or follows newline
|
First code unit at start or follows newline
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
.sp
|
.sp
|
||||||
re> /(?i)abc/info
|
re> /(?i)abc/info
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: <none>
|
Compile options: <none>
|
||||||
Overall options: caseless
|
Overall options: caseless
|
||||||
First code unit = 'a' (caseless)
|
First code unit = 'a' (caseless)
|
||||||
|
@ -1325,8 +1325,8 @@ current character is CR followed by LF, an advance of two characters occurs.
|
||||||
.sp
|
.sp
|
||||||
The \fBcopy\fP and \fBget\fP modifiers can be used to test the
|
The \fBcopy\fP and \fBget\fP modifiers can be used to test the
|
||||||
\fBpcre2_substring_copy_xxx()\fP and \fBpcre2_substring_get_xxx()\fP functions.
|
\fBpcre2_substring_copy_xxx()\fP and \fBpcre2_substring_get_xxx()\fP functions.
|
||||||
They can be given more than once, and each can specify a group name or number,
|
They can be given more than once, and each can specify a capture group name or
|
||||||
for example:
|
number, for example:
|
||||||
.sp
|
.sp
|
||||||
abcd\e=copy=1,copy=3,get=G1
|
abcd\e=copy=1,copy=3,get=G1
|
||||||
.sp
|
.sp
|
||||||
|
@ -2056,6 +2056,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 12 November 2018
|
Last updated: 03 February 2019
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -646,14 +646,14 @@ PATTERN MODIFIERS
|
||||||
are some typical examples:
|
are some typical examples:
|
||||||
|
|
||||||
re> /(?i)(^a|^b)/m,info
|
re> /(?i)(^a|^b)/m,info
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Compile options: multiline
|
Compile options: multiline
|
||||||
Overall options: caseless multiline
|
Overall options: caseless multiline
|
||||||
First code unit at start or follows newline
|
First code unit at start or follows newline
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
re> /(?i)abc/info
|
re> /(?i)abc/info
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: <none>
|
Compile options: <none>
|
||||||
Overall options: caseless
|
Overall options: caseless
|
||||||
First code unit = 'a' (caseless)
|
First code unit = 'a' (caseless)
|
||||||
|
@ -1214,8 +1214,8 @@ SUBJECT MODIFIERS
|
||||||
|
|
||||||
The copy and get modifiers can be used to test the pcre2_sub-
|
The copy and get modifiers can be used to test the pcre2_sub-
|
||||||
string_copy_xxx() and pcre2_substring_get_xxx() functions. They can be
|
string_copy_xxx() and pcre2_substring_get_xxx() functions. They can be
|
||||||
given more than once, and each can specify a group name or number, for
|
given more than once, and each can specify a capture group name or num-
|
||||||
example:
|
ber, for example:
|
||||||
|
|
||||||
abcd\=copy=1,copy=3,get=G1
|
abcd\=copy=1,copy=3,get=G1
|
||||||
|
|
||||||
|
@ -1887,5 +1887,5 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 12 November 2018
|
Last updated: 03 February 2019
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2UNICODE 3 "12 October 2018" "PCRE2 10.33"
|
.TH PCRE2UNICODE 3 "03 February 2019" "PCRE2 10.33"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
PCRE - Perl-compatible regular expressions (revised API)
|
PCRE - Perl-compatible regular expressions (revised API)
|
||||||
.SH "UNICODE AND UTF SUPPORT"
|
.SH "UNICODE AND UTF SUPPORT"
|
||||||
|
@ -27,10 +27,11 @@ case the library will be smaller.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
When PCRE2 is built with Unicode support, the escape sequences \ep{..},
|
When PCRE2 is built with Unicode support, the escape sequences \ep{..},
|
||||||
\eP{..}, and \eX can be used. The Unicode properties that can be tested are
|
\eP{..}, and \eX can be used. This is not dependent on the PCRE2_UTF setting.
|
||||||
limited to the general category properties such as Lu for an upper case letter
|
The Unicode properties that can be tested are limited to the general category
|
||||||
or Nd for a decimal number, the Unicode script names such as Arabic or Han, and
|
properties such as Lu for an upper case letter or Nd for a decimal number, the
|
||||||
the derived properties Any and L&. Full lists are given in the
|
Unicode script names such as Arabic or Han, and the derived properties Any and
|
||||||
|
L&. Full lists are given in the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
\fBpcre2pattern\fP
|
\fBpcre2pattern\fP
|
||||||
.\"
|
.\"
|
||||||
|
@ -62,13 +63,18 @@ individual code units.
|
||||||
In UTF modes, the dot metacharacter matches one UTF character instead of a
|
In UTF modes, the dot metacharacter matches one UTF character instead of a
|
||||||
single code unit.
|
single code unit.
|
||||||
.P
|
.P
|
||||||
|
In UTF modes, capture group names are not restricted to ASCII, and may contain
|
||||||
|
any Unicode letters and decimal digits, as well as underscore.
|
||||||
|
.P
|
||||||
The escape sequence \eC can be used to match a single code unit in a UTF mode,
|
The escape sequence \eC can be used to match a single code unit in a UTF mode,
|
||||||
but its use can lead to some strange effects because it breaks up multi-unit
|
but its use can lead to some strange effects because it breaks up multi-unit
|
||||||
characters (see the description of \eC in the
|
characters (see the description of \eC in the
|
||||||
.\" HREF
|
.\" HREF
|
||||||
\fBpcre2pattern\fP
|
\fBpcre2pattern\fP
|
||||||
.\"
|
.\"
|
||||||
documentation).
|
documentation). For this reason, there is a build-time option that disables
|
||||||
|
support for \eC completely. There is also a less draconian compile-time option
|
||||||
|
for locking out the use of \eC when a pattern is compiled.
|
||||||
.P
|
.P
|
||||||
The use of \eC is not supported by the alternative matching function
|
The use of \eC is not supported by the alternative matching function
|
||||||
\fBpcre2_dfa_match()\fP when in UTF-8 or UTF-16 mode, that is, when a character
|
\fBpcre2_dfa_match()\fP when in UTF-8 or UTF-16 mode, that is, when a character
|
||||||
|
@ -387,6 +393,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 12 October 2018
|
Last updated: 03 February 2019
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2019 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
|
@ -2194,6 +2194,7 @@ so it is simplest just to return both.
|
||||||
Arguments:
|
Arguments:
|
||||||
ptrptr points to the character pointer variable
|
ptrptr points to the character pointer variable
|
||||||
ptrend points to the end of the input string
|
ptrend points to the end of the input string
|
||||||
|
utf true if the input is UTF-encoded
|
||||||
terminator the terminator of a subpattern name must be this
|
terminator the terminator of a subpattern name must be this
|
||||||
offsetptr where to put the offset from the start of the pattern
|
offsetptr where to put the offset from the start of the pattern
|
||||||
nameptr where to put a pointer to the name in the input
|
nameptr where to put a pointer to the name in the input
|
||||||
|
@ -2206,13 +2207,12 @@ Returns: TRUE if a name was read
|
||||||
*/
|
*/
|
||||||
|
|
||||||
static BOOL
|
static BOOL
|
||||||
read_name(PCRE2_SPTR *ptrptr, PCRE2_SPTR ptrend, uint32_t terminator,
|
read_name(PCRE2_SPTR *ptrptr, PCRE2_SPTR ptrend, BOOL utf, uint32_t terminator,
|
||||||
PCRE2_SIZE *offsetptr, PCRE2_SPTR *nameptr, uint32_t *namelenptr,
|
PCRE2_SIZE *offsetptr, PCRE2_SPTR *nameptr, uint32_t *namelenptr,
|
||||||
int *errorcodeptr, compile_block *cb)
|
int *errorcodeptr, compile_block *cb)
|
||||||
{
|
{
|
||||||
PCRE2_SPTR ptr = *ptrptr;
|
PCRE2_SPTR ptr = *ptrptr;
|
||||||
BOOL is_group = (*ptr != CHAR_ASTERISK);
|
BOOL is_group = (*ptr != CHAR_ASTERISK);
|
||||||
uint32_t namelen = 0;
|
|
||||||
|
|
||||||
if (++ptr >= ptrend) /* No characters in name */
|
if (++ptr >= ptrend) /* No characters in name */
|
||||||
{
|
{
|
||||||
|
@ -2221,35 +2221,74 @@ if (++ptr >= ptrend) /* No characters in name */
|
||||||
goto FAILED;
|
goto FAILED;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* A group name must not start with a digit. If either of the others start with
|
|
||||||
a digit it just won't be recognized. */
|
|
||||||
|
|
||||||
if (is_group && IS_DIGIT(*ptr))
|
|
||||||
{
|
|
||||||
*errorcodeptr = ERR44;
|
|
||||||
goto FAILED;
|
|
||||||
}
|
|
||||||
|
|
||||||
*nameptr = ptr;
|
*nameptr = ptr;
|
||||||
*offsetptr = (PCRE2_SIZE)(ptr - cb->start_pattern);
|
*offsetptr = (PCRE2_SIZE)(ptr - cb->start_pattern);
|
||||||
|
|
||||||
while (ptr < ptrend && MAX_255(*ptr) && (cb->ctypes[*ptr] & ctype_word) != 0)
|
/* In UTF mode, a group name may contain letters and decimal digits as defined
|
||||||
|
by Unicode properties, and underscores, but must not start with a digit. */
|
||||||
|
|
||||||
|
#ifdef SUPPORT_UNICODE
|
||||||
|
if (utf && is_group)
|
||||||
{
|
{
|
||||||
ptr++;
|
uint32_t c, type;
|
||||||
namelen++;
|
|
||||||
if (namelen > MAX_NAME_SIZE)
|
GETCHAR(c, ptr);
|
||||||
|
type = UCD_CHARTYPE(c);
|
||||||
|
|
||||||
|
if (type == ucp_Nd)
|
||||||
{
|
{
|
||||||
*errorcodeptr = ERR48;
|
*errorcodeptr = ERR44;
|
||||||
goto FAILED;
|
goto FAILED;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
for(;;)
|
||||||
|
{
|
||||||
|
if (type != ucp_Nd && PRIV(ucp_gentype)[type] != ucp_L &&
|
||||||
|
c != CHAR_UNDERSCORE) break;
|
||||||
|
ptr++;
|
||||||
|
FORWARDCHAR(ptr);
|
||||||
|
if (ptr >= ptrend) break;
|
||||||
|
GETCHAR(c, ptr);
|
||||||
|
type = UCD_CHARTYPE(c);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
else
|
||||||
|
#else
|
||||||
|
(void)utf; /* Avoid compiler warning */
|
||||||
|
#endif /* SUPPORT_UNICODE */
|
||||||
|
|
||||||
|
/* Handle non-group names and group names in non-UTF modes. A group name must
|
||||||
|
not start with a digit. If either of the others start with a digit it just
|
||||||
|
won't be recognized. */
|
||||||
|
|
||||||
|
{
|
||||||
|
if (is_group && IS_DIGIT(*ptr))
|
||||||
|
{
|
||||||
|
*errorcodeptr = ERR44;
|
||||||
|
goto FAILED;
|
||||||
|
}
|
||||||
|
|
||||||
|
while (ptr < ptrend && MAX_255(*ptr) && (cb->ctypes[*ptr] & ctype_word) != 0)
|
||||||
|
{
|
||||||
|
ptr++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Check name length */
|
||||||
|
|
||||||
|
if (ptr > *nameptr + MAX_NAME_SIZE)
|
||||||
|
{
|
||||||
|
*errorcodeptr = ERR48;
|
||||||
|
goto FAILED;
|
||||||
|
}
|
||||||
|
*namelenptr = ptr - *nameptr;
|
||||||
|
|
||||||
/* Subpattern names must not be empty, and their terminator is checked here.
|
/* Subpattern names must not be empty, and their terminator is checked here.
|
||||||
(What follows a verb or alpha assertion name is checked separately.) */
|
(What follows a verb or alpha assertion name is checked separately.) */
|
||||||
|
|
||||||
if (is_group)
|
if (is_group)
|
||||||
{
|
{
|
||||||
if (namelen == 0)
|
if (ptr == *nameptr)
|
||||||
{
|
{
|
||||||
*errorcodeptr = ERR62; /* Subpattern name expected */
|
*errorcodeptr = ERR62; /* Subpattern name expected */
|
||||||
goto FAILED;
|
goto FAILED;
|
||||||
|
@ -2262,7 +2301,6 @@ if (is_group)
|
||||||
ptr++;
|
ptr++;
|
||||||
}
|
}
|
||||||
|
|
||||||
*namelenptr = namelen;
|
|
||||||
*ptrptr = ptr;
|
*ptrptr = ptr;
|
||||||
return TRUE;
|
return TRUE;
|
||||||
|
|
||||||
|
@ -2981,7 +3019,7 @@ while (ptr < ptrend)
|
||||||
|
|
||||||
/* Not a numerical recursion */
|
/* Not a numerical recursion */
|
||||||
|
|
||||||
if (!read_name(&ptr, ptrend, terminator, &offset, &name, &namelen,
|
if (!read_name(&ptr, ptrend, utf, terminator, &offset, &name, &namelen,
|
||||||
&errorcode, cb)) goto ESCAPE_FAILED;
|
&errorcode, cb)) goto ESCAPE_FAILED;
|
||||||
|
|
||||||
/* \k and \g when used with braces are back references, whereas \g used
|
/* \k and \g when used with braces are back references, whereas \g used
|
||||||
|
@ -3554,8 +3592,8 @@ while (ptr < ptrend)
|
||||||
uint32_t meta;
|
uint32_t meta;
|
||||||
|
|
||||||
vn = alasnames;
|
vn = alasnames;
|
||||||
if (!read_name(&ptr, ptrend, 0, &offset, &name, &namelen, &errorcode,
|
if (!read_name(&ptr, ptrend, utf, 0, &offset, &name, &namelen,
|
||||||
cb)) goto FAILED;
|
&errorcode, cb)) goto FAILED;
|
||||||
if (ptr >= ptrend || *ptr != CHAR_COLON)
|
if (ptr >= ptrend || *ptr != CHAR_COLON)
|
||||||
{
|
{
|
||||||
errorcode = ERR95; /* Malformed */
|
errorcode = ERR95; /* Malformed */
|
||||||
|
@ -3651,8 +3689,8 @@ while (ptr < ptrend)
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
vn = verbnames;
|
vn = verbnames;
|
||||||
if (!read_name(&ptr, ptrend, 0, &offset, &name, &namelen, &errorcode,
|
if (!read_name(&ptr, ptrend, utf, 0, &offset, &name, &namelen,
|
||||||
cb)) goto FAILED;
|
&errorcode, cb)) goto FAILED;
|
||||||
if (ptr >= ptrend || (*ptr != CHAR_COLON &&
|
if (ptr >= ptrend || (*ptr != CHAR_COLON &&
|
||||||
*ptr != CHAR_RIGHT_PARENTHESIS))
|
*ptr != CHAR_RIGHT_PARENTHESIS))
|
||||||
{
|
{
|
||||||
|
@ -3907,7 +3945,7 @@ while (ptr < ptrend)
|
||||||
errorcode = ERR41;
|
errorcode = ERR41;
|
||||||
goto FAILED;
|
goto FAILED;
|
||||||
}
|
}
|
||||||
if (!read_name(&ptr, ptrend, CHAR_RIGHT_PARENTHESIS, &offset, &name,
|
if (!read_name(&ptr, ptrend, utf, CHAR_RIGHT_PARENTHESIS, &offset, &name,
|
||||||
&namelen, &errorcode, cb)) goto FAILED;
|
&namelen, &errorcode, cb)) goto FAILED;
|
||||||
*parsed_pattern++ = META_BACKREF_BYNAME;
|
*parsed_pattern++ = META_BACKREF_BYNAME;
|
||||||
*parsed_pattern++ = namelen;
|
*parsed_pattern++ = namelen;
|
||||||
|
@ -3967,7 +4005,7 @@ while (ptr < ptrend)
|
||||||
|
|
||||||
case CHAR_AMPERSAND:
|
case CHAR_AMPERSAND:
|
||||||
RECURSE_BY_NAME:
|
RECURSE_BY_NAME:
|
||||||
if (!read_name(&ptr, ptrend, CHAR_RIGHT_PARENTHESIS, &offset, &name,
|
if (!read_name(&ptr, ptrend, utf, CHAR_RIGHT_PARENTHESIS, &offset, &name,
|
||||||
&namelen, &errorcode, cb)) goto FAILED;
|
&namelen, &errorcode, cb)) goto FAILED;
|
||||||
*parsed_pattern++ = META_RECURSE_BYNAME;
|
*parsed_pattern++ = META_RECURSE_BYNAME;
|
||||||
*parsed_pattern++ = namelen;
|
*parsed_pattern++ = namelen;
|
||||||
|
@ -4215,7 +4253,7 @@ while (ptr < ptrend)
|
||||||
terminator = CHAR_RIGHT_PARENTHESIS;
|
terminator = CHAR_RIGHT_PARENTHESIS;
|
||||||
ptr--; /* Point to char before name */
|
ptr--; /* Point to char before name */
|
||||||
}
|
}
|
||||||
if (!read_name(&ptr, ptrend, terminator, &offset, &name, &namelen,
|
if (!read_name(&ptr, ptrend, utf, terminator, &offset, &name, &namelen,
|
||||||
&errorcode, cb)) goto FAILED;
|
&errorcode, cb)) goto FAILED;
|
||||||
|
|
||||||
/* Handle (?(R&name) */
|
/* Handle (?(R&name) */
|
||||||
|
@ -4349,7 +4387,7 @@ while (ptr < ptrend)
|
||||||
terminator = CHAR_APOSTROPHE; /* Terminator */
|
terminator = CHAR_APOSTROPHE; /* Terminator */
|
||||||
|
|
||||||
DEFINE_NAME:
|
DEFINE_NAME:
|
||||||
if (!read_name(&ptr, ptrend, terminator, &offset, &name, &namelen,
|
if (!read_name(&ptr, ptrend, utf, terminator, &offset, &name, &namelen,
|
||||||
&errorcode, cb)) goto FAILED;
|
&errorcode, cb)) goto FAILED;
|
||||||
|
|
||||||
/* We have a name for this capturing group. It is also assigned a number,
|
/* We have a name for this capturing group. It is also assigned a number,
|
||||||
|
|
|
@ -95,7 +95,7 @@ static const unsigned char compile_error_texts[] =
|
||||||
/* 25 */
|
/* 25 */
|
||||||
"lookbehind assertion is not fixed length\0"
|
"lookbehind assertion is not fixed length\0"
|
||||||
"a relative value of zero is not allowed\0"
|
"a relative value of zero is not allowed\0"
|
||||||
"conditional group contains more than two branches\0"
|
"conditional subpattern contains more than two branches\0"
|
||||||
"assertion expected after (?( or (?(?C)\0"
|
"assertion expected after (?( or (?(?C)\0"
|
||||||
"digit expected after (?+ or (?-\0"
|
"digit expected after (?+ or (?-\0"
|
||||||
/* 30 */
|
/* 30 */
|
||||||
|
@ -113,21 +113,21 @@ static const unsigned char compile_error_texts[] =
|
||||||
/* 40 */
|
/* 40 */
|
||||||
"invalid escape sequence in (*VERB) name\0"
|
"invalid escape sequence in (*VERB) name\0"
|
||||||
"unrecognized character after (?P\0"
|
"unrecognized character after (?P\0"
|
||||||
"syntax error in subpattern name (missing terminator)\0"
|
"syntax error in subpattern name (missing terminator?)\0"
|
||||||
"two named subpatterns have the same name (PCRE2_DUPNAMES not set)\0"
|
"two named subpatterns have the same name (PCRE2_DUPNAMES not set)\0"
|
||||||
"group name must start with a non-digit\0"
|
"subpattern name must start with a non-digit\0"
|
||||||
/* 45 */
|
/* 45 */
|
||||||
"this version of PCRE2 does not have support for \\P, \\p, or \\X\0"
|
"this version of PCRE2 does not have support for \\P, \\p, or \\X\0"
|
||||||
"malformed \\P or \\p sequence\0"
|
"malformed \\P or \\p sequence\0"
|
||||||
"unknown property name after \\P or \\p\0"
|
"unknown property name after \\P or \\p\0"
|
||||||
"subpattern name is too long (maximum " XSTRING(MAX_NAME_SIZE) " characters)\0"
|
"subpattern name is too long (maximum " XSTRING(MAX_NAME_SIZE) " code units)\0"
|
||||||
"too many named subpatterns (maximum " XSTRING(MAX_NAME_COUNT) ")\0"
|
"too many named subpatterns (maximum " XSTRING(MAX_NAME_COUNT) ")\0"
|
||||||
/* 50 */
|
/* 50 */
|
||||||
"invalid range in character class\0"
|
"invalid range in character class\0"
|
||||||
"octal value is greater than \\377 in 8-bit non-UTF-8 mode\0"
|
"octal value is greater than \\377 in 8-bit non-UTF-8 mode\0"
|
||||||
"internal error: overran compiling workspace\0"
|
"internal error: overran compiling workspace\0"
|
||||||
"internal error: previously-checked referenced subpattern not found\0"
|
"internal error: previously-checked referenced subpattern not found\0"
|
||||||
"DEFINE group contains more than one branch\0"
|
"DEFINE subpattern contains more than one branch\0"
|
||||||
/* 55 */
|
/* 55 */
|
||||||
"missing opening brace after \\o\0"
|
"missing opening brace after \\o\0"
|
||||||
"internal error: unknown newline setting\0"
|
"internal error: unknown newline setting\0"
|
||||||
|
@ -137,7 +137,7 @@ static const unsigned char compile_error_texts[] =
|
||||||
"obsolete error (should not occur)\0" /* Was the above */
|
"obsolete error (should not occur)\0" /* Was the above */
|
||||||
/* 60 */
|
/* 60 */
|
||||||
"(*VERB) not recognized or malformed\0"
|
"(*VERB) not recognized or malformed\0"
|
||||||
"group number is too big\0"
|
"subpattern number is too big\0"
|
||||||
"subpattern name expected\0"
|
"subpattern name expected\0"
|
||||||
"internal error: parsed pattern overflow\0"
|
"internal error: parsed pattern overflow\0"
|
||||||
"non-octal character in \\o{} (closing brace missing?)\0"
|
"non-octal character in \\o{} (closing brace missing?)\0"
|
||||||
|
|
|
@ -3049,13 +3049,14 @@ return yield;
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
#ifdef SUPPORT_PCRE2_8
|
|
||||||
/*************************************************
|
/*************************************************
|
||||||
* Convert character value to UTF-8 *
|
* Convert character value to UTF-8 *
|
||||||
*************************************************/
|
*************************************************/
|
||||||
|
|
||||||
/* This function takes an integer value in the range 0 - 0x7fffffff
|
/* This function takes an integer value in the range 0 - 0x7fffffff
|
||||||
and encodes it as a UTF-8 character in 0 to 6 bytes.
|
and encodes it as a UTF-8 character in 0 to 6 bytes. It is needed even when the
|
||||||
|
8-bit library is not supported, to generate UTF-8 output for non-ASCII
|
||||||
|
characters.
|
||||||
|
|
||||||
Arguments:
|
Arguments:
|
||||||
cvalue the character value
|
cvalue the character value
|
||||||
|
@ -3081,7 +3082,6 @@ for (j = i; j > 0; j--)
|
||||||
*utf8bytes = utf8_table2[i] | cvalue;
|
*utf8bytes = utf8_table2[i] | cvalue;
|
||||||
return i + 1;
|
return i + 1;
|
||||||
}
|
}
|
||||||
#endif /* SUPPORT_PCRE2_8 */
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -4374,6 +4374,7 @@ static int
|
||||||
show_pattern_info(void)
|
show_pattern_info(void)
|
||||||
{
|
{
|
||||||
uint32_t compile_options, overall_options, extra_options;
|
uint32_t compile_options, overall_options, extra_options;
|
||||||
|
BOOL utf = (FLD(compiled_code, overall_options) & PCRE2_UTF) != 0;
|
||||||
|
|
||||||
if ((pat_patctl.control & (CTL_BINCODE|CTL_FULLBINCODE)) != 0)
|
if ((pat_patctl.control & (CTL_BINCODE|CTL_FULLBINCODE)) != 0)
|
||||||
{
|
{
|
||||||
|
@ -4463,7 +4464,7 @@ if ((pat_patctl.control & CTL_INFO) != 0)
|
||||||
!= 0)
|
!= 0)
|
||||||
return PR_ABEND;
|
return PR_ABEND;
|
||||||
|
|
||||||
fprintf(outfile, "Capturing subpattern count = %d\n", capture_count);
|
fprintf(outfile, "Capture group count = %d\n", capture_count);
|
||||||
|
|
||||||
if (backrefmax > 0)
|
if (backrefmax > 0)
|
||||||
fprintf(outfile, "Max back reference = %d\n", backrefmax);
|
fprintf(outfile, "Max back reference = %d\n", backrefmax);
|
||||||
|
@ -4482,14 +4483,60 @@ if ((pat_patctl.control & CTL_INFO) != 0)
|
||||||
|
|
||||||
if (namecount > 0)
|
if (namecount > 0)
|
||||||
{
|
{
|
||||||
fprintf(outfile, "Named capturing subpatterns:\n");
|
fprintf(outfile, "Named capture groups:\n");
|
||||||
for (; namecount > 0; namecount--)
|
for (; namecount > 0; namecount--)
|
||||||
{
|
{
|
||||||
int imm2_size = test_mode == PCRE8_MODE ? 2 : 1;
|
int imm2_size = test_mode == PCRE8_MODE ? 2 : 1;
|
||||||
uint32_t length = (uint32_t)STRLEN(nametable + imm2_size);
|
uint32_t length = (uint32_t)STRLEN(nametable + imm2_size);
|
||||||
fprintf(outfile, " ");
|
fprintf(outfile, " ");
|
||||||
PCHARSV(nametable, imm2_size, length, FALSE, outfile);
|
|
||||||
|
/* In UTF mode the name may be a UTF string containing non-ASCII
|
||||||
|
letters and digits. We must output it as a UTF-8 string. In non-UTF mode,
|
||||||
|
use the normal string printing functions, which use escapes for all
|
||||||
|
non-ASCII characters. */
|
||||||
|
|
||||||
|
if (utf)
|
||||||
|
{
|
||||||
|
#ifdef SUPPORT_PCRE2_32
|
||||||
|
if (test_mode == PCRE32_MODE)
|
||||||
|
{
|
||||||
|
PCRE2_SPTR32 nameptr = (PCRE2_SPTR32)nametable + imm2_size;
|
||||||
|
while (*nameptr != 0)
|
||||||
|
{
|
||||||
|
uint8_t u8buff[6];
|
||||||
|
int len = ord2utf8(*nameptr++, u8buff);
|
||||||
|
fprintf(outfile, "%.*s", len, u8buff);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
#ifdef SUPPORT_PCRE2_16
|
||||||
|
if (test_mode == PCRE16_MODE)
|
||||||
|
{
|
||||||
|
PCRE2_SPTR16 nameptr = (PCRE2_SPTR16)nametable + imm2_size;
|
||||||
|
while (*nameptr != 0)
|
||||||
|
{
|
||||||
|
int len;
|
||||||
|
uint8_t u8buff[6];
|
||||||
|
uint32_t c = *nameptr++ & 0xffff;
|
||||||
|
if (c >= 0xD800 && c < 0xDC00)
|
||||||
|
c = ((c & 0x3ff) << 10) + (*nameptr++ & 0x3ff) + 0x10000;
|
||||||
|
len = ord2utf8(c, u8buff);
|
||||||
|
fprintf(outfile, "%.*s", len, u8buff);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
#ifdef SUPPORT_PCRE2_8
|
||||||
|
if (test_mode == PCRE8_MODE)
|
||||||
|
fprintf(outfile, "%s", (PCRE2_SPTR8)nametable + imm2_size);
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
else /* Not UTF mode */
|
||||||
|
{
|
||||||
|
PCHARSV(nametable, imm2_size, length, FALSE, outfile);
|
||||||
|
}
|
||||||
|
|
||||||
while (length++ < nameentrysize - imm2_size) putc(' ', outfile);
|
while (length++ < nameentrysize - imm2_size) putc(' ', outfile);
|
||||||
|
|
||||||
#ifdef SUPPORT_PCRE2_32
|
#ifdef SUPPORT_PCRE2_32
|
||||||
if (test_mode == PCRE32_MODE)
|
if (test_mode == PCRE32_MODE)
|
||||||
fprintf(outfile, "%3d\n", (int)(((PCRE2_SPTR32)nametable)[0]));
|
fprintf(outfile, "%3d\n", (int)(((PCRE2_SPTR32)nametable)[0]));
|
||||||
|
@ -4503,6 +4550,7 @@ if ((pat_patctl.control & CTL_INFO) != 0)
|
||||||
fprintf(outfile, "%3d\n", (int)(
|
fprintf(outfile, "%3d\n", (int)(
|
||||||
((((PCRE2_SPTR8)nametable)[0]) << 8) | ((PCRE2_SPTR8)nametable)[1]));
|
((((PCRE2_SPTR8)nametable)[0]) << 8) | ((PCRE2_SPTR8)nametable)[1]));
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
nametable = (void*)((PCRE2_SPTR8)nametable + nameentrysize * code_unit_size);
|
nametable = (void*)((PCRE2_SPTR8)nametable + nameentrysize * code_unit_size);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
@ -481,4 +481,12 @@
|
||||||
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
/(?<=abc)(|def)/g,utf,replace=<$0>,substitute_callout
|
||||||
123abcáyzabcdef789abcሴqr
|
123abcáyzabcdef789abcሴqr
|
||||||
|
|
||||||
|
# Check name length with non-ASCII characters
|
||||||
|
|
||||||
|
/(?'ABáC678901234567890123456789012'...)/utf
|
||||||
|
|
||||||
|
/(?'ABáC6789012345678901234567890123'...)/utf
|
||||||
|
|
||||||
|
/(?'ABZC6789012345678901234567890123'...)/utf
|
||||||
|
|
||||||
# End of testinput10
|
# End of testinput10
|
||||||
|
|
|
@ -2457,4 +2457,27 @@
|
||||||
|
|
||||||
# -------
|
# -------
|
||||||
|
|
||||||
|
# Test group names containing non-ASCII letters and digits
|
||||||
|
|
||||||
|
/(?'ABáC'...)\g{ABáC}/utf
|
||||||
|
abcabcdefg
|
||||||
|
|
||||||
|
/(?'XʰABC'...)/utf
|
||||||
|
xyzpq
|
||||||
|
|
||||||
|
/(?'XאABC'...)/utf
|
||||||
|
12345
|
||||||
|
|
||||||
|
/(?'XᾈABC'...)/utf
|
||||||
|
%^&*(...
|
||||||
|
|
||||||
|
/(?'𐨐ABC'...)/utf
|
||||||
|
abcde
|
||||||
|
|
||||||
|
/^(?'אABC'...)(?&אABC)(?P=אABC)/utf
|
||||||
|
123123123456
|
||||||
|
|
||||||
|
/^(?'אABC'...)(?&אABC)/utf
|
||||||
|
123123123456
|
||||||
|
|
||||||
# End of testinput4
|
# End of testinput4
|
||||||
|
|
|
@ -2149,4 +2149,19 @@
|
||||||
|
|
||||||
# -------
|
# -------
|
||||||
|
|
||||||
|
# Test reference and errors in non-ASCII characters in group names
|
||||||
|
|
||||||
|
/(?'𑠅ABC'...)/I,utf
|
||||||
|
abcde\=copy=𑠅ABC
|
||||||
|
|
||||||
|
# Bad ones
|
||||||
|
|
||||||
|
/(?'AB၌C'...)\g{AB၌C}/utf
|
||||||
|
|
||||||
|
/(?'٠ABC'...)/utf
|
||||||
|
|
||||||
|
/(?'²ABC'...)/utf
|
||||||
|
|
||||||
|
/(?'X²ABC'...)/utf
|
||||||
|
|
||||||
# End of testinput5
|
# End of testinput5
|
||||||
|
|
|
@ -248,7 +248,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc4
|
First code unit = \xc4
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -261,7 +261,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xe1
|
First code unit = \xe1
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -274,7 +274,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xf0
|
First code unit = \xf0
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -287,7 +287,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xf4
|
First code unit = \xf4
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -300,7 +300,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xf4
|
First code unit = \xf4
|
||||||
Last code unit = \xbf
|
Last code unit = \xbf
|
||||||
|
@ -313,7 +313,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc3
|
First code unit = \xc3
|
||||||
Last code unit = \xbf
|
Last code unit = \xbf
|
||||||
|
@ -326,7 +326,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc4
|
First code unit = \xc4
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -339,7 +339,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc2
|
First code unit = \xc2
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -352,7 +352,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc3
|
First code unit = \xc3
|
||||||
Last code unit = \xbf
|
Last code unit = \xbf
|
||||||
|
@ -365,7 +365,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xed
|
First code unit = \xed
|
||||||
Last code unit = \xb4
|
Last code unit = \xb4
|
||||||
|
@ -380,7 +380,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xe6
|
First code unit = \xe6
|
||||||
Last code unit = \x9e
|
Last code unit = \x9e
|
||||||
|
@ -395,7 +395,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc2
|
First code unit = \xc2
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -408,7 +408,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc2
|
First code unit = \xc2
|
||||||
Last code unit = \x84
|
Last code unit = \x84
|
||||||
|
@ -421,7 +421,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc4
|
First code unit = \xc4
|
||||||
Last code unit = \x84
|
Last code unit = \x84
|
||||||
|
@ -434,7 +434,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xe0
|
First code unit = \xe0
|
||||||
Last code unit = \xa1
|
Last code unit = \xa1
|
||||||
|
@ -447,7 +447,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xf0
|
First code unit = \xf0
|
||||||
Last code unit = \xab
|
Last code unit = \xab
|
||||||
|
@ -460,7 +460,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
||||||
|
@ -495,7 +495,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc4
|
First code unit = \xc4
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -514,7 +514,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: x \xc4
|
Starting code units: x \xc4
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -531,7 +531,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: a x \xc4
|
Starting code units: a x \xc4
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -548,7 +548,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: a x \xc4
|
Starting code units: a x \xc4
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -566,7 +566,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: x \xc4
|
Starting code units: x \xc4
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -578,7 +578,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc4
|
First code unit = \xc4
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -592,7 +592,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -606,7 +606,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = \x81
|
Last code unit = \x81
|
||||||
|
@ -619,7 +619,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[\x{100}]/IB,utf
|
/[\x{100}]/IB,utf
|
||||||
|
@ -629,7 +629,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc4
|
First code unit = \xc4
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -648,7 +648,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc3
|
First code unit = \xc3
|
||||||
Last code unit = \xbf
|
Last code unit = \xbf
|
||||||
|
@ -663,7 +663,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
@ -678,14 +678,14 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc4
|
First code unit = \xc4
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
Subject length lower bound = 7
|
Subject length lower bound = 7
|
||||||
|
|
||||||
/\777/I,utf
|
/\777/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc7
|
First code unit = \xc7
|
||||||
Last code unit = \xbf
|
Last code unit = \xbf
|
||||||
|
@ -703,7 +703,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc4
|
First code unit = \xc4
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
|
@ -717,7 +717,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc4
|
First code unit = \xc4
|
||||||
Last code unit = 'X'
|
Last code unit = 'X'
|
||||||
|
@ -761,7 +761,7 @@ No match
|
||||||
0: \x{1234}
|
0: \x{1234}
|
||||||
|
|
||||||
/(*CRLF)(*UTF)(*BSR_UNICODE)a\Rb/I
|
/(*CRLF)(*UTF)(*BSR_UNICODE)a\Rb/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: <none>
|
Compile options: <none>
|
||||||
Overall options: utf
|
Overall options: utf
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
|
@ -771,7 +771,7 @@ Last code unit = 'b'
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
|
|
||||||
/\h/I,utf
|
/\h/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x20 \xc2 \xe1 \xe2 \xe3
|
Starting code units: \x09 \x20 \xc2 \xe1 \xe2 \xe3
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -795,7 +795,7 @@ Subject length lower bound = 1
|
||||||
0: \x{3000}
|
0: \x{3000}
|
||||||
|
|
||||||
/\v/I,utf
|
/\v/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \xc2 \xe2
|
Starting code units: \x0a \x0b \x0c \x0d \xc2 \xe2
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -813,7 +813,7 @@ Subject length lower bound = 1
|
||||||
0: \x{2028}
|
0: \x{2028}
|
||||||
|
|
||||||
/\h*A/I,utf
|
/\h*A/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x20 A \xc2 \xe1 \xe2 \xe3
|
Starting code units: \x09 \x20 A \xc2 \xe1 \xe2 \xe3
|
||||||
Last code unit = 'A'
|
Last code unit = 'A'
|
||||||
|
@ -822,21 +822,21 @@ Subject length lower bound = 1
|
||||||
0: A
|
0: A
|
||||||
|
|
||||||
/\v+A/I,utf
|
/\v+A/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \xc2 \xe2
|
Starting code units: \x0a \x0b \x0c \x0d \xc2 \xe2
|
||||||
Last code unit = 'A'
|
Last code unit = 'A'
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
|
||||||
/\s?xxx\s/I,utf
|
/\s?xxx\s/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 x
|
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 x
|
||||||
Last code unit = 'x'
|
Last code unit = 'x'
|
||||||
Subject length lower bound = 4
|
Subject length lower bound = 4
|
||||||
|
|
||||||
/\sxxx\s/I,utf,tables=2
|
/\sxxx\s/I,utf,tables=2
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xc2
|
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xc2
|
||||||
Last code unit = 'x'
|
Last code unit = 'x'
|
||||||
|
@ -847,7 +847,7 @@ Subject length lower bound = 5
|
||||||
0: \x{a0}xxx\x{85}
|
0: \x{a0}xxx\x{85}
|
||||||
|
|
||||||
/\S \S/I,utf,tables=2
|
/\S \S/I,utf,tables=2
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
||||||
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
||||||
|
@ -883,25 +883,25 @@ Error -36 (bad UTF-8 offset)
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/\x{1234}+/Ii,utf
|
/\x{1234}+/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xe1
|
Starting code units: \xe1
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{1234}+?/Ii,utf
|
/\x{1234}+?/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xe1
|
Starting code units: \xe1
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{1234}++/Ii,utf
|
/\x{1234}++/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xe1
|
Starting code units: \xe1
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{1234}{2}/Ii,utf
|
/\x{1234}{2}/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xe1
|
Starting code units: \xe1
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
@ -913,7 +913,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
@ -925,14 +925,14 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'X'
|
First code unit = 'X'
|
||||||
Last code unit = \x80
|
Last code unit = \x80
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
|
||||||
/\R/I,utf
|
/\R/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \xc2 \xe2
|
Starting code units: \x0a \x0b \x0c \x0d \xc2 \xe2
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -944,7 +944,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xc7
|
First code unit = \xc7
|
||||||
Last code unit = \xbf
|
Last code unit = \xbf
|
||||||
|
@ -1105,7 +1105,7 @@ Failed: error 174 at offset 0: using UTF is disabled by the application
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = 'A' (caseless)
|
First code unit = 'A' (caseless)
|
||||||
Subject length lower bound = 5
|
Subject length lower bound = 5
|
||||||
|
@ -1117,7 +1117,7 @@ Subject length lower bound = 5
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = \xb0
|
Last code unit = \xb0
|
||||||
|
@ -1130,7 +1130,7 @@ Subject length lower bound = 5
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = \xb0
|
Last code unit = \xb0
|
||||||
|
@ -1143,14 +1143,14 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = 'A' (caseless)
|
First code unit = 'A' (caseless)
|
||||||
Last code unit = 'B' (caseless)
|
Last code unit = 'B' (caseless)
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
|
|
||||||
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
|
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xd0 \xd1
|
Starting code units: \xd0 \xd1
|
||||||
Subject length lower bound = 17
|
Subject length lower bound = 17
|
||||||
|
@ -1176,17 +1176,17 @@ Subject length lower bound = 17
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
|
|
||||||
/\h/I
|
/\h/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x09 \x20 \xa0
|
Starting code units: \x09 \x20 \xa0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\v/I
|
/\v/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85
|
Starting code units: \x0a \x0b \x0c \x0d \x85
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\R/I
|
/\R/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85
|
Starting code units: \x0a \x0b \x0c \x0d \x85
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
@ -1199,7 +1199,7 @@ Subject length lower bound = 1
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
|
|
||||||
/\x{212a}+/Ii,utf
|
/\x{212a}+/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: K k \xe2
|
Starting code units: K k \xe2
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1207,7 +1207,7 @@ Subject length lower bound = 1
|
||||||
0: KKkk\x{212a}
|
0: KKkk\x{212a}
|
||||||
|
|
||||||
/s+/Ii,utf
|
/s+/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: S s \xc5
|
Starting code units: S s \xc5
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1222,7 +1222,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: A \xc4
|
Starting code units: A \xc4
|
||||||
Last code unit = 'A'
|
Last code unit = 'A'
|
||||||
|
@ -1239,7 +1239,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xc4
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xc4
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1251,7 +1251,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: Z \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd
|
Starting code units: Z \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd
|
||||||
\xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc
|
\xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc
|
||||||
|
@ -1273,7 +1273,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: z { | } ~ \x7f \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9
|
Starting code units: z { | } ~ \x7f \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9
|
||||||
\xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8
|
\xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8
|
||||||
|
@ -1289,7 +1289,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: - ] a d z \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc
|
Starting code units: - ] a d z \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc
|
||||||
\xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb
|
\xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb
|
||||||
|
@ -1314,7 +1314,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: a b \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd
|
Starting code units: a b \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd
|
||||||
\xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc
|
\xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc
|
||||||
|
@ -1332,7 +1332,7 @@ Subject length lower bound = 7
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xc4
|
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xc4
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1345,7 +1345,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xc4
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xc4
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1358,7 +1358,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
|
@ -1373,7 +1373,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
||||||
|
@ -1395,7 +1395,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
||||||
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
||||||
|
@ -1416,7 +1416,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
||||||
|
@ -1435,7 +1435,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce
|
Starting code units: \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce
|
||||||
\xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd
|
\xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd
|
||||||
|
@ -1462,7 +1462,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: Z z { | } ~ \x7f \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8
|
Starting code units: Z z { | } ~ \x7f \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8
|
||||||
\xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7
|
\xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7
|
||||||
|
@ -1503,7 +1503,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: Z z { | } ~ \x7f \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8
|
Starting code units: Z z { | } ~ \x7f \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8
|
||||||
\xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7
|
\xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7
|
||||||
|
@ -1520,7 +1520,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xce \xcf
|
Starting code units: \xce \xcf
|
||||||
Last code unit = 'B' (caseless)
|
Last code unit = 'B' (caseless)
|
||||||
|
@ -1531,7 +1531,7 @@ Subject length lower bound = 2
|
||||||
Failed: error -3: UTF-8 error: 1 byte missing at end
|
Failed: error -3: UTF-8 error: 1 byte missing at end
|
||||||
|
|
||||||
/(?<=(a)(?-1))x/I,utf
|
/(?<=(a)(?-1))x/I,utf
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max lookbehind = 2
|
Max lookbehind = 2
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'x'
|
First code unit = 'x'
|
||||||
|
@ -1579,7 +1579,7 @@ Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP),
|
||||||
# but subjects containing them must not be UTF-checked.
|
# but subjects containing them must not be UTF-checked.
|
||||||
|
|
||||||
/\x{d800}/I,utf,allow_surrogate_escapes
|
/\x{d800}/I,utf,allow_surrogate_escapes
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Extra options: allow_surrogate_escapes
|
Extra options: allow_surrogate_escapes
|
||||||
First code unit = \xed
|
First code unit = \xed
|
||||||
|
@ -1602,7 +1602,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: utf
|
Compile options: utf
|
||||||
Overall options: anchored utf
|
Overall options: anchored utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
|
@ -1636,4 +1636,13 @@ No match
|
||||||
4(2) Old 22 22 "" New 28 30 "<>"
|
4(2) Old 22 22 "" New 28 30 "<>"
|
||||||
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
4: 123abc<>\x{e1}yzabc<><def>789abc<>\x{1234}qr
|
||||||
|
|
||||||
|
# Check name length with non-ASCII characters
|
||||||
|
|
||||||
|
/(?'ABáC678901234567890123456789012'...)/utf
|
||||||
|
|
||||||
|
/(?'ABáC6789012345678901234567890123'...)/utf
|
||||||
|
Failed: error 148 at offset 36: subpattern name is too long (maximum 32 code units)
|
||||||
|
|
||||||
|
/(?'ABZC6789012345678901234567890123'...)/utf
|
||||||
|
|
||||||
# End of testinput10
|
# End of testinput10
|
||||||
|
|
|
@ -13,11 +13,11 @@
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{100}/I
|
/\x{100}/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
@ -215,7 +215,7 @@ Subject length lower bound = 1
|
||||||
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
|
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
|
||||||
\) )* # optional trailing comment
|
\) )* # optional trailing comment
|
||||||
/Ix
|
/Ix
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Contains explicit CR or LF match
|
Contains explicit CR or LF match
|
||||||
Options: extended
|
Options: extended
|
||||||
Starting code units: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
|
Starting code units: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
|
||||||
|
@ -260,7 +260,7 @@ Subject length lower bound = 3
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
|
|
||||||
/\h+/I
|
/\h+/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x09 \x20 \xa0 \xff
|
Starting code units: \x09 \x20 \xa0 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
|
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
|
||||||
|
@ -275,7 +275,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x09 \x20 \xa0 \xff
|
Starting code units: \x09 \x20 \xa0 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
|
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
|
||||||
|
@ -284,7 +284,7 @@ Subject length lower bound = 1
|
||||||
0: \x{200a}\xa0\x{2000}
|
0: \x{200a}\xa0\x{2000}
|
||||||
|
|
||||||
/\H+/I
|
/\H+/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
|
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
|
||||||
0: \x{167f}\x{1681}\x{180d}\x{180f}
|
0: \x{167f}\x{1681}\x{180d}\x{180f}
|
||||||
|
@ -306,7 +306,7 @@ Subject length lower bound = 1
|
||||||
0: \x9f\xa1\x{2fff}\x{3001}
|
0: \x9f\xa1\x{2fff}\x{3001}
|
||||||
|
|
||||||
/\v+/I
|
/\v+/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{2027}\x{2030}\x{2028}\x{2029}
|
\x{2027}\x{2030}\x{2028}\x{2029}
|
||||||
|
@ -321,7 +321,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{2027}\x{2030}\x{2028}\x{2029}
|
\x{2027}\x{2030}\x{2028}\x{2029}
|
||||||
|
@ -330,7 +330,7 @@ Subject length lower bound = 1
|
||||||
0: \x85\x0a\x0b\x0c\x0d
|
0: \x85\x0a\x0b\x0c\x0d
|
||||||
|
|
||||||
/\V+/I
|
/\V+/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{2028}\x{2029}\x{2027}\x{2030}
|
\x{2028}\x{2029}\x{2027}\x{2030}
|
||||||
0: \x{2027}\x{2030}
|
0: \x{2027}\x{2030}
|
||||||
|
@ -344,7 +344,7 @@ Subject length lower bound = 1
|
||||||
0: \x09\x0e\x84\x86
|
0: \x09\x0e\x84\x86
|
||||||
|
|
||||||
/\R+/I,bsr=unicode
|
/\R+/I,bsr=unicode
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -354,7 +354,7 @@ Subject length lower bound = 1
|
||||||
0: \x85\x0a\x0b\x0c\x0d
|
0: \x85\x0a\x0b\x0c\x0d
|
||||||
|
|
||||||
/\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}/I
|
/\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
First code unit = \x{d800}
|
First code unit = \x{d800}
|
||||||
Last code unit = \x{dd00}
|
Last code unit = \x{dd00}
|
||||||
Subject length lower bound = 6
|
Subject length lower bound = 6
|
||||||
|
@ -600,7 +600,7 @@ Failed: error 134 at offset 9: character code point value in \x{} or \o{} is too
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b
|
||||||
\x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a
|
\x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a
|
||||||
\x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9
|
\x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9
|
||||||
|
@ -624,7 +624,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
|
||||||
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
|
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
|
||||||
\x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = >
|
\x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = >
|
||||||
|
|
|
@ -13,11 +13,11 @@
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{100}/I
|
/\x{100}/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
@ -215,7 +215,7 @@ Subject length lower bound = 1
|
||||||
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
|
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
|
||||||
\) )* # optional trailing comment
|
\) )* # optional trailing comment
|
||||||
/Ix
|
/Ix
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Contains explicit CR or LF match
|
Contains explicit CR or LF match
|
||||||
Options: extended
|
Options: extended
|
||||||
Starting code units: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
|
Starting code units: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
|
||||||
|
@ -260,7 +260,7 @@ Subject length lower bound = 3
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
|
|
||||||
/\h+/I
|
/\h+/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x09 \x20 \xa0 \xff
|
Starting code units: \x09 \x20 \xa0 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
|
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
|
||||||
|
@ -275,7 +275,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x09 \x20 \xa0 \xff
|
Starting code units: \x09 \x20 \xa0 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
|
\x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000}
|
||||||
|
@ -284,7 +284,7 @@ Subject length lower bound = 1
|
||||||
0: \x{200a}\xa0\x{2000}
|
0: \x{200a}\xa0\x{2000}
|
||||||
|
|
||||||
/\H+/I
|
/\H+/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
|
\x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f}
|
||||||
0: \x{167f}\x{1681}\x{180d}\x{180f}
|
0: \x{167f}\x{1681}\x{180d}\x{180f}
|
||||||
|
@ -306,7 +306,7 @@ Subject length lower bound = 1
|
||||||
0: \x9f\xa1\x{2fff}\x{3001}
|
0: \x9f\xa1\x{2fff}\x{3001}
|
||||||
|
|
||||||
/\v+/I
|
/\v+/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{2027}\x{2030}\x{2028}\x{2029}
|
\x{2027}\x{2030}\x{2028}\x{2029}
|
||||||
|
@ -321,7 +321,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{2027}\x{2030}\x{2028}\x{2029}
|
\x{2027}\x{2030}\x{2028}\x{2029}
|
||||||
|
@ -330,7 +330,7 @@ Subject length lower bound = 1
|
||||||
0: \x85\x0a\x0b\x0c\x0d
|
0: \x85\x0a\x0b\x0c\x0d
|
||||||
|
|
||||||
/\V+/I
|
/\V+/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
\x{2028}\x{2029}\x{2027}\x{2030}
|
\x{2028}\x{2029}\x{2027}\x{2030}
|
||||||
0: \x{2027}\x{2030}
|
0: \x{2027}\x{2030}
|
||||||
|
@ -344,7 +344,7 @@ Subject length lower bound = 1
|
||||||
0: \x09\x0e\x84\x86
|
0: \x09\x0e\x84\x86
|
||||||
|
|
||||||
/\R+/I,bsr=unicode
|
/\R+/I,bsr=unicode
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -354,7 +354,7 @@ Subject length lower bound = 1
|
||||||
0: \x85\x0a\x0b\x0c\x0d
|
0: \x85\x0a\x0b\x0c\x0d
|
||||||
|
|
||||||
/\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}/I
|
/\x{d800}\x{d7ff}\x{dc00}\x{dc00}\x{dcff}\x{dd00}/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
First code unit = \x{d800}
|
First code unit = \x{d800}
|
||||||
Last code unit = \x{dd00}
|
Last code unit = \x{dd00}
|
||||||
Subject length lower bound = 6
|
Subject length lower bound = 6
|
||||||
|
@ -558,19 +558,19 @@ Failed: error 134 at offset 12: character code point value in \x{} or \o{} is to
|
||||||
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
|
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
|
||||||
|
|
||||||
/\x{7fffffff}\x{7fffffff}/I
|
/\x{7fffffff}\x{7fffffff}/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
First code unit = \x{7fffffff}
|
First code unit = \x{7fffffff}
|
||||||
Last code unit = \x{7fffffff}
|
Last code unit = \x{7fffffff}
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
|
||||||
/\x{80000000}\x{80000000}/I
|
/\x{80000000}\x{80000000}/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
First code unit = \x{80000000}
|
First code unit = \x{80000000}
|
||||||
Last code unit = \x{80000000}
|
Last code unit = \x{80000000}
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
|
||||||
/\x{ffffffff}\x{ffffffff}/I
|
/\x{ffffffff}\x{ffffffff}/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
First code unit = \x{ffffffff}
|
First code unit = \x{ffffffff}
|
||||||
Last code unit = \x{ffffffff}
|
Last code unit = \x{ffffffff}
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
@ -588,7 +588,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless
|
Options: caseless
|
||||||
First code unit = \x{400000}
|
First code unit = \x{400000}
|
||||||
Last code unit = \x{800000}
|
Last code unit = \x{800000}
|
||||||
|
@ -603,7 +603,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b
|
||||||
\x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a
|
\x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a
|
||||||
\x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9
|
\x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9
|
||||||
|
@ -627,7 +627,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e
|
||||||
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
|
\x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d
|
||||||
\x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = >
|
\x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = >
|
||||||
|
|
|
@ -18,7 +18,7 @@
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{ffff}
|
First code unit = \x{ffff}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -30,7 +30,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d800}
|
First code unit = \x{d800}
|
||||||
Last code unit = \x{dc00}
|
Last code unit = \x{dc00}
|
||||||
|
@ -43,7 +43,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -55,7 +55,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{1000}
|
First code unit = \x{1000}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -67,7 +67,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d800}
|
First code unit = \x{d800}
|
||||||
Last code unit = \x{dc00}
|
Last code unit = \x{dc00}
|
||||||
|
@ -80,7 +80,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{dbc0}
|
First code unit = \x{dbc0}
|
||||||
Last code unit = \x{dc00}
|
Last code unit = \x{dc00}
|
||||||
|
@ -93,7 +93,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{dbff}
|
First code unit = \x{dbff}
|
||||||
Last code unit = \x{dfff}
|
Last code unit = \x{dfff}
|
||||||
|
@ -106,7 +106,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xff
|
First code unit = \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -118,7 +118,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -130,7 +130,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x80
|
First code unit = \x80
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -142,7 +142,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xff
|
First code unit = \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -154,7 +154,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d55c}
|
First code unit = \x{d55c}
|
||||||
Last code unit = \x{c5b4}
|
Last code unit = \x{c5b4}
|
||||||
|
@ -169,7 +169,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{65e5}
|
First code unit = \x{65e5}
|
||||||
Last code unit = \x{8a9e}
|
Last code unit = \x{8a9e}
|
||||||
|
@ -184,7 +184,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x80
|
First code unit = \x80
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -196,7 +196,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x84
|
First code unit = \x84
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -208,7 +208,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{104}
|
First code unit = \x{104}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -220,7 +220,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{861}
|
First code unit = \x{861}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -232,7 +232,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d844}
|
First code unit = \x{d844}
|
||||||
Last code unit = \x{deab}
|
Last code unit = \x{deab}
|
||||||
|
@ -245,7 +245,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
||||||
|
@ -281,7 +281,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Last code unit = \x{100}
|
Last code unit = \x{100}
|
||||||
|
@ -300,7 +300,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: x \xff
|
Starting code units: x \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -317,7 +317,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: a x \xff
|
Starting code units: a x \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -334,7 +334,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: a x \xff
|
Starting code units: a x \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -352,7 +352,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: x \xff
|
Starting code units: x \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -364,7 +364,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -377,7 +377,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = \x{100}
|
Last code unit = \x{100}
|
||||||
|
@ -391,7 +391,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = \x{101}
|
Last code unit = \x{101}
|
||||||
|
@ -404,7 +404,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[\x{100}]/IB,utf
|
/[\x{100}]/IB,utf
|
||||||
|
@ -414,7 +414,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -432,7 +432,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xff
|
First code unit = \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -446,7 +446,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
@ -461,14 +461,14 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
Subject length lower bound = 7
|
Subject length lower bound = 7
|
||||||
|
|
||||||
/\777/I,utf
|
/\777/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{1ff}
|
First code unit = \x{1ff}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -485,7 +485,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Last code unit = \x{200}
|
Last code unit = \x{200}
|
||||||
|
@ -499,7 +499,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Last code unit = 'X'
|
Last code unit = 'X'
|
||||||
|
@ -547,7 +547,7 @@ Failed: error -24: UTF-16 error: missing low surrogate at end at offset 2
|
||||||
0: \x{11234}
|
0: \x{11234}
|
||||||
|
|
||||||
/(*UTF)\x{11234}/I
|
/(*UTF)\x{11234}/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: <none>
|
Compile options: <none>
|
||||||
Overall options: utf
|
Overall options: utf
|
||||||
First code unit = \x{d804}
|
First code unit = \x{d804}
|
||||||
|
@ -565,7 +565,7 @@ Failed: error 160 at offset 5: (*VERB) not recognized or malformed
|
||||||
abcd\x{11234}pqr
|
abcd\x{11234}pqr
|
||||||
|
|
||||||
/(*CRLF)(*UTF16)(*BSR_UNICODE)a\Rb/I
|
/(*CRLF)(*UTF16)(*BSR_UNICODE)a\Rb/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: <none>
|
Compile options: <none>
|
||||||
Overall options: utf
|
Overall options: utf
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
|
@ -578,7 +578,7 @@ Subject length lower bound = 3
|
||||||
Failed: error 160 at offset 14: (*VERB) not recognized or malformed
|
Failed: error 160 at offset 14: (*VERB) not recognized or malformed
|
||||||
|
|
||||||
/\h/I,utf
|
/\h/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x20 \xa0 \xff
|
Starting code units: \x09 \x20 \xa0 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -602,7 +602,7 @@ Subject length lower bound = 1
|
||||||
0: \x{3000}
|
0: \x{3000}
|
||||||
|
|
||||||
/\v/I,utf
|
/\v/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -620,7 +620,7 @@ Subject length lower bound = 1
|
||||||
0: \x{2028}
|
0: \x{2028}
|
||||||
|
|
||||||
/\h*A/I,utf
|
/\h*A/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x20 A \xa0 \xff
|
Starting code units: \x09 \x20 A \xa0 \xff
|
||||||
Last code unit = 'A'
|
Last code unit = 'A'
|
||||||
|
@ -631,7 +631,7 @@ Subject length lower bound = 1
|
||||||
0: \x{2000}A
|
0: \x{2000}A
|
||||||
|
|
||||||
/\R*A/I,bsr=unicode,utf
|
/\R*A/I,bsr=unicode,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
Starting code units: \x0a \x0b \x0c \x0d A \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d A \x85 \xff
|
||||||
|
@ -643,21 +643,21 @@ Subject length lower bound = 1
|
||||||
0: \x{2028}A
|
0: \x{2028}A
|
||||||
|
|
||||||
/\v+A/I,utf
|
/\v+A/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Last code unit = 'A'
|
Last code unit = 'A'
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
|
||||||
/\s?xxx\s/I,utf
|
/\s?xxx\s/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 x
|
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 x
|
||||||
Last code unit = 'x'
|
Last code unit = 'x'
|
||||||
Subject length lower bound = 4
|
Subject length lower bound = 4
|
||||||
|
|
||||||
/\sxxx\s/I,utf,tables=2
|
/\sxxx\s/I,utf,tables=2
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \x85 \xa0
|
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \x85 \xa0
|
||||||
Last code unit = 'x'
|
Last code unit = 'x'
|
||||||
|
@ -668,7 +668,7 @@ Subject length lower bound = 5
|
||||||
0: \x{a0}xxx\x{85}
|
0: \x{a0}xxx\x{85}
|
||||||
|
|
||||||
/\S \S/I,utf,tables=2
|
/\S \S/I,utf,tables=2
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
||||||
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
||||||
|
@ -708,25 +708,25 @@ Failed: error -33: bad offset value
|
||||||
Failed: error -33: bad offset value
|
Failed: error -33: bad offset value
|
||||||
|
|
||||||
/\x{1234}+/Ii,utf
|
/\x{1234}+/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = \x{1234}
|
First code unit = \x{1234}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{1234}+?/Ii,utf
|
/\x{1234}+?/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = \x{1234}
|
First code unit = \x{1234}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{1234}++/Ii,utf
|
/\x{1234}++/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = \x{1234}
|
First code unit = \x{1234}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{1234}{2}/Ii,utf
|
/\x{1234}{2}/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = \x{1234}
|
First code unit = \x{1234}
|
||||||
Last code unit = \x{1234}
|
Last code unit = \x{1234}
|
||||||
|
@ -739,7 +739,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
@ -751,14 +751,14 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'X'
|
First code unit = 'X'
|
||||||
Last code unit = \x{200}
|
Last code unit = \x{200}
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
|
||||||
/\R/I,utf
|
/\R/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -936,7 +936,7 @@ Failed: error 174 at offset 0: using UTF is disabled by the application
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = 'A' (caseless)
|
First code unit = 'A' (caseless)
|
||||||
Last code unit = \x{1fb0} (caseless)
|
Last code unit = \x{1fb0} (caseless)
|
||||||
|
@ -949,7 +949,7 @@ Subject length lower bound = 5
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = \x{1fb0}
|
Last code unit = \x{1fb0}
|
||||||
|
@ -962,7 +962,7 @@ Subject length lower bound = 5
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = \x{1fb0}
|
Last code unit = \x{1fb0}
|
||||||
|
@ -975,14 +975,14 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = 'A' (caseless)
|
First code unit = 'A' (caseless)
|
||||||
Last code unit = \x{1fb0} (caseless)
|
Last code unit = \x{1fb0} (caseless)
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
|
|
||||||
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
|
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = \x{401} (caseless)
|
First code unit = \x{401} (caseless)
|
||||||
Last code unit = \x{42f} (caseless)
|
Last code unit = \x{42f} (caseless)
|
||||||
|
@ -1017,7 +1017,7 @@ Subject length lower bound = 17
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
|
|
||||||
/\x{212a}+/Ii,utf
|
/\x{212a}+/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: K k \xff
|
Starting code units: K k \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1025,7 +1025,7 @@ Subject length lower bound = 1
|
||||||
0: KKkk\x{212a}
|
0: KKkk\x{212a}
|
||||||
|
|
||||||
/s+/Ii,utf
|
/s+/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: S s \xff
|
Starting code units: S s \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1048,7 +1048,7 @@ Failed: error 134 at offset 10: character code point value in \x{} or \o{} is to
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: A \xff
|
Starting code units: A \xff
|
||||||
Last code unit = 'A'
|
Last code unit = 'A'
|
||||||
|
@ -1065,7 +1065,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1077,7 +1077,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: Z \xff
|
Starting code units: Z \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1095,7 +1095,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87
|
Starting code units: z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87
|
||||||
\x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96
|
\x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96
|
||||||
|
@ -1115,7 +1115,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: - ] a d z \xff
|
Starting code units: - ] a d z \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1136,7 +1136,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: a b \xff
|
Starting code units: a b \xff
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
|
@ -1150,7 +1150,7 @@ Subject length lower bound = 7
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xff
|
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1163,7 +1163,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1176,7 +1176,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
|
@ -1191,7 +1191,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
||||||
|
@ -1217,7 +1217,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
||||||
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
||||||
|
@ -1243,7 +1243,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
||||||
|
@ -1266,7 +1266,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xff
|
Starting code units: \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1289,7 +1289,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
|
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
|
||||||
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
|
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
|
||||||
|
@ -1335,7 +1335,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
|
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
|
||||||
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
|
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
|
||||||
|
@ -1357,7 +1357,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xff
|
Starting code units: \xff
|
||||||
Last code unit = 'B' (caseless)
|
Last code unit = 'B' (caseless)
|
||||||
|
@ -1443,7 +1443,7 @@ Failed: error 191 at offset 0: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES is not allowe
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: utf
|
Compile options: utf
|
||||||
Overall options: anchored utf
|
Overall options: anchored utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
|
|
|
@ -18,7 +18,7 @@
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{ffff}
|
First code unit = \x{ffff}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -30,7 +30,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{10000}
|
First code unit = \x{10000}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -42,7 +42,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -54,7 +54,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{1000}
|
First code unit = \x{1000}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -66,7 +66,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{10000}
|
First code unit = \x{10000}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -78,7 +78,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100000}
|
First code unit = \x{100000}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -90,7 +90,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{10ffff}
|
First code unit = \x{10ffff}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -102,7 +102,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xff
|
First code unit = \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -114,7 +114,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -126,7 +126,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x80
|
First code unit = \x80
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -138,7 +138,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xff
|
First code unit = \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -150,7 +150,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d55c}
|
First code unit = \x{d55c}
|
||||||
Last code unit = \x{c5b4}
|
Last code unit = \x{c5b4}
|
||||||
|
@ -165,7 +165,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{65e5}
|
First code unit = \x{65e5}
|
||||||
Last code unit = \x{8a9e}
|
Last code unit = \x{8a9e}
|
||||||
|
@ -180,7 +180,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x80
|
First code unit = \x80
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -192,7 +192,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x84
|
First code unit = \x84
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -204,7 +204,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{104}
|
First code unit = \x{104}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -216,7 +216,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{861}
|
First code unit = \x{861}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -228,7 +228,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{212ab}
|
First code unit = \x{212ab}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -240,7 +240,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
||||||
|
@ -276,7 +276,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Last code unit = \x{100}
|
Last code unit = \x{100}
|
||||||
|
@ -295,7 +295,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: x \xff
|
Starting code units: x \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -312,7 +312,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: a x \xff
|
Starting code units: a x \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -329,7 +329,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: a x \xff
|
Starting code units: a x \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -347,7 +347,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: x \xff
|
Starting code units: x \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -359,7 +359,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -372,7 +372,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = \x{100}
|
Last code unit = \x{100}
|
||||||
|
@ -386,7 +386,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = \x{101}
|
Last code unit = \x{101}
|
||||||
|
@ -399,7 +399,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[\x{100}]/IB,utf
|
/[\x{100}]/IB,utf
|
||||||
|
@ -409,7 +409,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -427,7 +427,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xff
|
First code unit = \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -441,7 +441,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
@ -456,14 +456,14 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
Subject length lower bound = 7
|
Subject length lower bound = 7
|
||||||
|
|
||||||
/\777/I,utf
|
/\777/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{1ff}
|
First code unit = \x{1ff}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -480,7 +480,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Last code unit = \x{200}
|
Last code unit = \x{200}
|
||||||
|
@ -494,7 +494,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{100}
|
First code unit = \x{100}
|
||||||
Last code unit = 'X'
|
Last code unit = 'X'
|
||||||
|
@ -542,7 +542,7 @@ Failed: error 160 at offset 7: (*VERB) not recognized or malformed
|
||||||
abcd\x{11234}pqr
|
abcd\x{11234}pqr
|
||||||
|
|
||||||
/(*UTF)\x{11234}/I
|
/(*UTF)\x{11234}/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: <none>
|
Compile options: <none>
|
||||||
Overall options: utf
|
Overall options: utf
|
||||||
First code unit = \x{11234}
|
First code unit = \x{11234}
|
||||||
|
@ -562,7 +562,7 @@ Failed: error 160 at offset 5: (*VERB) not recognized or malformed
|
||||||
Failed: error 160 at offset 14: (*VERB) not recognized or malformed
|
Failed: error 160 at offset 14: (*VERB) not recognized or malformed
|
||||||
|
|
||||||
/(*CRLF)(*UTF32)(*BSR_UNICODE)a\Rb/I
|
/(*CRLF)(*UTF32)(*BSR_UNICODE)a\Rb/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: <none>
|
Compile options: <none>
|
||||||
Overall options: utf
|
Overall options: utf
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
|
@ -572,7 +572,7 @@ Last code unit = 'b'
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
|
|
||||||
/\h/I,utf
|
/\h/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x20 \xa0 \xff
|
Starting code units: \x09 \x20 \xa0 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -596,7 +596,7 @@ Subject length lower bound = 1
|
||||||
0: \x{3000}
|
0: \x{3000}
|
||||||
|
|
||||||
/\v/I,utf
|
/\v/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -614,7 +614,7 @@ Subject length lower bound = 1
|
||||||
0: \x{2028}
|
0: \x{2028}
|
||||||
|
|
||||||
/\h*A/I,utf
|
/\h*A/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x20 A \xa0 \xff
|
Starting code units: \x09 \x20 A \xa0 \xff
|
||||||
Last code unit = 'A'
|
Last code unit = 'A'
|
||||||
|
@ -625,7 +625,7 @@ Subject length lower bound = 1
|
||||||
0: \x{2000}A
|
0: \x{2000}A
|
||||||
|
|
||||||
/\R*A/I,bsr=unicode,utf
|
/\R*A/I,bsr=unicode,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
Starting code units: \x0a \x0b \x0c \x0d A \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d A \x85 \xff
|
||||||
|
@ -637,21 +637,21 @@ Subject length lower bound = 1
|
||||||
0: \x{2028}A
|
0: \x{2028}A
|
||||||
|
|
||||||
/\v+A/I,utf
|
/\v+A/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Last code unit = 'A'
|
Last code unit = 'A'
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
|
||||||
/\s?xxx\s/I,utf
|
/\s?xxx\s/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 x
|
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 x
|
||||||
Last code unit = 'x'
|
Last code unit = 'x'
|
||||||
Subject length lower bound = 4
|
Subject length lower bound = 4
|
||||||
|
|
||||||
/\sxxx\s/I,utf,tables=2
|
/\sxxx\s/I,utf,tables=2
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \x85 \xa0
|
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \x85 \xa0
|
||||||
Last code unit = 'x'
|
Last code unit = 'x'
|
||||||
|
@ -662,7 +662,7 @@ Subject length lower bound = 5
|
||||||
0: \x{a0}xxx\x{85}
|
0: \x{a0}xxx\x{85}
|
||||||
|
|
||||||
/\S \S/I,utf,tables=2
|
/\S \S/I,utf,tables=2
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
||||||
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
||||||
|
@ -702,25 +702,25 @@ Failed: error -33: bad offset value
|
||||||
Failed: error -33: bad offset value
|
Failed: error -33: bad offset value
|
||||||
|
|
||||||
/\x{1234}+/Ii,utf
|
/\x{1234}+/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = \x{1234}
|
First code unit = \x{1234}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{1234}+?/Ii,utf
|
/\x{1234}+?/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = \x{1234}
|
First code unit = \x{1234}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{1234}++/Ii,utf
|
/\x{1234}++/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = \x{1234}
|
First code unit = \x{1234}
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\x{1234}{2}/Ii,utf
|
/\x{1234}{2}/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = \x{1234}
|
First code unit = \x{1234}
|
||||||
Last code unit = \x{1234}
|
Last code unit = \x{1234}
|
||||||
|
@ -733,7 +733,7 @@ Subject length lower bound = 2
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
@ -745,14 +745,14 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'X'
|
First code unit = 'X'
|
||||||
Last code unit = \x{200}
|
Last code unit = \x{200}
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
|
||||||
/\R/I,utf
|
/\R/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
Starting code units: \x0a \x0b \x0c \x0d \x85 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -930,7 +930,7 @@ Failed: error 174 at offset 0: using UTF is disabled by the application
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = 'A' (caseless)
|
First code unit = 'A' (caseless)
|
||||||
Last code unit = \x{1fb0} (caseless)
|
Last code unit = \x{1fb0} (caseless)
|
||||||
|
@ -943,7 +943,7 @@ Subject length lower bound = 5
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = \x{1fb0}
|
Last code unit = \x{1fb0}
|
||||||
|
@ -956,7 +956,7 @@ Subject length lower bound = 5
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = \x{1fb0}
|
Last code unit = \x{1fb0}
|
||||||
|
@ -969,14 +969,14 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = 'A' (caseless)
|
First code unit = 'A' (caseless)
|
||||||
Last code unit = \x{1fb0} (caseless)
|
Last code unit = \x{1fb0} (caseless)
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
|
|
||||||
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
|
/\x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f}/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = \x{401} (caseless)
|
First code unit = \x{401} (caseless)
|
||||||
Last code unit = \x{42f} (caseless)
|
Last code unit = \x{42f} (caseless)
|
||||||
|
@ -1011,7 +1011,7 @@ Subject length lower bound = 17
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
|
|
||||||
/\x{212a}+/Ii,utf
|
/\x{212a}+/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: K k \xff
|
Starting code units: K k \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1019,7 +1019,7 @@ Subject length lower bound = 1
|
||||||
0: KKkk\x{212a}
|
0: KKkk\x{212a}
|
||||||
|
|
||||||
/s+/Ii,utf
|
/s+/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: S s \xff
|
Starting code units: S s \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1042,7 +1042,7 @@ Failed: error 134 at offset 10: character code point value in \x{} or \o{} is to
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: A \xff
|
Starting code units: A \xff
|
||||||
Last code unit = 'A'
|
Last code unit = 'A'
|
||||||
|
@ -1059,7 +1059,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1071,7 +1071,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: Z \xff
|
Starting code units: Z \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1089,7 +1089,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87
|
Starting code units: z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86 \x87
|
||||||
\x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96
|
\x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95 \x96
|
||||||
|
@ -1109,7 +1109,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: - ] a d z \xff
|
Starting code units: - ] a d z \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1130,7 +1130,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: a b \xff
|
Starting code units: a b \xff
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
|
@ -1144,7 +1144,7 @@ Subject length lower bound = 7
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xff
|
Starting code units: \x09 \x0a \x0b \x0c \x0d \x20 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1157,7 +1157,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1170,7 +1170,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
|
@ -1185,7 +1185,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
||||||
|
@ -1211,7 +1211,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f
|
||||||
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
\x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e
|
||||||
|
@ -1237,7 +1237,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
\x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19
|
||||||
|
@ -1260,7 +1260,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xff
|
Starting code units: \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -1283,7 +1283,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
|
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
|
||||||
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
|
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
|
||||||
|
@ -1329,7 +1329,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
|
Starting code units: Z z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 \x85 \x86
|
||||||
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
|
\x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 \x95
|
||||||
|
@ -1351,7 +1351,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Starting code units: \xff
|
Starting code units: \xff
|
||||||
Last code unit = 'B' (caseless)
|
Last code unit = 'B' (caseless)
|
||||||
|
@ -1418,7 +1418,7 @@ No match
|
||||||
# errors in 16-bit mode.
|
# errors in 16-bit mode.
|
||||||
|
|
||||||
/\x{d800}/I,utf,allow_surrogate_escapes
|
/\x{d800}/I,utf,allow_surrogate_escapes
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Extra options: allow_surrogate_escapes
|
Extra options: allow_surrogate_escapes
|
||||||
First code unit = \x{d800}
|
First code unit = \x{d800}
|
||||||
|
@ -1440,7 +1440,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: utf
|
Compile options: utf
|
||||||
Overall options: anchored utf
|
Overall options: anchored utf
|
||||||
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
Starting code units: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a
|
||||||
|
|
|
@ -7,7 +7,7 @@
|
||||||
# (2) Other tests that must not be run with JIT.
|
# (2) Other tests that must not be run with JIT.
|
||||||
|
|
||||||
/(a+)*zz/I
|
/(a+)*zz/I
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Starting code units: a z
|
Starting code units: a z
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
@ -24,7 +24,7 @@ Minimum depth limit = 30
|
||||||
No match
|
No match
|
||||||
|
|
||||||
!((?:\s|//.*\\n|/[*](?:\\n|.)*?[*]/)*)!I
|
!((?:\s|//.*\\n|/[*](?:\\n|.)*?[*]/)*)!I
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
/* this is a C style comment */\=find_limits
|
/* this is a C style comment */\=find_limits
|
||||||
|
@ -117,7 +117,7 @@ Failed: error 160 at offset 17: (*VERB) not recognized or malformed
|
||||||
Failed: error 160 at offset 24: (*VERB) not recognized or malformed
|
Failed: error 160 at offset 24: (*VERB) not recognized or malformed
|
||||||
|
|
||||||
/(*LIMIT_DEPTH=4294967280)abc/I
|
/(*LIMIT_DEPTH=4294967280)abc/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Depth limit = 4294967280
|
Depth limit = 4294967280
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = 'c'
|
Last code unit = 'c'
|
||||||
|
@ -137,7 +137,7 @@ Failed: error -47: match limit exceeded
|
||||||
Failed: error -53: matching depth limit exceeded
|
Failed: error -53: matching depth limit exceeded
|
||||||
|
|
||||||
/(*LIMIT_MATCH=3000)(a+)*zz/I
|
/(*LIMIT_MATCH=3000)(a+)*zz/I
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Match limit = 3000
|
Match limit = 3000
|
||||||
Starting code units: a z
|
Starting code units: a z
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
|
@ -150,7 +150,7 @@ Failed: error -47: match limit exceeded
|
||||||
Failed: error -47: match limit exceeded
|
Failed: error -47: match limit exceeded
|
||||||
|
|
||||||
/(*LIMIT_MATCH=60000)(*LIMIT_MATCH=3000)(a+)*zz/I
|
/(*LIMIT_MATCH=60000)(*LIMIT_MATCH=3000)(a+)*zz/I
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Match limit = 3000
|
Match limit = 3000
|
||||||
Starting code units: a z
|
Starting code units: a z
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
|
@ -160,7 +160,7 @@ Subject length lower bound = 2
|
||||||
Failed: error -47: match limit exceeded
|
Failed: error -47: match limit exceeded
|
||||||
|
|
||||||
/(*LIMIT_MATCH=60000)(a+)*zz/I
|
/(*LIMIT_MATCH=60000)(a+)*zz/I
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Match limit = 60000
|
Match limit = 60000
|
||||||
Starting code units: a z
|
Starting code units: a z
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
|
@ -173,7 +173,7 @@ No match
|
||||||
Failed: error -47: match limit exceeded
|
Failed: error -47: match limit exceeded
|
||||||
|
|
||||||
/(*LIMIT_DEPTH=10)(a+)*zz/I
|
/(*LIMIT_DEPTH=10)(a+)*zz/I
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Depth limit = 10
|
Depth limit = 10
|
||||||
Starting code units: a z
|
Starting code units: a z
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
|
@ -186,7 +186,7 @@ Failed: error -53: matching depth limit exceeded
|
||||||
Failed: error -53: matching depth limit exceeded
|
Failed: error -53: matching depth limit exceeded
|
||||||
|
|
||||||
/(*LIMIT_DEPTH=10)(*LIMIT_DEPTH=1000)(a+)*zz/I
|
/(*LIMIT_DEPTH=10)(*LIMIT_DEPTH=1000)(a+)*zz/I
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Depth limit = 1000
|
Depth limit = 1000
|
||||||
Starting code units: a z
|
Starting code units: a z
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
|
@ -196,7 +196,7 @@ Subject length lower bound = 2
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/(*LIMIT_DEPTH=1000)(a+)*zz/I
|
/(*LIMIT_DEPTH=1000)(a+)*zz/I
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Depth limit = 1000
|
Depth limit = 1000
|
||||||
Starting code units: a z
|
Starting code units: a z
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
|
@ -269,14 +269,14 @@ Failed: error -52: nested recursion at the same subject position
|
||||||
# when JIT is used.
|
# when JIT is used.
|
||||||
|
|
||||||
/(?R)/I
|
/(?R)/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
abcd
|
abcd
|
||||||
Failed: error -52: nested recursion at the same subject position
|
Failed: error -52: nested recursion at the same subject position
|
||||||
|
|
||||||
/(a|(?R))/I
|
/(a|(?R))/I
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
abcd
|
abcd
|
||||||
|
@ -286,7 +286,7 @@ Subject length lower bound = 0
|
||||||
Failed: error -52: nested recursion at the same subject position
|
Failed: error -52: nested recursion at the same subject position
|
||||||
|
|
||||||
/(ab|(bc|(de|(?R))))/I
|
/(ab|(bc|(de|(?R))))/I
|
||||||
Capturing subpattern count = 3
|
Capture group count = 3
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
abcd
|
abcd
|
||||||
|
@ -296,7 +296,7 @@ Subject length lower bound = 0
|
||||||
Failed: error -52: nested recursion at the same subject position
|
Failed: error -52: nested recursion at the same subject position
|
||||||
|
|
||||||
/(ab|(bc|(de|(?1))))/I
|
/(ab|(bc|(de|(?1))))/I
|
||||||
Capturing subpattern count = 3
|
Capture group count = 3
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
abcd
|
abcd
|
||||||
|
@ -306,7 +306,7 @@ Subject length lower bound = 0
|
||||||
Failed: error -52: nested recursion at the same subject position
|
Failed: error -52: nested recursion at the same subject position
|
||||||
|
|
||||||
/x(ab|(bc|(de|(?1)x)x)x)/I
|
/x(ab|(bc|(de|(?1)x)x)x)/I
|
||||||
Capturing subpattern count = 3
|
Capture group count = 3
|
||||||
First code unit = 'x'
|
First code unit = 'x'
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
xab123
|
xab123
|
||||||
|
@ -352,7 +352,7 @@ Failed: error -52: nested recursion at the same subject position
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
abcd
|
abcd
|
||||||
Failed: error -52: nested recursion at the same subject position
|
Failed: error -52: nested recursion at the same subject position
|
||||||
|
@ -367,7 +367,7 @@ Failed: error -52: nested recursion at the same subject position
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: no_auto_possess
|
Options: no_auto_possess
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
|
@ -390,7 +390,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: <none>
|
Compile options: <none>
|
||||||
Overall options: no_auto_possess
|
Overall options: no_auto_possess
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
|
|
|
@ -3,14 +3,14 @@
|
||||||
# are different without JIT.
|
# are different without JIT.
|
||||||
|
|
||||||
/abc/I,jit,jitverify
|
/abc/I,jit,jitverify
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = 'c'
|
Last code unit = 'c'
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
JIT support is not available in this version of PCRE2
|
JIT support is not available in this version of PCRE2
|
||||||
|
|
||||||
/a*/I
|
/a*/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
|
||||||
|
|
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
|
@ -32,9 +32,9 @@
|
||||||
#load testsaved2
|
#load testsaved2
|
||||||
|
|
||||||
#pop info
|
#pop info
|
||||||
Capturing subpattern count = 2
|
Capture group count = 2
|
||||||
Max back reference = 2
|
Max back reference = 2
|
||||||
Named capturing subpatterns:
|
Named capture groups:
|
||||||
n 1
|
n 1
|
||||||
n 2
|
n 2
|
||||||
Options: dupnames
|
Options: dupnames
|
||||||
|
@ -66,8 +66,8 @@ No match, mark = A
|
||||||
4: A
|
4: A
|
||||||
|
|
||||||
#pop info
|
#pop info
|
||||||
Capturing subpattern count = 4
|
Capture group count = 4
|
||||||
Named capturing subpatterns:
|
Named capture groups:
|
||||||
ADDR 2
|
ADDR 2
|
||||||
ADDRESS_PAT 4
|
ADDRESS_PAT 4
|
||||||
NAME 1
|
NAME 1
|
||||||
|
|
|
@ -79,7 +79,7 @@
|
||||||
Failed: error 183 at offset 4: using \C is disabled by the application
|
Failed: error 183 at offset 4: using \C is disabled by the application
|
||||||
|
|
||||||
/ab\Cde/info
|
/ab\Cde/info
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Contains \C
|
Contains \C
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = 'e'
|
Last code unit = 'e'
|
||||||
|
|
|
@ -4,7 +4,7 @@
|
||||||
# in some widths and not in others.
|
# in some widths and not in others.
|
||||||
|
|
||||||
/ab\Cde/utf,info
|
/ab\Cde/utf,info
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Contains \C
|
Contains \C
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
|
|
@ -4,7 +4,7 @@
|
||||||
# in some widths and not in others.
|
# in some widths and not in others.
|
||||||
|
|
||||||
/ab\Cde/utf,info
|
/ab\Cde/utf,info
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Contains \C
|
Contains \C
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
|
|
@ -4,7 +4,7 @@
|
||||||
# in some widths and not in others.
|
# in some widths and not in others.
|
||||||
|
|
||||||
/ab\Cde/utf,info
|
/ab\Cde/utf,info
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Contains \C
|
Contains \C
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
|
|
@ -78,13 +78,13 @@ No match
|
||||||
0: école
|
0: école
|
||||||
|
|
||||||
/\w/I
|
/\w/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\w/I,locale=fr_FR
|
/\w/I,locale=fr_FR
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
ª µ º À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â
|
ª µ º À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â
|
||||||
|
@ -153,7 +153,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
|
Starting code units: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
|
||||||
a b c d e f g h i j k l m n o p q r s t u v w x y z ª µ º À Á Â Ã Ä Å Æ Ç
|
a b c d e f g h i j k l m n o p q r s t u v w x y z ª µ º À Á Â Ã Ä Å Æ Ç
|
||||||
È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í
|
È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í
|
||||||
|
|
|
@ -78,13 +78,13 @@ No match
|
||||||
0: école
|
0: école
|
||||||
|
|
||||||
/\w/I
|
/\w/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\w/I,locale=fr_FR
|
/\w/I,locale=fr_FR
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
ª µ º À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â
|
ª µ º À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â
|
||||||
|
@ -153,7 +153,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
|
Starting code units: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
|
||||||
a b c d e f g h i j k l m n o p q r s t u v w x y z ª µ º À Á Â Ã Ä Å Æ Ç
|
a b c d e f g h i j k l m n o p q r s t u v w x y z ª µ º À Á Â Ã Ä Å Æ Ç
|
||||||
È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í
|
È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í
|
||||||
|
|
|
@ -78,13 +78,13 @@ No match
|
||||||
0: école
|
0: école
|
||||||
|
|
||||||
/\w/I
|
/\w/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\w/I,locale=fr_FR
|
/\w/I,locale=fr_FR
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
ª µ º À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â
|
ª µ º À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â
|
||||||
|
@ -153,7 +153,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
|
Starting code units: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
|
||||||
a b c d e f g h i j k l m n o p q r s t u v w x y z ª µ º À Á Â Ã Ä Å Æ Ç
|
a b c d e f g h i j k l m n o p q r s t u v w x y z ª µ º À Á Â Ã Ä Å Æ Ç
|
||||||
È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í
|
È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í
|
||||||
|
|
|
@ -3975,4 +3975,41 @@ No match
|
||||||
|
|
||||||
# -------
|
# -------
|
||||||
|
|
||||||
|
# Test group names containing non-ASCII letters and digits
|
||||||
|
|
||||||
|
/(?'ABáC'...)\g{ABáC}/utf
|
||||||
|
abcabcdefg
|
||||||
|
0: abcabc
|
||||||
|
1: abc
|
||||||
|
|
||||||
|
/(?'XʰABC'...)/utf
|
||||||
|
xyzpq
|
||||||
|
0: xyz
|
||||||
|
1: xyz
|
||||||
|
|
||||||
|
/(?'XאABC'...)/utf
|
||||||
|
12345
|
||||||
|
0: 123
|
||||||
|
1: 123
|
||||||
|
|
||||||
|
/(?'XᾈABC'...)/utf
|
||||||
|
%^&*(...
|
||||||
|
0: %^&
|
||||||
|
1: %^&
|
||||||
|
|
||||||
|
/(?'𐨐ABC'...)/utf
|
||||||
|
abcde
|
||||||
|
0: abc
|
||||||
|
1: abc
|
||||||
|
|
||||||
|
/^(?'אABC'...)(?&אABC)(?P=אABC)/utf
|
||||||
|
123123123456
|
||||||
|
0: 123123123
|
||||||
|
1: 123
|
||||||
|
|
||||||
|
/^(?'אABC'...)(?&אABC)/utf
|
||||||
|
123123123456
|
||||||
|
0: 123123
|
||||||
|
1: 123
|
||||||
|
|
||||||
# End of testinput4
|
# End of testinput4
|
||||||
|
|
|
@ -147,7 +147,7 @@ Failed: error 173 at offset 9: disallowed Unicode code point (>= 0xd800 && <= 0x
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = '.'
|
Last code unit = '.'
|
||||||
|
@ -164,7 +164,7 @@ Subject length lower bound = 4
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Last code unit = 'X'
|
Last code unit = 'X'
|
||||||
Subject length lower bound = 4
|
Subject length lower bound = 4
|
||||||
|
@ -179,7 +179,7 @@ Subject length lower bound = 4
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
\x{212ab}\x{212ab}\x{212ab}\x{861}
|
\x{212ab}\x{212ab}\x{212ab}\x{861}
|
||||||
|
@ -193,7 +193,7 @@ Subject length lower bound = 3
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: utf
|
Compile options: utf
|
||||||
Overall options: anchored utf
|
Overall options: anchored utf
|
||||||
Starting code units: a b
|
Starting code units: a b
|
||||||
|
@ -238,7 +238,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -251,7 +251,7 @@ Subject length lower bound = 0
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -264,7 +264,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = 'b'
|
Last code unit = 'b'
|
||||||
|
@ -291,7 +291,7 @@ No match
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
First code unit = \xff
|
First code unit = \xff
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
>\xff<
|
>\xff<
|
||||||
|
@ -304,7 +304,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[Ä-Ü]/utf
|
/[Ä-Ü]/utf
|
||||||
|
@ -343,7 +343,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: utf
|
Options: utf
|
||||||
Last code unit = 'z'
|
Last code unit = 'z'
|
||||||
Subject length lower bound = 7
|
Subject length lower bound = 7
|
||||||
|
@ -363,7 +363,7 @@ Subject length lower bound = 7
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 2
|
Capture group count = 2
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -394,7 +394,7 @@ Subject length lower bound = 0
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 2
|
Capture group count = 2
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -414,7 +414,7 @@ Subject length lower bound = 0
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 2
|
Capture group count = 2
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -445,7 +445,7 @@ Subject length lower bound = 0
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 2
|
Capture group count = 2
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -471,7 +471,7 @@ Subject length lower bound = 0
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Compile options: no_start_optimize utf
|
Compile options: no_start_optimize utf
|
||||||
Overall options: anchored no_start_optimize utf
|
Overall options: anchored no_start_optimize utf
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -713,7 +713,7 @@ No match
|
||||||
0: \x{1ec5}
|
0: \x{1ec5}
|
||||||
|
|
||||||
/a\Rb/I,bsr=anycrlf,utf
|
/a\Rb/I,bsr=anycrlf,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
\R matches CR, LF, or CRLF
|
\R matches CR, LF, or CRLF
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
@ -732,7 +732,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/a\Rb/I,bsr=unicode,utf
|
/a\Rb/I,bsr=unicode,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
@ -750,7 +750,7 @@ Subject length lower bound = 3
|
||||||
0: a\x{0b}b
|
0: a\x{0b}b
|
||||||
|
|
||||||
/a\R?b/I,bsr=anycrlf,utf
|
/a\R?b/I,bsr=anycrlf,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
\R matches CR, LF, or CRLF
|
\R matches CR, LF, or CRLF
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
@ -769,7 +769,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/a\R?b/I,bsr=unicode,utf
|
/a\R?b/I,bsr=unicode,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
@ -1408,22 +1408,22 @@ Failed: error 168 at offset 3: \c must be followed by a printable ASCII characte
|
||||||
2: \x{0d}
|
2: \x{0d}
|
||||||
|
|
||||||
/[^\x{1234}]+/Ii,utf
|
/[^\x{1234}]+/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[^\x{1234}]+?/Ii,utf
|
/[^\x{1234}]+?/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[^\x{1234}]++/Ii,utf
|
/[^\x{1234}]++/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[^\x{1234}]{2}/Ii,utf
|
/[^\x{1234}]{2}/Ii,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
Subject length lower bound = 2
|
Subject length lower bound = 2
|
||||||
|
|
||||||
|
@ -1703,7 +1703,7 @@ Partial match: \x{0d}\x{0d}
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
|
|
||||||
/(?<=\x{1234}\x{1234})\bxy/I,utf
|
/(?<=\x{1234}\x{1234})\bxy/I,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Max lookbehind = 2
|
Max lookbehind = 2
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'x'
|
First code unit = 'x'
|
||||||
|
@ -1768,7 +1768,7 @@ Failed: error 173 at offset 6: disallowed Unicode code point (>= 0xd800 && <= 0x
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[\p{^L}]/IB
|
/[\p{^L}]/IB
|
||||||
|
@ -1778,7 +1778,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[\P{L}]/IB
|
/[\P{L}]/IB
|
||||||
|
@ -1788,7 +1788,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[\P{^L}]/IB
|
/[\P{^L}]/IB
|
||||||
|
@ -1798,7 +1798,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/[abc\p{L}\x{0660}]/IB,utf
|
/[abc\p{L}\x{0660}]/IB,utf
|
||||||
|
@ -1808,7 +1808,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
@ -1819,7 +1819,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
1234
|
1234
|
||||||
|
@ -1832,7 +1832,7 @@ Subject length lower bound = 1
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
1234
|
1234
|
||||||
|
@ -2998,7 +2998,7 @@ Partial match: AA
|
||||||
Ket
|
Ket
|
||||||
End
|
End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless utf
|
Options: caseless utf
|
||||||
First code unit = 'A' (caseless)
|
First code unit = 'A' (caseless)
|
||||||
Last code unit = 'B' (caseless)
|
Last code unit = 'B' (caseless)
|
||||||
|
@ -3914,7 +3914,7 @@ No match
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
|
|
||||||
/^s?c/Iim,utf
|
/^s?c/Iim,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: caseless multiline utf
|
Options: caseless multiline utf
|
||||||
First code unit at start or follows newline
|
First code unit at start or follows newline
|
||||||
Last code unit = 'c' (caseless)
|
Last code unit = 'c' (caseless)
|
||||||
|
@ -4889,4 +4889,31 @@ MK: ABC
|
||||||
|
|
||||||
# -------
|
# -------
|
||||||
|
|
||||||
|
# Test reference and errors in non-ASCII characters in group names
|
||||||
|
|
||||||
|
/(?'𑠅ABC'...)/I,utf
|
||||||
|
Capture group count = 1
|
||||||
|
Named capture groups:
|
||||||
|
𑠅ABC 1
|
||||||
|
Options: utf
|
||||||
|
Subject length lower bound = 3
|
||||||
|
abcde\=copy=𑠅ABC
|
||||||
|
0: abc
|
||||||
|
1: abc
|
||||||
|
C abc (3) 𑠅ABC (group 1)
|
||||||
|
|
||||||
|
# Bad ones
|
||||||
|
|
||||||
|
/(?'AB၌C'...)\g{AB၌C}/utf
|
||||||
|
Failed: error 142 at offset 5: syntax error in subpattern name (missing terminator?)
|
||||||
|
|
||||||
|
/(?'٠ABC'...)/utf
|
||||||
|
Failed: error 144 at offset 3: subpattern name must start with a non-digit
|
||||||
|
|
||||||
|
/(?'²ABC'...)/utf
|
||||||
|
Failed: error 162 at offset 3: subpattern name expected
|
||||||
|
|
||||||
|
/(?'X²ABC'...)/utf
|
||||||
|
Failed: error 142 at offset 4: syntax error in subpattern name (missing terminator?)
|
||||||
|
|
||||||
# End of testinput5
|
# End of testinput5
|
||||||
|
|
|
@ -5978,7 +5978,7 @@ Partial match: 123
|
||||||
0: Content-Type:xxxyyyz
|
0: Content-Type:xxxyyyz
|
||||||
|
|
||||||
/^abc/Im,newline=lf
|
/^abc/Im,newline=lf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: multiline
|
Options: multiline
|
||||||
Forced newline is LF
|
Forced newline is LF
|
||||||
First code unit at start or follows newline
|
First code unit at start or follows newline
|
||||||
|
@ -6001,7 +6001,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/^abc/Im,newline=crlf
|
/^abc/Im,newline=crlf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: multiline
|
Options: multiline
|
||||||
Forced newline is CRLF
|
Forced newline is CRLF
|
||||||
First code unit at start or follows newline
|
First code unit at start or follows newline
|
||||||
|
@ -6016,7 +6016,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/^abc/Im,newline=cr
|
/^abc/Im,newline=cr
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: multiline
|
Options: multiline
|
||||||
Forced newline is CR
|
Forced newline is CR
|
||||||
First code unit at start or follows newline
|
First code unit at start or follows newline
|
||||||
|
@ -6031,7 +6031,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/.*/I,newline=lf
|
/.*/I,newline=lf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Forced newline is LF
|
Forced newline is LF
|
||||||
First code unit at start or follows newline
|
First code unit at start or follows newline
|
||||||
|
@ -6044,7 +6044,7 @@ Subject length lower bound = 0
|
||||||
0: abc\x0d
|
0: abc\x0d
|
||||||
|
|
||||||
/.*/I,newline=cr
|
/.*/I,newline=cr
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Forced newline is CR
|
Forced newline is CR
|
||||||
First code unit at start or follows newline
|
First code unit at start or follows newline
|
||||||
|
@ -6057,7 +6057,7 @@ Subject length lower bound = 0
|
||||||
0: abc
|
0: abc
|
||||||
|
|
||||||
/.*/I,newline=crlf
|
/.*/I,newline=crlf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Forced newline is CRLF
|
Forced newline is CRLF
|
||||||
First code unit at start or follows newline
|
First code unit at start or follows newline
|
||||||
|
@ -6070,7 +6070,7 @@ Subject length lower bound = 0
|
||||||
0: abc
|
0: abc
|
||||||
|
|
||||||
/\w+(.)(.)?def/Is
|
/\w+(.)(.)?def/Is
|
||||||
Capturing subpattern count = 2
|
Capture group count = 2
|
||||||
Options: dotall
|
Options: dotall
|
||||||
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
Starting code units: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
|
||||||
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
|
||||||
|
@ -6447,7 +6447,7 @@ No match
|
||||||
0: \x0aA
|
0: \x0aA
|
||||||
|
|
||||||
/a\Rb/I,bsr=anycrlf
|
/a\Rb/I,bsr=anycrlf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
\R matches CR, LF, or CRLF
|
\R matches CR, LF, or CRLF
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = 'b'
|
Last code unit = 'b'
|
||||||
|
@ -6465,7 +6465,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/a\Rb/I,bsr=unicode
|
/a\Rb/I,bsr=unicode
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = 'b'
|
Last code unit = 'b'
|
||||||
|
@ -6482,7 +6482,7 @@ Subject length lower bound = 3
|
||||||
0: a\x0bb
|
0: a\x0bb
|
||||||
|
|
||||||
/a\R?b/I,bsr=anycrlf
|
/a\R?b/I,bsr=anycrlf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
\R matches CR, LF, or CRLF
|
\R matches CR, LF, or CRLF
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = 'b'
|
Last code unit = 'b'
|
||||||
|
@ -6500,7 +6500,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/a\R?b/I,bsr=unicode
|
/a\R?b/I,bsr=unicode
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = 'b'
|
Last code unit = 'b'
|
||||||
|
@ -6517,7 +6517,7 @@ Subject length lower bound = 2
|
||||||
0: a\x0bb
|
0: a\x0bb
|
||||||
|
|
||||||
/a\R{2,4}b/I,bsr=anycrlf
|
/a\R{2,4}b/I,bsr=anycrlf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
\R matches CR, LF, or CRLF
|
\R matches CR, LF, or CRLF
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = 'b'
|
Last code unit = 'b'
|
||||||
|
@ -6535,7 +6535,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/a\R{2,4}b/I,bsr=unicode
|
/a\R{2,4}b/I,bsr=unicode
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Last code unit = 'b'
|
Last code unit = 'b'
|
||||||
|
@ -6831,7 +6831,7 @@ Partial match: +ab
|
||||||
0+ CBA
|
0+ CBA
|
||||||
|
|
||||||
/(abc|def|xyz)/I
|
/(abc|def|xyz)/I
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Starting code units: a d x
|
Starting code units: a d x
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
terhjk;abcdaadsfe
|
terhjk;abcdaadsfe
|
||||||
|
@ -6843,7 +6843,7 @@ Subject length lower bound = 3
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/(abc|def|xyz)/I,no_start_optimize
|
/(abc|def|xyz)/I,no_start_optimize
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Options: no_start_optimize
|
Options: no_start_optimize
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
terhjk;abcdaadsfe
|
terhjk;abcdaadsfe
|
||||||
|
|
|
@ -1030,7 +1030,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/a\Rb/I,bsr=anycrlf,utf
|
/a\Rb/I,bsr=anycrlf,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
\R matches CR, LF, or CRLF
|
\R matches CR, LF, or CRLF
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
@ -1049,7 +1049,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/a\Rb/I,bsr=unicode,utf
|
/a\Rb/I,bsr=unicode,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
@ -1067,7 +1067,7 @@ Subject length lower bound = 3
|
||||||
0: a\x{0b}b
|
0: a\x{0b}b
|
||||||
|
|
||||||
/a\R?b/I,bsr=anycrlf,utf
|
/a\R?b/I,bsr=anycrlf,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
\R matches CR, LF, or CRLF
|
\R matches CR, LF, or CRLF
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
@ -1086,7 +1086,7 @@ No match
|
||||||
No match
|
No match
|
||||||
|
|
||||||
/a\R?b/I,bsr=unicode,utf
|
/a\R?b/I,bsr=unicode,utf
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
\R matches any Unicode newline
|
\R matches any Unicode newline
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
|
|
|
@ -67,7 +67,7 @@ Memory allocation (code space): 10
|
||||||
2 2 Ket
|
2 2 Ket
|
||||||
4 End
|
4 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: extended
|
Options: extended
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -80,7 +80,7 @@ Memory allocation (code space): 14
|
||||||
4 4 Ket
|
4 4 Ket
|
||||||
6 End
|
6 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: extended
|
Options: extended
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -376,7 +376,7 @@ Memory allocation (code space): 26
|
||||||
10 10 Ket
|
10 10 Ket
|
||||||
12 End
|
12 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = '.'
|
Last code unit = '.'
|
||||||
|
@ -390,7 +390,7 @@ Memory allocation (code space): 22
|
||||||
8 8 Ket
|
8 8 Ket
|
||||||
10 End
|
10 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d55c}
|
First code unit = \x{d55c}
|
||||||
Last code unit = \x{c5b4}
|
Last code unit = \x{c5b4}
|
||||||
|
@ -404,7 +404,7 @@ Memory allocation (code space): 22
|
||||||
8 8 Ket
|
8 8 Ket
|
||||||
10 End
|
10 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{65e5}
|
First code unit = \x{65e5}
|
||||||
Last code unit = \x{8a9e}
|
Last code unit = \x{8a9e}
|
||||||
|
@ -904,7 +904,7 @@ Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
79 79 Ket
|
79 79 Ket
|
||||||
81 End
|
81 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -938,7 +938,7 @@ Subject length lower bound = 0
|
||||||
43 43 Ket
|
43 43 Ket
|
||||||
45 End
|
45 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -1011,7 +1011,7 @@ No match
|
||||||
133 133 Ket
|
133 133 Ket
|
||||||
135 End
|
135 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 10
|
Capture group count = 10
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
|
||||||
|
|
|
@ -67,7 +67,7 @@ Memory allocation (code space): 14
|
||||||
3 3 Ket
|
3 3 Ket
|
||||||
6 End
|
6 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: extended
|
Options: extended
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -80,7 +80,7 @@ Memory allocation (code space): 18
|
||||||
5 5 Ket
|
5 5 Ket
|
||||||
8 End
|
8 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: extended
|
Options: extended
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -376,7 +376,7 @@ Memory allocation (code space): 30
|
||||||
11 11 Ket
|
11 11 Ket
|
||||||
14 End
|
14 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = '.'
|
Last code unit = '.'
|
||||||
|
@ -390,7 +390,7 @@ Memory allocation (code space): 26
|
||||||
9 9 Ket
|
9 9 Ket
|
||||||
12 End
|
12 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d55c}
|
First code unit = \x{d55c}
|
||||||
Last code unit = \x{c5b4}
|
Last code unit = \x{c5b4}
|
||||||
|
@ -404,7 +404,7 @@ Memory allocation (code space): 26
|
||||||
9 9 Ket
|
9 9 Ket
|
||||||
12 End
|
12 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{65e5}
|
First code unit = \x{65e5}
|
||||||
Last code unit = \x{8a9e}
|
Last code unit = \x{8a9e}
|
||||||
|
@ -903,7 +903,7 @@ Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
110 110 Ket
|
110 110 Ket
|
||||||
113 End
|
113 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -937,7 +937,7 @@ Subject length lower bound = 0
|
||||||
58 58 Ket
|
58 58 Ket
|
||||||
61 End
|
61 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -1010,7 +1010,7 @@ No match
|
||||||
194 194 Ket
|
194 194 Ket
|
||||||
197 End
|
197 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 10
|
Capture group count = 10
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
|
||||||
|
|
|
@ -67,7 +67,7 @@ Memory allocation (code space): 14
|
||||||
3 3 Ket
|
3 3 Ket
|
||||||
6 End
|
6 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: extended
|
Options: extended
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -80,7 +80,7 @@ Memory allocation (code space): 18
|
||||||
5 5 Ket
|
5 5 Ket
|
||||||
8 End
|
8 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: extended
|
Options: extended
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -376,7 +376,7 @@ Memory allocation (code space): 30
|
||||||
11 11 Ket
|
11 11 Ket
|
||||||
14 End
|
14 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = '.'
|
Last code unit = '.'
|
||||||
|
@ -390,7 +390,7 @@ Memory allocation (code space): 26
|
||||||
9 9 Ket
|
9 9 Ket
|
||||||
12 End
|
12 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d55c}
|
First code unit = \x{d55c}
|
||||||
Last code unit = \x{c5b4}
|
Last code unit = \x{c5b4}
|
||||||
|
@ -404,7 +404,7 @@ Memory allocation (code space): 26
|
||||||
9 9 Ket
|
9 9 Ket
|
||||||
12 End
|
12 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{65e5}
|
First code unit = \x{65e5}
|
||||||
Last code unit = \x{8a9e}
|
Last code unit = \x{8a9e}
|
||||||
|
@ -903,7 +903,7 @@ Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
110 110 Ket
|
110 110 Ket
|
||||||
113 End
|
113 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -937,7 +937,7 @@ Subject length lower bound = 0
|
||||||
58 58 Ket
|
58 58 Ket
|
||||||
61 End
|
61 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -1010,7 +1010,7 @@ No match
|
||||||
194 194 Ket
|
194 194 Ket
|
||||||
197 End
|
197 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 10
|
Capture group count = 10
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
|
||||||
|
|
|
@ -67,7 +67,7 @@ Memory allocation (code space): 20
|
||||||
2 2 Ket
|
2 2 Ket
|
||||||
4 End
|
4 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: extended
|
Options: extended
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -80,7 +80,7 @@ Memory allocation (code space): 28
|
||||||
4 4 Ket
|
4 4 Ket
|
||||||
6 End
|
6 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: extended
|
Options: extended
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -376,7 +376,7 @@ Memory allocation (code space): 52
|
||||||
10 10 Ket
|
10 10 Ket
|
||||||
12 End
|
12 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = '.'
|
Last code unit = '.'
|
||||||
|
@ -390,7 +390,7 @@ Memory allocation (code space): 44
|
||||||
8 8 Ket
|
8 8 Ket
|
||||||
10 End
|
10 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d55c}
|
First code unit = \x{d55c}
|
||||||
Last code unit = \x{c5b4}
|
Last code unit = \x{c5b4}
|
||||||
|
@ -404,7 +404,7 @@ Memory allocation (code space): 44
|
||||||
8 8 Ket
|
8 8 Ket
|
||||||
10 End
|
10 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{65e5}
|
First code unit = \x{65e5}
|
||||||
Last code unit = \x{8a9e}
|
Last code unit = \x{8a9e}
|
||||||
|
@ -903,7 +903,7 @@ Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
79 79 Ket
|
79 79 Ket
|
||||||
81 End
|
81 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -937,7 +937,7 @@ Subject length lower bound = 0
|
||||||
43 43 Ket
|
43 43 Ket
|
||||||
45 End
|
45 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -1010,7 +1010,7 @@ No match
|
||||||
133 133 Ket
|
133 133 Ket
|
||||||
135 End
|
135 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 10
|
Capture group count = 10
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
|
||||||
|
|
|
@ -67,7 +67,7 @@ Memory allocation (code space): 20
|
||||||
2 2 Ket
|
2 2 Ket
|
||||||
4 End
|
4 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: extended
|
Options: extended
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -80,7 +80,7 @@ Memory allocation (code space): 28
|
||||||
4 4 Ket
|
4 4 Ket
|
||||||
6 End
|
6 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: extended
|
Options: extended
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -376,7 +376,7 @@ Memory allocation (code space): 52
|
||||||
10 10 Ket
|
10 10 Ket
|
||||||
12 End
|
12 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = '.'
|
Last code unit = '.'
|
||||||
|
@ -390,7 +390,7 @@ Memory allocation (code space): 44
|
||||||
8 8 Ket
|
8 8 Ket
|
||||||
10 End
|
10 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d55c}
|
First code unit = \x{d55c}
|
||||||
Last code unit = \x{c5b4}
|
Last code unit = \x{c5b4}
|
||||||
|
@ -404,7 +404,7 @@ Memory allocation (code space): 44
|
||||||
8 8 Ket
|
8 8 Ket
|
||||||
10 End
|
10 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{65e5}
|
First code unit = \x{65e5}
|
||||||
Last code unit = \x{8a9e}
|
Last code unit = \x{8a9e}
|
||||||
|
@ -903,7 +903,7 @@ Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
79 79 Ket
|
79 79 Ket
|
||||||
81 End
|
81 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -937,7 +937,7 @@ Subject length lower bound = 0
|
||||||
43 43 Ket
|
43 43 Ket
|
||||||
45 End
|
45 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -1010,7 +1010,7 @@ No match
|
||||||
133 133 Ket
|
133 133 Ket
|
||||||
135 End
|
135 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 10
|
Capture group count = 10
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
|
||||||
|
|
|
@ -67,7 +67,7 @@ Memory allocation (code space): 20
|
||||||
2 2 Ket
|
2 2 Ket
|
||||||
4 End
|
4 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: extended
|
Options: extended
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -80,7 +80,7 @@ Memory allocation (code space): 28
|
||||||
4 4 Ket
|
4 4 Ket
|
||||||
6 End
|
6 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: extended
|
Options: extended
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -376,7 +376,7 @@ Memory allocation (code space): 52
|
||||||
10 10 Ket
|
10 10 Ket
|
||||||
12 End
|
12 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = '.'
|
Last code unit = '.'
|
||||||
|
@ -390,7 +390,7 @@ Memory allocation (code space): 44
|
||||||
8 8 Ket
|
8 8 Ket
|
||||||
10 End
|
10 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{d55c}
|
First code unit = \x{d55c}
|
||||||
Last code unit = \x{c5b4}
|
Last code unit = \x{c5b4}
|
||||||
|
@ -404,7 +404,7 @@ Memory allocation (code space): 44
|
||||||
8 8 Ket
|
8 8 Ket
|
||||||
10 End
|
10 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \x{65e5}
|
First code unit = \x{65e5}
|
||||||
Last code unit = \x{8a9e}
|
Last code unit = \x{8a9e}
|
||||||
|
@ -903,7 +903,7 @@ Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
79 79 Ket
|
79 79 Ket
|
||||||
81 End
|
81 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -937,7 +937,7 @@ Subject length lower bound = 0
|
||||||
43 43 Ket
|
43 43 Ket
|
||||||
45 End
|
45 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -1010,7 +1010,7 @@ No match
|
||||||
133 133 Ket
|
133 133 Ket
|
||||||
135 End
|
135 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 10
|
Capture group count = 10
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
|
||||||
|
|
|
@ -67,7 +67,7 @@ Memory allocation (code space): 7
|
||||||
3 3 Ket
|
3 3 Ket
|
||||||
6 End
|
6 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: extended
|
Options: extended
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -80,7 +80,7 @@ Memory allocation (code space): 9
|
||||||
5 5 Ket
|
5 5 Ket
|
||||||
8 End
|
8 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: extended
|
Options: extended
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -376,7 +376,7 @@ Memory allocation (code space): 18
|
||||||
14 14 Ket
|
14 14 Ket
|
||||||
17 End
|
17 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = '.'
|
Last code unit = '.'
|
||||||
|
@ -390,7 +390,7 @@ Memory allocation (code space): 19
|
||||||
15 15 Ket
|
15 15 Ket
|
||||||
18 End
|
18 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xed
|
First code unit = \xed
|
||||||
Last code unit = \xb4
|
Last code unit = \xb4
|
||||||
|
@ -404,7 +404,7 @@ Memory allocation (code space): 19
|
||||||
15 15 Ket
|
15 15 Ket
|
||||||
18 End
|
18 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xe6
|
First code unit = \xe6
|
||||||
Last code unit = \x9e
|
Last code unit = \x9e
|
||||||
|
@ -904,7 +904,7 @@ Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
119 119 Ket
|
119 119 Ket
|
||||||
122 End
|
122 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -938,7 +938,7 @@ Subject length lower bound = 0
|
||||||
61 61 Ket
|
61 61 Ket
|
||||||
64 End
|
64 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -1011,7 +1011,7 @@ No match
|
||||||
205 205 Ket
|
205 205 Ket
|
||||||
208 End
|
208 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 10
|
Capture group count = 10
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
|
||||||
|
|
|
@ -67,7 +67,7 @@ Memory allocation (code space): 9
|
||||||
4 4 Ket
|
4 4 Ket
|
||||||
8 End
|
8 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: extended
|
Options: extended
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -80,7 +80,7 @@ Memory allocation (code space): 11
|
||||||
6 6 Ket
|
6 6 Ket
|
||||||
10 End
|
10 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: extended
|
Options: extended
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -376,7 +376,7 @@ Memory allocation (code space): 20
|
||||||
15 15 Ket
|
15 15 Ket
|
||||||
19 End
|
19 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = '.'
|
Last code unit = '.'
|
||||||
|
@ -390,7 +390,7 @@ Memory allocation (code space): 21
|
||||||
16 16 Ket
|
16 16 Ket
|
||||||
20 End
|
20 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xed
|
First code unit = \xed
|
||||||
Last code unit = \xb4
|
Last code unit = \xb4
|
||||||
|
@ -404,7 +404,7 @@ Memory allocation (code space): 21
|
||||||
16 16 Ket
|
16 16 Ket
|
||||||
20 End
|
20 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xe6
|
First code unit = \xe6
|
||||||
Last code unit = \x9e
|
Last code unit = \x9e
|
||||||
|
@ -903,7 +903,7 @@ Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
150 150 Ket
|
150 150 Ket
|
||||||
154 End
|
154 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -937,7 +937,7 @@ Subject length lower bound = 0
|
||||||
76 76 Ket
|
76 76 Ket
|
||||||
80 End
|
80 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -1010,7 +1010,7 @@ No match
|
||||||
266 266 Ket
|
266 266 Ket
|
||||||
270 End
|
270 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 10
|
Capture group count = 10
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
|
||||||
|
|
|
@ -67,7 +67,7 @@ Memory allocation (code space): 11
|
||||||
5 5 Ket
|
5 5 Ket
|
||||||
10 End
|
10 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
May match empty string
|
May match empty string
|
||||||
Options: extended
|
Options: extended
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -80,7 +80,7 @@ Memory allocation (code space): 13
|
||||||
7 7 Ket
|
7 7 Ket
|
||||||
12 End
|
12 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: extended
|
Options: extended
|
||||||
First code unit = 'a'
|
First code unit = 'a'
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
@ -376,7 +376,7 @@ Memory allocation (code space): 22
|
||||||
16 16 Ket
|
16 16 Ket
|
||||||
21 End
|
21 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = 'A'
|
First code unit = 'A'
|
||||||
Last code unit = '.'
|
Last code unit = '.'
|
||||||
|
@ -390,7 +390,7 @@ Memory allocation (code space): 23
|
||||||
17 17 Ket
|
17 17 Ket
|
||||||
22 End
|
22 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xed
|
First code unit = \xed
|
||||||
Last code unit = \xb4
|
Last code unit = \xb4
|
||||||
|
@ -404,7 +404,7 @@ Memory allocation (code space): 23
|
||||||
17 17 Ket
|
17 17 Ket
|
||||||
22 End
|
22 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Options: utf
|
Options: utf
|
||||||
First code unit = \xe6
|
First code unit = \xe6
|
||||||
Last code unit = \x9e
|
Last code unit = \x9e
|
||||||
|
@ -903,7 +903,7 @@ Failed: error 186 at offset 12820: regular expression is too complicated
|
||||||
181 181 Ket
|
181 181 Ket
|
||||||
186 End
|
186 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -937,7 +937,7 @@ Subject length lower bound = 0
|
||||||
91 91 Ket
|
91 91 Ket
|
||||||
96 End
|
96 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 1
|
Capture group count = 1
|
||||||
Max back reference = 1
|
Max back reference = 1
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
@ -1010,7 +1010,7 @@ No match
|
||||||
327 327 Ket
|
327 327 Ket
|
||||||
332 End
|
332 End
|
||||||
------------------------------------------------------------------
|
------------------------------------------------------------------
|
||||||
Capturing subpattern count = 10
|
Capture group count = 10
|
||||||
May match empty string
|
May match empty string
|
||||||
Subject length lower bound = 0
|
Subject length lower bound = 0
|
||||||
|
|
||||||
|
|
|
@ -215,7 +215,7 @@ Failed: error 134 at offset 6: character code point value in \x{} or \o{} is too
|
||||||
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
|
(?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] | \( (?: [^\\\x80-\xff\n\015()] | \\ [^\x80-\xff] )* \) )*
|
||||||
\) )* # optional trailing comment
|
\) )* # optional trailing comment
|
||||||
/Ix
|
/Ix
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Contains explicit CR or LF match
|
Contains explicit CR or LF match
|
||||||
Options: extended
|
Options: extended
|
||||||
Starting code units: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
|
Starting code units: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
|
||||||
|
@ -224,25 +224,25 @@ Starting code units: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
|
||||||
Subject length lower bound = 3
|
Subject length lower bound = 3
|
||||||
|
|
||||||
/\h/I
|
/\h/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x09 \x20 \xa0
|
Starting code units: \x09 \x20 \xa0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\H/I
|
/\H/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\v/I
|
/\v/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85
|
Starting code units: \x0a \x0b \x0c \x0d \x85
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\V/I
|
/\V/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
/\R/I
|
/\R/I
|
||||||
Capturing subpattern count = 0
|
Capture group count = 0
|
||||||
Starting code units: \x0a \x0b \x0c \x0d \x85
|
Starting code units: \x0a \x0b \x0c \x0d \x85
|
||||||
Subject length lower bound = 1
|
Subject length lower bound = 1
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue