Add subject_literal and allow jitstack in pcre2test pattern modifiers, and add
another big pattern test.
This commit is contained in:
parent
1381c3fe28
commit
6e30ed1b40
|
@ -184,6 +184,9 @@ starting offset greater than zero.
|
|||
|
||||
39. Implement REG_PEND (GNU extension) for the POSIX wrapper.
|
||||
|
||||
40. Implement the subject_literal modifier in pcre2test, and allow jitstack on
|
||||
pattern lines.
|
||||
|
||||
|
||||
Version 10.23 14-February-2017
|
||||
------------------------------
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2TEST 1 "06 June 2017" "PCRE 10.30"
|
||||
.TH PCRE2TEST 1 "12 June 2017" "PCRE 10.30"
|
||||
.SH NAME
|
||||
pcre2test - a program for testing Perl-compatible regular expressions.
|
||||
.SH SYNOPSIS
|
||||
|
@ -69,10 +69,10 @@ want that action.
|
|||
The input is processed using using C's string functions, so must not
|
||||
contain binary zeros, even though in Unix-like environments, \fBfgets()\fP
|
||||
treats any bytes other than newline as data characters. An error is generated
|
||||
if a binary zero is encountered. Subject lines are processed for backslash
|
||||
escapes, which makes it possible to include any data value in strings that are
|
||||
passed to the library for matching. For patterns, there is a facility for
|
||||
specifying some or all of the 8-bit input characters as hexadecimal pairs,
|
||||
if a binary zero is encountered. By default subject lines are processed for
|
||||
backslash escapes, which makes it possible to include any data value in strings
|
||||
that are passed to the library for matching. For patterns, there is a facility
|
||||
for specifying some or all of the 8-bit input characters as hexadecimal pairs,
|
||||
which makes it possible to include binary zeros.
|
||||
.
|
||||
.
|
||||
|
@ -442,8 +442,9 @@ A pattern can be followed by a modifier list (details below).
|
|||
.sp
|
||||
Before each subject line is passed to \fBpcre2_match()\fP or
|
||||
\fBpcre2_dfa_match()\fP, leading and trailing white space is removed, and the
|
||||
line is scanned for backslash escapes. The following provide a means of
|
||||
encoding non-printing characters in a visible way:
|
||||
line is scanned for backslash escapes, unless the \fBsubject_literal\fP
|
||||
modifier was set for the pattern. The following provide a means of encoding
|
||||
non-printing characters in a visible way:
|
||||
.sp
|
||||
\ea alarm (BEL, \ex07)
|
||||
\eb backspace (\ex08)
|
||||
|
@ -505,6 +506,11 @@ character. A backslash followed by anything else causes an error. However, if
|
|||
the very last character in the line is a backslash (and there is no modifier
|
||||
list), it is ignored. This gives a way of passing an empty line as data, since
|
||||
a real empty line terminates the data input.
|
||||
.P
|
||||
If the \fBsubject_literal\fP modifier is set for a pattern, all subject lines
|
||||
that follow are treated as literals, with no special treatment of backslashes.
|
||||
No replication is possible, and any subject modifiers must be set as defaults
|
||||
by a \fB#subject\fP command.
|
||||
.
|
||||
.
|
||||
.SH "PATTERN MODIFIERS"
|
||||
|
@ -602,6 +608,7 @@ heavily used in the test files.
|
|||
push push compiled pattern onto the stack
|
||||
pushcopy push a copy onto the stack
|
||||
stackguard=<number> test the stackguard feature
|
||||
subject_literal treat all subject lines as literal
|
||||
tables=[0|1|2] select internal tables
|
||||
use_length do not zero-terminate the pattern
|
||||
utf8_input treat input as UTF-8
|
||||
|
@ -967,17 +974,18 @@ are mutually exclusive.
|
|||
.SS "Setting certain match controls"
|
||||
.rs
|
||||
.sp
|
||||
The following modifiers are really subject modifiers, and are described below.
|
||||
However, they may be included in a pattern's modifier list, in which case they
|
||||
are applied to every subject line that is processed with that pattern. They may
|
||||
not appear in \fB#pattern\fP commands. These modifiers do not affect the
|
||||
compilation process.
|
||||
The following modifiers are really subject modifiers, and are described under
|
||||
"Subject Modifiers" below. However, they may be included in a pattern's
|
||||
modifier list, in which case they are applied to every subject line that is
|
||||
processed with that pattern. They may not appear in \fB#pattern\fP commands.
|
||||
These modifiers do not affect the compilation process.
|
||||
.sp
|
||||
aftertext show text after match
|
||||
allaftertext show text after captures
|
||||
allcaptures show all captures
|
||||
allusedtext show all consulted text
|
||||
/g global global matching
|
||||
jitstack=<n> set size of JIT stack
|
||||
mark show mark values
|
||||
replace=<string> specify a replacement string
|
||||
startchar show starting character when relevant
|
||||
|
@ -990,6 +998,15 @@ These modifiers may not appear in a \fB#pattern\fP command. If you want them as
|
|||
defaults, set them in a \fB#subject\fP command.
|
||||
.
|
||||
.
|
||||
.SS "Specifying literal subject lines"
|
||||
.rs
|
||||
.sp
|
||||
If the \fBsubject_literal\fP modifier is present on a pattern, all the subject
|
||||
lines that it matches are taken as literal strings, with no interpretation of
|
||||
backslashes. It is not possible to set subject modifiers on such lines, but any
|
||||
that are set as defaults by a \fB#subject\fP command are recognized.
|
||||
.
|
||||
.
|
||||
.SS "Saving a compiled pattern"
|
||||
.rs
|
||||
.sp
|
||||
|
@ -1321,9 +1338,11 @@ matching provokes an error return ("bad option value") from
|
|||
.sp
|
||||
The \fBjitstack\fP modifier provides a way of setting the maximum stack size
|
||||
that is used by the just-in-time optimization code. It is ignored if JIT
|
||||
optimization is not being used. The value is a number of kilobytes. Providing a
|
||||
stack that is larger than the default 32K is necessary only for very
|
||||
complicated patterns.
|
||||
optimization is not being used. The value is a number of kilobytes. Setting
|
||||
zero reverts to the default of 32K. Providing a stack that is larger than the
|
||||
default is necessary only for very complicated patterns. If \fBjitstack\fP is
|
||||
set non-zero on a subject line it overrides any value that was set on the
|
||||
pattern.
|
||||
.
|
||||
.
|
||||
.SS "Setting heap, match, and depth limits"
|
||||
|
@ -1815,6 +1834,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 06 June 2017
|
||||
Last updated: 12 June 2017
|
||||
Copyright (c) 1997-2017 University of Cambridge.
|
||||
.fi
|
||||
|
|
26
perltest.sh
26
perltest.sh
|
@ -42,13 +42,16 @@ fi
|
|||
# aftertext interpreted as "print $' afterwards"
|
||||
# afteralltext ignored
|
||||
# dupnames ignored (Perl always allows)
|
||||
# jitstack ignored
|
||||
# mark ignored
|
||||
# no_auto_possess ignored
|
||||
# no_start_optimize ignored
|
||||
# subject_literal does not process subjects for escapes
|
||||
# ucp sets Perl's /u modifier
|
||||
# utf invoke UTF-8 functionality
|
||||
#
|
||||
# The data lines must not have any pcre2test modifiers. They are processed as
|
||||
# The data lines must not have any pcre2test modifiers. Unless
|
||||
# "subject_litersl" is on the pattern, data lines are processed as
|
||||
# Perl double-quoted strings, so if they contain " $ or @ characters, these
|
||||
# have to be escaped. For this reason, all such characters in the
|
||||
# Perl-compatible testinput1 and testinput4 files are escaped so that they can
|
||||
|
@ -138,16 +141,20 @@ for (;;)
|
|||
|
||||
chomp($pattern);
|
||||
$pattern =~ s/\s+$//;
|
||||
|
||||
|
||||
# Split the pattern from the modifiers and adjust them as necessary.
|
||||
|
||||
$pattern =~ /^\s*((.).*\2)(.*)$/s;
|
||||
$pat = $1;
|
||||
$mod = $3;
|
||||
|
||||
|
||||
# The private "aftertext" modifier means "print $' afterwards".
|
||||
|
||||
$showrest = ($mod =~ s/aftertext,?//);
|
||||
|
||||
# The "subject_literal" modifer disables escapes in subjects.
|
||||
|
||||
$subject_literal = ($mod =~ s/subject_literal,?//);
|
||||
|
||||
# "allaftertext" is used by pcre2test to print remainders after captures
|
||||
|
||||
|
@ -161,6 +168,10 @@ for (;;)
|
|||
|
||||
$mod =~ s/dupnames,?//;
|
||||
|
||||
# Remove "jitstack".
|
||||
|
||||
$mod =~ s/jitstack=\d+,?//;
|
||||
|
||||
# Remove "mark" (asks pcre2test to check MARK data) */
|
||||
|
||||
$mod =~ s/mark,?//;
|
||||
|
@ -222,7 +233,14 @@ for (;;)
|
|||
last if ($_ eq "");
|
||||
next if $_ =~ /^\\=(?:\s|$)/; # Comment line
|
||||
|
||||
$x = eval "\"$_\""; # To get escapes processed
|
||||
if ($subject_literal)
|
||||
{
|
||||
$x = $_;
|
||||
}
|
||||
else
|
||||
{
|
||||
$x = eval "\"$_\""; # To get escapes processed
|
||||
}
|
||||
|
||||
# Empty array for holding results, ensure $REGERROR and $REGMARK are
|
||||
# unset, then do the matching.
|
||||
|
|
|
@ -479,6 +479,7 @@ so many of them that they are split into two fields. */
|
|||
#define CTL2_SUBSTITUTE_OVERFLOW_LENGTH 0x00000002u
|
||||
#define CTL2_SUBSTITUTE_UNKNOWN_UNSET 0x00000004u
|
||||
#define CTL2_SUBSTITUTE_UNSET_EMPTY 0x00000008u
|
||||
#define CTL2_SUBJECT_LITERAL 0x00000010u
|
||||
|
||||
#define CTL_NL_SET 0x40000000u /* Informational */
|
||||
#define CTL_BSR_SET 0x80000000u /* Informational */
|
||||
|
@ -518,6 +519,7 @@ typedef struct patctl { /* Structure for pattern modifiers. */
|
|||
uint32_t options; /* Must be in same position as datctl */
|
||||
uint32_t control; /* Must be in same position as datctl */
|
||||
uint32_t control2; /* Must be in same position as datctl */
|
||||
uint32_t jitstack; /* Must be in same position as datctl */
|
||||
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
||||
uint32_t jit;
|
||||
uint32_t stackguard_test;
|
||||
|
@ -537,6 +539,7 @@ typedef struct datctl { /* Structure for data line modifiers. */
|
|||
uint32_t options; /* Must be in same position as patctl */
|
||||
uint32_t control; /* Must be in same position as patctl */
|
||||
uint32_t control2; /* Must be in same position as patctl */
|
||||
uint32_t jitstack; /* Must be in same position as patctl */
|
||||
uint8_t replacement[REPLACE_MODSIZE]; /* So must this */
|
||||
uint32_t startend[2];
|
||||
uint32_t cerror[2];
|
||||
|
@ -544,7 +547,6 @@ typedef struct datctl { /* Structure for data line modifiers. */
|
|||
int32_t callout_data;
|
||||
int32_t copy_numbers[MAXCPYGET];
|
||||
int32_t get_numbers[MAXCPYGET];
|
||||
uint32_t jitstack;
|
||||
uint32_t oveccount;
|
||||
uint32_t offset;
|
||||
uint8_t copy_names[LENCPYGET];
|
||||
|
@ -630,7 +632,7 @@ static modstruct modlist[] = {
|
|||
{ "info", MOD_PAT, MOD_CTL, CTL_INFO, PO(control) },
|
||||
{ "jit", MOD_PAT, MOD_IND, 7, PO(jit) },
|
||||
{ "jitfast", MOD_PAT, MOD_CTL, CTL_JITFAST, PO(control) },
|
||||
{ "jitstack", MOD_DAT, MOD_INT, 0, DO(jitstack) },
|
||||
{ "jitstack", MOD_PNDP, MOD_INT, 0, PO(jitstack) },
|
||||
{ "jitverify", MOD_PAT, MOD_CTL, CTL_JITVERIFY, PO(control) },
|
||||
{ "locale", MOD_PAT, MOD_STR, LOCALESIZE, PO(locale) },
|
||||
{ "mark", MOD_PNDP, MOD_CTL, CTL_MARK, PO(control) },
|
||||
|
@ -674,6 +676,7 @@ static modstruct modlist[] = {
|
|||
{ "stackguard", MOD_PAT, MOD_INT, 0, PO(stackguard_test) },
|
||||
{ "startchar", MOD_PND, MOD_CTL, CTL_STARTCHAR, PO(control) },
|
||||
{ "startoffset", MOD_DAT, MOD_INT, 0, DO(offset) },
|
||||
{ "subject_literal", MOD_PATP, MOD_CTL, CTL2_SUBJECT_LITERAL, PO(control2) },
|
||||
{ "substitute_extended", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_EXTENDED, PO(control2) },
|
||||
{ "substitute_overflow_length", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_OVERFLOW_LENGTH, PO(control2) },
|
||||
{ "substitute_unknown_unset", MOD_PND, MOD_CTL, CTL2_SUBSTITUTE_UNKNOWN_UNSET, PO(control2) },
|
||||
|
@ -3477,7 +3480,8 @@ switch (m->which)
|
|||
case MOD_PND: /* Ditto, but not default pattern */
|
||||
case MOD_PNDP: /* Ditto, allowed for Perl test */
|
||||
if (dctl != NULL) field = dctl;
|
||||
else if (pctl != NULL && (m->which == MOD_PD || ctx != CTX_DEFPAT))
|
||||
else if (pctl != NULL && (m->which == MOD_PD || m->which == MOD_PDP ||
|
||||
ctx != CTX_DEFPAT))
|
||||
field = pctl;
|
||||
break;
|
||||
}
|
||||
|
@ -6216,6 +6220,7 @@ uint8_t *p, *pp, *start_rep;
|
|||
size_t needlen;
|
||||
void *use_dat_context;
|
||||
BOOL utf;
|
||||
BOOL subject_literal;
|
||||
|
||||
#ifdef SUPPORT_PCRE2_8
|
||||
uint8_t *q8 = NULL;
|
||||
|
@ -6227,6 +6232,8 @@ uint16_t *q16 = NULL;
|
|||
uint32_t *q32 = NULL;
|
||||
#endif
|
||||
|
||||
subject_literal = (pat_patctl.control2 & CTL2_SUBJECT_LITERAL) != 0;
|
||||
|
||||
/* Copy the default context and data control blocks to the active ones. Then
|
||||
copy from the pattern the controls that can be set in either the pattern or the
|
||||
data. This allows them to be overridden in the data line. We do not do this for
|
||||
|
@ -6238,6 +6245,7 @@ memcpy(&dat_datctl, &def_datctl, sizeof(datctl));
|
|||
dat_datctl.control |= (pat_patctl.control & CTL_ALLPD);
|
||||
dat_datctl.control2 |= (pat_patctl.control2 & CTL2_ALLPD);
|
||||
strcpy((char *)dat_datctl.replacement, (char *)pat_patctl.replacement);
|
||||
if (dat_datctl.jitstack == 0) dat_datctl.jitstack = pat_patctl.jitstack;
|
||||
|
||||
/* Initialize for scanning the data line. */
|
||||
|
||||
|
@ -6373,7 +6381,7 @@ while ((c = *p++) != 0)
|
|||
/* Handle a non-escaped character. In non-UTF 32-bit mode with utf8_input
|
||||
set, do the fudge for setting the top bit. */
|
||||
|
||||
if (c != '\\')
|
||||
if (c != '\\' || subject_literal)
|
||||
{
|
||||
uint32_t topbit = 0;
|
||||
if (test_mode == PCRE32_MODE && c == 0xff && *p != 0)
|
||||
|
|
|
@ -5924,9 +5924,9 @@ ef) x/x,mark
|
|||
# addresses in various formats. It's a heavy test for named subpatterns. In the
|
||||
# <atext> group, slash is coded as \x{2f} so that this pattern can also be
|
||||
# processed by perltest.sh, which does not cater for an escaped delimiter
|
||||
# within the pattern. All $ and @ characters in subject strings are escaped so
|
||||
# that Perl doesn't interpret them as variable insertions and " characters must
|
||||
# also be escaped for Perl.
|
||||
# within the pattern. $ within the pattern must also be escaped. All $ and @
|
||||
# characters in subject strings are escaped so that Perl doesn't interpret them
|
||||
# as variable insertions and " characters must also be escaped for Perl.
|
||||
|
||||
# This set of subpatterns is more or less a direct transliteration of the BNF
|
||||
# definitions in RFC2822, without any of the obsolete features. The addition of
|
||||
|
@ -5937,7 +5937,7 @@ ef) x/x,mark
|
|||
/(?ix)(?(DEFINE)
|
||||
(?<addr_spec> (?&local_part) \@ (?&domain) )
|
||||
(?<angle_addr> (?&CFWS)?+ < (?&addr_spec) > (?&CFWS)?+ )
|
||||
(?<atext> [a-z\d!#$%&'*+-\x{2f}=?^_`{|}~] )
|
||||
(?<atext> [a-z\d!#\$%&'*+-\x{2f}=?^_`{|}~] )
|
||||
(?<atom> (?&CFWS)?+ (?&atext)+ (?&CFWS)?+ )
|
||||
(?<ccontent> (?&ctext) | (?"ed_pair) | (?&comment) )
|
||||
(?<ctext> [^\x{9}\x{10}\x{13}\x{7f}-\x{ff}\ ()\\] )
|
||||
|
@ -5981,4 +5981,180 @@ ef) x/x,mark
|
|||
|
||||
# --------------------------------------------------------------------------
|
||||
|
||||
# This pattern uses named groups to match default PCRE2 patterns. It's another
|
||||
# heavy test for named subpatterns. Once again, code slash as \x{2f} and escape
|
||||
# $ even in classes so that this works with pcre2test.
|
||||
|
||||
/(?sx)(?(DEFINE)
|
||||
|
||||
(?<assertion> (?&simple_assertion) | (?&lookaround) )
|
||||
|
||||
(?<atomic_group> \( \? > (?®ex) \) )
|
||||
|
||||
(?<back_reference> \\ \d+ |
|
||||
\\g (?: [+-]?\d+ | \{ (?: [+-]?\d+ | (?&groupname) ) \} ) |
|
||||
\\k <(?&groupname)> |
|
||||
\\k '(?&groupname)' |
|
||||
\\k \{ (?&groupname) \} |
|
||||
\( \? P= (?&groupname) \) )
|
||||
|
||||
(?<branch> (?:(?&assertion) |
|
||||
(?&callout) |
|
||||
(?&comment) |
|
||||
(?&option_setting) |
|
||||
(?&qualified_item) |
|
||||
(?"ed_string) |
|
||||
(?"ed_string_empty) |
|
||||
(?&special_escape) |
|
||||
(?&verb)
|
||||
)* )
|
||||
|
||||
(?<callout> \(\?C (?: \d+ |
|
||||
(?: (?<D>["'`^%\#\$])
|
||||
(?: \k'D'\k'D' | (?!\k'D') . )* \k'D' |
|
||||
\{ (?: \}\} | [^}]*+ )* \} )
|
||||
)? \) )
|
||||
|
||||
(?<capturing_group> \( (?: \? P? < (?&groupname) > | \? ' (?&groupname) ' )?
|
||||
(?®ex) \) )
|
||||
|
||||
(?<character_class> \[ \^?+ (?: \] (?&class_item)* | (?&class_item)+ ) \] )
|
||||
|
||||
(?<character_type> (?! \\N\{\w+\} ) \\ [dDsSwWhHvVRN] )
|
||||
|
||||
(?<class_item> (?: \[ : (?:
|
||||
alnum|alpha|ascii|blank|cntrl|digit|graph|lower|print|
|
||||
punct|space|upper|word|xdigit
|
||||
) : \] |
|
||||
(?"ed_string) |
|
||||
(?"ed_string_empty) |
|
||||
(?&escaped_character) |
|
||||
(?&character_type) |
|
||||
[^]] ) )
|
||||
|
||||
(?<comment> \(\?\# [^)]* \) | (?"ed_string_empty) | \\E )
|
||||
|
||||
(?<condition> (?: \( [+-]? \d+ \) |
|
||||
\( < (?&groupname) > \) |
|
||||
\( ' (?&groupname) ' \) |
|
||||
\( R \d* \) |
|
||||
\( R & (?&groupname) \) |
|
||||
\( (?&groupname) \) |
|
||||
\( DEFINE \) |
|
||||
\( VERSION >?=\d+(?:\.\d\d?)? \) |
|
||||
(?&callout)?+ (?&comment)* (?&lookaround) ) )
|
||||
|
||||
(?<conditional_group> \(\? (?&condition) (?&branch) (?: \| (?&branch) )? \) )
|
||||
|
||||
(?<delimited_regex> (?<delimiter> [-\x{2f}!"'`=_:;,%&@~]) (?®ex)
|
||||
\k'delimiter' .* )
|
||||
|
||||
(?<escaped_character> \\ (?: 0[0-7]{1,2} | [0-7]{1,3} | o\{ [0-7]+ \} |
|
||||
x \{ (*COMMIT) [[:xdigit:]]* \} | x [[:xdigit:]]{0,2} |
|
||||
[aefnrt] | c[[:print:]] |
|
||||
[^[:alnum:]] ) )
|
||||
|
||||
(?<group> (?&capturing_group) | (?&non_capturing_group) |
|
||||
(?&resetting_group) | (?&atomic_group) |
|
||||
(?&conditional_group) )
|
||||
|
||||
(?<groupname> [a-zA-Z_]\w* )
|
||||
|
||||
(?<literal_character> (?! (?&range_qualifier) ) [^[()|*+?.\$\\] )
|
||||
|
||||
(?<lookaround> \(\? (?: = | ! | <= | <! ) (?®ex) \) )
|
||||
|
||||
(?<non_capturing_group> \(\? [iJmnsUx-]* : (?®ex) \) )
|
||||
|
||||
(?<option_setting> \(\? [iJmnsUx-]* \) )
|
||||
|
||||
(?<qualified_item> (?:\. |
|
||||
(?&lookaround) |
|
||||
(?&back_reference) |
|
||||
(?&character_class) |
|
||||
(?&character_type) |
|
||||
(?&escaped_character) |
|
||||
(?&group) |
|
||||
(?&subroutine_call) |
|
||||
(?&literal_character) |
|
||||
(?"ed_string)
|
||||
) (?&comment)? (?&qualifier)? )
|
||||
|
||||
(?<qualifier> (?: [?*+] | (?&range_qualifier) ) [+?]? )
|
||||
|
||||
(?<quoted_string> (?: \\Q (?: (?!\\E | \k'delimiter') . )++ (?: \\E | ) ) )
|
||||
|
||||
(?<quoted_string_empty> \\Q\\E )
|
||||
|
||||
(?<range_qualifier> \{ (?: \d+ (?: , \d* )? | , \d+ ) \} )
|
||||
|
||||
(?<regex> (?&start_item)* (?&branch) (?: \| (?&branch) )* )
|
||||
|
||||
(?<resetting_group> \( \? \| (?®ex) \) )
|
||||
|
||||
(?<simple_assertion> \^ | \$ | \\A | \\b | \\B | \\G | \\z | \\Z )
|
||||
|
||||
(?<special_escape> \\K )
|
||||
|
||||
(?<start_item> \( \* (?:
|
||||
ANY |
|
||||
ANYCRLF |
|
||||
BSR_ANYCRLF |
|
||||
BSR_UNICODE |
|
||||
CR |
|
||||
CRLF |
|
||||
LF |
|
||||
LIMIT_MATCH=\d+ |
|
||||
LIMIT_DEPTH=\d+ |
|
||||
LIMIT_HEAP=\d+ |
|
||||
NOTEMPTY |
|
||||
NOTEMPTY_ATSTART |
|
||||
NO_AUTO_POSSESS |
|
||||
NO_DOTSTAR_ANCHOR |
|
||||
NO_JIT |
|
||||
NO_START_OPT |
|
||||
NUL |
|
||||
UTF |
|
||||
UCP ) \) )
|
||||
|
||||
(?<subroutine_call> (?: \(\?R\) | \(\?[+-]?\d+\) |
|
||||
\(\? (?: & | P> ) (?&groupname) \) |
|
||||
\\g < (?&groupname) > |
|
||||
\\g ' (?&groupname) ' |
|
||||
\\g < [+-]? \d+ > |
|
||||
\\g ' [+-]? \d+ ) )
|
||||
|
||||
(?<verb> \(\* (?: ACCEPT | FAIL | F | COMMIT |
|
||||
(?:MARK)?:(?&verbname) |
|
||||
(?:PRUNE|SKIP|THEN) (?: : (?&verbname)? )? ) \) )
|
||||
|
||||
(?<verbname> [^)]+ )
|
||||
|
||||
) # End DEFINE
|
||||
# Kick it all off...
|
||||
^(?&delimited_regex)$/subject_literal,jitstack=256
|
||||
/^(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)\11*(\3\4)\1(?#)2$/
|
||||
/(cat(a(ract|tonic)|erpillar)) \1()2(3)/
|
||||
/^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/
|
||||
/^From\s+\S+\s+([a-zA-Z]{3}\s+){2}\d{1,2}\s+\d\d:\d\d/
|
||||
/<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/is
|
||||
/^(?(DEFINE) (?<A> a) (?<B> b) ) (?&A) (?&B) /
|
||||
/(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))\b(?&byte)(\.(?&byte)){3}/
|
||||
/\b(?&byte)(\.(?&byte)){3}(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))/
|
||||
/^(\w++|\s++)*$/
|
||||
/a+b?(*THEN)c+(*FAIL)/
|
||||
/(A (A|B(*ACCEPT)|C) D)(E)/x
|
||||
/^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$/i
|
||||
/A(*PRUNE)B(*SKIP)C(*THEN)D(*COMMIT)E(*F)F(*FAIL)G(?!)H(*ACCEPT)I/B
|
||||
/(?C`a``b`)(?C'a''b')(?C"a""b")(?C^a^^b^)(?C%a%%b%)(?C#a##b#)(?C$a$$b$)(?C{a}}b})/B,callout_info
|
||||
/(?sx)(?(DEFINE)(?<assertion> (?&simple_assertion) | (?&lookaround) )(?<atomic_group> \( \? > (?®ex) \) )(?<back_reference> \\ \d+ | \\g (?: [+-]?\d+ | \{ (?: [+-]?\d+ | (?&groupname) ) \} ) | \\k <(?&groupname)> | \\k '(?&groupname)' | \\k \{ (?&groupname) \} | \( \? P= (?&groupname) \) )(?<branch> (?:(?&assertion) | (?&callout) | (?&comment) | (?&option_setting) | (?&qualified_item) | (?"ed_string) | (?"ed_string_empty) | (?&special_escape) | (?&verb) )* )(?<callout> \(\?C (?: \d+ | (?: (?<D>["'`^%\#\$]) (?: \k'D'\k'D' | (?!\k'D') . )* \k'D' | \{ (?: \}\} | [^}]*+ )* \} ) )? \) )(?<capturing_group> \( (?: \? P? < (?&groupname) > | \? ' (?&groupname) ' )? (?®ex) \) )(?<character_class> \[ \^?+ (?: \] (?&class_item)* | (?&class_item)+ ) \] )(?<character_type> (?! \\N\{\w+\} ) \\ [dDsSwWhHvVRN] )(?<class_item> (?: \[ : (?: alnum|alpha|ascii|blank|cntrl|digit|graph|lower|print| punct|space|upper|word|xdigit ) : \] | (?"ed_string) | (?"ed_string_empty) | (?&escaped_character) | (?&character_type) | [^]] ) )(?<comment> \(\?\# [^)]* \) | (?"ed_string_empty) | \\E )(?<condition> (?: \( [+-]? \d+ \) | \( < (?&groupname) > \) | \( ' (?&groupname) ' \) | \( R \d* \) | \( R & (?&groupname) \) | \( (?&groupname) \) | \( DEFINE \) | \( VERSION >?=\d+(?:\.\d\d?)? \) | (?&callout)?+ (?&comment)* (?&lookaround) ) )(?<conditional_group> \(\? (?&condition) (?&branch) (?: \| (?&branch) )? \) )(?<delimited_regex> (?<delimiter> [-\x{2f}!"'`=_:;,%&@~]) (?®ex) \k'delimiter' .* )(?<escaped_character> \\ (?: 0[0-7]{1,2} | [0-7]{1,3} | o\{ [0-7]+ \} | x \{ (*COMMIT) [[:xdigit:]]* \} | x [[:xdigit:]]{0,2} | [aefnrt] | c[[:print:]] | [^[:alnum:]] ) )(?<group> (?&capturing_group) | (?&non_capturing_group) | (?&resetting_group) | (?&atomic_group) | (?&conditional_group) )(?<groupname> [a-zA-Z_]\w* )(?<literal_character> (?! (?&range_qualifier) ) [^[()|*+?.\$\\] )(?<lookaround> \(\? (?: = | ! | <= | <! ) (?®ex) \) )(?<non_capturing_group> \(\? [iJmnsUx-]* : (?®ex) \) )(?<option_setting> \(\? [iJmnsUx-]* \) )(?<qualified_item> (?:\. | (?&lookaround) | (?&back_reference) | (?&character_class) | (?&character_type) | (?&escaped_character) | (?&group) | (?&subroutine_call) | (?&literal_character) | (?"ed_string) ) (?&comment)? (?&qualifier)? )(?<qualifier> (?: [?*+] | (?&range_qualifier) ) [+?]? )(?<quoted_string> (?: \\Q (?: (?!\\E | \k'delimiter') . )++ (?: \\E | ) ) ) (?<quoted_string_empty> \\Q\\E ) (?<range_qualifier> \{ (?: \d+ (?: , \d* )? | , \d+ ) \} )(?<regex> (?&start_item)* (?&branch) (?: \| (?&branch) )* )(?<resetting_group> \( \? \| (?®ex) \) )(?<simple_assertion> \^ | \$ | \\A | \\b | \\B | \\G | \\z | \\Z )(?<special_escape> \\K )(?<start_item> \( \* (?: ANY | ANYCRLF | BSR_ANYCRLF | BSR_UNICODE | CR | CRLF | LF | LIMIT_MATCH=\d+ | LIMIT_DEPTH=\d+ | LIMIT_HEAP=\d+ | NOTEMPTY | NOTEMPTY_ATSTART | NO_AUTO_POSSESS | NO_DOTSTAR_ANCHOR | NO_JIT | NO_START_OPT | NUL | UTF | UCP ) \) )(?<subroutine_call> (?: \(\?R\) | \(\?[+-]?\d+\) | \(\? (?: & | P> ) (?&groupname) \) | \\g < (?&groupname) > | \\g ' (?&groupname) ' | \\g < [+-]? \d+ > | \\g ' [+-]? \d+ ) )(?<verb> \(\* (?: ACCEPT | FAIL | F | COMMIT | (?:MARK)?:(?&verbname) | (?:PRUNE|SKIP|THEN) (?: : (?&verbname)? )? ) \) )(?<verbname> [^)]+ ))^(?&delimited_regex)$/
|
||||
\= Expect no match
|
||||
/((?(?C'')\QX\E(?!((?(?C'')(?!X=X));=)r*X=X));=)/
|
||||
/(?:(?(2y)a|b)(X))+/
|
||||
/a(*MARK)b/
|
||||
/a(*CR)b/
|
||||
/(?P<abn>(?P=abn)(?<badstufxxx)/
|
||||
|
||||
# --------------------------------------------------------------------------
|
||||
|
||||
# End of testinput1
|
||||
|
|
|
@ -9496,9 +9496,9 @@ No match
|
|||
# addresses in various formats. It's a heavy test for named subpatterns. In the
|
||||
# <atext> group, slash is coded as \x{2f} so that this pattern can also be
|
||||
# processed by perltest.sh, which does not cater for an escaped delimiter
|
||||
# within the pattern. All $ and @ characters in subject strings are escaped so
|
||||
# that Perl doesn't interpret them as variable insertions and " characters must
|
||||
# also be escaped for Perl.
|
||||
# within the pattern. $ within the pattern must also be escaped. All $ and @
|
||||
# characters in subject strings are escaped so that Perl doesn't interpret them
|
||||
# as variable insertions and " characters must also be escaped for Perl.
|
||||
|
||||
# This set of subpatterns is more or less a direct transliteration of the BNF
|
||||
# definitions in RFC2822, without any of the obsolete features. The addition of
|
||||
|
@ -9509,7 +9509,7 @@ No match
|
|||
/(?ix)(?(DEFINE)
|
||||
(?<addr_spec> (?&local_part) \@ (?&domain) )
|
||||
(?<angle_addr> (?&CFWS)?+ < (?&addr_spec) > (?&CFWS)?+ )
|
||||
(?<atext> [a-z\d!#$%&'*+-\x{2f}=?^_`{|}~] )
|
||||
(?<atext> [a-z\d!#\$%&'*+-\x{2f}=?^_`{|}~] )
|
||||
(?<atom> (?&CFWS)?+ (?&atext)+ (?&CFWS)?+ )
|
||||
(?<ccontent> (?&ctext) | (?"ed_pair) | (?&comment) )
|
||||
(?<ctext> [^\x{9}\x{10}\x{13}\x{7f}-\x{ff}\ ()\\] )
|
||||
|
@ -9564,4 +9564,200 @@ No match
|
|||
|
||||
# --------------------------------------------------------------------------
|
||||
|
||||
# This pattern uses named groups to match default PCRE2 patterns. It's another
|
||||
# heavy test for named subpatterns. Once again, code slash as \x{2f} and escape
|
||||
# $ even in classes so that this works with pcre2test.
|
||||
|
||||
/(?sx)(?(DEFINE)
|
||||
|
||||
(?<assertion> (?&simple_assertion) | (?&lookaround) )
|
||||
|
||||
(?<atomic_group> \( \? > (?®ex) \) )
|
||||
|
||||
(?<back_reference> \\ \d+ |
|
||||
\\g (?: [+-]?\d+ | \{ (?: [+-]?\d+ | (?&groupname) ) \} ) |
|
||||
\\k <(?&groupname)> |
|
||||
\\k '(?&groupname)' |
|
||||
\\k \{ (?&groupname) \} |
|
||||
\( \? P= (?&groupname) \) )
|
||||
|
||||
(?<branch> (?:(?&assertion) |
|
||||
(?&callout) |
|
||||
(?&comment) |
|
||||
(?&option_setting) |
|
||||
(?&qualified_item) |
|
||||
(?"ed_string) |
|
||||
(?"ed_string_empty) |
|
||||
(?&special_escape) |
|
||||
(?&verb)
|
||||
)* )
|
||||
|
||||
(?<callout> \(\?C (?: \d+ |
|
||||
(?: (?<D>["'`^%\#\$])
|
||||
(?: \k'D'\k'D' | (?!\k'D') . )* \k'D' |
|
||||
\{ (?: \}\} | [^}]*+ )* \} )
|
||||
)? \) )
|
||||
|
||||
(?<capturing_group> \( (?: \? P? < (?&groupname) > | \? ' (?&groupname) ' )?
|
||||
(?®ex) \) )
|
||||
|
||||
(?<character_class> \[ \^?+ (?: \] (?&class_item)* | (?&class_item)+ ) \] )
|
||||
|
||||
(?<character_type> (?! \\N\{\w+\} ) \\ [dDsSwWhHvVRN] )
|
||||
|
||||
(?<class_item> (?: \[ : (?:
|
||||
alnum|alpha|ascii|blank|cntrl|digit|graph|lower|print|
|
||||
punct|space|upper|word|xdigit
|
||||
) : \] |
|
||||
(?"ed_string) |
|
||||
(?"ed_string_empty) |
|
||||
(?&escaped_character) |
|
||||
(?&character_type) |
|
||||
[^]] ) )
|
||||
|
||||
(?<comment> \(\?\# [^)]* \) | (?"ed_string_empty) | \\E )
|
||||
|
||||
(?<condition> (?: \( [+-]? \d+ \) |
|
||||
\( < (?&groupname) > \) |
|
||||
\( ' (?&groupname) ' \) |
|
||||
\( R \d* \) |
|
||||
\( R & (?&groupname) \) |
|
||||
\( (?&groupname) \) |
|
||||
\( DEFINE \) |
|
||||
\( VERSION >?=\d+(?:\.\d\d?)? \) |
|
||||
(?&callout)?+ (?&comment)* (?&lookaround) ) )
|
||||
|
||||
(?<conditional_group> \(\? (?&condition) (?&branch) (?: \| (?&branch) )? \) )
|
||||
|
||||
(?<delimited_regex> (?<delimiter> [-\x{2f}!"'`=_:;,%&@~]) (?®ex)
|
||||
\k'delimiter' .* )
|
||||
|
||||
(?<escaped_character> \\ (?: 0[0-7]{1,2} | [0-7]{1,3} | o\{ [0-7]+ \} |
|
||||
x \{ (*COMMIT) [[:xdigit:]]* \} | x [[:xdigit:]]{0,2} |
|
||||
[aefnrt] | c[[:print:]] |
|
||||
[^[:alnum:]] ) )
|
||||
|
||||
(?<group> (?&capturing_group) | (?&non_capturing_group) |
|
||||
(?&resetting_group) | (?&atomic_group) |
|
||||
(?&conditional_group) )
|
||||
|
||||
(?<groupname> [a-zA-Z_]\w* )
|
||||
|
||||
(?<literal_character> (?! (?&range_qualifier) ) [^[()|*+?.\$\\] )
|
||||
|
||||
(?<lookaround> \(\? (?: = | ! | <= | <! ) (?®ex) \) )
|
||||
|
||||
(?<non_capturing_group> \(\? [iJmnsUx-]* : (?®ex) \) )
|
||||
|
||||
(?<option_setting> \(\? [iJmnsUx-]* \) )
|
||||
|
||||
(?<qualified_item> (?:\. |
|
||||
(?&lookaround) |
|
||||
(?&back_reference) |
|
||||
(?&character_class) |
|
||||
(?&character_type) |
|
||||
(?&escaped_character) |
|
||||
(?&group) |
|
||||
(?&subroutine_call) |
|
||||
(?&literal_character) |
|
||||
(?"ed_string)
|
||||
) (?&comment)? (?&qualifier)? )
|
||||
|
||||
(?<qualifier> (?: [?*+] | (?&range_qualifier) ) [+?]? )
|
||||
|
||||
(?<quoted_string> (?: \\Q (?: (?!\\E | \k'delimiter') . )++ (?: \\E | ) ) )
|
||||
|
||||
(?<quoted_string_empty> \\Q\\E )
|
||||
|
||||
(?<range_qualifier> \{ (?: \d+ (?: , \d* )? | , \d+ ) \} )
|
||||
|
||||
(?<regex> (?&start_item)* (?&branch) (?: \| (?&branch) )* )
|
||||
|
||||
(?<resetting_group> \( \? \| (?®ex) \) )
|
||||
|
||||
(?<simple_assertion> \^ | \$ | \\A | \\b | \\B | \\G | \\z | \\Z )
|
||||
|
||||
(?<special_escape> \\K )
|
||||
|
||||
(?<start_item> \( \* (?:
|
||||
ANY |
|
||||
ANYCRLF |
|
||||
BSR_ANYCRLF |
|
||||
BSR_UNICODE |
|
||||
CR |
|
||||
CRLF |
|
||||
LF |
|
||||
LIMIT_MATCH=\d+ |
|
||||
LIMIT_DEPTH=\d+ |
|
||||
LIMIT_HEAP=\d+ |
|
||||
NOTEMPTY |
|
||||
NOTEMPTY_ATSTART |
|
||||
NO_AUTO_POSSESS |
|
||||
NO_DOTSTAR_ANCHOR |
|
||||
NO_JIT |
|
||||
NO_START_OPT |
|
||||
NUL |
|
||||
UTF |
|
||||
UCP ) \) )
|
||||
|
||||
(?<subroutine_call> (?: \(\?R\) | \(\?[+-]?\d+\) |
|
||||
\(\? (?: & | P> ) (?&groupname) \) |
|
||||
\\g < (?&groupname) > |
|
||||
\\g ' (?&groupname) ' |
|
||||
\\g < [+-]? \d+ > |
|
||||
\\g ' [+-]? \d+ ) )
|
||||
|
||||
(?<verb> \(\* (?: ACCEPT | FAIL | F | COMMIT |
|
||||
(?:MARK)?:(?&verbname) |
|
||||
(?:PRUNE|SKIP|THEN) (?: : (?&verbname)? )? ) \) )
|
||||
|
||||
(?<verbname> [^)]+ )
|
||||
|
||||
) # End DEFINE
|
||||
# Kick it all off...
|
||||
^(?&delimited_regex)$/subject_literal,jitstack=256
|
||||
/^(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)\11*(\3\4)\1(?#)2$/
|
||||
0: /^(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)\11*(\3\4)\1(?#)2$/
|
||||
/(cat(a(ract|tonic)|erpillar)) \1()2(3)/
|
||||
0: /(cat(a(ract|tonic)|erpillar)) \1()2(3)/
|
||||
/^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/
|
||||
0: /^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/
|
||||
/^From\s+\S+\s+([a-zA-Z]{3}\s+){2}\d{1,2}\s+\d\d:\d\d/
|
||||
0: /^From\s+\S+\s+([a-zA-Z]{3}\s+){2}\d{1,2}\s+\d\d:\d\d/
|
||||
/<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/is
|
||||
0: /<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/is
|
||||
/^(?(DEFINE) (?<A> a) (?<B> b) ) (?&A) (?&B) /
|
||||
0: /^(?(DEFINE) (?<A> a) (?<B> b) ) (?&A) (?&B) /
|
||||
/(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))\b(?&byte)(\.(?&byte)){3}/
|
||||
0: /(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))\b(?&byte)(\.(?&byte)){3}/
|
||||
/\b(?&byte)(\.(?&byte)){3}(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))/
|
||||
0: /\b(?&byte)(\.(?&byte)){3}(?(DEFINE)(?<byte>2[0-4]\d|25[0-5]|1\d\d|[1-9]?\d))/
|
||||
/^(\w++|\s++)*$/
|
||||
0: /^(\w++|\s++)*$/
|
||||
/a+b?(*THEN)c+(*FAIL)/
|
||||
0: /a+b?(*THEN)c+(*FAIL)/
|
||||
/(A (A|B(*ACCEPT)|C) D)(E)/x
|
||||
0: /(A (A|B(*ACCEPT)|C) D)(E)/x
|
||||
/^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$/i
|
||||
0: /^\W*+(?:((.)\W*+(?1)\W*+\2|)|((.)\W*+(?3)\W*+\4|\W*+.\W*+))\W*+$/i
|
||||
/A(*PRUNE)B(*SKIP)C(*THEN)D(*COMMIT)E(*F)F(*FAIL)G(?!)H(*ACCEPT)I/B
|
||||
0: /A(*PRUNE)B(*SKIP)C(*THEN)D(*COMMIT)E(*F)F(*FAIL)G(?!)H(*ACCEPT)I/B
|
||||
/(?C`a``b`)(?C'a''b')(?C"a""b")(?C^a^^b^)(?C%a%%b%)(?C#a##b#)(?C$a$$b$)(?C{a}}b})/B,callout_info
|
||||
0: /(?C`a``b`)(?C'a''b')(?C"a""b")(?C^a^^b^)(?C%a%%b%)(?C#a##b#)(?C$a$$b$)(?C{a}}b})/B,callout_info
|
||||
/(?sx)(?(DEFINE)(?<assertion> (?&simple_assertion) | (?&lookaround) )(?<atomic_group> \( \? > (?®ex) \) )(?<back_reference> \\ \d+ | \\g (?: [+-]?\d+ | \{ (?: [+-]?\d+ | (?&groupname) ) \} ) | \\k <(?&groupname)> | \\k '(?&groupname)' | \\k \{ (?&groupname) \} | \( \? P= (?&groupname) \) )(?<branch> (?:(?&assertion) | (?&callout) | (?&comment) | (?&option_setting) | (?&qualified_item) | (?"ed_string) | (?"ed_string_empty) | (?&special_escape) | (?&verb) )* )(?<callout> \(\?C (?: \d+ | (?: (?<D>["'`^%\#\$]) (?: \k'D'\k'D' | (?!\k'D') . )* \k'D' | \{ (?: \}\} | [^}]*+ )* \} ) )? \) )(?<capturing_group> \( (?: \? P? < (?&groupname) > | \? ' (?&groupname) ' )? (?®ex) \) )(?<character_class> \[ \^?+ (?: \] (?&class_item)* | (?&class_item)+ ) \] )(?<character_type> (?! \\N\{\w+\} ) \\ [dDsSwWhHvVRN] )(?<class_item> (?: \[ : (?: alnum|alpha|ascii|blank|cntrl|digit|graph|lower|print| punct|space|upper|word|xdigit ) : \] | (?"ed_string) | (?"ed_string_empty) | (?&escaped_character) | (?&character_type) | [^]] ) )(?<comment> \(\?\# [^)]* \) | (?"ed_string_empty) | \\E )(?<condition> (?: \( [+-]? \d+ \) | \( < (?&groupname) > \) | \( ' (?&groupname) ' \) | \( R \d* \) | \( R & (?&groupname) \) | \( (?&groupname) \) | \( DEFINE \) | \( VERSION >?=\d+(?:\.\d\d?)? \) | (?&callout)?+ (?&comment)* (?&lookaround) ) )(?<conditional_group> \(\? (?&condition) (?&branch) (?: \| (?&branch) )? \) )(?<delimited_regex> (?<delimiter> [-\x{2f}!"'`=_:;,%&@~]) (?®ex) \k'delimiter' .* )(?<escaped_character> \\ (?: 0[0-7]{1,2} | [0-7]{1,3} | o\{ [0-7]+ \} | x \{ (*COMMIT) [[:xdigit:]]* \} | x [[:xdigit:]]{0,2} | [aefnrt] | c[[:print:]] | [^[:alnum:]] ) )(?<group> (?&capturing_group) | (?&non_capturing_group) | (?&resetting_group) | (?&atomic_group) | (?&conditional_group) )(?<groupname> [a-zA-Z_]\w* )(?<literal_character> (?! (?&range_qualifier) ) [^[()|*+?.\$\\] )(?<lookaround> \(\? (?: = | ! | <= | <! ) (?®ex) \) )(?<non_capturing_group> \(\? [iJmnsUx-]* : (?®ex) \) )(?<option_setting> \(\? [iJmnsUx-]* \) )(?<qualified_item> (?:\. | (?&lookaround) | (?&back_reference) | (?&character_class) | (?&character_type) | (?&escaped_character) | (?&group) | (?&subroutine_call) | (?&literal_character) | (?"ed_string) ) (?&comment)? (?&qualifier)? )(?<qualifier> (?: [?*+] | (?&range_qualifier) ) [+?]? )(?<quoted_string> (?: \\Q (?: (?!\\E | \k'delimiter') . )++ (?: \\E | ) ) ) (?<quoted_string_empty> \\Q\\E ) (?<range_qualifier> \{ (?: \d+ (?: , \d* )? | , \d+ ) \} )(?<regex> (?&start_item)* (?&branch) (?: \| (?&branch) )* )(?<resetting_group> \( \? \| (?®ex) \) )(?<simple_assertion> \^ | \$ | \\A | \\b | \\B | \\G | \\z | \\Z )(?<special_escape> \\K )(?<start_item> \( \* (?: ANY | ANYCRLF | BSR_ANYCRLF | BSR_UNICODE | CR | CRLF | LF | LIMIT_MATCH=\d+ | LIMIT_DEPTH=\d+ | LIMIT_HEAP=\d+ | NOTEMPTY | NOTEMPTY_ATSTART | NO_AUTO_POSSESS | NO_DOTSTAR_ANCHOR | NO_JIT | NO_START_OPT | NUL | UTF | UCP ) \) )(?<subroutine_call> (?: \(\?R\) | \(\?[+-]?\d+\) | \(\? (?: & | P> ) (?&groupname) \) | \\g < (?&groupname) > | \\g ' (?&groupname) ' | \\g < [+-]? \d+ > | \\g ' [+-]? \d+ ) )(?<verb> \(\* (?: ACCEPT | FAIL | F | COMMIT | (?:MARK)?:(?&verbname) | (?:PRUNE|SKIP|THEN) (?: : (?&verbname)? )? ) \) )(?<verbname> [^)]+ ))^(?&delimited_regex)$/
|
||||
0: /(?sx)(?(DEFINE)(?<assertion> (?&simple_assertion) | (?&lookaround) )(?<atomic_group> \( \? > (?®ex) \) )(?<back_reference> \\ \d+ | \\g (?: [+-]?\d+ | \{ (?: [+-]?\d+ | (?&groupname) ) \} ) | \\k <(?&groupname)> | \\k '(?&groupname)' | \\k \{ (?&groupname) \} | \( \? P= (?&groupname) \) )(?<branch> (?:(?&assertion) | (?&callout) | (?&comment) | (?&option_setting) | (?&qualified_item) | (?"ed_string) | (?"ed_string_empty) | (?&special_escape) | (?&verb) )* )(?<callout> \(\?C (?: \d+ | (?: (?<D>["'`^%\#\$]) (?: \k'D'\k'D' | (?!\k'D') . )* \k'D' | \{ (?: \}\} | [^}]*+ )* \} ) )? \) )(?<capturing_group> \( (?: \? P? < (?&groupname) > | \? ' (?&groupname) ' )? (?®ex) \) )(?<character_class> \[ \^?+ (?: \] (?&class_item)* | (?&class_item)+ ) \] )(?<character_type> (?! \\N\{\w+\} ) \\ [dDsSwWhHvVRN] )(?<class_item> (?: \[ : (?: alnum|alpha|ascii|blank|cntrl|digit|graph|lower|print| punct|space|upper|word|xdigit ) : \] | (?"ed_string) | (?"ed_string_empty) | (?&escaped_character) | (?&character_type) | [^]] ) )(?<comment> \(\?\# [^)]* \) | (?"ed_string_empty) | \\E )(?<condition> (?: \( [+-]? \d+ \) | \( < (?&groupname) > \) | \( ' (?&groupname) ' \) | \( R \d* \) | \( R & (?&groupname) \) | \( (?&groupname) \) | \( DEFINE \) | \( VERSION >?=\d+(?:\.\d\d?)? \) | (?&callout)?+ (?&comment)* (?&lookaround) ) )(?<conditional_group> \(\? (?&condition) (?&branch) (?: \| (?&branch) )? \) )(?<delimited_regex> (?<delimiter> [-\x{2f}!"'`=_:;,%&@~]) (?®ex) \k'delimiter' .* )(?<escaped_character> \\ (?: 0[0-7]{1,2} | [0-7]{1,3} | o\{ [0-7]+ \} | x \{ (*COMMIT) [[:xdigit:]]* \} | x [[:xdigit:]]{0,2} | [aefnrt] | c[[:print:]] | [^[:alnum:]] ) )(?<group> (?&capturing_group) | (?&non_capturing_group) | (?&resetting_group) | (?&atomic_group) | (?&conditional_group) )(?<groupname> [a-zA-Z_]\w* )(?<literal_character> (?! (?&range_qualifier) ) [^[()|*+?.\$\\] )(?<lookaround> \(\? (?: = | ! | <= | <! ) (?®ex) \) )(?<non_capturing_group> \(\? [iJmnsUx-]* : (?®ex) \) )(?<option_setting> \(\? [iJmnsUx-]* \) )(?<qualified_item> (?:\. | (?&lookaround) | (?&back_reference) | (?&character_class) | (?&character_type) | (?&escaped_character) | (?&group) | (?&subroutine_call) | (?&literal_character) | (?"ed_string) ) (?&comment)? (?&qualifier)? )(?<qualifier> (?: [?*+] | (?&range_qualifier) ) [+?]? )(?<quoted_string> (?: \\Q (?: (?!\\E | \k'delimiter') . )++ (?: \\E | ) ) ) (?<quoted_string_empty> \\Q\\E ) (?<range_qualifier> \{ (?: \d+ (?: , \d* )? | , \d+ ) \} )(?<regex> (?&start_item)* (?&branch) (?: \| (?&branch) )* )(?<resetting_group> \( \? \| (?®ex) \) )(?<simple_assertion> \^ | \$ | \\A | \\b | \\B | \\G | \\z | \\Z )(?<special_escape> \\K )(?<start_item> \( \* (?: ANY | ANYCRLF | BSR_ANYCRLF | BSR_UNICODE | CR | CRLF | LF | LIMIT_MATCH=\d+ | LIMIT_DEPTH=\d+ | LIMIT_HEAP=\d+ | NOTEMPTY | NOTEMPTY_ATSTART | NO_AUTO_POSSESS | NO_DOTSTAR_ANCHOR | NO_JIT | NO_START_OPT | NUL | UTF | UCP ) \) )(?<subroutine_call> (?: \(\?R\) | \(\?[+-]?\d+\) | \(\? (?: & | P> ) (?&groupname) \) | \\g < (?&groupname) > | \\g ' (?&groupname) ' | \\g < [+-]? \d+ > | \\g ' [+-]? \d+ ) )(?<verb> \(\* (?: ACCEPT | FAIL | F | COMMIT | (?:MARK)?:(?&verbname) | (?:PRUNE|SKIP|THEN) (?: : (?&verbname)? )? ) \) )(?<verbname> [^)]+ ))^(?&delimited_regex)$/
|
||||
\= Expect no match
|
||||
/((?(?C'')\QX\E(?!((?(?C'')(?!X=X));=)r*X=X));=)/
|
||||
No match
|
||||
/(?:(?(2y)a|b)(X))+/
|
||||
No match
|
||||
/a(*MARK)b/
|
||||
No match
|
||||
/a(*CR)b/
|
||||
No match
|
||||
/(?P<abn>(?P=abn)(?<badstufxxx)/
|
||||
No match
|
||||
|
||||
# --------------------------------------------------------------------------
|
||||
|
||||
# End of testinput1
|
||||
|
|
Loading…
Reference in New Issue