Add replication feature for patterns to pcre2test.
This commit is contained in:
parent
d1b4d99bc5
commit
4ce7652a0e
|
@ -222,6 +222,10 @@ overflow.
|
||||||
|
|
||||||
64. Improve error message for overly-complicated patterns.
|
64. Improve error message for overly-complicated patterns.
|
||||||
|
|
||||||
|
65. Implemented an optional replication feature for patterns in pcre2test, to
|
||||||
|
make it easier to test long repetitive patterns. The tests for 63 above are
|
||||||
|
converted to use the new feature.
|
||||||
|
|
||||||
|
|
||||||
Version 10.20 30-June-2015
|
Version 10.20 30-June-2015
|
||||||
--------------------------
|
--------------------------
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
.TH PCRE2TEST 1 "17 October 2015" "PCRE 10.21"
|
.TH PCRE2TEST 1 "30 October 2015" "PCRE 10.21"
|
||||||
.SH NAME
|
.SH NAME
|
||||||
pcre2test - a program for testing Perl-compatible regular expressions.
|
pcre2test - a program for testing Perl-compatible regular expressions.
|
||||||
.SH SYNOPSIS
|
.SH SYNOPSIS
|
||||||
|
@ -218,9 +218,9 @@ Each subject line is matched separately and independently. If you want to do
|
||||||
multi-line matches, you have to use the \en escape sequence (or \er or \er\en,
|
multi-line matches, you have to use the \en escape sequence (or \er or \er\en,
|
||||||
etc., depending on the newline setting) in a single line of input to encode the
|
etc., depending on the newline setting) in a single line of input to encode the
|
||||||
newline sequences. There is no limit on the length of subject lines; the input
|
newline sequences. There is no limit on the length of subject lines; the input
|
||||||
buffer is automatically extended if it is too small. There is a replication
|
buffer is automatically extended if it is too small. There are replication
|
||||||
feature that makes it possible to generate long subject lines without having to
|
features that makes it possible to generate long repetitive pattern or subject
|
||||||
supply them explicitly.
|
lines without having to supply them explicitly.
|
||||||
.P
|
.P
|
||||||
An empty line or the end of the file signals the end of the subject lines for a
|
An empty line or the end of the file signals the end of the subject lines for a
|
||||||
test, at which point a new pattern or command line is expected if there is
|
test, at which point a new pattern or command line is expected if there is
|
||||||
|
@ -460,10 +460,10 @@ a real empty line terminates the data input.
|
||||||
.SH "PATTERN MODIFIERS"
|
.SH "PATTERN MODIFIERS"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
There are three types of modifier that can appear in pattern lines, two of
|
There are several types of modifier that can appear in pattern lines. Except
|
||||||
which may also be used in a \fB#pattern\fP command. A pattern's modifier list
|
where noted below, they may also be used in \fB#pattern\fP commands. A
|
||||||
can add to or override default modifiers that were set by a previous
|
pattern's modifier list can add to or override default modifiers that were set
|
||||||
\fB#pattern\fP command.
|
by a previous \fB#pattern\fP command.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
.\" HTML <a name="optionmodifiers"></a>
|
.\" HTML <a name="optionmodifiers"></a>
|
||||||
|
@ -629,6 +629,32 @@ PCRE2_ZERO_TERMINATED. However, for patterns specified in hexadecimal, the
|
||||||
actual length of the pattern is passed.
|
actual length of the pattern is passed.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
|
.SS "Generating long repetitive patterns"
|
||||||
|
.rs
|
||||||
|
.sp
|
||||||
|
Some tests use long patterns that are very repetitive. Instead of creating a
|
||||||
|
very long input line for such a pattern, you can use a special repetition
|
||||||
|
feature, similar to the one described for subject lines above. If the
|
||||||
|
\fBexpand\fP modifier is present on a pattern, parts of the pattern that have
|
||||||
|
the form
|
||||||
|
.sp
|
||||||
|
\e[<characters>]{<count>}
|
||||||
|
.sp
|
||||||
|
are expanded before the pattern is passed to \fBpcre2_compile()\fP. For
|
||||||
|
example, \e[AB]{6000} is expanded to "ABAB..." 6000 times. This construction
|
||||||
|
cannot be nested. An initial "\e[" sequence is recognized only if "]{" followed
|
||||||
|
by decimal digits and "}" is found later in the pattern. If not, the characters
|
||||||
|
remain in the pattern unaltered.
|
||||||
|
.P
|
||||||
|
If part of an expanded pattern looks like an expansion, but is really part of
|
||||||
|
the actual pattern, unwanted expansion can be avoided by giving two values in
|
||||||
|
the quantifier. For example, \e[AB]{6000,6000} is not recognized as an
|
||||||
|
expansion item.
|
||||||
|
.P
|
||||||
|
If the \fBinfo\fP modifier is set on an expanded pattern, the result of the
|
||||||
|
expansion is included in the information that is output.
|
||||||
|
.
|
||||||
|
.
|
||||||
.SS "JIT compilation"
|
.SS "JIT compilation"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
|
@ -805,8 +831,9 @@ are mutually exclusive.
|
||||||
.sp
|
.sp
|
||||||
The following modifiers are really subject modifiers, and are described below.
|
The following modifiers are really subject modifiers, and are described below.
|
||||||
However, they may be included in a pattern's modifier list, in which case they
|
However, they may be included in a pattern's modifier list, in which case they
|
||||||
are applied to every subject line that is processed with that pattern. They do
|
are applied to every subject line that is processed with that pattern. They may
|
||||||
not affect the compilation process.
|
not appear in \fB#pattern\fP commands. These modifiers do not affect the
|
||||||
|
compilation process.
|
||||||
.sp
|
.sp
|
||||||
aftertext show text after match
|
aftertext show text after match
|
||||||
allaftertext show text after captures
|
allaftertext show text after captures
|
||||||
|
@ -1560,6 +1587,6 @@ Cambridge, England.
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
.nf
|
.nf
|
||||||
Last updated: 17 October 2015
|
Last updated: 30 October 2015
|
||||||
Copyright (c) 1997-2015 University of Cambridge.
|
Copyright (c) 1997-2015 University of Cambridge.
|
||||||
.fi
|
.fi
|
||||||
|
|
248
src/pcre2test.c
248
src/pcre2test.c
|
@ -379,7 +379,7 @@ enum { MOD_CTC, /* Applies to a compile context */
|
||||||
MOD_NL, /* Is a newline value */
|
MOD_NL, /* Is a newline value */
|
||||||
MOD_NN, /* Is a number or a name; more than one may occur */
|
MOD_NN, /* Is a number or a name; more than one may occur */
|
||||||
MOD_OPT, /* Is an option bit */
|
MOD_OPT, /* Is an option bit */
|
||||||
MOD_SIZ, /* Is a PCRE2_SIZE value */
|
MOD_SIZ, /* Is a PCRE2_SIZE value */
|
||||||
MOD_STR }; /* Is a string */
|
MOD_STR }; /* Is a string */
|
||||||
|
|
||||||
/* Control bits. Some apply to compiling, some to matching, but some can be set
|
/* Control bits. Some apply to compiling, some to matching, but some can be set
|
||||||
|
@ -395,22 +395,23 @@ either on a pattern or a data line, so they must all be distinct. */
|
||||||
#define CTL_CALLOUT_INFO 0x00000080u
|
#define CTL_CALLOUT_INFO 0x00000080u
|
||||||
#define CTL_CALLOUT_NONE 0x00000100u
|
#define CTL_CALLOUT_NONE 0x00000100u
|
||||||
#define CTL_DFA 0x00000200u
|
#define CTL_DFA 0x00000200u
|
||||||
#define CTL_FINDLIMITS 0x00000400u
|
#define CTL_EXPAND 0x00000400u
|
||||||
#define CTL_FULLBINCODE 0x00000800u
|
#define CTL_FINDLIMITS 0x00000800u
|
||||||
#define CTL_GETALL 0x00001000u
|
#define CTL_FULLBINCODE 0x00001000u
|
||||||
#define CTL_GLOBAL 0x00002000u
|
#define CTL_GETALL 0x00002000u
|
||||||
#define CTL_HEXPAT 0x00004000u
|
#define CTL_GLOBAL 0x00004000u
|
||||||
#define CTL_INFO 0x00008000u
|
#define CTL_HEXPAT 0x00008000u
|
||||||
#define CTL_JITFAST 0x00010000u
|
#define CTL_INFO 0x00010000u
|
||||||
#define CTL_JITVERIFY 0x00020000u
|
#define CTL_JITFAST 0x00020000u
|
||||||
#define CTL_MARK 0x00040000u
|
#define CTL_JITVERIFY 0x00040000u
|
||||||
#define CTL_MEMORY 0x00080000u
|
#define CTL_MARK 0x00080000u
|
||||||
#define CTL_NULLCONTEXT 0x00100000u
|
#define CTL_MEMORY 0x00100000u
|
||||||
#define CTL_POSIX 0x00200000u
|
#define CTL_NULLCONTEXT 0x00200000u
|
||||||
#define CTL_PUSH 0x00400000u
|
#define CTL_POSIX 0x00400000u
|
||||||
#define CTL_STARTCHAR 0x00800000u
|
#define CTL_PUSH 0x00800000u
|
||||||
#define CTL_SUBSTITUTE_EXTENDED 0x01000000u
|
#define CTL_STARTCHAR 0x01000000u
|
||||||
#define CTL_ZERO_TERMINATE 0x02000000u
|
#define CTL_SUBSTITUTE_EXTENDED 0x02000000u
|
||||||
|
#define CTL_ZERO_TERMINATE 0x04000000u
|
||||||
|
|
||||||
#define CTL_BSR_SET 0x80000000u /* This is informational */
|
#define CTL_BSR_SET 0x80000000u /* This is informational */
|
||||||
#define CTL_NL_SET 0x40000000u /* This is informational */
|
#define CTL_NL_SET 0x40000000u /* This is informational */
|
||||||
|
@ -520,6 +521,7 @@ static modstruct modlist[] = {
|
||||||
{ "dollar_endonly", MOD_PAT, MOD_OPT, PCRE2_DOLLAR_ENDONLY, PO(options) },
|
{ "dollar_endonly", MOD_PAT, MOD_OPT, PCRE2_DOLLAR_ENDONLY, PO(options) },
|
||||||
{ "dotall", MOD_PATP, MOD_OPT, PCRE2_DOTALL, PO(options) },
|
{ "dotall", MOD_PATP, MOD_OPT, PCRE2_DOTALL, PO(options) },
|
||||||
{ "dupnames", MOD_PATP, MOD_OPT, PCRE2_DUPNAMES, PO(options) },
|
{ "dupnames", MOD_PATP, MOD_OPT, PCRE2_DUPNAMES, PO(options) },
|
||||||
|
{ "expand", MOD_PAT, MOD_CTL, CTL_EXPAND, PO(control) },
|
||||||
{ "extended", MOD_PATP, MOD_OPT, PCRE2_EXTENDED, PO(options) },
|
{ "extended", MOD_PATP, MOD_OPT, PCRE2_EXTENDED, PO(options) },
|
||||||
{ "find_limits", MOD_DAT, MOD_CTL, CTL_FINDLIMITS, DO(control) },
|
{ "find_limits", MOD_DAT, MOD_CTL, CTL_FINDLIMITS, DO(control) },
|
||||||
{ "firstline", MOD_PAT, MOD_OPT, PCRE2_FIRSTLINE, PO(options) },
|
{ "firstline", MOD_PAT, MOD_OPT, PCRE2_FIRSTLINE, PO(options) },
|
||||||
|
@ -606,11 +608,18 @@ static modstruct modlist[] = {
|
||||||
|
|
||||||
#define NOTPOP_CONTROLS (CTL_HEXPAT|CTL_POSIX|CTL_PUSH)
|
#define NOTPOP_CONTROLS (CTL_HEXPAT|CTL_POSIX|CTL_PUSH)
|
||||||
|
|
||||||
/* Controls that are mutually exclusive. */
|
/* Pattern controls that are mutually exclusive. */
|
||||||
|
|
||||||
|
static uint32_t exclusive_pat_controls[] = {
|
||||||
|
CTL_POSIX | CTL_HEXPAT,
|
||||||
|
CTL_POSIX | CTL_PUSH,
|
||||||
|
CTL_EXPAND | CTL_HEXPAT };
|
||||||
|
|
||||||
|
/* Data controls that are mutually exclusive. */
|
||||||
|
|
||||||
static uint32_t exclusive_dat_controls[] = {
|
static uint32_t exclusive_dat_controls[] = {
|
||||||
CTL_ALLUSEDTEXT | CTL_STARTCHAR,
|
CTL_ALLUSEDTEXT | CTL_STARTCHAR,
|
||||||
CTL_FINDLIMITS | CTL_NULLCONTEXT };
|
CTL_FINDLIMITS | CTL_NULLCONTEXT };
|
||||||
|
|
||||||
/* Table of single-character abbreviated modifiers. The index field is
|
/* Table of single-character abbreviated modifiers. The index field is
|
||||||
initialized to -1, but the first time the modifier is encountered, it is filled
|
initialized to -1, but the first time the modifier is encountered, it is filled
|
||||||
|
@ -2787,6 +2796,46 @@ else /* 16-bit mode */
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
/*************************************************
|
||||||
|
* Expand input buffers *
|
||||||
|
*************************************************/
|
||||||
|
|
||||||
|
/* This function doubles the size of the input buffer and the buffer for
|
||||||
|
keeping an 8-bit copy of patterns (pbuffer8), and copies the current buffers to
|
||||||
|
the new ones.
|
||||||
|
|
||||||
|
Arguments: none
|
||||||
|
Returns: nothing (aborts if malloc() fails)
|
||||||
|
*/
|
||||||
|
|
||||||
|
static void
|
||||||
|
expand_input_buffers(void)
|
||||||
|
{
|
||||||
|
int new_pbuffer8_size = 2*pbuffer8_size;
|
||||||
|
uint8_t *new_buffer = (uint8_t *)malloc(new_pbuffer8_size);
|
||||||
|
uint8_t *new_pbuffer8 = (uint8_t *)malloc(new_pbuffer8_size);
|
||||||
|
|
||||||
|
if (new_buffer == NULL || new_pbuffer8 == NULL)
|
||||||
|
{
|
||||||
|
fprintf(stderr, "pcre2test: malloc(%d) failed\n", new_pbuffer8_size);
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
memcpy(new_buffer, buffer, pbuffer8_size);
|
||||||
|
memcpy(new_pbuffer8, pbuffer8, pbuffer8_size);
|
||||||
|
|
||||||
|
pbuffer8_size = new_pbuffer8_size;
|
||||||
|
|
||||||
|
free(buffer);
|
||||||
|
free(pbuffer8);
|
||||||
|
|
||||||
|
buffer = new_buffer;
|
||||||
|
pbuffer8 = new_pbuffer8;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
/*************************************************
|
/*************************************************
|
||||||
* Read or extend an input line *
|
* Read or extend an input line *
|
||||||
*************************************************/
|
*************************************************/
|
||||||
|
@ -2859,29 +2908,11 @@ for (;;)
|
||||||
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
int new_pbuffer8_size = 2*pbuffer8_size;
|
size_t start_offset = start - buffer;
|
||||||
uint8_t *new_buffer = (uint8_t *)malloc(new_pbuffer8_size);
|
size_t here_offset = here - buffer;
|
||||||
uint8_t *new_pbuffer8 = (uint8_t *)malloc(new_pbuffer8_size);
|
expand_input_buffers();
|
||||||
|
start = buffer + start_offset;
|
||||||
if (new_buffer == NULL || new_pbuffer8 == NULL)
|
here = buffer + here_offset;
|
||||||
{
|
|
||||||
fprintf(stderr, "pcre2test: malloc(%d) failed\n", new_pbuffer8_size);
|
|
||||||
exit(1);
|
|
||||||
}
|
|
||||||
|
|
||||||
memcpy(new_buffer, buffer, pbuffer8_size);
|
|
||||||
memcpy(new_pbuffer8, pbuffer8, pbuffer8_size);
|
|
||||||
|
|
||||||
pbuffer8_size = new_pbuffer8_size;
|
|
||||||
|
|
||||||
start = new_buffer + (start - buffer);
|
|
||||||
here = new_buffer + (here - buffer);
|
|
||||||
|
|
||||||
free(buffer);
|
|
||||||
free(pbuffer8);
|
|
||||||
|
|
||||||
buffer = new_buffer;
|
|
||||||
pbuffer8 = new_pbuffer8;
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -3111,11 +3142,11 @@ for (;;)
|
||||||
/* Find the end of the item; lose trailing whitespace at end of line. */
|
/* Find the end of the item; lose trailing whitespace at end of line. */
|
||||||
|
|
||||||
for (ep = p; *ep != 0 && *ep != ','; ep++);
|
for (ep = p; *ep != 0 && *ep != ','; ep++);
|
||||||
if (*ep == 0)
|
if (*ep == 0)
|
||||||
{
|
{
|
||||||
while (ep > p && isspace(ep[-1])) ep--;
|
while (ep > p && isspace(ep[-1])) ep--;
|
||||||
*ep = 0;
|
*ep = 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Remember if the first character is '-'. */
|
/* Remember if the first character is '-'. */
|
||||||
|
|
||||||
|
@ -3466,7 +3497,7 @@ Returns: nothing
|
||||||
static void
|
static void
|
||||||
show_controls(uint32_t controls, const char *before)
|
show_controls(uint32_t controls, const char *before)
|
||||||
{
|
{
|
||||||
fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s",
|
fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s",
|
||||||
before,
|
before,
|
||||||
((controls & CTL_AFTERTEXT) != 0)? " aftertext" : "",
|
((controls & CTL_AFTERTEXT) != 0)? " aftertext" : "",
|
||||||
((controls & CTL_ALLAFTERTEXT) != 0)? " allaftertext" : "",
|
((controls & CTL_ALLAFTERTEXT) != 0)? " allaftertext" : "",
|
||||||
|
@ -3479,6 +3510,7 @@ fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s",
|
||||||
((controls & CTL_CALLOUT_INFO) != 0)? " callout_info" : "",
|
((controls & CTL_CALLOUT_INFO) != 0)? " callout_info" : "",
|
||||||
((controls & CTL_CALLOUT_NONE) != 0)? " callout_none" : "",
|
((controls & CTL_CALLOUT_NONE) != 0)? " callout_none" : "",
|
||||||
((controls & CTL_DFA) != 0)? " dfa" : "",
|
((controls & CTL_DFA) != 0)? " dfa" : "",
|
||||||
|
((controls & CTL_EXPAND) != 0)? " expand" : "",
|
||||||
((controls & CTL_FINDLIMITS) != 0)? " find_limits" : "",
|
((controls & CTL_FINDLIMITS) != 0)? " find_limits" : "",
|
||||||
((controls & CTL_FULLBINCODE) != 0)? " fullbincode" : "",
|
((controls & CTL_FULLBINCODE) != 0)? " fullbincode" : "",
|
||||||
((controls & CTL_GETALL) != 0)? " getall" : "",
|
((controls & CTL_GETALL) != 0)? " getall" : "",
|
||||||
|
@ -3490,7 +3522,7 @@ fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s",
|
||||||
((controls & CTL_MARK) != 0)? " mark" : "",
|
((controls & CTL_MARK) != 0)? " mark" : "",
|
||||||
((controls & CTL_MEMORY) != 0)? " memory" : "",
|
((controls & CTL_MEMORY) != 0)? " memory" : "",
|
||||||
((controls & CTL_NL_SET) != 0)? " newline" : "",
|
((controls & CTL_NL_SET) != 0)? " newline" : "",
|
||||||
((controls & CTL_NULLCONTEXT) != 0)? " null_context" : "",
|
((controls & CTL_NULLCONTEXT) != 0)? " null_context" : "",
|
||||||
((controls & CTL_POSIX) != 0)? " posix" : "",
|
((controls & CTL_POSIX) != 0)? " posix" : "",
|
||||||
((controls & CTL_PUSH) != 0)? " push" : "",
|
((controls & CTL_PUSH) != 0)? " push" : "",
|
||||||
((controls & CTL_STARTCHAR) != 0)? " startchar" : "",
|
((controls & CTL_STARTCHAR) != 0)? " startchar" : "",
|
||||||
|
@ -4262,6 +4294,7 @@ static int
|
||||||
process_pattern(void)
|
process_pattern(void)
|
||||||
{
|
{
|
||||||
BOOL utf;
|
BOOL utf;
|
||||||
|
uint32_t k;
|
||||||
uint8_t *p = buffer;
|
uint8_t *p = buffer;
|
||||||
const uint8_t *use_tables;
|
const uint8_t *use_tables;
|
||||||
unsigned int delimiter = *p++;
|
unsigned int delimiter = *p++;
|
||||||
|
@ -4311,6 +4344,19 @@ patlen = p - buffer - 2;
|
||||||
if (!decode_modifiers(p, CTX_PAT, &pat_patctl, NULL)) return PR_SKIP;
|
if (!decode_modifiers(p, CTX_PAT, &pat_patctl, NULL)) return PR_SKIP;
|
||||||
utf = (pat_patctl.options & PCRE2_UTF) != 0;
|
utf = (pat_patctl.options & PCRE2_UTF) != 0;
|
||||||
|
|
||||||
|
/* Check for mutually exclusive modifiers. */
|
||||||
|
|
||||||
|
for (k = 0; k < sizeof(exclusive_pat_controls)/sizeof(uint32_t); k++)
|
||||||
|
{
|
||||||
|
uint32_t c = pat_patctl.control & exclusive_pat_controls[k];
|
||||||
|
if (c != 0 && c != (c & (~c+1)))
|
||||||
|
{
|
||||||
|
show_controls(c, "** Not allowed together:");
|
||||||
|
fprintf(outfile, "\n");
|
||||||
|
return PR_SKIP;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
/* Assume full JIT compile for jitverify and/or jitfast if nothing else was
|
/* Assume full JIT compile for jitverify and/or jitfast if nothing else was
|
||||||
specified. */
|
specified. */
|
||||||
|
|
||||||
|
@ -4318,28 +4364,16 @@ if (pat_patctl.jit == 0 &&
|
||||||
(pat_patctl.control & (CTL_JITVERIFY|CTL_JITFAST)) != 0)
|
(pat_patctl.control & (CTL_JITVERIFY|CTL_JITFAST)) != 0)
|
||||||
pat_patctl.jit = 7;
|
pat_patctl.jit = 7;
|
||||||
|
|
||||||
/* POSIX and 'push' do not play together. */
|
|
||||||
|
|
||||||
if ((pat_patctl.control & (CTL_POSIX|CTL_PUSH)) == (CTL_POSIX|CTL_PUSH))
|
|
||||||
{
|
|
||||||
fprintf(outfile, "** The POSIX interface is incompatible with 'push'\n");
|
|
||||||
return PR_ABEND;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Now copy the pattern to pbuffer8 for use in 8-bit testing and for reflecting
|
/* Now copy the pattern to pbuffer8 for use in 8-bit testing and for reflecting
|
||||||
in callouts. Convert to binary if required. */
|
in callouts. Convert from hex if required; this must necessarily be fewer
|
||||||
|
characters so will always fit in pbuffer8. Alternatively, process for
|
||||||
|
repetition if requested. */
|
||||||
|
|
||||||
if ((pat_patctl.control & CTL_HEXPAT) != 0)
|
if ((pat_patctl.control & CTL_HEXPAT) != 0)
|
||||||
{
|
{
|
||||||
uint8_t *pp, *pt;
|
uint8_t *pp, *pt;
|
||||||
uint32_t c, d;
|
uint32_t c, d;
|
||||||
|
|
||||||
if ((pat_patctl.control & CTL_POSIX) != 0)
|
|
||||||
{
|
|
||||||
fprintf(outfile, "** Hex patterns are not supported for the POSIX API\n");
|
|
||||||
return PR_SKIP;
|
|
||||||
}
|
|
||||||
|
|
||||||
pt = pbuffer8;
|
pt = pbuffer8;
|
||||||
for (pp = buffer + 1; *pp != 0; pp++)
|
for (pp = buffer + 1; *pp != 0; pp++)
|
||||||
{
|
{
|
||||||
|
@ -4362,6 +4396,80 @@ if ((pat_patctl.control & CTL_HEXPAT) != 0)
|
||||||
*pt = 0;
|
*pt = 0;
|
||||||
patlen = pt - pbuffer8;
|
patlen = pt - pbuffer8;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
else if ((pat_patctl.control & CTL_EXPAND) != 0)
|
||||||
|
{
|
||||||
|
uint8_t *pp, *pt;
|
||||||
|
|
||||||
|
pt = pbuffer8;
|
||||||
|
for (pp = buffer + 1; *pp != 0; pp++)
|
||||||
|
{
|
||||||
|
uint8_t *pc = pp;
|
||||||
|
uint32_t count = 1;
|
||||||
|
size_t length = 1;
|
||||||
|
|
||||||
|
/* Check for replication syntax; if not found, the defaults just set will
|
||||||
|
prevail and one character will be copied. */
|
||||||
|
|
||||||
|
if (pp[0] == '\\' && pp[1] == '[')
|
||||||
|
{
|
||||||
|
uint8_t *pe;
|
||||||
|
for (pe = pp + 2; *pe != 0; pe++)
|
||||||
|
{
|
||||||
|
if (pe[0] == ']' && pe[1] == '{')
|
||||||
|
{
|
||||||
|
uint32_t clen = pe - pc - 2;
|
||||||
|
uint32_t i = 0;
|
||||||
|
pe += 2;
|
||||||
|
while (isdigit(*pe)) i = i * 10 + *pe++ - '0';
|
||||||
|
if (*pe == '}')
|
||||||
|
{
|
||||||
|
if (i == 0)
|
||||||
|
{
|
||||||
|
fprintf(outfile, "** Zero repeat not allowed\n");
|
||||||
|
return PR_SKIP;
|
||||||
|
}
|
||||||
|
pc += 2;
|
||||||
|
count = i;
|
||||||
|
length = clen;
|
||||||
|
pp = pe;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Add to output. If the buffer is too small expand it. The function for
|
||||||
|
expanding buffers always keeps buffer and pbuffer8 in step as far as their
|
||||||
|
size goes. */
|
||||||
|
|
||||||
|
while (pt + count * length > pbuffer8 + pbuffer8_size)
|
||||||
|
{
|
||||||
|
size_t pc_offset = pc - buffer;
|
||||||
|
size_t pp_offset = pp - buffer;
|
||||||
|
size_t pt_offset = pt - pbuffer8;
|
||||||
|
expand_input_buffers();
|
||||||
|
pc = buffer + pc_offset;
|
||||||
|
pp = buffer + pp_offset;
|
||||||
|
pt = pbuffer8 + pt_offset;
|
||||||
|
}
|
||||||
|
|
||||||
|
while (count-- > 0)
|
||||||
|
{
|
||||||
|
memcpy(pt, pc, length);
|
||||||
|
pt += length;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
*pt = 0;
|
||||||
|
patlen = pt - pbuffer8;
|
||||||
|
|
||||||
|
if ((pat_patctl.control & CTL_INFO) != 0)
|
||||||
|
fprintf(outfile, "Expanded: %s\n", pbuffer8);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Neither hex nor expanded, just copy the input verbatim. */
|
||||||
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
strncpy((char *)pbuffer8, (char *)(buffer+1), patlen + 1);
|
strncpy((char *)pbuffer8, (char *)(buffer+1), patlen + 1);
|
||||||
|
@ -4548,11 +4656,11 @@ if ((pat_patctl.control & CTL_NL_SET) == 0 && local_newline_default != 0)
|
||||||
{
|
{
|
||||||
SETFLD(pat_context, newline_convention, local_newline_default);
|
SETFLD(pat_context, newline_convention, local_newline_default);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* The nullcontext modifier is used to test calling pcre2_compile() with a NULL
|
/* The nullcontext modifier is used to test calling pcre2_compile() with a NULL
|
||||||
context. */
|
context. */
|
||||||
|
|
||||||
use_pat_context = ((pat_patctl.control & CTL_NULLCONTEXT) != 0)?
|
use_pat_context = ((pat_patctl.control & CTL_NULLCONTEXT) != 0)?
|
||||||
NULL : PTR(pat_context);
|
NULL : PTR(pat_context);
|
||||||
|
|
||||||
/* Compile many times when timing. */
|
/* Compile many times when timing. */
|
||||||
|
@ -4629,7 +4737,7 @@ if (pat_patctl.jit != 0)
|
||||||
clock_t start_time;
|
clock_t start_time;
|
||||||
SUB1(pcre2_code_free, compiled_code);
|
SUB1(pcre2_code_free, compiled_code);
|
||||||
PCRE2_COMPILE(compiled_code, pbuffer, patlen,
|
PCRE2_COMPILE(compiled_code, pbuffer, patlen,
|
||||||
pat_patctl.options|forbid_utf, &errorcode, &erroroffset,
|
pat_patctl.options|forbid_utf, &errorcode, &erroroffset,
|
||||||
use_pat_context);
|
use_pat_context);
|
||||||
start_time = clock();
|
start_time = clock();
|
||||||
PCRE2_JIT_COMPILE(compiled_code, pat_patctl.jit);
|
PCRE2_JIT_COMPILE(compiled_code, pat_patctl.jit);
|
||||||
|
@ -5490,9 +5598,9 @@ for (k = 0; k < sizeof(exclusive_dat_controls)/sizeof(uint32_t); k++)
|
||||||
fprintf(outfile, "\n");
|
fprintf(outfile, "\n");
|
||||||
return PR_OK;
|
return PR_OK;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (pat_patctl.replacement[0] != 0 &&
|
if (pat_patctl.replacement[0] != 0 &&
|
||||||
(dat_datctl.control & CTL_NULLCONTEXT) != 0)
|
(dat_datctl.control & CTL_NULLCONTEXT) != 0)
|
||||||
{
|
{
|
||||||
fprintf(outfile, "** Replacement text is not supported with null_context.\n");
|
fprintf(outfile, "** Replacement text is not supported with null_context.\n");
|
||||||
|
@ -5623,7 +5731,7 @@ if ((dat_datctl.control & CTL_ZERO_TERMINATE) != 0)
|
||||||
/* The nullcontext modifier is used to test calling pcre2_[jit_]match() with a
|
/* The nullcontext modifier is used to test calling pcre2_[jit_]match() with a
|
||||||
NULL context. */
|
NULL context. */
|
||||||
|
|
||||||
use_dat_context = ((dat_datctl.control & CTL_NULLCONTEXT) != 0)?
|
use_dat_context = ((dat_datctl.control & CTL_NULLCONTEXT) != 0)?
|
||||||
NULL : PTR(dat_context);
|
NULL : PTR(dat_context);
|
||||||
|
|
||||||
/* Enable display of malloc/free if wanted. */
|
/* Enable display of malloc/free if wanted. */
|
||||||
|
@ -5719,8 +5827,8 @@ if (dat_datctl.replacement[0] != 0)
|
||||||
xoptions = (((dat_datctl.control & CTL_GLOBAL) == 0)? 0 :
|
xoptions = (((dat_datctl.control & CTL_GLOBAL) == 0)? 0 :
|
||||||
PCRE2_SUBSTITUTE_GLOBAL) |
|
PCRE2_SUBSTITUTE_GLOBAL) |
|
||||||
(((pat_patctl.control & CTL_SUBSTITUTE_EXTENDED) == 0)? 0 :
|
(((pat_patctl.control & CTL_SUBSTITUTE_EXTENDED) == 0)? 0 :
|
||||||
PCRE2_SUBSTITUTE_EXTENDED);
|
PCRE2_SUBSTITUTE_EXTENDED);
|
||||||
|
|
||||||
SETCASTPTR(r, rbuffer); /* Sets r8, r16, or r32, as appropriate. */
|
SETCASTPTR(r, rbuffer); /* Sets r8, r16, or r32, as appropriate. */
|
||||||
pr = dat_datctl.replacement;
|
pr = dat_datctl.replacement;
|
||||||
|
|
||||||
|
@ -5814,7 +5922,7 @@ if (dat_datctl.replacement[0] != 0)
|
||||||
{
|
{
|
||||||
fprintf(outfile, "Failed: error %d", rc);
|
fprintf(outfile, "Failed: error %d", rc);
|
||||||
if (nsize != PCRE2_UNSET)
|
if (nsize != PCRE2_UNSET)
|
||||||
fprintf(outfile, " at offset %ld in replacement", nsize);
|
fprintf(outfile, " at offset %ld in replacement", nsize);
|
||||||
fprintf(outfile, ": ");
|
fprintf(outfile, ": ");
|
||||||
PCRE2_GET_ERROR_MESSAGE(nsize, rc, pbuffer);
|
PCRE2_GET_ERROR_MESSAGE(nsize, rc, pbuffer);
|
||||||
PCHARSV(CASTVAR(void *, pbuffer), 0, nsize, FALSE, outfile);
|
PCHARSV(CASTVAR(void *, pbuffer), 0, nsize, FALSE, outfile);
|
||||||
|
|
File diff suppressed because one or more lines are too long
|
@ -252,8 +252,10 @@
|
||||||
|
|
||||||
/(*MARK:a\x{100}b)z/alt_verbnames
|
/(*MARK:a\x{100}b)z/alt_verbnames
|
||||||
|
|
||||||
/(?'ABC'[bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar](*THEN:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))/
|
# Use "expand" to create some very long patterns
|
||||||
|
|
||||||
/(?'ABC'[bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))/
|
/(?'ABC'\[[bar](]{105}*THEN:\[A]{255}\[)]{106}/expand
|
||||||
|
|
||||||
|
/(?'ABC'\[[bar](]{106}*THEN:\[A]{255}\[)]{107}/expand
|
||||||
|
|
||||||
# End of testinput9
|
# End of testinput9
|
||||||
|
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
@ -355,9 +355,11 @@ Failed: error 177 at offset 6: character code point value in \u.... sequence is
|
||||||
/(*MARK:a\x{100}b)z/alt_verbnames
|
/(*MARK:a\x{100}b)z/alt_verbnames
|
||||||
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
|
Failed: error 134 at offset 14: character code point value in \x{} or \o{} is too large
|
||||||
|
|
||||||
/(?'ABC'[bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar](*THEN:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))/
|
# Use "expand" to create some very long patterns
|
||||||
|
|
||||||
/(?'ABC'[bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]([bar]))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))/
|
/(?'ABC'\[[bar](]{105}*THEN:\[A]{255}\[)]{106}/expand
|
||||||
|
|
||||||
|
/(?'ABC'\[[bar](]{106}*THEN:\[A]{255}\[)]{107}/expand
|
||||||
Failed: error 186 at offset 637: regular expression is too complicated
|
Failed: error 186 at offset 637: regular expression is too complicated
|
||||||
|
|
||||||
# End of testinput9
|
# End of testinput9
|
||||||
|
|
Loading…
Reference in New Issue