Upgrade perltest.sh to support (some) #pattern modifiers.

This commit is contained in:
Philip.Hazel 2018-07-17 16:00:09 +00:00
parent 455ce731dc
commit 635d04fbb7
5 changed files with 535 additions and 500 deletions

View File

@ -118,6 +118,11 @@ backtrack into the first of the atomic groups. A complicated example is
/(?>a(*:1))(?>b)(*SKIP:1)x|.*/ matched against "abc", where the *SKIP /(?>a(*:1))(?>b)(*SKIP:1)x|.*/ matched against "abc", where the *SKIP
shouldn't find a MARK (because is in an atomic group), but it did. shouldn't find a MARK (because is in an atomic group), but it did.
26. Upgraded the perltest.sh script: (1) #pattern lines can now be used to set
certain modifiers that the script recognizes; (2) Unsupported #command lines
give a warning when they are ignored; (3) Mark data is output only if the
"mark" modifier is present.
Version 10.31 12-February-2018 Version 10.31 12-February-2018
------------------------------ ------------------------------

View File

@ -315,7 +315,8 @@ number of subject lines to be matched against that pattern. In between sets of
test data, command lines that begin with # may appear. This file format, with test data, command lines that begin with # may appear. This file format, with
some restrictions, can also be processed by the <b>perltest.sh</b> script that some restrictions, can also be processed by the <b>perltest.sh</b> script that
is distributed with PCRE2 as a means of checking that the behaviour of PCRE2 is distributed with PCRE2 as a means of checking that the behaviour of PCRE2
and Perl is the same. and Perl is the same. For a specification of <b>perltest.sh</b>, see the
comments near its beginning.
</P> </P>
<P> <P>
When the input is a terminal, <b>pcre2test</b> prompts for each line of input, When the input is a terminal, <b>pcre2test</b> prompts for each line of input,
@ -2002,7 +2003,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC21" href="#TOC1">REVISION</a><br> <br><a name="SEC21" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 27 June 2018 Last updated: 16 July 2018
<br> <br>
Copyright &copy; 1997-2018 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>

View File

@ -1,4 +1,4 @@
.TH PCRE2TEST 1 "27 June 2018" "PCRE 10.32" .TH PCRE2TEST 1 "16 July 2018" "PCRE 10.32"
.SH NAME .SH NAME
pcre2test - a program for testing Perl-compatible regular expressions. pcre2test - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS .SH SYNOPSIS
@ -266,7 +266,8 @@ number of subject lines to be matched against that pattern. In between sets of
test data, command lines that begin with # may appear. This file format, with test data, command lines that begin with # may appear. This file format, with
some restrictions, can also be processed by the \fBperltest.sh\fP script that some restrictions, can also be processed by the \fBperltest.sh\fP script that
is distributed with PCRE2 as a means of checking that the behaviour of PCRE2 is distributed with PCRE2 as a means of checking that the behaviour of PCRE2
and Perl is the same. and Perl is the same. For a specification of \fBperltest.sh\fP, see the
comments near its beginning.
.P .P
When the input is a terminal, \fBpcre2test\fP prompts for each line of input, When the input is a terminal, \fBpcre2test\fP prompts for each line of input,
using "re>" to prompt for regular expression patterns, and "data>" to prompt using "re>" to prompt for regular expression patterns, and "data>" to prompt
@ -1980,6 +1981,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 27 June 2018 Last updated: 16 July 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
.fi .fi

View File

@ -251,7 +251,8 @@ DESCRIPTION
tern. In between sets of test data, command lines that begin with # may tern. In between sets of test data, command lines that begin with # may
appear. This file format, with some restrictions, can also be processed appear. This file format, with some restrictions, can also be processed
by the perltest.sh script that is distributed with PCRE2 as a means of by the perltest.sh script that is distributed with PCRE2 as a means of
checking that the behaviour of PCRE2 and Perl is the same. checking that the behaviour of PCRE2 and Perl is the same. For a speci-
fication of perltest.sh, see the comments near its beginning.
When the input is a terminal, pcre2test prompts for each line of input, When the input is a terminal, pcre2test prompts for each line of input,
using "re>" to prompt for regular expression patterns, and "data>" to using "re>" to prompt for regular expression patterns, and "data>" to
@ -1817,5 +1818,5 @@ AUTHOR
REVISION REVISION
Last updated: 27 June 2018 Last updated: 16 July 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.

View File

@ -50,6 +50,13 @@ fi
# ucp sets Perl's /u modifier # ucp sets Perl's /u modifier
# utf invoke UTF-8 functionality # utf invoke UTF-8 functionality
# #
# Comment lines are ignored. The #pattern command can be used to set modifiers
# that will be added to each subsequent pattern. NOTE: this is different to
# pcre2test where #pattern sets defaults, some of which can be overridden on
# individual patterns. The #perltest, #forbid_utf, and #newline_default
# commands, which are needed in the relevant pcre2test files, are ignored. Any
# other #-command is ignored, with a warning message.
#
# The data lines must not have any pcre2test modifiers. Unless # The data lines must not have any pcre2test modifiers. Unless
# "subject_literal" is on the pattern, data lines are processed as # "subject_literal" is on the pattern, data lines are processed as
# Perl double-quoted strings, so if they contain " $ or @ characters, these # Perl double-quoted strings, so if they contain " $ or @ characters, these
@ -127,7 +134,26 @@ for (;;)
printf " re> " if $interact; printf " re> " if $interact;
last if ! ($_ = <$infile>); last if ! ($_ = <$infile>);
printf $outfile "$_" if ! $interact; printf $outfile "$_" if ! $interact;
next if ($_ =~ /^\s*$/ || $_ =~ /^#/); next if ($_ =~ /^\s*$/ || $_ =~ /^#[\s!]/);
# A few of pcre2test's #-commands are supported, or just ignored. Any others
# cause an error.
if ($_ =~ /^#pattern(.*)/)
{
$extra_modifiers = $1;
chomp($extra_modifiers);
$extra_modifiers =~ s/\s+$//;
next;
}
elsif ($_ =~ /^#/)
{
if ($_ !~ /^#newline_default|^#perltest|^#forbid_utf/)
{
printf $outfile "** Warning: #-command ignored: %s", $_;
}
next;
}
$pattern = $_; $pattern = $_;
@ -146,7 +172,8 @@ for (;;)
$pattern =~ /^\s*((.).*\2)(.*)$/s; $pattern =~ /^\s*((.).*\2)(.*)$/s;
$pat = $1; $pat = $1;
$mod = $3; $mod = "$3,$extra_modifiers";
$mod =~ s/^,\s*//;
$del = $2; $del = $2;
# The private "aftertext" modifier means "print $' afterwards". # The private "aftertext" modifier means "print $' afterwards".