Impose a minimum of 1 for the number of pairs in the ovector.

This commit is contained in:
Philip.Hazel 2014-10-05 17:55:25 +00:00
parent 4ca4ad688d
commit 4bdfd990af
6 changed files with 61 additions and 46 deletions

View File

@ -1,4 +1,4 @@
.TH PCRE2API 3 "01 October 2014" "PCRE2 10.00" .TH PCRE2API 3 "05 October 2014" "PCRE2 10.00"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.sp .sp
@ -1655,8 +1655,10 @@ match data block by calling one of the creation functions above. For
\fBpcre2_match_data_create()\fP, the first argument is the number of pairs of \fBpcre2_match_data_create()\fP, the first argument is the number of pairs of
offsets in the \fIovector\fP. One pair of offsets is required to identify the offsets in the \fIovector\fP. One pair of offsets is required to identify the
string that matched the whole pattern, with another pair for each captured string that matched the whole pattern, with another pair for each captured
substring. For example, a value of 4 creates enough space to record the substring. For example, a value of 4 creates enough space to record the matched
matched portion of the subject plus three captured substrings. portion of the subject plus three captured substrings. A minimum of at least 1
pair is imposed by \fBpcre2_match_data_create()\fP, so it is always possible to
return the overall matched string.
.P .P
For \fBpcre2_match_data_create_from_pattern()\fP, the first argument is a For \fBpcre2_match_data_create_from_pattern()\fP, the first argument is a
pointer to a compiled pattern. In this case the ovector is created to be pointer to a compiled pattern. In this case the ovector is created to be
@ -2015,13 +2017,13 @@ operation, it is the last portion of the string that it matched that is
returned. returned.
.P .P
If the ovector is too small to hold all the captured substring offsets, as much If the ovector is too small to hold all the captured substring offsets, as much
as possible is filled in, and the function returns a value of zero. If neither as possible is filled in, and the function returns a value of zero. If captured
the actual string matched nor any captured substrings are of interest, substrings are not of interest, \fBpcre2_match()\fP may be called with a match
\fBpcre2_match()\fP may be called with a match data block whose ovector is of data block whose ovector is of minimum length (that is, one pair). However, if
zero length. However, if the pattern contains back references and the the pattern contains back references and the \fIovector\fP is not big enough to
\fIovector\fP is not big enough to remember the related substrings, PCRE2 has remember the related substrings, PCRE2 has to get additional memory for use
to get additional memory for use during matching. Thus it is usually advisable during matching. Thus it is usually advisable to set up a match data block
to set up a match data block containing an ovector of reasonable size. containing an ovector of reasonable size.
.P .P
It is possible for capturing subpattern number \fIn+1\fP to match some part of It is possible for capturing subpattern number \fIn+1\fP to match some part of
the subject when subpattern \fIn\fP has not been used at all. For example, if the subject when subpattern \fIn\fP has not been used at all. For example, if
@ -2652,6 +2654,6 @@ Cambridge CB2 3QH, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 01 October 2014 Last updated: 05 October 2014
Copyright (c) 1997-2014 University of Cambridge. Copyright (c) 1997-2014 University of Cambridge.
.fi .fi

View File

@ -1,4 +1,4 @@
.TH PCRE2TEST 1 "19 August 2014" "PCRE 10.00" .TH PCRE2TEST 1 "05 October 2014" "PCRE 10.00"
.SH NAME .SH NAME
pcre2test - a program for testing Perl-compatible regular expressions. pcre2test - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS .SH SYNOPSIS
@ -881,6 +881,12 @@ The \fBovector\fP modifier applies only to the subject line in which it
appears, though of course it can also be used to set a default in a appears, though of course it can also be used to set a default in a
\fB#subject\fP command. It specifies the number of pairs of offsets that are \fB#subject\fP command. It specifies the number of pairs of offsets that are
available for storing matching information. The default is 15. available for storing matching information. The default is 15.
.P
At least one pair of offsets is always created by
\fBpcre2_match_data_create()\fP, for matching with PCRE2's native API, so a
value of 0 is the same as 1. However a value of 0 is useful when testing the
POSIX API because it causes \fBregexec()\fP to be called with a NULL capture
vector.
. .
. .
.SH "THE ALTERNATIVE MATCHING FUNCTION" .SH "THE ALTERNATIVE MATCHING FUNCTION"
@ -1145,6 +1151,6 @@ Cambridge CB2 3QH, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 19 August 2014 Last updated: 05 October 2014
Copyright (c) 1997-2014 University of Cambridge. Copyright (c) 1997-2014 University of Cambridge.
.fi .fi

View File

@ -51,10 +51,14 @@ POSSIBILITY OF SUCH DAMAGE.
* Create a match data block given ovector size * * Create a match data block given ovector size *
*************************************************/ *************************************************/
/* A minimum of 1 is imposed on the number of ovector triplets. */
PCRE2_EXP_DEFN pcre2_match_data * PCRE2_CALL_CONVENTION PCRE2_EXP_DEFN pcre2_match_data * PCRE2_CALL_CONVENTION
pcre2_match_data_create(uint32_t oveccount, pcre2_general_context *gcontext) pcre2_match_data_create(uint32_t oveccount, pcre2_general_context *gcontext)
{ {
pcre2_match_data *yield = PRIV(memctl_malloc)( pcre2_match_data *yield;
if (oveccount < 1) oveccount = 1;
yield = PRIV(memctl_malloc)(
sizeof(pcre2_match_data) + 3*oveccount*sizeof(PCRE2_SIZE), sizeof(pcre2_match_data) + 3*oveccount*sizeof(PCRE2_SIZE),
(pcre2_memctl *)gcontext); (pcre2_memctl *)gcontext);
yield->oveccount = oveccount; yield->oveccount = oveccount;

View File

@ -4385,11 +4385,10 @@ if ((dat_datctl.control & (CTL_DFA|CTL_FINDLIMITS)) == (CTL_DFA|CTL_FINDLIMITS))
dat_datctl.control &= ~CTL_FINDLIMITS; dat_datctl.control &= ~CTL_FINDLIMITS;
} }
if ((dat_datctl.control & CTL_ANYGLOB) != 0 && dat_datctl.oveccount < 1) /* As pcre2_match_data_create() imposes a minimum of 1 on the ovector count, we
{ must do so too. */
printf("** Global matching requires a non-zero ovector count: ignored\n");
dat_datctl.control &= ~CTL_ANYGLOB; if (dat_datctl.oveccount < 1) dat_datctl.oveccount = 1;
}
/* Enable display of malloc/free if wanted. */ /* Enable display of malloc/free if wanted. */
@ -4875,8 +4874,7 @@ for (gmatched = 0;; gmatched++)
If that is the case, this is not necessarily the end. We want to advance the If that is the case, this is not necessarily the end. We want to advance the
start offset, and continue. We won't be at the end of the string - that was start offset, and continue. We won't be at the end of the string - that was
checked before setting g_notempty. We achieve the effect by pretending that a checked before setting g_notempty. We achieve the effect by pretending that a
single character was matched. We know that match_data->oveccount is at least single character was matched.
1 because that was checked above.
Complication arises in the case when the newline convention is "any", "crlf", Complication arises in the case when the newline convention is "any", "crlf",
or "anycrlf". If the previous match was at the end of a line terminated by or "anycrlf". If the previous match was at the end of a line terminated by

View File

@ -245,6 +245,7 @@ Subject length lower bound = 4
3: c 3: c
abcb\=ovector=0 abcb\=ovector=0
Matched, but too many substrings Matched, but too many substrings
0: abcb
abcb\=ovector=1 abcb\=ovector=1
Matched, but too many substrings Matched, but too many substrings
0: abcb 0: abcb
@ -273,6 +274,7 @@ Subject length lower bound = 3
1: a 1: a
abc\=ovector=0 abc\=ovector=0
Matched, but too many substrings Matched, but too many substrings
0: abc
abc\=ovector=1 abc\=ovector=1
Matched, but too many substrings Matched, but too many substrings
0: abc 0: abc
@ -286,6 +288,7 @@ Matched, but too many substrings
3: b 3: b
aba\=ovector=0 aba\=ovector=0
Matched, but too many substrings Matched, but too many substrings
0: aba
aba\=ovector=1 aba\=ovector=1
Matched, but too many substrings Matched, but too many substrings
0: aba 0: aba
@ -7404,6 +7407,7 @@ Subject length lower bound = 3
No match No match
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4\=ovector=0 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4\=ovector=0
Matched, but too many substrings Matched, but too many substrings
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4
/^a.b/newline=lf /^a.b/newline=lf
a\rb a\rb
@ -10922,6 +10926,7 @@ Minimum recursion limit = 4
3: baz 3: baz
bazfooX\=ovector=0 bazfooX\=ovector=0
Matched, but too many substrings Matched, but too many substrings
0: fooX
bazfooX\=ovector=1 bazfooX\=ovector=1
Matched, but too many substrings Matched, but too many substrings
0: fooX 0: fooX
@ -11970,7 +11975,7 @@ Callout 2: last capture = 0
/(ab)x|ab/ /(ab)x|ab/
ab\=ovector=0 ab\=ovector=0
Matched, but too many substrings 0: ab
ab\=ovector=1 ab\=ovector=1
0: ab 0: ab

View File

@ -7611,7 +7611,7 @@ Failed: error -37: invalid data in workspace for DFA restart
/abcd/ /abcd/
abcd\=ovector=0 abcd\=ovector=0
Matched, but offsets vector is too small to show all matches 0: abcd
# These tests show up auto-possessification # These tests show up auto-possessification