Improve interfaces to substring functions, and fix bugs.

This commit is contained in:
Philip.Hazel 2014-12-13 17:43:26 +00:00
parent 45b4ec3f8d
commit a85d15cbd1
11 changed files with 191 additions and 111 deletions

View File

@ -24,8 +24,16 @@ by name, into a given buffer. The arguments are:
.sp .sp
The \fIbufflen\fP variable is updated to contain the length of the extracted The \fIbufflen\fP variable is updated to contain the length of the extracted
string, excluding the trailing zero. The yield of the function is zero for string, excluding the trailing zero. The yield of the function is zero for
success, PCRE2_ERROR_NOMEMORY if the buffer is too small, or success or one of the following error numbers:
PCRE2_ERROR_NOSUBSTRING if the string name is invalid. .sp
PCRE2_ERROR_NOSUBSTRING there are no groups of that name
PCRE2_ERROR_UNAVAILBLE the ovector was too small for that group
PCRE2_ERROR_UNSET the group did not participate in the match
PCRE2_ERROR_NOMEMORY the buffer is not big enough
.sp
If there is more than one group with the given name, the first one that is set
is returned. In this situation PCRE2_ERROR_UNSET means that no group with the
given name was set.
.P .P
There is a complete description of the PCRE2 native API in the There is a complete description of the PCRE2 native API in the
.\" HREF .\" HREF

View File

@ -1,4 +1,4 @@
.TH PCRE2_SUBSTRING_COPY_BYNUMBER 3 "01 December 2014" "PCRE2 10.00" .TH PCRE2_SUBSTRING_COPY_BYNUMBER 3 "13 December 2014" "PCRE2 10.00"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS .SH SYNOPSIS
@ -24,9 +24,14 @@ buffer. The arguments are:
\fIbufflen\fP Length of buffer \fIbufflen\fP Length of buffer
.sp .sp
The \fIbufflen\fP variable is updated with the length of the extracted string, The \fIbufflen\fP variable is updated with the length of the extracted string,
excluding the terminating zero. The yield of the function is zero for success, excluding the terminating zero. The yield of the function is zero for success
PCRE2_ERROR_NOMEMORY if the buffer was too small, or PCRE2_ERROR_NOSUBSTRING if or one of the following error numbers:
the string number is invalid. .sp
PCRE2_ERROR_NOSUBSTRING there are no groups of that number
PCRE2_ERROR_UNAVAILBLE the ovector was too small for that group
PCRE2_ERROR_UNSET the group did not participate in the match
PCRE2_ERROR_NOMEMORY the buffer is too small
.sp
.P .P
There is a complete description of the PCRE2 native API in the There is a complete description of the PCRE2 native API in the
.\" HREF .\" HREF

View File

@ -25,9 +25,17 @@ newly acquired memory. The arguments are:
The memory in which the substring is placed is obtained by calling the same The memory in which the substring is placed is obtained by calling the same
memory allocation function that was used for the match data block. The memory allocation function that was used for the match data block. The
convenience function \fBpcre2_substring_free()\fP can be used to free it when convenience function \fBpcre2_substring_free()\fP can be used to free it when
it is no longer needed. The yield of the function is zero for success, it is no longer needed. The yield of the function is zero for success or one of
PCRE2_ERROR_NOMEMORY if sufficient memory could not be obtained, or the following error numbers:
PCRE2_ERROR_NOSUBSTRING if the string name is invalid. .sp
PCRE2_ERROR_NOSUBSTRING there are no groups of that name
PCRE2_ERROR_UNAVAILBLE the ovector was too small for that group
PCRE2_ERROR_UNSET the group did not participate in the match
PCRE2_ERROR_NOMEMORY memory could not be obtained
.sp
If there is more than one group with the given name, the first one that is set
is returned. In this situation PCRE2_ERROR_UNSET means that no group with the
given name was set.
.P .P
There is a complete description of the PCRE2 native API in the There is a complete description of the PCRE2 native API in the
.\" HREF .\" HREF

View File

@ -1,4 +1,4 @@
.TH PCRE2_SUBSTRING_GET_BYNUMBER 3 "01 December 2014" "PCRE2 10.00" .TH PCRE2_SUBSTRING_GET_BYNUMBER 3 "13 December 2014" "PCRE2 10.00"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS .SH SYNOPSIS
@ -25,9 +25,14 @@ into newly acquired memory. The arguments are:
The memory in which the substring is placed is obtained by calling the same The memory in which the substring is placed is obtained by calling the same
memory allocation function that was used for the match data block. The memory allocation function that was used for the match data block. The
convenience function \fBpcre2_substring_free()\fP can be used to free it when convenience function \fBpcre2_substring_free()\fP can be used to free it when
it is no longer needed. The yield of the function is zero for success, it is no longer needed. The yield of the function is zero for success or one of
PCRE2_ERROR_NOMEMORY if sufficient memory could not be obtained, or the following error numbers:
PCRE2_ERROR_NOSUBSTRING if the string number is invalid. .sp
PCRE2_ERROR_NOSUBSTRING there are no groups of that number
PCRE2_ERROR_UNAVAILBLE the ovector was too small for that group
PCRE2_ERROR_UNSET the group did not participate in the match
PCRE2_ERROR_NOMEMORY memory could not be obtained
.sp
.P .P
There is a complete description of the PCRE2 native API in the There is a complete description of the PCRE2 native API in the
.\" HREF .\" HREF

View File

@ -1,4 +1,4 @@
.TH PCRE2API 3 "01 December 2014" "PCRE2 10.00" .TH PCRE2API 3 "13 December 2014" "PCRE2 10.00"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.sp .sp
@ -2307,10 +2307,19 @@ attempt to get memory failed for \fBpcre2_substring_get_bynumber()\fP.
.sp .sp
PCRE2_ERROR_NOSUBSTRING PCRE2_ERROR_NOSUBSTRING
.sp .sp
No substring with the given number was captured. This could be because there is There is no substring with that number in the pattern, that is, the number is
no capturing group of that number in the pattern, or because the group with greater than the number of capturing parentheses.
that number did not participate in the match, or because the ovector was too .sp
small to capture that group. PCRE2_ERROR_UNAVAILABLE
.sp
The substring number, though not greater than the number of captures in the
pattern, is greater than the number of slots in the ovector, so the substring
could not be captured.
.sp
PCRE2_ERROR_UNSET
.sp
The substring did not participate in the match. For example, if the pattern is
(abc)|(def) and the subject is "def", substring number 1 is unset.
. .
. .
.SH "EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS" .SH "EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS"
@ -2345,7 +2354,7 @@ capturing subpattern number \fIn+1\fP matches some part of the subject, but
subpattern \fIn\fP has not been used at all, it returns an empty string. This subpattern \fIn\fP has not been used at all, it returns an empty string. This
can be distinguished from a genuine zero-length substring by inspecting the can be distinguished from a genuine zero-length substring by inspecting the
appropriate offset in the ovector, which contain PCRE2_UNSET for unset appropriate offset in the ovector, which contain PCRE2_UNSET for unset
substrings. substrings, or by calling \fBpcre2_substring_length_bynumber()\fP.
. .
. .
.\" HTML <a name="extractbyname"></a> .\" HTML <a name="extractbyname"></a>
@ -2384,8 +2393,10 @@ that name.
Given the number, you can extract the substring directly, or use one of the Given the number, you can extract the substring directly, or use one of the
functions described above. For convenience, there are also "byname" functions functions described above. For convenience, there are also "byname" functions
that correspond to the "bynumber" functions, the only difference being that the that correspond to the "bynumber" functions, the only difference being that the
second argument is a name instead of a number. However, if PCRE2_DUPNAMES is second argument is a name instead of a number. If PCRE2_DUPNAMES is
set and there are duplicate names, the behaviour may not be what you want. set and there are duplicate names, these functions return the first named
string that is set. PCRE2_ERROR_UNSET is returned only if all groups of the
same name are unset.
.P .P
\fBWarning:\fP If the pattern uses the (?| feature to set up multiple \fBWarning:\fP If the pattern uses the (?| feature to set up multiple
subpatterns with the same number, as described in the subpatterns with the same number, as described in the
@ -2485,9 +2496,9 @@ documentation.
.P .P
When duplicates are present, \fBpcre2_substring_copy_byname()\fP and When duplicates are present, \fBpcre2_substring_copy_byname()\fP and
\fBpcre2_substring_get_byname()\fP return the first substring corresponding to \fBpcre2_substring_get_byname()\fP return the first substring corresponding to
the given name that is set. If none are set, PCRE2_ERROR_NOSUBSTRING is the given name that is set. Only if none are set is PCRE2_ERROR_UNSET is
returned. The \fBpcre2_substring_number_from_name()\fP function returns returned. The \fBpcre2_substring_number_from_name()\fP function returns the
the error PCRE2_ERROR_NOUNIQUESUBSTRING. error PCRE2_ERROR_NOUNIQUESUBSTRING when there are duplicate names.
.P .P
If you want to get full details of all captured substrings for a given name, If you want to get full details of all captured substrings for a given name,
you must use the \fBpcre2_substring_nametable_scan()\fP function. The first you must use the \fBpcre2_substring_nametable_scan()\fP function. The first
@ -2735,6 +2746,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 01 December 2014 Last updated: 13 December 2014
Copyright (c) 1997-2014 University of Cambridge. Copyright (c) 1997-2014 University of Cambridge.
.fi .fi

View File

@ -80,20 +80,20 @@ uint8_t, UCHAR_MAX, etc are defined. */
extern "C" { extern "C" {
#endif #endif
/* The following options can be passed to pcre2_compile(), pcre2_match(), or /* The following option bits can be passed to pcre2_compile(), pcre2_match(),
pcre2_dfa_match(). PCRE2_NO_UTF_CHECK affects only the function to which it is or pcre2_dfa_match(). PCRE2_NO_UTF_CHECK affects only the function to which it
passed. Put these bits at the most significant end of the options word so is passed. Put these bits at the most significant end of the options word so
others can be added next to them */ others can be added next to them */
#define PCRE2_ANCHORED 0x80000000u #define PCRE2_ANCHORED 0x80000000u
#define PCRE2_NO_UTF_CHECK 0x40000000u #define PCRE2_NO_UTF_CHECK 0x40000000u
/* Other options that can be passed to pcre2_compile(). They may affect /* The following option bits can be passed only to pcre2_compile(). However,
compilation, JIT compilation, and/or interpretive execution. The following tags they may affect compilation, JIT compilation, and/or interpretive execution.
indicate which: The following tags indicate which:
C alters what is compiled C alters what is compiled by pcre2_compile()
J alters what JIT compiles J alters what is compiled by pcre2_jit_compile()
M is inspected during pcre2_match() execution M is inspected during pcre2_match() execution
D is inspected during pcre2_dfa_match() execution D is inspected during pcre2_dfa_match() execution
*/ */
@ -224,7 +224,8 @@ context functions. */
#define PCRE2_ERROR_NULL (-50) #define PCRE2_ERROR_NULL (-50)
#define PCRE2_ERROR_RECURSELOOP (-51) #define PCRE2_ERROR_RECURSELOOP (-51)
#define PCRE2_ERROR_RECURSIONLIMIT (-52) #define PCRE2_ERROR_RECURSIONLIMIT (-52)
#define PCRE2_ERROR_UNSET (-53) #define PCRE2_ERROR_UNAVAILABLE (-53)
#define PCRE2_ERROR_UNSET (-54)
/* Request types for pcre2_pattern_info() */ /* Request types for pcre2_pattern_info() */

View File

@ -221,12 +221,13 @@ static const char match_error_texts[] =
"JIT stack limit reached\0" "JIT stack limit reached\0"
"match limit exceeded\0" "match limit exceeded\0"
"no more memory\0" "no more memory\0"
"unknown or unset substring\0" "unknown substring\0"
"non-unique substring name\0" "non-unique substring name\0"
/* 50 */ /* 50 */
"NULL argument passed\0" "NULL argument passed\0"
"nested recursion at the same subject position\0" "nested recursion at the same subject position\0"
"recursion limit exceeded\0" "recursion limit exceeded\0"
"requested value is not available\0"
"requested value is not set\0" "requested value is not set\0"
; ;

View File

@ -7023,8 +7023,7 @@ if (rc == MATCH_MATCH || rc == MATCH_ACCEPT)
/* Set the return code to the number of captured strings, or 0 if there were /* Set the return code to the number of captured strings, or 0 if there were
too many to fit into the ovector. */ too many to fit into the ovector. */
match_data->rc = ((mb->capture_last & OVFLBIT) != 0 && match_data->rc = ((mb->capture_last & OVFLBIT) != 0)?
mb->end_offset_top >= arg_offset_max)?
0 : mb->end_offset_top/2; 0 : mb->end_offset_top/2;
/* If there is space in the offset vector, set any unused pairs at the end to /* If there is space in the offset vector, set any unused pairs at the end to

View File

@ -63,8 +63,9 @@ Arguments:
Returns: if successful: zero Returns: if successful: zero
if not successful, a negative error code: if not successful, a negative error code:
PCRE2_ERROR_NOMEMORY: buffer too small (1) an error from nametable_scan()
PCRE2_ERROR_NOSUBSTRING: no such captured substring (2) an error from copy_bynumber()
(3) PCRE2_ERROR_UNSET: all named groups are unset
*/ */
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
@ -83,7 +84,7 @@ for (entry = first; entry <= last; entry += entrysize)
if (n < match_data->oveccount && match_data->ovector[n*2] != PCRE2_UNSET) if (n < match_data->oveccount && match_data->ovector[n*2] != PCRE2_UNSET)
return pcre2_substring_copy_bynumber(match_data, n, buffer, sizeptr); return pcre2_substring_copy_bynumber(match_data, n, buffer, sizeptr);
} }
return PCRE2_ERROR_NOSUBSTRING; return PCRE2_ERROR_UNSET;
} }
@ -104,25 +105,24 @@ Arguments:
Returns: if successful: 0 Returns: if successful: 0
if not successful, a negative error code: if not successful, a negative error code:
PCRE2_ERROR_NOMEMORY: buffer too small PCRE2_ERROR_NOMEMORY: buffer too small
PCRE2_ERROR_NOSUBSTRING: no such captured substring PCRE2_ERROR_NOSUBSTRING: no such substring
PCRE2_ERROR_UNAVAILABLE: ovector too small
PCRE2_ERROR_UNSET: substring is not set
*/ */
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
pcre2_substring_copy_bynumber(pcre2_match_data *match_data, pcre2_substring_copy_bynumber(pcre2_match_data *match_data,
uint32_t stringnumber, PCRE2_UCHAR *buffer, PCRE2_SIZE *sizeptr) uint32_t stringnumber, PCRE2_UCHAR *buffer, PCRE2_SIZE *sizeptr)
{ {
PCRE2_SIZE left, right; int rc;
PCRE2_SIZE p = 0; PCRE2_SIZE size;
PCRE2_SPTR subject = match_data->subject; rc = pcre2_substring_length_bynumber(match_data, stringnumber, &size);
if (stringnumber >= match_data->oveccount || if (rc < 0) return rc;
stringnumber > match_data->code->top_bracket || if (size + 1 > *sizeptr) return PCRE2_ERROR_NOMEMORY;
(left = match_data->ovector[stringnumber*2]) == PCRE2_UNSET) memcpy(buffer, match_data->subject + match_data->ovector[stringnumber*2],
return PCRE2_ERROR_NOSUBSTRING; CU2BYTES(size));
right = match_data->ovector[stringnumber*2+1]; buffer[size] = 0;
if (right - left + 1 > *sizeptr) return PCRE2_ERROR_NOMEMORY; *sizeptr = size;
while (left < right) buffer[p++] = subject[left++];
buffer[p] = 0;
*sizeptr = p;
return 0; return 0;
} }
@ -144,8 +144,9 @@ Arguments:
Returns: if successful: zero Returns: if successful: zero
if not successful, a negative value: if not successful, a negative value:
PCRE2_ERROR_NOMEMORY: couldn't get memory (1) an error from nametable_scan()
PCRE2_ERROR_NOSUBSTRING: no such captured substring (2) an error from get_bynumber()
(3) PCRE2_ERROR_UNSET: all named groups are unset
*/ */
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
@ -164,7 +165,7 @@ for (entry = first; entry <= last; entry += entrysize)
if (n < match_data->oveccount && match_data->ovector[n*2] != PCRE2_UNSET) if (n < match_data->oveccount && match_data->ovector[n*2] != PCRE2_UNSET)
return pcre2_substring_get_bynumber(match_data, n, stringptr, sizeptr); return pcre2_substring_get_bynumber(match_data, n, stringptr, sizeptr);
} }
return PCRE2_ERROR_NOSUBSTRING; return PCRE2_ERROR_UNSET;
} }
@ -182,37 +183,32 @@ Arguments:
stringptr where to put a pointer to the new memory stringptr where to put a pointer to the new memory
sizeptr where to put the size of the substring sizeptr where to put the size of the substring
Returns: if successful: zero Returns: if successful: 0
if not successful a negative error code: if not successful, a negative error code:
PCRE2_ERROR_NOMEMORY: failed to get memory PCRE2_ERROR_NOMEMORY: failed to get memory
PCRE2_ERROR_NOSUBSTRING: substring not present PCRE2_ERROR_NOSUBSTRING: no such substring
PCRE2_ERROR_UNAVAILABLE: ovector too small
PCRE2_ERROR_UNSET: substring is not set
*/ */
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
pcre2_substring_get_bynumber(pcre2_match_data *match_data, pcre2_substring_get_bynumber(pcre2_match_data *match_data,
uint32_t stringnumber, PCRE2_UCHAR **stringptr, PCRE2_SIZE *sizeptr) uint32_t stringnumber, PCRE2_UCHAR **stringptr, PCRE2_SIZE *sizeptr)
{ {
PCRE2_SIZE left, right; int rc;
PCRE2_SIZE p = 0; PCRE2_SIZE size;
void *block;
PCRE2_UCHAR *yield; PCRE2_UCHAR *yield;
rc = pcre2_substring_length_bynumber(match_data, stringnumber, &size);
PCRE2_SPTR subject = match_data->subject; if (rc < 0) return rc;
if (stringnumber >= match_data->oveccount || yield = PRIV(memctl_malloc)(sizeof(pcre2_memctl) +
stringnumber > match_data->code->top_bracket || (size + 1)*PCRE2_CODE_UNIT_WIDTH, (pcre2_memctl *)match_data);
(left = match_data->ovector[stringnumber*2]) == PCRE2_UNSET) if (yield == NULL) return PCRE2_ERROR_NOMEMORY;
return PCRE2_ERROR_NOSUBSTRING; yield = (PCRE2_UCHAR *)(((char *)yield) + sizeof(pcre2_memctl));
right = match_data->ovector[stringnumber*2+1]; memcpy(yield, match_data->subject + match_data->ovector[stringnumber*2],
CU2BYTES(size));
block = PRIV(memctl_malloc)(sizeof(pcre2_memctl) + yield[size] = 0;
(right-left+1)*PCRE2_CODE_UNIT_WIDTH, (pcre2_memctl *)match_data);
if (block == NULL) return PCRE2_ERROR_NOMEMORY;
yield = (PCRE2_UCHAR *)((char *)block + sizeof(pcre2_memctl));
while (left < right) yield[p++] = subject[left++];
yield[p] = 0;
*stringptr = yield; *stringptr = yield;
*sizeptr = p; *sizeptr = size;
return 0; return 0;
} }
@ -260,14 +256,14 @@ PCRE2_SPTR last;
PCRE2_SPTR entry; PCRE2_SPTR entry;
int entrysize = pcre2_substring_nametable_scan(match_data->code, stringname, int entrysize = pcre2_substring_nametable_scan(match_data->code, stringname,
&first, &last); &first, &last);
if (entrysize <= 0) return entrysize; if (entrysize < 0) return entrysize;
for (entry = first; entry <= last; entry += entrysize) for (entry = first; entry <= last; entry += entrysize)
{ {
uint32_t n = GET2(entry, 0); uint32_t n = GET2(entry, 0);
if (n < match_data->oveccount && match_data->ovector[n*2] != PCRE2_UNSET) if (n < match_data->oveccount && match_data->ovector[n*2] != PCRE2_UNSET)
return pcre2_substring_length_bynumber(match_data, n, sizeptr); return pcre2_substring_length_bynumber(match_data, n, sizeptr);
} }
return PCRE2_ERROR_NOSUBSTRING; return PCRE2_ERROR_UNSET;
} }
@ -276,27 +272,37 @@ return PCRE2_ERROR_NOSUBSTRING;
* Get length of a numbered substring * * Get length of a numbered substring *
*************************************************/ *************************************************/
/* This function returns the length of a captured substring. /* This function returns the length of a captured substring. If the start is
beyond the end (which can happen when \K is used in an assertion), it sets the
length to zero.
Arguments: Arguments:
match_data pointer to match data match_data pointer to match data
stringnumber the number of the required substring stringnumber the number of the required substring
sizeptr where to put the length sizeptr where to put the length, if not NULL
Returns: 0 if successful, else a negative error number Returns: if successful: 0
if not successful, a negative error code:
PCRE2_ERROR_NOSUBSTRING: no such substring
PCRE2_ERROR_UNAVAILABLE: ovector is too small
PCRE2_ERROR_UNSET: substring is not set
*/ */
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
pcre2_substring_length_bynumber(pcre2_match_data *match_data, pcre2_substring_length_bynumber(pcre2_match_data *match_data,
uint32_t stringnumber, PCRE2_SIZE *sizeptr) uint32_t stringnumber, PCRE2_SIZE *sizeptr)
{ {
if (stringnumber >= match_data->oveccount || PCRE2_SIZE left, right;
stringnumber > match_data->code->top_bracket || if (stringnumber > match_data->code->top_bracket)
match_data->ovector[stringnumber*2] == PCRE2_UNSET)
return PCRE2_ERROR_NOSUBSTRING; return PCRE2_ERROR_NOSUBSTRING;
*sizeptr = match_data->ovector[stringnumber*2 + 1] - if (stringnumber >= match_data->oveccount)
match_data->ovector[stringnumber*2]; return PCRE2_ERROR_UNAVAILABLE;
return 0; if (match_data->ovector[stringnumber*2] == PCRE2_UNSET)
return PCRE2_ERROR_UNSET;
left = match_data->ovector[stringnumber*2];
right = match_data->ovector[stringnumber*2+1];
if (sizeptr != NULL) *sizeptr = (left > right)? 0 : right - left;
return 0;
} }
@ -334,7 +340,8 @@ PCRE2_UCHAR **listp;
PCRE2_UCHAR *sp; PCRE2_UCHAR *sp;
PCRE2_SIZE *ovector; PCRE2_SIZE *ovector;
if ((count = match_data->rc) < 0) return count; if ((count = match_data->rc) < 0) return count; /* Match failed */
if (count == 0) count = match_data->oveccount; /* Ovector too small */
count2 = 2*count; count2 = 2*count;
ovector = match_data->ovector; ovector = match_data->ovector;
@ -342,7 +349,11 @@ size = sizeof(pcre2_memctl) + sizeof(PCRE2_UCHAR *); /* For final NULL */
if (lengthsptr != NULL) size += sizeof(PCRE2_SIZE) * count; /* For lengths */ if (lengthsptr != NULL) size += sizeof(PCRE2_SIZE) * count; /* For lengths */
for (i = 0; i < count2; i += 2) for (i = 0; i < count2; i += 2)
size += sizeof(PCRE2_UCHAR *) + CU2BYTES(ovector[i+1] - ovector[i] + 1); {
size += sizeof(PCRE2_UCHAR *) + CU2BYTES(1);
if (ovector[i+1] > ovector[i]) size += CU2BYTES(ovector[i+1] - ovector[i]);
}
memp = PRIV(memctl_malloc)(size, (pcre2_memctl *)match_data); memp = PRIV(memctl_malloc)(size, (pcre2_memctl *)match_data);
if (memp == NULL) return PCRE2_ERROR_NOMEMORY; if (memp == NULL) return PCRE2_ERROR_NOMEMORY;
@ -362,7 +373,7 @@ else
for (i = 0; i < count2; i += 2) for (i = 0; i < count2; i += 2)
{ {
size = ovector[i+1] - ovector[i]; size = (ovector[i+1] > ovector[i])? (ovector[i+1] - ovector[i]) : 0;
memcpy(sp, match_data->subject + ovector[i], CU2BYTES(size)); memcpy(sp, match_data->subject + ovector[i], CU2BYTES(size));
*listp++ = sp; *listp++ = sp;
if (lensp != NULL) *lensp++ = size; if (lensp != NULL) *lensp++ = size;
@ -400,8 +411,8 @@ memctl->free(memctl, memctl->memory_data);
/* This function scans the nametable for a given name, using binary chop. It /* This function scans the nametable for a given name, using binary chop. It
returns either two pointers to the entries in the table, or, if no pointers are returns either two pointers to the entries in the table, or, if no pointers are
given, the number of a group with the given name. If duplicate names are given, the number of a unique group with the given name. If duplicate names are
permitted, this may not be unique. permitted, and the name is not unique, an error is generated.
Arguments: Arguments:
code the compiled regex code the compiled regex
@ -409,10 +420,12 @@ Arguments:
firstptr where to put the pointer to the first entry firstptr where to put the pointer to the first entry
lastptr where to put the pointer to the last entry lastptr where to put the pointer to the last entry
Returns: if firstptr and lastptr are NULL, a group number for a Returns: PCRE2_ERROR_NOSUBSTRING if the name is not found
unique substring, or PCRE2_ERROR_NOUNIQUESUBSTRING otherwise, if firstptr and lastptr are NULL:
otherwise, the length of each entry, or a negative number a group number for a unique substring
(PCRE2_ERROR_NOSUBSTRING) if not found else PCRE2_ERROR_NOUNIQUESUBSTRING
otherwise:
the length of each entry, having set firstptr and lastptr
*/ */
PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION PCRE2_EXP_DEFN int PCRE2_CALL_CONVENTION
@ -446,8 +459,8 @@ while (top > bot)
if (PRIV(strcmp)(stringname, (last + entrysize + IMM2_SIZE)) != 0) break; if (PRIV(strcmp)(stringname, (last + entrysize + IMM2_SIZE)) != 0) break;
last += entrysize; last += entrysize;
} }
if (firstptr == NULL) if (firstptr == NULL) return (first == last)?
return (first == last)? (int)GET2(entry, 0) : PCRE2_ERROR_NOUNIQUESUBSTRING; (int)GET2(entry, 0) : PCRE2_ERROR_NOUNIQUESUBSTRING;
*firstptr = first; *firstptr = first;
*lastptr = last; *lastptr = last;
return entrysize; return entrysize;

7
testdata/testinput2 vendored
View File

@ -4084,4 +4084,11 @@ a random value. /Ix
"(?(?=)?==)(((((((((?=)))))))))" "(?(?=)?==)(((((((((?=)))))))))"
a a
/(a)(b)|(c)/
XcX\=ovector=2,get=1,get=2,get=3,get=4,getall
/x(?=ab\K)/
xab\=get=0
xab\=copy=0
# End of testinput2 # End of testinput2

44
testdata/testoutput2 vendored
View File

@ -993,7 +993,7 @@ Subject length lower bound = 4
0: abcd 0: abcd
1: a 1: a
2: d 2: d
Copy substring 5 failed (-48): unknown or unset substring Copy substring 5 failed (-48): unknown substring
/(.{20})/I /(.{20})/I
Capturing subpattern count = 1 Capturing subpattern count = 1
@ -1047,9 +1047,9 @@ Subject length lower bound = 4
2: <unset> 2: <unset>
3: f 3: f
1G a (1) 1G a (1)
Get substring 2 failed (-48): unknown or unset substring Get substring 2 failed (-54): requested value is not set
3G f (1) 3G f (1)
Get substring 4 failed (-48): unknown or unset substring Get substring 4 failed (-48): unknown substring
0L adef 0L adef
1L a 1L a
2L 2L
@ -1062,7 +1062,7 @@ Get substring 4 failed (-48): unknown or unset substring
1G bc (2) 1G bc (2)
2G bc (2) 2G bc (2)
3G f (1) 3G f (1)
Get substring 4 failed (-48): unknown or unset substring Get substring 4 failed (-48): unknown substring
0L bcdef 0L bcdef
1L bc 1L bc
2L bc 2L bc
@ -4363,7 +4363,7 @@ Subject length lower bound = 8
1: cd 1: cd
2: gh 2: gh
Number not found for group 'three' Number not found for group 'three'
Copy substring 'three' failed (-48): unknown or unset substring Copy substring 'three' failed (-48): unknown substring
/(?P<Tes>)(?P<Test>)/IB /(?P<Tes>)(?P<Test>)/IB
------------------------------------------------------------------ ------------------------------------------------------------------
@ -5731,7 +5731,7 @@ No match
1: a1 1: a1
2: a1 2: a1
Number not found for group 'Z' Number not found for group 'Z'
Copy substring 'Z' failed (-48): unknown or unset substring Copy substring 'Z' failed (-48): unknown substring
C a1 (2) A (non-unique) C a1 (2) A (non-unique)
/(?|(?<a>)(?<b>)(?<a>)|(?<a>)(?<b>)(?<a>))/I,dupnames /(?|(?<a>)(?<b>)(?<a>)|(?<a>)(?<b>)(?<a>))/I,dupnames
@ -5772,7 +5772,7 @@ Subject length lower bound = 2
C a (1) A (non-unique) C a (1) A (non-unique)
cd\=copy=A cd\=copy=A
0: cd 0: cd
Copy substring 'A' failed (-48): unknown or unset substring Copy substring 'A' failed (-54): requested value is not set
/^(?P<A>a)(?P<A>b)|cd(?P<A>ef)(?P<A>gh)/I,dupnames /^(?P<A>a)(?P<A>b)|cd(?P<A>ef)(?P<A>gh)/I,dupnames
Capturing subpattern count = 4 Capturing subpattern count = 4
@ -5817,7 +5817,7 @@ No match
1: a1 1: a1
2: a1 2: a1
Number not found for group 'Z' Number not found for group 'Z'
Get substring 'Z' failed (-48): unknown or unset substring Get substring 'Z' failed (-48): unknown substring
G a1 (2) A (non-unique) G a1 (2) A (non-unique)
/^(?P<A>a)(?P<A>b)/I,dupnames /^(?P<A>a)(?P<A>b)/I,dupnames
@ -5848,7 +5848,7 @@ Subject length lower bound = 2
G a (1) A (non-unique) G a (1) A (non-unique)
cd\=get=A cd\=get=A
0: cd 0: cd
Get substring 'A' failed (-48): unknown or unset substring Get substring 'A' failed (-54): requested value is not set
/^(?P<A>a)(?P<A>b)|cd(?P<A>ef)(?P<A>gh)/I,dupnames /^(?P<A>a)(?P<A>b)|cd(?P<A>ef)(?P<A>gh)/I,dupnames
Capturing subpattern count = 4 Capturing subpattern count = 4
@ -13659,11 +13659,11 @@ Failed: error -35: invalid replacement string
/abc/replace=a$bad /abc/replace=a$bad
123abc 123abc
Failed: error -48: unknown or unset substring Failed: error -48: unknown substring
/abc/replace=a${A234567890123456789_123456789012}z /abc/replace=a${A234567890123456789_123456789012}z
123abc 123abc
Failed: error -48: unknown or unset substring Failed: error -48: unknown substring
/abc/replace=a${A23456789012345678901234567890123}z /abc/replace=a${A23456789012345678901234567890123}z
123abc 123abc
@ -13715,4 +13715,26 @@ Failed: error -34: bad option value
a a
No match No match
/(a)(b)|(c)/
XcX\=ovector=2,get=1,get=2,get=3,get=4,getall
Matched, but too many substrings
0: c
1: <unset>
Get substring 1 failed (-54): requested value is not set
Get substring 2 failed (-53): requested value is not available
Get substring 3 failed (-53): requested value is not available
Get substring 4 failed (-48): unknown substring
0L c
1L
/x(?=ab\K)/
xab\=get=0
Start of matched string is beyond its end - displaying from end to start.
0: ab
0G (0)
xab\=copy=0
Start of matched string is beyond its end - displaying from end to start.
0: ab
0C (0)
# End of testinput2 # End of testinput2