Set subject field in match data to NULL after failed match.

This commit is contained in:
Philip.Hazel 2018-10-19 15:31:16 +00:00
parent 7f801fb800
commit 8a0dd8955a
7 changed files with 535 additions and 529 deletions

View File

@ -39,7 +39,9 @@ worked, though there were no bug reports.
10. Implement PCRE2_COPY_MATCHED_SUBJECT for pcre2_match() (including JIT via 10. Implement PCRE2_COPY_MATCHED_SUBJECT for pcre2_match() (including JIT via
pcre2_match()) and pcre2_dfa_match(), but *not* the pcre2_jit_match() fast pcre2_match()) and pcre2_dfa_match(), but *not* the pcre2_jit_match() fast
path. path. Also, when a match fails, set the subject field in the match data to NULL
for tidiness - none of the substring extractors should reference this after
match failure.
Version 10.32 10-September-2018 Version 10.32 10-September-2018

View File

@ -1304,9 +1304,9 @@ NULL.
<P> <P>
NOTE: When one of the matching functions is called, pointers to the compiled NOTE: When one of the matching functions is called, pointers to the compiled
pattern and the subject string are set in the match data block so that they can pattern and the subject string are set in the match data block so that they can
be referenced by the substring extraction functions. After running a match, you be referenced by the substring extraction functions after a successful match.
must not free a compiled pattern or a subject string until after all After running a match, you must not free a compiled pattern or a subject string
operations on the until after all operations on the
<a href="#matchdatablock">match data block</a> <a href="#matchdatablock">match data block</a>
have taken place, unless, in the case of the subject string, you have used the have taken place, unless, in the case of the subject string, you have used the
PCRE2_COPY_MATCHED_SUBJECT option, which is described in the section entitled PCRE2_COPY_MATCHED_SUBJECT option, which is described in the section entitled
@ -2420,11 +2420,12 @@ on the error, and is detailed below.
<P> <P>
When one of the matching functions is called, pointers to the compiled pattern When one of the matching functions is called, pointers to the compiled pattern
and the subject string are set in the match data block so that they can be and the subject string are set in the match data block so that they can be
referenced by the extraction functions. After running a match, you must not referenced by the extraction functions after a successful match. After running
free a compiled pattern or a subject string until after all operations on the a match, you must not free a compiled pattern or a subject string until after
match data block (for that match) have taken place, unless, in the case of the all operations on the match data block (for that match) have taken place,
subject string, you have used the PCRE2_COPY_MATCHED_SUBJECT option, which is unless, in the case of the subject string, you have used the
described in the section entitled "Option bits for <b>pcre2_match()</b>" PCRE2_COPY_MATCHED_SUBJECT option, which is described in the section entitled
"Option bits for <b>pcre2_match()</b>"
<a href="#matchoptions>">below.</a> <a href="#matchoptions>">below.</a>
</P> </P>
<P> <P>
@ -3756,7 +3757,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC42" href="#TOC1">REVISION</a><br> <br><a name="SEC42" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 17 October 2018 Last updated: 19 October 2018
<br> <br>
Copyright &copy; 1997-2018 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,4 @@
.TH PCRE2API 3 "17 October 2018" "PCRE2 10.33" .TH PCRE2API 3 "19 October 2018" "PCRE2 10.33"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.sp .sp
@ -1236,9 +1236,9 @@ NULL.
.P .P
NOTE: When one of the matching functions is called, pointers to the compiled NOTE: When one of the matching functions is called, pointers to the compiled
pattern and the subject string are set in the match data block so that they can pattern and the subject string are set in the match data block so that they can
be referenced by the substring extraction functions. After running a match, you be referenced by the substring extraction functions after a successful match.
must not free a compiled pattern or a subject string until after all After running a match, you must not free a compiled pattern or a subject string
operations on the until after all operations on the
.\" HTML <a href="#matchdatablock"> .\" HTML <a href="#matchdatablock">
.\" </a> .\" </a>
match data block match data block
@ -2394,11 +2394,12 @@ on the error, and is detailed below.
.P .P
When one of the matching functions is called, pointers to the compiled pattern When one of the matching functions is called, pointers to the compiled pattern
and the subject string are set in the match data block so that they can be and the subject string are set in the match data block so that they can be
referenced by the extraction functions. After running a match, you must not referenced by the extraction functions after a successful match. After running
free a compiled pattern or a subject string until after all operations on the a match, you must not free a compiled pattern or a subject string until after
match data block (for that match) have taken place, unless, in the case of the all operations on the match data block (for that match) have taken place,
subject string, you have used the PCRE2_COPY_MATCHED_SUBJECT option, which is unless, in the case of the subject string, you have used the
described in the section entitled "Option bits for \fBpcre2_match()\fP" PCRE2_COPY_MATCHED_SUBJECT option, which is described in the section entitled
"Option bits for \fBpcre2_match()\fP"
.\" HTML <a href="#matchoptions>"> .\" HTML <a href="#matchoptions>">
.\" </a> .\" </a>
below. below.
@ -3767,6 +3768,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 17 October 2018 Last updated: 19 October 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
.fi .fi

View File

@ -3540,8 +3540,7 @@ if ((match_data->flags & PCRE2_MD_COPIED_SUBJECT) != 0)
/* Fill in fields that are always returned in the match data. */ /* Fill in fields that are always returned in the match data. */
match_data->code = re; match_data->code = re;
match_data->subject = subject; match_data->subject = NULL; /* Default for no match */
match_data->flags = 0;
match_data->mark = NULL; match_data->mark = NULL;
match_data->matchedby = PCRE2_MATCHEDBY_DFA_INTERPRETER; match_data->matchedby = PCRE2_MATCHEDBY_DFA_INTERPRETER;
@ -3846,7 +3845,10 @@ for (;;)
memcpy((void *)match_data->subject, subject, length); memcpy((void *)match_data->subject, subject, length);
match_data->flags |= PCRE2_MD_COPIED_SUBJECT; match_data->flags |= PCRE2_MD_COPIED_SUBJECT;
} }
else
{
if (rc >= 0 || rc == PCRE2_ERROR_PARTIAL) match_data->subject = subject;
}
goto EXIT; goto EXIT;
} }

View File

@ -173,8 +173,7 @@ else
if (rc > (int)oveccount) if (rc > (int)oveccount)
rc = 0; rc = 0;
match_data->code = re; match_data->code = re;
match_data->subject = subject; match_data->subject = (rc >= 0 || rc == PCRE2_ERROR_PARTIAL)? subject : NULL;
match_data->flags = 0;
match_data->rc = rc; match_data->rc = rc;
match_data->startchar = arguments.startchar_ptr - subject; match_data->startchar = arguments.startchar_ptr - subject;
match_data->leftchar = 0; match_data->leftchar = 0;

View File

@ -6174,7 +6174,7 @@ if (mcontext != NULL && mcontext->offset_limit != PCRE2_UNSET &&
return PCRE2_ERROR_BADOFFSETLIMIT; return PCRE2_ERROR_BADOFFSETLIMIT;
/* If the match data block was previously used with PCRE2_COPY_MATCHED_SUBJECT, /* If the match data block was previously used with PCRE2_COPY_MATCHED_SUBJECT,
free the memory that was obtained. */ free the memory that was obtained. Set the field to NULL for no match cases. */
if ((match_data->flags & PCRE2_MD_COPIED_SUBJECT) != 0) if ((match_data->flags & PCRE2_MD_COPIED_SUBJECT) != 0)
{ {
@ -6182,6 +6182,7 @@ if ((match_data->flags & PCRE2_MD_COPIED_SUBJECT) != 0)
match_data->memctl.memory_data); match_data->memctl.memory_data);
match_data->flags &= ~PCRE2_MD_COPIED_SUBJECT; match_data->flags &= ~PCRE2_MD_COPIED_SUBJECT;
} }
match_data->subject = NULL;
/* If the pattern was successfully studied with JIT support, run the JIT /* If the pattern was successfully studied with JIT support, run the JIT
executable instead of the rest of this function. Most options must be set at executable instead of the rest of this function. Most options must be set at
@ -6846,8 +6847,6 @@ if (mb->match_frames != mb->stack_frames)
/* Fill in fields that are always returned in the match data. */ /* Fill in fields that are always returned in the match data. */
match_data->code = re; match_data->code = re;
match_data->subject = subject;
match_data->flags = 0;
match_data->mark = mb->mark; match_data->mark = mb->mark;
match_data->matchedby = PCRE2_MATCHEDBY_INTERPRETER; match_data->matchedby = PCRE2_MATCHEDBY_INTERPRETER;
@ -6864,7 +6863,6 @@ if (rc == MATCH_MATCH)
match_data->leftchar = mb->start_used_ptr - subject; match_data->leftchar = mb->start_used_ptr - subject;
match_data->rightchar = ((mb->last_used_ptr > mb->end_match_ptr)? match_data->rightchar = ((mb->last_used_ptr > mb->end_match_ptr)?
mb->last_used_ptr : mb->end_match_ptr) - subject; mb->last_used_ptr : mb->end_match_ptr) - subject;
if ((options & PCRE2_COPY_MATCHED_SUBJECT) != 0) if ((options & PCRE2_COPY_MATCHED_SUBJECT) != 0)
{ {
length = CU2BYTES(length + was_zero_terminated); length = CU2BYTES(length + was_zero_terminated);
@ -6874,7 +6872,7 @@ if (rc == MATCH_MATCH)
memcpy((void *)match_data->subject, subject, length); memcpy((void *)match_data->subject, subject, length);
match_data->flags |= PCRE2_MD_COPIED_SUBJECT; match_data->flags |= PCRE2_MD_COPIED_SUBJECT;
} }
else match_data->subject = subject;
return match_data->rc; return match_data->rc;
} }
@ -6892,6 +6890,7 @@ if (rc != MATCH_NOMATCH && rc != PCRE2_ERROR_PARTIAL) match_data->rc = rc;
else if (match_partial != NULL) else if (match_partial != NULL)
{ {
match_data->subject = subject;
match_data->ovector[0] = match_partial - subject; match_data->ovector[0] = match_partial - subject;
match_data->ovector[1] = end_subject - subject; match_data->ovector[1] = end_subject - subject;
match_data->startchar = match_partial - subject; match_data->startchar = match_partial - subject;