Set subject field in match data to NULL after failed match.

This commit is contained in:
Philip.Hazel 2018-10-19 15:31:16 +00:00
parent 7f801fb800
commit 8a0dd8955a
7 changed files with 535 additions and 529 deletions

View File

@ -39,7 +39,9 @@ worked, though there were no bug reports.
10. Implement PCRE2_COPY_MATCHED_SUBJECT for pcre2_match() (including JIT via 10. Implement PCRE2_COPY_MATCHED_SUBJECT for pcre2_match() (including JIT via
pcre2_match()) and pcre2_dfa_match(), but *not* the pcre2_jit_match() fast pcre2_match()) and pcre2_dfa_match(), but *not* the pcre2_jit_match() fast
path. path. Also, when a match fails, set the subject field in the match data to NULL
for tidiness - none of the substring extractors should reference this after
match failure.
Version 10.32 10-September-2018 Version 10.32 10-September-2018

View File

@ -1304,9 +1304,9 @@ NULL.
<P> <P>
NOTE: When one of the matching functions is called, pointers to the compiled NOTE: When one of the matching functions is called, pointers to the compiled
pattern and the subject string are set in the match data block so that they can pattern and the subject string are set in the match data block so that they can
be referenced by the substring extraction functions. After running a match, you be referenced by the substring extraction functions after a successful match.
must not free a compiled pattern or a subject string until after all After running a match, you must not free a compiled pattern or a subject string
operations on the until after all operations on the
<a href="#matchdatablock">match data block</a> <a href="#matchdatablock">match data block</a>
have taken place, unless, in the case of the subject string, you have used the have taken place, unless, in the case of the subject string, you have used the
PCRE2_COPY_MATCHED_SUBJECT option, which is described in the section entitled PCRE2_COPY_MATCHED_SUBJECT option, which is described in the section entitled
@ -2420,11 +2420,12 @@ on the error, and is detailed below.
<P> <P>
When one of the matching functions is called, pointers to the compiled pattern When one of the matching functions is called, pointers to the compiled pattern
and the subject string are set in the match data block so that they can be and the subject string are set in the match data block so that they can be
referenced by the extraction functions. After running a match, you must not referenced by the extraction functions after a successful match. After running
free a compiled pattern or a subject string until after all operations on the a match, you must not free a compiled pattern or a subject string until after
match data block (for that match) have taken place, unless, in the case of the all operations on the match data block (for that match) have taken place,
subject string, you have used the PCRE2_COPY_MATCHED_SUBJECT option, which is unless, in the case of the subject string, you have used the
described in the section entitled "Option bits for <b>pcre2_match()</b>" PCRE2_COPY_MATCHED_SUBJECT option, which is described in the section entitled
"Option bits for <b>pcre2_match()</b>"
<a href="#matchoptions>">below.</a> <a href="#matchoptions>">below.</a>
</P> </P>
<P> <P>
@ -3756,7 +3757,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC42" href="#TOC1">REVISION</a><br> <br><a name="SEC42" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 17 October 2018 Last updated: 19 October 2018
<br> <br>
Copyright &copy; 1997-2018 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>

View File

@ -1301,12 +1301,13 @@ COMPILING A PATTERN
NOTE: When one of the matching functions is called, pointers to the NOTE: When one of the matching functions is called, pointers to the
compiled pattern and the subject string are set in the match data block compiled pattern and the subject string are set in the match data block
so that they can be referenced by the substring extraction functions. so that they can be referenced by the substring extraction functions
After running a match, you must not free a compiled pattern or a sub- after a successful match. After running a match, you must not free a
ject string until after all operations on the match data block have compiled pattern or a subject string until after all operations on the
taken place, unless, in the case of the subject string, you have used match data block have taken place, unless, in the case of the subject
the PCRE2_COPY_MATCHED_SUBJECT option, which is described in the sec- string, you have used the PCRE2_COPY_MATCHED_SUBJECT option, which is
tion entitled "Option bits for pcre2_match()" below. described in the section entitled "Option bits for pcre2_match()"
below.
The options argument for pcre2_compile() contains various bit settings The options argument for pcre2_compile() contains various bit settings
that affect the compilation. It should be zero if no options are that affect the compilation. It should be zero if no options are
@ -2387,12 +2388,13 @@ THE MATCH DATA BLOCK
When one of the matching functions is called, pointers to the compiled When one of the matching functions is called, pointers to the compiled
pattern and the subject string are set in the match data block so that pattern and the subject string are set in the match data block so that
they can be referenced by the extraction functions. After running a they can be referenced by the extraction functions after a successful
match, you must not free a compiled pattern or a subject string until match. After running a match, you must not free a compiled pattern or a
after all operations on the match data block (for that match) have subject string until after all operations on the match data block (for
taken place, unless, in the case of the subject string, you have used that match) have taken place, unless, in the case of the subject
the PCRE2_COPY_MATCHED_SUBJECT option, which is described in the sec- string, you have used the PCRE2_COPY_MATCHED_SUBJECT option, which is
tion entitled "Option bits for pcre2_match()" below. described in the section entitled "Option bits for pcre2_match()"
below.
When a match data block itself is no longer needed, it should be freed When a match data block itself is no longer needed, it should be freed
by calling pcre2_match_data_free(). If this function is called with a by calling pcre2_match_data_free(). If this function is called with a
@ -3631,7 +3633,7 @@ AUTHOR
REVISION REVISION
Last updated: 17 October 2018 Last updated: 19 October 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------

View File

@ -1,4 +1,4 @@
.TH PCRE2API 3 "17 October 2018" "PCRE2 10.33" .TH PCRE2API 3 "19 October 2018" "PCRE2 10.33"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.sp .sp
@ -1236,9 +1236,9 @@ NULL.
.P .P
NOTE: When one of the matching functions is called, pointers to the compiled NOTE: When one of the matching functions is called, pointers to the compiled
pattern and the subject string are set in the match data block so that they can pattern and the subject string are set in the match data block so that they can
be referenced by the substring extraction functions. After running a match, you be referenced by the substring extraction functions after a successful match.
must not free a compiled pattern or a subject string until after all After running a match, you must not free a compiled pattern or a subject string
operations on the until after all operations on the
.\" HTML <a href="#matchdatablock"> .\" HTML <a href="#matchdatablock">
.\" </a> .\" </a>
match data block match data block
@ -2394,11 +2394,12 @@ on the error, and is detailed below.
.P .P
When one of the matching functions is called, pointers to the compiled pattern When one of the matching functions is called, pointers to the compiled pattern
and the subject string are set in the match data block so that they can be and the subject string are set in the match data block so that they can be
referenced by the extraction functions. After running a match, you must not referenced by the extraction functions after a successful match. After running
free a compiled pattern or a subject string until after all operations on the a match, you must not free a compiled pattern or a subject string until after
match data block (for that match) have taken place, unless, in the case of the all operations on the match data block (for that match) have taken place,
subject string, you have used the PCRE2_COPY_MATCHED_SUBJECT option, which is unless, in the case of the subject string, you have used the
described in the section entitled "Option bits for \fBpcre2_match()\fP" PCRE2_COPY_MATCHED_SUBJECT option, which is described in the section entitled
"Option bits for \fBpcre2_match()\fP"
.\" HTML <a href="#matchoptions>"> .\" HTML <a href="#matchoptions>">
.\" </a> .\" </a>
below. below.
@ -3767,6 +3768,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 17 October 2018 Last updated: 19 October 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
.fi .fi

View File

@ -3540,8 +3540,7 @@ if ((match_data->flags & PCRE2_MD_COPIED_SUBJECT) != 0)
/* Fill in fields that are always returned in the match data. */ /* Fill in fields that are always returned in the match data. */
match_data->code = re; match_data->code = re;
match_data->subject = subject; match_data->subject = NULL; /* Default for no match */
match_data->flags = 0;
match_data->mark = NULL; match_data->mark = NULL;
match_data->matchedby = PCRE2_MATCHEDBY_DFA_INTERPRETER; match_data->matchedby = PCRE2_MATCHEDBY_DFA_INTERPRETER;
@ -3846,7 +3845,10 @@ for (;;)
memcpy((void *)match_data->subject, subject, length); memcpy((void *)match_data->subject, subject, length);
match_data->flags |= PCRE2_MD_COPIED_SUBJECT; match_data->flags |= PCRE2_MD_COPIED_SUBJECT;
} }
else
{
if (rc >= 0 || rc == PCRE2_ERROR_PARTIAL) match_data->subject = subject;
}
goto EXIT; goto EXIT;
} }

View File

@ -173,8 +173,7 @@ else
if (rc > (int)oveccount) if (rc > (int)oveccount)
rc = 0; rc = 0;
match_data->code = re; match_data->code = re;
match_data->subject = subject; match_data->subject = (rc >= 0 || rc == PCRE2_ERROR_PARTIAL)? subject : NULL;
match_data->flags = 0;
match_data->rc = rc; match_data->rc = rc;
match_data->startchar = arguments.startchar_ptr - subject; match_data->startchar = arguments.startchar_ptr - subject;
match_data->leftchar = 0; match_data->leftchar = 0;

View File

@ -6174,7 +6174,7 @@ if (mcontext != NULL && mcontext->offset_limit != PCRE2_UNSET &&
return PCRE2_ERROR_BADOFFSETLIMIT; return PCRE2_ERROR_BADOFFSETLIMIT;
/* If the match data block was previously used with PCRE2_COPY_MATCHED_SUBJECT, /* If the match data block was previously used with PCRE2_COPY_MATCHED_SUBJECT,
free the memory that was obtained. */ free the memory that was obtained. Set the field to NULL for no match cases. */
if ((match_data->flags & PCRE2_MD_COPIED_SUBJECT) != 0) if ((match_data->flags & PCRE2_MD_COPIED_SUBJECT) != 0)
{ {
@ -6182,6 +6182,7 @@ if ((match_data->flags & PCRE2_MD_COPIED_SUBJECT) != 0)
match_data->memctl.memory_data); match_data->memctl.memory_data);
match_data->flags &= ~PCRE2_MD_COPIED_SUBJECT; match_data->flags &= ~PCRE2_MD_COPIED_SUBJECT;
} }
match_data->subject = NULL;
/* If the pattern was successfully studied with JIT support, run the JIT /* If the pattern was successfully studied with JIT support, run the JIT
executable instead of the rest of this function. Most options must be set at executable instead of the rest of this function. Most options must be set at
@ -6846,8 +6847,6 @@ if (mb->match_frames != mb->stack_frames)
/* Fill in fields that are always returned in the match data. */ /* Fill in fields that are always returned in the match data. */
match_data->code = re; match_data->code = re;
match_data->subject = subject;
match_data->flags = 0;
match_data->mark = mb->mark; match_data->mark = mb->mark;
match_data->matchedby = PCRE2_MATCHEDBY_INTERPRETER; match_data->matchedby = PCRE2_MATCHEDBY_INTERPRETER;
@ -6864,7 +6863,6 @@ if (rc == MATCH_MATCH)
match_data->leftchar = mb->start_used_ptr - subject; match_data->leftchar = mb->start_used_ptr - subject;
match_data->rightchar = ((mb->last_used_ptr > mb->end_match_ptr)? match_data->rightchar = ((mb->last_used_ptr > mb->end_match_ptr)?
mb->last_used_ptr : mb->end_match_ptr) - subject; mb->last_used_ptr : mb->end_match_ptr) - subject;
if ((options & PCRE2_COPY_MATCHED_SUBJECT) != 0) if ((options & PCRE2_COPY_MATCHED_SUBJECT) != 0)
{ {
length = CU2BYTES(length + was_zero_terminated); length = CU2BYTES(length + was_zero_terminated);
@ -6874,7 +6872,7 @@ if (rc == MATCH_MATCH)
memcpy((void *)match_data->subject, subject, length); memcpy((void *)match_data->subject, subject, length);
match_data->flags |= PCRE2_MD_COPIED_SUBJECT; match_data->flags |= PCRE2_MD_COPIED_SUBJECT;
} }
else match_data->subject = subject;
return match_data->rc; return match_data->rc;
} }
@ -6892,6 +6890,7 @@ if (rc != MATCH_NOMATCH && rc != PCRE2_ERROR_PARTIAL) match_data->rc = rc;
else if (match_partial != NULL) else if (match_partial != NULL)
{ {
match_data->subject = subject;
match_data->ovector[0] = match_partial - subject; match_data->ovector[0] = match_partial - subject;
match_data->ovector[1] = end_subject - subject; match_data->ovector[1] = end_subject - subject;
match_data->startchar = match_partial - subject; match_data->startchar = match_partial - subject;