Memchr() speed-up for unanchored pattern in 8-bit mode.

This commit is contained in:
Philip.Hazel 2015-07-06 16:05:41 +00:00
parent f01184a3ab
commit be646cb567
3 changed files with 15 additions and 4 deletions

View File

@ -6,6 +6,9 @@ Version 10.21 xx-xxx-xxxx
1. Improve matching speed of patterns starting with + or * in JIT.
2. Use memchr() to find the first character in an unanchored match in 8-bit
mode in the interpreter. This gives a significant speed improvement.
Version 10.20 30-June-2015
--------------------------

View File

@ -9,9 +9,9 @@ dnl The PCRE2_PRERELEASE feature is for identifying release candidates. It might
dnl be defined as -RC2, for example. For real releases, it should be empty.
m4_define(pcre2_major, [10])
m4_define(pcre2_minor, [20])
m4_define(pcre2_prerelease, [])
m4_define(pcre2_date, [2015-06-30])
m4_define(pcre2_minor, [21])
m4_define(pcre2_prerelease, [-RC1])
m4_define(pcre2_date, [2015-07-06])
# NOTE: The CMakeLists.txt file searches for the above variables in the first
# 50 lines of this file. Please update that if the variables above are moved.

View File

@ -6783,7 +6783,8 @@ for(;;)
end_subject = t;
}
/* Advance to a unique first code unit if there is one. */
/* Advance to a unique first code unit if there is one. In 8-bit mode, the
use of memchr() gives a big speed up. */
if (has_first_cu)
{
@ -6793,8 +6794,15 @@ for(;;)
(smc = UCHAR21TEST(start_match)) != first_cu && smc != first_cu2)
start_match++;
else
{
#if PCRE2_CODE_UNIT_WIDTH != 8
while (start_match < end_subject && UCHAR21TEST(start_match) != first_cu)
start_match++;
#else
start_match = memchr(start_match, first_cu, end_subject - start_match);
if (start_match == NULL) start_match = end_subject;
#endif
}
}
/* Or to just after a linebreak for a multiline match */