Documentation update.
This commit is contained in:
parent
16d47a9cb1
commit
dea540877b
42
maint/README
42
maint/README
|
@ -141,8 +141,9 @@ distribution for a new release.
|
|||
|
||||
. Run perltest.sh on the test data for tests 1 and 4. The output should match
|
||||
the PCRE2 test output, apart from the version identification at the start of
|
||||
each test. The other tests are not Perl-compatible (they use various
|
||||
PCRE2-specific features or options).
|
||||
each test. Sometimes there are other differences in test 4 if PCRE2 and Perl
|
||||
are using different Unicode releases. The other tests are not Perl-compatible
|
||||
(they use various PCRE2-specific features or options).
|
||||
|
||||
. It is possible to test with the emulated memmove() function by undefining
|
||||
HAVE_MEMMOVE and HAVE_BCOPY in config.h, though I do not do this often.
|
||||
|
@ -155,8 +156,9 @@ distribution for a new release.
|
|||
systems. For example, on Solaris it is helpful to test using Sun's cc
|
||||
compiler as a change from gcc. Adding -xarch=v9 to the cc options does a
|
||||
64-bit test, but it also needs -S 64 for pcre2test to increase the stack size
|
||||
for test 2. Since I retired I can no longer do this, but instead I rely on
|
||||
putting out release candidates for folks on the pcre-dev list to test.
|
||||
for test 2. Since I retired I can no longer do much of this, but instead I
|
||||
rely on putting out release candidates for folks on the pcre-dev list to
|
||||
test.
|
||||
|
||||
. The buildbots at http://buildfarm.opencsw.org/ do some automated testing
|
||||
of PCRE2 and should be checked before putting out a release.
|
||||
|
@ -285,7 +287,7 @@ very sensible; some are rather wacky. Some have been on this list for years.
|
|||
to switch this dynamically. It would have to be specified when PCRE2 was
|
||||
compiled. PCRE2 would then call a function every time it wanted a character.
|
||||
|
||||
. pcre2grep: add -rs for a sorted recurse? Having to store file names and sort
|
||||
. pcre2grep: add -rs for a sorted recurse. Having to store file names and sort
|
||||
them will of course slow it down.
|
||||
|
||||
. Someone suggested --disable-callout to save code space when callouts are
|
||||
|
@ -325,10 +327,10 @@ very sensible; some are rather wacky. Some have been on this list for years.
|
|||
. If Perl ever supports the POSIX notation [[.something.]] PCRE2 should try
|
||||
to follow.
|
||||
|
||||
. Bugzilla #554 requested support for invalid UTF-8 strings.
|
||||
|
||||
. A user wanted a way of ignoring all Unicode "mark" characters so that, for
|
||||
example "a" followed by an accent would, together, match "a".
|
||||
example "a" followed by an accent would, together, match "a". This can only
|
||||
be done clumsily at present by using a lookahead such as /(?=a)\X/, which
|
||||
works for "combining" characters.
|
||||
|
||||
. Perl supports [\N{x}-\N{y}] as a Unicode range, even in EBCDIC. PCRE2
|
||||
supports \N{U+dd..} everywhere, but not in EBCDIC.
|
||||
|
@ -345,9 +347,6 @@ very sensible; some are rather wacky. Some have been on this list for years.
|
|||
|
||||
. Bugzilla #1694 requests backwards searching.
|
||||
|
||||
. A callout from pcre2_substitute() that happens after (before?) each
|
||||
substitution (value = 256?).
|
||||
|
||||
. Allow a callout to specify a number of characters to skip. This can be done
|
||||
compatibly via an extra callout field.
|
||||
|
||||
|
@ -359,15 +358,12 @@ very sensible; some are rather wacky. Some have been on this list for years.
|
|||
. A limit on substitutions: a user suggested somehow finding a way of making
|
||||
match_limit apply to the whole operation instead of each match separately.
|
||||
|
||||
. There was a suggestion that Perl should lock out \K in lookarounds. If it
|
||||
does, PCRE2 should follow.
|
||||
|
||||
. Redesign handling of class/nclass/xclass because the compile code logic is
|
||||
currently very contorted and obscure.
|
||||
|
||||
. Some #defines could be replaced with enums to improve robustness.
|
||||
|
||||
. There was a request for and option for pcre2_match() to return the longest
|
||||
. There was a request for an option for pcre2_match() to return the longest
|
||||
match. This would mean searching for all possible matches, of course.
|
||||
|
||||
. Perl's /a modifier sets Unicode, but restricts \d etc to ASCII characters,
|
||||
|
@ -417,7 +413,8 @@ very sensible; some are rather wacky. Some have been on this list for years.
|
|||
to define a bit in the match data, but all three matchers would need work.
|
||||
|
||||
. Would inlining "simple" recursions provide a useful performance boost for the
|
||||
interpreters? JIT already does some of this.
|
||||
interpreters? JIT already does some of this, but it may not be worth it for
|
||||
the interpreters.
|
||||
|
||||
. There was a request for a way of re-defining \w (and therefore \W, \b, and
|
||||
\B). An in-pattern sequence such as (?w=[...]) was suggested. Easiest way
|
||||
|
@ -426,7 +423,18 @@ very sensible; some are rather wacky. Some have been on this list for years.
|
|||
all previous settings; maybe a fixed amount of stack would do - how deep
|
||||
would anyone want to nest these things? Bugzilla #2301.
|
||||
|
||||
. Recognize the short script names. They are already listed in maint/
|
||||
Multistage2.py because they are needed for scanning the script extensions
|
||||
file.
|
||||
|
||||
. Use script extensions for \p?
|
||||
|
||||
. A user suggested something like --with-build-info to set a build information
|
||||
string that could be retrieved by pcre2_config(). However, there's no
|
||||
facility for a length limit in pcre2_config(), and what would be the
|
||||
encoding?
|
||||
|
||||
Philip Hazel
|
||||
Email local part: ph10
|
||||
Email domain: cam.ac.uk
|
||||
Last updated: 07 October 2018
|
||||
Last updated: 03 June 2019
|
||||
|
|
Loading…
Reference in New Issue