More documentation
This commit is contained in:
parent
56805a1246
commit
69530d5b36
|
@ -15,9 +15,11 @@ PCRE2 - Perl-compatible regular expressions (revised API)
|
|||
.sp
|
||||
After a successful call of \fBpcre2_match()\fP that was passed the match block
|
||||
that is this function's argument, this function returns the code unit offset of
|
||||
the character at which the successful match started. This can be different to
|
||||
the value of \fIovector[0]\fP if the pattern contains the \eK escape sequence.
|
||||
Note, however, that \eK has no effect for a partial match.
|
||||
the character at which the successful match started. For a non-partial match,
|
||||
this can be different to the value of \fIovector[0]\fP if the pattern contains
|
||||
the \eK escape sequence. After a partial match, however, this value is always
|
||||
the same as \fIovector[0]\fP because \eK does not affect the result of a
|
||||
partial match.
|
||||
.P
|
||||
There is a complete description of the PCRE2 native API in the
|
||||
.\" HREF
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2API 3 "16 October 2014" "PCRE2 10.00"
|
||||
.TH PCRE2API 3 "25 October 2014" "PCRE2 10.00"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.sp
|
||||
|
@ -2069,10 +2069,11 @@ pointer to the zero-terminated name, which is within the compiled pattern.
|
|||
Otherwise NULL is returned. A (*MARK) name may be available after a failed
|
||||
match or a partial match, as well as after a successful one.
|
||||
.P
|
||||
The offset of the character at which the successful match started is
|
||||
returned by \fBpcre2_get_startchar()\fP. This can be different to the value of
|
||||
\fIovector[0]\fP if the pattern contains the \eK escape sequence. Note,
|
||||
however, that \eK has no effect for a partial match.
|
||||
The code unit offset of the character at which a successful match started is
|
||||
returned by \fBpcre2_get_startchar()\fP. For a non-partial match, this can be
|
||||
different to the value of \fIovector[0]\fP if the pattern contains the \eK
|
||||
escape sequence. After a partial match, however, this value is always the same
|
||||
as \fIovector[0]\fP because \eK does not affect the result of a partial match.
|
||||
.
|
||||
.
|
||||
.\" HTML <a name="errorlist"></a>
|
||||
|
@ -2629,6 +2630,6 @@ Cambridge CB2 3QH, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 16 October 2014
|
||||
Last updated: 25 October 2014
|
||||
Copyright (c) 1997-2014 University of Cambridge.
|
||||
.fi
|
||||
|
|
62
maint/README
62
maint/README
|
@ -112,10 +112,10 @@ distribution for a new release.
|
|||
different configurations, and it also runs some of them with valgrind, all of
|
||||
which can take quite some time.
|
||||
|
||||
. Run perltest.pl on the test data for tests 1, 4, and 6. The output
|
||||
should match the PCRE2 test output, apart from the version identification at
|
||||
the start of each test. The other tests are not Perl-compatible (they use
|
||||
various PCRE2-specific features or options).
|
||||
. Run perltest.sh on the test data for tests 1 and 4. The output should match
|
||||
the PCRE2 test output, apart from the version identification at the start of
|
||||
each test. The other tests are not Perl-compatible (they use various
|
||||
PCRE2-specific features or options).
|
||||
|
||||
. It is possible to test with the emulated memmove() function by undefining
|
||||
HAVE_MEMMOVE and HAVE_BCOPY in config.h, though I do not do this often. You
|
||||
|
@ -134,6 +134,9 @@ distribution for a new release.
|
|||
longer do this, but instead I rely on putting out release candidates for
|
||||
folks on the pcre-dev list to test.
|
||||
|
||||
. The buildbots at http://buildfarm.opencsw.org/ do some automated testing
|
||||
of PCRE2 and should be checked before putting out a release.
|
||||
|
||||
|
||||
Updating version info for libtool
|
||||
=================================
|
||||
|
@ -179,8 +182,8 @@ changes in a shared library:
|
|||
new version. Increment current, set revision and age to 0.
|
||||
|
||||
|
||||
Making a PCRE release
|
||||
=====================
|
||||
Making a PCRE2 release
|
||||
======================
|
||||
|
||||
Run PrepareRelease and commit the files that it changes (by removing trailing
|
||||
spaces). The first thing this script does is to run CheckMan on the man pages;
|
||||
|
@ -193,9 +196,9 @@ copy:
|
|||
svn copy svn://vcs.exim.org/pcre2/code/trunk \
|
||||
svn://vcs.exim.org/pcre2/code/tags/pcre-8.xx
|
||||
|
||||
Don't forget to update Freecode (fka Freshmeat) when the new release is out,
|
||||
and to tell webmaster@pcre.org and the mailing list. Also, update the list of
|
||||
version numbers in Bugzilla (edit products).
|
||||
When the new release is out, don't forget to tell webmaster@pcre.org and the
|
||||
mailing list. Also, update the list of version numbers in Bugzilla (edit
|
||||
products).
|
||||
|
||||
|
||||
Future ideas (wish list)
|
||||
|
@ -220,8 +223,8 @@ others are relatively new.
|
|||
to have little effect, and maybe makes things worse.
|
||||
|
||||
* "Ends with literal string" - note that a single character doesn't gain much
|
||||
over the existing "required byte" (reqbyte) feature that just remembers one
|
||||
data unit.
|
||||
over the existing "required code unit" feature that just remembers one code
|
||||
unit.
|
||||
|
||||
* Remember an initial string rather than just 1 code unit?
|
||||
|
||||
|
@ -245,13 +248,11 @@ others are relatively new.
|
|||
|
||||
. Perl 6 will be a revolution. Is it a revolution too far for PCRE?
|
||||
|
||||
. Allow errorptr and erroroffset to be NULL. I don't like this idea.
|
||||
|
||||
. Line endings:
|
||||
|
||||
* Option to use NUL as a line terminator in subject strings. This could now
|
||||
be done relatively easily since the extension to support LF, CR, and CRLF.
|
||||
If it is done, a suitable option for pcregrep is also required.
|
||||
If it is done, a suitable option for pcre2grep is also required.
|
||||
|
||||
. Catch SIGSEGV for stack overflows?
|
||||
|
||||
|
@ -259,32 +260,26 @@ others are relatively new.
|
|||
|
||||
. Option to convert results into character offsets and character lengths.
|
||||
|
||||
. Option for pcregrep to scan only the start of a file. I am not keen - this is
|
||||
the job of "head".
|
||||
. Option for pcre2grep to scan only the start of a file. I am not keen - this
|
||||
is the job of "head".
|
||||
|
||||
. A (non-Unix) user wanted pcregrep options to (a) list a file name just once,
|
||||
preceded by a blank line, instead of adding it to every matched line, and (b)
|
||||
support --outputfile=name.
|
||||
|
||||
. Consider making UTF and UCP the default for PCRE n.0 for some n > 8.
|
||||
|
||||
. Define a union for the results from pcre2_pattern_info().
|
||||
|
||||
. Provide a "random access to the subject" facility so that the way in which it
|
||||
is stored is independent of PCRE. For efficiency, it probably isn't possible
|
||||
to switch this dynamically. It would have to be specified when PCRE was
|
||||
compiled. PCRE would then call a function every time it wanted a character.
|
||||
is stored is independent of PCRE2. For efficiency, it probably isn't possible
|
||||
to switch this dynamically. It would have to be specified when PCRE2 was
|
||||
compiled. PCRE2 would then call a function every time it wanted a character.
|
||||
|
||||
. Wild thought: the ability to compile from PCRE's internal byte code to a real
|
||||
. Wild thought: the ability to compile from PCRE2's internal code to a real
|
||||
FSM and a very fast (third) matcher to process the result. There would be
|
||||
even more restrictions than for pcre_dfa_exec(), however. This is not easy.
|
||||
even more restrictions than for pcre2_dfa_exec(), however. This is not easy.
|
||||
This is probably obsolete now that we have the JIT support.
|
||||
|
||||
. Should pcretest have some private locale data, to avoid relying on the
|
||||
available locales for the test data, since different OS have different ideas?
|
||||
This won't be as thorough a test, but perhaps that doesn't really matter.
|
||||
|
||||
. pcregrep: add -rs for a sorted recurse? Having to store file names and sort
|
||||
. pcre2grep: add -rs for a sorted recurse? Having to store file names and sort
|
||||
them will of course slow it down.
|
||||
|
||||
. Someone suggested --disable-callout to save code space when callouts are
|
||||
|
@ -293,13 +288,14 @@ others are relatively new.
|
|||
. A user suggested a parameter to limit the length of string matched, for
|
||||
example if the parameter is N, the current match should fail if the matched
|
||||
substring exceeds N. This could apply to both match functions. The value
|
||||
could be a new field in the extra block.
|
||||
could be a new field in the match context.
|
||||
|
||||
. Callouts with arguments: (?Cn:ARG) for instance.
|
||||
|
||||
. Write a function that generates random matching strings for a compiled regex.
|
||||
. Write a function that generates random matching strings for a compiled
|
||||
pattern.
|
||||
|
||||
. Pcregrep: an option to specify the output line separator, either as a string
|
||||
. Pcre2grep: an option to specify the output line separator, either as a string
|
||||
or select from a fixed list. This is not dead easy, because at the moment it
|
||||
outputs whatever is in the input file.
|
||||
|
||||
|
@ -309,7 +305,7 @@ others are relatively new.
|
|||
implementation that I tried made things worse in many simple cases, so this
|
||||
is not an obviously good thing.
|
||||
|
||||
. PCRE cannot at present distinguish between subpatterns with different names,
|
||||
. PCRE2 cannot at present distinguish between subpatterns with different names,
|
||||
but the same number (created by the use of ?|). In order to do so, a way of
|
||||
remembering *which* subpattern numbered n matched is needed. Bugzilla #760.
|
||||
Now that (*MARK) has been implemented, it can perhaps be used as a way round
|
||||
|
@ -321,4 +317,4 @@ others are relatively new.
|
|||
Philip Hazel
|
||||
Email local part: ph10
|
||||
Email domain: cam.ac.uk
|
||||
Last updated: 13 May 2014
|
||||
Last updated: 25 October 2014
|
||||
|
|
Loading…
Reference in New Issue