Document that ~(PCRE2_SIZE)0 is a reserved value.

This commit is contained in:
Philip.Hazel 2014-11-25 17:50:28 +00:00
parent 312375057b
commit d1f5dd5bf2
2 changed files with 29 additions and 13 deletions

View File

@ -370,6 +370,18 @@ pattern (\fBpcre2_pattern_info()\fP) and about the configuration with which
PCRE2 was built (\fBpcre2_config()\fP).
.
.
.SH "STRING LENGTHS AND OFFSETS"
.rs
.sp
The PCRE2 API uses string lengths and offsets into strings of code units in
several places. These values are always of type PCRE2_SIZE, which is an
unsigned integer type, currently always defined as \fIsize_t\fP. The largest
value that can be stored in such a type (that is ~(PCRE2_SIZE)0) is reserved
as a special indicator for zero-terminated strings and unset offsets.
Therefore, the longest string that can be handled is one less than this
maximum.
.
.
.\" HTML <a name="newlines"></a>
.SH NEWLINES
.rs

View File

@ -1,4 +1,4 @@
.TH PCRE2LIMITS 3 "29 September 2014" "PCRE2 10.00"
.TH PCRE2LIMITS 3 "25 November 2014" "PCRE2 10.00"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH "SIZE AND OTHER LIMITATIONS"
@ -20,6 +20,21 @@ documentation for details. In these cases the limit is substantially larger.
However, the speed of execution is slower. In the 32-bit library, the internal
linkage size is always 4.
.P
The maximum length (in code units) of a subject string is one less than the
largest number a PCRE2_SIZE variable can hold. PCRE2_SIZE is an unsigned
integer type, usually defined as size_t. Its maximum value (that is
~(PCRE2_SIZE)0) is reserved as a special indicator for zero-terminated strings
and unset offsets.
.P
Note that when using the traditional matching function, PCRE2 uses recursion to
handle subpatterns and indefinite repetition. This means that the available
stack space may limit the size of a subject string that can be processed by
certain patterns. For a discussion of stack issues, see the
.\" HREF
\fBpcre2stack\fP
.\"
documentation.
.P
All values in repeating quantifiers must be less than 65536.
.P
There is no limit to the number of parenthesized subpatterns, but there can be
@ -38,17 +53,6 @@ maximum number of named subpatterns is 10000.
.P
The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
is 255 for the 8-bit library and 65535 for the 16-bit and 32-bit libraries.
.P
The maximum length of a subject string is the largest number a PCRE2_SIZE
variable can hold. PCRE2_SIZE is an unsigned integer type, usually defined as
size_t. However, when using the traditional matching function, PCRE2 uses
recursion to handle subpatterns and indefinite repetition. This means that the
available stack space may limit the size of a subject string that can be
processed by certain patterns. For a discussion of stack issues, see the
.\" HREF
\fBpcre2stack\fP
.\"
documentation.
.
.
.SH AUTHOR
@ -65,6 +69,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 29 September 2014
Last updated: 25 November 2014
Copyright (c) 1997-2014 University of Cambridge.
.fi