pcre2/doc/pcre2limits.3

82 lines
3.0 KiB
Groff
Raw Permalink Normal View History

.TH PCRE2LIMITS 3 "26 October 2016" "PCRE2 10.23"
2014-09-29 18:45:37 +02:00
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH "SIZE AND OTHER LIMITATIONS"
.rs
.sp
There are some size limitations in PCRE2 but it is hoped that they will never
in practice be relevant.
.P
The maximum size of a compiled pattern is approximately 64K code units for the
8-bit and 16-bit libraries if PCRE2 is compiled with the default internal
linkage size, which is 2 bytes for these libraries. If you want to process
regular expressions that are truly enormous, you can compile PCRE2 with an
internal linkage size of 3 or 4 (when building the 16-bit library, 3 is rounded
up to 4). See the \fBREADME\fP file in the source distribution and the
.\" HREF
\fBpcre2build\fP
.\"
documentation for details. In these cases the limit is substantially larger.
However, the speed of execution is slower. In the 32-bit library, the internal
linkage size is always 4.
.P
The maximum length of a source pattern string is essentially unlimited; it is
the largest number a PCRE2_SIZE variable can hold. However, the program that
calls \fBpcre2_compile()\fP can specify a smaller limit.
.P
The maximum length (in code units) of a subject string is one less than the
largest number a PCRE2_SIZE variable can hold. PCRE2_SIZE is an unsigned
integer type, usually defined as size_t. Its maximum value (that is
~(PCRE2_SIZE)0) is reserved as a special indicator for zero-terminated strings
and unset offsets.
.P
Note that when using the traditional matching function, PCRE2 uses recursion to
handle subpatterns and indefinite repetition. This means that the available
stack space may limit the size of a subject string that can be processed by
certain patterns. For a discussion of stack issues, see the
.\" HREF
\fBpcre2stack\fP
.\"
documentation.
.P
2014-09-29 18:45:37 +02:00
All values in repeating quantifiers must be less than 65536.
.P
The maximum length of a lookbehind assertion is 65535 characters.
.P
2014-09-29 18:45:37 +02:00
There is no limit to the number of parenthesized subpatterns, but there can be
no more than 65535 capturing subpatterns. There is, however, a limit to the
depth of nesting of parenthesized subpatterns of all kinds. This is imposed in
order to limit the amount of system stack used at compile time. The default
2017-01-16 18:40:47 +01:00
limit can be specified when PCRE2 is built; the default default is 250. An
application can change this limit by calling pcre2_set_parens_nest_limit() to
set the limit in a compile context.
2014-09-29 18:45:37 +02:00
.P
The maximum length of name for a named subpattern is 32 code units, and the
maximum number of named subpatterns is 10000.
.P
The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
is 255 code units for the 8-bit library and 65535 code units for the 16-bit and
32-bit libraries.
.P
2017-01-16 18:40:47 +01:00
The maximum length of a string argument to a callout is the largest number a
32-bit unsigned integer can hold.
2014-09-29 18:45:37 +02:00
.
.
.SH AUTHOR
.rs
.sp
.nf
Philip Hazel
University Computing Service
2014-11-17 17:59:02 +01:00
Cambridge, England.
2014-09-29 18:45:37 +02:00
.fi
.
.
.SH REVISION
.rs
.sp
.nf
Last updated: 26 October 2016
Copyright (c) 1997-2016 University of Cambridge.
2014-09-29 18:45:37 +02:00
.fi