Documentation update to clarify what PCRE2 serialization is.

This commit is contained in:
Philip.Hazel 2018-06-27 17:20:58 +00:00
parent 374770c2e3
commit 7e921fda05
16 changed files with 684 additions and 610 deletions

View File

@ -28,7 +28,10 @@ DESCRIPTION
</b><br>
<P>
This function decodes a serialized set of compiled patterns back into a list of
individual patterns. Its arguments are:
individual patterns. This is possible only on a host that is running the same
version of PCRE2, with the same code unit width, and the host must also have
the same endianness, pointer width and PCRE2_SIZE type. The arguments for
<b>pcre2_serialize_decode()</b> are:
<pre>
<i>codes</i> pointer to a vector in which to build the list
<i>number_of_codes</i> number of slots in the vector
@ -54,8 +57,8 @@ on a system with different endianness.
<P>
There is a complete description of the PCRE2 native API in the
<a href="pcre2api.html"><b>pcre2api</b></a>
page and a description of the POSIX API in the
<a href="pcre2posix.html"><b>pcre2posix</b></a>
page and a description of the serialization functions in the
<a href="pcre2serialize.html"><b>pcre2serialize</b></a>
page.
<p>
Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -28,7 +28,12 @@ DESCRIPTION
</b><br>
<P>
This function encodes a list of compiled patterns into a byte stream that can
be saved on disc or elsewhere. Its arguments are:
be saved on disc or elsewhere. Note that this is not an abstract format like
Java or .NET. Conversion of the byte stream back into usable compiled patterns
can only happen on a host that is running the same version of PCRE2, with the
same code unit width, and the host must also have the same endianness, pointer
width and PCRE2_SIZE type. The arguments for <b>pcre2_serialize_encode()</b>
are:
<pre>
<i>codes</i> pointer to a vector containing the list
<i>number_of_codes</i> number of slots in the vector
@ -53,8 +58,8 @@ that a slot in the vector does not point to a compiled pattern.
<P>
There is a complete description of the PCRE2 native API in the
<a href="pcre2api.html"><b>pcre2api</b></a>
page and a description of the POSIX API in the
<a href="pcre2posix.html"><b>pcre2posix</b></a>
page and a description of the serialization functions in the
<a href="pcre2serialize.html"><b>pcre2serialize</b></a>
page.
<p>
Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -32,8 +32,8 @@ must point to such a byte stream.
<P>
There is a complete description of the PCRE2 native API in the
<a href="pcre2api.html"><b>pcre2api</b></a>
page and a description of the POSIX API in the
<a href="pcre2posix.html"><b>pcre2posix</b></a>
page and a description of the serialization functions in the
<a href="pcre2serialize.html"><b>pcre2serialize</b></a>
page.
<p>
Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -41,8 +41,8 @@ on a system with different endianness.
<P>
There is a complete description of the PCRE2 native API in the
<a href="pcre2api.html"><b>pcre2api</b></a>
page and a description of the POSIX API in the
<a href="pcre2posix.html"><b>pcre2posix</b></a>
page and a description of the serialization functions in the
<a href="pcre2serialize.html"><b>pcre2serialize</b></a>
page.
<p>
Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -2283,11 +2283,16 @@ documentation, which also gives further details about callouts.
<br><a name="SEC25" href="#TOC1">SERIALIZATION AND PRECOMPILING</a><br>
<P>
It is possible to save compiled patterns on disc or elsewhere, and reload them
later, subject to a number of restrictions. The functions whose names begin
with <b>pcre2_serialize_</b> are used for this purpose. They are described in
the
later, subject to a number of restrictions. The host on which the patterns are
reloaded must be running the same version of PCRE2, with the same code unit
width, and must also have the same endianness, pointer width, and PCRE2_SIZE
type. Before compiled patterns can be saved, they must be converted to a
"serialized" form, which in the case of PCRE2 is really just a bytecode dump.
The functions whose names begin with <b>pcre2_serialize_</b> are used for
converting to and from the serialized form. They are described in the
<a href="pcre2serialize.html"><b>pcre2serialize</b></a>
documentation.
documentation. Note that PCRE2 serialization does not convert compiled patterns
to an abstract format like Java or .NET serialization.
<a name="matchdatablock"></a></P>
<br><a name="SEC26" href="#TOC1">THE MATCH DATA BLOCK</a><br>
<P>

View File

@ -49,6 +49,15 @@ and PCRE2_SIZE type. For example, patterns compiled on a 32-bit system using
PCRE2's 16-bit library cannot be reloaded on a 64-bit system, nor can they be
reloaded using the 8-bit library.
</P>
<P>
Note that "serialization" in PCRE2 does not convert compiled patterns to an
abstract format like Java or .NET serialization. The serialized output is
really just a bytecode dump, which is why it can only be reloaded in the same
environment as the one that created it. Hence the restrictions mentioned above.
Applications that are not statically linked with a fixed version of PCRE2 must
be prepared to recompile patterns from their sources, in order to be immune to
PCRE2 upgrades.
</P>
<br><a name="SEC2" href="#TOC1">SECURITY CONCERNS</a><br>
<P>
The facility for saving and restoring compiled patterns is intended for use
@ -62,11 +71,11 @@ the byte stream that is passed to it.
</P>
<br><a name="SEC3" href="#TOC1">SAVING COMPILED PATTERNS</a><br>
<P>
Before compiled patterns can be saved they must be serialized, that is,
converted to a stream of bytes. A single byte stream may contain any number of
compiled patterns, but they must all use the same character tables. A single
copy of the tables is included in the byte stream (its size is 1088 bytes). For
more details of character tables, see the
Before compiled patterns can be saved they must be serialized, which in PCRE2
means converting the pattern to a stream of bytes. A single byte stream may
contain any number of compiled patterns, but they must all use the same
character tables. A single copy of the tables is included in the byte stream
(its size is 1088 bytes). For more details of character tables, see the
<a href="pcre2api.html#localesupport">section on locale support</a>
in the
<a href="pcre2api.html"><b>pcre2api</b></a>
@ -193,9 +202,9 @@ Cambridge, England.
</P>
<br><a name="SEC6" href="#TOC1">REVISION</a><br>
<P>
Last updated: 21 March 2017
Last updated: 27 June 2018
<br>
Copyright &copy; 1997-2017 University of Cambridge.
Copyright &copy; 1997-2018 University of Cambridge.
<br>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -1927,15 +1927,21 @@ documentation. In this section we describe the features of <b>pcre2test</b> that
can be used to test these functions.
</P>
<P>
When a pattern with <b>push</b> modifier is successfully compiled, it is pushed
onto a stack of compiled patterns, and <b>pcre2test</b> expects the next line to
contain a new pattern (or command) instead of a subject line. By contrast,
the <b>pushcopy</b> modifier causes a copy of the compiled pattern to be
stacked, leaving the original available for immediate matching. By using
<b>push</b> and/or <b>pushcopy</b>, a number of patterns can be compiled and
retained. These modifiers are incompatible with <b>posix</b>, and control
modifiers that act at match time are ignored (with a message) for the stacked
patterns. The <b>jitverify</b> modifier applies only at compile time.
Note that "serialization" in PCRE2 does not convert compiled patterns to an
abstract format like Java or .NET. It just makes a reloadable byte code stream.
Hence the restrictions on reloading mentioned above.
</P>
<P>
In <b>pcre2test</b>, when a pattern with <b>push</b> modifier is successfully
compiled, it is pushed onto a stack of compiled patterns, and <b>pcre2test</b>
expects the next line to contain a new pattern (or command) instead of a
subject line. By contrast, the <b>pushcopy</b> modifier causes a copy of the
compiled pattern to be stacked, leaving the original available for immediate
matching. By using <b>push</b> and/or <b>pushcopy</b>, a number of patterns can
be compiled and retained. These modifiers are incompatible with <b>posix</b>,
and control modifiers that act at match time are ignored (with a message) for
the stacked patterns. The <b>jitverify</b> modifier applies only at compile
time.
</P>
<P>
The command
@ -1996,7 +2002,7 @@ Cambridge, England.
</P>
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
<P>
Last updated: 25 April 2018
Last updated: 27 June 2018
<br>
Copyright &copy; 1997-2018 University of Cambridge.
<br>

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,4 @@
.TH PCRE2_SERIALIZE_DECODE 3 "02 September 2015" "PCRE2 10.21"
.TH PCRE2_SERIALIZE_DECODE 3 "27 June 2018" "PCRE2 10.32"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS
@ -16,7 +16,10 @@ PCRE2 - Perl-compatible regular expressions (revised API)
.rs
.sp
This function decodes a serialized set of compiled patterns back into a list of
individual patterns. Its arguments are:
individual patterns. This is possible only on a host that is running the same
version of PCRE2, with the same code unit width, and the host must also have
the same endianness, pointer width and PCRE2_SIZE type. The arguments for
\fBpcre2_serialize_decode()\fP are:
.sp
\fIcodes\fP pointer to a vector in which to build the list
\fInumber_of_codes\fP number of slots in the vector
@ -43,8 +46,8 @@ There is a complete description of the PCRE2 native API in the
.\" HREF
\fBpcre2api\fP
.\"
page and a description of the POSIX API in the
page and a description of the serialization functions in the
.\" HREF
\fBpcre2posix\fP
.\"
\fBpcre2serialize\fP
.\"
page.

View File

@ -1,4 +1,4 @@
.TH PCRE2_SERIALIZE_ENCODE 3 "02 September 2015" "PCRE2 10.21"
.TH PCRE2_SERIALIZE_ENCODE 3 "27 June 2018" "PCRE2 10.32"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS
@ -16,7 +16,12 @@ PCRE2 - Perl-compatible regular expressions (revised API)
.rs
.sp
This function encodes a list of compiled patterns into a byte stream that can
be saved on disc or elsewhere. Its arguments are:
be saved on disc or elsewhere. Note that this is not an abstract format like
Java or .NET. Conversion of the byte stream back into usable compiled patterns
can only happen on a host that is running the same version of PCRE2, with the
same code unit width, and the host must also have the same endianness, pointer
width and PCRE2_SIZE type. The arguments for \fBpcre2_serialize_encode()\fP
are:
.sp
\fIcodes\fP pointer to a vector containing the list
\fInumber_of_codes\fP number of slots in the vector
@ -42,8 +47,8 @@ There is a complete description of the PCRE2 native API in the
.\" HREF
\fBpcre2api\fP
.\"
page and a description of the POSIX API in the
page and a description of the serialization functions in the
.\" HREF
\fBpcre2posix\fP
.\"
\fBpcre2serialize\fP
.\"
page.

View File

@ -1,4 +1,4 @@
.TH PCRE2_SERIALIZE_FREE 3 "19 January 2015" "PCRE2 10.10"
.TH PCRE2_SERIALIZE_FREE 3 "27 June 2018" "PCRE2 10.32"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS
@ -21,8 +21,8 @@ There is a complete description of the PCRE2 native API in the
.\" HREF
\fBpcre2api\fP
.\"
page and a description of the POSIX API in the
page and a description of the serialization functions in the
.\" HREF
\fBpcre2posix\fP
.\"
\fBpcre2serialize\fP
.\"
page.

View File

@ -1,4 +1,4 @@
.TH PCRE2_SERIALIZE_GET_NUMBER_OF_CODES 3 "19 January 2015" "PCRE2 10.10"
.TH PCRE2_SERIALIZE_GET_NUMBER_OF_CODES 3 "27 June 2018" "PCRE2 10.32"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS
@ -30,8 +30,8 @@ There is a complete description of the PCRE2 native API in the
.\" HREF
\fBpcre2api\fP
.\"
page and a description of the POSIX API in the
page and a description of the serialization functions in the
.\" HREF
\fBpcre2posix\fP
.\"
\fBpcre2serialize\fP
.\"
page.

View File

@ -2250,13 +2250,18 @@ documentation, which also gives further details about callouts.
.rs
.sp
It is possible to save compiled patterns on disc or elsewhere, and reload them
later, subject to a number of restrictions. The functions whose names begin
with \fBpcre2_serialize_\fP are used for this purpose. They are described in
the
later, subject to a number of restrictions. The host on which the patterns are
reloaded must be running the same version of PCRE2, with the same code unit
width, and must also have the same endianness, pointer width, and PCRE2_SIZE
type. Before compiled patterns can be saved, they must be converted to a
"serialized" form, which in the case of PCRE2 is really just a bytecode dump.
The functions whose names begin with \fBpcre2_serialize_\fP are used for
converting to and from the serialized form. They are described in the
.\" HREF
\fBpcre2serialize\fP
.\"
documentation.
documentation. Note that PCRE2 serialization does not convert compiled patterns
to an abstract format like Java or .NET serialization.
.
.
.\" HTML <a name="matchdatablock"></a>

View File

@ -1,4 +1,4 @@
.TH PCRE2SERIALIZE 3 "21 March 2017" "PCRE2 10.30"
.TH PCRE2SERIALIZE 3 "27 June 2018" "PCRE2 10.32"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH "SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS"
@ -28,6 +28,14 @@ the same code unit width, and must also have the same endianness, pointer width
and PCRE2_SIZE type. For example, patterns compiled on a 32-bit system using
PCRE2's 16-bit library cannot be reloaded on a 64-bit system, nor can they be
reloaded using the 8-bit library.
.P
Note that "serialization" in PCRE2 does not convert compiled patterns to an
abstract format like Java or .NET serialization. The serialized output is
really just a bytecode dump, which is why it can only be reloaded in the same
environment as the one that created it. Hence the restrictions mentioned above.
Applications that are not statically linked with a fixed version of PCRE2 must
be prepared to recompile patterns from their sources, in order to be immune to
PCRE2 upgrades.
.
.
.SH "SECURITY CONCERNS"
@ -46,11 +54,11 @@ the byte stream that is passed to it.
.SH "SAVING COMPILED PATTERNS"
.rs
.sp
Before compiled patterns can be saved they must be serialized, that is,
converted to a stream of bytes. A single byte stream may contain any number of
compiled patterns, but they must all use the same character tables. A single
copy of the tables is included in the byte stream (its size is 1088 bytes). For
more details of character tables, see the
Before compiled patterns can be saved they must be serialized, which in PCRE2
means converting the pattern to a stream of bytes. A single byte stream may
contain any number of compiled patterns, but they must all use the same
character tables. A single copy of the tables is included in the byte stream
(its size is 1088 bytes). For more details of character tables, see the
.\" HTML <a href="pcre2api.html#localesupport">
.\" </a>
section on locale support
@ -184,6 +192,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 21 March 2017
Copyright (c) 1997-2017 University of Cambridge.
Last updated: 27 June 2018
Copyright (c) 1997-2018 University of Cambridge.
.fi

View File

@ -1,4 +1,4 @@
.TH PCRE2TEST 1 "25 April 2018" "PCRE 10.32"
.TH PCRE2TEST 1 "27 June 2018" "PCRE 10.32"
.SH NAME
pcre2test - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS
@ -1895,15 +1895,20 @@ for serializing and de-serializing. They are described in the
documentation. In this section we describe the features of \fBpcre2test\fP that
can be used to test these functions.
.P
When a pattern with \fBpush\fP modifier is successfully compiled, it is pushed
onto a stack of compiled patterns, and \fBpcre2test\fP expects the next line to
contain a new pattern (or command) instead of a subject line. By contrast,
the \fBpushcopy\fP modifier causes a copy of the compiled pattern to be
stacked, leaving the original available for immediate matching. By using
\fBpush\fP and/or \fBpushcopy\fP, a number of patterns can be compiled and
retained. These modifiers are incompatible with \fBposix\fP, and control
modifiers that act at match time are ignored (with a message) for the stacked
patterns. The \fBjitverify\fP modifier applies only at compile time.
Note that "serialization" in PCRE2 does not convert compiled patterns to an
abstract format like Java or .NET. It just makes a reloadable byte code stream.
Hence the restrictions on reloading mentioned above.
.P
In \fBpcre2test\fP, when a pattern with \fBpush\fP modifier is successfully
compiled, it is pushed onto a stack of compiled patterns, and \fBpcre2test\fP
expects the next line to contain a new pattern (or command) instead of a
subject line. By contrast, the \fBpushcopy\fP modifier causes a copy of the
compiled pattern to be stacked, leaving the original available for immediate
matching. By using \fBpush\fP and/or \fBpushcopy\fP, a number of patterns can
be compiled and retained. These modifiers are incompatible with \fBposix\fP,
and control modifiers that act at match time are ignored (with a message) for
the stacked patterns. The \fBjitverify\fP modifier applies only at compile
time.
.P
The command
.sp
@ -1975,6 +1980,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 25 April 2018
Last updated: 27 June 2018
Copyright (c) 1997-2018 University of Cambridge.
.fi

View File

@ -1747,16 +1747,20 @@ SAVING AND RESTORING COMPILED PATTERNS
ize documentation. In this section we describe the features of
pcre2test that can be used to test these functions.
When a pattern with push modifier is successfully compiled, it is
pushed onto a stack of compiled patterns, and pcre2test expects the
next line to contain a new pattern (or command) instead of a subject
line. By contrast, the pushcopy modifier causes a copy of the compiled
pattern to be stacked, leaving the original available for immediate
matching. By using push and/or pushcopy, a number of patterns can be
compiled and retained. These modifiers are incompatible with posix, and
control modifiers that act at match time are ignored (with a message)
for the stacked patterns. The jitverify modifier applies only at com-
pile time.
Note that "serialization" in PCRE2 does not convert compiled patterns
to an abstract format like Java or .NET. It just makes a reloadable
byte code stream. Hence the restrictions on reloading mentioned above.
In pcre2test, when a pattern with push modifier is successfully com-
piled, it is pushed onto a stack of compiled patterns, and pcre2test
expects the next line to contain a new pattern (or command) instead of
a subject line. By contrast, the pushcopy modifier causes a copy of the
compiled pattern to be stacked, leaving the original available for
immediate matching. By using push and/or pushcopy, a number of patterns
can be compiled and retained. These modifiers are incompatible with
posix, and control modifiers that act at match time are ignored (with a
message) for the stacked patterns. The jitverify modifier applies only
at compile time.
The command
@ -1813,5 +1817,5 @@ AUTHOR
REVISION
Last updated: 25 April 2018
Last updated: 27 June 2018
Copyright (c) 1997-2018 University of Cambridge.