Documentation update to clarify what PCRE2 serialization is.

This commit is contained in:
Philip.Hazel 2018-06-27 17:20:58 +00:00
parent 374770c2e3
commit 7e921fda05
16 changed files with 684 additions and 610 deletions

View File

@ -28,7 +28,10 @@ DESCRIPTION
</b><br> </b><br>
<P> <P>
This function decodes a serialized set of compiled patterns back into a list of This function decodes a serialized set of compiled patterns back into a list of
individual patterns. Its arguments are: individual patterns. This is possible only on a host that is running the same
version of PCRE2, with the same code unit width, and the host must also have
the same endianness, pointer width and PCRE2_SIZE type. The arguments for
<b>pcre2_serialize_decode()</b> are:
<pre> <pre>
<i>codes</i> pointer to a vector in which to build the list <i>codes</i> pointer to a vector in which to build the list
<i>number_of_codes</i> number of slots in the vector <i>number_of_codes</i> number of slots in the vector
@ -54,8 +57,8 @@ on a system with different endianness.
<P> <P>
There is a complete description of the PCRE2 native API in the There is a complete description of the PCRE2 native API in the
<a href="pcre2api.html"><b>pcre2api</b></a> <a href="pcre2api.html"><b>pcre2api</b></a>
page and a description of the POSIX API in the page and a description of the serialization functions in the
<a href="pcre2posix.html"><b>pcre2posix</b></a> <a href="pcre2serialize.html"><b>pcre2serialize</b></a>
page. page.
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -28,7 +28,12 @@ DESCRIPTION
</b><br> </b><br>
<P> <P>
This function encodes a list of compiled patterns into a byte stream that can This function encodes a list of compiled patterns into a byte stream that can
be saved on disc or elsewhere. Its arguments are: be saved on disc or elsewhere. Note that this is not an abstract format like
Java or .NET. Conversion of the byte stream back into usable compiled patterns
can only happen on a host that is running the same version of PCRE2, with the
same code unit width, and the host must also have the same endianness, pointer
width and PCRE2_SIZE type. The arguments for <b>pcre2_serialize_encode()</b>
are:
<pre> <pre>
<i>codes</i> pointer to a vector containing the list <i>codes</i> pointer to a vector containing the list
<i>number_of_codes</i> number of slots in the vector <i>number_of_codes</i> number of slots in the vector
@ -53,8 +58,8 @@ that a slot in the vector does not point to a compiled pattern.
<P> <P>
There is a complete description of the PCRE2 native API in the There is a complete description of the PCRE2 native API in the
<a href="pcre2api.html"><b>pcre2api</b></a> <a href="pcre2api.html"><b>pcre2api</b></a>
page and a description of the POSIX API in the page and a description of the serialization functions in the
<a href="pcre2posix.html"><b>pcre2posix</b></a> <a href="pcre2serialize.html"><b>pcre2serialize</b></a>
page. page.
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -32,8 +32,8 @@ must point to such a byte stream.
<P> <P>
There is a complete description of the PCRE2 native API in the There is a complete description of the PCRE2 native API in the
<a href="pcre2api.html"><b>pcre2api</b></a> <a href="pcre2api.html"><b>pcre2api</b></a>
page and a description of the POSIX API in the page and a description of the serialization functions in the
<a href="pcre2posix.html"><b>pcre2posix</b></a> <a href="pcre2serialize.html"><b>pcre2serialize</b></a>
page. page.
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -41,8 +41,8 @@ on a system with different endianness.
<P> <P>
There is a complete description of the PCRE2 native API in the There is a complete description of the PCRE2 native API in the
<a href="pcre2api.html"><b>pcre2api</b></a> <a href="pcre2api.html"><b>pcre2api</b></a>
page and a description of the POSIX API in the page and a description of the serialization functions in the
<a href="pcre2posix.html"><b>pcre2posix</b></a> <a href="pcre2serialize.html"><b>pcre2serialize</b></a>
page. page.
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -2283,11 +2283,16 @@ documentation, which also gives further details about callouts.
<br><a name="SEC25" href="#TOC1">SERIALIZATION AND PRECOMPILING</a><br> <br><a name="SEC25" href="#TOC1">SERIALIZATION AND PRECOMPILING</a><br>
<P> <P>
It is possible to save compiled patterns on disc or elsewhere, and reload them It is possible to save compiled patterns on disc or elsewhere, and reload them
later, subject to a number of restrictions. The functions whose names begin later, subject to a number of restrictions. The host on which the patterns are
with <b>pcre2_serialize_</b> are used for this purpose. They are described in reloaded must be running the same version of PCRE2, with the same code unit
the width, and must also have the same endianness, pointer width, and PCRE2_SIZE
type. Before compiled patterns can be saved, they must be converted to a
"serialized" form, which in the case of PCRE2 is really just a bytecode dump.
The functions whose names begin with <b>pcre2_serialize_</b> are used for
converting to and from the serialized form. They are described in the
<a href="pcre2serialize.html"><b>pcre2serialize</b></a> <a href="pcre2serialize.html"><b>pcre2serialize</b></a>
documentation. documentation. Note that PCRE2 serialization does not convert compiled patterns
to an abstract format like Java or .NET serialization.
<a name="matchdatablock"></a></P> <a name="matchdatablock"></a></P>
<br><a name="SEC26" href="#TOC1">THE MATCH DATA BLOCK</a><br> <br><a name="SEC26" href="#TOC1">THE MATCH DATA BLOCK</a><br>
<P> <P>

View File

@ -49,6 +49,15 @@ and PCRE2_SIZE type. For example, patterns compiled on a 32-bit system using
PCRE2's 16-bit library cannot be reloaded on a 64-bit system, nor can they be PCRE2's 16-bit library cannot be reloaded on a 64-bit system, nor can they be
reloaded using the 8-bit library. reloaded using the 8-bit library.
</P> </P>
<P>
Note that "serialization" in PCRE2 does not convert compiled patterns to an
abstract format like Java or .NET serialization. The serialized output is
really just a bytecode dump, which is why it can only be reloaded in the same
environment as the one that created it. Hence the restrictions mentioned above.
Applications that are not statically linked with a fixed version of PCRE2 must
be prepared to recompile patterns from their sources, in order to be immune to
PCRE2 upgrades.
</P>
<br><a name="SEC2" href="#TOC1">SECURITY CONCERNS</a><br> <br><a name="SEC2" href="#TOC1">SECURITY CONCERNS</a><br>
<P> <P>
The facility for saving and restoring compiled patterns is intended for use The facility for saving and restoring compiled patterns is intended for use
@ -62,11 +71,11 @@ the byte stream that is passed to it.
</P> </P>
<br><a name="SEC3" href="#TOC1">SAVING COMPILED PATTERNS</a><br> <br><a name="SEC3" href="#TOC1">SAVING COMPILED PATTERNS</a><br>
<P> <P>
Before compiled patterns can be saved they must be serialized, that is, Before compiled patterns can be saved they must be serialized, which in PCRE2
converted to a stream of bytes. A single byte stream may contain any number of means converting the pattern to a stream of bytes. A single byte stream may
compiled patterns, but they must all use the same character tables. A single contain any number of compiled patterns, but they must all use the same
copy of the tables is included in the byte stream (its size is 1088 bytes). For character tables. A single copy of the tables is included in the byte stream
more details of character tables, see the (its size is 1088 bytes). For more details of character tables, see the
<a href="pcre2api.html#localesupport">section on locale support</a> <a href="pcre2api.html#localesupport">section on locale support</a>
in the in the
<a href="pcre2api.html"><b>pcre2api</b></a> <a href="pcre2api.html"><b>pcre2api</b></a>
@ -193,9 +202,9 @@ Cambridge, England.
</P> </P>
<br><a name="SEC6" href="#TOC1">REVISION</a><br> <br><a name="SEC6" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 21 March 2017 Last updated: 27 June 2018
<br> <br>
Copyright &copy; 1997-2017 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -1927,15 +1927,21 @@ documentation. In this section we describe the features of <b>pcre2test</b> that
can be used to test these functions. can be used to test these functions.
</P> </P>
<P> <P>
When a pattern with <b>push</b> modifier is successfully compiled, it is pushed Note that "serialization" in PCRE2 does not convert compiled patterns to an
onto a stack of compiled patterns, and <b>pcre2test</b> expects the next line to abstract format like Java or .NET. It just makes a reloadable byte code stream.
contain a new pattern (or command) instead of a subject line. By contrast, Hence the restrictions on reloading mentioned above.
the <b>pushcopy</b> modifier causes a copy of the compiled pattern to be </P>
stacked, leaving the original available for immediate matching. By using <P>
<b>push</b> and/or <b>pushcopy</b>, a number of patterns can be compiled and In <b>pcre2test</b>, when a pattern with <b>push</b> modifier is successfully
retained. These modifiers are incompatible with <b>posix</b>, and control compiled, it is pushed onto a stack of compiled patterns, and <b>pcre2test</b>
modifiers that act at match time are ignored (with a message) for the stacked expects the next line to contain a new pattern (or command) instead of a
patterns. The <b>jitverify</b> modifier applies only at compile time. subject line. By contrast, the <b>pushcopy</b> modifier causes a copy of the
compiled pattern to be stacked, leaving the original available for immediate
matching. By using <b>push</b> and/or <b>pushcopy</b>, a number of patterns can
be compiled and retained. These modifiers are incompatible with <b>posix</b>,
and control modifiers that act at match time are ignored (with a message) for
the stacked patterns. The <b>jitverify</b> modifier applies only at compile
time.
</P> </P>
<P> <P>
The command The command
@ -1996,7 +2002,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC21" href="#TOC1">REVISION</a><br> <br><a name="SEC21" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 25 April 2018 Last updated: 27 June 2018
<br> <br>
Copyright &copy; 1997-2018 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,4 @@
.TH PCRE2_SERIALIZE_DECODE 3 "02 September 2015" "PCRE2 10.21" .TH PCRE2_SERIALIZE_DECODE 3 "27 June 2018" "PCRE2 10.32"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS .SH SYNOPSIS
@ -16,7 +16,10 @@ PCRE2 - Perl-compatible regular expressions (revised API)
.rs .rs
.sp .sp
This function decodes a serialized set of compiled patterns back into a list of This function decodes a serialized set of compiled patterns back into a list of
individual patterns. Its arguments are: individual patterns. This is possible only on a host that is running the same
version of PCRE2, with the same code unit width, and the host must also have
the same endianness, pointer width and PCRE2_SIZE type. The arguments for
\fBpcre2_serialize_decode()\fP are:
.sp .sp
\fIcodes\fP pointer to a vector in which to build the list \fIcodes\fP pointer to a vector in which to build the list
\fInumber_of_codes\fP number of slots in the vector \fInumber_of_codes\fP number of slots in the vector
@ -43,8 +46,8 @@ There is a complete description of the PCRE2 native API in the
.\" HREF .\" HREF
\fBpcre2api\fP \fBpcre2api\fP
.\" .\"
page and a description of the POSIX API in the page and a description of the serialization functions in the
.\" HREF .\" HREF
\fBpcre2posix\fP \fBpcre2serialize\fP
.\" .\"
page. page.

View File

@ -1,4 +1,4 @@
.TH PCRE2_SERIALIZE_ENCODE 3 "02 September 2015" "PCRE2 10.21" .TH PCRE2_SERIALIZE_ENCODE 3 "27 June 2018" "PCRE2 10.32"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS .SH SYNOPSIS
@ -16,7 +16,12 @@ PCRE2 - Perl-compatible regular expressions (revised API)
.rs .rs
.sp .sp
This function encodes a list of compiled patterns into a byte stream that can This function encodes a list of compiled patterns into a byte stream that can
be saved on disc or elsewhere. Its arguments are: be saved on disc or elsewhere. Note that this is not an abstract format like
Java or .NET. Conversion of the byte stream back into usable compiled patterns
can only happen on a host that is running the same version of PCRE2, with the
same code unit width, and the host must also have the same endianness, pointer
width and PCRE2_SIZE type. The arguments for \fBpcre2_serialize_encode()\fP
are:
.sp .sp
\fIcodes\fP pointer to a vector containing the list \fIcodes\fP pointer to a vector containing the list
\fInumber_of_codes\fP number of slots in the vector \fInumber_of_codes\fP number of slots in the vector
@ -42,8 +47,8 @@ There is a complete description of the PCRE2 native API in the
.\" HREF .\" HREF
\fBpcre2api\fP \fBpcre2api\fP
.\" .\"
page and a description of the POSIX API in the page and a description of the serialization functions in the
.\" HREF .\" HREF
\fBpcre2posix\fP \fBpcre2serialize\fP
.\" .\"
page. page.

View File

@ -1,4 +1,4 @@
.TH PCRE2_SERIALIZE_FREE 3 "19 January 2015" "PCRE2 10.10" .TH PCRE2_SERIALIZE_FREE 3 "27 June 2018" "PCRE2 10.32"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS .SH SYNOPSIS
@ -21,8 +21,8 @@ There is a complete description of the PCRE2 native API in the
.\" HREF .\" HREF
\fBpcre2api\fP \fBpcre2api\fP
.\" .\"
page and a description of the POSIX API in the page and a description of the serialization functions in the
.\" HREF .\" HREF
\fBpcre2posix\fP \fBpcre2serialize\fP
.\" .\"
page. page.

View File

@ -1,4 +1,4 @@
.TH PCRE2_SERIALIZE_GET_NUMBER_OF_CODES 3 "19 January 2015" "PCRE2 10.10" .TH PCRE2_SERIALIZE_GET_NUMBER_OF_CODES 3 "27 June 2018" "PCRE2 10.32"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS .SH SYNOPSIS
@ -30,8 +30,8 @@ There is a complete description of the PCRE2 native API in the
.\" HREF .\" HREF
\fBpcre2api\fP \fBpcre2api\fP
.\" .\"
page and a description of the POSIX API in the page and a description of the serialization functions in the
.\" HREF .\" HREF
\fBpcre2posix\fP \fBpcre2serialize\fP
.\" .\"
page. page.

View File

@ -2250,13 +2250,18 @@ documentation, which also gives further details about callouts.
.rs .rs
.sp .sp
It is possible to save compiled patterns on disc or elsewhere, and reload them It is possible to save compiled patterns on disc or elsewhere, and reload them
later, subject to a number of restrictions. The functions whose names begin later, subject to a number of restrictions. The host on which the patterns are
with \fBpcre2_serialize_\fP are used for this purpose. They are described in reloaded must be running the same version of PCRE2, with the same code unit
the width, and must also have the same endianness, pointer width, and PCRE2_SIZE
type. Before compiled patterns can be saved, they must be converted to a
"serialized" form, which in the case of PCRE2 is really just a bytecode dump.
The functions whose names begin with \fBpcre2_serialize_\fP are used for
converting to and from the serialized form. They are described in the
.\" HREF .\" HREF
\fBpcre2serialize\fP \fBpcre2serialize\fP
.\" .\"
documentation. documentation. Note that PCRE2 serialization does not convert compiled patterns
to an abstract format like Java or .NET serialization.
. .
. .
.\" HTML <a name="matchdatablock"></a> .\" HTML <a name="matchdatablock"></a>

View File

@ -1,4 +1,4 @@
.TH PCRE2SERIALIZE 3 "21 March 2017" "PCRE2 10.30" .TH PCRE2SERIALIZE 3 "27 June 2018" "PCRE2 10.32"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.SH "SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS" .SH "SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS"
@ -28,6 +28,14 @@ the same code unit width, and must also have the same endianness, pointer width
and PCRE2_SIZE type. For example, patterns compiled on a 32-bit system using and PCRE2_SIZE type. For example, patterns compiled on a 32-bit system using
PCRE2's 16-bit library cannot be reloaded on a 64-bit system, nor can they be PCRE2's 16-bit library cannot be reloaded on a 64-bit system, nor can they be
reloaded using the 8-bit library. reloaded using the 8-bit library.
.P
Note that "serialization" in PCRE2 does not convert compiled patterns to an
abstract format like Java or .NET serialization. The serialized output is
really just a bytecode dump, which is why it can only be reloaded in the same
environment as the one that created it. Hence the restrictions mentioned above.
Applications that are not statically linked with a fixed version of PCRE2 must
be prepared to recompile patterns from their sources, in order to be immune to
PCRE2 upgrades.
. .
. .
.SH "SECURITY CONCERNS" .SH "SECURITY CONCERNS"
@ -46,11 +54,11 @@ the byte stream that is passed to it.
.SH "SAVING COMPILED PATTERNS" .SH "SAVING COMPILED PATTERNS"
.rs .rs
.sp .sp
Before compiled patterns can be saved they must be serialized, that is, Before compiled patterns can be saved they must be serialized, which in PCRE2
converted to a stream of bytes. A single byte stream may contain any number of means converting the pattern to a stream of bytes. A single byte stream may
compiled patterns, but they must all use the same character tables. A single contain any number of compiled patterns, but they must all use the same
copy of the tables is included in the byte stream (its size is 1088 bytes). For character tables. A single copy of the tables is included in the byte stream
more details of character tables, see the (its size is 1088 bytes). For more details of character tables, see the
.\" HTML <a href="pcre2api.html#localesupport"> .\" HTML <a href="pcre2api.html#localesupport">
.\" </a> .\" </a>
section on locale support section on locale support
@ -184,6 +192,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 21 March 2017 Last updated: 27 June 2018
Copyright (c) 1997-2017 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
.fi .fi

View File

@ -1,4 +1,4 @@
.TH PCRE2TEST 1 "25 April 2018" "PCRE 10.32" .TH PCRE2TEST 1 "27 June 2018" "PCRE 10.32"
.SH NAME .SH NAME
pcre2test - a program for testing Perl-compatible regular expressions. pcre2test - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS .SH SYNOPSIS
@ -1895,15 +1895,20 @@ for serializing and de-serializing. They are described in the
documentation. In this section we describe the features of \fBpcre2test\fP that documentation. In this section we describe the features of \fBpcre2test\fP that
can be used to test these functions. can be used to test these functions.
.P .P
When a pattern with \fBpush\fP modifier is successfully compiled, it is pushed Note that "serialization" in PCRE2 does not convert compiled patterns to an
onto a stack of compiled patterns, and \fBpcre2test\fP expects the next line to abstract format like Java or .NET. It just makes a reloadable byte code stream.
contain a new pattern (or command) instead of a subject line. By contrast, Hence the restrictions on reloading mentioned above.
the \fBpushcopy\fP modifier causes a copy of the compiled pattern to be .P
stacked, leaving the original available for immediate matching. By using In \fBpcre2test\fP, when a pattern with \fBpush\fP modifier is successfully
\fBpush\fP and/or \fBpushcopy\fP, a number of patterns can be compiled and compiled, it is pushed onto a stack of compiled patterns, and \fBpcre2test\fP
retained. These modifiers are incompatible with \fBposix\fP, and control expects the next line to contain a new pattern (or command) instead of a
modifiers that act at match time are ignored (with a message) for the stacked subject line. By contrast, the \fBpushcopy\fP modifier causes a copy of the
patterns. The \fBjitverify\fP modifier applies only at compile time. compiled pattern to be stacked, leaving the original available for immediate
matching. By using \fBpush\fP and/or \fBpushcopy\fP, a number of patterns can
be compiled and retained. These modifiers are incompatible with \fBposix\fP,
and control modifiers that act at match time are ignored (with a message) for
the stacked patterns. The \fBjitverify\fP modifier applies only at compile
time.
.P .P
The command The command
.sp .sp
@ -1975,6 +1980,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 25 April 2018 Last updated: 27 June 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
.fi .fi

View File

@ -1747,16 +1747,20 @@ SAVING AND RESTORING COMPILED PATTERNS
ize documentation. In this section we describe the features of ize documentation. In this section we describe the features of
pcre2test that can be used to test these functions. pcre2test that can be used to test these functions.
When a pattern with push modifier is successfully compiled, it is Note that "serialization" in PCRE2 does not convert compiled patterns
pushed onto a stack of compiled patterns, and pcre2test expects the to an abstract format like Java or .NET. It just makes a reloadable
next line to contain a new pattern (or command) instead of a subject byte code stream. Hence the restrictions on reloading mentioned above.
line. By contrast, the pushcopy modifier causes a copy of the compiled
pattern to be stacked, leaving the original available for immediate In pcre2test, when a pattern with push modifier is successfully com-
matching. By using push and/or pushcopy, a number of patterns can be piled, it is pushed onto a stack of compiled patterns, and pcre2test
compiled and retained. These modifiers are incompatible with posix, and expects the next line to contain a new pattern (or command) instead of
control modifiers that act at match time are ignored (with a message) a subject line. By contrast, the pushcopy modifier causes a copy of the
for the stacked patterns. The jitverify modifier applies only at com- compiled pattern to be stacked, leaving the original available for
pile time. immediate matching. By using push and/or pushcopy, a number of patterns
can be compiled and retained. These modifiers are incompatible with
posix, and control modifiers that act at match time are ignored (with a
message) for the stacked patterns. The jitverify modifier applies only
at compile time.
The command The command
@ -1813,5 +1817,5 @@ AUTHOR
REVISION REVISION
Last updated: 25 April 2018 Last updated: 27 June 2018
Copyright (c) 1997-2018 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.