Documentation update
This commit is contained in:
parent
85fc061dcf
commit
a5d81d06f4
|
@ -40,7 +40,11 @@ GENERIC INSTRUCTIONS FOR THE PCRE2 C LIBRARY
|
||||||
|
|
||||||
The following are generic instructions for building the PCRE2 C library "by
|
The following are generic instructions for building the PCRE2 C library "by
|
||||||
hand". If you are going to use CMake, this section does not apply to you; you
|
hand". If you are going to use CMake, this section does not apply to you; you
|
||||||
can skip ahead to the CMake section.
|
can skip ahead to the CMake section. Note that the settings concerned with
|
||||||
|
8-bit, 16-bit, and 32-bit code units relate to the type of data string that
|
||||||
|
PCRE2 processes. They are NOT referring to the underlying operating system bit
|
||||||
|
width. You do not have to do anything special to compile in a 64-bit
|
||||||
|
environment, for example.
|
||||||
|
|
||||||
(1) Copy or rename the file src/config.h.generic as src/config.h, and edit the
|
(1) Copy or rename the file src/config.h.generic as src/config.h, and edit the
|
||||||
macro settings that it contains to whatever is appropriate for your
|
macro settings that it contains to whatever is appropriate for your
|
||||||
|
@ -86,11 +90,11 @@ can skip ahead to the CMake section.
|
||||||
The tables in src/pcre2_chartables.c are defaults. The caller of PCRE2 can
|
The tables in src/pcre2_chartables.c are defaults. The caller of PCRE2 can
|
||||||
specify alternative tables at run time.
|
specify alternative tables at run time.
|
||||||
|
|
||||||
(4) For an 8-bit library, compile the following source files from the src
|
(4) For a library that supports 8-bit code units in the character strings that
|
||||||
directory, setting -DPCRE2_CODE_UNIT_WIDTH=8 as a compiler option. Also
|
it processes, compile the following source files from the src directory,
|
||||||
set -DHAVE_CONFIG_H if you have set up src/config.h with your
|
setting -DPCRE2_CODE_UNIT_WIDTH=8 as a compiler option. Also set
|
||||||
configuration, or else use other -D settings to change the configuration
|
-DHAVE_CONFIG_H if you have set up src/config.h with your configuration,
|
||||||
as required.
|
or else use other -D settings to change the configuration as required.
|
||||||
|
|
||||||
pcre2_auto_possess.c
|
pcre2_auto_possess.c
|
||||||
pcre2_chartables.c
|
pcre2_chartables.c
|
||||||
|
@ -142,9 +146,9 @@ can skip ahead to the CMake section.
|
||||||
If your system has static and shared libraries, you may have to do this
|
If your system has static and shared libraries, you may have to do this
|
||||||
once for each type.
|
once for each type.
|
||||||
|
|
||||||
(6) If you want to build a 16-bit library or 32-bit library (as well as, or
|
(6) If you want to build a library that supports 16-bit or 32-bit code units,
|
||||||
instead of the 8-bit library) just supply 16 or 32 as the value of
|
(as well as, or instead of the 8-bit library) just supply 16 or 32 as the
|
||||||
-DPCRE2_CODE_UNIT_WIDTH when you are compiling.
|
value of -DPCRE2_CODE_UNIT_WIDTH when you are compiling.
|
||||||
|
|
||||||
(7) If you want to build the POSIX wrapper functions (which apply only to the
|
(7) If you want to build the POSIX wrapper functions (which apply only to the
|
||||||
8-bit library), ensure that you have the src/pcre2posix.h file and then
|
8-bit library), ensure that you have the src/pcre2posix.h file and then
|
||||||
|
@ -401,6 +405,6 @@ Everything in that location, source and executable, is in EBCDIC and native
|
||||||
z/OS file formats. The port provides an API for LE languages such as COBOL and
|
z/OS file formats. The port provides an API for LE languages such as COBOL and
|
||||||
for the z/OS and z/VM versions of the Rexx languages.
|
for the z/OS and z/VM versions of the Rexx languages.
|
||||||
|
|
||||||
==============================
|
===========================
|
||||||
Last Updated: 14 November 2018
|
Last Updated: 28 April 2021
|
||||||
==============================
|
===========================
|
||||||
|
|
|
@ -40,7 +40,11 @@ GENERIC INSTRUCTIONS FOR THE PCRE2 C LIBRARY
|
||||||
|
|
||||||
The following are generic instructions for building the PCRE2 C library "by
|
The following are generic instructions for building the PCRE2 C library "by
|
||||||
hand". If you are going to use CMake, this section does not apply to you; you
|
hand". If you are going to use CMake, this section does not apply to you; you
|
||||||
can skip ahead to the CMake section.
|
can skip ahead to the CMake section. Note that the settings concerned with
|
||||||
|
8-bit, 16-bit, and 32-bit code units relate to the type of data string that
|
||||||
|
PCRE2 processes. They are NOT referring to the underlying operating system bit
|
||||||
|
width. You do not have to do anything special to compile in a 64-bit
|
||||||
|
environment, for example.
|
||||||
|
|
||||||
(1) Copy or rename the file src/config.h.generic as src/config.h, and edit the
|
(1) Copy or rename the file src/config.h.generic as src/config.h, and edit the
|
||||||
macro settings that it contains to whatever is appropriate for your
|
macro settings that it contains to whatever is appropriate for your
|
||||||
|
@ -86,11 +90,11 @@ can skip ahead to the CMake section.
|
||||||
The tables in src/pcre2_chartables.c are defaults. The caller of PCRE2 can
|
The tables in src/pcre2_chartables.c are defaults. The caller of PCRE2 can
|
||||||
specify alternative tables at run time.
|
specify alternative tables at run time.
|
||||||
|
|
||||||
(4) For an 8-bit library, compile the following source files from the src
|
(4) For a library that supports 8-bit code units in the character strings that
|
||||||
directory, setting -DPCRE2_CODE_UNIT_WIDTH=8 as a compiler option. Also
|
it processes, compile the following source files from the src directory,
|
||||||
set -DHAVE_CONFIG_H if you have set up src/config.h with your
|
setting -DPCRE2_CODE_UNIT_WIDTH=8 as a compiler option. Also set
|
||||||
configuration, or else use other -D settings to change the configuration
|
-DHAVE_CONFIG_H if you have set up src/config.h with your configuration,
|
||||||
as required.
|
or else use other -D settings to change the configuration as required.
|
||||||
|
|
||||||
pcre2_auto_possess.c
|
pcre2_auto_possess.c
|
||||||
pcre2_chartables.c
|
pcre2_chartables.c
|
||||||
|
@ -142,9 +146,9 @@ can skip ahead to the CMake section.
|
||||||
If your system has static and shared libraries, you may have to do this
|
If your system has static and shared libraries, you may have to do this
|
||||||
once for each type.
|
once for each type.
|
||||||
|
|
||||||
(6) If you want to build a 16-bit library or 32-bit library (as well as, or
|
(6) If you want to build a library that supports 16-bit or 32-bit code units,
|
||||||
instead of the 8-bit library) just supply 16 or 32 as the value of
|
(as well as, or instead of the 8-bit library) just supply 16 or 32 as the
|
||||||
-DPCRE2_CODE_UNIT_WIDTH when you are compiling.
|
value of -DPCRE2_CODE_UNIT_WIDTH when you are compiling.
|
||||||
|
|
||||||
(7) If you want to build the POSIX wrapper functions (which apply only to the
|
(7) If you want to build the POSIX wrapper functions (which apply only to the
|
||||||
8-bit library), ensure that you have the src/pcre2posix.h file and then
|
8-bit library), ensure that you have the src/pcre2posix.h file and then
|
||||||
|
@ -401,6 +405,6 @@ Everything in that location, source and executable, is in EBCDIC and native
|
||||||
z/OS file formats. The port provides an API for LE languages such as COBOL and
|
z/OS file formats. The port provides an API for LE languages such as COBOL and
|
||||||
for the z/OS and z/VM versions of the Rexx languages.
|
for the z/OS and z/VM versions of the Rexx languages.
|
||||||
|
|
||||||
==============================
|
===========================
|
||||||
Last Updated: 14 November 2018
|
Last Updated: 28 April 2021
|
||||||
==============================
|
===========================
|
||||||
|
|
|
@ -38,8 +38,14 @@ Oniguruma syntax items, and there are options for requesting some minor changes
|
||||||
that give better ECMAScript (aka JavaScript) compatibility.
|
that give better ECMAScript (aka JavaScript) compatibility.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
The source code for PCRE2 can be compiled to support 8-bit, 16-bit, or 32-bit
|
The source code for PCRE2 can be compiled to support strings of 8-bit, 16-bit,
|
||||||
code units, which means that up to three separate libraries may be installed.
|
or 32-bit code units, which means that up to three separate libraries may be
|
||||||
|
installed, one for each code unit size. The size of code unit is not related to
|
||||||
|
the bit size of the underlying hardware. In a 64-bit environment that also
|
||||||
|
supports 32-bit applications, versions of PCRE2 that are compiled in both
|
||||||
|
64-bit and 32-bit modes may be needed.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
The original work to extend PCRE to 16-bit and 32-bit code units was done by
|
The original work to extend PCRE to 16-bit and 32-bit code units was done by
|
||||||
Zoltan Herczeg and Christian Persch, respectively. In all three cases, strings
|
Zoltan Herczeg and Christian Persch, respectively. In all three cases, strings
|
||||||
can be interpreted either as one character per code unit, or as UTF-encoded
|
can be interpreted either as one character per code unit, or as UTF-encoded
|
||||||
|
@ -198,9 +204,9 @@ use my two initials, followed by the two digits 10, at the domain cam.ac.uk.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC5" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC5" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 17 September 2018
|
Last updated: 28 April 2021
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2018 University of Cambridge.
|
Copyright © 1997-2021 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
|
@ -1213,7 +1213,7 @@ Setting match controls
|
||||||
The following modifiers affect the matching process or request additional
|
The following modifiers affect the matching process or request additional
|
||||||
information. Some of them may also be specified on a pattern line (see above),
|
information. Some of them may also be specified on a pattern line (see above),
|
||||||
in which case they apply to every subject line that is matched against that
|
in which case they apply to every subject line that is matched against that
|
||||||
pattern.
|
pattern, but can be overridden by modifiers on the subject.
|
||||||
<pre>
|
<pre>
|
||||||
aftertext show text after match
|
aftertext show text after match
|
||||||
allaftertext show text after captures
|
allaftertext show text after captures
|
||||||
|
@ -1421,6 +1421,11 @@ replacement strings cannot contain commas, because a comma signifies the end of
|
||||||
a modifier. This is not thought to be an issue in a test program.
|
a modifier. This is not thought to be an issue in a test program.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
|
Specifying a completely empty replacement string disables this modifier.
|
||||||
|
However, it is possible to specify an empty replacement by providing a buffer
|
||||||
|
length, as described below, for an otherwise empty replacement.
|
||||||
|
</P>
|
||||||
|
<P>
|
||||||
Unlike subject strings, <b>pcre2test</b> does not process replacement strings
|
Unlike subject strings, <b>pcre2test</b> does not process replacement strings
|
||||||
for escape sequences. In UTF mode, a replacement string is checked to see if it
|
for escape sequences. In UTF mode, a replacement string is checked to see if it
|
||||||
is a valid UTF-8 string. If so, it is correctly converted to a UTF string of
|
is a valid UTF-8 string. If so, it is correctly converted to a UTF string of
|
||||||
|
@ -2119,9 +2124,9 @@ Cambridge, England.
|
||||||
</P>
|
</P>
|
||||||
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
||||||
<P>
|
<P>
|
||||||
Last updated: 14 September 2020
|
Last updated: 28 April 2021
|
||||||
<br>
|
<br>
|
||||||
Copyright © 1997-2020 University of Cambridge.
|
Copyright © 1997-2021 University of Cambridge.
|
||||||
<br>
|
<br>
|
||||||
<p>
|
<p>
|
||||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||||
|
|
|
@ -34,16 +34,21 @@ INTRODUCTION
|
||||||
requesting some minor changes that give better ECMAScript (aka Java-
|
requesting some minor changes that give better ECMAScript (aka Java-
|
||||||
Script) compatibility.
|
Script) compatibility.
|
||||||
|
|
||||||
The source code for PCRE2 can be compiled to support 8-bit, 16-bit, or
|
The source code for PCRE2 can be compiled to support strings of 8-bit,
|
||||||
32-bit code units, which means that up to three separate libraries may
|
16-bit, or 32-bit code units, which means that up to three separate li-
|
||||||
be installed. The original work to extend PCRE to 16-bit and 32-bit
|
braries may be installed, one for each code unit size. The size of code
|
||||||
code units was done by Zoltan Herczeg and Christian Persch, respec-
|
unit is not related to the bit size of the underlying hardware. In a
|
||||||
tively. In all three cases, strings can be interpreted either as one
|
64-bit environment that also supports 32-bit applications, versions of
|
||||||
character per code unit, or as UTF-encoded Unicode, with support for
|
PCRE2 that are compiled in both 64-bit and 32-bit modes may be needed.
|
||||||
Unicode general category properties. Unicode support is optional at
|
|
||||||
build time (but is the default). However, processing strings as UTF
|
The original work to extend PCRE to 16-bit and 32-bit code units was
|
||||||
code units must be enabled explicitly at run time. The version of Uni-
|
done by Zoltan Herczeg and Christian Persch, respectively. In all three
|
||||||
code in use can be discovered by running
|
cases, strings can be interpreted either as one character per code
|
||||||
|
unit, or as UTF-encoded Unicode, with support for Unicode general cate-
|
||||||
|
gory properties. Unicode support is optional at build time (but is the
|
||||||
|
default). However, processing strings as UTF code units must be enabled
|
||||||
|
explicitly at run time. The version of Unicode in use can be discovered
|
||||||
|
by running
|
||||||
|
|
||||||
pcre2test -C
|
pcre2test -C
|
||||||
|
|
||||||
|
@ -177,8 +182,8 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 17 September 2018
|
Last updated: 28 April 2021
|
||||||
Copyright (c) 1997-2018 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -1084,7 +1084,8 @@ SUBJECT MODIFIERS
|
||||||
The following modifiers affect the matching process or request addi-
|
The following modifiers affect the matching process or request addi-
|
||||||
tional information. Some of them may also be specified on a pattern
|
tional information. Some of them may also be specified on a pattern
|
||||||
line (see above), in which case they apply to every subject line that
|
line (see above), in which case they apply to every subject line that
|
||||||
is matched against that pattern.
|
is matched against that pattern, but can be overridden by modifiers on
|
||||||
|
the subject.
|
||||||
|
|
||||||
aftertext show text after match
|
aftertext show text after match
|
||||||
allaftertext show text after captures
|
allaftertext show text after captures
|
||||||
|
@ -1276,6 +1277,11 @@ SUBJECT MODIFIERS
|
||||||
end of a modifier. This is not thought to be an issue in a test pro-
|
end of a modifier. This is not thought to be an issue in a test pro-
|
||||||
gram.
|
gram.
|
||||||
|
|
||||||
|
Specifying a completely empty replacement string disables this modi-
|
||||||
|
fier. However, it is possible to specify an empty replacement by pro-
|
||||||
|
viding a buffer length, as described below, for an otherwise empty re-
|
||||||
|
placement.
|
||||||
|
|
||||||
Unlike subject strings, pcre2test does not process replacement strings
|
Unlike subject strings, pcre2test does not process replacement strings
|
||||||
for escape sequences. In UTF mode, a replacement string is checked to
|
for escape sequences. In UTF mode, a replacement string is checked to
|
||||||
see if it is a valid UTF-8 string. If so, it is correctly converted to
|
see if it is a valid UTF-8 string. If so, it is correctly converted to
|
||||||
|
@ -1929,5 +1935,5 @@ AUTHOR
|
||||||
|
|
||||||
REVISION
|
REVISION
|
||||||
|
|
||||||
Last updated: 14 September 2020
|
Last updated: 28 April 2021
|
||||||
Copyright (c) 1997-2020 University of Cambridge.
|
Copyright (c) 1997-2021 University of Cambridge.
|
||||||
|
|
Loading…
Reference in New Issue