Documentation update
This commit is contained in:
parent
85fc061dcf
commit
a5d81d06f4
|
@ -40,7 +40,11 @@ GENERIC INSTRUCTIONS FOR THE PCRE2 C LIBRARY
|
|||
|
||||
The following are generic instructions for building the PCRE2 C library "by
|
||||
hand". If you are going to use CMake, this section does not apply to you; you
|
||||
can skip ahead to the CMake section.
|
||||
can skip ahead to the CMake section. Note that the settings concerned with
|
||||
8-bit, 16-bit, and 32-bit code units relate to the type of data string that
|
||||
PCRE2 processes. They are NOT referring to the underlying operating system bit
|
||||
width. You do not have to do anything special to compile in a 64-bit
|
||||
environment, for example.
|
||||
|
||||
(1) Copy or rename the file src/config.h.generic as src/config.h, and edit the
|
||||
macro settings that it contains to whatever is appropriate for your
|
||||
|
@ -86,11 +90,11 @@ can skip ahead to the CMake section.
|
|||
The tables in src/pcre2_chartables.c are defaults. The caller of PCRE2 can
|
||||
specify alternative tables at run time.
|
||||
|
||||
(4) For an 8-bit library, compile the following source files from the src
|
||||
directory, setting -DPCRE2_CODE_UNIT_WIDTH=8 as a compiler option. Also
|
||||
set -DHAVE_CONFIG_H if you have set up src/config.h with your
|
||||
configuration, or else use other -D settings to change the configuration
|
||||
as required.
|
||||
(4) For a library that supports 8-bit code units in the character strings that
|
||||
it processes, compile the following source files from the src directory,
|
||||
setting -DPCRE2_CODE_UNIT_WIDTH=8 as a compiler option. Also set
|
||||
-DHAVE_CONFIG_H if you have set up src/config.h with your configuration,
|
||||
or else use other -D settings to change the configuration as required.
|
||||
|
||||
pcre2_auto_possess.c
|
||||
pcre2_chartables.c
|
||||
|
@ -142,9 +146,9 @@ can skip ahead to the CMake section.
|
|||
If your system has static and shared libraries, you may have to do this
|
||||
once for each type.
|
||||
|
||||
(6) If you want to build a 16-bit library or 32-bit library (as well as, or
|
||||
instead of the 8-bit library) just supply 16 or 32 as the value of
|
||||
-DPCRE2_CODE_UNIT_WIDTH when you are compiling.
|
||||
(6) If you want to build a library that supports 16-bit or 32-bit code units,
|
||||
(as well as, or instead of the 8-bit library) just supply 16 or 32 as the
|
||||
value of -DPCRE2_CODE_UNIT_WIDTH when you are compiling.
|
||||
|
||||
(7) If you want to build the POSIX wrapper functions (which apply only to the
|
||||
8-bit library), ensure that you have the src/pcre2posix.h file and then
|
||||
|
@ -401,6 +405,6 @@ Everything in that location, source and executable, is in EBCDIC and native
|
|||
z/OS file formats. The port provides an API for LE languages such as COBOL and
|
||||
for the z/OS and z/VM versions of the Rexx languages.
|
||||
|
||||
==============================
|
||||
Last Updated: 14 November 2018
|
||||
==============================
|
||||
===========================
|
||||
Last Updated: 28 April 2021
|
||||
===========================
|
||||
|
|
|
@ -40,7 +40,11 @@ GENERIC INSTRUCTIONS FOR THE PCRE2 C LIBRARY
|
|||
|
||||
The following are generic instructions for building the PCRE2 C library "by
|
||||
hand". If you are going to use CMake, this section does not apply to you; you
|
||||
can skip ahead to the CMake section.
|
||||
can skip ahead to the CMake section. Note that the settings concerned with
|
||||
8-bit, 16-bit, and 32-bit code units relate to the type of data string that
|
||||
PCRE2 processes. They are NOT referring to the underlying operating system bit
|
||||
width. You do not have to do anything special to compile in a 64-bit
|
||||
environment, for example.
|
||||
|
||||
(1) Copy or rename the file src/config.h.generic as src/config.h, and edit the
|
||||
macro settings that it contains to whatever is appropriate for your
|
||||
|
@ -86,11 +90,11 @@ can skip ahead to the CMake section.
|
|||
The tables in src/pcre2_chartables.c are defaults. The caller of PCRE2 can
|
||||
specify alternative tables at run time.
|
||||
|
||||
(4) For an 8-bit library, compile the following source files from the src
|
||||
directory, setting -DPCRE2_CODE_UNIT_WIDTH=8 as a compiler option. Also
|
||||
set -DHAVE_CONFIG_H if you have set up src/config.h with your
|
||||
configuration, or else use other -D settings to change the configuration
|
||||
as required.
|
||||
(4) For a library that supports 8-bit code units in the character strings that
|
||||
it processes, compile the following source files from the src directory,
|
||||
setting -DPCRE2_CODE_UNIT_WIDTH=8 as a compiler option. Also set
|
||||
-DHAVE_CONFIG_H if you have set up src/config.h with your configuration,
|
||||
or else use other -D settings to change the configuration as required.
|
||||
|
||||
pcre2_auto_possess.c
|
||||
pcre2_chartables.c
|
||||
|
@ -142,9 +146,9 @@ can skip ahead to the CMake section.
|
|||
If your system has static and shared libraries, you may have to do this
|
||||
once for each type.
|
||||
|
||||
(6) If you want to build a 16-bit library or 32-bit library (as well as, or
|
||||
instead of the 8-bit library) just supply 16 or 32 as the value of
|
||||
-DPCRE2_CODE_UNIT_WIDTH when you are compiling.
|
||||
(6) If you want to build a library that supports 16-bit or 32-bit code units,
|
||||
(as well as, or instead of the 8-bit library) just supply 16 or 32 as the
|
||||
value of -DPCRE2_CODE_UNIT_WIDTH when you are compiling.
|
||||
|
||||
(7) If you want to build the POSIX wrapper functions (which apply only to the
|
||||
8-bit library), ensure that you have the src/pcre2posix.h file and then
|
||||
|
@ -401,6 +405,6 @@ Everything in that location, source and executable, is in EBCDIC and native
|
|||
z/OS file formats. The port provides an API for LE languages such as COBOL and
|
||||
for the z/OS and z/VM versions of the Rexx languages.
|
||||
|
||||
==============================
|
||||
Last Updated: 14 November 2018
|
||||
==============================
|
||||
===========================
|
||||
Last Updated: 28 April 2021
|
||||
===========================
|
||||
|
|
|
@ -38,8 +38,14 @@ Oniguruma syntax items, and there are options for requesting some minor changes
|
|||
that give better ECMAScript (aka JavaScript) compatibility.
|
||||
</P>
|
||||
<P>
|
||||
The source code for PCRE2 can be compiled to support 8-bit, 16-bit, or 32-bit
|
||||
code units, which means that up to three separate libraries may be installed.
|
||||
The source code for PCRE2 can be compiled to support strings of 8-bit, 16-bit,
|
||||
or 32-bit code units, which means that up to three separate libraries may be
|
||||
installed, one for each code unit size. The size of code unit is not related to
|
||||
the bit size of the underlying hardware. In a 64-bit environment that also
|
||||
supports 32-bit applications, versions of PCRE2 that are compiled in both
|
||||
64-bit and 32-bit modes may be needed.
|
||||
</P>
|
||||
<P>
|
||||
The original work to extend PCRE to 16-bit and 32-bit code units was done by
|
||||
Zoltan Herczeg and Christian Persch, respectively. In all three cases, strings
|
||||
can be interpreted either as one character per code unit, or as UTF-encoded
|
||||
|
@ -198,9 +204,9 @@ use my two initials, followed by the two digits 10, at the domain cam.ac.uk.
|
|||
</P>
|
||||
<br><a name="SEC5" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 17 September 2018
|
||||
Last updated: 28 April 2021
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
Copyright © 1997-2021 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -1213,7 +1213,7 @@ Setting match controls
|
|||
The following modifiers affect the matching process or request additional
|
||||
information. Some of them may also be specified on a pattern line (see above),
|
||||
in which case they apply to every subject line that is matched against that
|
||||
pattern.
|
||||
pattern, but can be overridden by modifiers on the subject.
|
||||
<pre>
|
||||
aftertext show text after match
|
||||
allaftertext show text after captures
|
||||
|
@ -1421,6 +1421,11 @@ replacement strings cannot contain commas, because a comma signifies the end of
|
|||
a modifier. This is not thought to be an issue in a test program.
|
||||
</P>
|
||||
<P>
|
||||
Specifying a completely empty replacement string disables this modifier.
|
||||
However, it is possible to specify an empty replacement by providing a buffer
|
||||
length, as described below, for an otherwise empty replacement.
|
||||
</P>
|
||||
<P>
|
||||
Unlike subject strings, <b>pcre2test</b> does not process replacement strings
|
||||
for escape sequences. In UTF mode, a replacement string is checked to see if it
|
||||
is a valid UTF-8 string. If so, it is correctly converted to a UTF string of
|
||||
|
@ -2119,9 +2124,9 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 14 September 2020
|
||||
Last updated: 28 April 2021
|
||||
<br>
|
||||
Copyright © 1997-2020 University of Cambridge.
|
||||
Copyright © 1997-2021 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE2 index page</a>.
|
||||
|
|
|
@ -34,16 +34,21 @@ INTRODUCTION
|
|||
requesting some minor changes that give better ECMAScript (aka Java-
|
||||
Script) compatibility.
|
||||
|
||||
The source code for PCRE2 can be compiled to support 8-bit, 16-bit, or
|
||||
32-bit code units, which means that up to three separate libraries may
|
||||
be installed. The original work to extend PCRE to 16-bit and 32-bit
|
||||
code units was done by Zoltan Herczeg and Christian Persch, respec-
|
||||
tively. In all three cases, strings can be interpreted either as one
|
||||
character per code unit, or as UTF-encoded Unicode, with support for
|
||||
Unicode general category properties. Unicode support is optional at
|
||||
build time (but is the default). However, processing strings as UTF
|
||||
code units must be enabled explicitly at run time. The version of Uni-
|
||||
code in use can be discovered by running
|
||||
The source code for PCRE2 can be compiled to support strings of 8-bit,
|
||||
16-bit, or 32-bit code units, which means that up to three separate li-
|
||||
braries may be installed, one for each code unit size. The size of code
|
||||
unit is not related to the bit size of the underlying hardware. In a
|
||||
64-bit environment that also supports 32-bit applications, versions of
|
||||
PCRE2 that are compiled in both 64-bit and 32-bit modes may be needed.
|
||||
|
||||
The original work to extend PCRE to 16-bit and 32-bit code units was
|
||||
done by Zoltan Herczeg and Christian Persch, respectively. In all three
|
||||
cases, strings can be interpreted either as one character per code
|
||||
unit, or as UTF-encoded Unicode, with support for Unicode general cate-
|
||||
gory properties. Unicode support is optional at build time (but is the
|
||||
default). However, processing strings as UTF code units must be enabled
|
||||
explicitly at run time. The version of Unicode in use can be discovered
|
||||
by running
|
||||
|
||||
pcre2test -C
|
||||
|
||||
|
@ -177,8 +182,8 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 17 September 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
Last updated: 28 April 2021
|
||||
Copyright (c) 1997-2021 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
|
|
|
@ -1084,7 +1084,8 @@ SUBJECT MODIFIERS
|
|||
The following modifiers affect the matching process or request addi-
|
||||
tional information. Some of them may also be specified on a pattern
|
||||
line (see above), in which case they apply to every subject line that
|
||||
is matched against that pattern.
|
||||
is matched against that pattern, but can be overridden by modifiers on
|
||||
the subject.
|
||||
|
||||
aftertext show text after match
|
||||
allaftertext show text after captures
|
||||
|
@ -1276,6 +1277,11 @@ SUBJECT MODIFIERS
|
|||
end of a modifier. This is not thought to be an issue in a test pro-
|
||||
gram.
|
||||
|
||||
Specifying a completely empty replacement string disables this modi-
|
||||
fier. However, it is possible to specify an empty replacement by pro-
|
||||
viding a buffer length, as described below, for an otherwise empty re-
|
||||
placement.
|
||||
|
||||
Unlike subject strings, pcre2test does not process replacement strings
|
||||
for escape sequences. In UTF mode, a replacement string is checked to
|
||||
see if it is a valid UTF-8 string. If so, it is correctly converted to
|
||||
|
@ -1929,5 +1935,5 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 14 September 2020
|
||||
Copyright (c) 1997-2020 University of Cambridge.
|
||||
Last updated: 28 April 2021
|
||||
Copyright (c) 1997-2021 University of Cambridge.
|
||||
|
|
Loading…
Reference in New Issue