Update HTML and derived documentation.

This commit is contained in:
Philip.Hazel 2018-02-25 18:00:56 +00:00
parent e5b34b3555
commit 3236d6868c
5 changed files with 191 additions and 150 deletions

View File

@ -171,10 +171,12 @@ library. They are also documented in the pcre2build man page.
give large performance improvements on certain platforms, add --enable-jit to give large performance improvements on certain platforms, add --enable-jit to
the "configure" command. This support is available only for certain hardware the "configure" command. This support is available only for certain hardware
architectures. If you try to enable it on an unsupported architecture, there architectures. If you try to enable it on an unsupported architecture, there
will be a compile time error. If you are running under SELinux you may also will be a compile time error. If in doubt, use --enable-jit=auto, which
want to add --enable-jit-sealloc, which enables the use of an execmem enables JIT only if the current hardware is supported.
allocator in JIT that is compatible with SELinux. This has no effect if JIT
is not enabled. . If you are enabling JIT under SELinux you may also want to add
--enable-jit-sealloc, which enables the use of an execmem allocator in JIT
that is compatible with SELinux. This has no effect if JIT is not enabled.
. If you do not want to make use of the default support for UTF-8 Unicode . If you do not want to make use of the default support for UTF-8 Unicode
character strings in the 8-bit library, UTF-16 Unicode character strings in character strings in the 8-bit library, UTF-16 Unicode character strings in
@ -883,4 +885,4 @@ The distribution should contain the files listed below.
Philip Hazel Philip Hazel
Email local part: ph10 Email local part: ph10
Email domain: cam.ac.uk Email domain: cam.ac.uk
Last updated: 12 September 2017 Last updated: 25 February 2018

View File

@ -82,7 +82,8 @@ The following sections include descriptions of "on/off" options whose names
begin with --enable or --disable. Because of the way that <b>configure</b> begin with --enable or --disable. Because of the way that <b>configure</b>
works, --enable and --disable always come in pairs, so the complementary option works, --enable and --disable always come in pairs, so the complementary option
always exists as well, but as it specifies the default, it is not described. always exists as well, but as it specifies the default, it is not described.
Options that specify values have names that start with --with. Options that specify values have names that start with --with. At the end of a
<b>configure</b> run, a summary of the configuration is output.
</P> </P>
<br><a name="SEC3" href="#TOC1">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br> <br><a name="SEC3" href="#TOC1">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br>
<P> <P>
@ -170,8 +171,15 @@ Just-in-time (JIT) compiler support is included in the build by specifying
--enable-jit --enable-jit
</pre> </pre>
This support is available only for certain hardware architectures. If this This support is available only for certain hardware architectures. If this
option is set for an unsupported architecture, a building error occurs. If you option is set for an unsupported architecture, a building error occurs.
are running under SELinux you may also want to add If in doubt, use
<pre>
--enable-jit=auto
</pre>
which enables JIT only if the current hardware is supported. You can check
if JIT is enabled in the configuration summary that is output at the end of a
<b>configure</b> run. If you are enabling JIT under SELinux you may also want to
add
<pre> <pre>
--enable-jit-sealloc --enable-jit-sealloc
</pre> </pre>
@ -565,9 +573,9 @@ Cambridge, England.
</P> </P>
<br><a name="SEC25" href="#TOC1">REVISION</a><br> <br><a name="SEC25" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 18 July 2017 Last updated: 25 February 2018
<br> <br>
Copyright &copy; 1997-2017 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -17,17 +17,18 @@ please consult the man page, in case the conversion went wrong.
<li><a name="TOC2" href="#SEC2">DESCRIPTION</a> <li><a name="TOC2" href="#SEC2">DESCRIPTION</a>
<li><a name="TOC3" href="#SEC3">SUPPORT FOR COMPRESSED FILES</a> <li><a name="TOC3" href="#SEC3">SUPPORT FOR COMPRESSED FILES</a>
<li><a name="TOC4" href="#SEC4">BINARY FILES</a> <li><a name="TOC4" href="#SEC4">BINARY FILES</a>
<li><a name="TOC5" href="#SEC5">OPTIONS</a> <li><a name="TOC5" href="#SEC5">BINARY ZEROS IN PATTERNS</a>
<li><a name="TOC6" href="#SEC6">ENVIRONMENT VARIABLES</a> <li><a name="TOC6" href="#SEC6">OPTIONS</a>
<li><a name="TOC7" href="#SEC7">NEWLINES</a> <li><a name="TOC7" href="#SEC7">ENVIRONMENT VARIABLES</a>
<li><a name="TOC8" href="#SEC8">OPTIONS COMPATIBILITY</a> <li><a name="TOC8" href="#SEC8">NEWLINES</a>
<li><a name="TOC9" href="#SEC9">OPTIONS WITH DATA</a> <li><a name="TOC9" href="#SEC9">OPTIONS COMPATIBILITY</a>
<li><a name="TOC10" href="#SEC10">USING PCRE2'S CALLOUT FACILITY</a> <li><a name="TOC10" href="#SEC10">OPTIONS WITH DATA</a>
<li><a name="TOC11" href="#SEC11">MATCHING ERRORS</a> <li><a name="TOC11" href="#SEC11">USING PCRE2'S CALLOUT FACILITY</a>
<li><a name="TOC12" href="#SEC12">DIAGNOSTICS</a> <li><a name="TOC12" href="#SEC12">MATCHING ERRORS</a>
<li><a name="TOC13" href="#SEC13">SEE ALSO</a> <li><a name="TOC13" href="#SEC13">DIAGNOSTICS</a>
<li><a name="TOC14" href="#SEC14">AUTHOR</a> <li><a name="TOC14" href="#SEC14">SEE ALSO</a>
<li><a name="TOC15" href="#SEC15">REVISION</a> <li><a name="TOC15" href="#SEC15">AUTHOR</a>
<li><a name="TOC16" href="#SEC16">REVISION</a>
</ul> </ul>
<br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br> <br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br>
<P> <P>
@ -150,7 +151,13 @@ specified as "nul", that is, the line terminator is a binary zero, the test for
a binary file is not applied. See the <b>--binary-files</b> option for a means a binary file is not applied. See the <b>--binary-files</b> option for a means
of changing the way binary files are handled. of changing the way binary files are handled.
</P> </P>
<br><a name="SEC5" href="#TOC1">OPTIONS</a><br> <br><a name="SEC5" href="#TOC1">BINARY ZEROS IN PATTERNS</a><br>
<P>
Patterns passed from the command line are strings that are terminated by a
binary zero, so cannot contain internal zeros. However, patterns that are read
from a file via the <b>-f</b> option may contain binary zeros.
</P>
<br><a name="SEC6" href="#TOC1">OPTIONS</a><br>
<P> <P>
The order in which some of the options appear can affect the output. For The order in which some of the options appear can affect the output. For
example, both the <b>-H</b> and <b>-l</b> options affect the printing of file example, both the <b>-H</b> and <b>-l</b> options affect the printing of file
@ -355,12 +362,15 @@ files; it does not apply to patterns specified by any of the <b>--include</b> or
<P> <P>
<b>-f</b> <i>filename</i>, <b>--file=</b><i>filename</i> <b>-f</b> <i>filename</i>, <b>--file=</b><i>filename</i>
Read patterns from the file, one per line, and match them against each line of Read patterns from the file, one per line, and match them against each line of
input. What constitutes a newline when reading the file is the operating input. As is the case with patterns on the command line, no delimiters should
system's default. The <b>--newline</b> option has no effect on this option. be used. What constitutes a newline when reading the file is the operating
Trailing white space is removed from each line, and blank lines are ignored. An system's default interpretation of \n. The <b>--newline</b> option has no
empty file contains no patterns and therefore matches nothing. See also the effect on this option. Trailing white space is removed from each line, and
comments about multiple patterns versus a single pattern with alternatives in blank lines are ignored. An empty file contains no patterns and therefore
the description of <b>-e</b> above. matches nothing. Patterns read from a file in this way may contain binary
zeros, which are treated as ordinary data characters. See also the comments
about multiple patterns versus a single pattern with alternatives in the
description of <b>-e</b> above.
<br> <br>
<br> <br>
If this option is given more than once, all the specified files are read. A If this option is given more than once, all the specified files are read. A
@ -373,14 +383,15 @@ command line; all arguments are treated as the names of paths to be searched.
<P> <P>
<b>--file-list</b>=<i>filename</i> <b>--file-list</b>=<i>filename</i>
Read a list of files and/or directories that are to be scanned from the given Read a list of files and/or directories that are to be scanned from the given
file, one per line. Trailing white space is removed from each line, and blank file, one per line. What constitutes a newline when reading the file is the
lines are ignored. These paths are processed before any that are listed on the operating system's default. Trailing white space is removed from each line, and
command line. The file name can be given as "-" to refer to the standard input. blank lines are ignored. These paths are processed before any that are listed
If <b>--file</b> and <b>--file-list</b> are both specified as "-", patterns are on the command line. The file name can be given as "-" to refer to the standard
read first. This is useful only when the standard input is a terminal, from input. If <b>--file</b> and <b>--file-list</b> are both specified as "-",
which further lines (the list of files) can be read after an end-of-file patterns are read first. This is useful only when the standard input is a
indication. If this option is given more than once, all the specified files are terminal, from which further lines (the list of files) can be read after an
read. end-of-file indication. If this option is given more than once, all the
specified files are read.
</P> </P>
<P> <P>
<b>--file-offsets</b> <b>--file-offsets</b>
@ -764,27 +775,28 @@ pattern and ")$" at the end. This option applies only to the patterns that are
matched against the contents of files; it does not apply to patterns specified matched against the contents of files; it does not apply to patterns specified
by any of the <b>--include</b> or <b>--exclude</b> options. by any of the <b>--include</b> or <b>--exclude</b> options.
</P> </P>
<br><a name="SEC6" href="#TOC1">ENVIRONMENT VARIABLES</a><br> <br><a name="SEC7" href="#TOC1">ENVIRONMENT VARIABLES</a><br>
<P> <P>
The environment variables <b>LC_ALL</b> and <b>LC_CTYPE</b> are examined, in that The environment variables <b>LC_ALL</b> and <b>LC_CTYPE</b> are examined, in that
order, for a locale. The first one that is set is used. This can be overridden order, for a locale. The first one that is set is used. This can be overridden
by the <b>--locale</b> option. If no locale is set, the PCRE2 library's default by the <b>--locale</b> option. If no locale is set, the PCRE2 library's default
(usually the "C" locale) is used. (usually the "C" locale) is used.
</P> </P>
<br><a name="SEC7" href="#TOC1">NEWLINES</a><br> <br><a name="SEC8" href="#TOC1">NEWLINES</a><br>
<P> <P>
The <b>-N</b> (<b>--newline</b>) option allows <b>pcre2grep</b> to scan files with The <b>-N</b> (<b>--newline</b>) option allows <b>pcre2grep</b> to scan files with
different newline conventions from the default. Any parts of the input files different newline conventions from the default. Any parts of the input files
that are written to the standard output are copied identically, with whatever that are written to the standard output are copied identically, with whatever
newline sequences they have in the input. However, the setting of this option newline sequences they have in the input. However, the setting of this option
does not affect the interpretation of files specified by the <b>-f</b>, affects only the way scanned files are processed. It does not affect the
<b>--exclude-from</b>, or <b>--include-from</b> options, which are assumed to use interpretation of files specified by the <b>-f</b>, <b>--file-list</b>,
the operating system's standard newline sequence, nor does it affect the way in <b>--exclude-from</b>, or <b>--include-from</b> options, nor does it affect the
which <b>pcre2grep</b> writes informational messages to the standard error and way in which <b>pcre2grep</b> writes informational messages to the standard
output streams. For these it uses the string "\n" to indicate newlines, error and output streams. For these it uses the string "\n" to indicate
relying on the C I/O library to convert this to an appropriate sequence. newlines, relying on the C I/O library to convert this to an appropriate
sequence.
</P> </P>
<br><a name="SEC8" href="#TOC1">OPTIONS COMPATIBILITY</a><br> <br><a name="SEC9" href="#TOC1">OPTIONS COMPATIBILITY</a><br>
<P> <P>
Many of the short and long forms of <b>pcre2grep</b>'s options are the same Many of the short and long forms of <b>pcre2grep</b>'s options are the same
as in the GNU <b>grep</b> program. Any long option of the form as in the GNU <b>grep</b> program. Any long option of the form
@ -804,7 +816,7 @@ for GNU <b>grep</b>, but a regular expression for <b>pcre2grep</b>. If both the
<b>-c</b> and <b>-l</b> options are given, GNU grep lists only file names, <b>-c</b> and <b>-l</b> options are given, GNU grep lists only file names,
without counts, but <b>pcre2grep</b> gives the counts as well. without counts, but <b>pcre2grep</b> gives the counts as well.
</P> </P>
<br><a name="SEC9" href="#TOC1">OPTIONS WITH DATA</a><br> <br><a name="SEC10" href="#TOC1">OPTIONS WITH DATA</a><br>
<P> <P>
There are four different ways in which an option with data can be specified. There are four different ways in which an option with data can be specified.
If a short form option is used, the data may follow immediately, or (with one If a short form option is used, the data may follow immediately, or (with one
@ -836,7 +848,7 @@ The exceptions to the above are the <b>--colour</b> (or <b>--color</b>) and
options does have data, it must be given in the first form, using an equals options does have data, it must be given in the first form, using an equals
character. Otherwise <b>pcre2grep</b> will assume that it has no data. character. Otherwise <b>pcre2grep</b> will assume that it has no data.
</P> </P>
<br><a name="SEC10" href="#TOC1">USING PCRE2'S CALLOUT FACILITY</a><br> <br><a name="SEC11" href="#TOC1">USING PCRE2'S CALLOUT FACILITY</a><br>
<P> <P>
<b>pcre2grep</b> has, by default, support for calling external programs or <b>pcre2grep</b> has, by default, support for calling external programs or
scripts or echoing specific strings during matching by making use of PCRE2's scripts or echoing specific strings during matching by making use of PCRE2's
@ -906,7 +918,7 @@ Matching continues normally after the string is output. If you want to see only
the callout output but not any output from an actual match, you should end the the callout output but not any output from an actual match, you should end the
relevant pattern with (*FAIL). relevant pattern with (*FAIL).
</P> </P>
<br><a name="SEC11" href="#TOC1">MATCHING ERRORS</a><br> <br><a name="SEC12" href="#TOC1">MATCHING ERRORS</a><br>
<P> <P>
It is possible to supply a regular expression that takes a very long time to It is possible to supply a regular expression that takes a very long time to
fail to match certain lines. Such patterns normally involve nested indefinite fail to match certain lines. Such patterns normally involve nested indefinite
@ -922,7 +934,7 @@ overall resource limit. There are also other limits that affect the amount of
memory used during matching; see the discussion of <b>--heap-limit</b> and memory used during matching; see the discussion of <b>--heap-limit</b> and
<b>--depth-limit</b> above. <b>--depth-limit</b> above.
</P> </P>
<br><a name="SEC12" href="#TOC1">DIAGNOSTICS</a><br> <br><a name="SEC13" href="#TOC1">DIAGNOSTICS</a><br>
<P> <P>
Exit status is 0 if any matches were found, 1 if no matches were found, and 2 Exit status is 0 if any matches were found, 1 if no matches were found, and 2
for syntax errors, overlong lines, non-existent or inaccessible files (even if for syntax errors, overlong lines, non-existent or inaccessible files (even if
@ -934,11 +946,11 @@ affect the return code.
When run under VMS, the return code is placed in the symbol PCRE2GREP_RC When run under VMS, the return code is placed in the symbol PCRE2GREP_RC
because VMS does not distinguish between exit(0) and exit(1). because VMS does not distinguish between exit(0) and exit(1).
</P> </P>
<br><a name="SEC13" href="#TOC1">SEE ALSO</a><br> <br><a name="SEC14" href="#TOC1">SEE ALSO</a><br>
<P> <P>
<b>pcre2pattern</b>(3), <b>pcre2syntax</b>(3), <b>pcre2callout</b>(3). <b>pcre2pattern</b>(3), <b>pcre2syntax</b>(3), <b>pcre2callout</b>(3).
</P> </P>
<br><a name="SEC14" href="#TOC1">AUTHOR</a><br> <br><a name="SEC15" href="#TOC1">AUTHOR</a><br>
<P> <P>
Philip Hazel Philip Hazel
<br> <br>
@ -947,11 +959,11 @@ University Computing Service
Cambridge, England. Cambridge, England.
<br> <br>
</P> </P>
<br><a name="SEC15" href="#TOC1">REVISION</a><br> <br><a name="SEC16" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 13 November 2017 Last updated: 24 February 2018
<br> <br>
Copyright &copy; 1997-2017 University of Cambridge. Copyright &copy; 1997-2018 University of Cambridge.
<br> <br>
<p> <p>
Return to the <a href="index.html">PCRE2 index page</a>. Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -3526,7 +3526,8 @@ PCRE2 BUILD-TIME OPTIONS
ure works, --enable and --disable always come in pairs, so the comple- ure works, --enable and --disable always come in pairs, so the comple-
mentary option always exists as well, but as it specifies the default, mentary option always exists as well, but as it specifies the default,
it is not described. Options that specify values have names that start it is not described. Options that specify values have names that start
with --with. with --with. At the end of a configure run, a summary of the configura-
tion is output.
BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES
@ -3617,7 +3618,14 @@ JUST-IN-TIME COMPILER SUPPORT
This support is available only for certain hardware architectures. If This support is available only for certain hardware architectures. If
this option is set for an unsupported architecture, a building error this option is set for an unsupported architecture, a building error
occurs. If you are running under SELinux you may also want to add occurs. If in doubt, use
--enable-jit=auto
which enables JIT only if the current hardware is supported. You can
check if JIT is enabled in the configuration summary that is output at
the end of a configure run. If you are enabling JIT under SELinux you
may also want to add
--enable-jit-sealloc --enable-jit-sealloc
@ -4020,8 +4028,8 @@ AUTHOR
REVISION REVISION
Last updated: 18 July 2017 Last updated: 25 February 2018
Copyright (c) 1997-2017 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------

View File

@ -122,6 +122,13 @@ BINARY FILES
handled. handled.
BINARY ZEROS IN PATTERNS
Patterns passed from the command line are strings that are terminated
by a binary zero, so cannot contain internal zeros. However, patterns
that are read from a file via the -f option may contain binary zeros.
OPTIONS OPTIONS
The order in which some of the options appear can affect the output. The order in which some of the options appear can affect the output.
@ -329,14 +336,17 @@ OPTIONS
-f filename, --file=filename -f filename, --file=filename
Read patterns from the file, one per line, and match them Read patterns from the file, one per line, and match them
against each line of input. What constitutes a newline when against each line of input. As is the case with patterns on
reading the file is the operating system's default. The the command line, no delimiters should be used. What consti-
--newline option has no effect on this option. Trailing tutes a newline when reading the file is the operating sys-
white space is removed from each line, and blank lines are tem's default interpretation of \n. The --newline option has
ignored. An empty file contains no patterns and therefore no effect on this option. Trailing white space is removed
matches nothing. See also the comments about multiple pat- from each line, and blank lines are ignored. An empty file
terns versus a single pattern with alternatives in the contains no patterns and therefore matches nothing. Patterns
description of -e above. read from a file in this way may contain binary zeros, which
are treated as ordinary data characters. See also the com-
ments about multiple patterns versus a single pattern with
alternatives in the description of -e above.
If this option is given more than once, all the specified If this option is given more than once, all the specified
files are read. A data line is output if any of the patterns files are read. A data line is output if any of the patterns
@ -349,16 +359,17 @@ OPTIONS
--file-list=filename --file-list=filename
Read a list of files and/or directories that are to be Read a list of files and/or directories that are to be
scanned from the given file, one per line. Trailing white scanned from the given file, one per line. What constitutes a
space is removed from each line, and blank lines are ignored. newline when reading the file is the operating system's
These paths are processed before any that are listed on the default. Trailing white space is removed from each line, and
command line. The file name can be given as "-" to refer to blank lines are ignored. These paths are processed before any
the standard input. If --file and --file-list are both spec- that are listed on the command line. The file name can be
ified as "-", patterns are read first. This is useful only given as "-" to refer to the standard input. If --file and
when the standard input is a terminal, from which further --file-list are both specified as "-", patterns are read
lines (the list of files) can be read after an end-of-file first. This is useful only when the standard input is a ter-
indication. If this option is given more than once, all the minal, from which further lines (the list of files) can be
specified files are read. read after an end-of-file indication. If this option is given
more than once, all the specified files are read.
--file-offsets --file-offsets
Instead of showing lines or parts of lines that match, show Instead of showing lines or parts of lines that match, show
@ -758,13 +769,13 @@ NEWLINES
newline conventions from the default. Any parts of the input files that newline conventions from the default. Any parts of the input files that
are written to the standard output are copied identically, with what- are written to the standard output are copied identically, with what-
ever newline sequences they have in the input. However, the setting of ever newline sequences they have in the input. However, the setting of
this option does not affect the interpretation of files specified by this option affects only the way scanned files are processed. It does
the -f, --exclude-from, or --include-from options, which are assumed to not affect the interpretation of files specified by the -f, --file-
use the operating system's standard newline sequence, nor does it list, --exclude-from, or --include-from options, nor does it affect the
affect the way in which pcre2grep writes informational messages to the way in which pcre2grep writes informational messages to the standard
standard error and output streams. For these it uses the string "\n" to error and output streams. For these it uses the string "\n" to indicate
indicate newlines, relying on the C I/O library to convert this to an newlines, relying on the C I/O library to convert this to an appropri-
appropriate sequence. ate sequence.
OPTIONS COMPATIBILITY OPTIONS COMPATIBILITY
@ -929,5 +940,5 @@ AUTHOR
REVISION REVISION
Last updated: 13 November 2017 Last updated: 24 February 2018
Copyright (c) 1997-2017 University of Cambridge. Copyright (c) 1997-2018 University of Cambridge.