diff --git a/doc/html/README.txt b/doc/html/README.txt index 52859a9..66b756b 100644 --- a/doc/html/README.txt +++ b/doc/html/README.txt @@ -171,10 +171,12 @@ library. They are also documented in the pcre2build man page. give large performance improvements on certain platforms, add --enable-jit to the "configure" command. This support is available only for certain hardware architectures. If you try to enable it on an unsupported architecture, there - will be a compile time error. If you are running under SELinux you may also - want to add --enable-jit-sealloc, which enables the use of an execmem - allocator in JIT that is compatible with SELinux. This has no effect if JIT - is not enabled. + will be a compile time error. If in doubt, use --enable-jit=auto, which + enables JIT only if the current hardware is supported. + +. If you are enabling JIT under SELinux you may also want to add + --enable-jit-sealloc, which enables the use of an execmem allocator in JIT + that is compatible with SELinux. This has no effect if JIT is not enabled. . If you do not want to make use of the default support for UTF-8 Unicode character strings in the 8-bit library, UTF-16 Unicode character strings in @@ -883,4 +885,4 @@ The distribution should contain the files listed below. Philip Hazel Email local part: ph10 Email domain: cam.ac.uk -Last updated: 12 September 2017 +Last updated: 25 February 2018 diff --git a/doc/html/pcre2build.html b/doc/html/pcre2build.html index 823e605..edf24e8 100644 --- a/doc/html/pcre2build.html +++ b/doc/html/pcre2build.html @@ -82,7 +82,8 @@ The following sections include descriptions of "on/off" options whose names begin with --enable or --disable. Because of the way that configure works, --enable and --disable always come in pairs, so the complementary option always exists as well, but as it specifies the default, it is not described. -Options that specify values have names that start with --with. +Options that specify values have names that start with --with. At the end of a +configure run, a summary of the configuration is output.


BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES

@@ -170,8 +171,15 @@ Just-in-time (JIT) compiler support is included in the build by specifying --enable-jit This support is available only for certain hardware architectures. If this -option is set for an unsupported architecture, a building error occurs. If you -are running under SELinux you may also want to add +option is set for an unsupported architecture, a building error occurs. +If in doubt, use +

+  --enable-jit=auto
+
+which enables JIT only if the current hardware is supported. You can check +if JIT is enabled in the configuration summary that is output at the end of a +configure run. If you are enabling JIT under SELinux you may also want to +add
   --enable-jit-sealloc
 
@@ -565,9 +573,9 @@ Cambridge, England.


REVISION

-Last updated: 18 July 2017 +Last updated: 25 February 2018
-Copyright © 1997-2017 University of Cambridge. +Copyright © 1997-2018 University of Cambridge.

Return to the PCRE2 index page. diff --git a/doc/html/pcre2grep.html b/doc/html/pcre2grep.html index 625a467..1ef8f42 100644 --- a/doc/html/pcre2grep.html +++ b/doc/html/pcre2grep.html @@ -17,17 +17,18 @@ please consult the man page, in case the conversion went wrong.

  • DESCRIPTION
  • SUPPORT FOR COMPRESSED FILES
  • BINARY FILES -
  • OPTIONS -
  • ENVIRONMENT VARIABLES -
  • NEWLINES -
  • OPTIONS COMPATIBILITY -
  • OPTIONS WITH DATA -
  • USING PCRE2'S CALLOUT FACILITY -
  • MATCHING ERRORS -
  • DIAGNOSTICS -
  • SEE ALSO -
  • AUTHOR -
  • REVISION +
  • BINARY ZEROS IN PATTERNS +
  • OPTIONS +
  • ENVIRONMENT VARIABLES +
  • NEWLINES +
  • OPTIONS COMPATIBILITY +
  • OPTIONS WITH DATA +
  • USING PCRE2'S CALLOUT FACILITY +
  • MATCHING ERRORS +
  • DIAGNOSTICS +
  • SEE ALSO +
  • AUTHOR +
  • REVISION
    SYNOPSIS

    @@ -150,7 +151,13 @@ specified as "nul", that is, the line terminator is a binary zero, the test for a binary file is not applied. See the --binary-files option for a means of changing the way binary files are handled.

    -
    OPTIONS
    +
    BINARY ZEROS IN PATTERNS
    +

    +Patterns passed from the command line are strings that are terminated by a +binary zero, so cannot contain internal zeros. However, patterns that are read +from a file via the -f option may contain binary zeros. +

    +
    OPTIONS

    The order in which some of the options appear can affect the output. For example, both the -H and -l options affect the printing of file @@ -355,12 +362,15 @@ files; it does not apply to patterns specified by any of the --include or

    -f filename, --file=filename Read patterns from the file, one per line, and match them against each line of -input. What constitutes a newline when reading the file is the operating -system's default. The --newline option has no effect on this option. -Trailing white space is removed from each line, and blank lines are ignored. An -empty file contains no patterns and therefore matches nothing. See also the -comments about multiple patterns versus a single pattern with alternatives in -the description of -e above. +input. As is the case with patterns on the command line, no delimiters should +be used. What constitutes a newline when reading the file is the operating +system's default interpretation of \n. The --newline option has no +effect on this option. Trailing white space is removed from each line, and +blank lines are ignored. An empty file contains no patterns and therefore +matches nothing. Patterns read from a file in this way may contain binary +zeros, which are treated as ordinary data characters. See also the comments +about multiple patterns versus a single pattern with alternatives in the +description of -e above.

    If this option is given more than once, all the specified files are read. A @@ -373,14 +383,15 @@ command line; all arguments are treated as the names of paths to be searched.

    --file-list=filename Read a list of files and/or directories that are to be scanned from the given -file, one per line. Trailing white space is removed from each line, and blank -lines are ignored. These paths are processed before any that are listed on the -command line. The file name can be given as "-" to refer to the standard input. -If --file and --file-list are both specified as "-", patterns are -read first. This is useful only when the standard input is a terminal, from -which further lines (the list of files) can be read after an end-of-file -indication. If this option is given more than once, all the specified files are -read. +file, one per line. What constitutes a newline when reading the file is the +operating system's default. Trailing white space is removed from each line, and +blank lines are ignored. These paths are processed before any that are listed +on the command line. The file name can be given as "-" to refer to the standard +input. If --file and --file-list are both specified as "-", +patterns are read first. This is useful only when the standard input is a +terminal, from which further lines (the list of files) can be read after an +end-of-file indication. If this option is given more than once, all the +specified files are read.

    --file-offsets @@ -764,27 +775,28 @@ pattern and ")$" at the end. This option applies only to the patterns that are matched against the contents of files; it does not apply to patterns specified by any of the --include or --exclude options.

    -
    ENVIRONMENT VARIABLES
    +
    ENVIRONMENT VARIABLES

    The environment variables LC_ALL and LC_CTYPE are examined, in that order, for a locale. The first one that is set is used. This can be overridden by the --locale option. If no locale is set, the PCRE2 library's default (usually the "C" locale) is used.

    -
    NEWLINES
    +
    NEWLINES

    The -N (--newline) option allows pcre2grep to scan files with different newline conventions from the default. Any parts of the input files that are written to the standard output are copied identically, with whatever newline sequences they have in the input. However, the setting of this option -does not affect the interpretation of files specified by the -f, ---exclude-from, or --include-from options, which are assumed to use -the operating system's standard newline sequence, nor does it affect the way in -which pcre2grep writes informational messages to the standard error and -output streams. For these it uses the string "\n" to indicate newlines, -relying on the C I/O library to convert this to an appropriate sequence. +affects only the way scanned files are processed. It does not affect the +interpretation of files specified by the -f, --file-list, +--exclude-from, or --include-from options, nor does it affect the +way in which pcre2grep writes informational messages to the standard +error and output streams. For these it uses the string "\n" to indicate +newlines, relying on the C I/O library to convert this to an appropriate +sequence.

    -
    OPTIONS COMPATIBILITY
    +
    OPTIONS COMPATIBILITY

    Many of the short and long forms of pcre2grep's options are the same as in the GNU grep program. Any long option of the form @@ -804,7 +816,7 @@ for GNU grep, but a regular expression for pcre2grep. If both the -c and -l options are given, GNU grep lists only file names, without counts, but pcre2grep gives the counts as well.

    -
    OPTIONS WITH DATA
    +
    OPTIONS WITH DATA

    There are four different ways in which an option with data can be specified. If a short form option is used, the data may follow immediately, or (with one @@ -836,7 +848,7 @@ The exceptions to the above are the --colour (or --color) and options does have data, it must be given in the first form, using an equals character. Otherwise pcre2grep will assume that it has no data.

    -
    USING PCRE2'S CALLOUT FACILITY
    +
    USING PCRE2'S CALLOUT FACILITY

    pcre2grep has, by default, support for calling external programs or scripts or echoing specific strings during matching by making use of PCRE2's @@ -906,7 +918,7 @@ Matching continues normally after the string is output. If you want to see only the callout output but not any output from an actual match, you should end the relevant pattern with (*FAIL).

    -
    MATCHING ERRORS
    +
    MATCHING ERRORS

    It is possible to supply a regular expression that takes a very long time to fail to match certain lines. Such patterns normally involve nested indefinite @@ -922,7 +934,7 @@ overall resource limit. There are also other limits that affect the amount of memory used during matching; see the discussion of --heap-limit and --depth-limit above.

    -
    DIAGNOSTICS
    +
    DIAGNOSTICS

    Exit status is 0 if any matches were found, 1 if no matches were found, and 2 for syntax errors, overlong lines, non-existent or inaccessible files (even if @@ -934,11 +946,11 @@ affect the return code. When run under VMS, the return code is placed in the symbol PCRE2GREP_RC because VMS does not distinguish between exit(0) and exit(1).

    -
    SEE ALSO
    +
    SEE ALSO

    pcre2pattern(3), pcre2syntax(3), pcre2callout(3).

    -
    AUTHOR
    +
    AUTHOR

    Philip Hazel
    @@ -947,11 +959,11 @@ University Computing Service Cambridge, England.

    -
    REVISION
    +
    REVISION

    -Last updated: 13 November 2017 +Last updated: 24 February 2018
    -Copyright © 1997-2017 University of Cambridge. +Copyright © 1997-2018 University of Cambridge.

    Return to the PCRE2 index page. diff --git a/doc/pcre2.txt b/doc/pcre2.txt index 79d94e3..3b5761c 100644 --- a/doc/pcre2.txt +++ b/doc/pcre2.txt @@ -170,8 +170,8 @@ REVISION Last updated: 01 April 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2API(3) Library Functions Manual PCRE2API(3) @@ -3477,8 +3477,8 @@ REVISION Last updated: 31 December 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2BUILD(3) Library Functions Manual PCRE2BUILD(3) @@ -3526,18 +3526,19 @@ PCRE2 BUILD-TIME OPTIONS ure works, --enable and --disable always come in pairs, so the comple- mentary option always exists as well, but as it specifies the default, it is not described. Options that specify values have names that start - with --with. + with --with. At the end of a configure run, a summary of the configura- + tion is output. BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES - By default, a library called libpcre2-8 is built, containing functions - that take string arguments contained in arrays of bytes, interpreted - either as single-byte characters, or UTF-8 strings. You can also build - two other libraries, called libpcre2-16 and libpcre2-32, which process - strings that are contained in arrays of 16-bit and 32-bit code units, + By default, a library called libpcre2-8 is built, containing functions + that take string arguments contained in arrays of bytes, interpreted + either as single-byte characters, or UTF-8 strings. You can also build + two other libraries, called libpcre2-16 and libpcre2-32, which process + strings that are contained in arrays of 16-bit and 32-bit code units, respectively. These can be interpreted either as single-unit characters - or UTF-16/UTF-32 strings. To build these additional libraries, add one + or UTF-16/UTF-32 strings. To build these additional libraries, add one or both of the following to the configure command: --enable-pcre2-16 @@ -3547,16 +3548,16 @@ BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES --disable-pcre2-8 - as well. At least one of the three libraries must be built. Note that - the POSIX wrapper is for the 8-bit library only, and that pcre2grep is - an 8-bit program. Neither of these are built if you select only the + as well. At least one of the three libraries must be built. Note that + the POSIX wrapper is for the 8-bit library only, and that pcre2grep is + an 8-bit program. Neither of these are built if you select only the 16-bit or 32-bit libraries. BUILDING SHARED AND STATIC LIBRARIES - The Autotools PCRE2 building process uses libtool to build both shared - and static libraries by default. You can suppress an unwanted library + The Autotools PCRE2 building process uses libtool to build both shared + and static libraries by default. You can suppress an unwanted library by adding one of --disable-shared @@ -3567,40 +3568,40 @@ BUILDING SHARED AND STATIC LIBRARIES UNICODE AND UTF SUPPORT - By default, PCRE2 is built with support for Unicode and UTF character + By default, PCRE2 is built with support for Unicode and UTF character strings. To build it without Unicode support, add --disable-unicode - to the configure command. This setting applies to all three libraries. - It is not possible to build one library with Unicode support, and + to the configure command. This setting applies to all three libraries. + It is not possible to build one library with Unicode support, and another without, in the same configuration. - Of itself, Unicode support does not make PCRE2 treat strings as UTF-8, + Of itself, Unicode support does not make PCRE2 treat strings as UTF-8, UTF-16 or UTF-32. To do that, applications that use the library can set - the PCRE2_UTF option when they call pcre2_compile() to compile a pat- - tern. Alternatively, patterns may be started with (*UTF) unless the + the PCRE2_UTF option when they call pcre2_compile() to compile a pat- + tern. Alternatively, patterns may be started with (*UTF) unless the application has locked this out by setting PCRE2_NEVER_UTF. UTF support allows the libraries to process character code points up to - 0x10ffff in the strings that they handle. Unicode support also gives - access to the Unicode properties of characters, using pattern escapes + 0x10ffff in the strings that they handle. Unicode support also gives + access to the Unicode properties of characters, using pattern escapes such as \P, \p, and \X. Only the general category properties such as Lu - and Nd are supported. Details are given in the pcre2pattern documenta- + and Nd are supported. Details are given in the pcre2pattern documenta- tion. Pattern escapes such as \d and \w do not by default make use of Unicode - properties. The application can request that they do by setting the - PCRE2_UCP option. Unless the application has set PCRE2_NEVER_UCP, a + properties. The application can request that they do by setting the + PCRE2_UCP option. Unless the application has set PCRE2_NEVER_UCP, a pattern may also request this by starting with (*UCP). DISABLING THE USE OF \C The \C escape sequence, which matches a single code unit, even in a UTF - mode, can cause unpredictable behaviour because it may leave the cur- - rent matching point in the middle of a multi-code-unit character. The - application can lock it out by setting the PCRE2_NEVER_BACKSLASH_C + mode, can cause unpredictable behaviour because it may leave the cur- + rent matching point in the middle of a multi-code-unit character. The + application can lock it out by setting the PCRE2_NEVER_BACKSLASH_C option when calling pcre2_compile(). There is also a build-time option --enable-never-backslash-C @@ -3610,14 +3611,21 @@ DISABLING THE USE OF \C JUST-IN-TIME COMPILER SUPPORT - Just-in-time (JIT) compiler support is included in the build by speci- + Just-in-time (JIT) compiler support is included in the build by speci- fying --enable-jit - This support is available only for certain hardware architectures. If - this option is set for an unsupported architecture, a building error - occurs. If you are running under SELinux you may also want to add + This support is available only for certain hardware architectures. If + this option is set for an unsupported architecture, a building error + occurs. If in doubt, use + + --enable-jit=auto + + which enables JIT only if the current hardware is supported. You can + check if JIT is enabled in the configuration summary that is output at + the end of a configure run. If you are enabling JIT under SELinux you + may also want to add --enable-jit-sealloc @@ -4020,11 +4028,11 @@ AUTHOR REVISION - Last updated: 18 July 2017 - Copyright (c) 1997-2017 University of Cambridge. + Last updated: 25 February 2018 + Copyright (c) 1997-2018 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2CALLOUT(3) Library Functions Manual PCRE2CALLOUT(3) @@ -4447,8 +4455,8 @@ REVISION Last updated: 22 December 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2COMPAT(3) Library Functions Manual PCRE2COMPAT(3) @@ -4645,8 +4653,8 @@ REVISION Last updated: 18 April 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2JIT(3) Library Functions Manual PCRE2JIT(3) @@ -5039,8 +5047,8 @@ REVISION Last updated: 31 March 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2LIMITS(3) Library Functions Manual PCRE2LIMITS(3) @@ -5110,8 +5118,8 @@ REVISION Last updated: 30 March 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2MATCHING(3) Library Functions Manual PCRE2MATCHING(3) @@ -5329,8 +5337,8 @@ REVISION Last updated: 29 September 2014 Copyright (c) 1997-2014 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2PARTIAL(3) Library Functions Manual PCRE2PARTIAL(3) @@ -5769,8 +5777,8 @@ REVISION Last updated: 22 December 2014 Copyright (c) 1997-2014 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2PATTERN(3) Library Functions Manual PCRE2PATTERN(3) @@ -8880,8 +8888,8 @@ REVISION Last updated: 12 September 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2PERFORM(3) Library Functions Manual PCRE2PERFORM(3) @@ -9108,8 +9116,8 @@ REVISION Last updated: 08 April 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2POSIX(3) Library Functions Manual PCRE2POSIX(3) @@ -9416,8 +9424,8 @@ REVISION Last updated: 15 June 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2SAMPLE(3) Library Functions Manual PCRE2SAMPLE(3) @@ -9685,8 +9693,8 @@ REVISION Last updated: 21 March 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2SYNTAX(3) Library Functions Manual PCRE2SYNTAX(3) @@ -10133,8 +10141,8 @@ REVISION Last updated: 17 June 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + PCRE2UNICODE(3) Library Functions Manual PCRE2UNICODE(3) @@ -10390,5 +10398,5 @@ REVISION Last updated: 17 May 2017 Copyright (c) 1997-2017 University of Cambridge. ------------------------------------------------------------------------------ - - + + diff --git a/doc/pcre2grep.txt b/doc/pcre2grep.txt index 30517b4..6a84095 100644 --- a/doc/pcre2grep.txt +++ b/doc/pcre2grep.txt @@ -122,6 +122,13 @@ BINARY FILES handled. +BINARY ZEROS IN PATTERNS + + Patterns passed from the command line are strings that are terminated + by a binary zero, so cannot contain internal zeros. However, patterns + that are read from a file via the -f option may contain binary zeros. + + OPTIONS The order in which some of the options appear can affect the output. @@ -329,36 +336,40 @@ OPTIONS -f filename, --file=filename Read patterns from the file, one per line, and match them - against each line of input. What constitutes a newline when - reading the file is the operating system's default. The - --newline option has no effect on this option. Trailing - white space is removed from each line, and blank lines are - ignored. An empty file contains no patterns and therefore - matches nothing. See also the comments about multiple pat- - terns versus a single pattern with alternatives in the - description of -e above. + against each line of input. As is the case with patterns on + the command line, no delimiters should be used. What consti- + tutes a newline when reading the file is the operating sys- + tem's default interpretation of \n. The --newline option has + no effect on this option. Trailing white space is removed + from each line, and blank lines are ignored. An empty file + contains no patterns and therefore matches nothing. Patterns + read from a file in this way may contain binary zeros, which + are treated as ordinary data characters. See also the com- + ments about multiple patterns versus a single pattern with + alternatives in the description of -e above. - If this option is given more than once, all the specified - files are read. A data line is output if any of the patterns - match it. A file name can be given as "-" to refer to the - standard input. When -f is used, patterns specified on the - command line using -e may also be present; they are tested - before the file's patterns. However, no other pattern is + If this option is given more than once, all the specified + files are read. A data line is output if any of the patterns + match it. A file name can be given as "-" to refer to the + standard input. When -f is used, patterns specified on the + command line using -e may also be present; they are tested + before the file's patterns. However, no other pattern is taken from the command line; all arguments are treated as the names of paths to be searched. --file-list=filename - Read a list of files and/or directories that are to be - scanned from the given file, one per line. Trailing white - space is removed from each line, and blank lines are ignored. - These paths are processed before any that are listed on the - command line. The file name can be given as "-" to refer to - the standard input. If --file and --file-list are both spec- - ified as "-", patterns are read first. This is useful only - when the standard input is a terminal, from which further - lines (the list of files) can be read after an end-of-file - indication. If this option is given more than once, all the - specified files are read. + Read a list of files and/or directories that are to be + scanned from the given file, one per line. What constitutes a + newline when reading the file is the operating system's + default. Trailing white space is removed from each line, and + blank lines are ignored. These paths are processed before any + that are listed on the command line. The file name can be + given as "-" to refer to the standard input. If --file and + --file-list are both specified as "-", patterns are read + first. This is useful only when the standard input is a ter- + minal, from which further lines (the list of files) can be + read after an end-of-file indication. If this option is given + more than once, all the specified files are read. --file-offsets Instead of showing lines or parts of lines that match, show @@ -758,13 +769,13 @@ NEWLINES newline conventions from the default. Any parts of the input files that are written to the standard output are copied identically, with what- ever newline sequences they have in the input. However, the setting of - this option does not affect the interpretation of files specified by - the -f, --exclude-from, or --include-from options, which are assumed to - use the operating system's standard newline sequence, nor does it - affect the way in which pcre2grep writes informational messages to the - standard error and output streams. For these it uses the string "\n" to - indicate newlines, relying on the C I/O library to convert this to an - appropriate sequence. + this option affects only the way scanned files are processed. It does + not affect the interpretation of files specified by the -f, --file- + list, --exclude-from, or --include-from options, nor does it affect the + way in which pcre2grep writes informational messages to the standard + error and output streams. For these it uses the string "\n" to indicate + newlines, relying on the C I/O library to convert this to an appropri- + ate sequence. OPTIONS COMPATIBILITY @@ -929,5 +940,5 @@ AUTHOR REVISION - Last updated: 13 November 2017 - Copyright (c) 1997-2017 University of Cambridge. + Last updated: 24 February 2018 + Copyright (c) 1997-2018 University of Cambridge.