More documentation and file tidies.

2014-11-21 16:45:06 +00:00 · 2014-11-21 16:45:06 +00:00 · eb4fffbbf4
parent ba1e2e0cbb
commit eb4fffbbf4
25 changed files with 1002 additions and 987 deletions
--- a/doc/html/pcre2.html
+++ b/doc/html/pcre2.html
@ -25,9 +25,10 @@ PCRE2 is the name used for a revised API for the PCRE library, which is a set
 of functions, written in C, that implement regular expression pattern matching
 using the same syntax and semantics as Perl, with just a few differences. Some
 features that appeared in Python and the original PCRE before they appeared in
-Perl are also available using the Python syntax, there is some support for one
+Perl are also available using the Python syntax. There is also some support for
-or two .NET and Oniguruma syntax items, and there are options for requesting
+one or two .NET and Oniguruma syntax items, and there are options for
-some minor changes that give better ECMAScript (aka JavaScript) compatibility.
+requesting some minor changes that give better ECMAScript (aka JavaScript)
 compatibility.
 </P>
 <P>
 The source code for PCRE2 can be compiled to support 8-bit, 16-bit, or 32-bit
@ -36,7 +37,7 @@ The original work to extend PCRE to 16-bit and 32-bit code units was done by
 Zoltan Herczeg and Christian Persch, respectively. In all three cases, strings
 can be interpreted either as one character per code unit, or as UTF-encoded
 Unicode, with support for Unicode general category properties. Unicode support
-is optional at build time (but is the default); however, processing strings as
+is optional at build time (but is the default). However, processing strings as
 UTF code units must be enabled explicitly at run time. The version of Unicode
 in use can be discovered by running
 <pre>
@ -143,17 +144,17 @@ listing), and the short pages for individual functions, are concatenated in
  pcre2compat        discussion of Perl compatibility
  pcre2demo          a demonstration C program that uses PCRE2
  pcre2grep          description of the <b>pcre2grep</b> command (8-bit only)
-  pcre2jit           discussion of the just-in-time optimization support
+  pcre2jit           discussion of just-in-time optimization support
  pcre2limits        details of size and other limits
  pcre2matching      discussion of the two matching algorithms
  pcre2partial       details of the partial matching facility
-  pcre2pattern       syntax and semantics of supported regular expressions
+  pcre2pattern       syntax and semantics of supported regular  expression patterns
  pcre2perform       discussion of performance issues
  pcre2posix         the POSIX-compatible C API for the 8-bit library
  pcre2sample        discussion of the pcre2demo program
  pcre2stack         discussion of stack usage
  pcre2syntax        quick syntax reference
-  pcre2test          description of the <b>pcre2test</b> testing command
+  pcre2test          description of the <b>pcre2test</b> command
  pcre2unicode       discussion of Unicode and UTF support
 </pre>
 In the "man" and HTML formats, there is also a short page for each C library
@ -165,7 +166,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <P>
@ -174,7 +175,7 @@ use my two initials, followed by the two digits 10, at the domain cam.ac.uk.
 </P>
 <br><a name="SEC5" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 03 November 2014
+Last updated: 18 November 2014
 <br>
 Copyright &copy; 1997-2014 University of Cambridge.
 <br>
--- a/doc/html/pcre2api.html
+++ b/doc/html/pcre2api.html
@ -37,16 +37,18 @@ please consult the man page, in case the conversion went wrong.
 <li><a name="TOC22" href="#SEC22">MATCHING A PATTERN: THE TRADITIONAL FUNCTION</a>
 <li><a name="TOC23" href="#SEC23">NEWLINE HANDLING WHEN MATCHING</a>
 <li><a name="TOC24" href="#SEC24">HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS</a>
-<li><a name="TOC25" href="#SEC25">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a>
+<li><a name="TOC25" href="#SEC25">OTHER INFORMATION ABOUT A MATCH</a>
-<li><a name="TOC26" href="#SEC26">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a>
+<li><a name="TOC26" href="#SEC26">ERROR RETURNS FROM <b>pcre2_match()</b></a>
-<li><a name="TOC27" href="#SEC27">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a>
+<li><a name="TOC27" href="#SEC27">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a>
-<li><a name="TOC28" href="#SEC28">CREATING A NEW STRING WITH SUBSTITUTIONS</a>
+<li><a name="TOC28" href="#SEC28">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a>
-<li><a name="TOC29" href="#SEC29">DUPLICATE SUBPATTERN NAMES</a>
+<li><a name="TOC29" href="#SEC29">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a>
-<li><a name="TOC30" href="#SEC30">FINDING ALL POSSIBLE MATCHES</a>
+<li><a name="TOC30" href="#SEC30">CREATING A NEW STRING WITH SUBSTITUTIONS</a>
-<li><a name="TOC31" href="#SEC31">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a>
+<li><a name="TOC31" href="#SEC31">DUPLICATE SUBPATTERN NAMES</a>
-<li><a name="TOC32" href="#SEC32">SEE ALSO</a>
+<li><a name="TOC32" href="#SEC32">FINDING ALL POSSIBLE MATCHES AT ONE POSITION</a>
-<li><a name="TOC33" href="#SEC33">AUTHOR</a>
+<li><a name="TOC33" href="#SEC33">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a>
-<li><a name="TOC34" href="#SEC34">REVISION</a>
+<li><a name="TOC34" href="#SEC34">SEE ALSO</a>
 <li><a name="TOC35" href="#SEC35">AUTHOR</a>
 <li><a name="TOC36" href="#SEC36">REVISION</a>
 </ul>
 <P>
 <b>#include &#60;pcre2.h&#62;</b>
@ -436,13 +438,9 @@ U+000C), NEL (next line, U+0085), LS (line separator, U+2028), and PS
 <P>
 Each of the first three conventions is used by at least one operating system as
 its standard newline sequence. When PCRE2 is built, a default can be specified.
-The default default is LF, which is the Unix standard. When PCRE2 is run, the
+The default default is LF, which is the Unix standard. However, the newline
-default can be overridden, either when a pattern is compiled, or when it is
+convention can be changed by an application when calling <b>pcre2_compile()</b>,
-matched.
+or it can be specified by special text at the start of the pattern itself; this
 </P>
 <P>
 The newline convention can be changed when calling <b>pcre2_compile()</b>, or it
 can be specified by special text at the start of the pattern itself; this
 overrides any other settings. See the
 <a href="pcre2pattern.html"><b>pcre2pattern</b></a>
 page for details of the special character sequences.
@ -459,8 +457,8 @@ below.
 </P>
 <P>
 The choice of newline convention does not affect the interpretation of
-the \n or \r escape sequences, nor does it affect what \R matches, which has
+the \n or \r escape sequences, nor does it affect what \R matches; this has
-its own separate control.
+its own separate convention.
 </P>
 <br><a name="SEC13" href="#TOC1">MULTITHREADING</a><br>
 <P>
@ -472,7 +470,7 @@ time ensuring that multithreaded applications can use it.
 </P>
 <P>
 There are several different blocks of data that are used to pass information
-between the application and the PCRE libraries.
+between the application and the PCRE2 libraries.
 </P>
 <P>
 (1) A pointer to the compiled form of a pattern is returned to the user when
@ -572,11 +570,11 @@ The compile context
 A compile context is required if you want to change the default values of any
 of the following compile-time parameters:
 <pre>
-  What \R matches (Unicode newlines or CR, LF, CRLF only);
+  What \R matches (Unicode newlines or CR, LF, CRLF only)
-  PCRE2's character tables;
+  PCRE2's character tables
-  The newline character sequence;
+  The newline character sequence
-  The compile time nested parentheses limit;
+  The compile time nested parentheses limit
-  An external function for stack checking.
+  An external function for stack checking
 </pre>
 A compile context is also required if you are using custom memory management.
 If none of these apply, just pass NULL as the context argument of
@ -604,9 +602,8 @@ PCRE2_ERROR_BADDATA if invalid data is detected.
 <br>
 The value must be PCRE2_BSR_ANYCRLF, to specify that \R matches only CR, LF,
 or CRLF, or PCRE2_BSR_UNICODE, to specify that \R matches any Unicode line
-ending sequence. The value of this parameter does not affect what is compiled;
+ending sequence. The value is used by the JIT compiler and by the two
-it is just saved with the compiled pattern. The value is used by the JIT
+interpreted matching functions, <i>pcre2_match()</i> and
 compiler and by the two interpreted matching functions, <i>pcre2_match()</i> and
 <i>pcre2_dfa_match()</i>.
 <b>int pcre2_set_character_tables(pcre2_compile_context *<i>ccontext</i>,</b>
 <b>  const unsigned char *<i>tables</i>);</b>
@ -709,12 +706,12 @@ in the subject string. This limit is not relevant to <b>pcre2_dfa_match()</b>,
 which ignores it.
 </P>
 <P>
-When <b>pcre2_match()</b> is called with a pattern that was successfully studied
+When <b>pcre2_match()</b> is called with a pattern that was successfully
-with <b>pcre2_jit_compile()</b>, the way that the matching is executed is
+processed by <b>pcre2_jit_compile()</b>, the way in which matching is executed
-entirely different. However, there is still the possibility of runaway matching
+is entirely different. However, there is still the possibility of runaway
-that goes on for a very long time, and so the <i>match_limit</i> value is also
+matching that goes on for a very long time, and so the <i>match_limit</i> value
-used in this case (but in a different way) to limit how long the matching can
+is also used in this case (but in a different way) to limit how long the
-continue.
+matching can continue.
 </P>
 <P>
 The default value for the limit can be set when PCRE2 is built; the default
@ -770,15 +767,17 @@ stack. There is a discussion about PCRE2's stack usage in the
 <a href="pcre2stack.html"><b>pcre2stack</b></a>
 documentation. See the
 <a href="pcre2build.html"><b>pcre2build</b></a>
-documentation for details of how to build PCRE2. Using the heap for recursion
+documentation for details of how to build PCRE2.
-is a non-standard way of building PCRE2, for use in environments that have
+</P>
-limited stacks. Because of the greater use of memory management,
+<P>
-<b>pcre2_match()</b> runs more slowly. Functions that are different to the
+Using the heap for recursion is a non-standard way of building PCRE2, for use
-general custom memory functions are provided so that special-purpose external
+in environments that have limited stacks. Because of the greater use of memory
-code can be used for this case, because the memory blocks are all the same
+management, <b>pcre2_match()</b> runs more slowly. Functions that are different
-size. The blocks are retained by <b>pcre2_match()</b> until it is about to exit
+to the general custom memory functions are provided so that special-purpose
-so that they can be re-used when possible during the match. In the absence of
+external code can be used for this case, because the memory blocks are all the
-these functions, the normal custom memory management functions are used, if
+same size. The blocks are retained by <b>pcre2_match()</b> until it is about to
 exit so that they can be re-used when possible during the match. In the absence
 of these functions, the normal custom memory management functions are used, if
 supplied, otherwise the system functions.
 </P>
 <br><a name="SEC15" href="#TOC1">CHECKING BUILD-TIME OPTIONS</a><br>
@ -809,9 +808,10 @@ available:
  PCRE2_CONFIG_BSR
 </pre>
 The output is an integer whose value indicates what character sequences the \R
-escape sequence matches by default. A value of 0 means that \R matches any
+escape sequence matches by default. A value of PCRE2_BSR_UNICODE means that \R
-Unicode line ending sequence; a value of 1 means that \R matches only CR, LF,
+matches any Unicode line ending sequence; a value of PCRE2_BSR_ANYCRLF means
-or CRLF. The default can be overridden when a pattern is compiled or matched.
+that \R matches only CR, LF, or CRLF. The default can be overridden when a
 pattern is compiled.
 <pre>
  PCRE2_CONFIG_JIT
 </pre>
@ -821,7 +821,7 @@ compiling is available; otherwise it is set to zero.
  PCRE2_CONFIG_JITTARGET
 </pre>
 The <i>where</i> argument should point to a buffer that is at least 48 code
-units long. (The exact length needed can be found by calling
+units long. (The exact length required can be found by calling
 <b>pcre2_config()</b> with <b>where</b> set to NULL.) The buffer is filled with a
 string that contains the name of the architecture for which the JIT compiler is
 configured, for example "x86 32bit (little endian + unaligned)". If JIT support
@ -855,11 +855,11 @@ Further details are given with <b>pcre2_match()</b> below.
 The output is an integer whose value specifies the default character sequence
 that is recognized as meaning "newline". The values are:
 <pre>
-  1  Carriage return (CR)
+  PCRE2_NEWLINE_CR       Carriage return (CR)
-  2  Linefeed (LF)
+  PCRE2_NEWLINE_LF       Linefeed (LF)
-  3  Carriage return, linefeed (CRLF)
+  PCRE2_NEWLINE_CRLF     Carriage return, linefeed (CRLF)
-  4  Any Unicode line ending
+  PCRE2_NEWLINE_ANY      Any Unicode line ending
-  5  Any of CR, LF, or CRLF
+  PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
 </pre>
 The default should normally correspond to the standard sequence for your
 operating system.
@ -891,7 +891,7 @@ heap instead of recursive function calls.
  PCRE2_CONFIG_UNICODE_VERSION
 </pre>
 The <i>where</i> argument should point to a buffer that is at least 24 code
-units long. (The exact length needed can be found by calling
+units long. (The exact length required can be found by calling
 <b>pcre2_config()</b> with <b>where</b> set to NULL.) If PCRE2 has been compiled
 without Unicode support, the buffer is filled with the text "Unicode not
 supported". Otherwise, the Unicode version string (for example, "7.0.0") is
@ -906,7 +906,7 @@ otherwise it is set to zero. Unicode support implies UTF support.
  PCRE2_CONFIG_VERSION
 </pre>
 The <i>where</i> argument should point to a buffer that is at least 12 code
-units long. (The exact length needed can be found by calling
+units long. (The exact length required can be found by calling
 <b>pcre2_config()</b> with <b>where</b> set to NULL.) The buffer is filled with
 the PCRE2 version string, zero-terminated. The number of code units used is
 returned. This is the length of the string plus one unit for the terminating
@ -922,17 +922,17 @@ zero.
 <b>pcre2_code_free(pcre2_code *<i>code</i>);</b>
 </P>
 <P>
-This function compiles a pattern, defined by a pointer to a string of code
+The <b>pcre2_compile()</b> function compiles a pattern into an internal form.
-units and a length, into an internal form. If the pattern is zero-terminated,
+The pattern is defined by a pointer to a string of code units and a length, If
-the length should be specified as PCRE2_ZERO_TERMINATED. The function returns a
+the pattern is zero-terminated, the length can be specified as
-pointer to a block of memory that contains the compiled pattern and related
+PCRE2_ZERO_TERMINATED. The function returns a pointer to a block of memory that
-data. The caller must free the memory by calling <b>pcre2_code_free()</b> when
+contains the compiled pattern and related data. The caller must free the memory
-it is no longer needed.
+by calling <b>pcre2_code_free()</b> when it is no longer needed.
 </P>
 <P>
-If the compile context argument <i>ccontext</i> is NULL, the memory is obtained
+If the compile context argument <i>ccontext</i> is NULL, memory for the compiled
-by calling <b>malloc()</b>. Otherwise, it is obtained from the same memory
+pattern is obtained by calling <b>malloc()</b>. Otherwise, it is obtained from
-function that was used for the compile context.
+the same memory function that was used for the compile context.
 </P>
 <P>
 The <i>options</i> argument contains various bit settings that affect the
@ -1247,7 +1247,7 @@ classify characters. More details are given in the section on
 in the
 <a href="pcre2pattern.html"><b>pcre2pattern</b></a>
 page. If you set PCRE2_UCP, matching one of the items it affects takes much
-longer. The option is available only if PCRE2 has been compiled with UTF
+longer. The option is available only if PCRE2 has been compiled with Unicode
 support.
 <pre>
  PCRE2_UNGREEDY
@ -1260,9 +1260,10 @@ with Perl. It can also be set by a (?U) option setting within the pattern.
 </pre>
 This option causes PCRE2 to regard both the pattern and the subject strings
 that are subsequently processed as strings of UTF characters instead of
-single-code-unit strings. However, it is available only when PCRE2 is built to
+single-code-unit strings. It is available when PCRE2 is built to include
-include UTF support. If not, the use of this option provokes an error. Details
+Unicode support (which is the default). If Unicode support is not available,
-of how this option changes the behaviour of PCRE2 are given in the
+the use of this option provokes an error. Details of how this option changes
 the behaviour of PCRE2 are given in the
 <a href="pcre2unicode.html"><b>pcre2unicode</b></a>
 page.
 </P>
@ -1318,13 +1319,12 @@ Most, but not all patterns can be optimized by the JIT compiler.
 <P>
 PCRE2 handles caseless matching, and determines whether characters are letters,
 digits, or whatever, by reference to a set of tables, indexed by character code
-point. When running in UTF-8 mode, or using the 16-bit or 32-bit libraries,
+point. This applies only to characters whose code points are less than 256. By
-this applies only to characters with code points less than 256. By default,
+default, higher-valued code points never match escapes such as \w or \d.
-higher-valued code points never match escapes such as \w or \d. However, if
+However, if PCRE2 is built with UTF support, all characters can be tested with
-PCRE2 is built with UTF support, all characters can be tested with \p and \P,
+\p and \P, or, alternatively, the PCRE2_UCP option can be set when a pattern
-or, alternatively, the PCRE2_UCP option can be set when a pattern is compiled;
+is compiled; this causes \w and friends to use Unicode property support
-this causes \w and friends to use Unicode property support instead of the
+instead of the built-in tables.
 built-in tables.
 </P>
 <P>
 The use of locales with Unicode is discouraged. If you are handling characters
@ -1437,9 +1437,9 @@ are no back references.
  PCRE2_INFO_BSR
 </pre>
 The output is a uint32_t whose value indicates what character sequences the \R
-escape sequence matches by default. A value of 0 means that \R matches any
+escape sequence matches. A value of PCRE2_BSR_UNICODE means that \R matches
-Unicode line ending sequence; a value of 1 means that \R matches only CR, LF,
+any Unicode line ending sequence; a value of PCRE2_BSR_ANYCRLF means that \R
-or CRLF. The default can be overridden when a pattern is matched.
+matches only CR, LF, or CRLF.
 <pre>
  PCRE2_INFO_CAPTURECOUNT
 </pre>
@ -1581,15 +1581,18 @@ values.
 <P>
 The map consists of a number of fixed-size entries. PCRE2_INFO_NAMECOUNT gives
 the number of entries, and PCRE2_INFO_NAMEENTRYSIZE gives the size of each
-entry; both of these return a <b>uint32_t</b> value. The entry size depends on
+entry in code units; both of these return a <b>uint32_t</b> value. The entry
-the length of the longest name. PCRE2_INFO_NAMETABLE returns a pointer to the
+size depends on the length of the longest name.
-first entry of the table. This is a PCRE2_SPTR pointer to a block of code
+</P>
-units. In the 8-bit library, the first two bytes of each entry are the number
+<P>
-of the capturing parenthesis, most significant byte first. In the 16-bit
+PCRE2_INFO_NAMETABLE returns a pointer to the first entry of the table. This is
-library, the pointer points to 16-bit data units, the first of which contains
+a PCRE2_SPTR pointer to a block of code units. In the 8-bit library, the first
-the parenthesis number. In the 32-bit library, the pointer points to 32-bit
+two bytes of each entry are the number of the capturing parenthesis, most
-data units, the first of which contains the parenthesis number. The rest of the
+significant byte first. In the 16-bit library, the pointer points to 16-bit
-entry is the corresponding name, zero terminated.
+code units, the first of which contains the parenthesis number. In the 32-bit
 library, the pointer points to 32-bit code units, the first of which contains
 the parenthesis number. The rest of the entry is the corresponding name, zero
 terminated.
 </P>
 <P>
 The names are in alphabetical order. If (?| is used to create multiple groups
@ -1629,17 +1632,16 @@ different for each compiled pattern.
 <pre>
  PCRE2_INFO_NEWLINE
 </pre>
-The output is a <b>uint32_t</b> whose value specifies the default character
+The output is a <b>uint32_t</b> with one of the following values:
 sequence that will be recognized as meaning "newline" while matching. The
 values are:
 <pre>
-  1  Carriage return (CR)
+  PCRE2_NEWLINE_CR       Carriage return (CR)
-  2  Linefeed (LF)
+  PCRE2_NEWLINE_LF       Linefeed (LF)
-  3  Carriage return, linefeed (CRLF)
+  PCRE2_NEWLINE_CRLF     Carriage return, linefeed (CRLF)
-  4  Any Unicode line ending
+  PCRE2_NEWLINE_ANY      Any Unicode line ending
-  5  Any of CR, LF, or CRLF
+  PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
 </pre>
-The default can be overridden when a pattern is matched.
+This specifies the default character sequence that will be recognized as
 meaning "newline" while matching.
 <pre>
  PCRE2_INFO_RECURSIONLIMIT
 </pre>
@ -1675,18 +1677,19 @@ Information about successful and unsuccessful matches is placed in a match
 data block, which is an opaque structure that is accessed by function calls. In
 particular, the match data block contains a vector of offsets into the subject
 string that define the matched part of the subject and any substrings that were
-capured. This is know as the <i>ovector</i>.
+captured. This is know as the <i>ovector</i>.
 </P>
 <P>
-Before calling <b>pcre2_match()</b> or <b>pcre2_dfa_match()</b> you must create a
+Before calling <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b>, or
-match data block by calling one of the creation functions above. For
+<b>pcre2_jit_match()</b> you must create a match data block by calling one of
-<b>pcre2_match_data_create()</b>, the first argument is the number of pairs of
+the creation functions above. For <b>pcre2_match_data_create()</b>, the first
-offsets in the <i>ovector</i>. One pair of offsets is required to identify the
+argument is the number of pairs of offsets in the <i>ovector</i>. One pair of
-string that matched the whole pattern, with another pair for each captured
+offsets is required to identify the string that matched the whole pattern, with
-substring. For example, a value of 4 creates enough space to record the matched
+another pair for each captured substring. For example, a value of 4 creates
-portion of the subject plus three captured substrings. A minimum of at least 1
+enough space to record the matched portion of the subject plus three captured
-pair is imposed by <b>pcre2_match_data_create()</b>, so it is always possible to
+substrings. A minimum of at least 1 pair is imposed by
-return the overall matched string.
+<b>pcre2_match_data_create()</b>, so it is always possible to return the overall
 matched string.
 </P>
 <P>
 For <b>pcre2_match_data_create_from_pattern()</b>, the first argument is a
@ -1694,15 +1697,16 @@ pointer to a compiled pattern. In this case the ovector is created to be
 exactly the right size to hold all the substrings a pattern might capture.
 </P>
 <P>
-The second argument of both these functions ia a pointer to a general context,
+The second argument of both these functions is a pointer to a general context,
 which can specify custom memory management for obtaining the memory for the
 match data block. If you are not using custom memory management, pass NULL.
 </P>
 <P>
 A match data block can be used many times, with the same or different compiled
 patterns. When it is no longer needed, it should be freed by calling
-<b>pcre2_match_data_free()</b>. How to extract information from a match data
+<b>pcre2_match_data_free()</b>. You can extract information from a match data
-block after a match operation is described in the sections on
+block after a match operation has finished, using functions that are described
 in the sections on
 <a href="#matchedstrings">matched strings</a>
 and
 <a href="#matchotherdata">other match data</a>
@ -1816,12 +1820,10 @@ PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
 PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is described below.
 </P>
 <P>
-If the pattern was successfully processed by the just-in-time (JIT) compiler,
+Setting PCRE2_ANCHORED at match time is not supported by the just-in-time (JIT)
-the only supported options for matching using the JIT code are PCRE2_NOTBOL,
+compiler. If it is set, JIT matching is disabled and the normal interpretive
-PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK,
+code in <b>pcre2_match()</b> is run. The remaining options are supported for JIT
-PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. If an unsupported option is used,
+matching.
 JIT matching is disabled and the normal interpretive code in
 <b>pcre2_match()</b> is run.
 <pre>
  PCRE2_ANCHORED
 </pre>
@ -1835,17 +1837,18 @@ matching.
 </pre>
 This option specifies that first character of the subject string is not the
 beginning of a line, so the circumflex metacharacter should not match before
-it. Setting this without PCRE2_MULTILINE (at compile time) causes circumflex
+it. Setting this without having set PCRE2_MULTILINE at compile time causes
-never to match. This option affects only the behaviour of the circumflex
+circumflex never to match. This option affects only the behaviour of the
-metacharacter. It does not affect \A.
+circumflex metacharacter. It does not affect \A.
 <pre>
  PCRE2_NOTEOL
 </pre>
 This option specifies that the end of the subject string is not the end of a
 line, so the dollar metacharacter should not match it nor (except in multiline
-mode) a newline immediately before it. Setting this without PCRE2_MULTILINE (at
+mode) a newline immediately before it. Setting this without having set
-compile time) causes dollar never to match. This option affects only the
+PCRE2_MULTILINE at compile time causes dollar never to match. This option
-behaviour of the dollar metacharacter. It does not affect \Z or \z.
+affects only the behaviour of the dollar metacharacter. It does not affect \Z
 or \z.
 <pre>
  PCRE2_NOTEMPTY
 </pre>
@ -1857,13 +1860,16 @@ match the empty string, the entire match fails. For example, if the pattern
 </pre>
 is applied to a string not beginning with "a" or "b", it matches an empty
 string at the start of the subject. With PCRE2_NOTEMPTY set, this match is not
-valid, so PCRE2 searches further into the string for occurrences of "a" or "b".
+valid, so <b>pcre2_match()</b> searches further into the string for occurrences
 of "a" or "b".
 <pre>
  PCRE2_NOTEMPTY_ATSTART
 </pre>
-This is like PCRE2_NOTEMPTY, except that an empty string match that is not at
+This is like PCRE2_NOTEMPTY, except that it locks out an empty string match
-the start of the subject is permitted. If the pattern is anchored, such a match
+only at the first matching position, that is, at the start of the subject plus
-can occur only if the pattern contains \K.
+the starting offset. An empty string match later in the subject is permitted.
 If the pattern is anchored, such a match can occur only if the pattern contains
 \K.
 <pre>
  PCRE2_NO_UTF_CHECK
 </pre>
@ -1904,8 +1910,8 @@ subject characters to complete the match. If this happens when
 PCRE2_PARTIAL_SOFT (but not PCRE2_PARTIAL_HARD) is set, matching continues by
 testing any remaining alternatives. Only if no complete match can be found is
 PCRE2_ERROR_PARTIAL returned instead of PCRE2_ERROR_NOMATCH. In other words,
-PCRE2_PARTIAL_SOFT says that the caller is prepared to handle a partial match,
+PCRE2_PARTIAL_SOFT specifies that the caller is prepared to handle a partial
-but only if no complete match can be found.
+match, but only if no complete match can be found.
 </P>
 <P>
 If PCRE2_PARTIAL_HARD is set, it overrides PCRE2_PARTIAL_SOFT. In this case, if
@ -1928,14 +1934,14 @@ a
 <a href="#compilecontext">compile context.</a>
 During matching, the newline choice affects the behaviour of the dot,
 circumflex, and dollar metacharacters. It may also alter the way the match
-position is advanced after a match failure for an unanchored pattern.
+starting position is advanced after a match failure for an unanchored pattern.
 </P>
 <P>
-When PCRE2_NEWLINE_CRLF, PCRE2_NEWLINE_ANYCRLF, or PCRE2_NEWLINE_ANY is set,
+When PCRE2_NEWLINE_CRLF, PCRE2_NEWLINE_ANYCRLF, or PCRE2_NEWLINE_ANY is set as
-and a match attempt for an unanchored pattern fails when the current position
+the newline convention, and a match attempt for an unanchored pattern fails
-is at a CRLF sequence, and the pattern contains no explicit matches for CR or
+when the current starting position is at a CRLF sequence, and the pattern
-LF characters, the match position is advanced by two characters instead of one,
+contains no explicit matches for CR or LF characters, the match position is
-in other words, to after the CRLF.
+advanced by two characters instead of one, in other words, to after the CRLF.
 </P>
 <P>
 The above rule is a compromise that makes the most common cases work as
@ -1948,8 +1954,8 @@ reference, and so advances only by one character after the first failure.
 <P>
 An explicit match for CR of LF is either a literal appearance of one of those
 characters in the pattern, or one of the \r or \n escape sequences. Implicit
-matches such as [^X] do not count, nor does \s (which includes CR and LF in
+matches such as [^X] do not count, nor does \s, even though it includes CR and
-the characters that it matches).
+LF in the characters that it matches.
 </P>
 <P>
 Notwithstanding the above, anomalous effects may still occur when CRLF is a
@ -1967,16 +1973,16 @@ In general, a pattern matches a certain portion of the subject, and in
 addition, further substrings from the subject may be picked out by
 parenthesized parts of the pattern. Following the usage in Jeffrey Friedl's
 book, this is called "capturing" in what follows, and the phrase "capturing
-subpattern" is used for a fragment of a pattern that picks out a substring.
+subpattern" or "capturing group" is used for a fragment of a pattern that picks
-PCRE2 supports several other kinds of parenthesized subpattern that do not
+out a substring. PCRE2 supports several other kinds of parenthesized subpattern
-cause substrings to be captured. The <b>pcre2_pattern_info()</b> function can be
+that do not cause substrings to be captured. The <b>pcre2_pattern_info()</b>
-used to find out how many capturing subpatterns there are in a compiled
+function can be used to find out how many capturing subpatterns there are in a
-pattern.
+compiled pattern.
 </P>
 <P>
 The overall matched string and any captured substrings are returned to the
-caller via a vector of PCRE2_SIZE values, called the <b>ovector</b>. This is
+caller via a vector of PCRE2_SIZE values. This is called the <b>ovector</b>, and
-contained within the
+is contained within the
 <a href="#matchdatablock">match data block.</a>
 You can obtain direct access to the ovector by calling
 <b>pcre2_get_ovector_pointer()</b> to find its address, and
@ -2045,9 +2051,7 @@ parentheses, no more than <i>ovector[0]</i> to <i>ovector[2n+1]</i> are set by
 <b>pcre2_match()</b>. The other elements retain whatever values they previously
 had.
 <a name="matchotherdata"></a></P>
-<br><b>
+<br><a name="SEC25" href="#TOC1">OTHER INFORMATION ABOUT A MATCH</a><br>
 Other information about the match
 </b><br>
 <P>
 <b>PCRE2_SPTR pcre2_get_mark(pcre2_match_data *<i>match_data</i>);</b>
 <br>
@ -2055,7 +2059,7 @@ Other information about the match
 <b>PCRE2_SIZE pcre2_get_startchar(pcre2_match_data *<i>match_data</i>);</b>
 </P>
 <P>
-In addition to the offsets in the ovector, other information about a match is
+As well as the offsets in the ovector, other information about a match is
 retained in the match data block and can be retrieved by the above functions.
 </P>
 <P>
@ -2071,9 +2075,7 @@ different to the value of <i>ovector[0]</i> if the pattern contains the \K
 escape sequence. After a partial match, however, this value is always the same
 as <i>ovector[0]</i> because \K does not affect the result of a partial match.
 <a name="errorlist"></a></P>
-<br><b>
+<br><a name="SEC26" href="#TOC1">ERROR RETURNS FROM <b>pcre2_match()</b></a><br>
 Error return values from <b>pcre2_match()</b>
 </b><br>
 <P>
 If <b>pcre2_match()</b> fails, it returns a negative number. This can be
 converted to a text string by calling <b>pcre2_get_error_message()</b>. Negative
@ -2108,7 +2110,7 @@ passed to a 16-bit or 32-bit library function, or vice versa.
 <pre>
  PCRE2_ERROR_BADOFFSET
 </pre>
-The value of <i>startoffset</i> greater than the length of the subject.
+The value of <i>startoffset</i> was greater than the length of the subject.
 <pre>
  PCRE2_ERROR_BADOPTION
 </pre>
@ -2175,14 +2177,14 @@ the pattern. Specifically, it means that either the whole pattern or a
 subpattern has been called recursively for the second time at the same position
 in the subject string. Some simple patterns that might do this are detected and
 faulted at compile time, but more complicated cases, in particular mutual
-recursions between two different subpatterns, cannot be detected until run
+recursions between two different subpatterns, cannot be detected until matching
-time.
+is attempted.
 <pre>
  PCRE2_ERROR_RECURSIONLIMIT
 </pre>
 The internal recursion limit was reached.
 <a name="extractbynumber"></a></P>
-<br><a name="SEC25" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a><br>
+<br><a name="SEC27" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NUMBER</a><br>
 <P>
 <b>int pcre2_substring_length_bynumber(pcre2_match_data *<i>match_data</i>,</b>
 <b>  unsigned int <i>number</i>, PCRE2_SIZE *<i>length</i>);</b>
@ -2228,8 +2230,8 @@ extract the captured substrings.
 <P>
 The final arguments of <b>pcre2_substring_copy_bynumber()</b> are a pointer to
 the buffer and a pointer to a variable that contains its length in code units.
-This is updated to contain the actual number of code units used, excluding the
+This is updated to contain the actual number of code units used for the
-terminating zero.
+extracted substring, excluding the terminating zero.
 </P>
 <P>
 For <b>pcre2_substring_get_bynumber()</b> the third and fourth arguments point
@ -2254,7 +2256,7 @@ no capturing group of that number in the pattern, or because the group with
 that number did not participate in the match, or because the ovector was too
 small to capture that group.
 </P>
-<br><a name="SEC26" href="#TOC1">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a><br>
+<br><a name="SEC28" href="#TOC1">EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS</a><br>
 <P>
 <b>int pcre2_substring_list_get(pcre2_match_data *<i>match_data</i>,</b>
 <b>"  PCRE2_UCHAR ***<i>listptr</i>, PCRE2_SIZE **<i>lengthsptr</i>);</b>
@ -2264,10 +2266,11 @@ small to capture that group.
 </P>
 <P>
 The <b>pcre2_substring_list_get()</b> function extracts all available substrings
-and builds a list of pointers to them, and a second list that contains their
+and builds a list of pointers to them. It also (optionally) builds a second
-lengths (in code units), excluding a terminating zero that is added to each of
+list that contains their lengths (in code units), excluding a terminating zero
-them. All this is done in a single block of memory that is obtained using the
+that is added to each of them. All this is done in a single block of memory
-same memory allocation function that was used to get the match data block.
+that is obtained using the same memory allocation function that was used to get
 the match data block.
 </P>
 <P>
 The address of the memory block is returned via <i>listptr</i>, which is also
@ -2285,10 +2288,10 @@ If this function encounters a substring that is unset, which can happen when
 capturing subpattern number <i>n+1</i> matches some part of the subject, but
 subpattern <i>n</i> has not been used at all, it returns an empty string. This
 can be distinguished from a genuine zero-length substring by inspecting the
-appropriate offset in the ovector, which contains PCRE2_UNSET for unset
+appropriate offset in the ovector, which contain PCRE2_UNSET for unset
 substrings.
 <a name="extractbyname"></a></P>
-<br><a name="SEC27" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a><br>
+<br><a name="SEC29" href="#TOC1">EXTRACTING CAPTURED SUBSTRINGS BY NAME</a><br>
 <P>
 <b>int pcre2_substring_number_from_name(const pcre2_code *<i>code</i>,</b>
 <b>  PCRE2_SPTR <i>name</i>);</b>
@ -2324,11 +2327,10 @@ that name.
 </P>
 <P>
 Given the number, you can extract the substring directly, or use one of the
-functions described in the previous section. For convenience, there are also
+functions described above. For convenience, there are also "byname" functions
-"byname" functions that correspond to the "bynumber" functions, the only
+that correspond to the "bynumber" functions, the only difference being that the
-difference being that the second argument is a name instead of a number.
+second argument is a name instead of a number. However, if PCRE2_DUPNAMES is
-However, if PCRE2_DUPNAMES is set and there are duplicate names,
+set and there are duplicate names, the behaviour may not be what you want.
 the behaviour may not be what you want (see the next section).
 </P>
 <P>
 <b>Warning:</b> If the pattern uses the (?| feature to set up multiple
@ -2341,7 +2343,7 @@ names are not included in the compiled code. The matching process uses only
 numbers. For this reason, the use of different names for subpatterns of the
 same number causes an error at compile time.
 </P>
-<br><a name="SEC28" href="#TOC1">CREATING A NEW STRING WITH SUBSTITUTIONS</a><br>
+<br><a name="SEC30" href="#TOC1">CREATING A NEW STRING WITH SUBSTITUTIONS</a><br>
 <P>
 <b>int pcre2_substitute(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
 <b>  PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
@ -2368,8 +2370,8 @@ recognized:
 Either a group number or a group name can be given for &#60;n&#62;. Curly brackets are
 required only if the following character would be interpreted as part of the
 number or name. The number may be zero to include the entire matched string.
-For example, if the pattern a(b)c is matched with "[abc]" and the replacement
+For example, if the pattern a(b)c is matched with "=abc=" and the replacement
-string "+$1$0$1+", the result is "[+babcb+]". Group insertion is done by
+string "+$1$0$1+", the result is "=+babcb+=". Group insertion is done by
 calling <b>pcre2_copy_byname()</b> or <b>pcre2_copy_bynumber()</b> as
 appropriate.
 </P>
@ -2402,7 +2404,7 @@ straight back. PCRE2_ERROR_BADREPLACEMENT is returned for an invalid
 replacement string (unrecognized sequence following a dollar sign), and
 PCRE2_ERROR_NOMEMORY is returned if the output buffer is not big enough.
 </P>
-<br><a name="SEC29" href="#TOC1">DUPLICATE SUBPATTERN NAMES</a><br>
+<br><a name="SEC31" href="#TOC1">DUPLICATE SUBPATTERN NAMES</a><br>
 <P>
 <b>int pcre2_substring_nametable_scan(const pcre2_code *<i>code</i>,</b>
 <b>  PCRE2_SPTR <i>name</i>, PCRE2_SPTR *<i>first</i>, PCRE2_SPTR *<i>last</i>);</b>
@ -2423,19 +2425,21 @@ documentation.
 When duplicates are present, <b>pcre2_substring_copy_byname()</b> and
 <b>pcre2_substring_get_byname()</b> return the first substring corresponding to
 the given name that is set. If none are set, PCRE2_ERROR_NOSUBSTRING is
-returned. The <b>pcre2_substring_number_from_name()</b> function returns one of
+returned. The <b>pcre2_substring_number_from_name()</b> function returns
-the numbers that are associated with the name, but it is not defined which it
+the error PCRE2_ERROR_NOUNIQUESUBSTRING.
 is.
 </P>
 <P>
 If you want to get full details of all captured substrings for a given name,
 you must use the <b>pcre2_substring_nametable_scan()</b> function. The first
 argument is the compiled pattern, and the second is the name. If the third and
-fourth arguments are NULL, the function returns a group number (it is not
+fourth arguments are NULL, the function returns a group number for a unique
-defined which). Otherwise, the third and fourth arguments must be pointers to
+name, or PCRE2_ERROR_NOUNIQUESUBSTRING otherwise.
 </P>
 <P>
 When the third and fourth arguments are not NULL, they must be pointers to
 variables that are updated by the function. After it has run, they point to the
 first and last entries in the name-to-number table for the given name, and the
-function returns the length of each entry. In both cases,
+function returns the length of each entry in code units. In both cases,
 PCRE2_ERROR_NOSUBSTRING is returned if there are no entries for the given name.
 </P>
 <P>
@ -2445,14 +2449,14 @@ The format of the name table is described above in the section entitled
 Given all the relevant entries for the name, you can extract each of their
 numbers, and hence the captured data.
 </P>
-<br><a name="SEC30" href="#TOC1">FINDING ALL POSSIBLE MATCHES</a><br>
+<br><a name="SEC32" href="#TOC1">FINDING ALL POSSIBLE MATCHES AT ONE POSITION</a><br>
 <P>
 The traditional matching function uses a similar algorithm to Perl, which stops
-when it finds the first match, starting at a given point in the subject. If you
+when it finds the first match at a given point in the subject. If you want to
-want to find all possible matches, or the longest possible match at a given
+find all possible matches, or the longest possible match at a given position,
-position, consider using the alternative matching function (see below) instead.
+consider using the alternative matching function (see below) instead. If you
-If you cannot use the alternative function, you can kludge it up by making use
+cannot use the alternative function, you can kludge it up by making use of the
-of the callout facility, which is described in the
+callout facility, which is described in the
 <a href="pcre2callout.html"><b>pcre2callout</b></a>
 documentation.
 </P>
@ -2463,7 +2467,7 @@ substring. Then return 1, which forces <b>pcre2_match()</b> to backtrack and try
 other alternatives. Ultimately, when it runs out of matches,
 <b>pcre2_match()</b> will yield PCRE2_ERROR_NOMATCH.
 <a name="dfamatch"></a></P>
-<br><a name="SEC31" href="#TOC1">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a><br>
+<br><a name="SEC33" href="#TOC1">MATCHING A PATTERN: THE ALTERNATIVE FUNCTION</a><br>
 <P>
 <b>int pcre2_dfa_match(const pcre2_code *<i>code</i>, PCRE2_SPTR <i>subject</i>,</b>
 <b>  PCRE2_SIZE <i>length</i>, PCRE2_SIZE <i>startoffset</i>,</b>
@ -2591,11 +2595,10 @@ the longest matches.
 <P>
 NOTE: PCRE2's "auto-possessification" optimization usually applies to character
 repeats at the end of a pattern (as well as internally). For example, the
-pattern "a\d+" is compiled as if it were "a\d++" because there is no point in
+pattern "a\d+" is compiled as if it were "a\d++". For DFA matching, this
-backtracking into the repeated digits. For DFA matching, this means that only
+means that only one possible match is found. If you really do want multiple
-one possible match is found. If you really do want multiple matches in such
+matches in such cases, either use an ungreedy repeat auch as "a\d+?" or set
-cases, either use an ungreedy repeat ("a\d+?") or set the
+the PCRE2_NO_AUTO_POSSESS option when compiling.
 PCRE2_NO_AUTO_POSSESS option when compiling.
 </P>
 <br><b>
 Error returns from <b>pcre2_dfa_match()</b>
@ -2633,29 +2636,29 @@ extremely rare, as a vector of size 1000 is used.
 <pre>
  PCRE2_ERROR_DFA_BADRESTART
 </pre>
-When <b>pcre2_dfa_match()</b> is called with the <b>pcre2_dfa_RESTART</b> option,
+When <b>pcre2_dfa_match()</b> is called with the <b>PCRE2_DFA_RESTART</b> option,
 some plausibility checks are made on the contents of the workspace, which
 should contain data about the previous partial match. If any of these checks
 fail, this error is given.
 </P>
-<br><a name="SEC32" href="#TOC1">SEE ALSO</a><br>
+<br><a name="SEC34" href="#TOC1">SEE ALSO</a><br>
 <P>
-<b>pcre2build</b>(3), <b>pcre2libs</b>(3), <b>pcre2callout</b>(3),
+<b>pcre2build</b>(3), <b>pcre2callout</b>(3), <b>pcre2demo(3)</b>,
 <b>pcre2matching</b>(3), <b>pcre2partial</b>(3), <b>pcre2posix</b>(3),
-<b>pcre2demo(3)</b>, <b>pcre2sample</b>(3), <b>pcre2stack</b>(3).
+<b>pcre2sample</b>(3), <b>pcre2stack</b>(3), <b>pcre2unicode</b>(3).
 </P>
-<br><a name="SEC33" href="#TOC1">AUTHOR</a><br>
+<br><a name="SEC35" href="#TOC1">AUTHOR</a><br>
 <P>
 Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
-<br><a name="SEC34" href="#TOC1">REVISION</a><br>
+<br><a name="SEC36" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 11 November 2014
+Last updated: 21 November 2014
 <br>
 Copyright &copy; 1997-2014 University of Cambridge.
 <br>
--- a/doc/html/pcre2build.html
+++ b/doc/html/pcre2build.html
@ -461,7 +461,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><a name="SEC21" href="#TOC1">REVISION</a><br>
--- a/doc/html/pcre2callout.html
+++ b/doc/html/pcre2callout.html
@ -256,7 +256,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><a name="SEC7" href="#TOC1">REVISION</a><br>
--- a/doc/html/pcre2compat.html
+++ b/doc/html/pcre2compat.html
@ -207,7 +207,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><b>
--- a/doc/html/pcre2grep.html
+++ b/doc/html/pcre2grep.html
@ -745,7 +745,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><a name="SEC14" href="#TOC1">REVISION</a><br>
--- a/doc/html/pcre2jit.html
+++ b/doc/html/pcre2jit.html
@ -413,7 +413,7 @@ Philip Hazel (FAQ by Zoltan Herczeg)
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><a name="SEC13" href="#TOC1">REVISION</a><br>
--- a/doc/html/pcre2limits.html
+++ b/doc/html/pcre2limits.html
@ -73,7 +73,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><b>
--- a/doc/html/pcre2matching.html
+++ b/doc/html/pcre2matching.html
@ -227,7 +227,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><a name="SEC8" href="#TOC1">REVISION</a><br>
--- a/doc/html/pcre2partial.html
+++ b/doc/html/pcre2partial.html
@ -450,7 +450,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><a name="SEC10" href="#TOC1">REVISION</a><br>
--- a/doc/html/pcre2pattern.html
+++ b/doc/html/pcre2pattern.html
@ -3231,7 +3231,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><a name="SEC30" href="#TOC1">REVISION</a><br>
--- a/doc/html/pcre2perform.html
+++ b/doc/html/pcre2perform.html
@ -180,7 +180,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><b>
--- a/doc/html/pcre2posix.html
+++ b/doc/html/pcre2posix.html
@ -278,7 +278,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><a name="SEC9" href="#TOC1">REVISION</a><br>
--- a/doc/html/pcre2sample.html
+++ b/doc/html/pcre2sample.html
@ -90,7 +90,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><b>
--- a/doc/html/pcre2stack.html
+++ b/doc/html/pcre2stack.html
@ -33,6 +33,13 @@ the recursive call would immediately be passed back as the result of the
 current call (a "tail recursion"), the function is just restarted instead.
 </P>
 <P>
 Each time the internal <b>match()</b> function is called recursively, it uses
 memory from the process stack. For certain kinds of pattern and data, very
 large amounts of stack may be needed, despite the recognition of "tail
 recursion". Note that if PCRE2 is compiled with the -fsanitize=address option
 of the GCC compiler, the stack requirements are greatly increased.
 </P>
 <P>
 The above comments apply when <b>pcre2_match()</b> is run in its normal
 interpretive manner. If the compiled pattern was processed by
 <b>pcre2_jit_compile()</b>, and just-in-time compiling was successful, and the
@ -61,10 +68,7 @@ relevant only for <b>pcre2_match()</b> without the JIT optimization.
 Reducing <b>pcre2_match()</b>'s stack usage
 </b><br>
 <P>
-Each time that the internal <b>match()</b> function is called recursively, it
+You can often reduce the amount of recursion, and therefore the
 uses memory from the process stack. For certain kinds of pattern and data, very
 large amounts of stack may be needed, despite the recognition of "tail
 recursion". You can often reduce the amount of recursion, and therefore the
 amount of stack used, by modifying the pattern that is being matched. Consider,
 for example, this pattern:
 <pre>
@ -187,14 +191,14 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><b>
 REVISION
 </b><br>
 <P>
-Last updated: 20 October 2014
+Last updated: 21 November 2014
 <br>
 Copyright &copy; 1997-2014 University of Cambridge.
 <br>
--- a/doc/html/pcre2syntax.html
+++ b/doc/html/pcre2syntax.html
@ -548,7 +548,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><a name="SEC27" href="#TOC1">REVISION</a><br>
--- a/doc/html/pcre2test.html
+++ b/doc/html/pcre2test.html
@ -1301,7 +1301,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><a name="SEC20" href="#TOC1">REVISION</a><br>
--- a/doc/html/pcre2unicode.html
+++ b/doc/html/pcre2unicode.html
@ -254,7 +254,7 @@ Philip Hazel
 <br>
 University Computing Service
 <br>
-Cambridge CB2 3QH, England.
+Cambridge, England.
 <br>
 </P>
 <br><b>
--- a/doc/pcre2.txt
+++ b/doc/pcre2.txt
@ -22,9 +22,9 @@ INTRODUCTION
       pattern matching using the same syntax and semantics as Perl, with just
       a few differences. Some features that appeared in Python and the origi-
       nal  PCRE  before  they  appeared  in Perl are also available using the
-       Python syntax, there is some support for one or two .NET and  Oniguruma
+       Python syntax. There is also some support for one or two .NET and Onig-
-       syntax  items,  and there are options for requesting some minor changes
+       uruma  syntax  items,  and  there are options for requesting some minor
-       that give better ECMAScript (aka JavaScript) compatibility.
+       changes that give better ECMAScript (aka JavaScript) compatibility.
       The source code for PCRE2 can be compiled to support 8-bit, 16-bit,  or
       32-bit  code units, which means that up to three separate libraries may
@ -33,7 +33,7 @@ INTRODUCTION
       tively. In all three cases, strings can be interpreted  either  as  one
       character  per  code  unit, or as UTF-encoded Unicode, with support for
       Unicode general category properties. Unicode  support  is  optional  at
-       build  time  (but  is  the default); however, processing strings as UTF
+       build  time  (but  is  the default). However, processing strings as UTF
       code units must be enabled explicitly at run time. The version of  Uni-
       code in use can be discovered by running
@ -124,19 +124,18 @@ USER DOCUMENTATION
         pcre2compat        discussion of Perl compatibility
         pcre2demo          a demonstration C program that uses PCRE2
         pcre2grep          description of the pcre2grep command (8-bit only)
-         pcre2jit           discussion of the just-in-time  optimization  sup-
+         pcre2jit           discussion of just-in-time optimization support
       port
         pcre2limits        details of size and other limits
         pcre2matching      discussion of the two matching algorithms
         pcre2partial       details of the partial matching facility
-         pcre2pattern       syntax and semantics of supported
+         pcre2pattern       syntax and semantics of supported regular
-                             regular expressions
+                              expression patterns
         pcre2perform       discussion of performance issues
         pcre2posix         the POSIX-compatible C API for the 8-bit library
         pcre2sample        discussion of the pcre2demo program
         pcre2stack         discussion of stack usage
         pcre2syntax        quick syntax reference
-         pcre2test          description of the pcre2test testing command
+         pcre2test          description of the pcre2test command
         pcre2unicode       discussion of Unicode and UTF support
       In the "man" and HTML formats, there is also a short page  for  each  C
@ -147,7 +146,7 @@ AUTHOR
       Philip Hazel
       University Computing Service
-       Cambridge CB2 3QH, England.
+       Cambridge, England.
       Putting  an  actual email address here is a spam magnet. If you want to
       email me, use my two initials, followed by the two digits  10,  at  the
@ -156,7 +155,7 @@ AUTHOR
 REVISION
-       Last updated: 03 November 2014
+       Last updated: 18 November 2014
       Copyright (c) 1997-2014 University of Cambridge.
 ------------------------------------------------------------------------------
@ -506,13 +505,10 @@ NEWLINES
       Each  of  the first three conventions is used by at least one operating
       system as its standard newline sequence. When PCRE2 is built, a default
       can  be  specified.  The default default is LF, which is the Unix stan-
-       dard. When PCRE2 is run, the default can be overridden, either  when  a
+       dard. However, the newline convention can be changed by an  application
-       pattern is compiled, or when it is matched.
+       when calling pcre2_compile(), or it can be specified by special text at
-
+       the start of the pattern itself; this overrides any other settings. See
-       The  newline convention can be changed when calling pcre2_compile(), or
+       the pcre2pattern page for details of the special character sequences.
       it can be specified by special text at the start of the pattern itself;
       this  overrides  any  other  settings.  See  the  pcre2pattern page for
       details of the special character sequences.
       In  the  PCRE2  documentation  the  word "newline" is used to mean "the
       character or pair of characters that indicate a line break". The choice
@ -523,8 +519,8 @@ NEWLINES
       section on pcre2_match() options below.
       The  choice of newline convention does not affect the interpretation of
-       the  \n  or  \r  escape  sequences, nor does it affect what \R matches,
+       the \n or \r escape sequences, nor does it affect what \R matches; this
-       which has its own separate control.
+       has its own separate convention.
 MULTITHREADING
@ -537,7 +533,7 @@ MULTITHREADING
       cations can use it.
       There are several different blocks of data that are used to pass infor-
-       mation between the application and the PCRE libraries.
+       mation between the application and the PCRE2 libraries.
       (1) A pointer to the compiled form of a pattern is returned to the user
       when pcre2_compile() is successful. The data in the compiled pattern is
@ -634,11 +630,11 @@ PCRE2 CONTEXTS
       A  compile context is required if you want to change the default values
       of any of the following compile-time parameters:
-         What \R matches (Unicode newlines or CR, LF, CRLF only);
+         What \R matches (Unicode newlines or CR, LF, CRLF only)
-         PCRE2's character tables;
+         PCRE2's character tables
-         The newline character sequence;
+         The newline character sequence
-         The compile time nested parentheses limit;
+         The compile time nested parentheses limit
-         An external function for stack checking.
+         An external function for stack checking
       A compile context is also required if you are using custom memory  man-
       agement.   If  none of these apply, just pass NULL as the context argu-
@ -664,10 +660,9 @@ PCRE2 CONTEXTS
       The  value  must  be PCRE2_BSR_ANYCRLF, to specify that \R matches only
       CR, LF, or CRLF, or PCRE2_BSR_UNICODE, to specify that \R  matches  any
-       Unicode line ending sequence. The value  of  this  parameter  does  not
+       Unicode line ending sequence. The value is used by the JIT compiler and
-       affect  what  is  compiled; it is just saved with the compiled pattern.
+       by  the  two  interpreted   matching   functions,   pcre2_match()   and
-       The value is used by the JIT compiler and by the two interpreted match-
+       pcre2_dfa_match().
       ing functions, pcre2_match() and pcre2_dfa_match().
       int pcre2_set_character_tables(pcre2_compile_context *ccontext,
         const unsigned char *tables);
@ -763,8 +758,8 @@ PCRE2 CONTEXTS
       from  zero  for  each position in the subject string. This limit is not
       relevant to pcre2_dfa_match(), which ignores it.
-       When pcre2_match() is called with a pattern that was successfully stud-
+       When pcre2_match() is called with a pattern that was successfully  pro-
-       ied  with pcre2_jit_compile(), the way that the matching is executed is
+       cessed by pcre2_jit_compile(), the way in which matching is executed is
       entirely different. However, there is still the possibility of  runaway
       matching  that  goes  on  for  a very long time, and so the match_limit
       value is also used in this case (but in a different way) to  limit  how
@ -819,17 +814,18 @@ PCRE2 CONTEXTS
       remembering backtracking data, instead of recursive function calls that
       use the system stack. There is a discussion about PCRE2's  stack  usage
       in  the  pcre2stack documentation. See the pcre2build documentation for
-       details of how to build PCRE2. Using the heap for recursion is  a  non-
+       details of how to build PCRE2.
-       standard  way of building PCRE2, for use in environments that have lim-
+
-       ited  stacks.  Because  of  the  greater  use  of  memory   management,
+       Using the heap for recursion is a non-standard way of  building  PCRE2,
-       pcre2_match()  runs  more  slowly.  Functions that are different to the
+       for  use  in  environments  that  have  limited  stacks. Because of the
-       general custom memory functions are provided  so  that  special-purpose
+       greater use of memory management, pcre2_match() runs more slowly. Func-
-       external  code can be used for this case, because the memory blocks are
+       tions  that  are  different  to the general custom memory functions are
-       all the same size. The blocks are retained by pcre2_match() until it is
+       provided so that special-purpose external code can  be  used  for  this
-       about  to  exit  so  that  they can be re-used when possible during the
+       case,  because  the memory blocks are all the same size. The blocks are
-       match. In the absence of these functions, the normal custom memory man-
+       retained by pcre2_match() until it is about to exit so that they can be
-       agement  functions  are  used,  if supplied, otherwise the system func-
+       re-used  when  possible during the match. In the absence of these func-
-       tions.
+       tions, the normal custom memory management functions are used, if  sup-
       plied, otherwise the system functions.
 CHECKING BUILD-TIME OPTIONS
@ -858,10 +854,10 @@ CHECKING BUILD-TIME OPTIONS
         PCRE2_CONFIG_BSR
       The output is an integer whose value indicates what character sequences
-       the \R escape sequence matches by default. A value of 0 means  that  \R
+       the \R escape sequence matches by default. A value of PCRE2_BSR_UNICODE
-       matches  any  Unicode  line ending sequence; a value of 1 means that \R
+       means that \R matches any Unicode line  ending  sequence;  a  value  of
-       matches only CR, LF, or CRLF. The default can be overridden when a pat-
+       PCRE2_BSR_ANYCRLF  means  that  \R  matches  only  CR, LF, or CRLF. The
-       tern is compiled or matched.
+       default can be overridden when a pattern is compiled.
         PCRE2_CONFIG_JIT
@ -871,13 +867,13 @@ CHECKING BUILD-TIME OPTIONS
         PCRE2_CONFIG_JITTARGET
       The  where  argument  should point to a buffer that is at least 48 code
-       units long. (The exact length needed can be found by calling pcre2_con-
+       units long.  (The  exact  length  required  can  be  found  by  calling
-       fig() with where set to NULL.) The buffer is filled with a string  that
+       pcre2_config()  with  where  set  to NULL.) The buffer is filled with a
-       contains  the  name  of  the architecture for which the JIT compiler is
+       string that contains the name of the architecture  for  which  the  JIT
-       configured, for example "x86 32bit (little endian + unaligned)". If JIT
+       compiler  is  configured,  for  example  "x86  32bit  (little  endian +
-       support  is not available, PCRE2_ERROR_BADOPTION is returned, otherwise
+       unaligned)". If JIT support is not available, PCRE2_ERROR_BADOPTION  is
-       the number of code units used is returned. This is the  length  of  the
+       returned,  otherwise the number of code units used is returned. This is
-       string, plus one unit for the terminating zero.
+       the length of the string, plus one unit for the terminating zero.
         PCRE2_CONFIG_LINKSIZE
@ -906,11 +902,11 @@ CHECKING BUILD-TIME OPTIONS
       The output is an integer whose value specifies  the  default  character
       sequence that is recognized as meaning "newline". The values are:
-         1  Carriage return (CR)
+         PCRE2_NEWLINE_CR       Carriage return (CR)
-         2  Linefeed (LF)
+         PCRE2_NEWLINE_LF       Linefeed (LF)
-         3  Carriage return, linefeed (CRLF)
+         PCRE2_NEWLINE_CRLF     Carriage return, linefeed (CRLF)
-         4  Any Unicode line ending
+         PCRE2_NEWLINE_ANY      Any Unicode line ending
-         5  Any of CR, LF, or CRLF
+         PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
       The  default  should  normally  correspond to the standard sequence for
       your operating system.
@ -943,12 +939,12 @@ CHECKING BUILD-TIME OPTIONS
         PCRE2_CONFIG_UNICODE_VERSION
       The  where  argument  should point to a buffer that is at least 24 code
-       units long. (The exact length needed can be found by calling pcre2_con-
+       units long.  (The  exact  length  required  can  be  found  by  calling
-       fig() with where set to NULL.) If PCRE2 has been compiled without  Uni-
+       pcre2_config()  with  where  set  to  NULL.) If PCRE2 has been compiled
-       code  support,  the  buffer  is  filled with the text "Unicode not sup-
+       without Unicode support, the buffer is filled with  the  text  "Unicode
-       ported". Otherwise, the Unicode version string (for  example,  "7.0.0")
+       not  supported".  Otherwise,  the  Unicode version string (for example,
-       is  inserted.  The  number  of code units used is returned. This is the
+       "7.0.0") is inserted. The number of code units used is  returned.  This
-       length of the string plus one unit for the terminating zero.
+       is the length of the string plus one unit for the terminating zero.
         PCRE2_CONFIG_UNICODE
@ -959,9 +955,9 @@ CHECKING BUILD-TIME OPTIONS
         PCRE2_CONFIG_VERSION
       The  where  argument  should point to a buffer that is at least 12 code
-       units long. (The exact length needed can be found by calling pcre2_con-
+       units long.  (The  exact  length  required  can  be  found  by  calling
-       fig() with where set to NULL.) The buffer is filled with the PCRE2 ver-
+       pcre2_config()  with  where set to NULL.) The buffer is filled with the
-       sion  string,  zero-terminated.  The  number  of  code  units  used  is
+       PCRE2 version string, zero-terminated. The number of code units used is
       returned. This is the length of the string plus one unit for the termi-
       nating zero.
@ -974,16 +970,18 @@ COMPILING A PATTERN
       pcre2_code_free(pcre2_code *code);
-       This  function  compiles a pattern, defined by a pointer to a string of
+       The pcre2_compile() function compiles a pattern into an internal  form.
-       code units and a length, into an internal form. If the pattern is zero-
+       The  pattern  is  defined  by a pointer to a string of code units and a
-       terminated,  the  length  should be specified as PCRE2_ZERO_TERMINATED.
+       length, If the pattern is zero-terminated, the length can be  specified
-       The function returns a pointer to a block of memory that  contains  the
+       as  PCRE2_ZERO_TERMINATED. The function returns a pointer to a block of
-       compiled  pattern  and related data. The caller must free the memory by
+       memory that contains the compiled pattern and related data. The  caller
-       calling pcre2_code_free() when it is no longer needed.
+       must  free the memory by calling pcre2_code_free() when it is no longer
       needed.
-       If the compile  context  argument  ccontext  is  NULL,  the  memory  is
+       If the compile context argument ccontext is NULL, memory for  the  com-
-       obtained  by  calling malloc(). Otherwise, it is obtained from the same
+       piled  pattern  is  obtained  by  calling  malloc().  Otherwise,  it is
-       memory function that was used for the compile context.
+       obtained from the same memory function that was used  for  the  compile
       context.
       The options argument contains various bit settings that affect the com-
       pilation. It should be zero if no options are required.  The  available
@ -1280,7 +1278,8 @@ COMPILING A PATTERN
       are used instead to classify characters. More details are given in  the
       section on generic character types in the pcre2pattern page. If you set
       PCRE2_UCP, matching one of the items it affects takes much longer.  The
-       option is available only if PCRE2 has been compiled with UTF support.
+       option  is  available only if PCRE2 has been compiled with Unicode sup-
       port.
         PCRE2_UNGREEDY
@ -1293,10 +1292,11 @@ COMPILING A PATTERN
       This  option  causes  PCRE2  to regard both the pattern and the subject
       strings that are subsequently processed as strings  of  UTF  characters
-       instead of single-code-unit strings. However, it is available only when
+       instead  of  single-code-unit  strings.  It  is available when PCRE2 is
-       PCRE2 is built to include UTF support. If not, the use of  this  option
+       built to include Unicode support (which is  the  default).  If  Unicode
-       provokes  an error. Details of how this option changes the behaviour of
+       support  is  not  available,  the use of this option provokes an error.
-       PCRE2 are given in the pcre2unicode page.
+       Details of how this option changes the behaviour of PCRE2 are given  in
       the pcre2unicode page.
 COMPILATION ERROR CODES
@ -1345,14 +1345,13 @@ LOCALE SUPPORT
       PCRE2 handles caseless matching, and determines whether characters  are
       letters,  digits, or whatever, by reference to a set of tables, indexed
-       by  character  code  point.  When  running  in UTF-8 mode, or using the
+       by character code point. This applies only  to  characters  whose  code
-       16-bit or 32-bit libraries, this applies only to characters  with  code
+       points  are  less than 256. By default, higher-valued code points never
-       points less than 256. By default, higher-valued code points never match
+       match escapes such as \w or \d.  However, if PCRE2 is  built  with  UTF
-       escapes such as \w or \d. However, if PCRE2 is built with UTF  support,
+       support,  all  characters  can  be  tested with \p and \P, or, alterna-
-       all  characters  can  be  tested with \p and \P, or, alternatively, the
+       tively, the PCRE2_UCP option can be set when  a  pattern  is  compiled;
-       PCRE2_UCP option can be set when a pattern is compiled; this causes  \w
+       this  causes  \w and friends to use Unicode property support instead of
-       and  friends  to  use  Unicode property support instead of the built-in
+       the built-in tables.
       tables.
       The use of locales with Unicode is discouraged.  If  you  are  handling
       characters  with  code  points  greater than 128, you should either use
@ -1463,10 +1462,9 @@ INFORMATION ABOUT A COMPILED PATTERN
         PCRE2_INFO_BSR
       The output is a uint32_t whose value indicates what character sequences
-       the  \R  escape sequence matches by default. A value of 0 means that \R
+       the \R escape sequence matches. A value of PCRE2_BSR_UNICODE means that
-       matches any Unicode line ending sequence; a value of 1  means  that  \R
+       \R matches any Unicode line ending sequence; a value of  PCRE2_BSR_ANY-
-       matches only CR, LF, or CRLF. The default can be overridden when a pat-
+       CRLF means that \R matches only CR, LF, or CRLF.
       tern is matched.
         PCRE2_INFO_CAPTURECOUNT
@ -1607,15 +1605,16 @@ INFORMATION ABOUT A COMPILED PATTERN
       The map consists of a number of  fixed-size  entries.  PCRE2_INFO_NAME-
       COUNT  gives  the number of entries, and PCRE2_INFO_NAMEENTRYSIZE gives
-       the  size  of  each  entry;  both of these return a uint32_t value. The
+       the size of each entry in code units; both of these return  a  uint32_t
-       entry   size   depends   on   the   length   of   the   longest   name.
+       value. The entry size depends on the length of the longest name.
       PCRE2_INFO_NAMETABLE returns a pointer to the first entry of the table.
       This is a PCRE2_SPTR pointer to a block of code  units.  In  the  8-bit
       library,  the  first two bytes of each entry are the number of the cap-
       turing parenthesis, most significant byte first. In the 16-bit library,
-       the  pointer  points  to 16-bit data units, the first of which contains
+       the  pointer  points  to 16-bit code units, the first of which contains
       the parenthesis number. In the 32-bit library, the  pointer  points  to
-       32-bit  data units, the first of which contains the parenthesis number.
+       32-bit  code units, the first of which contains the parenthesis number.
       The rest of the entry is the corresponding name, zero terminated.
       The names are in alphabetical order. If (?| is used to create  multiple
@ -1653,17 +1652,16 @@ INFORMATION ABOUT A COMPILED PATTERN
         PCRE2_INFO_NEWLINE
-       The  output  is  a uint32_t whose value specifies the default character
+       The output is a uint32_t with one of the following values:
       sequence that will be recognized as meaning "newline"  while  matching.
       The values are:
-         1  Carriage return (CR)
+         PCRE2_NEWLINE_CR       Carriage return (CR)
-         2  Linefeed (LF)
+         PCRE2_NEWLINE_LF       Linefeed (LF)
-         3  Carriage return, linefeed (CRLF)
+         PCRE2_NEWLINE_CRLF     Carriage return, linefeed (CRLF)
-         4  Any Unicode line ending
+         PCRE2_NEWLINE_ANY      Any Unicode line ending
-         5  Any of CR, LF, or CRLF
+         PCRE2_NEWLINE_ANYCRLF  Any of CR, LF, or CRLF
-       The default can be overridden when a pattern is matched.
+       This  specifies  the default character sequence that will be recognized
       as meaning "newline" while matching.
         PCRE2_INFO_RECURSIONLIMIT
@ -1699,17 +1697,17 @@ THE MATCH DATA BLOCK
       match data block, which is an opaque  structure  that  is  accessed  by
       function  calls.  In particular, the match data block contains a vector
       of offsets into the subject string that define the matched part of  the
-       subject and any substrings that were capured. This is know as the ovec-
+       subject  and  any  substrings  that  were captured. This is know as the
-       tor.
+       ovector.
-       Before  calling  pcre2_match()  or  pcre2_dfa_match() you must create a
+       Before calling pcre2_match(), pcre2_dfa_match(),  or  pcre2_jit_match()
-       match data block by calling one of the creation  functions  above.  For
+       you must create a match data block by calling one of the creation func-
-       pcre2_match_data_create(), the first argument is the number of pairs of
+       tions above. For pcre2_match_data_create(), the first argument  is  the
-       offsets in the ovector. One pair of offsets is required to identify the
+       number  of  pairs  of  offsets  in  the ovector. One pair of offsets is
-       string  that matched the whole pattern, with another pair for each cap-
+       required to identify the string that matched the  whole  pattern,  with
-       tured substring. For example, a value of  4  creates  enough  space  to
+       another  pair  for  each  captured substring. For example, a value of 4
-       record  the  matched  portion  of  the subject plus three captured sub-
+       creates enough space to record the matched portion of the subject  plus
-       strings.   A   minimum   of   at   least   1   pair   is   imposed   by
+       three  captured  substrings. A minimum of at least 1 pair is imposed by
       pcre2_match_data_create(), so it is always possible to return the over-
       all matched string.
@ -1718,16 +1716,17 @@ THE MATCH DATA BLOCK
       be  exactly  the  right size to hold all the substrings a pattern might
       capture.
-       The  second  argument of both these functions ia a pointer to a general
+       The second argument of both these functions is a pointer to  a  general
       context,  which  can specify custom memory management for obtaining the
       memory for the match data block. If you are  not  using  custom  memory
       management, pass NULL.
       A  match  data block can be used many times, with the same or different
       compiled patterns. When it is no longer needed, it should be  freed  by
-       calling pcre2_match_data_free(). How  to  extract  information  from  a
+       calling  pcre2_match_data_free().  You  can  extract information from a
-       match  data  block after a match operation is described in the sections
+       match data block after a match operation has finished, using  functions
-       on matched strings and other match data below.
+       that  are  described in the sections on matched strings and other match
       data below.
 MATCHING A PATTERN: THE TRADITIONAL FUNCTION
@ -1826,12 +1825,10 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
       PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and  PCRE2_PARTIAL_SOFT.  Their
       action is described below.
-       If  the  pattern  was  successfully processed by the just-in-time (JIT)
+       Setting  PCRE2_ANCHORED  at match time is not supported by the just-in-
-       compiler, the only supported options for matching using  the  JIT  code
+       time (JIT) compiler. If it is set, JIT matching  is  disabled  and  the
-       are PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART,
+       normal interpretive code in pcre2_match() is run. The remaining options
-       PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT.  If  an
+       are supported for JIT matching.
       unsupported  option  is  used,  JIT matching is disabled and the normal
       interpretive code in pcre2_match() is run.
         PCRE2_ANCHORED
@ -1845,18 +1842,18 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
       This option specifies that first character of the subject string is not
       the  beginning  of  a  line, so the circumflex metacharacter should not
-       match before it. Setting this without PCRE2_MULTILINE (at compile time)
+       match before it. Setting this without  having  set  PCRE2_MULTILINE  at
-       causes  circumflex  never to match. This option affects only the behav-
+       compile time causes circumflex never to match. This option affects only
-       iour of the circumflex metacharacter. It does not affect \A.
+       the behaviour of the circumflex metacharacter. It does not affect \A.
         PCRE2_NOTEOL
       This option specifies that the end of the subject string is not the end
       of  a line, so the dollar metacharacter should not match it nor (except
       in multiline mode) a newline immediately before it. Setting this  with-
-       out  PCRE2_MULTILINE  (at  compile  time) causes dollar never to match.
+       out  having  set PCRE2_MULTILINE at compile time causes dollar never to
-       This option affects only the behaviour of the dollar metacharacter.  It
+       match. This option affects only the behaviour of the dollar metacharac-
-       does not affect \Z or \z.
+       ter. It does not affect \Z or \z.
         PCRE2_NOTEMPTY
@ -1869,14 +1866,16 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
       is applied to a string not beginning with "a" or  "b",  it  matches  an
       empty string at the start of the subject. With PCRE2_NOTEMPTY set, this
-       match is not valid, so PCRE2  searches  further  into  the  string  for
+       match is not valid, so pcre2_match() searches further into  the  string
-       occurrences of "a" or "b".
+       for occurrences of "a" or "b".
         PCRE2_NOTEMPTY_ATSTART
-       This  is like PCRE2_NOTEMPTY, except that an empty string match that is
+       This  is  like PCRE2_NOTEMPTY, except that it locks out an empty string
-       not at the start of  the  subject  is  permitted.  If  the  pattern  is
+       match only at the first matching position, that is, at the start of the
-       anchored, such a match can occur only if the pattern contains \K.
+       subject  plus  the  starting offset. An empty string match later in the
       subject is permitted.  If the pattern is anchored,  such  a  match  can
       occur only if the pattern contains \K.
         PCRE2_NO_UTF_CHECK
@ -1910,9 +1909,9 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION
       happens  when  PCRE2_PARTIAL_SOFT  (but not PCRE2_PARTIAL_HARD) is set,
       matching continues by testing any remaining alternatives.  Only  if  no
       complete  match can be found is PCRE2_ERROR_PARTIAL returned instead of
-       PCRE2_ERROR_NOMATCH. In other words, PCRE2_PARTIAL_SOFT says  that  the
+       PCRE2_ERROR_NOMATCH. In other words, PCRE2_PARTIAL_SOFT specifies  that
-       caller  is  prepared to handle a partial match, but only if no complete
+       the  caller  is prepared to handle a partial match, but only if no com-
-       match can be found.
+       plete match can be found.
       If PCRE2_PARTIAL_HARD is set, it overrides PCRE2_PARTIAL_SOFT. In  this
       case,  if  a  partial match is found, pcre2_match() immediately returns
@ -1930,15 +1929,15 @@ NEWLINE HANDLING WHEN MATCHING
       ally the standard convention for the operating system. The default  can
       be  overridden  in  a  compile  context.   During matching, the newline
       choice affects  the  behaviour  of  the  dot,  circumflex,  and  dollar
-       metacharacters.  It  may  also  alter  the  way  the  match position is
+       metacharacters.  It  may also alter the way the match starting position
-       advanced after a match failure for an unanchored pattern.
+       is advanced after a match failure for an unanchored pattern.
       When PCRE2_NEWLINE_CRLF, PCRE2_NEWLINE_ANYCRLF, or PCRE2_NEWLINE_ANY is
-       set,  and a match attempt for an unanchored pattern fails when the cur-
+       set  as  the  newline convention, and a match attempt for an unanchored
-       rent position is at a  CRLF  sequence,  and  the  pattern  contains  no
+       pattern fails when the current starting position is at a CRLF sequence,
-       explicit  matches  for  CR  or  LF  characters,  the  match position is
+       and  the  pattern contains no explicit matches for CR or LF characters,
-       advanced by two characters instead of one, in other words, to after the
+       the match position is advanced by two characters  instead  of  one,  in
-       CRLF.
+       other words, to after the CRLF.
       The above rule is a compromise that makes the most common cases work as
       expected. For example, if the pattern  is  .+A  (and  the  PCRE2_DOTALL
@ -1950,8 +1949,8 @@ NEWLINE HANDLING WHEN MATCHING
       An explicit match for CR of LF is either a literal appearance of one of
       those characters in the  pattern,  or  one  of  the  \r  or  \n  escape
-       sequences.  Implicit  matches  such  as  [^X] do not count, nor does \s
+       sequences.  Implicit  matches  such  as [^X] do not count, nor does \s,
-       (which includes CR and LF in the characters that it matches).
+       even though it includes CR and LF in the characters that it matches.
       Notwithstanding the above, anomalous effects may still occur when  CRLF
       is a valid newline sequence and explicit \r or \n escapes appear in the
@ -1968,19 +1967,20 @@ HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS
       addition,  further  substrings  from  the  subject may be picked out by
       parenthesized parts of the pattern.  Following  the  usage  in  Jeffrey
       Friedl's  book,  this  is  called  "capturing" in what follows, and the
-       phrase "capturing subpattern" is used for a fragment of a pattern  that
+       phrase "capturing subpattern" or "capturing group" is used for a  frag-
-       picks out a substring.  PCRE2 supports several other kinds of parenthe-
+       ment  of  a  pattern that picks out a substring. PCRE2 supports several
-       sized subpattern that do not  cause  substrings  to  be  captured.  The
+       other kinds of parenthesized subpattern that do not cause substrings to
-       pcre2_pattern_info()  function can be used to find out how many captur-
+       be  captured. The pcre2_pattern_info() function can be used to find out
-       ing subpatterns there are in a compiled pattern.
+       how many capturing subpatterns there are in a compiled pattern.
       The overall matched string and any captured substrings are returned  to
-       the  caller via a vector of PCRE2_SIZE values, called the ovector. This
+       the  caller via a vector of PCRE2_SIZE values. This is called the ovec-
-       is contained within the match data block.  You can obtain direct access
+       tor, and is contained within the match  data  block.   You  can  obtain
-       to  the  ovector  by  calling  pcre2_get_ovector_pointer()  to find its
+       direct  access to the ovector by calling pcre2_get_ovector_pointer() to
-       address, and pcre2_get_ovector_count() to find the number of  pairs  of
+       find its address, and pcre2_get_ovector_count() to find the  number  of
-       values  it contains. Alternatively, you can use the auxiliary functions
+       pairs  of  values it contains. Alternatively, you can use the auxiliary
-       for accessing captured substrings by number or by name (see below).
+       functions for accessing captured substrings by number or by  name  (see
       below).
       Within the ovector, the first in each pair of values is set to the off-
       set of the first code unit of a substring, and the second is set to the
@ -2033,15 +2033,16 @@ HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS
       pcre2_match().  The  other  elements retain whatever values they previ-
       ously had.
-   Other information about the match
+
 OTHER INFORMATION ABOUT A MATCH
       PCRE2_SPTR pcre2_get_mark(pcre2_match_data *match_data);
       PCRE2_SIZE pcre2_get_startchar(pcre2_match_data *match_data);
-       In  addition  to  the offsets in the ovector, other information about a
+       As well as the offsets in the ovector, other information about a  match
-       match is retained in the match data block and can be retrieved  by  the
+       is  retained  in the match data block and can be retrieved by the above
-       above functions.
+       functions.
       When a (*MARK) name is to be passed back,  pcre2_get_mark()  returns  a
       pointer  to the zero-terminated name, which is within the compiled pat-
@ -2056,7 +2057,8 @@ HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS
       value is always the same as ovector[0] because \K does not  affect  the
       result of a partial match.
-   Error return values from pcre2_match()
+
 ERROR RETURNS FROM pcre2_match()
       If  pcre2_match() fails, it returns a negative number. This can be con-
       verted to a text string by calling pcre2_get_error_message().  Negative
@ -2090,7 +2092,7 @@ HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS
         PCRE2_ERROR_BADOFFSET
-       The value of startoffset greater than the length of the subject.
+       The value of startoffset was greater than the length of the subject.
         PCRE2_ERROR_BADOPTION
@ -2154,7 +2156,7 @@ HOW PCRE2_MATCH() RETURNS A STRING AND CAPTURED SUBSTRINGS
       the same position in the subject  string.  Some  simple  patterns  that
       might  do  this are detected and faulted at compile time, but more com-
       plicated cases, in particular mutual recursions between  two  different
-       subpatterns, cannot be detected until run time.
+       subpatterns, cannot be detected until matching is attempted.
         PCRE2_ERROR_RECURSIONLIMIT
@ -2201,8 +2203,8 @@ EXTRACTING CAPTURED SUBSTRINGS BY NUMBER
       The final arguments of pcre2_substring_copy_bynumber() are a pointer to
       the buffer and a pointer to a variable that contains its length in code
-       units.   This  is  updated  to  contain the actual number of code units
+       units.  This is updated to contain the actual number of code units used
-       used, excluding the terminating zero.
+       for the extracted substring, excluding the terminating zero.
       For pcre2_substring_get_bynumber() the third and fourth arguments point
       to variables that are updated with a pointer to the new memory and  the
@ -2234,11 +2236,11 @@ EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS
       void pcre2_substring_list_free(PCRE2_SPTR *list);
       The pcre2_substring_list_get() function  extracts  all  available  sub-
-       strings and builds a list of pointers to them, and a second  list  that
+       strings  and  builds  a  list of pointers to them. It also (optionally)
-       contains  their  lengths  (in code units), excluding a terminating zero
+       builds a second list that  contains  their  lengths  (in  code  units),
-       that is added to each of them. All this is done in a  single  block  of
+       excluding a terminating zero that is added to each of them. All this is
-       memory  that is obtained using the same memory allocation function that
+       done in a single block of memory that is obtained using the same memory
-       was used to get the match data block.
+       allocation function that was used to get the match data block.
       The  address of the memory block is returned via listptr, which is also
       the start of the list of string pointers. The end of the list is marked
@ -2254,7 +2256,7 @@ EXTRACTING A LIST OF ALL CAPTURED SUBSTRINGS
       when capturing subpattern number n+1 matches some part of the  subject,
       but  subpattern n has not been used at all, it returns an empty string.
       This can be distinguished  from  a  genuine  zero-length  substring  by
-       inspecting the  appropriate  offset  in  the  ovector,  which  contains
+       inspecting  the  appropriate  offset  in  the  ovector,  which  contain
       PCRE2_UNSET for unset substrings.
@ -2288,12 +2290,11 @@ EXTRACTING CAPTURED SUBSTRINGS BY NAME
       there is more than one subpattern of that name.
       Given the number, you can extract the substring directly, or use one of
-       the functions described in the previous section. For convenience, there
+       the functions described above. For convenience, there are also "byname"
-       are also "byname" functions that correspond  to  the  "bynumber"  func-
+       functions that correspond to the "bynumber" functions, the only differ-
-       tions,  the  only  difference  being that the second argument is a name
+       ence being that the second argument is a name instead of a number. How-
-       instead of a number.  However, if PCRE2_DUPNAMES is set and  there  are
+       ever,  if  PCRE2_DUPNAMES is set and there are duplicate names, the be-
-       duplicate  names,  the behaviour may not be what you want (see the next
+       haviour may not be what you want.
       section).
       Warning: If the pattern uses the (?| feature to set up multiple subpat-
       terns  with  the  same number, as described in the section on duplicate
@ -2331,8 +2332,8 @@ CREATING A NEW STRING WITH SUBSTITUTIONS
       brackets are required only if the following character would  be  inter-
       preted as part of the number or name. The number may be zero to include
       the entire matched string.   For  example,  if  the  pattern  a(b)c  is
-       matched  with "[abc]" and the replacement string "+$1$0$1+", the result
+       matched  with "=abc=" and the replacement string "+$1$0$1+", the result
-       is "[+babcb+]". Group insertion is done by calling  pcre2_copy_byname()
+       is "=+babcb+=". Group insertion is done by calling  pcre2_copy_byname()
       or pcre2_copy_bynumber() as appropriate.
       The  first  seven  arguments  of pcre2_substitute() are the same as for
@ -2382,19 +2383,20 @@ DUPLICATE SUBPATTERN NAMES
       pcre2_substring_get_byname()  return  the first substring corresponding
       to the given name that is set. If none are set, PCRE2_ERROR_NOSUBSTRING
       is  returned.  The  pcre2_substring_number_from_name() function returns
-       one of the numbers that are associated with the name,  but  it  is  not
+       the error PCRE2_ERROR_NOUNIQUESUBSTRING.
       defined which it is.
       If you want to get full details of all captured substrings for a  given
       name,  you  must use the pcre2_substring_nametable_scan() function. The
       first argument is the compiled pattern, and the second is the name.  If
       the  third  and fourth arguments are NULL, the function returns a group
-       number (it is not defined which). Otherwise, the third and fourth argu-
+       number for a unique name, or PCRE2_ERROR_NOUNIQUESUBSTRING otherwise.
-       ments must be pointers to variables that are updated by  the  function.
+
-       After it has run, they point to the first and last entries in the name-
+       When the third and fourth arguments are not NULL, they must be pointers
-       to-number table for the given name, and the function returns the length
+       to  variables  that are updated by the function. After it has run, they
-       of  each  entry.  In both cases, PCRE2_ERROR_NOSUBSTRING is returned if
+       point to the first and last entries in the name-to-number table for the
-       there are no entries for the given name.
+       given  name,  and the function returns the length of each entry in code
       units. In both cases, PCRE2_ERROR_NOSUBSTRING is returned if there  are
       no entries for the given name.
       The format of the name table is described above in the section entitled
       Information about a pattern above.  Given all the relevant entries  for
@ -2402,15 +2404,15 @@ DUPLICATE SUBPATTERN NAMES
       data.
-FINDING ALL POSSIBLE MATCHES
+FINDING ALL POSSIBLE MATCHES AT ONE POSITION
       The traditional matching function uses a  similar  algorithm  to  Perl,
-       which stops when it finds the first match, starting at a given point in
+       which  stops when it finds the first match at a given point in the sub-
-       the  subject.  If you want to find all possible matches, or the longest
+       ject. If you want to find all possible matches, or the longest possible
-       possible match at a given  position,  consider  using  the  alternative
+       match  at  a  given  position,  consider using the alternative matching
-       matching  function (see below) instead.  If you cannot use the alterna-
+       function (see below) instead. If you cannot use the  alternative  func-
-       tive function, you can kludge it up by making use of the callout facil-
+       tion, you can kludge it up by making use of the callout facility, which
-       ity, which is described in the pcre2callout documentation.
+       is described in the pcre2callout documentation.
       What you have to do is to insert a callout right at the end of the pat-
       tern.   When your callout function is called, extract and save the cur-
@ -2538,12 +2540,11 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
       NOTE:  PCRE2's  "auto-possessification" optimization usually applies to
       character repeats at the end of a pattern (as well as internally).  For
-       example, the pattern "a\d+" is compiled as if it were  "a\d++"  because
+       example,  the pattern "a\d+" is compiled as if it were "a\d++". For DFA
       there  is  no  point  in backtracking into the repeated digits. For DFA
       matching, this means that only one possible  match  is  found.  If  you
       really  do  want multiple matches in such cases, either use an ungreedy
-       repeat ("a\d+?") or set the PCRE2_NO_AUTO_POSSESS option  when  compil-
+       repeat auch as "a\d+?" or set  the  PCRE2_NO_AUTO_POSSESS  option  when
-       ing.
+       compiling.
   Error returns from pcre2_dfa_match()
@ -2578,7 +2579,7 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
         PCRE2_ERROR_DFA_BADRESTART
-       When pcre2_dfa_match() is called  with  the  pcre2_dfa_RESTART  option,
+       When pcre2_dfa_match() is called  with  the  PCRE2_DFA_RESTART  option,
       some  plausibility  checks  are  made on the contents of the workspace,
       which should contain data about the previous partial match. If  any  of
       these checks fail, this error is given.
@ -2586,21 +2587,21 @@ MATCHING A PATTERN: THE ALTERNATIVE FUNCTION
 SEE ALSO
-       pcre2build(3),    pcre2libs(3),    pcre2callout(3),   pcre2matching(3),
+       pcre2build(3),    pcre2callout(3),    pcre2demo(3),   pcre2matching(3),
-       pcre2partial(3),    pcre2posix(3),    pcre2demo(3),     pcre2sample(3),
+       pcre2partial(3),    pcre2posix(3),    pcre2sample(3),    pcre2stack(3),
-       pcre2stack(3).
+       pcre2unicode(3).
 AUTHOR
       Philip Hazel
       University Computing Service
-       Cambridge CB2 3QH, England.
+       Cambridge, England.
 REVISION
-       Last updated: 11 November 2014
+       Last updated: 21 November 2014
       Copyright (c) 1997-2014 University of Cambridge.
 ------------------------------------------------------------------------------
@ -3043,7 +3044,7 @@ AUTHOR
       Philip Hazel
       University Computing Service
-       Cambridge CB2 3QH, England.
+       Cambridge, England.
 REVISION
@ -3279,7 +3280,7 @@ AUTHOR
       Philip Hazel
       University Computing Service
-       Cambridge CB2 3QH, England.
+       Cambridge, England.
 REVISION
@ -3465,7 +3466,7 @@ AUTHOR
       Philip Hazel
       University Computing Service
-       Cambridge CB2 3QH, England.
+       Cambridge, England.
 REVISION
@ -3849,7 +3850,7 @@ AUTHOR
       Philip Hazel (FAQ by Zoltan Herczeg)
       University Computing Service
-       Cambridge CB2 3QH, England.
+       Cambridge, England.
 REVISION
@ -3917,7 +3918,7 @@ AUTHOR
       Philip Hazel
       University Computing Service
-       Cambridge CB2 3QH, England.
+       Cambridge, England.
 REVISION
@ -4136,7 +4137,7 @@ AUTHOR
       Philip Hazel
       University Computing Service
-       Cambridge CB2 3QH, England.
+       Cambridge, England.
 REVISION
@ -4576,7 +4577,7 @@ AUTHOR
       Philip Hazel
       University Computing Service
-       Cambridge CB2 3QH, England.
+       Cambridge, England.
 REVISION
@ -4801,7 +4802,7 @@ AUTHOR
       Philip Hazel
       University Computing Service
-       Cambridge CB2 3QH, England.
+       Cambridge, England.
 REVISION
--- a/doc/pcre2api.3
+++ b/doc/pcre2api.3
@ -1,4 +1,4 @@
-.TH PCRE2API 3 "18 November 2014" "PCRE2 10.00"
+.TH PCRE2API 3 "21 November 2014" "PCRE2 10.00"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions (revised API)
 .sp
@ -1569,15 +1569,17 @@ values.
 .P
 The map consists of a number of fixed-size entries. PCRE2_INFO_NAMECOUNT gives
 the number of entries, and PCRE2_INFO_NAMEENTRYSIZE gives the size of each
-entry; both of these return a \fBuint32_t\fP value. The entry size depends on
+entry in code units; both of these return a \fBuint32_t\fP value. The entry
-the length of the longest name. PCRE2_INFO_NAMETABLE returns a pointer to the
+size depends on the length of the longest name.
-first entry of the table. This is a PCRE2_SPTR pointer to a block of code
+.P
-units. In the 8-bit library, the first two bytes of each entry are the number
+PCRE2_INFO_NAMETABLE returns a pointer to the first entry of the table. This is
-of the capturing parenthesis, most significant byte first. In the 16-bit
+a PCRE2_SPTR pointer to a block of code units. In the 8-bit library, the first
-library, the pointer points to 16-bit data units, the first of which contains
+two bytes of each entry are the number of the capturing parenthesis, most
-the parenthesis number. In the 32-bit library, the pointer points to 32-bit
+significant byte first. In the 16-bit library, the pointer points to 16-bit
-data units, the first of which contains the parenthesis number. The rest of the
+code units, the first of which contains the parenthesis number. In the 32-bit
-entry is the corresponding name, zero terminated.
+library, the pointer points to 32-bit code units, the first of which contains
 the parenthesis number. The rest of the entry is the corresponding name, zero
 terminated.
 .P
 The names are in alphabetical order. If (?| is used to create multiple groups
 with the same number, as described in the
@ -1835,17 +1837,18 @@ matching.
 .sp
 This option specifies that first character of the subject string is not the
 beginning of a line, so the circumflex metacharacter should not match before
-it. Setting this without PCRE2_MULTILINE (at compile time) causes circumflex
+it. Setting this without having set PCRE2_MULTILINE at compile time causes
-never to match. This option affects only the behaviour of the circumflex
+circumflex never to match. This option affects only the behaviour of the
-metacharacter. It does not affect \eA.
+circumflex metacharacter. It does not affect \eA.
 .sp
  PCRE2_NOTEOL
 .sp
 This option specifies that the end of the subject string is not the end of a
 line, so the dollar metacharacter should not match it nor (except in multiline
-mode) a newline immediately before it. Setting this without PCRE2_MULTILINE (at
+mode) a newline immediately before it. Setting this without having set
-compile time) causes dollar never to match. This option affects only the
+PCRE2_MULTILINE at compile time causes dollar never to match. This option
-behaviour of the dollar metacharacter. It does not affect \eZ or \ez.
+affects only the behaviour of the dollar metacharacter. It does not affect \eZ
 or \ez.
 .sp
  PCRE2_NOTEMPTY
 .sp
@ -1857,13 +1860,16 @@ match the empty string, the entire match fails. For example, if the pattern
 .sp
 is applied to a string not beginning with "a" or "b", it matches an empty
 string at the start of the subject. With PCRE2_NOTEMPTY set, this match is not
-valid, so PCRE2 searches further into the string for occurrences of "a" or "b".
+valid, so \fBpcre2_match()\fP searches further into the string for occurrences
 of "a" or "b".
 .sp
  PCRE2_NOTEMPTY_ATSTART
 .sp
-This is like PCRE2_NOTEMPTY, except that an empty string match that is not at
+This is like PCRE2_NOTEMPTY, except that it locks out an empty string match
-the start of the subject is permitted. If the pattern is anchored, such a match
+only at the first matching position, that is, at the start of the subject plus
-can occur only if the pattern contains \eK.
+the starting offset. An empty string match later in the subject is permitted.
 If the pattern is anchored, such a match can occur only if the pattern contains
 \eK.
 .sp
  PCRE2_NO_UTF_CHECK
 .sp
@ -1913,8 +1919,8 @@ subject characters to complete the match. If this happens when
 PCRE2_PARTIAL_SOFT (but not PCRE2_PARTIAL_HARD) is set, matching continues by
 testing any remaining alternatives. Only if no complete match can be found is
 PCRE2_ERROR_PARTIAL returned instead of PCRE2_ERROR_NOMATCH. In other words,
-PCRE2_PARTIAL_SOFT says that the caller is prepared to handle a partial match,
+PCRE2_PARTIAL_SOFT specifies that the caller is prepared to handle a partial
-but only if no complete match can be found.
+match, but only if no complete match can be found.
 .P
 If PCRE2_PARTIAL_HARD is set, it overrides PCRE2_PARTIAL_SOFT. In this case, if
 a partial match is found, \fBpcre2_match()\fP immediately returns
@ -1943,13 +1949,13 @@ compile context.
 .\"
 During matching, the newline choice affects the behaviour of the dot,
 circumflex, and dollar metacharacters. It may also alter the way the match
-position is advanced after a match failure for an unanchored pattern.
+starting position is advanced after a match failure for an unanchored pattern.
 .P
-When PCRE2_NEWLINE_CRLF, PCRE2_NEWLINE_ANYCRLF, or PCRE2_NEWLINE_ANY is set,
+When PCRE2_NEWLINE_CRLF, PCRE2_NEWLINE_ANYCRLF, or PCRE2_NEWLINE_ANY is set as
-and a match attempt for an unanchored pattern fails when the current position
+the newline convention, and a match attempt for an unanchored pattern fails
-is at a CRLF sequence, and the pattern contains no explicit matches for CR or
+when the current starting position is at a CRLF sequence, and the pattern
-LF characters, the match position is advanced by two characters instead of one,
+contains no explicit matches for CR or LF characters, the match position is
-in other words, to after the CRLF.
+advanced by two characters instead of one, in other words, to after the CRLF.
 .P
 The above rule is a compromise that makes the most common cases work as
 expected. For example, if the pattern is .+A (and the PCRE2_DOTALL option is
@ -1960,8 +1966,8 @@ reference, and so advances only by one character after the first failure.
 .P
 An explicit match for CR of LF is either a literal appearance of one of those
 characters in the pattern, or one of the \er or \en escape sequences. Implicit
-matches such as [^X] do not count, nor does \es (which includes CR and LF in
+matches such as [^X] do not count, nor does \es, even though it includes CR and
-the characters that it matches).
+LF in the characters that it matches.
 .P
 Notwithstanding the above, anomalous effects may still occur when CRLF is a
 valid newline sequence and explicit \er or \en escapes appear in the pattern.
@ -1981,15 +1987,15 @@ In general, a pattern matches a certain portion of the subject, and in
 addition, further substrings from the subject may be picked out by
 parenthesized parts of the pattern. Following the usage in Jeffrey Friedl's
 book, this is called "capturing" in what follows, and the phrase "capturing
-subpattern" is used for a fragment of a pattern that picks out a substring.
+subpattern" or "capturing group" is used for a fragment of a pattern that picks
-PCRE2 supports several other kinds of parenthesized subpattern that do not
+out a substring. PCRE2 supports several other kinds of parenthesized subpattern
-cause substrings to be captured. The \fBpcre2_pattern_info()\fP function can be
+that do not cause substrings to be captured. The \fBpcre2_pattern_info()\fP
-used to find out how many capturing subpatterns there are in a compiled
+function can be used to find out how many capturing subpatterns there are in a
-pattern.
+compiled pattern.
 .P
 The overall matched string and any captured substrings are returned to the
-caller via a vector of PCRE2_SIZE values, called the \fBovector\fP. This is
+caller via a vector of PCRE2_SIZE values. This is called the \fBovector\fP, and
-contained within the
+is contained within the
 .\" HTML <a href="#matchdatablock">
 .\" </a>
 match data block.
@ -2062,7 +2068,7 @@ had.
 .
 .
 .\" HTML <a name="matchotherdata"></a>
-.SS "Other information about the match"
+.SH "OTHER INFORMATION ABOUT A MATCH"
 .rs
 .sp
 .nf
@ -2071,7 +2077,7 @@ had.
 .B PCRE2_SIZE pcre2_get_startchar(pcre2_match_data *\fImatch_data\fP);
 .fi
 .P
-In addition to the offsets in the ovector, other information about a match is
+As well as the offsets in the ovector, other information about a match is
 retained in the match data block and can be retrieved by the above functions.
 .P
 When a (*MARK) name is to be passed back, \fBpcre2_get_mark()\fP returns a
@ -2087,7 +2093,7 @@ as \fIovector[0]\fP because \eK does not affect the result of a partial match.
 .
 .
 .\" HTML <a name="errorlist"></a>
-.SS "Error return values from \fBpcre2_match()\fP"
+.SH "ERROR RETURNS FROM \fBpcre2_match()\fP"
 .rs
 .sp
 If \fBpcre2_match()\fP fails, it returns a negative number. This can be
@ -2127,7 +2133,7 @@ passed to a 16-bit or 32-bit library function, or vice versa.
 .sp
  PCRE2_ERROR_BADOFFSET
 .sp
-The value of \fIstartoffset\fP greater than the length of the subject.
+The value of \fIstartoffset\fP was greater than the length of the subject.
 .sp
  PCRE2_ERROR_BADOPTION
 .sp
@ -2200,8 +2206,8 @@ the pattern. Specifically, it means that either the whole pattern or a
 subpattern has been called recursively for the second time at the same position
 in the subject string. Some simple patterns that might do this are detected and
 faulted at compile time, but more complicated cases, in particular mutual
-recursions between two different subpatterns, cannot be detected until run
+recursions between two different subpatterns, cannot be detected until matching
-time.
+is attempted.
 .sp
  PCRE2_ERROR_RECURSIONLIMIT
 .sp
@ -2254,8 +2260,8 @@ extract the captured substrings.
 .P
 The final arguments of \fBpcre2_substring_copy_bynumber()\fP are a pointer to
 the buffer and a pointer to a variable that contains its length in code units.
-This is updated to contain the actual number of code units used, excluding the
+This is updated to contain the actual number of code units used for the
-terminating zero.
+extracted substring, excluding the terminating zero.
 .P
 For \fBpcre2_substring_get_bynumber()\fP the third and fourth arguments point
 to variables that are updated with a pointer to the new memory and the number
@ -2290,10 +2296,11 @@ small to capture that group.
 .fi
 .P
 The \fBpcre2_substring_list_get()\fP function extracts all available substrings
-and builds a list of pointers to them, and a second list that contains their
+and builds a list of pointers to them. It also (optionally) builds a second
-lengths (in code units), excluding a terminating zero that is added to each of
+list that contains their lengths (in code units), excluding a terminating zero
-them. All this is done in a single block of memory that is obtained using the
+that is added to each of them. All this is done in a single block of memory
-same memory allocation function that was used to get the match data block.
+that is obtained using the same memory allocation function that was used to get
 the match data block.
 .P
 The address of the memory block is returned via \fIlistptr\fP, which is also
 the start of the list of string pointers. The end of the list is marked by a
@ -2309,7 +2316,7 @@ If this function encounters a substring that is unset, which can happen when
 capturing subpattern number \fIn+1\fP matches some part of the subject, but
 subpattern \fIn\fP has not been used at all, it returns an empty string. This
 can be distinguished from a genuine zero-length substring by inspecting the
-appropriate offset in the ovector, which contains PCRE2_UNSET for unset
+appropriate offset in the ovector, which contain PCRE2_UNSET for unset
 substrings.
 .
 .
@ -2347,11 +2354,10 @@ name, or PCRE2_ERROR_NOUNIQUESUBSTRING if there is more than one subpattern of
 that name.
 .P
 Given the number, you can extract the substring directly, or use one of the
-functions described in the previous section. For convenience, there are also
+functions described above. For convenience, there are also "byname" functions
-"byname" functions that correspond to the "bynumber" functions, the only
+that correspond to the "bynumber" functions, the only difference being that the
-difference being that the second argument is a name instead of a number.
+second argument is a name instead of a number. However, if PCRE2_DUPNAMES is
-However, if PCRE2_DUPNAMES is set and there are duplicate names,
+set and there are duplicate names, the behaviour may not be what you want.
 the behaviour may not be what you want (see the next section).
 .P
 \fBWarning:\fP If the pattern uses the (?| feature to set up multiple
 subpatterns with the same number, as described in the
@ -2398,8 +2404,8 @@ recognized:
 Either a group number or a group name can be given for <n>. Curly brackets are
 required only if the following character would be interpreted as part of the
 number or name. The number may be zero to include the entire matched string.
-For example, if the pattern a(b)c is matched with "[abc]" and the replacement
+For example, if the pattern a(b)c is matched with "=abc=" and the replacement
-string "+$1$0$1+", the result is "[+babcb+]". Group insertion is done by
+string "+$1$0$1+", the result is "=+babcb+=". Group insertion is done by
 calling \fBpcre2_copy_byname()\fP or \fBpcre2_copy_bynumber()\fP as
 appropriate.
 .P
@ -2452,18 +2458,19 @@ documentation.
 When duplicates are present, \fBpcre2_substring_copy_byname()\fP and
 \fBpcre2_substring_get_byname()\fP return the first substring corresponding to
 the given name that is set. If none are set, PCRE2_ERROR_NOSUBSTRING is
-returned. The \fBpcre2_substring_number_from_name()\fP function returns one of
+returned. The \fBpcre2_substring_number_from_name()\fP function returns
-the numbers that are associated with the name, but it is not defined which it
+the error PCRE2_ERROR_NOUNIQUESUBSTRING.
 is.
 .P
 If you want to get full details of all captured substrings for a given name,
 you must use the \fBpcre2_substring_nametable_scan()\fP function. The first
 argument is the compiled pattern, and the second is the name. If the third and
-fourth arguments are NULL, the function returns a group number (it is not
+fourth arguments are NULL, the function returns a group number for a unique
-defined which). Otherwise, the third and fourth arguments must be pointers to
+name, or PCRE2_ERROR_NOUNIQUESUBSTRING otherwise.
 .P
 When the third and fourth arguments are not NULL, they must be pointers to
 variables that are updated by the function. After it has run, they point to the
 first and last entries in the name-to-number table for the given name, and the
-function returns the length of each entry. In both cases,
+function returns the length of each entry in code units. In both cases,
 PCRE2_ERROR_NOSUBSTRING is returned if there are no entries for the given name.
 .P
 The format of the name table is described above in the section entitled
@ -2476,15 +2483,15 @@ Given all the relevant entries for the name, you can extract each of their
 numbers, and hence the captured data.
 .
 .
-.SH "FINDING ALL POSSIBLE MATCHES"
+.SH "FINDING ALL POSSIBLE MATCHES AT ONE POSITION"
 .rs
 .sp
 The traditional matching function uses a similar algorithm to Perl, which stops
-when it finds the first match, starting at a given point in the subject. If you
+when it finds the first match at a given point in the subject. If you want to
-want to find all possible matches, or the longest possible match at a given
+find all possible matches, or the longest possible match at a given position,
-position, consider using the alternative matching function (see below) instead.
+consider using the alternative matching function (see below) instead. If you
-If you cannot use the alternative function, you can kludge it up by making use
+cannot use the alternative function, you can kludge it up by making use of the
-of the callout facility, which is described in the
+callout facility, which is described in the
 .\" HREF
 \fBpcre2callout\fP
 .\"
@ -2628,11 +2635,10 @@ the longest matches.
 .P
 NOTE: PCRE2's "auto-possessification" optimization usually applies to character
 repeats at the end of a pattern (as well as internally). For example, the
-pattern "a\ed+" is compiled as if it were "a\ed++" because there is no point in
+pattern "a\ed+" is compiled as if it were "a\ed++". For DFA matching, this
-backtracking into the repeated digits. For DFA matching, this means that only
+means that only one possible match is found. If you really do want multiple
-one possible match is found. If you really do want multiple matches in such
+matches in such cases, either use an ungreedy repeat auch as "a\ed+?" or set
-cases, either use an ungreedy repeat ("a\ed+?") or set the
+the PCRE2_NO_AUTO_POSSESS option when compiling.
 PCRE2_NO_AUTO_POSSESS option when compiling.
 .
 .
 .SS "Error returns from \fBpcre2_dfa_match()\fP"
@ -2673,7 +2679,7 @@ extremely rare, as a vector of size 1000 is used.
 .sp
  PCRE2_ERROR_DFA_BADRESTART
 .sp
-When \fBpcre2_dfa_match()\fP is called with the \fBpcre2_dfa_RESTART\fP option,
+When \fBpcre2_dfa_match()\fP is called with the \fBPCRE2_DFA_RESTART\fP option,
 some plausibility checks are made on the contents of the workspace, which
 should contain data about the previous partial match. If any of these checks
 fail, this error is given.
@ -2682,9 +2688,9 @@ fail, this error is given.
 .SH "SEE ALSO"
 .rs
 .sp
-\fBpcre2build\fP(3), \fBpcre2libs\fP(3), \fBpcre2callout\fP(3),
+\fBpcre2build\fP(3), \fBpcre2callout\fP(3), \fBpcre2demo(3)\fP,
 \fBpcre2matching\fP(3), \fBpcre2partial\fP(3), \fBpcre2posix\fP(3),
-\fBpcre2demo(3)\fP, \fBpcre2sample\fP(3), \fBpcre2stack\fP(3).
+\fBpcre2sample\fP(3), \fBpcre2stack\fP(3), \fBpcre2unicode\fP(3).
 .
 .
 .SH AUTHOR
@ -2701,6 +2707,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 18 November 2014
+Last updated: 21 November 2014
 Copyright (c) 1997-2014 University of Cambridge.
 .fi