From 5b6c797a4d05c79bf1fb46a52e1920d9b5482d28 Mon Sep 17 00:00:00 2001
From: "Philip.Hazel"
-Last updated: 17 June 2016
+Last updated: 06 July 2016
Copyright © 1997-2016 University of Cambridge.
diff --git a/doc/html/pcre2unicode.html b/doc/html/pcre2unicode.html
index 7af55c3..6ca367f 100644
--- a/doc/html/pcre2unicode.html
+++ b/doc/html/pcre2unicode.html
@@ -67,16 +67,20 @@ In UTF modes, the dot metacharacter matches one UTF character instead of a
single code unit.
-The escape sequence \C can be used to match a single code unit, in a UTF mode, +The escape sequence \C can be used to match a single code unit in a UTF mode, but its use can lead to some strange effects because it breaks up multi-unit characters (see the description of \C in the pcre2pattern -documentation). The use of \C is not supported by the alternative matching -function pcre2_dfa_match() when in UTF mode. Its use provokes a -match-time error. The JIT optimization also does not support \C in UTF mode. -If JIT optimization is requested for a UTF pattern that contains \C, it will -not succeed, and so the matching will be carried out by the normal interpretive -function. +documentation). +
++The use of \C is not supported by the alternative matching function +pcre2_dfa_match() when in UTF-8 or UTF-16 mode, that is, when a character +may consist of more than one code unit. The use of \C in these modes provokes +a match-time error. Also, the JIT optimization does not support \C in these +modes. If JIT optimization is requested for a UTF-8 or UTF-16 pattern that +contains \C, it will not succeed, and so when pcre2_match() is called, +the matching will be carried out by the normal interpretive function.
The character escapes \b, \B, \d, \D, \s, \S, \w, and \W correctly test @@ -244,9 +248,9 @@ Errors in UTF-16 strings
The following negative error codes are given for invalid UTF-16 strings:
- PCRE_UTF16_ERR1 Missing low surrogate at end of string - PCRE_UTF16_ERR2 Invalid low surrogate follows high surrogate - PCRE_UTF16_ERR3 Isolated low surrogate + PCRE2_ERROR_UTF16_ERR1 Missing low surrogate at end of string + PCRE2_ERROR_UTF16_ERR2 Invalid low surrogate follows high surrogate + PCRE2_ERROR_UTF16_ERR3 Isolated low surrogate@@ -256,8 +260,8 @@ Errors in UTF-32 strings
The following negative error codes are given for invalid UTF-32 strings:
- PCRE_UTF32_ERR1 Surrogate character (range from 0xd800 to 0xdfff) - PCRE_UTF32_ERR2 Code point is greater than 0x10ffff + PCRE2_ERROR_UTF32_ERR1 Surrogate character (0xd800 to 0xdfff) + PCRE2_ERROR_UTF32_ERR2 Code point is greater than 0x10ffff@@ -276,9 +280,9 @@ Cambridge, England. REVISION
-Last updated: 16 October 2015
+Last updated: 03 July 2016
-Copyright © 1997-2015 University of Cambridge.
+Copyright © 1997-2016 University of Cambridge.
Return to the PCRE2 index page.
diff --git a/doc/pcre2.txt b/doc/pcre2.txt
index fe66fb4..8f4e8a1 100644
--- a/doc/pcre2.txt
+++ b/doc/pcre2.txt
@@ -9740,15 +9740,19 @@ WIDE CHARACTERS AND UTF MODES
In UTF modes, the dot metacharacter matches one UTF character instead
of a single code unit.
- The escape sequence \C can be used to match a single code unit, in a
- UTF mode, but its use can lead to some strange effects because it
- breaks up multi-unit characters (see the description of \C in the
- pcre2pattern documentation). The use of \C is not supported by the
- alternative matching function pcre2_dfa_match() when in UTF mode. Its
- use provokes a match-time error. The JIT optimization also does not
- support \C in UTF mode. If JIT optimization is requested for a UTF
- pattern that contains \C, it will not succeed, and so the matching will
- be carried out by the normal interpretive function.
+ The escape sequence \C can be used to match a single code unit in a UTF
+ mode, but its use can lead to some strange effects because it breaks up
+ multi-unit characters (see the description of \C in the pcre2pattern
+ documentation).
+
+ The use of \C is not supported by the alternative matching function
+ pcre2_dfa_match() when in UTF-8 or UTF-16 mode, that is, when a charac-
+ ter may consist of more than one code unit. The use of \C in these
+ modes provokes a match-time error. Also, the JIT optimization does not
+ support \C in these modes. If JIT optimization is requested for a UTF-8
+ or UTF-16 pattern that contains \C, it will not succeed, and so when
+ pcre2_match() is called, the matching will be carried out by the normal
+ interpretive function.
The character escapes \b, \B, \d, \D, \s, \S, \w, and \W correctly test
characters of any code value, but, by default, the characters that
@@ -9900,9 +9904,9 @@ VALIDITY OF UTF STRINGS
The following negative error codes are given for invalid UTF-16
strings:
- PCRE_UTF16_ERR1 Missing low surrogate at end of string
- PCRE_UTF16_ERR2 Invalid low surrogate follows high surrogate
- PCRE_UTF16_ERR3 Isolated low surrogate
+ PCRE2_ERROR_UTF16_ERR1 Missing low surrogate at end of string
+ PCRE2_ERROR_UTF16_ERR2 Invalid low surrogate follows high surrogate
+ PCRE2_ERROR_UTF16_ERR3 Isolated low surrogate
Errors in UTF-32 strings
@@ -9910,8 +9914,8 @@ VALIDITY OF UTF STRINGS
The following negative error codes are given for invalid UTF-32
strings:
- PCRE_UTF32_ERR1 Surrogate character (range from 0xd800 to 0xdfff)
- PCRE_UTF32_ERR2 Code point is greater than 0x10ffff
+ PCRE2_ERROR_UTF32_ERR1 Surrogate character (0xd800 to 0xdfff)
+ PCRE2_ERROR_UTF32_ERR2 Code point is greater than 0x10ffff
AUTHOR
@@ -9923,8 +9927,8 @@ AUTHOR
REVISION
- Last updated: 16 October 2015
- Copyright (c) 1997-2015 University of Cambridge.
+ Last updated: 03 July 2016
+ Copyright (c) 1997-2016 University of Cambridge.
------------------------------------------------------------------------------
diff --git a/doc/pcre2test.1 b/doc/pcre2test.1
index f639ec2..2fbf794 100644
--- a/doc/pcre2test.1
+++ b/doc/pcre2test.1
@@ -1055,7 +1055,7 @@ The \fBallcaptures\fP modifier requests that the values of all potential
captured parentheses be output after a match. By default, only those up to the
highest one actually used in the match are output (corresponding to the return
code from \fBpcre2_match()\fP). Groups that did not take part in the match
-are output as "