Documentation correction.

2015-02-20 09:38:54 +00:00 · 2015-02-20 09:38:54 +00:00 · 8fe95cf804
parent 52ba34a73c
commit 8fe95cf804
1 changed files with 99 additions and 90 deletions
--- a/39
+++ b/39
@ -263,10 +263,13 @@ of repeat make use of these opcodes:
  OP_POSUPTO      OP_POSUPTOI
  OP_EXACT        OP_EXACTI
-Each of these is followed by a count and then the repeated character. OP_UPTO
+Each of these is followed by a count and then the repeated character. The count
-matches from 0 to the given number. A repeat with a non-zero minimum and a
+is two bytes long in 8-bit mode (most significant byte first), or one code unit
-fixed maximum is coded as an OP_EXACT followed by an OP_UPTO (or OP_MINUPTO or
+in 16-bit and 32-bit modes.
-OPT_POSUPTO).
+
 OP_UPTO matches from 0 to the given number. A repeat with a non-zero minimum
 and a fixed maximum is coded as an OP_EXACT followed by an OP_UPTO (or
 OP_MINUPTO or OPT_POSUPTO).
 Another set of matching repeating opcodes (called OP_NOTSTAR, OP_NOTSTARI,
 etc.) are used for repeated, negated, single-character classes such as [^a]*.
@ -330,19 +333,21 @@ negative one. In either case, the opcode is followed by a 32-byte (16-short,
 bits are counted from the least significant end of each unit. In caseless mode,
 bits for both cases are set.
-The reason for having both OP_CLASS and OP_NCLASS is so that, in UTF-8/16/32
+The reason for having both OP_CLASS and OP_NCLASS is so that, in UTF-8 and
-mode, subject characters with values greater than 255 can be handled correctly.
+16-bit and 32-bit modes, subject characters with values greater than 255 can be
-For OP_CLASS they do not match, whereas for OP_NCLASS they do.
+handled correctly. For OP_CLASS they do not match, whereas for OP_NCLASS they
 do.
 For classes containing characters with values greater than 255 or that contain
-\p or \P, OP_XCLASS is used. It optionally uses a bit map if any code points
+\p or \P, OP_XCLASS is used. It optionally uses a bit map if any acceptable
-are less than 256, followed by a list of pairs (for a range) and single
+code points are less than 256, followed by a list of pairs (for a range) and
-characters. In caseless mode, both cases are explicitly listed.
+single characters. In caseless mode, both cases are explicitly listed.
-OP_XCLASS is followed by a code unit containing flag bits: XCL_NOT indicates
+OP_XCLASS is followed by a LINK_SIZE item containing the total length of the
-that this is a negative class, and XCL_MAP indicates that a bit map is present.
+opcode and its data. This is followed by a code unit containing flag bits:
-There follows the bit map, if XCL_MAP is set, and then a sequence of items
+XCL_NOT indicates that this is a negative class, and XCL_MAP indicates that a
-coded as follows:
+bit map is present. There follows the bit map, if XCL_MAP is set, and then a
 sequence of items coded as follows:
  XCL_END      marks the end of the list
  XCL_SINGLE   one character follows
@ -354,6 +359,10 @@ If a range starts with a code point less than 256 and ends with one greater
 than 256, it is split into two ranges, with characters less than 256 being
 indicated in the bit map, and the rest with XCL_RANGE.
 When XCL_NOT is set, the bit map, if present, contains bits for characters that
 are allowed (exactly as for OP_NCLASS), but the list of items that follow it
 specifies characters and properties that are not allowed.
 Back references
 ---------------
@ -545,4 +554,4 @@ not a real opcode, but is used to check that tables indexed by opcode are the
 correct length, in order to catch updating errors.
 Philip Hazel
-August 2014
+February 2015