doc: formatting/typo fixes to documentation (#47)

* doc: fix incorrect use of JOIN and typo

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>

* doc: reformat of pcre2_substitute to align options

includes some rewording to fit better in an 80 char wide troff output.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>

* doc: update names to pcre2
This commit is contained in:
Carlo Marcelo Arenas Belón 2021-11-27 08:27:49 -08:00 committed by GitHub
parent c8d31f1605
commit 587b94277b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
15 changed files with 64 additions and 55 deletions

View File

@ -30,8 +30,8 @@ This function sets additional option bits for <b>pcre2_compile()</b> that are
housed in a compile context. It completely replaces all the bits. The extra housed in a compile context. It completely replaces all the bits. The extra
options are: options are:
<pre> <pre>
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \K in lookarounds PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \x{df800} to \x{dfff} PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \K in lookarounds
in UTF-8 and UTF-32 modes PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \x{d800} to \x{dfff} in UTF-8 and UTF-32 modes
PCRE2_EXTRA_ALT_BSUX Extended alternate \u, \U, and \x handling PCRE2_EXTRA_ALT_BSUX Extended alternate \u, \U, and \x handling
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL Treat all invalid escapes as a literal following character PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL Treat all invalid escapes as a literal following character
PCRE2_EXTRA_ESCAPED_CR_IS_LF Interpret \r as \n PCRE2_EXTRA_ESCAPED_CR_IS_LF Interpret \r as \n

View File

@ -69,18 +69,18 @@ The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
zero-terminated strings. The options are: zero-terminated strings. The options are:
<pre> <pre>
PCRE2_ANCHORED Match only at the first position PCRE2_ANCHORED Match only at the first position
PCRE2_ENDANCHORED Pattern can match only at end of subject PCRE2_ENDANCHORED Match only at end of subject
PCRE2_NOTBOL Subject is not the beginning of a line PCRE2_NOTBOL Subject is not the beginning of a line
PCRE2_NOTEOL Subject is not the end of a line PCRE2_NOTEOL Subject is not the end of a line
PCRE2_NOTEMPTY An empty string is not a valid match PCRE2_NOTEMPTY An empty string is not a valid match
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject is not a valid match PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject is not a valid match
PCRE2_NO_JIT Do not use JIT matching PCRE2_NO_JIT Do not use JIT matching
PCRE2_NO_UTF_CHECK Do not check the subject or replacement for UTF validity (only relevant if PCRE2_NO_UTF_CHECK Do not check for UTF validity in the subject or replacement
PCRE2_UTF was set at compile time) (only relevant if PCRE2_UTF was set at compile time)
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject
PCRE2_SUBSTITUTE_LITERAL The replacement string is literal PCRE2_SUBSTITUTE_LITERAL The replacement string is literal
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for 1st match PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for first match
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
PCRE2_SUBSTITUTE_REPLACEMENT_ONLY Return only replacement string(s) PCRE2_SUBSTITUTE_REPLACEMENT_ONLY Return only replacement string(s)
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
@ -90,7 +90,7 @@ If PCRE2_SUBSTITUTE_LITERAL is set, PCRE2_SUBSTITUTE_EXTENDED,
PCRE2_SUBSTITUTE_UNKNOWN_UNSET, and PCRE2_SUBSTITUTE_UNSET_EMPTY are ignored. PCRE2_SUBSTITUTE_UNKNOWN_UNSET, and PCRE2_SUBSTITUTE_UNSET_EMPTY are ignored.
</P> </P>
<P> <P>
If PCRE2_SUBSTITUTE_MATCHED is set, <i>match_data</i> must be non-zero; its If PCRE2_SUBSTITUTE_MATCHED is set, <i>match_data</i> must be non-NULL; its
contents must be the result of a call to <b>pcre2_match()</b> using the same contents must be the result of a call to <b>pcre2_match()</b> using the same
pattern and subject. pattern and subject.
</P> </P>

View File

@ -534,7 +534,7 @@ for themselves. For example, outside a character class:
\0113 is a tab followed by the character "3" \0113 is a tab followed by the character "3"
\113 might be a backreference, otherwise the character with octal code 113 \113 might be a backreference, otherwise the character with octal code 113
\377 might be a backreference, otherwise the value 255 (decimal) \377 might be a backreference, otherwise the value 255 (decimal)
\81 is always a backreference .sp \81 is always a backreference
</pre> </pre>
Note that octal values of 100 or greater that are specified using this syntax Note that octal values of 100 or greater that are specified using this syntax
must not be introduced by a leading zero, because no more than three octal must not be introduced by a leading zero, because no more than three octal

View File

@ -18,9 +18,9 @@ This function sets additional option bits for \fBpcre2_compile()\fP that are
housed in a compile context. It completely replaces all the bits. The extra housed in a compile context. It completely replaces all the bits. The extra
options are: options are:
.sp .sp
.\" JOIN
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \eK in lookarounds PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \eK in lookarounds
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{df800} to \ex{dfff} .\" JOIN
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{d800} to \ex{dfff}
in UTF-8 and UTF-32 modes in UTF-8 and UTF-32 modes
.\" JOIN .\" JOIN
PCRE2_EXTRA_ALT_BSUX Extended alternate \eu, \eU, and PCRE2_EXTRA_ALT_BSUX Extended alternate \eu, \eU, and

View File

@ -56,22 +56,32 @@ The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
zero-terminated strings. The options are: zero-terminated strings. The options are:
.sp .sp
PCRE2_ANCHORED Match only at the first position PCRE2_ANCHORED Match only at the first position
PCRE2_ENDANCHORED Pattern can match only at end of subject PCRE2_ENDANCHORED Match only at end of subject
PCRE2_NOTBOL Subject is not the beginning of a line
PCRE2_NOTEOL Subject is not the end of a line
PCRE2_NOTEMPTY An empty string is not a valid match
.\" JOIN .\" JOIN
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the PCRE2_NOTBOL Subject is not the beginning of a
subject is not a valid match line
PCRE2_NOTEOL Subject is not the end of a line
.\" JOIN
PCRE2_NOTEMPTY An empty string is not a
valid match
.\" JOIN
PCRE2_NOTEMPTY_ATSTART An empty string at the start of
the subject is not a valid match
PCRE2_NO_JIT Do not use JIT matching PCRE2_NO_JIT Do not use JIT matching
.\" JOIN .\" JOIN
PCRE2_NO_UTF_CHECK Do not check the subject or replacement PCRE2_NO_UTF_CHECK Do not check for UTF validity in
for UTF validity (only relevant if the subject or replacement
PCRE2_UTF was set at compile time) .\" JOIN
(only relevant if PCRE2_UTF was
set at compile time)
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject .\" JOIN
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the
subject
PCRE2_SUBSTITUTE_LITERAL The replacement string is literal PCRE2_SUBSTITUTE_LITERAL The replacement string is literal
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for 1st match .\" JOIN
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for
first match
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
PCRE2_SUBSTITUTE_REPLACEMENT_ONLY Return only replacement string(s) PCRE2_SUBSTITUTE_REPLACEMENT_ONLY Return only replacement string(s)
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
@ -80,7 +90,7 @@ zero-terminated strings. The options are:
If PCRE2_SUBSTITUTE_LITERAL is set, PCRE2_SUBSTITUTE_EXTENDED, If PCRE2_SUBSTITUTE_LITERAL is set, PCRE2_SUBSTITUTE_EXTENDED,
PCRE2_SUBSTITUTE_UNKNOWN_UNSET, and PCRE2_SUBSTITUTE_UNSET_EMPTY are ignored. PCRE2_SUBSTITUTE_UNKNOWN_UNSET, and PCRE2_SUBSTITUTE_UNSET_EMPTY are ignored.
.P .P
If PCRE2_SUBSTITUTE_MATCHED is set, \fImatch_data\fP must be non-zero; its If PCRE2_SUBSTITUTE_MATCHED is set, \fImatch_data\fP must be non-NULL; its
contents must be the result of a call to \fBpcre2_match()\fP using the same contents must be the result of a call to \fBpcre2_match()\fP using the same
pattern and subject. pattern and subject.
.P .P

View File

@ -1794,7 +1794,7 @@ it is set, the effect of passing an invalid UTF string as a pattern is
undefined. It may cause your program to crash or loop. undefined. It may cause your program to crash or loop.
.P .P
Note that this option can also be passed to \fBpcre2_match()\fP and Note that this option can also be passed to \fBpcre2_match()\fP and
\fBpcre_dfa_match()\fP, to suppress UTF validity checking of the subject \fBpcre2_dfa_match()\fP, to suppress UTF validity checking of the subject
string. string.
.P .P
Note also that setting PCRE2_NO_UTF_CHECK at compile time does not disable the Note also that setting PCRE2_NO_UTF_CHECK at compile time does not disable the
@ -3848,7 +3848,7 @@ Here is an example of a simple call to \fBpcre2_dfa_match()\fP:
wspace, /* working space vector */ wspace, /* working space vector */
20); /* number of elements (NOT size in bytes) */ 20); /* number of elements (NOT size in bytes) */
. .
.SS "Option bits for \fBpcre_dfa_match()\fP" .SS "Option bits for \fBpcre2_dfa_match()\fP"
.rs .rs
.sp .sp
The unused bits of the \fIoptions\fP argument for \fBpcre2_dfa_match()\fP must The unused bits of the \fIoptions\fP argument for \fBpcre2_dfa_match()\fP must

View File

@ -509,7 +509,6 @@ for themselves. For example, outside a character class:
.\" JOIN .\" JOIN
\e377 might be a backreference, otherwise \e377 might be a backreference, otherwise
the value 255 (decimal) the value 255 (decimal)
.\" JOIN
\e81 is always a backreference \e81 is always a backreference
.sp .sp
Note that octal values of 100 or greater that are specified using this syntax Note that octal values of 100 or greater that are specified using this syntax

View File

@ -47,7 +47,7 @@ format before being passed to the library functions. Results are converted back
to 8-bit code units for output. to 8-bit code units for output.
.P .P
In the rest of this document, the names of library functions and structures In the rest of this document, the names of library functions and structures
are given in generic form, for example, \fBpcre_compile()\fP. The actual are given in generic form, for example, \fBpcre2_compile()\fP. The actual
names used in the libraries have a suffix _8, _16, or _32, as appropriate. names used in the libraries have a suffix _8, _16, or _32, as appropriate.
. .
. .

View File

@ -519,7 +519,7 @@ it is. This is called only in UTF-32 mode - we don't put a test within the
macro because almost all calls are already within a block of UTF-32 only macro because almost all calls are already within a block of UTF-32 only
code. code.
These are all no-ops since all UTF-32 characters fit into one pcre_uchar. */ These are all no-ops since all UTF-32 characters fit into one PCRE2_UCHAR. */
#define BACKCHAR(eptr) do { } while (0) #define BACKCHAR(eptr) do { } while (0)
@ -764,7 +764,7 @@ typedef struct pcre2_real_jit_stack {
} pcre2_real_jit_stack; } pcre2_real_jit_stack;
/* Structure for items in a linked list that represents an explicit recursive /* Structure for items in a linked list that represents an explicit recursive
call within the pattern when running pcre_dfa_match(). */ call within the pattern when running pcre2_dfa_match(). */
typedef struct dfa_recursion_info { typedef struct dfa_recursion_info {
struct dfa_recursion_info *prevrec; struct dfa_recursion_info *prevrec;

View File

@ -13692,7 +13692,7 @@ if (!compiler)
} }
common->compiler = compiler; common->compiler = compiler;
/* Main pcre_jit_exec entry. */ /* Main pcre2_jit_exec entry. */
sljit_emit_enter(compiler, 0, SLJIT_ARG1(SW), 5, 5, 0, 0, private_data_size); sljit_emit_enter(compiler, 0, SLJIT_ARG1(SW), 5, 5, 0, 0, private_data_size);
/* Register init. */ /* Register init. */

View File

@ -120,7 +120,7 @@ else if ((options & PCRE2_PARTIAL_SOFT) != 0)
if (functions == NULL || functions->executable_funcs[index] == NULL) if (functions == NULL || functions->executable_funcs[index] == NULL)
return PCRE2_ERROR_JIT_BADOPTION; return PCRE2_ERROR_JIT_BADOPTION;
/* Sanity checks should be handled by pcre_exec. */ /* Sanity checks should be handled by pcre2_match. */
arguments.str = subject + start_offset; arguments.str = subject + start_offset;
arguments.begin = subject; arguments.begin = subject;
arguments.end = subject + length; arguments.end = subject + length;

View File

@ -108,7 +108,7 @@ int main(void)
pcre2_config_32(PCRE2_CONFIG_JIT, &jit); pcre2_config_32(PCRE2_CONFIG_JIT, &jit);
#endif #endif
if (!jit) { if (!jit) {
printf("JIT must be enabled to run pcre_jit_test\n"); printf("JIT must be enabled to run pcre2_jit_test\n");
return 1; return 1;
} }
return regression_tests() return regression_tests()
@ -1200,8 +1200,8 @@ static int regression_tests(void)
#endif #endif
/* This test compares the behaviour of interpreter and JIT. Although disabling /* This test compares the behaviour of interpreter and JIT. Although disabling
utf or ucp may make tests fail, if the pcre_exec result is the SAME, it is utf or ucp may make tests fail, if the pcre2_match result is the SAME, it is
still considered successful from pcre_jit_test point of view. */ still considered successful from pcre2_jit_test point of view. */
#if defined SUPPORT_PCRE2_8 #if defined SUPPORT_PCRE2_8
pcre2_config_8(PCRE2_CONFIG_JITTARGET, &cpu_info); pcre2_config_8(PCRE2_CONFIG_JITTARGET, &cpu_info);

View File

@ -1816,7 +1816,7 @@ to find all possible matches.
Arguments: Arguments:
matchptr the start of the subject matchptr the start of the subject
length the length of the subject to match length the length of the subject to match
options options for pcre_exec options options for pcre2_match
startoffset where to start matching startoffset where to start matching
mrc address of where to put the result of pcre2_match() mrc address of where to put the result of pcre2_match()

2
testdata/testinput7 vendored
View File

@ -1,5 +1,5 @@
# This set of tests checks UTF and Unicode property support with the DFA # This set of tests checks UTF and Unicode property support with the DFA
# matching functionality of pcre_dfa_match(). A default subject modifier is # matching functionality of pcre2_dfa_match(). A default subject modifier is
# used to force DFA matching for all tests. # used to force DFA matching for all tests.
#subject dfa #subject dfa

View File

@ -1,5 +1,5 @@
# This set of tests checks UTF and Unicode property support with the DFA # This set of tests checks UTF and Unicode property support with the DFA
# matching functionality of pcre_dfa_match(). A default subject modifier is # matching functionality of pcre2_dfa_match(). A default subject modifier is
# used to force DFA matching for all tests. # used to force DFA matching for all tests.
#subject dfa #subject dfa