doc: formatting/typo fixes to documentation (#47)

* doc: fix incorrect use of JOIN and typo

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>

* doc: reformat of pcre2_substitute to align options

includes some rewording to fit better in an 80 char wide troff output.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>

* doc: update names to pcre2
This commit is contained in:
Carlo Marcelo Arenas Belón 2021-11-27 08:27:49 -08:00 committed by GitHub
parent c8d31f1605
commit 587b94277b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
15 changed files with 64 additions and 55 deletions

View File

@ -30,8 +30,8 @@ This function sets additional option bits for <b>pcre2_compile()</b> that are
housed in a compile context. It completely replaces all the bits. The extra
options are:
<pre>
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \K in lookarounds PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \x{df800} to \x{dfff}
in UTF-8 and UTF-32 modes
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \K in lookarounds
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \x{d800} to \x{dfff} in UTF-8 and UTF-32 modes
PCRE2_EXTRA_ALT_BSUX Extended alternate \u, \U, and \x handling
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL Treat all invalid escapes as a literal following character
PCRE2_EXTRA_ESCAPED_CR_IS_LF Interpret \r as \n

View File

@ -68,29 +68,29 @@ automatically added.
The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
zero-terminated strings. The options are:
<pre>
PCRE2_ANCHORED Match only at the first position
PCRE2_ENDANCHORED Pattern can match only at end of subject
PCRE2_NOTBOL Subject is not the beginning of a line
PCRE2_NOTEOL Subject is not the end of a line
PCRE2_NOTEMPTY An empty string is not a valid match
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject is not a valid match
PCRE2_NO_JIT Do not use JIT matching
PCRE2_NO_UTF_CHECK Do not check the subject or replacement for UTF validity (only relevant if
PCRE2_UTF was set at compile time)
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject
PCRE2_SUBSTITUTE_LITERAL The replacement string is literal
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for 1st match
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
PCRE2_ANCHORED Match only at the first position
PCRE2_ENDANCHORED Match only at end of subject
PCRE2_NOTBOL Subject is not the beginning of a line
PCRE2_NOTEOL Subject is not the end of a line
PCRE2_NOTEMPTY An empty string is not a valid match
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject is not a valid match
PCRE2_NO_JIT Do not use JIT matching
PCRE2_NO_UTF_CHECK Do not check for UTF validity in the subject or replacement
(only relevant if PCRE2_UTF was set at compile time)
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject
PCRE2_SUBSTITUTE_LITERAL The replacement string is literal
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for first match
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
PCRE2_SUBSTITUTE_REPLACEMENT_ONLY Return only replacement string(s)
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
PCRE2_SUBSTITUTE_UNSET_EMPTY Simple unset insert = empty string
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
PCRE2_SUBSTITUTE_UNSET_EMPTY Simple unset insert = empty string
</pre>
If PCRE2_SUBSTITUTE_LITERAL is set, PCRE2_SUBSTITUTE_EXTENDED,
PCRE2_SUBSTITUTE_UNKNOWN_UNSET, and PCRE2_SUBSTITUTE_UNSET_EMPTY are ignored.
</P>
<P>
If PCRE2_SUBSTITUTE_MATCHED is set, <i>match_data</i> must be non-zero; its
If PCRE2_SUBSTITUTE_MATCHED is set, <i>match_data</i> must be non-NULL; its
contents must be the result of a call to <b>pcre2_match()</b> using the same
pattern and subject.
</P>

View File

@ -534,7 +534,7 @@ for themselves. For example, outside a character class:
\0113 is a tab followed by the character "3"
\113 might be a backreference, otherwise the character with octal code 113
\377 might be a backreference, otherwise the value 255 (decimal)
\81 is always a backreference .sp
\81 is always a backreference
</pre>
Note that octal values of 100 or greater that are specified using this syntax
must not be introduced by a leading zero, because no more than three octal

View File

@ -18,9 +18,9 @@ This function sets additional option bits for \fBpcre2_compile()\fP that are
housed in a compile context. It completely replaces all the bits. The extra
options are:
.sp
.\" JOIN
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \eK in lookarounds
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{df800} to \ex{dfff}
.\" JOIN
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{d800} to \ex{dfff}
in UTF-8 and UTF-32 modes
.\" JOIN
PCRE2_EXTRA_ALT_BSUX Extended alternate \eu, \eU, and

View File

@ -55,32 +55,42 @@ automatically added.
The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
zero-terminated strings. The options are:
.sp
PCRE2_ANCHORED Match only at the first position
PCRE2_ENDANCHORED Pattern can match only at end of subject
PCRE2_NOTBOL Subject is not the beginning of a line
PCRE2_NOTEOL Subject is not the end of a line
PCRE2_NOTEMPTY An empty string is not a valid match
PCRE2_ANCHORED Match only at the first position
PCRE2_ENDANCHORED Match only at end of subject
.\" JOIN
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the
subject is not a valid match
PCRE2_NO_JIT Do not use JIT matching
PCRE2_NOTBOL Subject is not the beginning of a
line
PCRE2_NOTEOL Subject is not the end of a line
.\" JOIN
PCRE2_NO_UTF_CHECK Do not check the subject or replacement
for UTF validity (only relevant if
PCRE2_UTF was set at compile time)
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject
PCRE2_SUBSTITUTE_LITERAL The replacement string is literal
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for 1st match
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
PCRE2_NOTEMPTY An empty string is not a
valid match
.\" JOIN
PCRE2_NOTEMPTY_ATSTART An empty string at the start of
the subject is not a valid match
PCRE2_NO_JIT Do not use JIT matching
.\" JOIN
PCRE2_NO_UTF_CHECK Do not check for UTF validity in
the subject or replacement
.\" JOIN
(only relevant if PCRE2_UTF was
set at compile time)
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
.\" JOIN
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the
subject
PCRE2_SUBSTITUTE_LITERAL The replacement string is literal
.\" JOIN
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for
first match
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
PCRE2_SUBSTITUTE_REPLACEMENT_ONLY Return only replacement string(s)
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
PCRE2_SUBSTITUTE_UNSET_EMPTY Simple unset insert = empty string
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
PCRE2_SUBSTITUTE_UNSET_EMPTY Simple unset insert = empty string
.sp
If PCRE2_SUBSTITUTE_LITERAL is set, PCRE2_SUBSTITUTE_EXTENDED,
PCRE2_SUBSTITUTE_UNKNOWN_UNSET, and PCRE2_SUBSTITUTE_UNSET_EMPTY are ignored.
.P
If PCRE2_SUBSTITUTE_MATCHED is set, \fImatch_data\fP must be non-zero; its
If PCRE2_SUBSTITUTE_MATCHED is set, \fImatch_data\fP must be non-NULL; its
contents must be the result of a call to \fBpcre2_match()\fP using the same
pattern and subject.
.P

View File

@ -1794,7 +1794,7 @@ it is set, the effect of passing an invalid UTF string as a pattern is
undefined. It may cause your program to crash or loop.
.P
Note that this option can also be passed to \fBpcre2_match()\fP and
\fBpcre_dfa_match()\fP, to suppress UTF validity checking of the subject
\fBpcre2_dfa_match()\fP, to suppress UTF validity checking of the subject
string.
.P
Note also that setting PCRE2_NO_UTF_CHECK at compile time does not disable the
@ -3848,7 +3848,7 @@ Here is an example of a simple call to \fBpcre2_dfa_match()\fP:
wspace, /* working space vector */
20); /* number of elements (NOT size in bytes) */
.
.SS "Option bits for \fBpcre_dfa_match()\fP"
.SS "Option bits for \fBpcre2_dfa_match()\fP"
.rs
.sp
The unused bits of the \fIoptions\fP argument for \fBpcre2_dfa_match()\fP must

View File

@ -509,7 +509,6 @@ for themselves. For example, outside a character class:
.\" JOIN
\e377 might be a backreference, otherwise
the value 255 (decimal)
.\" JOIN
\e81 is always a backreference
.sp
Note that octal values of 100 or greater that are specified using this syntax

View File

@ -47,7 +47,7 @@ format before being passed to the library functions. Results are converted back
to 8-bit code units for output.
.P
In the rest of this document, the names of library functions and structures
are given in generic form, for example, \fBpcre_compile()\fP. The actual
are given in generic form, for example, \fBpcre2_compile()\fP. The actual
names used in the libraries have a suffix _8, _16, or _32, as appropriate.
.
.

View File

@ -519,7 +519,7 @@ it is. This is called only in UTF-32 mode - we don't put a test within the
macro because almost all calls are already within a block of UTF-32 only
code.
These are all no-ops since all UTF-32 characters fit into one pcre_uchar. */
These are all no-ops since all UTF-32 characters fit into one PCRE2_UCHAR. */
#define BACKCHAR(eptr) do { } while (0)
@ -764,7 +764,7 @@ typedef struct pcre2_real_jit_stack {
} pcre2_real_jit_stack;
/* Structure for items in a linked list that represents an explicit recursive
call within the pattern when running pcre_dfa_match(). */
call within the pattern when running pcre2_dfa_match(). */
typedef struct dfa_recursion_info {
struct dfa_recursion_info *prevrec;

View File

@ -13692,7 +13692,7 @@ if (!compiler)
}
common->compiler = compiler;
/* Main pcre_jit_exec entry. */
/* Main pcre2_jit_exec entry. */
sljit_emit_enter(compiler, 0, SLJIT_ARG1(SW), 5, 5, 0, 0, private_data_size);
/* Register init. */

View File

@ -120,7 +120,7 @@ else if ((options & PCRE2_PARTIAL_SOFT) != 0)
if (functions == NULL || functions->executable_funcs[index] == NULL)
return PCRE2_ERROR_JIT_BADOPTION;
/* Sanity checks should be handled by pcre_exec. */
/* Sanity checks should be handled by pcre2_match. */
arguments.str = subject + start_offset;
arguments.begin = subject;
arguments.end = subject + length;

View File

@ -108,7 +108,7 @@ int main(void)
pcre2_config_32(PCRE2_CONFIG_JIT, &jit);
#endif
if (!jit) {
printf("JIT must be enabled to run pcre_jit_test\n");
printf("JIT must be enabled to run pcre2_jit_test\n");
return 1;
}
return regression_tests()
@ -1200,8 +1200,8 @@ static int regression_tests(void)
#endif
/* This test compares the behaviour of interpreter and JIT. Although disabling
utf or ucp may make tests fail, if the pcre_exec result is the SAME, it is
still considered successful from pcre_jit_test point of view. */
utf or ucp may make tests fail, if the pcre2_match result is the SAME, it is
still considered successful from pcre2_jit_test point of view. */
#if defined SUPPORT_PCRE2_8
pcre2_config_8(PCRE2_CONFIG_JITTARGET, &cpu_info);

View File

@ -1816,7 +1816,7 @@ to find all possible matches.
Arguments:
matchptr the start of the subject
length the length of the subject to match
options options for pcre_exec
options options for pcre2_match
startoffset where to start matching
mrc address of where to put the result of pcre2_match()

2
testdata/testinput7 vendored
View File

@ -1,5 +1,5 @@
# This set of tests checks UTF and Unicode property support with the DFA
# matching functionality of pcre_dfa_match(). A default subject modifier is
# matching functionality of pcre2_dfa_match(). A default subject modifier is
# used to force DFA matching for all tests.
#subject dfa

View File

@ -1,5 +1,5 @@
# This set of tests checks UTF and Unicode property support with the DFA
# matching functionality of pcre_dfa_match(). A default subject modifier is
# matching functionality of pcre2_dfa_match(). A default subject modifier is
# used to force DFA matching for all tests.
#subject dfa