doc: formatting/typo fixes to documentation (#47)
* doc: fix incorrect use of JOIN and typo Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> * doc: reformat of pcre2_substitute to align options includes some rewording to fit better in an 80 char wide troff output. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> * doc: update names to pcre2
This commit is contained in:
parent
c8d31f1605
commit
587b94277b
|
@ -30,8 +30,8 @@ This function sets additional option bits for <b>pcre2_compile()</b> that are
|
||||||
housed in a compile context. It completely replaces all the bits. The extra
|
housed in a compile context. It completely replaces all the bits. The extra
|
||||||
options are:
|
options are:
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \K in lookarounds PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \x{df800} to \x{dfff}
|
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \K in lookarounds
|
||||||
in UTF-8 and UTF-32 modes
|
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \x{d800} to \x{dfff} in UTF-8 and UTF-32 modes
|
||||||
PCRE2_EXTRA_ALT_BSUX Extended alternate \u, \U, and \x handling
|
PCRE2_EXTRA_ALT_BSUX Extended alternate \u, \U, and \x handling
|
||||||
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL Treat all invalid escapes as a literal following character
|
PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL Treat all invalid escapes as a literal following character
|
||||||
PCRE2_EXTRA_ESCAPED_CR_IS_LF Interpret \r as \n
|
PCRE2_EXTRA_ESCAPED_CR_IS_LF Interpret \r as \n
|
||||||
|
|
|
@ -68,29 +68,29 @@ automatically added.
|
||||||
The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
|
The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
|
||||||
zero-terminated strings. The options are:
|
zero-terminated strings. The options are:
|
||||||
<pre>
|
<pre>
|
||||||
PCRE2_ANCHORED Match only at the first position
|
PCRE2_ANCHORED Match only at the first position
|
||||||
PCRE2_ENDANCHORED Pattern can match only at end of subject
|
PCRE2_ENDANCHORED Match only at end of subject
|
||||||
PCRE2_NOTBOL Subject is not the beginning of a line
|
PCRE2_NOTBOL Subject is not the beginning of a line
|
||||||
PCRE2_NOTEOL Subject is not the end of a line
|
PCRE2_NOTEOL Subject is not the end of a line
|
||||||
PCRE2_NOTEMPTY An empty string is not a valid match
|
PCRE2_NOTEMPTY An empty string is not a valid match
|
||||||
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject is not a valid match
|
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the subject is not a valid match
|
||||||
PCRE2_NO_JIT Do not use JIT matching
|
PCRE2_NO_JIT Do not use JIT matching
|
||||||
PCRE2_NO_UTF_CHECK Do not check the subject or replacement for UTF validity (only relevant if
|
PCRE2_NO_UTF_CHECK Do not check for UTF validity in the subject or replacement
|
||||||
PCRE2_UTF was set at compile time)
|
(only relevant if PCRE2_UTF was set at compile time)
|
||||||
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
|
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
|
||||||
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject
|
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject
|
||||||
PCRE2_SUBSTITUTE_LITERAL The replacement string is literal
|
PCRE2_SUBSTITUTE_LITERAL The replacement string is literal
|
||||||
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for 1st match
|
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for first match
|
||||||
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
|
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
|
||||||
PCRE2_SUBSTITUTE_REPLACEMENT_ONLY Return only replacement string(s)
|
PCRE2_SUBSTITUTE_REPLACEMENT_ONLY Return only replacement string(s)
|
||||||
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
|
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
|
||||||
PCRE2_SUBSTITUTE_UNSET_EMPTY Simple unset insert = empty string
|
PCRE2_SUBSTITUTE_UNSET_EMPTY Simple unset insert = empty string
|
||||||
</pre>
|
</pre>
|
||||||
If PCRE2_SUBSTITUTE_LITERAL is set, PCRE2_SUBSTITUTE_EXTENDED,
|
If PCRE2_SUBSTITUTE_LITERAL is set, PCRE2_SUBSTITUTE_EXTENDED,
|
||||||
PCRE2_SUBSTITUTE_UNKNOWN_UNSET, and PCRE2_SUBSTITUTE_UNSET_EMPTY are ignored.
|
PCRE2_SUBSTITUTE_UNKNOWN_UNSET, and PCRE2_SUBSTITUTE_UNSET_EMPTY are ignored.
|
||||||
</P>
|
</P>
|
||||||
<P>
|
<P>
|
||||||
If PCRE2_SUBSTITUTE_MATCHED is set, <i>match_data</i> must be non-zero; its
|
If PCRE2_SUBSTITUTE_MATCHED is set, <i>match_data</i> must be non-NULL; its
|
||||||
contents must be the result of a call to <b>pcre2_match()</b> using the same
|
contents must be the result of a call to <b>pcre2_match()</b> using the same
|
||||||
pattern and subject.
|
pattern and subject.
|
||||||
</P>
|
</P>
|
||||||
|
|
|
@ -534,7 +534,7 @@ for themselves. For example, outside a character class:
|
||||||
\0113 is a tab followed by the character "3"
|
\0113 is a tab followed by the character "3"
|
||||||
\113 might be a backreference, otherwise the character with octal code 113
|
\113 might be a backreference, otherwise the character with octal code 113
|
||||||
\377 might be a backreference, otherwise the value 255 (decimal)
|
\377 might be a backreference, otherwise the value 255 (decimal)
|
||||||
\81 is always a backreference .sp
|
\81 is always a backreference
|
||||||
</pre>
|
</pre>
|
||||||
Note that octal values of 100 or greater that are specified using this syntax
|
Note that octal values of 100 or greater that are specified using this syntax
|
||||||
must not be introduced by a leading zero, because no more than three octal
|
must not be introduced by a leading zero, because no more than three octal
|
||||||
|
|
|
@ -18,9 +18,9 @@ This function sets additional option bits for \fBpcre2_compile()\fP that are
|
||||||
housed in a compile context. It completely replaces all the bits. The extra
|
housed in a compile context. It completely replaces all the bits. The extra
|
||||||
options are:
|
options are:
|
||||||
.sp
|
.sp
|
||||||
.\" JOIN
|
|
||||||
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \eK in lookarounds
|
PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK Allow \eK in lookarounds
|
||||||
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{df800} to \ex{dfff}
|
.\" JOIN
|
||||||
|
PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES Allow \ex{d800} to \ex{dfff}
|
||||||
in UTF-8 and UTF-32 modes
|
in UTF-8 and UTF-32 modes
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_EXTRA_ALT_BSUX Extended alternate \eu, \eU, and
|
PCRE2_EXTRA_ALT_BSUX Extended alternate \eu, \eU, and
|
||||||
|
|
|
@ -55,32 +55,42 @@ automatically added.
|
||||||
The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
|
The subject and replacement lengths can be given as PCRE2_ZERO_TERMINATED for
|
||||||
zero-terminated strings. The options are:
|
zero-terminated strings. The options are:
|
||||||
.sp
|
.sp
|
||||||
PCRE2_ANCHORED Match only at the first position
|
PCRE2_ANCHORED Match only at the first position
|
||||||
PCRE2_ENDANCHORED Pattern can match only at end of subject
|
PCRE2_ENDANCHORED Match only at end of subject
|
||||||
PCRE2_NOTBOL Subject is not the beginning of a line
|
|
||||||
PCRE2_NOTEOL Subject is not the end of a line
|
|
||||||
PCRE2_NOTEMPTY An empty string is not a valid match
|
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_NOTEMPTY_ATSTART An empty string at the start of the
|
PCRE2_NOTBOL Subject is not the beginning of a
|
||||||
subject is not a valid match
|
line
|
||||||
PCRE2_NO_JIT Do not use JIT matching
|
PCRE2_NOTEOL Subject is not the end of a line
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
PCRE2_NO_UTF_CHECK Do not check the subject or replacement
|
PCRE2_NOTEMPTY An empty string is not a
|
||||||
for UTF validity (only relevant if
|
valid match
|
||||||
PCRE2_UTF was set at compile time)
|
.\" JOIN
|
||||||
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
|
PCRE2_NOTEMPTY_ATSTART An empty string at the start of
|
||||||
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the subject
|
the subject is not a valid match
|
||||||
PCRE2_SUBSTITUTE_LITERAL The replacement string is literal
|
PCRE2_NO_JIT Do not use JIT matching
|
||||||
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for 1st match
|
.\" JOIN
|
||||||
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
|
PCRE2_NO_UTF_CHECK Do not check for UTF validity in
|
||||||
|
the subject or replacement
|
||||||
|
.\" JOIN
|
||||||
|
(only relevant if PCRE2_UTF was
|
||||||
|
set at compile time)
|
||||||
|
PCRE2_SUBSTITUTE_EXTENDED Do extended replacement processing
|
||||||
|
.\" JOIN
|
||||||
|
PCRE2_SUBSTITUTE_GLOBAL Replace all occurrences in the
|
||||||
|
subject
|
||||||
|
PCRE2_SUBSTITUTE_LITERAL The replacement string is literal
|
||||||
|
.\" JOIN
|
||||||
|
PCRE2_SUBSTITUTE_MATCHED Use pre-existing match data for
|
||||||
|
first match
|
||||||
|
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH If overflow, compute needed length
|
||||||
PCRE2_SUBSTITUTE_REPLACEMENT_ONLY Return only replacement string(s)
|
PCRE2_SUBSTITUTE_REPLACEMENT_ONLY Return only replacement string(s)
|
||||||
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
|
PCRE2_SUBSTITUTE_UNKNOWN_UNSET Treat unknown group as unset
|
||||||
PCRE2_SUBSTITUTE_UNSET_EMPTY Simple unset insert = empty string
|
PCRE2_SUBSTITUTE_UNSET_EMPTY Simple unset insert = empty string
|
||||||
.sp
|
.sp
|
||||||
If PCRE2_SUBSTITUTE_LITERAL is set, PCRE2_SUBSTITUTE_EXTENDED,
|
If PCRE2_SUBSTITUTE_LITERAL is set, PCRE2_SUBSTITUTE_EXTENDED,
|
||||||
PCRE2_SUBSTITUTE_UNKNOWN_UNSET, and PCRE2_SUBSTITUTE_UNSET_EMPTY are ignored.
|
PCRE2_SUBSTITUTE_UNKNOWN_UNSET, and PCRE2_SUBSTITUTE_UNSET_EMPTY are ignored.
|
||||||
.P
|
.P
|
||||||
If PCRE2_SUBSTITUTE_MATCHED is set, \fImatch_data\fP must be non-zero; its
|
If PCRE2_SUBSTITUTE_MATCHED is set, \fImatch_data\fP must be non-NULL; its
|
||||||
contents must be the result of a call to \fBpcre2_match()\fP using the same
|
contents must be the result of a call to \fBpcre2_match()\fP using the same
|
||||||
pattern and subject.
|
pattern and subject.
|
||||||
.P
|
.P
|
||||||
|
|
|
@ -1794,7 +1794,7 @@ it is set, the effect of passing an invalid UTF string as a pattern is
|
||||||
undefined. It may cause your program to crash or loop.
|
undefined. It may cause your program to crash or loop.
|
||||||
.P
|
.P
|
||||||
Note that this option can also be passed to \fBpcre2_match()\fP and
|
Note that this option can also be passed to \fBpcre2_match()\fP and
|
||||||
\fBpcre_dfa_match()\fP, to suppress UTF validity checking of the subject
|
\fBpcre2_dfa_match()\fP, to suppress UTF validity checking of the subject
|
||||||
string.
|
string.
|
||||||
.P
|
.P
|
||||||
Note also that setting PCRE2_NO_UTF_CHECK at compile time does not disable the
|
Note also that setting PCRE2_NO_UTF_CHECK at compile time does not disable the
|
||||||
|
@ -3848,7 +3848,7 @@ Here is an example of a simple call to \fBpcre2_dfa_match()\fP:
|
||||||
wspace, /* working space vector */
|
wspace, /* working space vector */
|
||||||
20); /* number of elements (NOT size in bytes) */
|
20); /* number of elements (NOT size in bytes) */
|
||||||
.
|
.
|
||||||
.SS "Option bits for \fBpcre_dfa_match()\fP"
|
.SS "Option bits for \fBpcre2_dfa_match()\fP"
|
||||||
.rs
|
.rs
|
||||||
.sp
|
.sp
|
||||||
The unused bits of the \fIoptions\fP argument for \fBpcre2_dfa_match()\fP must
|
The unused bits of the \fIoptions\fP argument for \fBpcre2_dfa_match()\fP must
|
||||||
|
|
|
@ -509,7 +509,6 @@ for themselves. For example, outside a character class:
|
||||||
.\" JOIN
|
.\" JOIN
|
||||||
\e377 might be a backreference, otherwise
|
\e377 might be a backreference, otherwise
|
||||||
the value 255 (decimal)
|
the value 255 (decimal)
|
||||||
.\" JOIN
|
|
||||||
\e81 is always a backreference
|
\e81 is always a backreference
|
||||||
.sp
|
.sp
|
||||||
Note that octal values of 100 or greater that are specified using this syntax
|
Note that octal values of 100 or greater that are specified using this syntax
|
||||||
|
|
|
@ -47,7 +47,7 @@ format before being passed to the library functions. Results are converted back
|
||||||
to 8-bit code units for output.
|
to 8-bit code units for output.
|
||||||
.P
|
.P
|
||||||
In the rest of this document, the names of library functions and structures
|
In the rest of this document, the names of library functions and structures
|
||||||
are given in generic form, for example, \fBpcre_compile()\fP. The actual
|
are given in generic form, for example, \fBpcre2_compile()\fP. The actual
|
||||||
names used in the libraries have a suffix _8, _16, or _32, as appropriate.
|
names used in the libraries have a suffix _8, _16, or _32, as appropriate.
|
||||||
.
|
.
|
||||||
.
|
.
|
||||||
|
|
|
@ -519,7 +519,7 @@ it is. This is called only in UTF-32 mode - we don't put a test within the
|
||||||
macro because almost all calls are already within a block of UTF-32 only
|
macro because almost all calls are already within a block of UTF-32 only
|
||||||
code.
|
code.
|
||||||
|
|
||||||
These are all no-ops since all UTF-32 characters fit into one pcre_uchar. */
|
These are all no-ops since all UTF-32 characters fit into one PCRE2_UCHAR. */
|
||||||
|
|
||||||
#define BACKCHAR(eptr) do { } while (0)
|
#define BACKCHAR(eptr) do { } while (0)
|
||||||
|
|
||||||
|
@ -764,7 +764,7 @@ typedef struct pcre2_real_jit_stack {
|
||||||
} pcre2_real_jit_stack;
|
} pcre2_real_jit_stack;
|
||||||
|
|
||||||
/* Structure for items in a linked list that represents an explicit recursive
|
/* Structure for items in a linked list that represents an explicit recursive
|
||||||
call within the pattern when running pcre_dfa_match(). */
|
call within the pattern when running pcre2_dfa_match(). */
|
||||||
|
|
||||||
typedef struct dfa_recursion_info {
|
typedef struct dfa_recursion_info {
|
||||||
struct dfa_recursion_info *prevrec;
|
struct dfa_recursion_info *prevrec;
|
||||||
|
|
|
@ -13692,7 +13692,7 @@ if (!compiler)
|
||||||
}
|
}
|
||||||
common->compiler = compiler;
|
common->compiler = compiler;
|
||||||
|
|
||||||
/* Main pcre_jit_exec entry. */
|
/* Main pcre2_jit_exec entry. */
|
||||||
sljit_emit_enter(compiler, 0, SLJIT_ARG1(SW), 5, 5, 0, 0, private_data_size);
|
sljit_emit_enter(compiler, 0, SLJIT_ARG1(SW), 5, 5, 0, 0, private_data_size);
|
||||||
|
|
||||||
/* Register init. */
|
/* Register init. */
|
||||||
|
|
|
@ -120,7 +120,7 @@ else if ((options & PCRE2_PARTIAL_SOFT) != 0)
|
||||||
if (functions == NULL || functions->executable_funcs[index] == NULL)
|
if (functions == NULL || functions->executable_funcs[index] == NULL)
|
||||||
return PCRE2_ERROR_JIT_BADOPTION;
|
return PCRE2_ERROR_JIT_BADOPTION;
|
||||||
|
|
||||||
/* Sanity checks should be handled by pcre_exec. */
|
/* Sanity checks should be handled by pcre2_match. */
|
||||||
arguments.str = subject + start_offset;
|
arguments.str = subject + start_offset;
|
||||||
arguments.begin = subject;
|
arguments.begin = subject;
|
||||||
arguments.end = subject + length;
|
arguments.end = subject + length;
|
||||||
|
|
|
@ -108,7 +108,7 @@ int main(void)
|
||||||
pcre2_config_32(PCRE2_CONFIG_JIT, &jit);
|
pcre2_config_32(PCRE2_CONFIG_JIT, &jit);
|
||||||
#endif
|
#endif
|
||||||
if (!jit) {
|
if (!jit) {
|
||||||
printf("JIT must be enabled to run pcre_jit_test\n");
|
printf("JIT must be enabled to run pcre2_jit_test\n");
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
return regression_tests()
|
return regression_tests()
|
||||||
|
@ -1200,8 +1200,8 @@ static int regression_tests(void)
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/* This test compares the behaviour of interpreter and JIT. Although disabling
|
/* This test compares the behaviour of interpreter and JIT. Although disabling
|
||||||
utf or ucp may make tests fail, if the pcre_exec result is the SAME, it is
|
utf or ucp may make tests fail, if the pcre2_match result is the SAME, it is
|
||||||
still considered successful from pcre_jit_test point of view. */
|
still considered successful from pcre2_jit_test point of view. */
|
||||||
|
|
||||||
#if defined SUPPORT_PCRE2_8
|
#if defined SUPPORT_PCRE2_8
|
||||||
pcre2_config_8(PCRE2_CONFIG_JITTARGET, &cpu_info);
|
pcre2_config_8(PCRE2_CONFIG_JITTARGET, &cpu_info);
|
||||||
|
|
|
@ -1816,7 +1816,7 @@ to find all possible matches.
|
||||||
Arguments:
|
Arguments:
|
||||||
matchptr the start of the subject
|
matchptr the start of the subject
|
||||||
length the length of the subject to match
|
length the length of the subject to match
|
||||||
options options for pcre_exec
|
options options for pcre2_match
|
||||||
startoffset where to start matching
|
startoffset where to start matching
|
||||||
mrc address of where to put the result of pcre2_match()
|
mrc address of where to put the result of pcre2_match()
|
||||||
|
|
||||||
|
|
|
@ -1,5 +1,5 @@
|
||||||
# This set of tests checks UTF and Unicode property support with the DFA
|
# This set of tests checks UTF and Unicode property support with the DFA
|
||||||
# matching functionality of pcre_dfa_match(). A default subject modifier is
|
# matching functionality of pcre2_dfa_match(). A default subject modifier is
|
||||||
# used to force DFA matching for all tests.
|
# used to force DFA matching for all tests.
|
||||||
|
|
||||||
#subject dfa
|
#subject dfa
|
||||||
|
|
|
@ -1,5 +1,5 @@
|
||||||
# This set of tests checks UTF and Unicode property support with the DFA
|
# This set of tests checks UTF and Unicode property support with the DFA
|
||||||
# matching functionality of pcre_dfa_match(). A default subject modifier is
|
# matching functionality of pcre2_dfa_match(). A default subject modifier is
|
||||||
# used to force DFA matching for all tests.
|
# used to force DFA matching for all tests.
|
||||||
|
|
||||||
#subject dfa
|
#subject dfa
|
||||||
|
|
Loading…
Reference in New Issue