Documentation update

This commit is contained in:
Philip.Hazel 2020-11-04 17:01:13 +00:00
parent fb54d81528
commit dc426be88e
3 changed files with 1364 additions and 1275 deletions

View File

@ -626,14 +626,15 @@ documentation for more details.
<P> <P>
In a more complicated situation, where patterns are compiled only when they are In a more complicated situation, where patterns are compiled only when they are
first needed, but are still shared between threads, pointers to compiled first needed, but are still shared between threads, pointers to compiled
patterns must be protected from simultaneous writing by multiple threads, at patterns must be protected from simultaneous writing by multiple threads. This
least until a pattern has been compiled. The logic can be something like this: is somewhat tricky to do correctly. If you know that writing to a pointer is
atomic in your environment, you can use logic like this:
<pre> <pre>
Get a read-only (shared) lock (mutex) for pointer Get a read-only (shared) lock (mutex) for pointer
if (pointer == NULL) if (pointer == NULL)
{ {
Get a write (unique) lock for pointer Get a write (unique) lock for pointer
pointer = pcre2_compile(... if (pointer == NULL) pointer = pcre2_compile(...
} }
Release the lock Release the lock
Use pointer in pcre2_match() Use pointer in pcre2_match()
@ -641,10 +642,39 @@ least until a pattern has been compiled. The logic can be something like this:
Of course, testing for compilation errors should also be included in the code. Of course, testing for compilation errors should also be included in the code.
</P> </P>
<P> <P>
If JIT is being used, but the JIT compilation is not being done immediately, The reason for checking the pointer a second time is as follows: Several
(perhaps waiting to see if the pattern is used often enough) similar logic is threads may have acquired the shared lock and tested the pointer for being
required. JIT compilation updates a pointer within the compiled code block, so NULL, but only one of them will be given the write lock, with the rest kept
a thread must gain unique write access to the pointer before calling waiting. The winning thread will compile the pattern and store the result.
After this thread releases the write lock, another thread will get it, and if
it does not retest pointer for being NULL, will recompile the pattern and
overwrite the pointer, creating a memory leak and possibly causing other
issues.
</P>
<P>
In an environment where writing to a pointer may not be atomic, the above logic
is not sufficient. The thread that is doing the compiling may be descheduled
after writing only part of the pointer, which could cause other threads to use
an invalid value. Instead of checking the pointer itself, a separate "pointer
is valid" flag (that can be updated atomically) must be used:
<pre>
Get a read-only (shared) lock (mutex) for pointer
if (!pointer_is_valid)
{
Get a write (unique) lock for pointer
if (!pointer_is_valid)
{
pointer = pcre2_compile(...
pointer_is_valid = TRUE
}
}
Release the lock
Use pointer in pcre2_match()
</pre>
If JIT is being used, but the JIT compilation is not being done immediately
(perhaps waiting to see if the pattern is used often enough), similar logic is
required. JIT compilation updates a value within the compiled code block, so a
thread must gain unique write access to the pointer before calling
<b>pcre2_jit_compile()</b>. Alternatively, <b>pcre2_code_copy()</b> or <b>pcre2_jit_compile()</b>. Alternatively, <b>pcre2_code_copy()</b> or
<b>pcre2_code_copy_with_tables()</b> can be used to obtain a private copy of the <b>pcre2_code_copy_with_tables()</b> can be used to obtain a private copy of the
compiled code before calling the JIT compiler. compiled code before calling the JIT compiler.
@ -3959,7 +3989,7 @@ Cambridge, England.
</P> </P>
<br><a name="SEC42" href="#TOC1">REVISION</a><br> <br><a name="SEC42" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 05 October 2020 Last updated: 04 November 2020
<br> <br>
Copyright &copy; 1997-2020 University of Cambridge. Copyright &copy; 1997-2020 University of Cambridge.
<br> <br>

View File

@ -683,14 +683,15 @@ MULTITHREADING
In a more complicated situation, where patterns are compiled only when In a more complicated situation, where patterns are compiled only when
they are first needed, but are still shared between threads, pointers they are first needed, but are still shared between threads, pointers
to compiled patterns must be protected from simultaneous writing by to compiled patterns must be protected from simultaneous writing by
multiple threads, at least until a pattern has been compiled. The logic multiple threads. This is somewhat tricky to do correctly. If you know
can be something like this: that writing to a pointer is atomic in your environment, you can use
logic like this:
Get a read-only (shared) lock (mutex) for pointer Get a read-only (shared) lock (mutex) for pointer
if (pointer == NULL) if (pointer == NULL)
{ {
Get a write (unique) lock for pointer Get a write (unique) lock for pointer
pointer = pcre2_compile(... if (pointer == NULL) pointer = pcre2_compile(...
} }
Release the lock Release the lock
Use pointer in pcre2_match() Use pointer in pcre2_match()
@ -698,9 +699,38 @@ MULTITHREADING
Of course, testing for compilation errors should also be included in Of course, testing for compilation errors should also be included in
the code. the code.
The reason for checking the pointer a second time is as follows: Sev-
eral threads may have acquired the shared lock and tested the pointer
for being NULL, but only one of them will be given the write lock, with
the rest kept waiting. The winning thread will compile the pattern and
store the result. After this thread releases the write lock, another
thread will get it, and if it does not retest pointer for being NULL,
will recompile the pattern and overwrite the pointer, creating a memory
leak and possibly causing other issues.
In an environment where writing to a pointer may not be atomic, the
above logic is not sufficient. The thread that is doing the compiling
may be descheduled after writing only part of the pointer, which could
cause other threads to use an invalid value. Instead of checking the
pointer itself, a separate "pointer is valid" flag (that can be updated
atomically) must be used:
Get a read-only (shared) lock (mutex) for pointer
if (!pointer_is_valid)
{
Get a write (unique) lock for pointer
if (!pointer_is_valid)
{
pointer = pcre2_compile(...
pointer_is_valid = TRUE
}
}
Release the lock
Use pointer in pcre2_match()
If JIT is being used, but the JIT compilation is not being done immedi- If JIT is being used, but the JIT compilation is not being done immedi-
ately, (perhaps waiting to see if the pattern is used often enough) ately (perhaps waiting to see if the pattern is used often enough),
similar logic is required. JIT compilation updates a pointer within the similar logic is required. JIT compilation updates a value within the
compiled code block, so a thread must gain unique write access to the compiled code block, so a thread must gain unique write access to the
pointer before calling pcre2_jit_compile(). Alternatively, pointer before calling pcre2_jit_compile(). Alternatively,
pcre2_code_copy() or pcre2_code_copy_with_tables() can be used to ob- pcre2_code_copy() or pcre2_code_copy_with_tables() can be used to ob-
@ -3796,7 +3826,7 @@ AUTHOR
REVISION REVISION
Last updated: 05 October 2020 Last updated: 04 November 2020
Copyright (c) 1997-2020 University of Cambridge. Copyright (c) 1997-2020 University of Cambridge.
------------------------------------------------------------------------------ ------------------------------------------------------------------------------

View File

@ -1,4 +1,4 @@
.TH PCRE2API 3 "05 October 2020" "PCRE2 10.36" .TH PCRE2API 3 "04 November 2020" "PCRE2 10.36"
.SH NAME .SH NAME
PCRE2 - Perl-compatible regular expressions (revised API) PCRE2 - Perl-compatible regular expressions (revised API)
.sp .sp
@ -564,24 +564,53 @@ documentation for more details.
.P .P
In a more complicated situation, where patterns are compiled only when they are In a more complicated situation, where patterns are compiled only when they are
first needed, but are still shared between threads, pointers to compiled first needed, but are still shared between threads, pointers to compiled
patterns must be protected from simultaneous writing by multiple threads, at patterns must be protected from simultaneous writing by multiple threads. This
least until a pattern has been compiled. The logic can be something like this: is somewhat tricky to do correctly. If you know that writing to a pointer is
atomic in your environment, you can use logic like this:
.sp .sp
Get a read-only (shared) lock (mutex) for pointer Get a read-only (shared) lock (mutex) for pointer
if (pointer == NULL) if (pointer == NULL)
{ {
Get a write (unique) lock for pointer Get a write (unique) lock for pointer
pointer = pcre2_compile(... if (pointer == NULL) pointer = pcre2_compile(...
} }
Release the lock Release the lock
Use pointer in pcre2_match() Use pointer in pcre2_match()
.sp .sp
Of course, testing for compilation errors should also be included in the code. Of course, testing for compilation errors should also be included in the code.
.P .P
If JIT is being used, but the JIT compilation is not being done immediately, The reason for checking the pointer a second time is as follows: Several
(perhaps waiting to see if the pattern is used often enough) similar logic is threads may have acquired the shared lock and tested the pointer for being
required. JIT compilation updates a pointer within the compiled code block, so NULL, but only one of them will be given the write lock, with the rest kept
a thread must gain unique write access to the pointer before calling waiting. The winning thread will compile the pattern and store the result.
After this thread releases the write lock, another thread will get it, and if
it does not retest pointer for being NULL, will recompile the pattern and
overwrite the pointer, creating a memory leak and possibly causing other
issues.
.P
In an environment where writing to a pointer may not be atomic, the above logic
is not sufficient. The thread that is doing the compiling may be descheduled
after writing only part of the pointer, which could cause other threads to use
an invalid value. Instead of checking the pointer itself, a separate "pointer
is valid" flag (that can be updated atomically) must be used:
.sp
Get a read-only (shared) lock (mutex) for pointer
if (!pointer_is_valid)
{
Get a write (unique) lock for pointer
if (!pointer_is_valid)
{
pointer = pcre2_compile(...
pointer_is_valid = TRUE
}
}
Release the lock
Use pointer in pcre2_match()
.sp
If JIT is being used, but the JIT compilation is not being done immediately
(perhaps waiting to see if the pattern is used often enough), similar logic is
required. JIT compilation updates a value within the compiled code block, so a
thread must gain unique write access to the pointer before calling
\fBpcre2_jit_compile()\fP. Alternatively, \fBpcre2_code_copy()\fP or \fBpcre2_jit_compile()\fP. Alternatively, \fBpcre2_code_copy()\fP or
\fBpcre2_code_copy_with_tables()\fP can be used to obtain a private copy of the \fBpcre2_code_copy_with_tables()\fP can be used to obtain a private copy of the
compiled code before calling the JIT compiler. compiled code before calling the JIT compiler.
@ -3971,6 +4000,6 @@ Cambridge, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 05 October 2020 Last updated: 04 November 2020
Copyright (c) 1997-2020 University of Cambridge. Copyright (c) 1997-2020 University of Cambridge.
.fi .fi