Compare commits

...

35 Commits

Author SHA1 Message Date
George Sokianos 4a45482c9c Cleanup of Makefile.os4, added release rule and a README file for this release 2022-07-31 20:34:33 +01:00
Philip Hazel 8b133fa0ba Implement -Z in pcre2grep and update documentation 2022-07-30 17:41:49 +01:00
Philip Hazel cc5e121c8e Added some special heap tests 2022-07-28 17:58:19 +01:00
Philip Hazel 1343bdff8f Fix overlooked comment edit 2022-07-27 18:00:40 +01:00
Philip Hazel d90fb23878 Refactor match_data() to always use the heap instead of having an initial frames vector on the stack; some consequential adjustmentsneeded. 2022-07-27 17:44:55 +01:00
Ezekiel Warren e47fc51584
bazel support (#136) 2022-07-15 17:18:11 +01:00
Zoltan Herczeg b67d568201 JIT compiler update 2022-07-14 03:41:42 +00:00
Zoltan Herczeg 4851890ede
Fixed an issue in the backtracking optimization of character repeats in JIT (#135) 2022-07-14 05:25:39 +02:00
Amin Yahyaabadi 3e52db5209
doc: fix various typos (#132) 2022-07-08 10:01:46 +01:00
Philip Hazel 4804b00e8f Add an #ifdef to avoid the need even to link with pcre2_jit_compile.o when JIT is not supported 2022-06-30 17:37:51 +01:00
Philip Hazel 7549fdca74 Change length variables in pcre2grep from int to size_t 2022-06-30 17:06:32 +01:00
Philip Hazel 5271b533c4 Fix compiler warning in pcre2test 2022-06-08 17:05:24 +01:00
larinsv 45af1203bd
Fixed race condition that occurs when initializing the executable_allocator_is_working variable in the pcre2_jit_compile function (#91) 2022-05-18 12:16:00 +02:00
Rémi Verschelde 187b7ba050
Add `pcre2_ucptables.c` to non-autotools build docs (#120)
This seems needed following 4514ddd2a2.
2022-05-18 08:56:59 +01:00
William A Rowe Jr 06f34ba374
Include specific .pdb files only for chosen char size libs when shared (#116)
Signed-off-by: William A Rowe Jr <wrowe@vmware.com>
2022-05-07 09:09:19 +01:00
GregThain a334ea2a34
Add target_include_directories to CMakefile (#113)
To tell clients where to find the public include directory,
and attach it to the various library targets.
2022-05-03 16:29:28 +01:00
Carlo Marcelo Arenas Belón 15a82c3efd
doc: mostly wording issues, but more importantly a fixed group link (#114)
Not sure when the previous link broke, but this one seems to work
2022-04-30 09:46:50 +01:00
Philip Hazel 51a5fcdc1f Remove unused variables in ucptest.c and update test data for added properties 2022-04-25 15:19:09 +01:00
Philip Hazel 104fe2fead Update maintenance documentation 2022-04-25 15:07:14 +01:00
Philip Hazel f65df06305 Remove unused enum; add comments re unity builds 2022-04-24 16:44:33 +01:00
pkeir a13d7d4340
Added support for (CMake) Unity Builds. (#94) 2022-04-24 16:37:37 +01:00
Lucas Trzesniewski c630e868ca
Fix integer promotion causing a warning in MSVC (#111) 2022-04-24 16:16:49 +01:00
Joe Zhang 77ce1ff528
Add OpenSSF Scorecards to impove the security posture (#93)
* add openssf scorecards

* Create codeql.yml
2022-04-23 17:48:09 +01:00
Philip Hazel ff5402a378 Add some casts and other tidies to pcre2test formatting of size_t values 2022-04-23 17:34:35 +01:00
Philip Hazel b52d055d1b Update HTML docs 2022-04-22 18:02:14 +01:00
Carlo Marcelo Arenas Belón a4ac97fea8
doc: avoid nonexistent PCRE2_ERROR_MEMORY error (#107)
5438fc8a (Add serialization functions and tests with updated pcre2test.
Fix PCRE2_INFO_SIZE issues., 2015-01-23) introduced the typo.

Reported-by: @sjshuck
Fixes: #106
2022-04-22 17:59:44 +01:00
Philip Hazel fedf4d9d40 Fix recent documentation error 2022-04-22 17:51:31 +01:00
Philip Hazel 8ebf9efe7b Add PR#110 comment to ChangeLog 2022-04-22 17:33:07 +01:00
Carlo Marcelo Arenas Belón 4edcf6ada5
cmake: add pthread dependency (#110)
Fixes: #103
2022-04-22 17:31:07 +01:00
Philip Hazel d0c7544e78 Documentation update 2022-04-22 10:38:37 +01:00
Carlo Marcelo Arenas Belón f28e82602d
ci: windows support (#105)
Still barebones and only to serve as a starting point and guideline for
how to integrate mostly non autotools environments.

Selects Intel 32-bit specifically as it is the one that has been tested
the most and also has the less number of warnings.

Test should be improved further so it is at least equivalent to what is
done in Linux, but that is orthogonal to having it integrated, and the
tests that were disabled would work locally (albeit in a newer version),
so this at least does the minimum to prevent regressions by validating
both the interpreter and JIT.

Co-authored-by: PhilipHazel <Philip.Hazel@gmail.com>
2022-04-22 10:07:12 +01:00
Philip Hazel 1bb2b97b29 Update build workflow to add test in an Alpine container 2022-04-22 09:31:05 +01:00
Lucas Trzesniewski 3fec24a26f
Add a GitHub Actions build workflow (#19) 2022-04-20 08:43:44 +01:00
Philip Hazel 66b3cb34df More GitHub URL updates 2022-04-19 17:44:47 +01:00
Philip Hazel 29a43aa11d Update README to new GitHub organization URL 2022-04-19 17:39:59 +01:00
92 changed files with 8871 additions and 6322 deletions

3
.bazelrc Normal file
View File

@ -0,0 +1,3 @@
common --experimental_enable_bzlmod
build --incompatible_enable_cc_toolchain_resolution
build --incompatible_strict_action_env

77
.github/workflows/build.yml vendored Normal file
View File

@ -0,0 +1,77 @@
name: Build
on: [push, pull_request]
jobs:
linux:
name: Linux
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Autogen
run: ./autogen.sh
- name: Configure
run: ./configure --enable-jit --enable-pcre2-8 --enable-pcre2-16 --enable-pcre2-32
- name: Build
run: make
- name: Test (main test script)
run: ./RunTest
- name: Test (JIT test program)
run: ./pcre2_jit_test
- name: Test (pcre2grep test script)
run: ./RunGrepTest
alpine:
name: alpine
runs-on: ubuntu-latest
container: alpine
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Autotools
run: apk add --no-cache automake autoconf gcc libtool make musl-dev
- name: Autogen
run: ./autogen.sh
- name: Configure
run: ./configure --enable-jit --enable-pcre2-8 --enable-pcre2-16 --enable-pcre2-32
- name: Build
run: make
- name: Test (main test script)
run: ./RunTest
- name: Test (JIT test program)
run: ./pcre2_jit_test
- name: Test (pcre2grep test script)
run: ./RunGrepTest
windows:
name: 32bit Windows
runs-on: windows-latest
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Configure
run: cmake -DPCRE2_SUPPORT_JIT=ON -DPCRE2_BUILD_PCRE2_16=ON -DPCRE2_BUILD_PCRE2_32=ON -B build -A Win32
- name: Build
run: cmake --build build
- name: Test
run: |
cd build\Debug
..\..\RunTest.bat

73
.github/workflows/codeql.yml vendored Normal file
View File

@ -0,0 +1,73 @@
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"
on:
push:
branches: [ master ]
pull_request:
# The branches below must be a subset of the branches above
branches: [ master ]
schedule:
- cron: '27 6 * * 4'
# Declare default permissions as read only.
permissions: read-all
jobs:
analyze:
name: Analyze
runs-on: ubuntu-latest
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: [ 'cpp', 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
# Learn more about CodeQL language support at https://git.io/codeql-language-support
steps:
- name: Checkout repository
uses: actions/checkout@v2
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v1
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# queries: ./path/to/local/query, your-org/your-repo/queries@main
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v1
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
# ✏️ If the Autobuild fails above, remove it and uncomment the following three lines
# and modify them (or add more) to build your code if your project
# uses a compiled language
#- run: |
# make bootstrap
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v1

55
.github/workflows/scorecards.yml vendored Normal file
View File

@ -0,0 +1,55 @@
name: Scorecards supply-chain security
on:
# Only the default branch is supported.
branch_protection_rule:
schedule:
- cron: '23 17 * * 1'
push:
branches: [ master ]
# Declare default permissions as read only.
permissions: read-all
jobs:
analysis:
name: Scorecards analysis
runs-on: ubuntu-latest
permissions:
# Needed to upload the results to code-scanning dashboard.
security-events: write
actions: read
contents: read
steps:
- name: "Checkout code"
uses: actions/checkout@ec3a7ce113134d7a93b817d10a8272cb61118579 # v2.4.0
with:
persist-credentials: false
- name: "Run analysis"
uses: ossf/scorecard-action@c1aec4ac820532bab364f02a81873c555a0ba3a1 # v1.0.4
with:
results_file: results.sarif
results_format: sarif
# Read-only PAT token. To create it,
# follow the steps in https://github.com/ossf/scorecard-action#pat-token-creation.
repo_token: ${{ secrets.SCORECARD_READ_TOKEN }}
# Publish the results to enable scorecard badges. For more details, see
# https://github.com/ossf/scorecard-action#publishing-results.
# For private repositories, `publish_results` will automatically be set to `false`,
# regardless of the value entered here.
publish_results: true
# Upload the results as artifacts (optional).
- name: "Upload artifact"
uses: actions/upload-artifact@82c141cc518b40d92cc801eee768e7aafc9c2fa2 # v2.3.1
with:
name: SARIF file
path: results.sarif
retention-days: 5
# Upload the results to GitHub's code scanning dashboard.
- name: "Upload to code-scanning"
uses: github/codeql-action/upload-sarif@5f532563584d71fdef14ee64d17bafb34f751ce5 # v1.0.26
with:
sarif_file: results.sarif

4
.gitignore vendored
View File

@ -6,6 +6,7 @@
*.pc
*.o
*~
*.lha
__pycache__
.deps
@ -75,4 +76,7 @@ src/pcre2.h
src/pcre2_chartables.c
src/stamp-h1
/bazel-*
# End

72
BUILD.bazel Normal file
View File

@ -0,0 +1,72 @@
load("@rules_cc//cc:defs.bzl", "cc_library", "cc_test")
load("@bazel_skylib//rules:copy_file.bzl", "copy_file")
copy_file(
name = "config_h_generic",
src = "src/config.h.generic",
out = "src/config.h",
)
copy_file(
name = "pcre2_h_generic",
src = "src/pcre2.h.generic",
out = "src/pcre2.h",
)
copy_file(
name = "pcre2_chartables_c",
src = "src/pcre2_chartables.c.dist",
out = "src/pcre2_chartables.c",
)
cc_library(
name = "pcre2",
srcs = [
"src/pcre2_auto_possess.c",
"src/pcre2_compile.c",
"src/pcre2_config.c",
"src/pcre2_context.c",
"src/pcre2_convert.c",
"src/pcre2_dfa_match.c",
"src/pcre2_error.c",
"src/pcre2_extuni.c",
"src/pcre2_find_bracket.c",
"src/pcre2_maketables.c",
"src/pcre2_match.c",
"src/pcre2_match_data.c",
"src/pcre2_newline.c",
"src/pcre2_ord2utf.c",
"src/pcre2_pattern_info.c",
"src/pcre2_script_run.c",
"src/pcre2_serialize.c",
"src/pcre2_string_utils.c",
"src/pcre2_study.c",
"src/pcre2_substitute.c",
"src/pcre2_substring.c",
"src/pcre2_tables.c",
"src/pcre2_ucd.c",
"src/pcre2_ucptables.c",
"src/pcre2_valid_utf.c",
"src/pcre2_xclass.c",
":pcre2_chartables_c",
],
hdrs = glob(["src/*.h"]) + [
":config_h_generic",
":pcre2_h_generic",
],
defines = [
"HAVE_CONFIG_H",
"PCRE2_CODE_UNIT_WIDTH=8",
"PCRE2_STATIC",
],
includes = ["src"],
strip_include_prefix = "src",
visibility = ["//visibility:public"],
)
cc_binary(
name = "pcre2demo",
srcs = ["src/pcre2demo.c"],
visibility = ["//visibility:public"],
deps = [":pcre2"],
)

View File

@ -103,8 +103,8 @@
PROJECT(PCRE2 C)
# Increased minimum to 2.8.5 to support GNUInstallDirs.
# Increased minimum to 3.0.0 because older than 2.8.12 is deprecated.
CMAKE_MINIMUM_REQUIRED(VERSION 3.0.0)
# Increased minimum to 3.1 to support imported targets.
CMAKE_MINIMUM_REQUIRED(VERSION 3.1)
# Set policy CMP0026 to avoid warnings for the use of LOCATION in
# GET_TARGET_PROPERTY. This should no longer be required.
@ -382,7 +382,13 @@ IF(PCRE2_SUPPORT_UNICODE)
ENDIF(PCRE2_SUPPORT_UNICODE)
IF(PCRE2_SUPPORT_JIT)
SET(SUPPORT_JIT 1)
SET(SUPPORT_JIT 1)
IF(UNIX)
FIND_PACKAGE(Threads REQUIRED)
IF(CMAKE_USE_PTHREADS_INIT)
SET(REQUIRE_PTHREAD 1)
ENDIF(CMAKE_USE_PTHREADS_INIT)
ENDIF(UNIX)
ENDIF(PCRE2_SUPPORT_JIT)
IF(PCRE2_SUPPORT_JIT_SEALLOC)
@ -652,6 +658,8 @@ IF(MINGW AND BUILD_SHARED_LIBS)
ENDIF(MINGW AND BUILD_SHARED_LIBS)
IF(MSVC AND BUILD_SHARED_LIBS)
SET(dll_pdb_files ${PROJECT_BINARY_DIR}/pcre2-posix.pdb ${dll_pdb_files})
SET(dll_pdb_debug_files ${PROJECT_BINARY_DIR}/pcre2-posixd.pdb ${dll_pdb_debug_files})
IF (EXISTS ${PROJECT_SOURCE_DIR}/pcre2.rc)
SET(PCRE2_SOURCES ${PCRE2_SOURCES} pcre2.rc)
ENDIF(EXISTS ${PROJECT_SOURCE_DIR}/pcre2.rc)
@ -697,6 +705,10 @@ IF(PCRE2_BUILD_PCRE2_8)
VERSION ${LIBPCRE2_8_VERSION}
SOVERSION ${LIBPCRE2_8_SOVERSION})
TARGET_COMPILE_DEFINITIONS(pcre2-8-static PUBLIC PCRE2_STATIC)
TARGET_INCLUDE_DIRECTORIES(pcre2-8-static PUBLIC ${PROJECT_BINARY_DIR})
IF(REQUIRE_PTHREAD)
TARGET_LINK_LIBRARIES(pcre2-8-static Threads::Threads)
ENDIF(REQUIRE_PTHREAD)
SET(targets ${targets} pcre2-8-static)
ADD_LIBRARY(pcre2-posix-static STATIC ${PCRE2POSIX_HEADERS} ${PCRE2POSIX_SOURCES})
SET_TARGET_PROPERTIES(pcre2-posix-static PROPERTIES
@ -707,6 +719,7 @@ IF(PCRE2_BUILD_PCRE2_8)
SOVERSION ${LIBPCRE2_POSIX_SOVERSION})
TARGET_LINK_LIBRARIES(pcre2-posix-static pcre2-8-static)
TARGET_COMPILE_DEFINITIONS(pcre2-posix-static PUBLIC PCRE2_STATIC)
TARGET_INCLUDE_DIRECTORIES(pcre2-posix-static PUBLIC ${PROJECT_BINARY_DIR})
SET(targets ${targets} pcre2-posix-static)
IF(MSVC)
@ -723,6 +736,7 @@ IF(PCRE2_BUILD_PCRE2_8)
IF(BUILD_SHARED_LIBS)
ADD_LIBRARY(pcre2-8-shared SHARED ${PCRE2_HEADERS} ${PCRE2_SOURCES} ${PROJECT_BINARY_DIR}/config.h)
TARGET_INCLUDE_DIRECTORIES(pcre2-8-shared PUBLIC ${PROJECT_BINARY_DIR})
SET_TARGET_PROPERTIES(pcre2-8-shared PROPERTIES
COMPILE_DEFINITIONS PCRE2_CODE_UNIT_WIDTH=8
MACHO_COMPATIBILITY_VERSION "${LIBPCRE2_8_MACHO_COMPATIBILITY_VERSION}"
@ -730,8 +744,12 @@ IF(PCRE2_BUILD_PCRE2_8)
VERSION ${LIBPCRE2_8_VERSION}
SOVERSION ${LIBPCRE2_8_SOVERSION}
OUTPUT_NAME pcre2-8)
IF(REQUIRE_PTHREAD)
TARGET_LINK_LIBRARIES(pcre2-8-shared Threads::Threads)
ENDIF(REQUIRE_PTHREAD)
SET(targets ${targets} pcre2-8-shared)
ADD_LIBRARY(pcre2-posix-shared SHARED ${PCRE2POSIX_HEADERS} ${PCRE2POSIX_SOURCES})
TARGET_INCLUDE_DIRECTORIES(pcre2-posix-shared PUBLIC ${PROJECT_BINARY_DIR})
SET_TARGET_PROPERTIES(pcre2-posix-shared PROPERTIES
COMPILE_DEFINITIONS PCRE2_CODE_UNIT_WIDTH=8
MACHO_COMPATIBILITY_VERSION "${LIBPCRE2_POSIX_MACHO_COMPATIBILITY_VERSION}"
@ -741,6 +759,8 @@ IF(PCRE2_BUILD_PCRE2_8)
OUTPUT_NAME pcre2-posix)
TARGET_LINK_LIBRARIES(pcre2-posix-shared pcre2-8-shared)
SET(targets ${targets} pcre2-posix-shared)
SET(dll_pdb_files ${PROJECT_BINARY_DIR}/pcre2-8.pdb ${dll_pdb_files})
SET(dll_pdb_debug_files ${PROJECT_BINARY_DIR}/pcre2-8d.pdb ${dll_pdb_debug_files})
IF(MINGW)
IF(NON_STANDARD_LIB_PREFIX)
@ -766,6 +786,7 @@ ENDIF(PCRE2_BUILD_PCRE2_8)
IF(PCRE2_BUILD_PCRE2_16)
IF(BUILD_STATIC_LIBS)
ADD_LIBRARY(pcre2-16-static STATIC ${PCRE2_HEADERS} ${PCRE2_SOURCES} ${PROJECT_BINARY_DIR}/config.h)
TARGET_INCLUDE_DIRECTORIES(pcre2-16-static PUBLIC ${PROJECT_BINARY_DIR})
SET_TARGET_PROPERTIES(pcre2-16-static PROPERTIES
COMPILE_DEFINITIONS PCRE2_CODE_UNIT_WIDTH=16
MACHO_COMPATIBILITY_VERSION "${LIBPCRE2_32_MACHO_COMPATIBILITY_VERSION}"
@ -773,6 +794,9 @@ IF(PCRE2_BUILD_PCRE2_16)
VERSION ${LIBPCRE2_16_VERSION}
SOVERSION ${LIBPCRE2_16_SOVERSION})
TARGET_COMPILE_DEFINITIONS(pcre2-16-static PUBLIC PCRE2_STATIC)
IF(REQUIRE_PTHREAD)
TARGET_LINK_LIBRARIES(pcre2-16-static Threads::Threads)
ENDIF(REQUIRE_PTHREAD)
SET(targets ${targets} pcre2-16-static)
IF(MSVC)
@ -787,6 +811,7 @@ IF(PCRE2_BUILD_PCRE2_16)
IF(BUILD_SHARED_LIBS)
ADD_LIBRARY(pcre2-16-shared SHARED ${PCRE2_HEADERS} ${PCRE2_SOURCES} ${PROJECT_BINARY_DIR}/config.h)
TARGET_INCLUDE_DIRECTORIES(pcre2-16-shared PUBLIC ${PROJECT_BINARY_DIR})
SET_TARGET_PROPERTIES(pcre2-16-shared PROPERTIES
COMPILE_DEFINITIONS PCRE2_CODE_UNIT_WIDTH=16
MACHO_COMPATIBILITY_VERSION "${LIBPCRE2_32_MACHO_COMPATIBILITY_VERSION}"
@ -794,7 +819,12 @@ IF(PCRE2_BUILD_PCRE2_16)
VERSION ${LIBPCRE2_16_VERSION}
SOVERSION ${LIBPCRE2_16_SOVERSION}
OUTPUT_NAME pcre2-16)
IF(REQUIRE_PTHREAD)
TARGET_LINK_LIBRARIES(pcre2-16-shared Threads::Threads)
ENDIF(REQUIRE_PTHREAD)
SET(targets ${targets} pcre2-16-shared)
SET(dll_pdb_files ${PROJECT_BINARY_DIR}/pcre2-16.pdb ${dll_pdb_files})
SET(dll_pdb_debug_files ${PROJECT_BINARY_DIR}/pcre2-16d.pdb ${dll_pdb_debug_files})
IF(MINGW)
IF(NON_STANDARD_LIB_PREFIX)
@ -818,6 +848,7 @@ ENDIF(PCRE2_BUILD_PCRE2_16)
IF(PCRE2_BUILD_PCRE2_32)
IF(BUILD_STATIC_LIBS)
ADD_LIBRARY(pcre2-32-static STATIC ${PCRE2_HEADERS} ${PCRE2_SOURCES} ${PROJECT_BINARY_DIR}/config.h)
TARGET_INCLUDE_DIRECTORIES(pcre2-32-static PUBLIC ${PROJECT_BINARY_DIR})
SET_TARGET_PROPERTIES(pcre2-32-static PROPERTIES
COMPILE_DEFINITIONS PCRE2_CODE_UNIT_WIDTH=32
MACHO_COMPATIBILITY_VERSION "${LIBPCRE2_32_MACHO_COMPATIBILITY_VERSION}"
@ -825,6 +856,9 @@ IF(PCRE2_BUILD_PCRE2_32)
VERSION ${LIBPCRE2_32_VERSION}
SOVERSION ${LIBPCRE2_32_SOVERSION})
TARGET_COMPILE_DEFINITIONS(pcre2-32-static PUBLIC PCRE2_STATIC)
IF(REQUIRE_PTHREAD)
TARGET_LINK_LIBRARIES(pcre2-32-static Threads::Threads)
ENDIF(REQUIRE_PTHREAD)
SET(targets ${targets} pcre2-32-static)
IF(MSVC)
@ -839,6 +873,7 @@ IF(PCRE2_BUILD_PCRE2_32)
IF(BUILD_SHARED_LIBS)
ADD_LIBRARY(pcre2-32-shared SHARED ${PCRE2_HEADERS} ${PCRE2_SOURCES} ${PROJECT_BINARY_DIR}/config.h)
TARGET_INCLUDE_DIRECTORIES(pcre2-32-shared PUBLIC ${PROJECT_BINARY_DIR})
SET_TARGET_PROPERTIES(pcre2-32-shared PROPERTIES
COMPILE_DEFINITIONS PCRE2_CODE_UNIT_WIDTH=32
MACHO_COMPATIBILITY_VERSION "${LIBPCRE2_32_MACHO_COMPATIBILITY_VERSION}"
@ -846,7 +881,12 @@ IF(PCRE2_BUILD_PCRE2_32)
VERSION ${LIBPCRE2_32_VERSION}
SOVERSION ${LIBPCRE2_32_SOVERSION}
OUTPUT_NAME pcre2-32)
IF(REQUIRE_PTHREAD)
TARGET_LINK_LIBRARIES(pcre2-32-shared Threads::Threads)
ENDIF(REQUIRE_PTHREAD)
SET(targets ${targets} pcre2-32-shared)
SET(dll_pdb_files ${PROJECT_BINARY_DIR}/pcre2-32.pdb ${dll_pdb_files})
SET(dll_pdb_debug_files ${PROJECT_BINARY_DIR}/pcre2-32d.pdb ${dll_pdb_debug_files})
IF(MINGW)
IF(NON_STANDARD_LIB_PREFIX)
@ -1053,18 +1093,8 @@ INSTALL(FILES ${man3} DESTINATION man/man3)
INSTALL(FILES ${html} DESTINATION share/doc/pcre2/html)
IF(MSVC AND INSTALL_MSVC_PDB)
INSTALL(FILES ${PROJECT_BINARY_DIR}/pcre2-8.pdb
${PROJECT_BINARY_DIR}/pcre2-16.pdb
${PROJECT_BINARY_DIR}/pcre2-32.pdb
${PROJECT_BINARY_DIR}/pcre2-posix.pdb
DESTINATION bin
CONFIGURATIONS RelWithDebInfo)
INSTALL(FILES ${PROJECT_BINARY_DIR}/pcre2-8d.pdb
${PROJECT_BINARY_DIR}/pcre2-16d.pdb
${PROJECT_BINARY_DIR}/pcre2-32d.pdb
${PROJECT_BINARY_DIR}/pcre2-posixd.pdb
DESTINATION bin
CONFIGURATIONS Debug)
INSTALL(FILES ${dll_pdb_files} DESTINATION bin CONFIGURATIONS RelWithDebInfo)
INSTALL(FILES ${dll_pdb_debug_files} DESTINATION bin CONFIGURATIONS Debug)
ENDIF(MSVC AND INSTALL_MSVC_PDB)
# Help, only for nice output

View File

@ -1,5 +1,55 @@
Change Log for PCRE2
--------------------
Change Log for PCRE2 - see also the Git log
-------------------------------------------
Version 10.41 xx-xxx-2022
-------------------------
1. Add fflush() before and after a fork callout in pcre2grep to get its output
to be the same on all systems. (THere were previously ordering differences in
Alpine Linux).
2. Merged patch from @carenas (GitHub #110) for pthreads support in CMake.
3. SSF scorecards grumbled about possible overflow in an expression in
pcre2test. It never would have overflowed in practice, but some casts have been
added and at the some time there's been some tidying of fprints that output
size_t values.
4. PR #94 showed up an unused enum in pcre2_convert.c, which is now removed.
5. Minor code re-arrangement to remove gcc warning about realloc() in
pcre2test.
6. Change a number of int variables that hold buffer and line lengths in
pcre2grep to PCRE2_SIZE (aka size_t).
7. Added an #ifdef to cut out a call to PRIV(jit_free) when JIT is not
supported (even though that function would do nothing in that case) at the
request of a user who doesn't even want to link with pcre_jit_compile.o. Also
tidied up an untidy #ifdef arrangement in pcre2test.
8. Fixed an issue in the backtracking optimization of character repeats in
JIT. Furthermore optimize star repetitions, not just plus repetitions.
9. Removed the use of an initial backtracking frames vector on the system stack
in pcre2_match() so that it now always uses the heap. (In a multi-thread
environment with very small stacks there had been an issue.) This also is
tidier for JIT matching, which didn't need that vector. The heap vector is now
remembered in the match data block and re-used if that block itself is re-used.
It is freed with the match data block.
10. Adjusted the find_limits code in pcre2test to work with change 9 above.
11. Added find_limits_noheap to pcre2test, because the heap limits are now
different in different environments and so cannot be included in the standard
tests.
12. Created a test for pcre2_match() heap processing that is not part of the
tests run by 'make check', but can be run manually. The current output is from
a 64-bit system.
13. Implemented -Z aka --null in pcre2grep.
Version 10.40 15-April-2022
@ -92,7 +142,7 @@ pattern, the optimizing "must be present for a match" character check was not
being flagged as caseless, causing some matches that should have succeeded to
fail.
23. Fixed a unicode properrty matching issue in JIT. The character was not
23. Fixed a unicode property matching issue in JIT. The character was not
fully read in caseless matching.
24. Fixed an issue affecting recursions in JIT caused by duplicated data
@ -119,10 +169,10 @@ Version 10.39 29-October-2021
honoured if chosen.
prtdiff_t is signed, so use a signed type instead, and make sure
that an appropiate width is chosen if pointers are 64bit wide and
that an appropriate width is chosen if pointers are 64bit wide and
long is not (ex: Windows 64bit).
IMHO removing the cast (and therefore the positibilty of truncation)
IMHO removing the cast (and therefore the possibilty of truncation)
make the code cleaner and the fallback is likely portable enough
with all 64-bit POSIX systems doing LP64 except for Windows.
@ -173,7 +223,7 @@ Version 10.38 01-October-2021
-----------------------------
1. Fix invalid single character repetition issues in JIT when the repetition
is inside a capturing bracket and the bracket is preceeded by character
is inside a capturing bracket and the bracket is preceded by character
literals.
2. Installed revised CMake configuration files provided by Jan-Willem Blokland.
@ -413,7 +463,7 @@ now correctly backtracked, so this unnecessary restriction has been removed.
7. Added PCRE2_SUBSTITUTE_MATCHED.
8. Added (?* and (?<* as synonms for (*napla: and (*naplb: to match another
8. Added (?* and (?<* as synonyms for (*napla: and (*naplb: to match another
regex engine. The Perl regex folks are aware of this usage and have made a note
about it.
@ -844,7 +894,7 @@ Patch by Guillem Jover.
warnings were reported.
38. Using the clang compiler with sanitizing options causes runtime complaints
about truncation for statments such as x = ~x when x is an 8-bit value; it
about truncation for statements such as x = ~x when x is an 8-bit value; it
seems to compute ~x as a 32-bit value. Changing such statements to x = 255 ^ x
gets rid of the warnings. There were also two missing casts in pcre2test.

65
HACKING
View File

@ -8,8 +8,8 @@ library is referred to as PCRE1 below. For information about testing PCRE2, see
the pcre2test documentation and the comment at the head of the RunTest file.
PCRE1 releases were up to 8.3x when PCRE2 was developed, and later bug fix
releases remain in the 8.xx series. PCRE2 releases started at 10.00 to avoid
confusion with PCRE1.
releases carried on the 8.xx series, up to the final 8.45 release. PCRE2
releases started at 10.00 to avoid confusion with PCRE1.
Historical note 1
@ -38,8 +38,8 @@ Historical note 2
By contrast, the code originally written by Henry Spencer (which was
subsequently heavily modified for Perl) compiles the expression twice: once in
a dummy mode in order to find out how much store will be needed, and then for
real. (The Perl version probably doesn't do this any more; I'm talking about
the original library.) The execution function operates by backtracking and
real. (The Perl version may or may not still do this; I'm talking about the
original library.) The execution function operates by backtracking and
maximizing (or, optionally, minimizing, in Perl) the amount of the subject that
matches individual wild portions of the pattern. This is an "NFA algorithm" in
Friedl's terminology.
@ -151,8 +151,8 @@ of code units in the item itself. The exception is the aforementioned large
advance to check for such values. When auto-callouts are enabled, the generous
assumption is made that there will be a callout for each pattern code unit
(which of course is only actually true if all code units are literals) plus one
at the end. There is a default parsed pattern vector on the system stack, but
if this is not big enough, heap memory is used.
at the end. A default parsed pattern vector is defined on the system stack, to
minimize memory handling, but if this is not big enough, heap memory is used.
As before, the actual compiling function is run twice, the first time to
determine the amount of memory needed for the final compiled pattern. It
@ -187,7 +187,7 @@ META_CLASS_EMPTY [] empty class - only with PCRE2_ALLOW_EMPTY_CLASS
META_CLASS_EMPTY_NOT [^] negative empty class - ditto
META_CLASS_END ] end of non-empty class
META_CLASS_NOT [^ start non-empty negative class
META_COMMIT (*COMMIT)
META_COMMIT (*COMMIT) - no argument (see below for with argument)
META_COND_ASSERT (?(?assertion)
META_DOLLAR $ metacharacter
META_DOT . metacharacter
@ -201,18 +201,18 @@ META_NOCAPTURE (?: no capture parens
META_PLUS +
META_PLUS_PLUS ++
META_PLUS_QUERY +?
META_PRUNE (*PRUNE) - no argument
META_PRUNE (*PRUNE) - no argument (see below for with argument)
META_QUERY ?
META_QUERY_PLUS ?+
META_QUERY_QUERY ??
META_RANGE_ESCAPED hyphen in class range with at least one escape
META_RANGE_LITERAL hyphen in class range defined literally
META_SKIP (*SKIP) - no argument
META_THEN (*THEN) - no argument
META_SKIP (*SKIP) - no argument (see below for with argument)
META_THEN (*THEN) - no argument (see below for with argument)
The two RANGE values occur only in character classes. They are positioned
between two literals that define the start and end of the range. In an EBCDIC
evironment it is necessary to know whether either of the range values was
environment it is necessary to know whether either of the range values was
specified as an escape. In an ASCII/Unicode environment the distinction is not
relevant.
@ -229,17 +229,16 @@ If the data for META_ALT is non-zero, it is inside a lookbehind, and the data
is the length of its branch, for which OP_REVERSE must be generated.
META_BACKREF, META_CAPTURE, and META_RECURSE have the capture group number as
their data in the lower 16 bits of the element.
their data in the lower 16 bits of the element. META_RECURSE is followed by an
offset, for use in error messages.
META_BACKREF is followed by an offset if the back reference group number is 10
or more. The offsets of the first ocurrences of references to groups whose
or more. The offsets of the first occurrences of references to groups whose
numbers are less than 10 are put in cb->small_ref_offset[] (only the first
occurrence is useful). On 64-bit systems this avoids using more than two parsed
pattern elements for items such as \3. The offset is used when an error occurs
because the reference is to a non-existent group.
META_RECURSE is always followed by an offset, for use in error messages.
META_ESCAPE has an ESC_xxx value as its data. For ESC_P and ESC_p, the next
element contains the 16-bit type and data property values, packed together.
ESC_g and ESC_k are used only for named references - numerical ones are turned
@ -291,9 +290,9 @@ META_LOOKBEHIND (?<= start of lookbehind
META_LOOKBEHIND_NA (*naplb: start of non-atomic lookbehind
META_LOOKBEHINDNOT (?<! start of negative lookbehind
The following are followed by two elements, the minimum and maximum. Repeat
values are limited to 65535 (MAX_REPEAT). A maximum value of "unlimited" is
represented by UNLIMITED_REPEAT, which is bigger than MAX_REPEAT:
The following are followed by two elements, the minimum and maximum. The
maximum value is limited to 65535 (MAX_REPEAT). A maximum value of "unlimited"
is represented by UNLIMITED_REPEAT, which is bigger than MAX_REPEAT:
META_MINMAX {n,m} repeat
META_MINMAX_PLUS {n,m}+ repeat
@ -347,11 +346,11 @@ support is not available for this kind of matching.
Changeable options
------------------
The /i, /m, or /s options (PCRE2_CASELESS, PCRE2_MULTILINE, PCRE2_DOTALL, and
others) may be changed in the middle of patterns by items such as (?i). Their
processing is handled entirely at compile time by generating different opcodes
for the different settings. The runtime functions do not need to keep track of
an option's state.
The /i, /m, or /s options (PCRE2_CASELESS, PCRE2_MULTILINE, PCRE2_DOTALL) and
some others may be changed in the middle of patterns by items such as (?i).
Their processing is handled entirely at compile time by generating different
opcodes for the different settings. The runtime functions do not need to keep
track of an option's state.
PCRE2_DUPNAMES, PCRE2_EXTENDED, PCRE2_EXTENDED_MORE, and PCRE2_NO_AUTO_CAPTURE
are tracked and processed during the parsing pre-pass. The others are handled
@ -437,7 +436,7 @@ Backtracking control verbs
--------------------------
Verbs with no arguments generate opcodes with no following data (as listed
in the section above).
in the section above).
(*MARK:NAME) generates OP_MARK followed by the mark name, preceded by a
length in one code unit, and followed by a binary zero. The name length is
@ -468,8 +467,8 @@ Caseless matching (positive or negative) of characters that have more than two
case-equivalent code points (which is possible only in UTF mode) is handled by
compiling a Unicode property item (see below), with the pseudo-property
PT_CLIST. The value of this property is an offset in a vector called
"ucd_caseless_sets" which identifies the start of a short list of equivalent
characters, terminated by the value NOTACHAR (0xffffffff).
"ucd_caseless_sets" which identifies the start of a short list of case
equivalent characters, terminated by the value NOTACHAR (0xffffffff).
Repeating single characters
@ -546,9 +545,9 @@ Each is followed by two code units that encode the desired property as a type
and a value. The types are a set of #defines of the form PT_xxx, and the values
are enumerations of the form ucp_xx, defined in the pcre2_ucp.h source file.
The value is relevant only for PT_GC (General Category), PT_PC (Particular
Category), PT_SC (Script), PT_BIDICL (Bidi Class), and the pseudo-property
PT_CLIST, which is used to identify a list of case-equivalent characters when
there are three or more.
Category), PT_SC (Script), PT_BIDICL (Bidi Class), PT_BOOL (Boolean property),
and the pseudo-property PT_CLIST, which is used to identify a list of
case-equivalent characters when there are three or more (see above).
Repeats of these items use the OP_TYPESTAR etc. set of opcodes, followed by
three code units: OP_PROP or OP_NOTPROP, and then the desired property type and
@ -666,9 +665,9 @@ a count that immediately follows the offset.
There are several opcodes that mark the end of a subpattern group. OP_KET is
used for subpatterns that do not repeat indefinitely, OP_KETRMIN and
OP_KETRMAX are used for indefinite repetitions, minimally or maximally
respectively, and OP_KETRPOS for possessive repetitions (see below for more
respectively, and OP_KETRPOS for possessive repetitions (see below for more
details). All four are followed by a LINK_SIZE value giving (as a positive
number) the offset back to the matching bracket opcode.
number) the offset back to the matching opening bracket opcode.
If a subpattern is quantified such that it is permitted to match zero times, it
is preceded by one of OP_BRAZERO, OP_BRAMINZERO, or OP_SKIPZERO. These are
@ -719,7 +718,7 @@ Assertions
Forward assertions are also just like other subpatterns, but starting with one
of the opcodes OP_ASSERT, OP_ASSERT_NA (non-atomic assertion), or
OP_ASSERT_NOT. Backward assertions use the opcodes OP_ASSERTBACK,
OP_ASSERT_NOT. Backward assertions use the opcodes OP_ASSERTBACK,
OP_ASSERTBACK_NA, and OP_ASSERTBACK_NOT, and the first opcode inside the
assertion is OP_REVERSE, followed by a count of the number of characters to
move back the pointer in the subject string. In ASCII or UTF-32 mode, the count
@ -828,4 +827,4 @@ not a real opcode, but is used to check at compile time that tables indexed by
opcode are the correct length, in order to catch updating errors.
Philip Hazel
December 2021
April 2022

8
MODULE.bazel Normal file
View File

@ -0,0 +1,8 @@
module(
name = "pcre2",
version = "10.40",
compatibility_level = 1,
)
bazel_dep(name = "rules_cc", version = "0.0.1")
bazel_dep(name = "bazel_skylib", version = "1.2.1")

View File

@ -452,9 +452,10 @@ EXTRA_DIST += \
src/sljit/sljitNativePPC_32.c \
src/sljit/sljitNativePPC_64.c \
src/sljit/sljitNativePPC_common.c \
src/sljit/sljitNativeRISCV_32.c \
src/sljit/sljitNativeRISCV_64.c \
src/sljit/sljitNativeRISCV_common.c \
src/sljit/sljitNativeS390X.c \
src/sljit/sljitNativeSPARC_32.c \
src/sljit/sljitNativeSPARC_common.c \
src/sljit/sljitNativeX86_32.c \
src/sljit/sljitNativeX86_64.c \
src/sljit/sljitNativeX86_common.c \

271
Makefile.os4 Normal file
View File

@ -0,0 +1,271 @@
#
# Project: pcre2
#
# Created on: 10-01-2022 22:01:46
#
# commands to use:
# make -f Makefile.os4 libpcre2.a
# make -f Makefile.os4 libpcre2-posix.a
# make -f Makefile.os4 pcre2test
# sh RunTest
# make -f Makefile.os4 clean
#
###################################################################
##
##//// Objects
##
###################################################################
libpcre2_OBJ := \
src/pcre2_chartables.o src/pcre2_auto_possess.o src/pcre2_compile.o \
src/pcre2_config.o src/pcre2_context.o src/pcre2_convert.o \
src/pcre2_dfa_match.o src/pcre2_error.o src/pcre2_extuni.o \
src/pcre2_find_bracket.o src/pcre2_jit_compile.o src/pcre2_maketables.o \
src/pcre2_match.o src/pcre2_match_data.o src/pcre2_newline.o \
src/pcre2_ord2utf.o src/pcre2_pattern_info.o src/pcre2_script_run.o \
src/pcre2_serialize.o src/pcre2_string_utils.o src/pcre2_study.o \
src/pcre2_substitute.o src/pcre2_substring.o src/pcre2_tables.o \
src/pcre2_ucd.o src/pcre2_valid_utf.o src/pcre2_xclass.o \
pcre2posix_OBJ := \
src/pcre2posix.o
pcre2test_OBJ := \
src/pcre2test.o
pcre2grep_OBJ := \
src/pcre2grep.o
###################################################################
##
##//// Variables and Environment
##
###################################################################
MCRT := -mcrt=newlib
ifeq ($(USE_CLIB2), yes)
MCRT := -mcrt=clib2
endif
CC := gcc:bin/gcc
INCPATH := -I. -Isrc
# for pcre2test
CFLAGS := $(MCRT) $(INCPATH) -O2 -DHAVE_CONFIG_H -DPCRE2_CODE_UNIT_WIDTH=8
###################################################################
##
##//// General rules
##
###################################################################
.PHONY: all all-before all-after clean clean-custom realclean
all: all-before libpcre2.a libpcre2-posix.a all-after
all-before:
# You can add rules here to execute before the project is built
all-after:
# You can add rules here to execute after the project is built
tests: pcre2test pcre2grep
clean: clean-custom
@echo "Cleaning compiler objects..."
@rm -f $(libpcre2_OBJ) $(pcre2posix_OBJ) $(pcre2test_OBJ)
cleanall: clean
@echo "Cleaning compiler targets..."
@rm -f libpcre.a libpcre-posix.a pcre2test pcre2grep
###################################################################
##
##//// Targets
##
###################################################################
libpcre2.a: $(libpcre2_OBJ)
ar -rcs libpcre2.a $(libpcre2_OBJ)
ranlib libpcre2.a
libpcre2-posix.a: $(pcre2posix_OBJ)
ar -rcs libpcre2-posix.a $(pcre2posix_OBJ)
ranlib libpcre2-posix.a
pcre2test: libpcre2.a libpcre2-posix.a $(pcre2test_OBJ)
@echo "Linking pcre2test"
@gcc:bin/gcc $(MCRT) -o pcre2test $(pcre2test_OBJ) -L. -lauto -lpcre2 -lpcre2-posix
@echo "Removing stale debug target: pcre2test"
@rm -f pcre2test.debug
pcre2grep: libpcre2.a $(pcre2grep_OBJ)
@echo "Linking pcre2grep"
@gcc:bin/gcc $(MCRT) -o pcre2grep $(pcre2grep_OBJ) -L . -lauto -lpcre2
@echo "Removing stale debug target: pcre2grep"
@rm -f pcre2grep.debug
###################################################################
##
##//// Standard rules
##
###################################################################
# A default rule to make all the objects listed below
# because we are hiding compiler commands from the output
.c.o:
@echo "Compiling $<"
@$(CC) -c $< -o $*.o $(CFLAGS)
src/pcre2_chartables.o: src/pcre2_chartables.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_auto_possess.o: src/pcre2_auto_possess.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_compile.o: src/pcre2_compile.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h src/pcre2_intmodedep.h \
src/pcre2_config.o: src/pcre2_config.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_context.o: src/pcre2_context.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_convert.o: src/pcre2_convert.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_dfa_match.o: src/pcre2_dfa_match.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_error.o: src/pcre2_error.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_extuni.o: src/pcre2_extuni.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_find_bracket.o: src/pcre2_find_bracket.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_jit_compile.o: src/pcre2_jit_compile.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h src/pcre2_intmodedep.h \
src/sljit/sljitLir.c src/sljit/sljitLir.h src/sljit/sljitConfig.h \
src/sljit/sljitConfigInternal.h src/sljit/sljitUtils.c src/sljit/sljitProtExecAllocator.c \
src/sljit/sljitWXExecAllocator.c src/sljit/sljitExecAllocator.c src/pcre2_jit_simd_inc.h \
src/pcre2_jit_neon_inc.h src/pcre2_jit_match.c
src/pcre2_maketables.o: src/pcre2_maketables.c
src/pcre2_match.o: src/pcre2_match.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_match_data.o: src/pcre2_match_data.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_newline.o: src/pcre2_newline.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_ord2utf.o: src/pcre2_ord2utf.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_pattern_info.o: src/pcre2_pattern_info.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_script_run.o: src/pcre2_script_run.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_serialize.o: src/pcre2_serialize.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2test.o: src/pcre2test.c src/config.h src/pcre2.h \
src/pcre2posix.h src/pcre2_internal.h src/pcre2_ucp.h \
src/pcre2_intmodedep.h src/pcre2_tables.c src/pcre2_ucptables.c \
src/pcre2_ucd.c src/pcre2_printint.c
src/pcre2_string_utils.o: src/pcre2_string_utils.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_study.o: src/pcre2_study.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_substitute.o: src/pcre2_substitute.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_substring.o: src/pcre2_substring.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2posix.o: src/pcre2posix.c src/config.h src/pcre2.h \
src/pcre2_tables.o: src/pcre2_tables.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h src/pcre2_intmodedep.h \
src/pcre2_ucd.o: src/pcre2_ucd.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_valid_utf.o: src/pcre2_valid_utf.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2_xclass.o: src/pcre2_xclass.c src/config.h src/pcre2_internal.h \
src/pcre2.h src/pcre2_ucp.h
src/pcre2grep.o: src/pcre2grep.c src/config.h
###################################################################
##
##//// Custom rules
##
###################################################################
runtests: libpcre2.a libpcre2-posix.a tests
sh RunTest
sh RunGrepTest
release:
@echo "Create release folders..."
@mkdir -p release/local/newlib/lib release/local/clib2/lib release/local/Documentation/pcre2 release/local/common/include
@echo "Building newlib based libraries..."
@make -f Makefile.os4 all
@cp libpcre2.a release/local/newlib/lib/
@cp libpcre2-posix.a release/local/newlib/lib/
@echo "Clean build and libraries files..."
@make -f Makefile.os4 cleanall
@echo "Building clib2 based libraries..."
@make -f Makefile.os4 all USE_CLIB2=yes
@cp libpcre2.a release/local/clib2/lib/
@cp libpcre2-posix.a release/local/clib2/lib/
@echo "Copy the necessary files..."
@cp src/pcre2.h release/local/common/include/
@cp src/pcre2posix.h release/local/common/include/
@cp COPYING release/local/Documentation/pcre2/
@cp HACKING release/local/Documentation/pcre2/
@cp LICENCE release/local/Documentation/pcre2/
@cp README release/local/Documentation/pcre2/
@cp README-OS4.md release/local/Documentation/pcre2/
@echo "Clean build and libraries files..."
@make -f Makefile.os4 cleanall
@echo "Creating the lha release file..."
@rm -f pcre2.lha
@lha -aeqr3 a pcre2.lha release/
@rm -rf release
###################################################################

View File

@ -121,6 +121,7 @@ environment, for example.
pcre2_substring.c
pcre2_tables.c
pcre2_ucd.c
pcre2_ucptables.c
pcre2_valid_utf.c
pcre2_xclass.c
@ -306,7 +307,7 @@ cache can be deleted by selecting "File > Delete Cache".
3. Create a new, empty build directory, preferably a subdirectory of the
source dir. For example, C:\pcre2\pcre2-xx\build.
4. Run cmake-gui from the Shell envirornment of your build tool, for example,
4. Run cmake-gui from the Shell environment of your build tool, for example,
Msys for Msys/MinGW or Visual Studio Command Prompt for VC/VC++. Do not try
to start Cmake from the Windows Start menu, as this can lead to errors.
@ -373,7 +374,7 @@ Otherwise:
1. Copy RunTest.bat into the directory where pcre2test.exe and pcre2grep.exe
have been created.
2. Edit RunTest.bat to indentify the full or relative location of
2. Edit RunTest.bat to identify the full or relative location of
the pcre2 source (wherein which the testdata folder resides), e.g.:
set srcdir=C:\pcre2\pcre2-10.00

17
README
View File

@ -8,7 +8,7 @@ features, and the internals have been improved. The original PCRE1 library is
now obsolete and no longer maintained. The latest release of PCRE2 is available
in .tar.gz, tar.bz2, or .zip form from this GitHub repository:
https://github.com/PhilipHazel/pcre2/releases
https://github.com/PCRE2Project/pcre2/releases
There is a mailing list for discussion about the development of PCRE2 at
pcre2-dev@googlegroups.com. You can subscribe by sending an email to
@ -17,7 +17,7 @@ pcre2-dev+subscribe@googlegroups.com.
You can access the archives and also subscribe or manage your subscription
here:
https://groups.google.com/pcre2-dev
https://groups.google.com/g/pcre2-dev
Please read the NEWS file if you are upgrading from a previous release. The
contents of this README file are:
@ -375,7 +375,8 @@ library. They are also documented in the pcre2build man page.
necessary to specify something like LIBS="-lncurses" as well. This is
because, to quote the readline INSTALL, "Readline uses the termcap functions,
but does not link with the termcap or curses library itself, allowing
applications which link with readline the to choose an appropriate library."
applications which link with readline the option to choose an appropriate
library."
If you get error messages about missing functions tgetstr, tgetent, tputs,
tgetflag, or tgoto, this is the problem, and linking with the ncurses library
should fix it.
@ -400,10 +401,10 @@ library. They are also documented in the pcre2build man page.
Setting --enable-fuzz-support also causes a binary called pcre2fuzzcheck to
be created. This is normally run under valgrind or used when PCRE2 is
compiled with address sanitizing enabled. It calls the fuzzing function and
outputs information about it is doing. The input strings are specified by
arguments: if an argument starts with "=" the rest of it is a literal input
string. Otherwise, it is assumed to be a file name, and the contents of the
file are the test string.
outputs information about what it is doing. The input strings are specified
by arguments: if an argument starts with "=" the rest of it is a literal
input string. Otherwise, it is assumed to be a file name, and the contents
of the file are the test string.
. Releases before 10.30 could be compiled with --disable-stack-for-recursion,
which caused pcre2_match() to use individual blocks on the heap for
@ -695,7 +696,7 @@ Test 14 contains some special UTF and UCP tests that give different output for
different code unit widths.
Test 15 contains a number of tests that must not be run with JIT. They check,
among other non-JIT things, the match-limiting features of the intepretive
among other non-JIT things, the match-limiting features of the interpretive
matcher.
Test 16 is run only when JIT support is not available. It checks that an

39
README-OS4.md Normal file
View File

@ -0,0 +1,39 @@
PCRE2 (Perl-compatible regular expression library)
---------------------------------------------------------------------------
This is a port of PCRE2 10.40 by Philip Hazel for AmigaOS 4, as found at the
GitHub repository https://github.com/PCRE2Project/pcre2
More information about PCRE can be found at its official website
at https://www.pcre.org and at the documentation that comes with this
package.
In the archive both newlib and clib2 libraries are included. It has been
tested with various applications, but in case you find issues please
contact me.
To install it into your AmigaOS 4 SDK installation, just extract all the
files in the SDK: path.
Compile
--------------------------
The source and the changes I did can be found at my personale repository
https://git.walkero.gr/walkero/pcre2
You can compile it using the Makefile.os4 file, and produce the libraries
yourself.
* with newlib run:
```bash
make -f Makefile.os4 all
```
* with clib2 run:
```bash
make -f Makefile.os4 all USE_CLIB2=yes
```
Changelog
--------------------------
v10.40r1 - 2022-07-31
* First release

View File

@ -14,14 +14,14 @@ flexible API, the code of PCRE2 has been much improved since the fork.
## Download
As well as downloading from the
[GitHub site](https://github.com/PhilipHazel/pcre2), you can download PCRE2
[GitHub site](https://github.com/PCRE2Project/pcre2), you can download PCRE2
or the older, unmaintained PCRE1 library from an
[*unofficial* mirror](https://sourceforge.net/projects/pcre/files/) at SourceForge.
You can check out the PCRE2 source code via Git or Subversion:
git clone https://github.com/PhilipHazel/pcre2.git
svn co https://github.com/PhilipHazel/pcre2.git
git clone https://github.com/PCRE2Project/pcre2.git
svn co https://github.com/PCRE2Project/pcre2.git
## Contributed Ports
@ -36,7 +36,7 @@ default character encoding, can be found at
## Documentation
You can read the PCRE2 documentation
[here](https://philiphazel.github.io/pcre2/doc/html/index.html).
[here](https://PCRE2Project.github.io/pcre2/doc/html/index.html).
Comparisons to Perl's regular expression semantics can be found in the
community authored Wikipedia entry for PCRE.

View File

@ -68,6 +68,22 @@ diff -b /dev/null /dev/null 2>/dev/null && cf="diff -b"
diff -u /dev/null /dev/null 2>/dev/null && cf="diff -u"
diff -ub /dev/null /dev/null 2>/dev/null && cf="diff -ub"
# Some tests involve NUL characters. It seems impossible to handle them easily
# in many operating systems. An earlier version of this script used sed to
# translate NUL into the string ZERO, but this didn't work on Solaris (aka
# SunOS), where the version of sed explicitly doesn't like them, and also MacOS
# (Darwin), OpenBSD, FreeBSD, NetBSD, and some Linux distributions like Alpine,
# even when using GNU sed. A user suggested using tr instead, which
# necessitates translating to a single character. However, on (some versions
# of?) Solaris, the normal "tr" cannot handle binary zeros, but if
# /usr/xpg4/bin/tr is available, it can do so, so test for that.
if [ -x /usr/xpg4/bin/tr ] ; then
tr=/usr/xpg4/bin/tr
else
tr=tr
fi
# If this test is being run from "make check", $srcdir will be set. If not, set
# it to the current or parent directory, whichever one contains the test data.
# Subsequently, we run most of the pcre2grep tests in the source directory so
@ -685,6 +701,16 @@ echo "---------------------------- Test 134 -----------------------------" >>tes
(cd $srcdir; $valgrind $vjs $pcre2grep -m1 -O '=$x{41}$x423$o{103}$o1045=' 'fox') <$srcdir/testdata/grepinputv >>testtrygrep 2>&1
echo "RC=$?" >>testtrygrep
echo "---------------------------- Test 135 -----------------------------" >>testtrygrep
(cd $srcdir; $valgrind $vjs $pcre2grep -HZ 'word' ./testdata/grepinputv) | $tr '\000' '@' >>testtrygrep
echo "RC=$?" >>testtrygrep
(cd $srcdir; $valgrind $vjs $pcre2grep -lZ 'word' ./testdata/grepinputv ./testdata/grepinputv) | $tr '\000' '@' >>testtrygrep
echo "RC=$?" >>testtrygrep
(cd $srcdir; $valgrind $vjs $pcre2grep -A 1 -B 1 -HZ 'word' ./testdata/grepinputv) | $tr '\000' '@' >>testtrygrep
echo "RC=$?" >>testtrygrep
(cd $srcdir; $valgrind $vjs $pcre2grep -MHZn 'start[\s]+end' testdata/grepinputM) >>testtrygrep
echo "RC=$?" >>testtrygrep
# Now compare the results.
$cf $srcdir/testdata/grepoutput testtrygrep
@ -759,22 +785,6 @@ $valgrind $vjs $pcre2grep -n --newline=any "^(abc|def|ghi|jkl)" testNinputgrep >
printf '%c--------------------------- Test N6 ------------------------------\r\n' - >>testtrygrep
$valgrind $vjs $pcre2grep -n --newline=anycrlf "^(abc|def|ghi|jkl)" testNinputgrep >>testtrygrep
# This next test involves NUL characters. It seems impossible to handle them
# easily in many operating systems. An earlier version of this script used sed
# to translate NUL into the string ZERO, but this didn't work on Solaris (aka
# SunOS), where the version of sed explicitly doesn't like them, and also MacOS
# (Darwin), OpenBSD, FreeBSD, NetBSD, and some Linux distributions like Alpine,
# even when using GNU sed. A user suggested using tr instead, which
# necessitates translating to a single character (@). However, on (some
# versions of?) Solaris, the normal "tr" cannot handle binary zeros, but if
# /usr/xpg4/bin/tr is available, it can do so, so test for that.
if [ -x /usr/xpg4/bin/tr ] ; then
tr=/usr/xpg4/bin/tr
else
tr=tr
fi
printf '%c--------------------------- Test N7 ------------------------------\r\n' - >>testtrygrep
printf 'abc\0def' >testNinputgrep
$valgrind $vjs $pcre2grep -na --newline=nul "^(abc|def)" testNinputgrep | $tr '\000' '@' >>testtrygrep

35
RunTest
View File

@ -17,8 +17,16 @@
# individual test numbers, ranges of tests such as 3-6 or 3- (meaning 3 to the
# end), or a number preceded by ~ to exclude a test. For example, "3-15 ~10"
# runs tests 3 to 15, excluding test 10, and just "~10" runs all the tests
# except test 10. Whatever order the arguments are in, the tests are always run
# in numerical order.
# except test 10. Whatever order the arguments are in, these tests are always
# run in numerical order.
#
# If no specific tests are selected (which is the case when this script is run
# via 'make check') the default is to run all the numbered tests.
#
# There may also be named (as well as numbered) tests for special purposes. At
# present there is just one, called "heap". This test's output contains the
# sizes of heap frames and frame vectors, which depend on the environment. It
# is therefore not run unless explicitly requested.
#
# Inappropriate tests are automatically skipped (with a comment to say so). For
# example, if JIT support is not compiled, test 16 is skipped, whereas if JIT
@ -82,6 +90,7 @@ title24="Test 24: Non-UTF pattern conversion tests"
title25="Test 25: UTF pattern conversion tests"
title26="Test 26: Auto-generated unicode property tests"
maxtest=26
titleheap="Test 'heap': Environment-specific heap tests"
if [ $# -eq 1 -a "$1" = "list" ]; then
echo $title0
@ -111,6 +120,11 @@ if [ $# -eq 1 -a "$1" = "list" ]; then
echo $title24
echo $title25
echo $title26
echo ""
echo $titleheap
echo ""
echo "Numbered tests are automatically run if nothing selected."
echo "Named tests must be explicitly selected."
exit 0
fi
@ -241,6 +255,7 @@ do23=no
do24=no
do25=no
do26=no
doheap=no
while [ $# -gt 0 ] ; do
case $1 in
@ -271,6 +286,7 @@ while [ $# -gt 0 ] ; do
24) do24=yes;;
25) do25=yes;;
26) do26=yes;;
heap) doheap=yes;;
-8) arg8=yes;;
-16) arg16=yes;;
-32) arg32=yes;;
@ -412,8 +428,8 @@ if [ $jit -ne 0 -a "$nojit" != "yes" ] ; then
fi
fi
# If no specific tests were requested, select all. Those that are not
# relevant will be automatically skipped.
# If no specific tests were requested, select all the numbered tests. Those
# that are not relevant will be automatically skipped.
if [ $do0 = no -a $do1 = no -a $do2 = no -a $do3 = no -a \
$do4 = no -a $do5 = no -a $do6 = no -a $do7 = no -a \
@ -421,7 +437,7 @@ if [ $do0 = no -a $do1 = no -a $do2 = no -a $do3 = no -a \
$do12 = no -a $do13 = no -a $do14 = no -a $do15 = no -a \
$do16 = no -a $do17 = no -a $do18 = no -a $do19 = no -a \
$do20 = no -a $do21 = no -a $do22 = no -a $do23 = no -a \
$do24 = no -a $do25 = no -a $do26 = no \
$do24 = no -a $do25 = no -a $do26 = no -a $doheap = no \
]; then
do0=yes
do1=yes
@ -882,6 +898,15 @@ for bmode in "$test8" "$test16" "$test32"; do
fi
fi
# Manually selected heap tests - output may vary in different environments,
# which is why that are not automatically run.
if [ $doheap = yes ] ; then
echo $titleheap
$sim $valgrind ./pcre2test -q $setstack $bmode $testdata/testinputheap testtry
checkresult $? heap-$bits ""
fi
# End of loop for 8/16/32-bit tests
done

View File

@ -135,9 +135,9 @@ if "%all%" == "yes" (
set do7=yes
set do8=yes
set do9=yes
set do10=yes
set do10=no
set do11=yes
set do12=yes
set do12=no
set do13=yes
set do14=yes
set do15=yes

1
WORKSPACE.bazel Normal file
View File

@ -0,0 +1 @@
# See MODULE.bazel

View File

@ -9,9 +9,9 @@ dnl The PCRE2_PRERELEASE feature is for identifying release candidates. It might
dnl be defined as -RC2, for example. For real releases, it should be empty.
m4_define(pcre2_major, [10])
m4_define(pcre2_minor, [40])
m4_define(pcre2_minor, [41])
m4_define(pcre2_prerelease, [])
m4_define(pcre2_date, [2022-04-14])
m4_define(pcre2_date, [2022-xx-xx])
# Libtool shared library interface versions (current:revision:age)
m4_define(libpcre2_8_version, [11:0:11])

View File

@ -121,6 +121,7 @@ environment, for example.
pcre2_substring.c
pcre2_tables.c
pcre2_ucd.c
pcre2_ucptables.c
pcre2_valid_utf.c
pcre2_xclass.c
@ -306,7 +307,7 @@ cache can be deleted by selecting "File > Delete Cache".
3. Create a new, empty build directory, preferably a subdirectory of the
source dir. For example, C:\pcre2\pcre2-xx\build.
4. Run cmake-gui from the Shell envirornment of your build tool, for example,
4. Run cmake-gui from the Shell environment of your build tool, for example,
Msys for Msys/MinGW or Visual Studio Command Prompt for VC/VC++. Do not try
to start Cmake from the Windows Start menu, as this can lead to errors.
@ -373,7 +374,7 @@ Otherwise:
1. Copy RunTest.bat into the directory where pcre2test.exe and pcre2grep.exe
have been created.
2. Edit RunTest.bat to indentify the full or relative location of
2. Edit RunTest.bat to identify the full or relative location of
the pcre2 source (wherein which the testdata folder resides), e.g.:
set srcdir=C:\pcre2\pcre2-10.00

View File

@ -8,7 +8,7 @@ features, and the internals have been improved. The original PCRE1 library is
now obsolete and no longer maintained. The latest release of PCRE2 is available
in .tar.gz, tar.bz2, or .zip form from this GitHub repository:
https://github.com/PhilipHazel/pcre2/releases
https://github.com/PCRE2Project/pcre2/releases
There is a mailing list for discussion about the development of PCRE2 at
pcre2-dev@googlegroups.com. You can subscribe by sending an email to
@ -17,7 +17,7 @@ pcre2-dev+subscribe@googlegroups.com.
You can access the archives and also subscribe or manage your subscription
here:
https://groups.google.com/pcre2-dev
https://groups.google.com/g/pcre2-dev
Please read the NEWS file if you are upgrading from a previous release. The
contents of this README file are:
@ -375,7 +375,8 @@ library. They are also documented in the pcre2build man page.
necessary to specify something like LIBS="-lncurses" as well. This is
because, to quote the readline INSTALL, "Readline uses the termcap functions,
but does not link with the termcap or curses library itself, allowing
applications which link with readline the to choose an appropriate library."
applications which link with readline the option to choose an appropriate
library."
If you get error messages about missing functions tgetstr, tgetent, tputs,
tgetflag, or tgoto, this is the problem, and linking with the ncurses library
should fix it.
@ -400,10 +401,10 @@ library. They are also documented in the pcre2build man page.
Setting --enable-fuzz-support also causes a binary called pcre2fuzzcheck to
be created. This is normally run under valgrind or used when PCRE2 is
compiled with address sanitizing enabled. It calls the fuzzing function and
outputs information about it is doing. The input strings are specified by
arguments: if an argument starts with "=" the rest of it is a literal input
string. Otherwise, it is assumed to be a file name, and the contents of the
file are the test string.
outputs information about what it is doing. The input strings are specified
by arguments: if an argument starts with "=" the rest of it is a literal
input string. Otherwise, it is assumed to be a file name, and the contents
of the file are the test string.
. Releases before 10.30 could be compiled with --disable-stack-for-recursion,
which caused pcre2_match() to use individual blocks on the heap for
@ -695,7 +696,7 @@ Test 14 contains some special UTF and UCP tests that give different output for
different code unit widths.
Test 15 contains a number of tests that must not be run with JIT. They check,
among other non-JIT things, the match-limiting features of the intepretive
among other non-JIT things, the match-limiting features of the interpretive
matcher.
Test 16 is run only when JIT support is not available. It checks that an

View File

@ -92,8 +92,18 @@ Additional options may be set in the compile context via the
function.
</P>
<P>
The yield of this function is a pointer to a private data structure that
contains the compiled pattern, or NULL if an error was detected.
If either of <i>errorcode</i> or <i>erroroffset</i> is NULL, the function returns
NULL immediately. Otherwise, the yield of this function is a pointer to a
private data structure that contains the compiled pattern, or NULL if an error
was detected. In the error case, a text error message can be obtained by
passing the value returned via the <i>errorcode</i> argument to the the
<b>pcre2_get_error_message()</b> function. The offset (in code units) where the
error was encountered is returned via the <i>erroroffset</i> argument.
</P>
<P>
If there is no error, the value passed via <i>errorcode</i> returns the message
"no error" if passed to <b>pcre2_get_error_message()</b>, and the value passed
via <i>erroroffset</i> is zero.
</P>
<P>
There is a complete description of the PCRE2 native API, with more detail on

View File

@ -48,7 +48,7 @@ the following negative error codes:
PCRE2_ERROR_BADDATA <i>number_of_codes</i> is zero or less
PCRE2_ERROR_BADMAGIC mismatch of id bytes in <i>bytes</i>
PCRE2_ERROR_BADMODE mismatch of variable unit size or PCRE version
PCRE2_ERROR_MEMORY memory allocation failed
PCRE2_ERROR_NOMEMORY memory allocation failed
PCRE2_ERROR_NULL <i>codes</i> or <i>bytes</i> is NULL
</pre>
PCRE2_ERROR_BADMAGIC may mean that the data is corrupt, or that it was compiled

View File

@ -1017,7 +1017,7 @@ has its own memory control arrangements (see the
documentation for more details). If the limit is reached, the negative error
code PCRE2_ERROR_HEAPLIMIT is returned. The default limit can be set when PCRE2
is built; if it is not, the default is set very large and is essentially
"unlimited".
unlimited.
</P>
<P>
A value for the heap limit may also be supplied by an item at the start of a
@ -1030,19 +1030,17 @@ less than the limit set by the caller of <b>pcre2_match()</b> or, if no such
limit is set, less than the default.
</P>
<P>
The <b>pcre2_match()</b> function starts out using a 20KiB vector on the system
stack for recording backtracking points. The more nested backtracking points
there are (that is, the deeper the search tree), the more memory is needed.
Heap memory is used only if the initial vector is too small. If the heap limit
is set to a value less than 21 (in particular, zero) no heap memory will be
used. In this case, only patterns that do not have a lot of nested backtracking
can be successfully processed.
The <b>pcre2_match()</b> function always needs some heap memory, so setting a
value of zero guarantees a "heap limit exceeded" error. Details of how
<b>pcre2_match()</b> uses the heap are given in the
<a href="pcre2perform.html"><b>pcre2perform</b></a>
documentation.
</P>
<P>
Similarly, for <b>pcre2_dfa_match()</b>, a vector on the system stack is used
when processing pattern recursions, lookarounds, or atomic groups, and only if
this is not big enough is heap memory used. In this case, too, setting a value
of zero disables the use of the heap.
For <b>pcre2_dfa_match()</b>, a vector on the system stack is used when
processing pattern recursions, lookarounds, or atomic groups, and only if this
is not big enough is heap memory used. In this case, setting a value of zero
disables the use of the heap.
<br>
<br>
<b>int pcre2_set_match_limit(pcre2_match_context *<i>mcontext</i>,</b>
@ -1089,10 +1087,10 @@ less than the limit set by the caller of <b>pcre2_match()</b> or
<br>
<br>
This parameter limits the depth of nested backtracking in <b>pcre2_match()</b>.
Each time a nested backtracking point is passed, a new memory "frame" is used
Each time a nested backtracking point is passed, a new memory frame is used
to remember the state of matching at that point. Thus, this parameter
indirectly limits the amount of memory that is used in a match. However,
because the size of each memory "frame" depends on the number of capturing
because the size of each memory frame depends on the number of capturing
parentheses, the actual memory limit varies from pattern to pattern. This limit
was more useful in versions before 10.30, where function recursion was used for
backtracking.
@ -1383,8 +1381,7 @@ If <i>errorcode</i> or <i>erroroffset</i> is NULL, <b>pcre2_compile()</b> return
NULL immediately. Otherwise, the variables to which these point are set to an
error code and an offset (number of code units) within the pattern,
respectively, when <b>pcre2_compile()</b> returns NULL because a compilation
error has occurred. The values are not defined when compilation is successful
and <b>pcre2_compile()</b> returns a non-NULL value.
error has occurred.
</P>
<P>
There are nearly 100 positive error codes that <b>pcre2_compile()</b> may return
@ -1399,15 +1396,18 @@ because the textual error messages that are obtained by calling the
message"
<a href="#geterrormessage">below)</a>
should be self-explanatory. Macro names starting with PCRE2_ERROR_ are defined
for both positive and negative error codes in <b>pcre2.h</b>.
for both positive and negative error codes in <b>pcre2.h</b>. When compilation
is successful <i>errorcode</i> is set to a value that returns the message "no
error" if passed to <b>pcre2_get_error_message()</b>.
</P>
<P>
The value returned in <i>erroroffset</i> is an indication of where in the
pattern the error occurred. It is not necessarily the furthest point in the
pattern that was read. For example, after the error "lookbehind assertion is
not fixed length", the error offset points to the start of the failing
assertion. For an invalid UTF-8 or UTF-16 string, the offset is that of the
first code unit of the failing character.
pattern an error occurred. When there is no error, zero is returned. A non-zero
value is not necessarily the furthest point in the pattern that was read. For
example, after the error "lookbehind assertion is not fixed length", the error
offset points to the start of the failing assertion. For an invalid UTF-8 or
UTF-16 string, the offset is that of the first code unit of the failing
character.
</P>
<P>
Some errors are not detected until the whole pattern has been scanned; in these
@ -3146,11 +3146,11 @@ The backtracking match limit was reached.
<pre>
PCRE2_ERROR_NOMEMORY
</pre>
If a pattern contains many nested backtracking points, heap memory is used to
remember them. This error is given when the memory allocation function (default
or custom) fails. Note that a different error, PCRE2_ERROR_HEAPLIMIT, is given
if the amount of memory needed exceeds the heap limit. PCRE2_ERROR_NOMEMORY is
also returned if PCRE2_COPY_MATCHED_SUBJECT is set and memory allocation fails.
Heap memory is used to remember backgracking points. This error is given when
the memory allocation function (default or custom) fails. Note that a different
error, PCRE2_ERROR_HEAPLIMIT, is given if the amount of memory needed exceeds
the heap limit. PCRE2_ERROR_NOMEMORY is also returned if
PCRE2_COPY_MATCHED_SUBJECT is set and memory allocation fails.
<pre>
PCRE2_ERROR_NULL
</pre>
@ -4018,9 +4018,9 @@ Cambridge, England.
</P>
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
<P>
Last updated: 14 December 2021
Last updated: 27 July 2022
<br>
Copyright &copy; 1997-2021 University of Cambridge.
Copyright &copy; 1997-2022 University of Cambridge.
<br>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -284,12 +284,11 @@ to the <b>configure</b> command. This setting also applies to the
counting is done differently).
</P>
<P>
The <b>pcre2_match()</b> function starts out using a 20KiB vector on the system
stack to record backtracking points. The more nested backtracking points there
are (that is, the deeper the search tree), the more memory is needed. If the
initial vector is not large enough, heap memory is used, up to a certain limit,
which is specified in kibibytes (units of 1024 bytes). The limit can be changed
at run time, as described in the
The <b>pcre2_match()</b> function uses heap memory to record backtracking
points. The more nested backtracking points there are (that is, the deeper the
search tree), the more memory is needed. There is an upper limit, specified in
kibibytes (units of 1024 bytes). This limit can be changed at run time, as
described in the
<a href="pcre2api.html"><b>pcre2api</b></a>
documentation. The default limit (in effect unlimited) is 20 million. You can
change this by a setting such as
@ -609,16 +608,16 @@ give a warning.
<P>
Philip Hazel
<br>
University Computing Service
Retired from University Computing Service
<br>
Cambridge, England.
<br>
</P>
<br><a name="SEC26" href="#TOC1">REVISION</a><br>
<P>
Last updated: 08 December 2021
Last updated: 27 July 2022
<br>
Copyright &copy; 1997-2021 University of Cambridge.
Copyright &copy; 1997-2022 University of Cambridge.
<br>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -71,13 +71,15 @@ For example:
<pre>
pcre2grep some-pattern file1 - file3
</pre>
Input files are searched line by line. By default, each line that matches a
By default, input files are searched line by line. Each line that matches a
pattern is copied to the standard output, and if there is more than one file,
the file name is output at the start of each line, followed by a colon.
However, there are options that can change how <b>pcre2grep</b> behaves. In
particular, the <b>-M</b> option makes it possible to search for strings that
span line boundaries. What defines a line boundary is controlled by the
<b>-N</b> (<b>--newline</b>) option.
However, there are options that can change how <b>pcre2grep</b> behaves. For
example, the <b>-M</b> option makes it possible to search for strings that span
line boundaries. What defines a line boundary is controlled by the <b>-N</b>
(<b>--newline</b>) option. The <b>-h</b> and <b>-H</b> options control whether or
not file names are shown, and the <b>-Z</b> option changes the file name
terminator to a zero byte.
</P>
<P>
The amount of memory used for buffering files that are being scanned is
@ -178,9 +180,11 @@ Output up to <i>number</i> lines of context after each matching line. Fewer
lines are output if the next match or the end of the file is reached, or if the
processing buffer size has been set too small. If file names and/or line
numbers are being output, a hyphen separator is used instead of a colon for the
context lines. A line containing "--" is output between each group of lines,
unless they are in fact contiguous in the input file. The value of <i>number</i>
is expected to be relatively small. When <b>-c</b> is used, <b>-A</b> is ignored.
context lines (the <b>-Z</b> option can be used to change the file name
terminator to a zero byte). A line containing "--" is output between each group
of lines, unless they are in fact contiguous in the input file. The value of
<i>number</i> is expected to be relatively small. When <b>-c</b> is used,
<b>-A</b> is ignored.
</P>
<P>
<b>-a</b>, <b>--text</b>
@ -199,9 +203,10 @@ Output up to <i>number</i> lines of context before each matching line. Fewer
lines are output if the previous match or the start of the file is within
<i>number</i> lines, or if the processing buffer size has been set too small. If
file names and/or line numbers are being output, a hyphen separator is used
instead of a colon for the context lines. A line containing "--" is output
between each group of lines, unless they are in fact contiguous in the input
file. The value of <i>number</i> is expected to be relatively small. When
instead of a colon for the context lines (the <b>-Z</b> option can be used to
change the file name terminator to a zero byte). A line containing "--" is
output between each group of lines, unless they are in fact contiguous in the
input file. The value of <i>number</i> is expected to be relatively small. When
<b>-c</b> is used, <b>-B</b> is ignored.
</P>
<P>
@ -411,20 +416,22 @@ shown separately. This option is mutually exclusive with <b>--output</b>,
<P>
<b>-H</b>, <b>--with-filename</b>
Force the inclusion of the file name at the start of output lines when
searching a single file. By default, the file name is not shown in this case.
For matching lines, the file name is followed by a colon; for context lines, a
hyphen separator is used. If a line number is also being output, it follows the
file name. When the <b>-M</b> option causes a pattern to match more than one
line, only the first is preceded by the file name. This option overrides any
previous <b>-h</b>, <b>-l</b>, or <b>-L</b> options.
searching a single file. The file name is not normally shown in this case.
By default, for matching lines, the file name is followed by a colon; for
context lines, a hyphen separator is used. The <b>-Z</b> option can be used to
change the terminator to a zero byte. If a line number is also being output,
it follows the file name. When the <b>-M</b> option causes a pattern to match
more than one line, only the first is preceded by the file name. This option
overrides any previous <b>-h</b>, <b>-l</b>, or <b>-L</b> options.
</P>
<P>
<b>-h</b>, <b>--no-filename</b>
Suppress the output file names when searching multiple files. By default,
file names are shown when multiple files are searched. For matching lines, the
file name is followed by a colon; for context lines, a hyphen separator is used.
If a line number is also being output, it follows the file name. This option
overrides any previous <b>-H</b>, <b>-L</b>, or <b>-l</b> options.
Suppress the output file names when searching multiple files. File names are
normally shown when multiple files are searched. By default, for matching
lines, the file name is followed by a colon; for context lines, a hyphen
separator is used. The <b>-Z</b> option can be used to change the terminator to
a zero byte. If a line number is also being output, it follows the file name.
This option overrides any previous <b>-H</b>, <b>-L</b>, or <b>-l</b> options.
</P>
<P>
<b>--heap-limit</b>=<i>number</i>
@ -481,18 +488,20 @@ given any number of times. If a directory matches both <b>--include-dir</b> and
<b>-L</b>, <b>--files-without-match</b>
Instead of outputting lines from the files, just output the names of the files
that do not contain any lines that would have been output. Each file name is
output once, on a separate line. This option overrides any previous <b>-H</b>,
<b>-h</b>, or <b>-l</b> options.
output once, on a separate line by default, but if the <b>-Z</b> option is set,
they are separated by zero bytes instead of newlines. This option overrides any
previous <b>-H</b>, <b>-h</b>, or <b>-l</b> options.
</P>
<P>
<b>-l</b>, <b>--files-with-matches</b>
Instead of outputting lines from the files, just output the names of the files
containing lines that would have been output. Each file name is output once, on
a separate line. Searching normally stops as soon as a matching line is found
in a file. However, if the <b>-c</b> (count) option is also used, matching
continues in order to obtain the correct count, and those files that have at
least one match are listed along with their counts. Using this option with
<b>-c</b> is a way of suppressing the listing of files with no matches that
a separate line, but if the <b>-Z</b> option is set, they are separated by zero
bytes instead of newlines. Searching normally stops as soon as a matching line
is found in a file. However, if the <b>-c</b> (count) option is also used,
matching continues in order to obtain the correct count, and those files that
have at least one match are listed along with their counts. Using this option
with <b>-c</b> is a way of suppressing the listing of files with no matches that
occurs with <b>-c</b> on its own. This option overrides any previous <b>-H</b>,
<b>-h</b>, or <b>-L</b> options.
</P>
@ -592,10 +601,7 @@ value set by <b>--match-limit</b> is reached, an error occurs.
<br>
<br>
The <b>--heap-limit</b> option specifies, as a number of kibibytes (units of
1024 bytes), the amount of heap memory that may be used for matching. Heap
memory is needed only if matching the pattern requires a significant number of
nested backtracking points to be remembered. This parameter can be set to zero
to forbid the use of heap memory altogether.
1024 bytes), the maximum amount of heap memory that may be used for matching.
<br>
<br>
The <b>--depth-limit</b> option limits the depth of nested backtracking points,
@ -839,6 +845,13 @@ pattern and ")$" at the end. This option applies only to the patterns that are
matched against the contents of files; it does not apply to patterns specified
by any of the <b>--include</b> or <b>--exclude</b> options.
</P>
<P>
<b>-Z</b>, <b>--null</b>
Terminate files names in the regular output with a zero byte (the NUL
character) instead of what would normally appear. This is useful when file
names contain unusual characters such as colons, hyphens, or even newlines. The
option does not apply to file names in error messages.
</P>
<br><a name="SEC7" href="#TOC1">ENVIRONMENT VARIABLES</a><br>
<P>
The environment variables <b>LC_ALL</b> and <b>LC_CTYPE</b> are examined, in that
@ -1053,9 +1066,9 @@ Cambridge, England.
</P>
<br><a name="SEC16" href="#TOC1">REVISION</a><br>
<P>
Last updated: 31 August 2021
Last updated: 30 July 2022
<br>
Copyright &copy; 1997-2021 University of Cambridge.
Copyright &copy; 1997-2022 University of Cambridge.
<br>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -71,13 +71,18 @@ is 255 code units for the 8-bit library and 65535 code units for the 16-bit and
The maximum length of a string argument to a callout is the largest number a
32-bit unsigned integer can hold.
</P>
<P>
The maximum amount of heap memory used for matching is controlled by the heap
limit, which can be set in a pattern or in a match context. The default is a
very large number, effectively unlimited.
</P>
<br><b>
AUTHOR
</b><br>
<P>
Philip Hazel
<br>
University Computing Service
Retired from University Computing Service
<br>
Cambridge, England.
<br>
@ -86,9 +91,9 @@ Cambridge, England.
REVISION
</b><br>
<P>
Last updated: 02 February 2019
Last updated: 26 July 2022
<br>
Copyright &copy; 1997-2019 University of Cambridge.
Copyright &copy; 1997-2022 University of Cambridge.
<br>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -83,12 +83,31 @@ From release 10.30, the interpretive (non-JIT) version of <b>pcre2_match()</b>
uses very little system stack at run time. In earlier releases recursive
function calls could use a great deal of stack, and this could cause problems,
but this usage has been eliminated. Backtracking positions are now explicitly
remembered in memory frames controlled by the code. An initial 20KiB vector of
frames is allocated on the system stack (enough for about 100 frames for small
patterns), but if this is insufficient, heap memory is used. The amount of heap
memory can be limited; if the limit is set to zero, only the initial stack
vector is used. Rewriting patterns to be time-efficient, as described below,
may also reduce the memory requirements.
remembered in memory frames controlled by the code.
</P>
<P>
The size of each frame depends on the size of pointer variables and the number
of capturing parenthesized groups in the pattern being matched. On a 64-bit
system the frame size for a pattern with no captures is 128 bytes. For each
capturing group the size increases by 16 bytes.
</P>
<P>
Until release 10.41, an initial 20KiB frames vector was allocated on the system
stack, but this still caused some issues for multi-thread applications where
each thread has a very small stack. From release 10.41 backtracking memory
frames are always held in heap memory. An initial heap allocation is obtained
the first time any match data block is passed to <b>pcre2_match()</b>. This is
remembered with the match data block and re-used if that block is used for
another match. It is freed when the match data block itself is freed.
</P>
<P>
The size of the initial block is the larger of 20KiB or ten times the pattern's
frame size, unless the heap limit is less than this, in which case the heap
limit is used. If the initial block proves to be too small during matching, it
is replaced by a larger block, subject to the heap limit. The heap limit is
checked only when a new block is to be allocated. Reducing the heap limit
between calls to <b>pcre2_match()</b> with the same match data block does not
affect the saved block.
</P>
<P>
In contrast to <b>pcre2_match()</b>, <b>pcre2_dfa_match()</b> does use recursive
@ -245,16 +264,16 @@ pattern to match. This is done by repeatedly matching with different limits.
<P>
Philip Hazel
<br>
University Computing Service
Retired from University Computing Service
<br>
Cambridge, England.
<br>
</P>
<br><a name="SEC6" href="#TOC1">REVISION</a><br>
<P>
Last updated: 03 February 2019
Last updated: 27 July 2022
<br>
Copyright &copy; 1997-2019 University of Cambridge.
Copyright &copy; 1997-2022 University of Cambridge.
<br>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.

View File

@ -94,7 +94,7 @@ of serialized patterns, or one of the following negative error codes:
<pre>
PCRE2_ERROR_BADDATA the number of patterns is zero or less
PCRE2_ERROR_BADMAGIC mismatch of id bytes in one of the patterns
PCRE2_ERROR_MEMORY memory allocation failed
PCRE2_ERROR_NOMEMORY memory allocation failed
PCRE2_ERROR_MIXEDTABLES the patterns do not all use the same tables
PCRE2_ERROR_NULL the 1st, 3rd, or 4th argument is NULL
</pre>

View File

@ -1241,7 +1241,8 @@ pattern, but can be overridden by modifiers on the subject.
copy=&#60;number or name&#62; copy captured substring
depth_limit=&#60;n&#62; set a depth limit
dfa use <b>pcre2_dfa_match()</b>
find_limits find match and depth limits
find_limits find heap, match and depth limits
find_limits_noheap find match and depth limits
get=&#60;number or name&#62; extract captured substring
getall extract all captured substrings
/g global global matching
@ -1564,7 +1565,7 @@ Setting heap, match, and depth limits
<P>
The <b>heap_limit</b>, <b>match_limit</b>, and <b>depth_limit</b> modifiers set
the appropriate limits in the match context. These values are ignored when the
<b>find_limits</b> modifier is specified.
<b>find_limits</b> or <b>find_limits_noheap</b> modifier is specified.
</P>
<br><b>
Finding minimum limits
@ -1574,8 +1575,12 @@ If the <b>find_limits</b> modifier is present on a subject line, <b>pcre2test</b
calls the relevant matching function several times, setting different values in
the match context via <b>pcre2_set_heap_limit()</b>,
<b>pcre2_set_match_limit()</b>, or <b>pcre2_set_depth_limit()</b> until it finds
the minimum values for each parameter that allows the match to complete without
error. If JIT is being used, only the match limit is relevant.
the smallest value for each parameter that allows the match to complete without
a "limit exceeded" error. The match itself may succeed or fail. An alternative
modifier, <b>find_limits_noheap</b>, omits the heap limit. This is used in the
standard tests, because the minimum heap limit varies between systems. If JIT
is being used, only the match limit is relevant, and the other two are
automatically omitted.
</P>
<P>
When using this modifier, the pattern should not contain any limit settings
@ -1603,9 +1608,7 @@ overall amount of computing resource that is used.
</P>
<P>
For both kinds of matching, the <i>heap_limit</i> number, which is in kibibytes
(units of 1024 bytes), limits the amount of heap memory used for matching. A
value of zero disables the use of any heap memory; many simple pattern matches
can be done without using the heap, so zero is not an unreasonable setting.
(units of 1024 bytes), limits the amount of heap memory used for matching.
</P>
<br><b>
Showing MARK names
@ -1623,12 +1626,10 @@ Showing memory usage
<P>
The <b>memory</b> modifier causes <b>pcre2test</b> to log the sizes of all heap
memory allocation and freeing calls that occur during a call to
<b>pcre2_match()</b> or <b>pcre2_dfa_match()</b>. These occur only when a match
requires a bigger vector than the default for remembering backtracking points
(<b>pcre2_match()</b>) or for internal workspace (<b>pcre2_dfa_match()</b>). In
many cases there will be no heap memory used and therefore no additional
output. No heap memory is allocated during matching with JIT, so in that case
the <b>memory</b> modifier never has any effect. For this modifier to work, the
<b>pcre2_match()</b> or <b>pcre2_dfa_match()</b>. In the latter case, heap memory
is used only when a match requires more internal workspace that the default
allocation on the stack, so in many cases there will be no output. No heap
memory is allocated during matching with JIT. For this modifier to work, the
<b>null_context</b> modifier must not be set on both the pattern and the
subject, though it can be set on one or the other.
</P>
@ -1690,7 +1691,8 @@ Normally, <b>pcre2test</b> passes a context block to <b>pcre2_match()</b>,
If the <b>null_context</b> modifier is set, however, NULL is passed. This is for
testing that the matching and substitution functions behave correctly in this
case (they use default values). This modifier cannot be used with the
<b>find_limits</b> or <b>substitute_callout</b> modifiers.
<b>find_limits</b>, <b>find_limits_noheap</b>, or <b>substitute_callout</b>
modifiers.
</P>
<P>
Similarly, for testing purposes, if the <b>null_subject</b> or
@ -2141,7 +2143,7 @@ Cambridge, England.
</P>
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
<P>
Last updated: 12 January 2022
Last updated: 27 July 2022
<br>
Copyright &copy; 1997-2022 University of Cambridge.
<br>

View File

@ -185,8 +185,8 @@ REVISION
Last updated: 27 August 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2API(3) Library Functions Manual PCRE2API(3)
@ -1028,7 +1028,7 @@ PCRE2 CONTEXTS
pcre2jit documentation for more details). If the limit is reached, the
negative error code PCRE2_ERROR_HEAPLIMIT is returned. The default
limit can be set when PCRE2 is built; if it is not, the default is set
very large and is essentially "unlimited".
very large and is essentially unlimited.
A value for the heap limit may also be supplied by an item at the start
of a pattern of the form
@ -1039,19 +1039,15 @@ PCRE2 CONTEXTS
less ddd is less than the limit set by the caller of pcre2_match() or,
if no such limit is set, less than the default.
The pcre2_match() function starts out using a 20KiB vector on the sys-
tem stack for recording backtracking points. The more nested backtrack-
ing points there are (that is, the deeper the search tree), the more
memory is needed. Heap memory is used only if the initial vector is
too small. If the heap limit is set to a value less than 21 (in partic-
ular, zero) no heap memory will be used. In this case, only patterns
that do not have a lot of nested backtracking can be successfully pro-
cessed.
The pcre2_match() function always needs some heap memory, so setting a
value of zero guarantees a "heap limit exceeded" error. Details of how
pcre2_match() uses the heap are given in the pcre2perform documenta-
tion.
Similarly, for pcre2_dfa_match(), a vector on the system stack is used
when processing pattern recursions, lookarounds, or atomic groups, and
only if this is not big enough is heap memory used. In this case, too,
setting a value of zero disables the use of the heap.
For pcre2_dfa_match(), a vector on the system stack is used when pro-
cessing pattern recursions, lookarounds, or atomic groups, and only if
this is not big enough is heap memory used. In this case, setting a
value of zero disables the use of the heap.
int pcre2_set_match_limit(pcre2_match_context *mcontext,
uint32_t value);
@ -1093,12 +1089,12 @@ PCRE2 CONTEXTS
This parameter limits the depth of nested backtracking in
pcre2_match(). Each time a nested backtracking point is passed, a new
memory "frame" is used to remember the state of matching at that point.
memory frame is used to remember the state of matching at that point.
Thus, this parameter indirectly limits the amount of memory that is
used in a match. However, because the size of each memory "frame" de-
pends on the number of capturing parentheses, the actual memory limit
varies from pattern to pattern. This limit was more useful in versions
before 10.30, where function recursion was used for backtracking.
used in a match. However, because the size of each memory frame depends
on the number of capturing parentheses, the actual memory limit varies
from pattern to pattern. This limit was more useful in versions before
10.30, where function recursion was used for backtracking.
The depth limit is not relevant, and is ignored, when matching is done
using JIT compiled code. However, it is supported by pcre2_dfa_match(),
@ -1372,27 +1368,29 @@ COMPILING A PATTERN
diately. Otherwise, the variables to which these point are set to an
error code and an offset (number of code units) within the pattern, re-
spectively, when pcre2_compile() returns NULL because a compilation er-
ror has occurred. The values are not defined when compilation is suc-
cessful and pcre2_compile() returns a non-NULL value.
ror has occurred.
There are nearly 100 positive error codes that pcre2_compile() may re-
turn if it finds an error in the pattern. There are also some negative
error codes that are used for invalid UTF strings when validity check-
ing is in force. These are the same as given by pcre2_match() and
There are nearly 100 positive error codes that pcre2_compile() may re-
turn if it finds an error in the pattern. There are also some negative
error codes that are used for invalid UTF strings when validity check-
ing is in force. These are the same as given by pcre2_match() and
pcre2_dfa_match(), and are described in the pcre2unicode documentation.
There is no separate documentation for the positive error codes, be-
cause the textual error messages that are obtained by calling the
There is no separate documentation for the positive error codes, be-
cause the textual error messages that are obtained by calling the
pcre2_get_error_message() function (see "Obtaining a textual error mes-
sage" below) should be self-explanatory. Macro names starting with
PCRE2_ERROR_ are defined for both positive and negative error codes in
pcre2.h.
sage" below) should be self-explanatory. Macro names starting with
PCRE2_ERROR_ are defined for both positive and negative error codes in
pcre2.h. When compilation is successful errorcode is set to a value
that returns the message "no error" if passed to pcre2_get_error_mes-
sage().
The value returned in erroroffset is an indication of where in the pat-
tern the error occurred. It is not necessarily the furthest point in
the pattern that was read. For example, after the error "lookbehind as-
sertion is not fixed length", the error offset points to the start of
the failing assertion. For an invalid UTF-8 or UTF-16 string, the off-
set is that of the first code unit of the failing character.
tern an error occurred. When there is no error, zero is returned. A
non-zero value is not necessarily the furthest point in the pattern
that was read. For example, after the error "lookbehind assertion is
not fixed length", the error offset points to the start of the failing
assertion. For an invalid UTF-8 or UTF-16 string, the offset is that of
the first code unit of the failing character.
Some errors are not detected until the whole pattern has been scanned;
in these cases, the offset passed back is the length of the pattern.
@ -3049,12 +3047,12 @@ ERROR RETURNS FROM pcre2_match()
PCRE2_ERROR_NOMEMORY
If a pattern contains many nested backtracking points, heap memory is
used to remember them. This error is given when the memory allocation
function (default or custom) fails. Note that a different error,
PCRE2_ERROR_HEAPLIMIT, is given if the amount of memory needed exceeds
the heap limit. PCRE2_ERROR_NOMEMORY is also returned if
PCRE2_COPY_MATCHED_SUBJECT is set and memory allocation fails.
Heap memory is used to remember backgracking points. This error is
given when the memory allocation function (default or custom) fails.
Note that a different error, PCRE2_ERROR_HEAPLIMIT, is given if the
amount of memory needed exceeds the heap limit. PCRE2_ERROR_NOMEMORY is
also returned if PCRE2_COPY_MATCHED_SUBJECT is set and memory alloca-
tion fails.
PCRE2_ERROR_NULL
@ -3858,11 +3856,11 @@ AUTHOR
REVISION
Last updated: 14 December 2021
Copyright (c) 1997-2021 University of Cambridge.
Last updated: 27 July 2022
Copyright (c) 1997-2022 University of Cambridge.
------------------------------------------------------------------------------
PCRE2BUILD(3) Library Functions Manual PCRE2BUILD(3)
@ -4116,41 +4114,40 @@ LIMITING PCRE2 RESOURCE USAGE
pcre2_dfa_match() matching function, and to JIT matching (though the
counting is done differently).
The pcre2_match() function starts out using a 20KiB vector on the sys-
tem stack to record backtracking points. The more nested backtracking
points there are (that is, the deeper the search tree), the more memory
is needed. If the initial vector is not large enough, heap memory is
used, up to a certain limit, which is specified in kibibytes (units of
1024 bytes). The limit can be changed at run time, as described in the
pcre2api documentation. The default limit (in effect unlimited) is 20
million. You can change this by a setting such as
The pcre2_match() function uses heap memory to record backtracking
points. The more nested backtracking points there are (that is, the
deeper the search tree), the more memory is needed. There is an upper
limit, specified in kibibytes (units of 1024 bytes). This limit can be
changed at run time, as described in the pcre2api documentation. The
default limit (in effect unlimited) is 20 million. You can change this
by a setting such as
--with-heap-limit=500
which limits the amount of heap to 500 KiB. This limit applies only to
which limits the amount of heap to 500 KiB. This limit applies only to
interpretive matching in pcre2_match() and pcre2_dfa_match(), which may
also use the heap for internal workspace when processing complicated
patterns. This limit does not apply when JIT (which has its own memory
also use the heap for internal workspace when processing complicated
patterns. This limit does not apply when JIT (which has its own memory
arrangements) is used.
You can also explicitly limit the depth of nested backtracking in the
You can also explicitly limit the depth of nested backtracking in the
pcre2_match() interpreter. This limit defaults to the value that is set
for --with-match-limit. You can set a lower default limit by adding,
for --with-match-limit. You can set a lower default limit by adding,
for example,
--with-match-limit-depth=10000
to the configure command. This value can be overridden at run time.
This depth limit indirectly limits the amount of heap memory that is
used, but because the size of each backtracking "frame" depends on the
number of capturing parentheses in a pattern, the amount of heap that
is used before the limit is reached varies from pattern to pattern.
to the configure command. This value can be overridden at run time.
This depth limit indirectly limits the amount of heap memory that is
used, but because the size of each backtracking "frame" depends on the
number of capturing parentheses in a pattern, the amount of heap that
is used before the limit is reached varies from pattern to pattern.
This limit was more useful in versions before 10.30, where function re-
cursion was used for backtracking.
As well as applying to pcre2_match(), the depth limit also controls the
depth of recursive function calls in pcre2_dfa_match(). These are used
for lookaround assertions, atomic groups, and recursion within pat-
depth of recursive function calls in pcre2_dfa_match(). These are used
for lookaround assertions, atomic groups, and recursion within pat-
terns. The limit does not apply to JIT matching.
@ -4158,67 +4155,67 @@ CREATING CHARACTER TABLES AT BUILD TIME
PCRE2 uses fixed tables for processing characters whose code points are
less than 256. By default, PCRE2 is built with a set of tables that are
distributed in the file src/pcre2_chartables.c.dist. These tables are
distributed in the file src/pcre2_chartables.c.dist. These tables are
for ASCII codes only. If you add
--enable-rebuild-chartables
to the configure command, the distributed tables are no longer used.
to the configure command, the distributed tables are no longer used.
Instead, a program called pcre2_dftables is compiled and run. This out-
puts the source for new set of tables, created in the default locale of
your C run-time system. This method of replacing the tables does not
your C run-time system. This method of replacing the tables does not
work if you are cross compiling, because pcre2_dftables needs to be run
on the local host and therefore not compiled with the cross compiler.
If you need to create alternative tables when cross compiling, you will
have to do so "by hand". There may also be other reasons for creating
tables manually. To cause pcre2_dftables to be built on the local
have to do so "by hand". There may also be other reasons for creating
tables manually. To cause pcre2_dftables to be built on the local
host, run a normal compiling command, and then run the program with the
output file as its argument, for example:
cc src/pcre2_dftables.c -o pcre2_dftables
./pcre2_dftables src/pcre2_chartables.c
This builds the tables in the default locale of the local host. If you
This builds the tables in the default locale of the local host. If you
want to specify a locale, you must use the -L option:
LC_ALL=fr_FR ./pcre2_dftables -L src/pcre2_chartables.c
You can also specify -b (with or without -L). This causes the tables to
be written in binary instead of as source code. A set of binary tables
can be loaded into memory by an application and passed to pcre2_com-
be written in binary instead of as source code. A set of binary tables
can be loaded into memory by an application and passed to pcre2_com-
pile() in the same way as tables created by calling pcre2_maketables().
The tables are just a string of bytes, independent of hardware charac-
teristics such as endianness. This means they can be bundled with an
application that runs in different environments, to ensure consistent
The tables are just a string of bytes, independent of hardware charac-
teristics such as endianness. This means they can be bundled with an
application that runs in different environments, to ensure consistent
behaviour.
USING EBCDIC CODE
PCRE2 assumes by default that it will run in an environment where the
character code is ASCII or Unicode, which is a superset of ASCII. This
PCRE2 assumes by default that it will run in an environment where the
character code is ASCII or Unicode, which is a superset of ASCII. This
is the case for most computer operating systems. PCRE2 can, however, be
compiled to run in an 8-bit EBCDIC environment by adding
--enable-ebcdic --disable-unicode
to the configure command. This setting implies --enable-rebuild-charta-
bles. You should only use it if you know that you are in an EBCDIC en-
bles. You should only use it if you know that you are in an EBCDIC en-
vironment (for example, an IBM mainframe operating system).
It is not possible to support both EBCDIC and UTF-8 codes in the same
version of the library. Consequently, --enable-unicode and --enable-
It is not possible to support both EBCDIC and UTF-8 codes in the same
version of the library. Consequently, --enable-unicode and --enable-
ebcdic are mutually exclusive.
The EBCDIC character that corresponds to an ASCII LF is assumed to have
the value 0x15 by default. However, in some EBCDIC environments, 0x25
the value 0x15 by default. However, in some EBCDIC environments, 0x25
is used. In such an environment you should use
--enable-ebcdic-nl25
as well as, or instead of, --enable-ebcdic. The EBCDIC character for CR
has the same value as in ASCII, namely, 0x0d. Whichever of 0x15 and
has the same value as in ASCII, namely, 0x0d. Whichever of 0x15 and
0x25 is not chosen as LF is made to correspond to the Unicode NEL char-
acter (which, in Unicode, is 0x85).
@ -4230,47 +4227,47 @@ USING EBCDIC CODE
PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS
By default pcre2grep supports the use of callouts with string arguments
within the patterns it is matching. There are two kinds: one that gen-
within the patterns it is matching. There are two kinds: one that gen-
erates output using local code, and another that calls an external pro-
gram or script. If --disable-pcre2grep-callout-fork is added to the
configure command, only the first kind of callout is supported; if
--disable-pcre2grep-callout is used, all callouts are completely ig-
nored. For more details of pcre2grep callouts, see the pcre2grep docu-
gram or script. If --disable-pcre2grep-callout-fork is added to the
configure command, only the first kind of callout is supported; if
--disable-pcre2grep-callout is used, all callouts are completely ig-
nored. For more details of pcre2grep callouts, see the pcre2grep docu-
mentation.
PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT
By default, pcre2grep reads all files as plain text. You can build it
so that it recognizes files whose names end in .gz or .bz2, and reads
By default, pcre2grep reads all files as plain text. You can build it
so that it recognizes files whose names end in .gz or .bz2, and reads
them with libz or libbz2, respectively, by adding one or both of
--enable-pcre2grep-libz
--enable-pcre2grep-libbz2
to the configure command. These options naturally require that the rel-
evant libraries are installed on your system. Configuration will fail
evant libraries are installed on your system. Configuration will fail
if they are not.
PCRE2GREP BUFFER SIZE
pcre2grep uses an internal buffer to hold a "window" on the file it is
pcre2grep uses an internal buffer to hold a "window" on the file it is
scanning, in order to be able to output "before" and "after" lines when
it finds a match. The default starting size of the buffer is 20KiB. The
buffer itself is three times this size, but because of the way it is
buffer itself is three times this size, but because of the way it is
used for holding "before" lines, the longest line that is guaranteed to
be processable is the notional buffer size. If a longer line is encoun-
tered, pcre2grep automatically expands the buffer, up to a specified
maximum size, whose default is 1MiB or the starting size, whichever is
the larger. You can change the default parameter values by adding, for
tered, pcre2grep automatically expands the buffer, up to a specified
maximum size, whose default is 1MiB or the starting size, whichever is
the larger. You can change the default parameter values by adding, for
example,
--with-pcre2grep-bufsize=51200
--with-pcre2grep-max-bufsize=2097152
to the configure command. The caller of pcre2grep can override these
values by using --buffer-size and --max-buffer-size on the command
to the configure command. The caller of pcre2grep can override these
values by using --buffer-size and --max-buffer-size on the command
line.
@ -4281,26 +4278,26 @@ PCRE2TEST OPTION FOR LIBREADLINE SUPPORT
--enable-pcre2test-libreadline
--enable-pcre2test-libedit
to the configure command, pcre2test is linked with the libreadline or-
libedit library, respectively, and when its input is from a terminal,
it reads it using the readline() function. This provides line-editing
and history facilities. Note that libreadline is GPL-licensed, so if
you distribute a binary of pcre2test linked in this way, there may be
to the configure command, pcre2test is linked with the libreadline or-
libedit library, respectively, and when its input is from a terminal,
it reads it using the readline() function. This provides line-editing
and history facilities. Note that libreadline is GPL-licensed, so if
you distribute a binary of pcre2test linked in this way, there may be
licensing issues. These can be avoided by linking instead with libedit,
which has a BSD licence.
Setting --enable-pcre2test-libreadline causes the -lreadline option to
be added to the pcre2test build. In many operating environments with a
sytem-installed readline library this is sufficient. However, in some
Setting --enable-pcre2test-libreadline causes the -lreadline option to
be added to the pcre2test build. In many operating environments with a
sytem-installed readline library this is sufficient. However, in some
environments (e.g. if an unmodified distribution version of readline is
in use), some extra configuration may be necessary. The INSTALL file
in use), some extra configuration may be necessary. The INSTALL file
for libreadline says this:
"Readline uses the termcap functions, but does not link with
the termcap or curses library itself, allowing applications
which link with readline the to choose an appropriate library."
If your environment has not been set up so that an appropriate library
If your environment has not been set up so that an appropriate library
is automatically included, you may need to add something like
LIBS="-ncurses"
@ -4314,7 +4311,7 @@ INCLUDING DEBUGGING CODE
--enable-debug
to the configure command, additional debugging code is included in the
to the configure command, additional debugging code is included in the
build. This feature is intended for use by the PCRE2 maintainers.
@ -4324,14 +4321,14 @@ DEBUGGING WITH VALGRIND SUPPORT
--enable-valgrind
to the configure command, PCRE2 will use valgrind annotations to mark
certain memory regions as unaddressable. This allows it to detect in-
to the configure command, PCRE2 will use valgrind annotations to mark
certain memory regions as unaddressable. This allows it to detect in-
valid memory accesses, and is mostly useful for debugging PCRE2 itself.
CODE COVERAGE REPORTING
If your C compiler is gcc, you can build a version of PCRE2 that can
If your C compiler is gcc, you can build a version of PCRE2 that can
generate a code coverage report for its test suite. To enable this, you
must install lcov version 1.6 or above. Then specify
@ -4340,20 +4337,20 @@ CODE COVERAGE REPORTING
to the configure command and build PCRE2 in the usual way.
Note that using ccache (a caching C compiler) is incompatible with code
coverage reporting. If you have configured ccache to run automatically
coverage reporting. If you have configured ccache to run automatically
on your system, you must set the environment variable
CCACHE_DISABLE=1
before running make to build PCRE2, so that ccache is not used.
When --enable-coverage is used, the following addition targets are
When --enable-coverage is used, the following addition targets are
added to the Makefile:
make coverage
This creates a fresh coverage report for the PCRE2 test suite. It is
equivalent to running "make coverage-reset", "make coverage-baseline",
This creates a fresh coverage report for the PCRE2 test suite. It is
equivalent to running "make coverage-reset", "make coverage-baseline",
"make check", and then "make coverage-report".
make coverage-reset
@ -4370,73 +4367,73 @@ CODE COVERAGE REPORTING
make coverage-clean-report
This removes the generated coverage report without cleaning the cover-
This removes the generated coverage report without cleaning the cover-
age data itself.
make coverage-clean-data
This removes the captured coverage data without removing the coverage
This removes the captured coverage data without removing the coverage
files created at compile time (*.gcno).
make coverage-clean
This cleans all coverage data including the generated coverage report.
For more information about code coverage, see the gcov and lcov docu-
This cleans all coverage data including the generated coverage report.
For more information about code coverage, see the gcov and lcov docu-
mentation.
DISABLING THE Z AND T FORMATTING MODIFIERS
The C99 standard defines formatting modifiers z and t for size_t and
ptrdiff_t values, respectively. By default, PCRE2 uses these modifiers
The C99 standard defines formatting modifiers z and t for size_t and
ptrdiff_t values, respectively. By default, PCRE2 uses these modifiers
in environments other than old versions of Microsoft Visual Studio when
__STDC_VERSION__ is defined and has a value greater than or equal to
199901L (indicating support for C99). However, there is at least one
__STDC_VERSION__ is defined and has a value greater than or equal to
199901L (indicating support for C99). However, there is at least one
environment that claims to be C99 but does not support these modifiers.
If
--disable-percent-zt
is specified, no use is made of the z or t modifiers. Instead of %td or
%zu, a suitable format is used depending in the size of long for the
%zu, a suitable format is used depending in the size of long for the
platform.
SUPPORT FOR FUZZERS
There is a special option for use by people who want to run fuzzing
There is a special option for use by people who want to run fuzzing
tests on PCRE2:
--enable-fuzz-support
At present this applies only to the 8-bit library. If set, it causes an
extra library called libpcre2-fuzzsupport.a to be built, but not in-
stalled. This contains a single function called LLVMFuzzerTestOneIn-
put() whose arguments are a pointer to a string and the length of the
string. When called, this function tries to compile the string as a
pattern, and if that succeeds, to match it. This is done both with no
options and with some random options bits that are generated from the
extra library called libpcre2-fuzzsupport.a to be built, but not in-
stalled. This contains a single function called LLVMFuzzerTestOneIn-
put() whose arguments are a pointer to a string and the length of the
string. When called, this function tries to compile the string as a
pattern, and if that succeeds, to match it. This is done both with no
options and with some random options bits that are generated from the
string.
Setting --enable-fuzz-support also causes a binary called pcre2fuz-
zcheck to be created. This is normally run under valgrind or used when
Setting --enable-fuzz-support also causes a binary called pcre2fuz-
zcheck to be created. This is normally run under valgrind or used when
PCRE2 is compiled with address sanitizing enabled. It calls the fuzzing
function and outputs information about what it is doing. The input
strings are specified by arguments: if an argument starts with "=" the
rest of it is a literal input string. Otherwise, it is assumed to be a
function and outputs information about what it is doing. The input
strings are specified by arguments: if an argument starts with "=" the
rest of it is a literal input string. Otherwise, it is assumed to be a
file name, and the contents of the file are the test string.
OBSOLETE OPTION
In versions of PCRE2 prior to 10.30, there were two ways of handling
backtracking in the pcre2_match() function. The default was to use the
In versions of PCRE2 prior to 10.30, there were two ways of handling
backtracking in the pcre2_match() function. The default was to use the
system stack, but if
--disable-stack-for-recursion
was set, memory on the heap was used. From release 10.30 onwards this
has changed (the stack is no longer used) and this option now does
was set, memory on the heap was used. From release 10.30 onwards this
has changed (the stack is no longer used) and this option now does
nothing except give a warning.
@ -4448,17 +4445,17 @@ SEE ALSO
AUTHOR
Philip Hazel
University Computing Service
Retired from University Computing Service
Cambridge, England.
REVISION
Last updated: 08 December 2021
Copyright (c) 1997-2021 University of Cambridge.
Last updated: 27 July 2022
Copyright (c) 1997-2022 University of Cambridge.
------------------------------------------------------------------------------
PCRE2CALLOUT(3) Library Functions Manual PCRE2CALLOUT(3)
@ -4887,8 +4884,8 @@ REVISION
Last updated: 03 February 2019
Copyright (c) 1997-2019 University of Cambridge.
------------------------------------------------------------------------------
PCRE2COMPAT(3) Library Functions Manual PCRE2COMPAT(3)
@ -5110,8 +5107,8 @@ REVISION
Last updated: 08 December 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2JIT(3) Library Functions Manual PCRE2JIT(3)
@ -5537,8 +5534,8 @@ REVISION
Last updated: 30 November 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2LIMITS(3) Library Functions Manual PCRE2LIMITS(3)
@ -5594,21 +5591,25 @@ SIZE AND OTHER LIMITATIONS
The maximum length of a string argument to a callout is the largest
number a 32-bit unsigned integer can hold.
The maximum amount of heap memory used for matching is controlled by
the heap limit, which can be set in a pattern or in a match context.
The default is a very large number, effectively unlimited.
AUTHOR
Philip Hazel
University Computing Service
Retired from University Computing Service
Cambridge, England.
REVISION
Last updated: 02 February 2019
Copyright (c) 1997-2019 University of Cambridge.
Last updated: 26 July 2022
Copyright (c) 1997-2022 University of Cambridge.
------------------------------------------------------------------------------
PCRE2MATCHING(3) Library Functions Manual PCRE2MATCHING(3)
@ -5832,8 +5833,8 @@ REVISION
Last updated: 28 August 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2PARTIAL(3) Library Functions Manual PCRE2PARTIAL(3)
@ -6212,8 +6213,8 @@ REVISION
Last updated: 04 September 2019
Copyright (c) 1997-2019 University of Cambridge.
------------------------------------------------------------------------------
PCRE2PATTERN(3) Library Functions Manual PCRE2PATTERN(3)
@ -9698,8 +9699,8 @@ REVISION
Last updated: 12 January 2022
Copyright (c) 1997-2022 University of Cambridge.
------------------------------------------------------------------------------
PCRE2PERFORM(3) Library Functions Manual PCRE2PERFORM(3)
@ -9771,152 +9772,169 @@ STACK AND HEAP USAGE AT RUN TIME
sive function calls could use a great deal of stack, and this could
cause problems, but this usage has been eliminated. Backtracking posi-
tions are now explicitly remembered in memory frames controlled by the
code. An initial 20KiB vector of frames is allocated on the system
stack (enough for about 100 frames for small patterns), but if this is
insufficient, heap memory is used. The amount of heap memory can be
limited; if the limit is set to zero, only the initial stack vector is
used. Rewriting patterns to be time-efficient, as described below, may
also reduce the memory requirements.
code.
In contrast to pcre2_match(), pcre2_dfa_match() does use recursive
function calls, but only for processing atomic groups, lookaround as-
The size of each frame depends on the size of pointer variables and the
number of capturing parenthesized groups in the pattern being matched.
On a 64-bit system the frame size for a pattern with no captures is 128
bytes. For each capturing group the size increases by 16 bytes.
Until release 10.41, an initial 20KiB frames vector was allocated on
the system stack, but this still caused some issues for multi-thread
applications where each thread has a very small stack. From release
10.41 backtracking memory frames are always held in heap memory. An
initial heap allocation is obtained the first time any match data block
is passed to pcre2_match(). This is remembered with the match data
block and re-used if that block is used for another match. It is freed
when the match data block itself is freed.
The size of the initial block is the larger of 20KiB or ten times the
pattern's frame size, unless the heap limit is less than this, in which
case the heap limit is used. If the initial block proves to be too
small during matching, it is replaced by a larger block, subject to the
heap limit. The heap limit is checked only when a new block is to be
allocated. Reducing the heap limit between calls to pcre2_match() with
the same match data block does not affect the saved block.
In contrast to pcre2_match(), pcre2_dfa_match() does use recursive
function calls, but only for processing atomic groups, lookaround as-
sertions, and recursion within the pattern. The original version of the
code used to allocate quite large internal workspace vectors on the
stack, which caused some problems for some patterns in environments
with small stacks. From release 10.32 the code for pcre2_dfa_match()
has been re-factored to use heap memory when necessary for internal
workspace when recursing, though recursive function calls are still
code used to allocate quite large internal workspace vectors on the
stack, which caused some problems for some patterns in environments
with small stacks. From release 10.32 the code for pcre2_dfa_match()
has been re-factored to use heap memory when necessary for internal
workspace when recursing, though recursive function calls are still
used.
The "match depth" parameter can be used to limit the depth of function
recursion, and the "match heap" parameter to limit heap memory in
The "match depth" parameter can be used to limit the depth of function
recursion, and the "match heap" parameter to limit heap memory in
pcre2_dfa_match().
PROCESSING TIME
Certain items in regular expression patterns are processed more effi-
Certain items in regular expression patterns are processed more effi-
ciently than others. It is more efficient to use a character class like
[aeiou] than a set of single-character alternatives such as
(a|e|i|o|u). In general, the simplest construction that provides the
[aeiou] than a set of single-character alternatives such as
(a|e|i|o|u). In general, the simplest construction that provides the
required behaviour is usually the most efficient. Jeffrey Friedl's book
contains a lot of useful general discussion about optimizing regular
contains a lot of useful general discussion about optimizing regular
expressions for efficient performance. This document contains a few ob-
servations about PCRE2.
Using Unicode character properties (the \p, \P, and \X escapes) is
slow, because PCRE2 has to use a multi-stage table lookup whenever it
needs a character's property. If you can find an alternative pattern
Using Unicode character properties (the \p, \P, and \X escapes) is
slow, because PCRE2 has to use a multi-stage table lookup whenever it
needs a character's property. If you can find an alternative pattern
that does not use character properties, it will probably be faster.
By default, the escape sequences \b, \d, \s, and \w, and the POSIX
character classes such as [:alpha:] do not use Unicode properties,
By default, the escape sequences \b, \d, \s, and \w, and the POSIX
character classes such as [:alpha:] do not use Unicode properties,
partly for backwards compatibility, and partly for performance reasons.
However, you can set the PCRE2_UCP option or start the pattern with
(*UCP) if you want Unicode character properties to be used. This can
double the matching time for items such as \d, when matched with
pcre2_match(); the performance loss is less with a DFA matching func-
However, you can set the PCRE2_UCP option or start the pattern with
(*UCP) if you want Unicode character properties to be used. This can
double the matching time for items such as \d, when matched with
pcre2_match(); the performance loss is less with a DFA matching func-
tion, and in both cases there is not much difference for \b.
When a pattern begins with .* not in atomic parentheses, nor in paren-
theses that are the subject of a backreference, and the PCRE2_DOTALL
option is set, the pattern is implicitly anchored by PCRE2, since it
can match only at the start of a subject string. If the pattern has
When a pattern begins with .* not in atomic parentheses, nor in paren-
theses that are the subject of a backreference, and the PCRE2_DOTALL
option is set, the pattern is implicitly anchored by PCRE2, since it
can match only at the start of a subject string. If the pattern has
multiple top-level branches, they must all be anchorable. The optimiza-
tion can be disabled by the PCRE2_NO_DOTSTAR_ANCHOR option, and is au-
tion can be disabled by the PCRE2_NO_DOTSTAR_ANCHOR option, and is au-
tomatically disabled if the pattern contains (*PRUNE) or (*SKIP).
If PCRE2_DOTALL is not set, PCRE2 cannot make this optimization, be-
cause the dot metacharacter does not then match a newline, and if the
subject string contains newlines, the pattern may match from the char-
If PCRE2_DOTALL is not set, PCRE2 cannot make this optimization, be-
cause the dot metacharacter does not then match a newline, and if the
subject string contains newlines, the pattern may match from the char-
acter immediately following one of them instead of from the very start.
For example, the pattern
.*second
matches the subject "first\nand second" (where \n stands for a newline
character), with the match starting at the seventh character. In order
to do this, PCRE2 has to retry the match starting after every newline
matches the subject "first\nand second" (where \n stands for a newline
character), with the match starting at the seventh character. In order
to do this, PCRE2 has to retry the match starting after every newline
in the subject.
If you are using such a pattern with subject strings that do not con-
tain newlines, the best performance is obtained by setting
PCRE2_DOTALL, or starting the pattern with ^.* or ^.*? to indicate ex-
plicit anchoring. That saves PCRE2 from having to scan along the sub-
If you are using such a pattern with subject strings that do not con-
tain newlines, the best performance is obtained by setting
PCRE2_DOTALL, or starting the pattern with ^.* or ^.*? to indicate ex-
plicit anchoring. That saves PCRE2 from having to scan along the sub-
ject looking for a newline to restart at.
Beware of patterns that contain nested indefinite repeats. These can
take a long time to run when applied to a string that does not match.
Beware of patterns that contain nested indefinite repeats. These can
take a long time to run when applied to a string that does not match.
Consider the pattern fragment
^(a+)*
This can match "aaaa" in 16 different ways, and this number increases
very rapidly as the string gets longer. (The * repeat can match 0, 1,
2, 3, or 4 times, and for each of those cases other than 0 or 4, the +
repeats can match different numbers of times.) When the remainder of
the pattern is such that the entire match is going to fail, PCRE2 has
in principle to try every possible variation, and this can take an ex-
This can match "aaaa" in 16 different ways, and this number increases
very rapidly as the string gets longer. (The * repeat can match 0, 1,
2, 3, or 4 times, and for each of those cases other than 0 or 4, the +
repeats can match different numbers of times.) When the remainder of
the pattern is such that the entire match is going to fail, PCRE2 has
in principle to try every possible variation, and this can take an ex-
tremely long time, even for relatively short strings.
An optimization catches some of the more simple cases such as
(a+)*b
where a literal character follows. Before embarking on the standard
matching procedure, PCRE2 checks that there is a "b" later in the sub-
ject string, and if there is not, it fails the match immediately. How-
ever, when there is no following literal this optimization cannot be
where a literal character follows. Before embarking on the standard
matching procedure, PCRE2 checks that there is a "b" later in the sub-
ject string, and if there is not, it fails the match immediately. How-
ever, when there is no following literal this optimization cannot be
used. You can see the difference by comparing the behaviour of
(a+)*\d
with the pattern above. The former gives a failure almost instantly
when applied to a whole line of "a" characters, whereas the latter
with the pattern above. The former gives a failure almost instantly
when applied to a whole line of "a" characters, whereas the latter
takes an appreciable time with strings longer than about 20 characters.
In many cases, the solution to this kind of performance issue is to use
an atomic group or a possessive quantifier. This can often reduce mem-
an atomic group or a possessive quantifier. This can often reduce mem-
ory requirements as well. As another example, consider this pattern:
([^<]|<(?!inet))+
It matches from wherever it starts until it encounters "<inet" or the
end of the data, and is the kind of pattern that might be used when
It matches from wherever it starts until it encounters "<inet" or the
end of the data, and is the kind of pattern that might be used when
processing an XML file. Each iteration of the outer parentheses matches
either one character that is not "<" or a "<" that is not followed by
"inet". However, each time a parenthesis is processed, a backtracking
position is passed, so this formulation uses a memory frame for each
either one character that is not "<" or a "<" that is not followed by
"inet". However, each time a parenthesis is processed, a backtracking
position is passed, so this formulation uses a memory frame for each
matched character. For a long string, a lot of memory is required. Con-
sider now this rewritten pattern, which matches exactly the same
sider now this rewritten pattern, which matches exactly the same
strings:
([^<]++|<(?!inet))+
This runs much faster, because sequences of characters that do not con-
tain "<" are "swallowed" in one item inside the parentheses, and a pos-
sessive quantifier is used to stop any backtracking into the runs of
non-"<" characters. This version also uses a lot less memory because
entry to a new set of parentheses happens only when a "<" character
that is not followed by "inet" is encountered (and we assume this is
sessive quantifier is used to stop any backtracking into the runs of
non-"<" characters. This version also uses a lot less memory because
entry to a new set of parentheses happens only when a "<" character
that is not followed by "inet" is encountered (and we assume this is
relatively rare).
This example shows that one way of optimizing performance when matching
long subject strings is to write repeated parenthesized subpatterns to
long subject strings is to write repeated parenthesized subpatterns to
match more than one character whenever possible.
SETTING RESOURCE LIMITS
You can set limits on the amount of processing that takes place when
matching, and on the amount of heap memory that is used. The default
You can set limits on the amount of processing that takes place when
matching, and on the amount of heap memory that is used. The default
values of the limits are very large, and unlikely ever to operate. They
can be changed when PCRE2 is built, and they can also be set when
pcre2_match() or pcre2_dfa_match() is called. For details of these in-
terfaces, see the pcre2build documentation and the section entitled
can be changed when PCRE2 is built, and they can also be set when
pcre2_match() or pcre2_dfa_match() is called. For details of these in-
terfaces, see the pcre2build documentation and the section entitled
"The match context" in the pcre2api documentation.
The pcre2test test program has a modifier called "find_limits" which,
if applied to a subject line, causes it to find the smallest limits
The pcre2test test program has a modifier called "find_limits" which,
if applied to a subject line, causes it to find the smallest limits
that allow a pattern to match. This is done by repeatedly matching with
different limits.
@ -9924,17 +9942,17 @@ PROCESSING TIME
AUTHOR
Philip Hazel
University Computing Service
Retired from University Computing Service
Cambridge, England.
REVISION
Last updated: 03 February 2019
Copyright (c) 1997-2019 University of Cambridge.
Last updated: 27 July 2022
Copyright (c) 1997-2022 University of Cambridge.
------------------------------------------------------------------------------
PCRE2POSIX(3) Library Functions Manual PCRE2POSIX(3)
@ -10267,8 +10285,8 @@ REVISION
Last updated: 26 April 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------
PCRE2SAMPLE(3) Library Functions Manual PCRE2SAMPLE(3)
@ -10434,7 +10452,7 @@ SAVING COMPILED PATTERNS
PCRE2_ERROR_BADDATA the number of patterns is zero or less
PCRE2_ERROR_BADMAGIC mismatch of id bytes in one of the patterns
PCRE2_ERROR_MEMORY memory allocation failed
PCRE2_ERROR_NOMEMORY memory allocation failed
PCRE2_ERROR_MIXEDTABLES the patterns do not all use the same tables
PCRE2_ERROR_NULL the 1st, 3rd, or 4th argument is NULL
@ -10545,8 +10563,8 @@ REVISION
Last updated: 27 June 2018
Copyright (c) 1997-2018 University of Cambridge.
------------------------------------------------------------------------------
PCRE2SYNTAX(3) Library Functions Manual PCRE2SYNTAX(3)
@ -11093,8 +11111,8 @@ REVISION
Last updated: 12 January 2022
Copyright (c) 1997-2022 University of Cambridge.
------------------------------------------------------------------------------
PCRE2UNICODE(3) Library Functions Manual PCRE2UNICODE(3)
@ -11530,5 +11548,5 @@ REVISION
Last updated: 22 December 2021
Copyright (c) 1997-2021 University of Cambridge.
------------------------------------------------------------------------------

View File

@ -1,4 +1,4 @@
.TH PCRE2_COMPILE 3 "23 May 2019" "PCRE2 10.34"
.TH PCRE2_COMPILE 3 "22 April 2022" "PCRE2 10.41"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH SYNOPSIS
@ -80,8 +80,17 @@ Additional options may be set in the compile context via the
.\"
function.
.P
The yield of this function is a pointer to a private data structure that
contains the compiled pattern, or NULL if an error was detected.
If either of \fIerrorcode\fP or \fIerroroffset\fP is NULL, the function returns
NULL immediately. Otherwise, the yield of this function is a pointer to a
private data structure that contains the compiled pattern, or NULL if an error
was detected. In the error case, a text error message can be obtained by
passing the value returned via the \fIerrorcode\fP argument to the the
\fBpcre2_get_error_message()\fP function. The offset (in code units) where the
error was encountered is returned via the \fIerroroffset\fP argument.
.P
If there is no error, the value passed via \fIerrorcode\fP returns the message
"no error" if passed to \fBpcre2_get_error_message()\fP, and the value passed
via \fIerroroffset\fP is zero.
.P
There is a complete description of the PCRE2 native API, with more detail on
each option, in the

View File

@ -36,7 +36,7 @@ the following negative error codes:
PCRE2_ERROR_BADDATA \fInumber_of_codes\fP is zero or less
PCRE2_ERROR_BADMAGIC mismatch of id bytes in \fIbytes\fP
PCRE2_ERROR_BADMODE mismatch of variable unit size or PCRE version
PCRE2_ERROR_MEMORY memory allocation failed
PCRE2_ERROR_NOMEMORY memory allocation failed
PCRE2_ERROR_NULL \fIcodes\fP or \fIbytes\fP is NULL
.sp
PCRE2_ERROR_BADMAGIC may mean that the data is corrupt, or that it was compiled

View File

@ -1,4 +1,4 @@
.TH PCRE2API 3 "14 December 2021" "PCRE2 10.40"
.TH PCRE2API 3 "27 July 2022" "PCRE2 10.41"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.sp
@ -953,7 +953,7 @@ has its own memory control arrangements (see the
documentation for more details). If the limit is reached, the negative error
code PCRE2_ERROR_HEAPLIMIT is returned. The default limit can be set when PCRE2
is built; if it is not, the default is set very large and is essentially
"unlimited".
unlimited.
.P
A value for the heap limit may also be supplied by an item at the start of a
pattern of the form
@ -964,18 +964,18 @@ where ddd is a decimal number. However, such a setting is ignored unless ddd is
less than the limit set by the caller of \fBpcre2_match()\fP or, if no such
limit is set, less than the default.
.P
The \fBpcre2_match()\fP function starts out using a 20KiB vector on the system
stack for recording backtracking points. The more nested backtracking points
there are (that is, the deeper the search tree), the more memory is needed.
Heap memory is used only if the initial vector is too small. If the heap limit
is set to a value less than 21 (in particular, zero) no heap memory will be
used. In this case, only patterns that do not have a lot of nested backtracking
can be successfully processed.
The \fBpcre2_match()\fP function always needs some heap memory, so setting a
value of zero guarantees a "heap limit exceeded" error. Details of how
\fBpcre2_match()\fP uses the heap are given in the
.\" HREF
\fBpcre2perform\fP
.\"
documentation.
.P
Similarly, for \fBpcre2_dfa_match()\fP, a vector on the system stack is used
when processing pattern recursions, lookarounds, or atomic groups, and only if
this is not big enough is heap memory used. In this case, too, setting a value
of zero disables the use of the heap.
For \fBpcre2_dfa_match()\fP, a vector on the system stack is used when
processing pattern recursions, lookarounds, or atomic groups, and only if this
is not big enough is heap memory used. In this case, setting a value of zero
disables the use of the heap.
.sp
.nf
.B int pcre2_set_match_limit(pcre2_match_context *\fImcontext\fP,
@ -1019,10 +1019,10 @@ less than the limit set by the caller of \fBpcre2_match()\fP or
.fi
.sp
This parameter limits the depth of nested backtracking in \fBpcre2_match()\fP.
Each time a nested backtracking point is passed, a new memory "frame" is used
Each time a nested backtracking point is passed, a new memory frame is used
to remember the state of matching at that point. Thus, this parameter
indirectly limits the amount of memory that is used in a match. However,
because the size of each memory "frame" depends on the number of capturing
because the size of each memory frame depends on the number of capturing
parentheses, the actual memory limit varies from pattern to pattern. This limit
was more useful in versions before 10.30, where function recursion was used for
backtracking.
@ -1323,8 +1323,7 @@ If \fIerrorcode\fP or \fIerroroffset\fP is NULL, \fBpcre2_compile()\fP returns
NULL immediately. Otherwise, the variables to which these point are set to an
error code and an offset (number of code units) within the pattern,
respectively, when \fBpcre2_compile()\fP returns NULL because a compilation
error has occurred. The values are not defined when compilation is successful
and \fBpcre2_compile()\fP returns a non-NULL value.
error has occurred.
.P
There are nearly 100 positive error codes that \fBpcre2_compile()\fP may return
if it finds an error in the pattern. There are also some negative error codes
@ -1343,14 +1342,17 @@ message"
below)
.\"
should be self-explanatory. Macro names starting with PCRE2_ERROR_ are defined
for both positive and negative error codes in \fBpcre2.h\fP.
for both positive and negative error codes in \fBpcre2.h\fP. When compilation
is successful \fIerrorcode\fP is set to a value that returns the message "no
error" if passed to \fBpcre2_get_error_message()\fP.
.P
The value returned in \fIerroroffset\fP is an indication of where in the
pattern the error occurred. It is not necessarily the furthest point in the
pattern that was read. For example, after the error "lookbehind assertion is
not fixed length", the error offset points to the start of the failing
assertion. For an invalid UTF-8 or UTF-16 string, the offset is that of the
first code unit of the failing character.
pattern an error occurred. When there is no error, zero is returned. A non-zero
value is not necessarily the furthest point in the pattern that was read. For
example, after the error "lookbehind assertion is not fixed length", the error
offset points to the start of the failing assertion. For an invalid UTF-8 or
UTF-16 string, the offset is that of the first code unit of the failing
character.
.P
Some errors are not detected until the whole pattern has been scanned; in these
cases, the offset passed back is the length of the pattern. Note that the
@ -3160,11 +3162,11 @@ The backtracking match limit was reached.
.sp
PCRE2_ERROR_NOMEMORY
.sp
If a pattern contains many nested backtracking points, heap memory is used to
remember them. This error is given when the memory allocation function (default
or custom) fails. Note that a different error, PCRE2_ERROR_HEAPLIMIT, is given
if the amount of memory needed exceeds the heap limit. PCRE2_ERROR_NOMEMORY is
also returned if PCRE2_COPY_MATCHED_SUBJECT is set and memory allocation fails.
Heap memory is used to remember backgracking points. This error is given when
the memory allocation function (default or custom) fails. Note that a different
error, PCRE2_ERROR_HEAPLIMIT, is given if the amount of memory needed exceeds
the heap limit. PCRE2_ERROR_NOMEMORY is also returned if
PCRE2_COPY_MATCHED_SUBJECT is set and memory allocation fails.
.sp
PCRE2_ERROR_NULL
.sp
@ -4025,6 +4027,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 14 December 2021
Copyright (c) 1997-2021 University of Cambridge.
Last updated: 27 July 2022
Copyright (c) 1997-2022 University of Cambridge.
.fi

View File

@ -1,4 +1,4 @@
.TH PCRE2BUILD 3 "08 December 2021" "PCRE2 10.40"
.TH PCRE2BUILD 3 "27 July 2022" "PCRE2 10.41"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.
@ -278,12 +278,11 @@ to the \fBconfigure\fP command. This setting also applies to the
\fBpcre2_dfa_match()\fP matching function, and to JIT matching (though the
counting is done differently).
.P
The \fBpcre2_match()\fP function starts out using a 20KiB vector on the system
stack to record backtracking points. The more nested backtracking points there
are (that is, the deeper the search tree), the more memory is needed. If the
initial vector is not large enough, heap memory is used, up to a certain limit,
which is specified in kibibytes (units of 1024 bytes). The limit can be changed
at run time, as described in the
The \fBpcre2_match()\fP function uses heap memory to record backtracking
points. The more nested backtracking points there are (that is, the deeper the
search tree), the more memory is needed. There is an upper limit, specified in
kibibytes (units of 1024 bytes). This limit can be changed at run time, as
described in the
.\" HREF
\fBpcre2api\fP
.\"
@ -625,7 +624,7 @@ give a warning.
.sp
.nf
Philip Hazel
University Computing Service
Retired from University Computing Service
Cambridge, England.
.fi
.
@ -634,6 +633,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 08 December 2021
Copyright (c) 1997-2021 University of Cambridge.
Last updated: 27 July 2022
Copyright (c) 1997-2022 University of Cambridge.
.fi

View File

@ -1,4 +1,4 @@
.TH PCRE2GREP 1 "31 August 2021" "PCRE2 10.38"
.TH PCRE2GREP 1 "30 July 2022" "PCRE2 10.41"
.SH NAME
pcre2grep - a grep with Perl-compatible regular expressions.
.SH SYNOPSIS
@ -43,13 +43,15 @@ For example:
.sp
pcre2grep some-pattern file1 - file3
.sp
Input files are searched line by line. By default, each line that matches a
By default, input files are searched line by line. Each line that matches a
pattern is copied to the standard output, and if there is more than one file,
the file name is output at the start of each line, followed by a colon.
However, there are options that can change how \fBpcre2grep\fP behaves. In
particular, the \fB-M\fP option makes it possible to search for strings that
span line boundaries. What defines a line boundary is controlled by the
\fB-N\fP (\fB--newline\fP) option.
However, there are options that can change how \fBpcre2grep\fP behaves. For
example, the \fB-M\fP option makes it possible to search for strings that span
line boundaries. What defines a line boundary is controlled by the \fB-N\fP
(\fB--newline\fP) option. The \fB-h\fP and \fB-H\fP options control whether or
not file names are shown, and the \fB-Z\fP option changes the file name
terminator to a zero byte.
.P
The amount of memory used for buffering files that are being scanned is
controlled by parameters that can be set by the \fB--buffer-size\fP and
@ -149,9 +151,11 @@ Output up to \fInumber\fP lines of context after each matching line. Fewer
lines are output if the next match or the end of the file is reached, or if the
processing buffer size has been set too small. If file names and/or line
numbers are being output, a hyphen separator is used instead of a colon for the
context lines. A line containing "--" is output between each group of lines,
unless they are in fact contiguous in the input file. The value of \fInumber\fP
is expected to be relatively small. When \fB-c\fP is used, \fB-A\fP is ignored.
context lines (the \fB-Z\fP option can be used to change the file name
terminator to a zero byte). A line containing "--" is output between each group
of lines, unless they are in fact contiguous in the input file. The value of
\fInumber\fP is expected to be relatively small. When \fB-c\fP is used,
\fB-A\fP is ignored.
.TP
\fB-a\fP, \fB--text\fP
Treat binary files as text. This is equivalent to
@ -167,9 +171,10 @@ Output up to \fInumber\fP lines of context before each matching line. Fewer
lines are output if the previous match or the start of the file is within
\fInumber\fP lines, or if the processing buffer size has been set too small. If
file names and/or line numbers are being output, a hyphen separator is used
instead of a colon for the context lines. A line containing "--" is output
between each group of lines, unless they are in fact contiguous in the input
file. The value of \fInumber\fP is expected to be relatively small. When
instead of a colon for the context lines (the \fB-Z\fP option can be used to
change the file name terminator to a zero byte). A line containing "--" is
output between each group of lines, unless they are in fact contiguous in the
input file. The value of \fInumber\fP is expected to be relatively small. When
\fB-c\fP is used, \fB-B\fP is ignored.
.TP
\fB--binary-files=\fP\fIword\fP
@ -356,19 +361,21 @@ shown separately. This option is mutually exclusive with \fB--output\fP,
.TP
\fB-H\fP, \fB--with-filename\fP
Force the inclusion of the file name at the start of output lines when
searching a single file. By default, the file name is not shown in this case.
For matching lines, the file name is followed by a colon; for context lines, a
hyphen separator is used. If a line number is also being output, it follows the
file name. When the \fB-M\fP option causes a pattern to match more than one
line, only the first is preceded by the file name. This option overrides any
previous \fB-h\fP, \fB-l\fP, or \fB-L\fP options.
searching a single file. The file name is not normally shown in this case.
By default, for matching lines, the file name is followed by a colon; for
context lines, a hyphen separator is used. The \fB-Z\fP option can be used to
change the terminator to a zero byte. If a line number is also being output,
it follows the file name. When the \fB-M\fP option causes a pattern to match
more than one line, only the first is preceded by the file name. This option
overrides any previous \fB-h\fP, \fB-l\fP, or \fB-L\fP options.
.TP
\fB-h\fP, \fB--no-filename\fP
Suppress the output file names when searching multiple files. By default,
file names are shown when multiple files are searched. For matching lines, the
file name is followed by a colon; for context lines, a hyphen separator is used.
If a line number is also being output, it follows the file name. This option
overrides any previous \fB-H\fP, \fB-L\fP, or \fB-l\fP options.
Suppress the output file names when searching multiple files. File names are
normally shown when multiple files are searched. By default, for matching
lines, the file name is followed by a colon; for context lines, a hyphen
separator is used. The \fB-Z\fP option can be used to change the terminator to
a zero byte. If a line number is also being output, it follows the file name.
This option overrides any previous \fB-H\fP, \fB-L\fP, or \fB-l\fP options.
.TP
\fB--heap-limit\fP=\fInumber\fP
See \fB--match-limit\fP below.
@ -417,17 +424,19 @@ given any number of times. If a directory matches both \fB--include-dir\fP and
\fB-L\fP, \fB--files-without-match\fP
Instead of outputting lines from the files, just output the names of the files
that do not contain any lines that would have been output. Each file name is
output once, on a separate line. This option overrides any previous \fB-H\fP,
\fB-h\fP, or \fB-l\fP options.
output once, on a separate line by default, but if the \fB-Z\fP option is set,
they are separated by zero bytes instead of newlines. This option overrides any
previous \fB-H\fP, \fB-h\fP, or \fB-l\fP options.
.TP
\fB-l\fP, \fB--files-with-matches\fP
Instead of outputting lines from the files, just output the names of the files
containing lines that would have been output. Each file name is output once, on
a separate line. Searching normally stops as soon as a matching line is found
in a file. However, if the \fB-c\fP (count) option is also used, matching
continues in order to obtain the correct count, and those files that have at
least one match are listed along with their counts. Using this option with
\fB-c\fP is a way of suppressing the listing of files with no matches that
a separate line, but if the \fB-Z\fP option is set, they are separated by zero
bytes instead of newlines. Searching normally stops as soon as a matching line
is found in a file. However, if the \fB-c\fP (count) option is also used,
matching continues in order to obtain the correct count, and those files that
have at least one match are listed along with their counts. Using this option
with \fB-c\fP is a way of suppressing the listing of files with no matches that
occurs with \fB-c\fP on its own. This option overrides any previous \fB-H\fP,
\fB-h\fP, or \fB-L\fP options.
.TP
@ -516,10 +525,7 @@ counter that is incremented each time around its main processing loop. If the
value set by \fB--match-limit\fP is reached, an error occurs.
.sp
The \fB--heap-limit\fP option specifies, as a number of kibibytes (units of
1024 bytes), the amount of heap memory that may be used for matching. Heap
memory is needed only if matching the pattern requires a significant number of
nested backtracking points to be remembered. This parameter can be set to zero
to forbid the use of heap memory altogether.
1024 bytes), the maximum amount of heap memory that may be used for matching.
.sp
The \fB--depth-limit\fP option limits the depth of nested backtracking points,
which indirectly limits the amount of memory that is used. The amount of memory
@ -732,6 +738,12 @@ be more than one line. This is equivalent to having "^(?:" at the start of each
pattern and ")$" at the end. This option applies only to the patterns that are
matched against the contents of files; it does not apply to patterns specified
by any of the \fB--include\fP or \fB--exclude\fP options.
.TP
\fB-Z\fP, \fB--null\fP
Terminate files names in the regular output with a zero byte (the NUL
character) instead of what would normally appear. This is useful when file
names contain unusual characters such as colons, hyphens, or even newlines. The
option does not apply to file names in error messages.
.
.
.SH "ENVIRONMENT VARIABLES"
@ -960,6 +972,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 31 August 2021
Copyright (c) 1997-2021 University of Cambridge.
Last updated: 30 July 2022
Copyright (c) 1997-2022 University of Cambridge.
.fi

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,4 @@
.TH PCRE2LIMITS 3 "03 February 2019" "PCRE2 10.33"
.TH PCRE2LIMITS 3 "26 July 2022" "PCRE2 10.41"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH "SIZE AND OTHER LIMITATIONS"
@ -51,6 +51,10 @@ is 255 code units for the 8-bit library and 65535 code units for the 16-bit and
.P
The maximum length of a string argument to a callout is the largest number a
32-bit unsigned integer can hold.
.P
The maximum amount of heap memory used for matching is controlled by the heap
limit, which can be set in a pattern or in a match context. The default is a
very large number, effectively unlimited.
.
.
.SH AUTHOR
@ -58,7 +62,7 @@ The maximum length of a string argument to a callout is the largest number a
.sp
.nf
Philip Hazel
University Computing Service
Retired from University Computing Service
Cambridge, England.
.fi
.
@ -67,6 +71,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 02 February 2019
Copyright (c) 1997-2019 University of Cambridge.
Last updated: 26 July 2022
Copyright (c) 1997-2022 University of Cambridge.
.fi

View File

@ -1,4 +1,4 @@
.TH PCRE2PERFORM 3 "03 February 2019" "PCRE2 10.33"
.TH PCRE2PERFORM 3 "27 July 2022" "PCRE2 10.41"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH "PCRE2 PERFORMANCE"
@ -69,12 +69,28 @@ From release 10.30, the interpretive (non-JIT) version of \fBpcre2_match()\fP
uses very little system stack at run time. In earlier releases recursive
function calls could use a great deal of stack, and this could cause problems,
but this usage has been eliminated. Backtracking positions are now explicitly
remembered in memory frames controlled by the code. An initial 20KiB vector of
frames is allocated on the system stack (enough for about 100 frames for small
patterns), but if this is insufficient, heap memory is used. The amount of heap
memory can be limited; if the limit is set to zero, only the initial stack
vector is used. Rewriting patterns to be time-efficient, as described below,
may also reduce the memory requirements.
remembered in memory frames controlled by the code.
.P
The size of each frame depends on the size of pointer variables and the number
of capturing parenthesized groups in the pattern being matched. On a 64-bit
system the frame size for a pattern with no captures is 128 bytes. For each
capturing group the size increases by 16 bytes.
.P
Until release 10.41, an initial 20KiB frames vector was allocated on the system
stack, but this still caused some issues for multi-thread applications where
each thread has a very small stack. From release 10.41 backtracking memory
frames are always held in heap memory. An initial heap allocation is obtained
the first time any match data block is passed to \fBpcre2_match()\fP. This is
remembered with the match data block and re-used if that block is used for
another match. It is freed when the match data block itself is freed.
.P
The size of the initial block is the larger of 20KiB or ten times the pattern's
frame size, unless the heap limit is less than this, in which case the heap
limit is used. If the initial block proves to be too small during matching, it
is replaced by a larger block, subject to the heap limit. The heap limit is
checked only when a new block is to be allocated. Reducing the heap limit
between calls to \fBpcre2_match()\fP with the same match data block does not
affect the saved block.
.P
In contrast to \fBpcre2_match()\fP, \fBpcre2_dfa_match()\fP does use recursive
function calls, but only for processing atomic groups, lookaround assertions,
@ -230,7 +246,7 @@ pattern to match. This is done by repeatedly matching with different limits.
.sp
.nf
Philip Hazel
University Computing Service
Retired from University Computing Service
Cambridge, England.
.fi
.
@ -239,6 +255,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 03 February 2019
Copyright (c) 1997-2019 University of Cambridge.
Last updated: 27 July 2022
Copyright (c) 1997-2022 University of Cambridge.
.fi

View File

@ -81,7 +81,7 @@ of serialized patterns, or one of the following negative error codes:
.sp
PCRE2_ERROR_BADDATA the number of patterns is zero or less
PCRE2_ERROR_BADMAGIC mismatch of id bytes in one of the patterns
PCRE2_ERROR_MEMORY memory allocation failed
PCRE2_ERROR_NOMEMORY memory allocation failed
PCRE2_ERROR_MIXEDTABLES the patterns do not all use the same tables
PCRE2_ERROR_NULL the 1st, 3rd, or 4th argument is NULL
.sp

View File

@ -1,4 +1,4 @@
.TH PCRE2TEST 1 "12 January 2022" "PCRE 10.40"
.TH PCRE2TEST 1 "27 July 2022" "PCRE 10.41"
.SH NAME
pcre2test - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS
@ -1206,7 +1206,8 @@ pattern, but can be overridden by modifiers on the subject.
copy=<number or name> copy captured substring
depth_limit=<n> set a depth limit
dfa use \fBpcre2_dfa_match()\fP
find_limits find match and depth limits
find_limits find heap, match and depth limits
find_limits_noheap find match and depth limits
get=<number or name> extract captured substring
getall extract all captured substrings
/g global global matching
@ -1528,7 +1529,7 @@ value that was set on the pattern.
.sp
The \fBheap_limit\fP, \fBmatch_limit\fP, and \fBdepth_limit\fP modifiers set
the appropriate limits in the match context. These values are ignored when the
\fBfind_limits\fP modifier is specified.
\fBfind_limits\fP or \fBfind_limits_noheap\fP modifier is specified.
.
.
.SS "Finding minimum limits"
@ -1538,8 +1539,12 @@ If the \fBfind_limits\fP modifier is present on a subject line, \fBpcre2test\fP
calls the relevant matching function several times, setting different values in
the match context via \fBpcre2_set_heap_limit()\fP,
\fBpcre2_set_match_limit()\fP, or \fBpcre2_set_depth_limit()\fP until it finds
the minimum values for each parameter that allows the match to complete without
error. If JIT is being used, only the match limit is relevant.
the smallest value for each parameter that allows the match to complete without
a "limit exceeded" error. The match itself may succeed or fail. An alternative
modifier, \fBfind_limits_noheap\fP, omits the heap limit. This is used in the
standard tests, because the minimum heap limit varies between systems. If JIT
is being used, only the match limit is relevant, and the other two are
automatically omitted.
.P
When using this modifier, the pattern should not contain any limit settings
such as (*LIMIT_MATCH=...) within it. If such a setting is present and is
@ -1563,9 +1568,7 @@ and non-recursive, to the internal matching function, thus controlling the
overall amount of computing resource that is used.
.P
For both kinds of matching, the \fIheap_limit\fP number, which is in kibibytes
(units of 1024 bytes), limits the amount of heap memory used for matching. A
value of zero disables the use of any heap memory; many simple pattern matches
can be done without using the heap, so zero is not an unreasonable setting.
(units of 1024 bytes), limits the amount of heap memory used for matching.
.
.
.SS "Showing MARK names"
@ -1584,12 +1587,10 @@ is added to the non-match message.
.sp
The \fBmemory\fP modifier causes \fBpcre2test\fP to log the sizes of all heap
memory allocation and freeing calls that occur during a call to
\fBpcre2_match()\fP or \fBpcre2_dfa_match()\fP. These occur only when a match
requires a bigger vector than the default for remembering backtracking points
(\fBpcre2_match()\fP) or for internal workspace (\fBpcre2_dfa_match()\fP). In
many cases there will be no heap memory used and therefore no additional
output. No heap memory is allocated during matching with JIT, so in that case
the \fBmemory\fP modifier never has any effect. For this modifier to work, the
\fBpcre2_match()\fP or \fBpcre2_dfa_match()\fP. In the latter case, heap memory
is used only when a match requires more internal workspace that the default
allocation on the stack, so in many cases there will be no output. No heap
memory is allocated during matching with JIT. For this modifier to work, the
\fBnull_context\fP modifier must not be set on both the pattern and the
subject, though it can be set on one or the other.
.
@ -1649,7 +1650,8 @@ Normally, \fBpcre2test\fP passes a context block to \fBpcre2_match()\fP,
If the \fBnull_context\fP modifier is set, however, NULL is passed. This is for
testing that the matching and substitution functions behave correctly in this
case (they use default values). This modifier cannot be used with the
\fBfind_limits\fP or \fBsubstitute_callout\fP modifiers.
\fBfind_limits\fP, \fBfind_limits_noheap\fP, or \fBsubstitute_callout\fP
modifiers.
.P
Similarly, for testing purposes, if the \fBnull_subject\fP or
\fBnull_replacement\fP modifier is set, the subject or replacement string
@ -2119,6 +2121,6 @@ Cambridge, England.
.rs
.sp
.nf
Last updated: 12 January 2022
Last updated: 27 July 2022
Copyright (c) 1997-2022 University of Cambridge.
.fi

View File

@ -1111,7 +1111,8 @@ SUBJECT MODIFIERS
copy=<number or name> copy captured substring
depth_limit=<n> set a depth limit
dfa use pcre2_dfa_match()
find_limits find match and depth limits
find_limits find heap, match and depth limits
find_limits_noheap find match and depth limits
get=<number or name> extract captured substring
getall extract all captured substrings
/g global global matching
@ -1411,7 +1412,7 @@ SUBJECT MODIFIERS
The heap_limit, match_limit, and depth_limit modifiers set the appro-
priate limits in the match context. These values are ignored when the
find_limits modifier is specified.
find_limits or find_limits_noheap modifier is specified.
Finding minimum limits
@ -1419,8 +1420,12 @@ SUBJECT MODIFIERS
calls the relevant matching function several times, setting different
values in the match context via pcre2_set_heap_limit(),
pcre2_set_match_limit(), or pcre2_set_depth_limit() until it finds the
minimum values for each parameter that allows the match to complete
without error. If JIT is being used, only the match limit is relevant.
smallest value for each parameter that allows the match to complete
without a "limit exceeded" error. The match itself may succeed or fail.
An alternative modifier, find_limits_noheap, omits the heap limit. This
is used in the standard tests, because the minimum heap limit varies
between systems. If JIT is being used, only the match limit is rele-
vant, and the other two are automatically omitted.
When using this modifier, the pattern should not contain any limit set-
tings such as (*LIMIT_MATCH=...) within it. If such a setting is
@ -1446,9 +1451,7 @@ SUBJECT MODIFIERS
For both kinds of matching, the heap_limit number, which is in
kibibytes (units of 1024 bytes), limits the amount of heap memory used
for matching. A value of zero disables the use of any heap memory; many
simple pattern matches can be done without using the heap, so zero is
not an unreasonable setting.
for matching.
Showing MARK names
@ -1463,13 +1466,11 @@ SUBJECT MODIFIERS
The memory modifier causes pcre2test to log the sizes of all heap mem-
ory allocation and freeing calls that occur during a call to
pcre2_match() or pcre2_dfa_match(). These occur only when a match re-
quires a bigger vector than the default for remembering backtracking
points (pcre2_match()) or for internal workspace (pcre2_dfa_match()).
In many cases there will be no heap memory used and therefore no addi-
tional output. No heap memory is allocated during matching with JIT, so
in that case the memory modifier never has any effect. For this modi-
fier to work, the null_context modifier must not be set on both the
pcre2_match() or pcre2_dfa_match(). In the latter case, heap memory is
used only when a match requires more internal workspace that the de-
fault allocation on the stack, so in many cases there will be no out-
put. No heap memory is allocated during matching with JIT. For this
modifier to work, the null_context modifier must not be set on both the
pattern and the subject, though it can be set on one or the other.
Setting a starting offset
@ -1518,45 +1519,46 @@ SUBJECT MODIFIERS
null_context modifier is set, however, NULL is passed. This is for
testing that the matching and substitution functions behave correctly
in this case (they use default values). This modifier cannot be used
with the find_limits or substitute_callout modifiers.
with the find_limits, find_limits_noheap, or substitute_callout modi-
fiers.
Similarly, for testing purposes, if the null_subject or null_replace-
ment modifier is set, the subject or replacement string pointers are
Similarly, for testing purposes, if the null_subject or null_replace-
ment modifier is set, the subject or replacement string pointers are
passed as NULL, respectively, to the relevant functions.
THE ALTERNATIVE MATCHING FUNCTION
By default, pcre2test uses the standard PCRE2 matching function,
By default, pcre2test uses the standard PCRE2 matching function,
pcre2_match() to match each subject line. PCRE2 also supports an alter-
native matching function, pcre2_dfa_match(), which operates in a dif-
ferent way, and has some restrictions. The differences between the two
native matching function, pcre2_dfa_match(), which operates in a dif-
ferent way, and has some restrictions. The differences between the two
functions are described in the pcre2matching documentation.
If the dfa modifier is set, the alternative matching function is used.
This function finds all possible matches at a given point in the sub-
ject. If, however, the dfa_shortest modifier is set, processing stops
after the first match is found. This is always the shortest possible
If the dfa modifier is set, the alternative matching function is used.
This function finds all possible matches at a given point in the sub-
ject. If, however, the dfa_shortest modifier is set, processing stops
after the first match is found. This is always the shortest possible
match.
DEFAULT OUTPUT FROM pcre2test
This section describes the output when the normal matching function,
This section describes the output when the normal matching function,
pcre2_match(), is being used.
When a match succeeds, pcre2test outputs the list of captured sub-
strings, starting with number 0 for the string that matched the whole
When a match succeeds, pcre2test outputs the list of captured sub-
strings, starting with number 0 for the string that matched the whole
pattern. Otherwise, it outputs "No match" when the return is PCRE2_ER-
ROR_NOMATCH, or "Partial match:" followed by the partially matching
substring when the return is PCRE2_ERROR_PARTIAL. (Note that this is
the entire substring that was inspected during the partial match; it
may include characters before the actual match start if a lookbehind
ROR_NOMATCH, or "Partial match:" followed by the partially matching
substring when the return is PCRE2_ERROR_PARTIAL. (Note that this is
the entire substring that was inspected during the partial match; it
may include characters before the actual match start if a lookbehind
assertion, \K, \b, or \B was involved.)
For any other return, pcre2test outputs the PCRE2 negative error number
and a short descriptive phrase. If the error is a failed UTF string
check, the code unit offset of the start of the failing character is
and a short descriptive phrase. If the error is a failed UTF string
check, the code unit offset of the start of the failing character is
also output. Here is an example of an interactive pcre2test run.
$ pcre2test
@ -1572,8 +1574,8 @@ DEFAULT OUTPUT FROM pcre2test
Unset capturing substrings that are not followed by one that is set are
not shown by pcre2test unless the allcaptures modifier is specified. In
the following example, there are two capturing substrings, but when the
first data line is matched, the second, unset substring is not shown.
An "internal" unset substring is shown as "<unset>", as for the second
first data line is matched, the second, unset substring is not shown.
An "internal" unset substring is shown as "<unset>", as for the second
data line.
re> /(a)|(b)/
@ -1585,11 +1587,11 @@ DEFAULT OUTPUT FROM pcre2test
1: <unset>
2: b
If the strings contain any non-printing characters, they are output as
\xhh escapes if the value is less than 256 and UTF mode is not set.
If the strings contain any non-printing characters, they are output as
\xhh escapes if the value is less than 256 and UTF mode is not set.
Otherwise they are output as \x{hh...} escapes. See below for the defi-
nition of non-printing characters. If the aftertext modifier is set,
the output for substring 0 is followed by the the rest of the subject
nition of non-printing characters. If the aftertext modifier is set,
the output for substring 0 is followed by the the rest of the subject
string, identified by "0+" like this:
re> /cat/aftertext
@ -1609,8 +1611,8 @@ DEFAULT OUTPUT FROM pcre2test
0: ipp
1: pp
"No match" is output only if the first match attempt fails. Here is an
example of a failure message (the offset 4 that is specified by the
"No match" is output only if the first match attempt fails. Here is an
example of a failure message (the offset 4 that is specified by the
offset modifier is past the end of the subject string):
re> /xyz/
@ -1618,7 +1620,7 @@ DEFAULT OUTPUT FROM pcre2test
Error -24 (bad offset value)
Note that whereas patterns can be continued over several lines (a plain
">" prompt is used for continuations), subject lines may not. However
">" prompt is used for continuations), subject lines may not. However
newlines can be included in a subject by means of the \n escape (or \r,
\r\n, etc., depending on the newline sequence setting).
@ -1626,7 +1628,7 @@ DEFAULT OUTPUT FROM pcre2test
OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
When the alternative matching function, pcre2_dfa_match(), is used, the
output consists of a list of all the matches that start at the first
output consists of a list of all the matches that start at the first
point in the subject where there is at least one match. For example:
re> /(tang|tangerine|tan)/
@ -1635,11 +1637,11 @@ OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
1: tang
2: tan
Using the normal matching function on this data finds only "tang". The
longest matching string is always given first (and numbered zero). Af-
ter a PCRE2_ERROR_PARTIAL return, the output is "Partial match:", fol-
Using the normal matching function on this data finds only "tang". The
longest matching string is always given first (and numbered zero). Af-
ter a PCRE2_ERROR_PARTIAL return, the output is "Partial match:", fol-
lowed by the partially matching substring. Note that this is the entire
substring that was inspected during the partial match; it may include
substring that was inspected during the partial match; it may include
characters before the actual match start if a lookbehind assertion, \b,
or \B was involved. (\K is not supported for DFA matching.)
@ -1655,16 +1657,16 @@ OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
1: tan
0: tan
The alternative matching function does not support substring capture,
so the modifiers that are concerned with captured substrings are not
The alternative matching function does not support substring capture,
so the modifiers that are concerned with captured substrings are not
relevant.
RESTARTING AFTER A PARTIAL MATCH
When the alternative matching function has given the PCRE2_ERROR_PAR-
When the alternative matching function has given the PCRE2_ERROR_PAR-
TIAL return, indicating that the subject partially matched the pattern,
you can restart the match with additional subject data by means of the
you can restart the match with additional subject data by means of the
dfa_restart modifier. For example:
re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
@ -1673,37 +1675,37 @@ RESTARTING AFTER A PARTIAL MATCH
data> n05\=dfa,dfa_restart
0: n05
For further information about partial matching, see the pcre2partial
For further information about partial matching, see the pcre2partial
documentation.
CALLOUTS
If the pattern contains any callout requests, pcre2test's callout func-
tion is called during matching unless callout_none is specified. This
tion is called during matching unless callout_none is specified. This
works with both matching functions, and with JIT, though there are some
differences in behaviour. The output for callouts with numerical argu-
differences in behaviour. The output for callouts with numerical argu-
ments and those with string arguments is slightly different.
Callouts with numerical arguments
By default, the callout function displays the callout number, the start
and current positions in the subject text at the callout time, and the
and current positions in the subject text at the callout time, and the
next pattern item to be tested. For example:
--->pqrabcdef
0 ^ ^ \d
This output indicates that callout number 0 occurred for a match at-
tempt starting at the fourth character of the subject string, when the
pointer was at the seventh character, and when the next pattern item
was \d. Just one circumflex is output if the start and current posi-
This output indicates that callout number 0 occurred for a match at-
tempt starting at the fourth character of the subject string, when the
pointer was at the seventh character, and when the next pattern item
was \d. Just one circumflex is output if the start and current posi-
tions are the same, or if the current position precedes the start posi-
tion, which can happen if the callout is in a lookbehind assertion.
Callouts numbered 255 are assumed to be automatic callouts, inserted as
a result of the auto_callout pattern modifier. In this case, instead of
showing the callout number, the offset in the pattern, preceded by a
showing the callout number, the offset in the pattern, preceded by a
plus, is output. For example:
re> /\d?[A-E]\*/auto_callout
@ -1730,17 +1732,17 @@ CALLOUTS
+12 ^ ^
0: abc
The mark changes between matching "a" and "b", but stays the same for
the rest of the match, so nothing more is output. If, as a result of
backtracking, the mark reverts to being unset, the text "<unset>" is
The mark changes between matching "a" and "b", but stays the same for
the rest of the match, so nothing more is output. If, as a result of
backtracking, the mark reverts to being unset, the text "<unset>" is
output.
Callouts with string arguments
The output for a callout with a string argument is similar, except that
instead of outputting a callout number before the position indicators,
the callout string and its offset in the pattern string are output be-
fore the reflection of the subject string, and the subject string is
instead of outputting a callout number before the position indicators,
the callout string and its offset in the pattern string are output be-
fore the reflection of the subject string, and the subject string is
reflected for each callout. For example:
re> /^ab(?C'first')cd(?C"second")ef/
@ -1756,26 +1758,26 @@ CALLOUTS
Callout modifiers
The callout function in pcre2test returns zero (carry on matching) by
default, but you can use a callout_fail modifier in a subject line to
The callout function in pcre2test returns zero (carry on matching) by
default, but you can use a callout_fail modifier in a subject line to
change this and other parameters of the callout (see below).
If the callout_capture modifier is set, the current captured groups are
output when a callout occurs. This is useful only for non-DFA matching,
as pcre2_dfa_match() does not support capturing, so no captures are
as pcre2_dfa_match() does not support capturing, so no captures are
ever shown.
The normal callout output, showing the callout number or pattern offset
(as described above) is suppressed if the callout_no_where modifier is
(as described above) is suppressed if the callout_no_where modifier is
set.
When using the interpretive matching function pcre2_match() without
JIT, setting the callout_extra modifier causes additional output from
pcre2test's callout function to be generated. For the first callout in
a match attempt at a new starting position in the subject, "New match
attempt" is output. If there has been a backtrack since the last call-
When using the interpretive matching function pcre2_match() without
JIT, setting the callout_extra modifier causes additional output from
pcre2test's callout function to be generated. For the first callout in
a match attempt at a new starting position in the subject, "New match
attempt" is output. If there has been a backtrack since the last call-
out (or start of matching if this is the first callout), "Backtrack" is
output, followed by "No other matching paths" if the backtrack ended
output, followed by "No other matching paths" if the backtrack ended
the previous match attempt. For example:
re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess
@ -1812,86 +1814,86 @@ CALLOUTS
+1 ^ a+
No match
Notice that various optimizations must be turned off if you want all
possible matching paths to be scanned. If no_start_optimize is not
used, there is an immediate "no match", without any callouts, because
the starting optimization fails to find "b" in the subject, which it
knows must be present for any match. If no_auto_possess is not used,
the "a+" item is turned into "a++", which reduces the number of back-
Notice that various optimizations must be turned off if you want all
possible matching paths to be scanned. If no_start_optimize is not
used, there is an immediate "no match", without any callouts, because
the starting optimization fails to find "b" in the subject, which it
knows must be present for any match. If no_auto_possess is not used,
the "a+" item is turned into "a++", which reduces the number of back-
tracks.
The callout_extra modifier has no effect if used with the DFA matching
The callout_extra modifier has no effect if used with the DFA matching
function, or with JIT.
Return values from callouts
The default return from the callout function is zero, which allows
The default return from the callout function is zero, which allows
matching to continue. The callout_fail modifier can be given one or two
numbers. If there is only one number, 1 is returned instead of 0 (caus-
ing matching to backtrack) when a callout of that number is reached. If
two numbers (<n>:<m>) are given, 1 is returned when callout <n> is
reached and there have been at least <m> callouts. The callout_error
two numbers (<n>:<m>) are given, 1 is returned when callout <n> is
reached and there have been at least <m> callouts. The callout_error
modifier is similar, except that PCRE2_ERROR_CALLOUT is returned, caus-
ing the entire matching process to be aborted. If both these modifiers
are set for the same callout number, callout_error takes precedence.
Note that callouts with string arguments are always given the number
ing the entire matching process to be aborted. If both these modifiers
are set for the same callout number, callout_error takes precedence.
Note that callouts with string arguments are always given the number
zero.
The callout_data modifier can be given an unsigned or a negative num-
ber. This is set as the "user data" that is passed to the matching
function, and passed back when the callout function is invoked. Any
value other than zero is used as a return from pcre2test's callout
The callout_data modifier can be given an unsigned or a negative num-
ber. This is set as the "user data" that is passed to the matching
function, and passed back when the callout function is invoked. Any
value other than zero is used as a return from pcre2test's callout
function.
Inserting callouts can be helpful when using pcre2test to check compli-
cated regular expressions. For further information about callouts, see
cated regular expressions. For further information about callouts, see
the pcre2callout documentation.
NON-PRINTING CHARACTERS
When pcre2test is outputting text in the compiled version of a pattern,
bytes other than 32-126 are always treated as non-printing characters
bytes other than 32-126 are always treated as non-printing characters
and are therefore shown as hex escapes.
When pcre2test is outputting text that is a matched part of a subject
string, it behaves in the same way, unless a different locale has been
set for the pattern (using the locale modifier). In this case, the is-
When pcre2test is outputting text that is a matched part of a subject
string, it behaves in the same way, unless a different locale has been
set for the pattern (using the locale modifier). In this case, the is-
print() function is used to distinguish printing and non-printing char-
acters.
SAVING AND RESTORING COMPILED PATTERNS
It is possible to save compiled patterns on disc or elsewhere, and
It is possible to save compiled patterns on disc or elsewhere, and
reload them later, subject to a number of restrictions. JIT data cannot
be saved. The host on which the patterns are reloaded must be running
be saved. The host on which the patterns are reloaded must be running
the same version of PCRE2, with the same code unit width, and must also
have the same endianness, pointer width and PCRE2_SIZE type. Before
compiled patterns can be saved they must be serialized, that is, con-
verted to a stream of bytes. A single byte stream may contain any num-
ber of compiled patterns, but they must all use the same character ta-
bles. A single copy of the tables is included in the byte stream (its
have the same endianness, pointer width and PCRE2_SIZE type. Before
compiled patterns can be saved they must be serialized, that is, con-
verted to a stream of bytes. A single byte stream may contain any num-
ber of compiled patterns, but they must all use the same character ta-
bles. A single copy of the tables is included in the byte stream (its
size is 1088 bytes).
The functions whose names begin with pcre2_serialize_ are used for se-
rializing and de-serializing. They are described in the pcre2serialize
documentation. In this section we describe the features of pcre2test
The functions whose names begin with pcre2_serialize_ are used for se-
rializing and de-serializing. They are described in the pcre2serialize
documentation. In this section we describe the features of pcre2test
that can be used to test these functions.
Note that "serialization" in PCRE2 does not convert compiled patterns
to an abstract format like Java or .NET. It just makes a reloadable
Note that "serialization" in PCRE2 does not convert compiled patterns
to an abstract format like Java or .NET. It just makes a reloadable
byte code stream. Hence the restrictions on reloading mentioned above.
In pcre2test, when a pattern with push modifier is successfully com-
piled, it is pushed onto a stack of compiled patterns, and pcre2test
expects the next line to contain a new pattern (or command) instead of
In pcre2test, when a pattern with push modifier is successfully com-
piled, it is pushed onto a stack of compiled patterns, and pcre2test
expects the next line to contain a new pattern (or command) instead of
a subject line. By contrast, the pushcopy modifier causes a copy of the
compiled pattern to be stacked, leaving the original available for im-
mediate matching. By using push and/or pushcopy, a number of patterns
can be compiled and retained. These modifiers are incompatible with
compiled pattern to be stacked, leaving the original available for im-
mediate matching. By using push and/or pushcopy, a number of patterns
can be compiled and retained. These modifiers are incompatible with
posix, and control modifiers that act at match time are ignored (with a
message) for the stacked patterns. The jitverify modifier applies only
message) for the stacked patterns. The jitverify modifier applies only
at compile time.
The command
@ -1899,21 +1901,21 @@ SAVING AND RESTORING COMPILED PATTERNS
#save <filename>
causes all the stacked patterns to be serialized and the result written
to the named file. Afterwards, all the stacked patterns are freed. The
to the named file. Afterwards, all the stacked patterns are freed. The
command
#load <filename>
reads the data in the file, and then arranges for it to be de-serial-
ized, with the resulting compiled patterns added to the pattern stack.
The pattern on the top of the stack can be retrieved by the #pop com-
mand, which must be followed by lines of subjects that are to be
matched with the pattern, terminated as usual by an empty line or end
of file. This command may be followed by a modifier list containing
only control modifiers that act after a pattern has been compiled. In
particular, hex, posix, posix_nosub, push, and pushcopy are not al-
lowed, nor are any option-setting modifiers. The JIT modifiers are,
however permitted. Here is an example that saves and reloads two pat-
reads the data in the file, and then arranges for it to be de-serial-
ized, with the resulting compiled patterns added to the pattern stack.
The pattern on the top of the stack can be retrieved by the #pop com-
mand, which must be followed by lines of subjects that are to be
matched with the pattern, terminated as usual by an empty line or end
of file. This command may be followed by a modifier list containing
only control modifiers that act after a pattern has been compiled. In
particular, hex, posix, posix_nosub, push, and pushcopy are not al-
lowed, nor are any option-setting modifiers. The JIT modifiers are,
however permitted. Here is an example that saves and reloads two pat-
terns.
/abc/push
@ -1926,10 +1928,10 @@ SAVING AND RESTORING COMPILED PATTERNS
#pop jit,bincode
abc
If jitverify is used with #pop, it does not automatically imply jit,
If jitverify is used with #pop, it does not automatically imply jit,
which is different behaviour from when it is used on a pattern.
The #popcopy command is analagous to the pushcopy modifier in that it
The #popcopy command is analagous to the pushcopy modifier in that it
makes current a copy of the topmost stack pattern, leaving the original
still on the stack.
@ -1949,5 +1951,5 @@ AUTHOR
REVISION
Last updated: 12 January 2022
Last updated: 27 July 2022
Copyright (c) 1997-2022 University of Cambridge.

View File

@ -14,14 +14,14 @@ flexible API, the code of PCRE2 has been much improved since the fork.
## Download
As well as downloading from the
[GitHub site](https://github.com/PhilipHazel/pcre2), you can download PCRE2
[GitHub site](https://github.com/PCRE2Project/pcre2), you can download PCRE2
or the older, unmaintained PCRE1 library from an
[*unofficial* mirror](https://sourceforge.net/projects/pcre/files/) at SourceForge.
You can check out the PCRE2 source code via Git or Subversion:
git clone https://github.com/PhilipHazel/pcre2.git
svn co https://github.com/PhilipHazel/pcre2.git
git clone https://github.com/PCRE2Project/pcre2.git
svn co https://github.com/PCRE2Project/pcre2.git
## Contributed Ports
@ -36,7 +36,7 @@ default character encoding, can be found at
## Documentation
You can read the PCRE2 documentation
[here](https://philiphazel.github.io/pcre2/doc/html/index.html).
[here](https://PCRE2Project.github.io/pcre2/doc/html/index.html).
Comparisons to Perl's regular expression semantics can be found in the
community authored Wikipedia entry for PCRE.

View File

@ -78,9 +78,9 @@ utf8.c
A short, freestanding C program for converting a Unicode code point into a
sequence of bytes in the UTF-8 encoding, and vice versa. If its argument is a
hex number such as 0x1234, it outputs a list of the equivalent UTF-8 bytes.
If its argument is a sequence of concatenated UTF-8 bytes (e.g. e188b4) it
treats them as a UTF-8 character and outputs the equivalent code point in
hex. See comments at its head for details.
If its argument is a sequence of concatenated UTF-8 bytes (e.g. 12e188b4) it
treats them as a UTF-8 string and outputs the equivalent code points in hex.
See comments at its head for details.
Updating to a new Unicode release
@ -94,8 +94,9 @@ directory.
Note: Previously, it was necessary to update lists of scripts and their
abbreviations by hand before running the Python scripts. This is no longer
necessary because the scripts have been upgraded to extract this information
themselves. Also, there used to be explicit lists of script in two of the man
pages. This is no longer the case.
themselves. Also, there used to be explicit lists of scripts in two of the man
pages. This is no longer the case; the pcre2test program can now output a list
of supported scripts.
You can give an output file name as an argument to the following scripts, but
by default:
@ -129,8 +130,8 @@ files should eventually be installed in the main testdata directory.
Preparing for a PCRE2 release
=============================
This section contains a checklist of things that I consult before building a
distribution for a new release.
This section contains a checklist of things that I do before building a new
release.
. Ensure that the version number and version date are correct in configure.ac.
@ -139,17 +140,16 @@ distribution for a new release.
. If new build options or new source files have been added, ensure that they
are added to the CMake files as well as to the autoconf files. The relevant
files are CMakeLists.txt and config-cmake.h.in. After making a release
tarball, test it out with CMake if there have been changes here.
files are CMakeLists.txt and config-cmake.h.in. After making a release, test
it out with CMake if there have been changes here.
. Run ./autogen.sh to ensure everything is up-to-date.
. Compile and test with many different config options, and combinations of
options. Also, test with valgrind by running "RunTest valgrind" and
"RunGrepTest valgrind" (which takes quite a long time). The script
maint/ManyConfigTests now encapsulates this testing. It runs tests with
different configurations, and it also runs some of them with valgrind, all of
which can take quite some time.
"RunGrepTest valgrind". The script maint/ManyConfigTests now encapsulates
this testing. It runs tests with different configurations, and it also runs
some of them with valgrind, all of which can take quite some time.
. Run tests in both 32-bit and 64-bit environments if possible. I can no longer
run 32-bit tests.
@ -164,7 +164,8 @@ distribution for a new release.
-fsanitize=signed-integer-overflow
. Do a test build using CMake. Remove src/config.h first, lest it override the
version that CMake creates. Do NOT use parallel make.
version that CMake creates. Also do a CMake unity build to check that it
still works: [c]cmake -DCMAKE_UNITY_BUILD=ON sets up a unity build.
. Run perltest.sh on the test data for tests 1 and 4. The output should match
the PCRE2 test output, apart from the version identification at the start of
@ -183,11 +184,12 @@ distribution for a new release.
systems. For example, on Solaris it is helpful to test using Sun's cc
compiler as a change from gcc. Adding -xarch=v9 to the cc options does a
64-bit test, but it also needs -S 64 for pcre2test to increase the stack size
for test 2. Since I retired I can no longer do much of this, but instead I
rely on putting out release candidates for testing by the community.
for test 2. Since I retired I can no longer do much of this. There are
automated tests under Ubuntu, Alpine, and Windows that are now set up as
GitHub actions. Check that they are running clean.
. The buildbots at http://buildfarm.opencsw.org/ do some automated testing
of PCRE2 and should be checked before putting out a release.
of PCRE2 and should also be checked before putting out a release.
Updating version info for libtool
@ -243,10 +245,11 @@ it reports them and then aborts. Otherwise it removes trailing spaces from
sources and refreshes the HTML documentation. Update the GitHub repository with
"git push".
Once PrepareRelease has run clean, run "make distcheck" to create the tarball
Once PrepareRelease has run clean, run "make distcheck" to create the tarballs
and the zipball. I then sign these files. Double-check with "git status" that
the repository is fully up-to-date, then create a new tag on GitHub. Upload the
tarball, zipball, and the signatures as "assets" of the GitHub release.
the repository is fully up-to-date, then create a new tag and a release on
GitHub. Upload the tarballs, zipball, and the signatures as "assets" of the
GitHub release.
When the new release is out, don't forget to tell webmaster@pcre.org and the
mailing list.
@ -365,8 +368,6 @@ years.
See Unicode TR 29. The last two are very much aimed at natural language.
. (?[...]) extended classes: big project.
. Allow a callout to specify a number of characters to skip. This can be done
compatibly via an extra callout field.
@ -436,13 +437,8 @@ years.
with lookarounds for \b and \B. Ideally the setting should last till the end
of the group, which means remembering all previous settings; maybe a fixed
amount of stack would do - how deep would anyone want to nest these things?
See GitHub issue #13 for a compendium of character class issues.
. Recognize the short script names. They are already listed in maint/
Multistage2.py because they are needed for scanning the script extensions
file.
. Use script extensions for \p?
See GitHub issue #13 for a compendium of character class issues, including
(?[...]) extended classes.
. A user suggested something like --with-build-info to set a build information
string that could be retrieved by pcre2_config(). However, there's no
@ -461,4 +457,4 @@ years.
Philip Hazel
Email local part: Philip.Hazel
Email domain: gmail.com
Last updated: 10 January 2022
Last updated: 25 April 2022

View File

@ -546,7 +546,6 @@ int script = -1;
int type = -1;
int gbreak = -1;
int bidiclass = -1;
BOOL bidicontrol = FALSE;
BOOL script_not = FALSE;
BOOL type_not = FALSE;
BOOL gbreak_not = FALSE;
@ -559,12 +558,10 @@ while (*s != 0)
{
unsigned int offset = 0;
BOOL scriptx_not = FALSE;
char *value_start;
for (t = name; *s != 0 && !isspace(*s); s++) *t++ = *s;
*t = 0;
while (isspace(*s)) s++;
value_start = s;
for (t = value; *s != 0 && !isspace(*s); s++)
{

View File

@ -1,139 +1,139 @@
findprop 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
U+0000 BN Control: Control, common, Control, [ascii]
U+0001 BN Control: Control, common, Control, [ascii]
U+0002 BN Control: Control, common, Control, [ascii]
U+0003 BN Control: Control, common, Control, [ascii]
U+0004 BN Control: Control, common, Control, [ascii]
U+0005 BN Control: Control, common, Control, [ascii]
U+0006 BN Control: Control, common, Control, [ascii]
U+0007 BN Control: Control, common, Control, [ascii]
U+0008 BN Control: Control, common, Control, [ascii]
U+0009 S Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+000A B Control: Control, common, LF, [ascii, patternwhitespace, whitespace]
U+000B S Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+000C WS Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+000D B Control: Control, common, CR, [ascii, patternwhitespace, whitespace]
U+000E BN Control: Control, common, Control, [ascii]
U+000F BN Control: Control, common, Control, [ascii]
U+0000 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0001 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0002 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0003 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0004 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0005 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0006 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0007 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0008 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0009 S Control: Control, common, Control, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+000A B Control: Control, common, LF, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+000B S Control: Control, common, Control, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+000C WS Control: Control, common, Control, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+000D B Control: Control, common, CR, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+000E BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+000F BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
findprop 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f
U+0010 BN Control: Control, common, Control, [ascii]
U+0011 BN Control: Control, common, Control, [ascii]
U+0012 BN Control: Control, common, Control, [ascii]
U+0013 BN Control: Control, common, Control, [ascii]
U+0014 BN Control: Control, common, Control, [ascii]
U+0015 BN Control: Control, common, Control, [ascii]
U+0016 BN Control: Control, common, Control, [ascii]
U+0017 BN Control: Control, common, Control, [ascii]
U+0018 BN Control: Control, common, Control, [ascii]
U+0019 BN Control: Control, common, Control, [ascii]
U+001A BN Control: Control, common, Control, [ascii]
U+001B BN Control: Control, common, Control, [ascii]
U+001C B Control: Control, common, Control, [ascii]
U+001D B Control: Control, common, Control, [ascii]
U+001E B Control: Control, common, Control, [ascii]
U+001F S Control: Control, common, Control, [ascii]
U+0010 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0011 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0012 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0013 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0014 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0015 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0016 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0017 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0018 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0019 BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+001A BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+001B BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+001C B Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+001D B Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+001E B Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+001F S Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
findprop 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f
U+0020 WS Separator: Space separator, common, Other, [ascii, graphemebase, patternwhitespace, whitespace]
U+0021 ON Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+0022 ON Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax, quotationmark]
U+0023 ET Punctuation: Other punctuation, common, Other, [ascii, emoji, emojicomponent, graphemebase, patternsyntax]
U+0024 ET Symbol: Currency symbol, common, Other, [ascii, graphemebase, patternsyntax]
U+0025 ET Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax]
U+0026 ON Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax]
U+0027 ON Punctuation: Other punctuation, common, Other, [ascii, caseignorable, graphemebase, patternsyntax, quotationmark]
U+0028 ON Punctuation: Open punctuation, common, Other, [ascii, bidimirrored, graphemebase, patternsyntax]
U+0029 ON Punctuation: Close punctuation, common, Other, [ascii, bidimirrored, graphemebase, patternsyntax]
U+002A ON Punctuation: Other punctuation, common, Other, [ascii, emoji, emojicomponent, graphemebase, patternsyntax]
U+002B ES Symbol: Mathematical symbol, common, Other, [ascii, graphemebase, math, patternsyntax]
U+002C CS Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax, terminalpunctuation]
U+002D ES Punctuation: Dash punctuation, common, Other, [ascii, dash, graphemebase, patternsyntax]
U+002E CS Punctuation: Other punctuation, common, Other, [ascii, caseignorable, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+002F CS Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax]
U+0020 WS Separator: Space separator, common, Other, [ascii, emoji, emojicomponent, graphemebase, patternsyntax]
U+0021 ON Punctuation: Other punctuation, common, Other, [ascii, caseignorable, graphemebase, patternsyntax, quotationmark]
U+0022 ON Punctuation: Other punctuation, common, Other, [ascii, graphemebase, math, patternsyntax]
U+0023 ET Punctuation: Other punctuation, common, Other, [ascii, dash, graphemebase, patternsyntax]
U+0024 ET Symbol: Currency symbol, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0025 ET Punctuation: Other punctuation, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0026 ON Punctuation: Other punctuation, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0027 ON Punctuation: Other punctuation, common, Other, [ascii, bidimirrored, graphemebase, math, patternsyntax]
U+0028 ON Punctuation: Open punctuation, common, Other, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0029 ON Punctuation: Close punctuation, common, Other, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+002A ON Punctuation: Other punctuation, common, Other, [ascii, dash, graphemebase, patternsyntax]
U+002B ES Symbol: Mathematical symbol, common, Other, [ascii, graphemebase, idcontinue, xidcontinue]
U+002C CS Punctuation: Other punctuation, common, Other, [ascii, asciihexdigit, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, hexdigit, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+002D ES Punctuation: Dash punctuation, common, Other, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, softdotted, xidcontinue, xidstart]
U+002E CS Punctuation: Other punctuation, common, Other, [graphemebase, whitespace]
U+002F CS Punctuation: Other punctuation, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
findprop 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f
U+0030 EN Number: Decimal number, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0031 EN Number: Decimal number, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0032 EN Number: Decimal number, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0033 EN Number: Decimal number, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0034 EN Number: Decimal number, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0035 EN Number: Decimal number, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0036 EN Number: Decimal number, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0037 EN Number: Decimal number, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0038 EN Number: Decimal number, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0039 EN Number: Decimal number, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+003A CS Punctuation: Other punctuation, common, Other, [ascii, caseignorable, graphemebase, patternsyntax, terminalpunctuation]
U+003B ON Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax, terminalpunctuation]
U+003C ON Symbol: Mathematical symbol, common, Other, [ascii, bidimirrored, graphemebase, math, patternsyntax]
U+003D ON Symbol: Mathematical symbol, common, Other, [ascii, graphemebase, math, patternsyntax]
U+003E ON Symbol: Mathematical symbol, common, Other, [ascii, bidimirrored, graphemebase, math, patternsyntax]
U+003F ON Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+0030 EN Number: Decimal number, common, Other, [caseignorable, diacritic, graphemebase]
U+0031 EN Number: Decimal number, common, Other, [caseignorable, diacritic, graphemebase]
U+0032 EN Number: Decimal number, common, Other, [caseignorable, diacritic, graphemebase]
U+0033 EN Number: Decimal number, common, Other, [caseignorable, diacritic, graphemebase]
U+0034 EN Number: Decimal number, common, Other, [caseignorable, diacritic, graphemebase]
U+0035 EN Number: Decimal number, common, Other, [caseignorable, diacritic, graphemebase]
U+0036 EN Number: Decimal number, common, Other, [caseignorable, diacritic, graphemebase]
U+0037 EN Number: Decimal number, common, Other, [caseignorable, diacritic, graphemebase]
U+0038 EN Number: Decimal number, common, Other, [caseignorable, diacritic, graphemebase]
U+0039 EN Number: Decimal number, common, Other, [caseignorable, diacritic, graphemebase]
U+003A CS Punctuation: Other punctuation, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+003B ON Punctuation: Other punctuation, common, Other, [ascii, asciihexdigit, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, hexdigit, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+003C ON Symbol: Mathematical symbol, common, Other, [graphemebase, math, patternsyntax]
U+003D ON Symbol: Mathematical symbol, common, Other, [ascii, graphemebase, idcontinue, xidcontinue]
U+003E ON Symbol: Mathematical symbol, common, Other, [graphemebase, math, patternsyntax]
U+003F ON Punctuation: Other punctuation, common, Other, [ascii, caseignorable, graphemebase, patternsyntax, quotationmark]
findprop 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f
U+0040 ON Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax]
U+0041 L Letter: Upper case letter, latin, Other, U+0061, [ascii, asciihexdigit, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, hexdigit, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0042 L Letter: Upper case letter, latin, Other, U+0062, [ascii, asciihexdigit, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, hexdigit, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0043 L Letter: Upper case letter, latin, Other, U+0063, [ascii, asciihexdigit, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, hexdigit, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0044 L Letter: Upper case letter, latin, Other, U+0064, [ascii, asciihexdigit, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, hexdigit, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0045 L Letter: Upper case letter, latin, Other, U+0065, [ascii, asciihexdigit, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, hexdigit, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0046 L Letter: Upper case letter, latin, Other, U+0066, [ascii, asciihexdigit, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, hexdigit, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0047 L Letter: Upper case letter, latin, Other, U+0067, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0048 L Letter: Upper case letter, latin, Other, U+0068, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0049 L Letter: Upper case letter, latin, Other, U+0069, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+004A L Letter: Upper case letter, latin, Other, U+006A, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+004B L Letter: Upper case letter, latin, Other, U+006B, U+212A, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+004C L Letter: Upper case letter, latin, Other, U+006C, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+004D L Letter: Upper case letter, latin, Other, U+006D, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+004E L Letter: Upper case letter, latin, Other, U+006E, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+004F L Letter: Upper case letter, latin, Other, U+006F, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0040 ON Punctuation: Other punctuation, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+0041 L Letter: Upper case letter, latin, Other, U+0061, [graphemebase]
U+0042 L Letter: Upper case letter, latin, Other, U+0062, [graphemebase]
U+0043 L Letter: Upper case letter, latin, Other, U+0063, [graphemebase]
U+0044 L Letter: Upper case letter, latin, Other, U+0064, [graphemebase]
U+0045 L Letter: Upper case letter, latin, Other, U+0065, [graphemebase]
U+0046 L Letter: Upper case letter, latin, Other, U+0066, [graphemebase]
U+0047 L Letter: Upper case letter, latin, Other, U+0067, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0048 L Letter: Upper case letter, latin, Other, U+0068, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0049 L Letter: Upper case letter, latin, Other, U+0069, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+004A L Letter: Upper case letter, latin, Other, U+006A, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+004B L Letter: Upper case letter, latin, Other, U+006B, U+212A, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+004C L Letter: Upper case letter, latin, Other, U+006C, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+004D L Letter: Upper case letter, latin, Other, U+006D, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+004E L Letter: Upper case letter, latin, Other, U+006E, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+004F L Letter: Upper case letter, latin, Other, U+006F, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
findprop 50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f
U+0050 L Letter: Upper case letter, latin, Other, U+0070, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0051 L Letter: Upper case letter, latin, Other, U+0071, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0052 L Letter: Upper case letter, latin, Other, U+0072, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0053 L Letter: Upper case letter, latin, Other, U+0073, U+017F, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0054 L Letter: Upper case letter, latin, Other, U+0074, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0055 L Letter: Upper case letter, latin, Other, U+0075, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0056 L Letter: Upper case letter, latin, Other, U+0076, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0057 L Letter: Upper case letter, latin, Other, U+0077, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0058 L Letter: Upper case letter, latin, Other, U+0078, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0059 L Letter: Upper case letter, latin, Other, U+0079, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+005A L Letter: Upper case letter, latin, Other, U+007A, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+005B ON Punctuation: Open punctuation, common, Other, [ascii, bidimirrored, graphemebase, patternsyntax]
U+005C ON Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax]
U+005D ON Punctuation: Close punctuation, common, Other, [ascii, bidimirrored, graphemebase, patternsyntax]
U+005E ON Symbol: Modifier symbol, common, Other, [ascii, caseignorable, diacritic, graphemebase, math, patternsyntax]
U+005F ON Punctuation: Connector punctuation, common, Other, [ascii, graphemebase, idcontinue, xidcontinue]
U+0050 L Letter: Upper case letter, latin, Other, U+0070, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0051 L Letter: Upper case letter, latin, Other, U+0071, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0052 L Letter: Upper case letter, latin, Other, U+0072, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0053 L Letter: Upper case letter, latin, Other, U+0073, U+017F, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0054 L Letter: Upper case letter, latin, Other, U+0074, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0055 L Letter: Upper case letter, latin, Other, U+0075, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0056 L Letter: Upper case letter, latin, Other, U+0076, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0057 L Letter: Upper case letter, latin, Other, U+0077, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0058 L Letter: Upper case letter, latin, Other, U+0078, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+0059 L Letter: Upper case letter, latin, Other, U+0079, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+005A L Letter: Upper case letter, latin, Other, U+007A, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+005B ON Punctuation: Open punctuation, common, Other, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+005C ON Punctuation: Other punctuation, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+005D ON Punctuation: Close punctuation, common, Other, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+005E ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+005F ON Punctuation: Connector punctuation, common, Other, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, deprecated, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
findprop 60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f
U+0060 ON Symbol: Modifier symbol, common, Other, [ascii, caseignorable, diacritic, graphemebase, patternsyntax]
U+0061 L Letter: Lower case letter, latin, Other, U+0041, [ascii, asciihexdigit, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, hexdigit, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0062 L Letter: Lower case letter, latin, Other, U+0042, [ascii, asciihexdigit, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, hexdigit, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0063 L Letter: Lower case letter, latin, Other, U+0043, [ascii, asciihexdigit, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, hexdigit, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0064 L Letter: Lower case letter, latin, Other, U+0044, [ascii, asciihexdigit, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, hexdigit, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0065 L Letter: Lower case letter, latin, Other, U+0045, [ascii, asciihexdigit, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, hexdigit, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0066 L Letter: Lower case letter, latin, Other, U+0046, [ascii, asciihexdigit, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, hexdigit, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0067 L Letter: Lower case letter, latin, Other, U+0047, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0068 L Letter: Lower case letter, latin, Other, U+0048, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0069 L Letter: Lower case letter, latin, Other, U+0049, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, softdotted, xidcontinue, xidstart]
U+006A L Letter: Lower case letter, latin, Other, U+004A, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, softdotted, xidcontinue, xidstart]
U+006B L Letter: Lower case letter, latin, Other, U+004B, U+212A, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+006C L Letter: Lower case letter, latin, Other, U+004C, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+006D L Letter: Lower case letter, latin, Other, U+004D, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+006E L Letter: Lower case letter, latin, Other, U+004E, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+006F L Letter: Lower case letter, latin, Other, U+004F, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0060 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, changeswhentitlecased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0061 L Letter: Lower case letter, latin, Other, U+0041, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0062 L Letter: Lower case letter, latin, Other, U+0042, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0063 L Letter: Lower case letter, latin, Other, U+0043, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0064 L Letter: Lower case letter, latin, Other, U+0044, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0065 L Letter: Lower case letter, latin, Other, U+0045, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0066 L Letter: Lower case letter, latin, Other, U+0046, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0067 L Letter: Lower case letter, latin, Other, U+0047, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0068 L Letter: Lower case letter, latin, Other, U+0048, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0069 L Letter: Lower case letter, latin, Other, U+0049, [caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+006A L Letter: Lower case letter, latin, Other, U+004A, [caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+006B L Letter: Lower case letter, latin, Other, U+004B, U+212A, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+006C L Letter: Lower case letter, latin, Other, U+004C, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+006D L Letter: Lower case letter, latin, Other, U+004D, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+006E L Letter: Lower case letter, latin, Other, U+004E, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+006F L Letter: Lower case letter, latin, Other, U+004F, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
findprop 70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f
U+0070 L Letter: Lower case letter, latin, Other, U+0050, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0071 L Letter: Lower case letter, latin, Other, U+0051, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0072 L Letter: Lower case letter, latin, Other, U+0052, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0073 L Letter: Lower case letter, latin, Other, U+0053, U+017F, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0074 L Letter: Lower case letter, latin, Other, U+0054, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0075 L Letter: Lower case letter, latin, Other, U+0055, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0076 L Letter: Lower case letter, latin, Other, U+0056, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0077 L Letter: Lower case letter, latin, Other, U+0057, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0078 L Letter: Lower case letter, latin, Other, U+0058, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0079 L Letter: Lower case letter, latin, Other, U+0059, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+007A L Letter: Lower case letter, latin, Other, U+005A, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+007B ON Punctuation: Open punctuation, common, Other, [ascii, bidimirrored, graphemebase, patternsyntax]
U+007C ON Symbol: Mathematical symbol, common, Other, [ascii, graphemebase, math, patternsyntax]
U+007D ON Punctuation: Close punctuation, common, Other, [ascii, bidimirrored, graphemebase, patternsyntax]
U+007E ON Symbol: Mathematical symbol, common, Other, [ascii, graphemebase, math, patternsyntax]
U+007F BN Control: Control, common, Control, [ascii]
U+0070 L Letter: Lower case letter, latin, Other, U+0050, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0071 L Letter: Lower case letter, latin, Other, U+0051, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0072 L Letter: Lower case letter, latin, Other, U+0052, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0073 L Letter: Lower case letter, latin, Other, U+0053, U+017F, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0074 L Letter: Lower case letter, latin, Other, U+0054, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0075 L Letter: Lower case letter, latin, Other, U+0055, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0076 L Letter: Lower case letter, latin, Other, U+0056, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0077 L Letter: Lower case letter, latin, Other, U+0057, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0078 L Letter: Lower case letter, latin, Other, U+0058, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+0079 L Letter: Lower case letter, latin, Other, U+0059, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+007A L Letter: Lower case letter, latin, Other, U+005A, [alphabetic, caseignorable, diacritic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+007B ON Punctuation: Open punctuation, common, Other, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+007C ON Symbol: Mathematical symbol, common, Other, [ascii, graphemebase, idcontinue, xidcontinue]
U+007D ON Punctuation: Close punctuation, common, Other, [ascii, alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+007E ON Symbol: Mathematical symbol, common, Other, [ascii, graphemebase, idcontinue, xidcontinue]
U+007F BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
findprop 80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f
U+0080 BN Control: Control, common, Control
@ -141,7 +141,7 @@ U+0081 BN Control: Control, common, Control
U+0082 BN Control: Control, common, Control
U+0083 BN Control: Control, common, Control
U+0084 BN Control: Control, common, Control
U+0085 B Control: Control, common, Control, [patternwhitespace, whitespace]
U+0085 B Control: Control, common, Control, [caseignorable, defaultignorablecodepoint, graphemeextend, idcontinue, xidcontinue]
U+0086 BN Control: Control, common, Control
U+0087 BN Control: Control, common, Control
U+0088 BN Control: Control, common, Control
@ -170,240 +170,240 @@ U+009D BN Control: Control, common, Control
U+009E BN Control: Control, common, Control
U+009F BN Control: Control, common, Control
findprop a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af
U+00A0 CS Separator: Space separator, common, Other, [graphemebase, whitespace]
U+00A1 ON Punctuation: Other punctuation, common, Other, [graphemebase, patternsyntax]
U+00A2 ET Symbol: Currency symbol, common, Other, [graphemebase, patternsyntax]
U+00A3 ET Symbol: Currency symbol, common, Other, [graphemebase, patternsyntax]
U+00A4 ET Symbol: Currency symbol, common, Other, [graphemebase, patternsyntax]
U+00A5 ET Symbol: Currency symbol, common, Other, [graphemebase, patternsyntax]
U+00A6 ON Symbol: Other symbol, common, Other, [graphemebase, patternsyntax]
U+00A7 ON Punctuation: Other punctuation, common, Other, [graphemebase, patternsyntax]
U+00A8 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+00A9 ON Symbol: Other symbol, common, Extended Pictographic, [emoji, extendedpictographic, graphemebase, patternsyntax]
U+00AA L Letter: Other letter, latin, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00AB ON Punctuation: Initial punctuation, common, Other, [bidimirrored, graphemebase, patternsyntax, quotationmark]
U+00AC ON Symbol: Mathematical symbol, common, Other, [graphemebase, math, patternsyntax]
U+00AD BN Control: Format, common, Control, [caseignorable, defaultignorablecodepoint]
U+00AE ON Symbol: Other symbol, common, Extended Pictographic, [emoji, extendedpictographic, graphemebase, patternsyntax]
U+00AF ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+00A0 CS Separator: Space separator, common, Other, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase]
U+00A1 ON Punctuation: Other punctuation, common, Other, [caseignorable, graphemebase, idcontinue, terminalpunctuation, xidcontinue]
U+00A2 ET Symbol: Currency symbol, common, Other, [caseignorable, graphemebase, idcontinue, terminalpunctuation, xidcontinue]
U+00A3 ET Symbol: Currency symbol, common, Other, [caseignorable, graphemebase, idcontinue, terminalpunctuation, xidcontinue]
U+00A4 ET Symbol: Currency symbol, common, Other, [caseignorable, graphemebase, idcontinue, terminalpunctuation, xidcontinue]
U+00A5 ET Symbol: Currency symbol, common, Other, [caseignorable, graphemebase, idcontinue, terminalpunctuation, xidcontinue]
U+00A6 ON Symbol: Other symbol, common, Other, [caseignorable, graphemebase, idcontinue, terminalpunctuation, xidcontinue]
U+00A7 ON Punctuation: Other punctuation, common, Other, [caseignorable, graphemebase, idcontinue, terminalpunctuation, xidcontinue]
U+00A8 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+00A9 ON Symbol: Other symbol, common, Extended Pictographic, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+00AA L Letter: Other letter, latin, Other, [caseignorable, graphemeextend]
U+00AB ON Punctuation: Initial punctuation, common, Other, [graphemebase, sentenceterminal, terminalpunctuation]
U+00AC ON Symbol: Mathematical symbol, common, Other, [alphabetic, caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+00AD BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+00AE ON Symbol: Other symbol, common, Extended Pictographic, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+00AF ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
findprop b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf
U+00B0 ET Symbol: Other symbol, common, Other, [graphemebase, patternsyntax]
U+00B1 ET Symbol: Mathematical symbol, common, Other, [graphemebase, math, patternsyntax]
U+00B2 EN Number: Other number, common, Other, [graphemebase]
U+00B3 EN Number: Other number, common, Other, [graphemebase]
U+00B4 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+00B5 L Letter: Lower case letter, common, Other, U+03BC, U+039C, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00B6 ON Punctuation: Other punctuation, common, Other, [graphemebase, patternsyntax]
U+00B7 ON Punctuation: Other punctuation, common, Other, [caseignorable, diacritic, extender, graphemebase, idcontinue, xidcontinue]
U+00B8 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+00B9 EN Number: Other number, common, Other, [graphemebase]
U+00BA L Letter: Other letter, latin, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00BB ON Punctuation: Final punctuation, common, Other, [bidimirrored, graphemebase, patternsyntax, quotationmark]
U+00BC ON Number: Other number, common, Other, [graphemebase]
U+00BD ON Number: Other number, common, Other, [graphemebase]
U+00BE ON Number: Other number, common, Other, [graphemebase]
U+00BF ON Punctuation: Other punctuation, common, Other, [graphemebase, patternsyntax]
U+00B0 ET Symbol: Other symbol, common, Other, [caseignorable, graphemebase, idcontinue, terminalpunctuation, xidcontinue]
U+00B1 ET Symbol: Mathematical symbol, common, Other, [alphabetic, caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+00B2 EN Number: Other number, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+00B3 EN Number: Other number, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+00B4 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+00B5 L Letter: Lower case letter, common, Other, U+03BC, U+039C, [alphabetic, deprecated, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+00B6 ON Punctuation: Other punctuation, common, Other, [caseignorable, graphemebase, idcontinue, terminalpunctuation, xidcontinue]
U+00B7 ON Punctuation: Other punctuation, common, Other, [alphabetic, graphemebase, idcontinue, xidcontinue]
U+00B8 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+00B9 EN Number: Other number, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+00BA L Letter: Other letter, latin, Other, [caseignorable, graphemeextend]
U+00BB ON Punctuation: Final punctuation, common, Other, [graphemebase, sentenceterminal, terminalpunctuation]
U+00BC ON Number: Other number, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+00BD ON Number: Other number, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+00BE ON Number: Other number, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+00BF ON Punctuation: Other punctuation, common, Other, [caseignorable, graphemebase, idcontinue, terminalpunctuation, xidcontinue]
findprop c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf
U+00C0 L Letter: Upper case letter, latin, Other, U+00E0, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00C1 L Letter: Upper case letter, latin, Other, U+00E1, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00C2 L Letter: Upper case letter, latin, Other, U+00E2, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00C3 L Letter: Upper case letter, latin, Other, U+00E3, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00C4 L Letter: Upper case letter, latin, Other, U+00E4, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00C5 L Letter: Upper case letter, latin, Other, U+00E5, U+212B, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00C6 L Letter: Upper case letter, latin, Other, U+00E6, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00C7 L Letter: Upper case letter, latin, Other, U+00E7, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00C8 L Letter: Upper case letter, latin, Other, U+00E8, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00C9 L Letter: Upper case letter, latin, Other, U+00E9, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00CA L Letter: Upper case letter, latin, Other, U+00EA, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00CB L Letter: Upper case letter, latin, Other, U+00EB, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00CC L Letter: Upper case letter, latin, Other, U+00EC, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00CD L Letter: Upper case letter, latin, Other, U+00ED, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00CE L Letter: Upper case letter, latin, Other, U+00EE, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00CF L Letter: Upper case letter, latin, Other, U+00EF, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00C0 L Letter: Upper case letter, latin, Other, U+00E0, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00C1 L Letter: Upper case letter, latin, Other, U+00E1, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00C2 L Letter: Upper case letter, latin, Other, U+00E2, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00C3 L Letter: Upper case letter, latin, Other, U+00E3, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00C4 L Letter: Upper case letter, latin, Other, U+00E4, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00C5 L Letter: Upper case letter, latin, Other, U+00E5, U+212B, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00C6 L Letter: Upper case letter, latin, Other, U+00E6, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00C7 L Letter: Upper case letter, latin, Other, U+00E7, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00C8 L Letter: Upper case letter, latin, Other, U+00E8, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00C9 L Letter: Upper case letter, latin, Other, U+00E9, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00CA L Letter: Upper case letter, latin, Other, U+00EA, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00CB L Letter: Upper case letter, latin, Other, U+00EB, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00CC L Letter: Upper case letter, latin, Other, U+00EC, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00CD L Letter: Upper case letter, latin, Other, U+00ED, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00CE L Letter: Upper case letter, latin, Other, U+00EE, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00CF L Letter: Upper case letter, latin, Other, U+00EF, [alphabetic, graphemeextend, idcontinue, xidcontinue]
findprop d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df
U+00D0 L Letter: Upper case letter, latin, Other, U+00F0, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00D1 L Letter: Upper case letter, latin, Other, U+00F1, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00D2 L Letter: Upper case letter, latin, Other, U+00F2, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00D3 L Letter: Upper case letter, latin, Other, U+00F3, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00D4 L Letter: Upper case letter, latin, Other, U+00F4, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00D5 L Letter: Upper case letter, latin, Other, U+00F5, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00D6 L Letter: Upper case letter, latin, Other, U+00F6, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00D7 ON Symbol: Mathematical symbol, common, Other, [graphemebase, math, patternsyntax]
U+00D8 L Letter: Upper case letter, latin, Other, U+00F8, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00D9 L Letter: Upper case letter, latin, Other, U+00F9, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00DA L Letter: Upper case letter, latin, Other, U+00FA, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00DB L Letter: Upper case letter, latin, Other, U+00FB, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00DC L Letter: Upper case letter, latin, Other, U+00FC, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00DD L Letter: Upper case letter, latin, Other, U+00FD, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00DE L Letter: Upper case letter, latin, Other, U+00FE, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00DF L Letter: Lower case letter, latin, Other, U+1E9E, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00D0 L Letter: Upper case letter, latin, Other, U+00F0, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00D1 L Letter: Upper case letter, latin, Other, U+00F1, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00D2 L Letter: Upper case letter, latin, Other, U+00F2, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00D3 L Letter: Upper case letter, latin, Other, U+00F3, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00D4 L Letter: Upper case letter, latin, Other, U+00F4, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00D5 L Letter: Upper case letter, latin, Other, U+00F5, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00D6 L Letter: Upper case letter, latin, Other, U+00F6, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00D7 ON Symbol: Mathematical symbol, common, Other, [alphabetic, caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+00D8 L Letter: Upper case letter, latin, Other, U+00F8, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00D9 L Letter: Upper case letter, latin, Other, U+00F9, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00DA L Letter: Upper case letter, latin, Other, U+00FA, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00DB L Letter: Upper case letter, latin, Other, U+00FB, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00DC L Letter: Upper case letter, latin, Other, U+00FC, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00DD L Letter: Upper case letter, latin, Other, U+00FD, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00DE L Letter: Upper case letter, latin, Other, U+00FE, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+00DF L Letter: Lower case letter, latin, Other, U+1E9E, [alphabetic, deprecated, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
findprop e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef
U+00E0 L Letter: Lower case letter, latin, Other, U+00C0, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00E1 L Letter: Lower case letter, latin, Other, U+00C1, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00E2 L Letter: Lower case letter, latin, Other, U+00C2, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00E3 L Letter: Lower case letter, latin, Other, U+00C3, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00E4 L Letter: Lower case letter, latin, Other, U+00C4, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00E5 L Letter: Lower case letter, latin, Other, U+00C5, U+212B, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00E6 L Letter: Lower case letter, latin, Other, U+00C6, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00E7 L Letter: Lower case letter, latin, Other, U+00C7, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00E8 L Letter: Lower case letter, latin, Other, U+00C8, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00E9 L Letter: Lower case letter, latin, Other, U+00C9, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00EA L Letter: Lower case letter, latin, Other, U+00CA, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00EB L Letter: Lower case letter, latin, Other, U+00CB, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00EC L Letter: Lower case letter, latin, Other, U+00CC, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00ED L Letter: Lower case letter, latin, Other, U+00CD, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00EE L Letter: Lower case letter, latin, Other, U+00CE, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00EF L Letter: Lower case letter, latin, Other, U+00CF, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00E0 L Letter: Lower case letter, latin, Other, U+00C0, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00E1 L Letter: Lower case letter, latin, Other, U+00C1, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00E2 L Letter: Lower case letter, latin, Other, U+00C2, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00E3 L Letter: Lower case letter, latin, Other, U+00C3, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00E4 L Letter: Lower case letter, latin, Other, U+00C4, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00E5 L Letter: Lower case letter, latin, Other, U+00C5, U+212B, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00E6 L Letter: Lower case letter, latin, Other, U+00C6, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00E7 L Letter: Lower case letter, latin, Other, U+00C7, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00E8 L Letter: Lower case letter, latin, Other, U+00C8, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00E9 L Letter: Lower case letter, latin, Other, U+00C9, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00EA L Letter: Lower case letter, latin, Other, U+00CA, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00EB L Letter: Lower case letter, latin, Other, U+00CB, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00EC L Letter: Lower case letter, latin, Other, U+00CC, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00ED L Letter: Lower case letter, latin, Other, U+00CD, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00EE L Letter: Lower case letter, latin, Other, U+00CE, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00EF L Letter: Lower case letter, latin, Other, U+00CF, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
findprop f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe ff
U+00F0 L Letter: Lower case letter, latin, Other, U+00D0, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00F1 L Letter: Lower case letter, latin, Other, U+00D1, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00F2 L Letter: Lower case letter, latin, Other, U+00D2, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00F3 L Letter: Lower case letter, latin, Other, U+00D3, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00F4 L Letter: Lower case letter, latin, Other, U+00D4, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00F5 L Letter: Lower case letter, latin, Other, U+00D5, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00F6 L Letter: Lower case letter, latin, Other, U+00D6, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00F7 ON Symbol: Mathematical symbol, common, Other, [graphemebase, math, patternsyntax]
U+00F8 L Letter: Lower case letter, latin, Other, U+00D8, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00F9 L Letter: Lower case letter, latin, Other, U+00D9, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00FA L Letter: Lower case letter, latin, Other, U+00DA, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00FB L Letter: Lower case letter, latin, Other, U+00DB, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00FC L Letter: Lower case letter, latin, Other, U+00DC, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00FD L Letter: Lower case letter, latin, Other, U+00DD, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00FE L Letter: Lower case letter, latin, Other, U+00DE, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00FF L Letter: Lower case letter, latin, Other, U+0178, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00F0 L Letter: Lower case letter, latin, Other, U+00D0, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00F1 L Letter: Lower case letter, latin, Other, U+00D1, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00F2 L Letter: Lower case letter, latin, Other, U+00D2, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00F3 L Letter: Lower case letter, latin, Other, U+00D3, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00F4 L Letter: Lower case letter, latin, Other, U+00D4, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00F5 L Letter: Lower case letter, latin, Other, U+00D5, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00F6 L Letter: Lower case letter, latin, Other, U+00D6, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00F7 ON Symbol: Mathematical symbol, common, Other, [alphabetic, caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+00F8 L Letter: Lower case letter, latin, Other, U+00D8, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00F9 L Letter: Lower case letter, latin, Other, U+00D9, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00FA L Letter: Lower case letter, latin, Other, U+00DA, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00FB L Letter: Lower case letter, latin, Other, U+00DB, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00FC L Letter: Lower case letter, latin, Other, U+00DC, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00FD L Letter: Lower case letter, latin, Other, U+00DD, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00FE L Letter: Lower case letter, latin, Other, U+00DE, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+00FF L Letter: Lower case letter, latin, Other, U+0178, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
findprop 0100 0101 0102 0103 0104 0105 0106
U+0100 L Letter: Upper case letter, latin, Other, U+0101, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0101 L Letter: Lower case letter, latin, Other, U+0100, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0102 L Letter: Upper case letter, latin, Other, U+0103, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0103 L Letter: Lower case letter, latin, Other, U+0102, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0104 L Letter: Upper case letter, latin, Other, U+0105, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0105 L Letter: Lower case letter, latin, Other, U+0104, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0106 L Letter: Upper case letter, latin, Other, U+0107, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+0100 L Letter: Upper case letter, latin, Other, U+0101, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+0101 L Letter: Lower case letter, latin, Other, U+0100, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+0102 L Letter: Upper case letter, latin, Other, U+0103, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+0103 L Letter: Lower case letter, latin, Other, U+0102, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+0104 L Letter: Upper case letter, latin, Other, U+0105, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+0105 L Letter: Lower case letter, latin, Other, U+0104, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+0106 L Letter: Upper case letter, latin, Other, U+0107, [alphabetic, graphemeextend, idcontinue, xidcontinue]
findprop ffe0 ffe1 ffe2 ffe3 ffe4 ffe5 ffe6 ffe7
U+FFE0 ET Symbol: Currency symbol, common, Other, [graphemebase]
U+FFE1 ET Symbol: Currency symbol, common, Other, [graphemebase]
U+FFE2 ON Symbol: Mathematical symbol, common, Other, [graphemebase, math]
U+FFE3 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+FFE4 ON Symbol: Other symbol, common, Other, [graphemebase]
U+FFE5 ET Symbol: Currency symbol, common, Other, [graphemebase]
U+FFE6 ET Symbol: Currency symbol, common, Other, [graphemebase]
U+FFE0 ET Symbol: Currency symbol, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FFE1 ET Symbol: Currency symbol, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FFE2 ON Symbol: Mathematical symbol, common, Other, [emoji, extendedpictographic, graphemebase]
U+FFE3 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+FFE4 ON Symbol: Other symbol, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FFE5 ET Symbol: Currency symbol, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FFE6 ET Symbol: Currency symbol, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FFE7 L Control: Unassigned, unknown, Other
findprop ffe8 ffe9 ffea ffeb ffec ffed ffee ffef
U+FFE8 ON Symbol: Other symbol, common, Other, [graphemebase]
U+FFE9 ON Symbol: Mathematical symbol, common, Other, [graphemebase, math]
U+FFEA ON Symbol: Mathematical symbol, common, Other, [graphemebase, math]
U+FFEB ON Symbol: Mathematical symbol, common, Other, [graphemebase, math]
U+FFEC ON Symbol: Mathematical symbol, common, Other, [graphemebase, math]
U+FFED ON Symbol: Other symbol, common, Other, [graphemebase]
U+FFEE ON Symbol: Other symbol, common, Other, [graphemebase]
U+FFE8 ON Symbol: Other symbol, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FFE9 ON Symbol: Mathematical symbol, common, Other, [emoji, extendedpictographic, graphemebase]
U+FFEA ON Symbol: Mathematical symbol, common, Other, [emoji, extendedpictographic, graphemebase]
U+FFEB ON Symbol: Mathematical symbol, common, Other, [emoji, extendedpictographic, graphemebase]
U+FFEC ON Symbol: Mathematical symbol, common, Other, [emoji, extendedpictographic, graphemebase]
U+FFED ON Symbol: Other symbol, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FFEE ON Symbol: Other symbol, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FFEF L Control: Unassigned, unknown, Other
findprop fff8 fff9 fffa fffb fffc fffd fffe ffff
U+FFF8 BN Control: Unassigned, unknown, Control, [defaultignorablecodepoint]
U+FFF9 ON Control: Format, common, Control, [caseignorable]
U+FFFA ON Control: Format, common, Control, [caseignorable]
U+FFFB ON Control: Format, common, Control, [caseignorable]
U+FFFC ON Symbol: Other symbol, common, Other, [graphemebase]
U+FFFD ON Symbol: Other symbol, common, Other, [graphemebase]
U+FFFE BN Control: Unassigned, unknown, Other, [noncharactercodepoint]
U+FFFF BN Control: Unassigned, unknown, Other, [noncharactercodepoint]
U+FFF8 BN Control: Unassigned, unknown, Control, [dash, defaultignorablecodepoint, deprecated, extendedpictographic, joincontrol, lowercase, patternwhitespace, quotationmark, sentenceterminal, softdotted, xidcontinue, xidstart]
U+FFF9 ON Control: Format, common, Control, [changeswhenuppercased, deprecated, emojimodifier, emojipresentation, extender, sentenceterminal, xidcontinue, xidstart]
U+FFFA ON Control: Format, common, Control, [changeswhenuppercased, deprecated, emojimodifier, emojipresentation, extender, sentenceterminal, xidcontinue, xidstart]
U+FFFB ON Control: Format, common, Control, [changeswhenuppercased, deprecated, emojimodifier, emojipresentation, extender, sentenceterminal, xidcontinue, xidstart]
U+FFFC ON Symbol: Other symbol, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FFFD ON Symbol: Other symbol, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FFFE BN Control: Unassigned, unknown, Other, [changeswhenuppercased, deprecated, emojicomponent, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+FFFF BN Control: Unassigned, unknown, Other, [changeswhenuppercased, deprecated, emojicomponent, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
findprop 10000 10001 e01ef f0000 100000
U+10000 L Letter: Other letter, linearb, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+10001 L Letter: Other letter, linearb, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+E01EF NSM Mark: Non-spacing mark, inherited, Extend, [caseignorable, defaultignorablecodepoint, graphemeextend, idcontinue, variationselector, xidcontinue]
U+10000 L Letter: Other letter, linearb, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+10001 L Letter: Other letter, linearb, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+E01EF NSM Mark: Non-spacing mark, inherited, Extend, []
U+F0000 L Control: Private use, unknown, Other
U+100000 L Control: Private use, unknown, Other
findprop 1b00 12000 7c0 a840 10900
U+1B00 NSM Mark: Non-spacing mark, balinese, Extend, [alphabetic, caseignorable, graphemeextend, idcontinue, xidcontinue]
U+12000 L Letter: Other letter, cuneiform, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+07C0 R Number: Decimal number, nko, Other, [graphemebase, idcontinue, xidcontinue]
U+A840 L Letter: Other letter, phagspa, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+10900 R Letter: Other letter, phoenician, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+1B00 NSM Mark: Non-spacing mark, balinese, Extend, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, uppercase]
U+12000 L Letter: Other letter, cuneiform, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+07C0 R Number: Decimal number, nko, Other, [graphemebase, patternsyntax, terminalpunctuation]
U+A840 L Letter: Other letter, phagspa, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+10900 R Letter: Other letter, phoenician, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
findprop 1d79 a77d
U+1D79 L Letter: Lower case letter, latin, Other, U+A77D, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+A77D L Letter: Upper case letter, latin, Other, U+1D79, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+1D79 L Letter: Lower case letter, latin, Other, U+A77D, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue]
U+A77D L Letter: Upper case letter, latin, Other, U+1D79, [alphabetic, graphemeextend, idcontinue, xidcontinue]
findprop 0800 083e a4d0 a4f7 aa80 aadf
U+0800 R Letter: Other letter, samaritan, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+083E R Punctuation: Other punctuation, samaritan, Other, [graphemebase, sentenceterminal, terminalpunctuation]
U+A4D0 L Letter: Other letter, lisu, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+A4F7 L Letter: Other letter, lisu, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AA80 L Letter: Other letter, taiviet, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AADF L Punctuation: Other punctuation, taiviet, Other, [graphemebase, terminalpunctuation]
U+0800 R Letter: Other letter, samaritan, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+083E R Punctuation: Other punctuation, samaritan, Other, [bidimirrored, graphemebase, math, patternsyntax]
U+A4D0 L Letter: Other letter, lisu, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+A4F7 L Letter: Other letter, lisu, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AA80 L Letter: Other letter, taiviet, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AADF L Punctuation: Other punctuation, taiviet, Other, [graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
findprop 10b00 10b35 13000 1342e 10840 10855
U+10B00 R Letter: Other letter, avestan, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+10B35 R Letter: Other letter, avestan, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+13000 L Letter: Other letter, egyptianhieroglyphs, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+1342E L Letter: Other letter, egyptianhieroglyphs, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+10840 R Letter: Other letter, imperialaramaic, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+10855 R Letter: Other letter, imperialaramaic, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+10B00 R Letter: Other letter, avestan, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+10B35 R Letter: Other letter, avestan, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+13000 L Letter: Other letter, egyptianhieroglyphs, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+1342E L Letter: Other letter, egyptianhieroglyphs, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+10840 R Letter: Other letter, imperialaramaic, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+10855 R Letter: Other letter, imperialaramaic, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
findprop 11100 1113c 11680 116c0
U+11100 NSM Mark: Non-spacing mark, chakma, Extend, [alphabetic, caseignorable, graphemeextend, idcontinue, xidcontinue]
U+1113C L Number: Decimal number, chakma, Other, [graphemebase, idcontinue, xidcontinue]
U+11680 L Letter: Other letter, takri, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+116C0 L Number: Decimal number, takri, Other, [graphemebase, idcontinue, xidcontinue]
U+11100 NSM Mark: Non-spacing mark, chakma, Extend, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, uppercase]
U+1113C L Number: Decimal number, chakma, Other, [graphemebase, patternsyntax, terminalpunctuation]
U+11680 L Letter: Other letter, takri, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+116C0 L Number: Decimal number, takri, Other, [graphemebase, patternsyntax, terminalpunctuation]
findprop 0d 0a 0e 0711 1b04 1111 1169 11fe ae4c ad89
U+000D B Control: Control, common, CR, [ascii, patternwhitespace, whitespace]
U+000A B Control: Control, common, LF, [ascii, patternwhitespace, whitespace]
U+000E BN Control: Control, common, Control, [ascii]
U+0711 NSM Mark: Non-spacing mark, syriac, Extend, [alphabetic, caseignorable, graphemeextend, idcontinue, xidcontinue]
U+1B04 L Mark: Spacing mark, balinese, SpacingMark, [alphabetic, graphemebase, idcontinue, xidcontinue]
U+1111 L Letter: Other letter, hangul, Hangul syllable type L, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+1169 L Letter: Other letter, hangul, Hangul syllable type V, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+11FE L Letter: Other letter, hangul, Hangul syllable type T, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AE4C L Letter: Other letter, hangul, Hangul syllable type LV, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AD89 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+000D B Control: Control, common, CR, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+000A B Control: Control, common, LF, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+000E BN Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0711 NSM Mark: Non-spacing mark, syriac, Extend, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, uppercase]
U+1B04 L Mark: Spacing mark, balinese, SpacingMark, [dash, emoji, extendedpictographic, graphemebase, patternsyntax]
U+1111 L Letter: Other letter, hangul, Hangul syllable type L, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+1169 L Letter: Other letter, hangul, Hangul syllable type V, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+11FE L Letter: Other letter, hangul, Hangul syllable type T, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AE4C L Letter: Other letter, hangul, Hangul syllable type LV, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AD89 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
findprop 118a0 11ac7 16ad0
U+118A0 L Letter: Upper case letter, warangciti, Other, U+118C0, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+11AC7 L Letter: Other letter, paucinhau, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+16AD0 L Letter: Other letter, bassavah, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+118A0 L Letter: Upper case letter, warangciti, Other, U+118C0, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+11AC7 L Letter: Other letter, paucinhau, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+16AD0 L Letter: Other letter, bassavah, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
findprop 11700 14400 108e0 11280 1d800
U+11700 L Letter: Other letter, ahom, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+14400 L Letter: Other letter, anatolianhieroglyphs, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+108E0 R Letter: Other letter, hatran, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+11280 L Letter: Other letter, multani, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+1D800 L Symbol: Other symbol, signwriting, Other, [graphemebase]
U+11700 L Letter: Other letter, ahom, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+14400 L Letter: Other letter, anatolianhieroglyphs, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+108E0 R Letter: Other letter, hatran, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+11280 L Letter: Other letter, multani, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+1D800 L Symbol: Other symbol, signwriting, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
findprop 11800 1e903 11da9 10d27 11ee0 16e48 10f27 10f30
U+11800 L Letter: Other letter, dogra, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+1E903 R Letter: Upper case letter, adlam, Other, U+1E925, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+11DA9 L Number: Decimal number, gunjalagondi, Other, [graphemebase, idcontinue, xidcontinue]
U+10D27 NSM Mark: Non-spacing mark, hanifirohingya, Extend, [alphabetic, caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+11EE0 L Letter: Other letter, makasar, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+16E48 L Letter: Upper case letter, medefaidrin, Other, U+16E68, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+10F27 R Letter: Other letter, oldsogdian, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+10F30 AL Letter: Other letter, sogdian, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+11800 L Letter: Other letter, dogra, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+1E903 R Letter: Upper case letter, adlam, Other, U+1E925, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+11DA9 L Number: Decimal number, gunjalagondi, Other, [graphemebase, patternsyntax, terminalpunctuation]
U+10D27 NSM Mark: Non-spacing mark, hanifirohingya, Extend, [extendedpictographic, graphemebase, patternsyntax]
U+11EE0 L Letter: Other letter, makasar, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+16E48 L Letter: Upper case letter, medefaidrin, Other, U+16E68, [alphabetic, graphemeextend, idcontinue, xidcontinue]
U+10F27 R Letter: Other letter, oldsogdian, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+10F30 AL Letter: Other letter, sogdian, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
findprop a836 a833 1cf4 20f0 1cd0
U+A836 L Symbol: Other symbol, common, Other, [devanagari, gurmukhi, gujarati, kaithi, takri, khojki, mahajani, modi, khudawadi, tirhuta, dogra], [graphemebase]
U+A833 L Number: Other number, common, Other, [devanagari, gurmukhi, gujarati, kannada, kaithi, takri, khojki, mahajani, modi, khudawadi, tirhuta, dogra, nandinagari], [graphemebase]
U+1CF4 NSM Mark: Non-spacing mark, inherited, Extend, [devanagari, kannada, grantha], [caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+20F0 NSM Mark: Non-spacing mark, inherited, Extend, [latin, devanagari, grantha], [caseignorable, graphemeextend, idcontinue, xidcontinue]
U+1CD0 NSM Mark: Non-spacing mark, inherited, Extend, [devanagari, bengali, kannada, grantha], [caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+A836 L Symbol: Other symbol, common, Other, [devanagari, gurmukhi, gujarati, kaithi, takri, khojki, mahajani, modi, khudawadi, tirhuta, dogra], [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+A833 L Number: Other number, common, Other, [devanagari, gurmukhi, gujarati, kannada, kaithi, takri, khojki, mahajani, modi, khudawadi, tirhuta, dogra, nandinagari], [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+1CF4 NSM Mark: Non-spacing mark, inherited, Extend, [devanagari, kannada, grantha], [alphabetic, cased, graphemebase, idcontinue, idstart, lowercase, softdotted, xidcontinue, xidstart]
U+20F0 NSM Mark: Non-spacing mark, inherited, Extend, [latin, devanagari, grantha], [caseignorable, graphemebase, patternsyntax, quotationmark]
U+1CD0 NSM Mark: Non-spacing mark, inherited, Extend, [devanagari, bengali, kannada, grantha], [alphabetic, cased, graphemebase, idcontinue, idstart, lowercase, softdotted, xidcontinue, xidstart]
findprop 32ff
U+32FF L Symbol: Other symbol, common, Other, [han], [graphemebase]
U+32FF L Symbol: Other symbol, common, Other, [han], [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
findprop 1f16d
U+1F16D ON Symbol: Other symbol, common, Extended Pictographic, [extendedpictographic, graphemebase]
U+1F16D ON Symbol: Other symbol, common, Extended Pictographic, [ascii, sentenceterminal, unifiedideograph, whitespace, xidcontinue]
findprop U+10e93 U+10eaa
U+10E93 R Letter: Other letter, yezidi, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+10E93 R Letter: Other letter, yezidi, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+10EAA R Control: Unassigned, unknown, Other
findprop 0602 202a 202b 202c 2068 2069 202d 202e 2067
U+0602 AN Control: Format, arabic, Prepend, [caseignorable, prependedconcatenationmark]
U+202A LRE Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+202B RLE Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+202C PDF Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+2068 FSI Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+2069 PDI Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+202D LRO Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+202E RLO Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+2067 RLI Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+0602 AN Control: Format, arabic, Prepend, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, lowercase]
U+202A LRE Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
U+202B RLE Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
U+202C PDF Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
U+2068 FSI Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
U+2069 PDI Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
U+202D LRO Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
U+202E RLO Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
U+2067 RLI Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]

View File

@ -1,278 +1,298 @@
find script Han
U+2E80..U+2E99 ON Symbol: Other symbol, han, Other, [graphemebase, radical]
U+2E9B..U+2EF3 ON Symbol: Other symbol, han, Other, [graphemebase, radical]
U+2F00..U+2FD5 ON Symbol: Other symbol, han, Other, [graphemebase, radical]
U+3005 L Letter: Modifier letter, han, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+3007 L Number: Letter number, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+3021..U+3029 L Number: Letter number, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+3038..U+303A L Number: Letter number, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+303B L Letter: Modifier letter, han, Other, [alphabetic, caseignorable, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+3400..U+4DBF L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+4E00..U+9FFF L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+F900..U+FA0D L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+FA0E..U+FA0F L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+FA10 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+FA11 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+FA12 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+FA13..U+FA14 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+FA15..U+FA1E L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+FA1F L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+FA20 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+FA21 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+FA22 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+FA23..U+FA24 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+FA25..U+FA26 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+FA27..U+FA29 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+FA2A..U+FA6D L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+FA70..U+FAD9 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+16FE2 ON Punctuation: Other punctuation, han, Other, [graphemebase]
U+16FE3 L Letter: Modifier letter, han, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+16FF0..U+16FF1 L Mark: Spacing mark, han, SpacingMark, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+20000..U+2A6DF L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+2A700..U+2B738 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+2B740..U+2B81D L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+2B820..U+2CEA1 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+2CEB0..U+2EBE0 L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+2F800..U+2FA1D L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+30000..U+3134A L Letter: Other letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, unifiedideograph, xidcontinue, xidstart]
U+2E80..U+2E99 ON Symbol: Other symbol, han, Other, [ascii, sentenceterminal, unifiedideograph, whitespace, xidstart]
U+2E9B..U+2EF3 ON Symbol: Other symbol, han, Other, [ascii, sentenceterminal, unifiedideograph, whitespace, xidstart]
U+2F00..U+2FD5 ON Symbol: Other symbol, han, Other, [ascii, sentenceterminal, unifiedideograph, whitespace, xidstart]
U+3005 L Letter: Modifier letter, han, Other, [emoji, emojimodifierbase, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
U+3007 L Number: Letter number, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+3021..U+3029 L Number: Letter number, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+3038..U+303A L Number: Letter number, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+303B L Letter: Modifier letter, han, Other, [alphabetic, graphemebase, idcontinue, idstart, ideographic, xidcontinue, xidstart]
U+3400..U+4DBF L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+4E00..U+9FFF L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+F900..U+FA0D L Letter: Other letter, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+FA0E..U+FA0F L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+FA10 L Letter: Other letter, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+FA11 L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+FA12 L Letter: Other letter, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+FA13..U+FA14 L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+FA15..U+FA1E L Letter: Other letter, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+FA1F L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+FA20 L Letter: Other letter, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+FA21 L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+FA22 L Letter: Other letter, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+FA23..U+FA24 L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+FA25..U+FA26 L Letter: Other letter, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+FA27..U+FA29 L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+FA2A..U+FA6D L Letter: Other letter, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+FA70..U+FAD9 L Letter: Other letter, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+16FE2 ON Punctuation: Other punctuation, han, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+16FE3 L Letter: Modifier letter, han, Other, [emoji, emojimodifierbase, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
U+16FF0..U+16FF1 L Mark: Spacing mark, han, SpacingMark, [caseignorable, graphemeextend, idcontinue, ideographic, xidcontinue]
U+20000..U+2A6DF L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+2A700..U+2B738 L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+2B740..U+2B81D L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+2B820..U+2CEA1 L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+2CEB0..U+2EBE0 L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+2F800..U+2FA1D L Letter: Other letter, han, Other, [sentenceterminal, unifiedideograph, xidcontinue, xidstart]
U+30000..U+3134A L Letter: Other letter, han, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
find type Pe script Common scriptx Hangul
U+3009 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [bidimirrored, graphemebase, patternsyntax]
U+300B ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [bidimirrored, graphemebase, patternsyntax]
U+300D ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [bidimirrored, graphemebase, patternsyntax, quotationmark]
U+300F ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [bidimirrored, graphemebase, patternsyntax, quotationmark]
U+3011 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [bidimirrored, graphemebase, patternsyntax]
U+3015 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [bidimirrored, graphemebase, patternsyntax]
U+3017 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [bidimirrored, graphemebase, patternsyntax]
U+3019 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [bidimirrored, graphemebase, patternsyntax]
U+301B ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [bidimirrored, graphemebase, patternsyntax]
U+301E..U+301F ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han], [graphemebase, patternsyntax, quotationmark]
U+FF63 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [bidimirrored, graphemebase, quotationmark]
U+3009 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, dash, emojimodifier, emojimodifierbase]
U+300B ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, dash, emojimodifier, emojimodifierbase]
U+300D ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [graphemebase, sentenceterminal, terminalpunctuation]
U+300F ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [graphemebase, sentenceterminal, terminalpunctuation]
U+3011 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, dash, emojimodifier, emojimodifierbase]
U+3015 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, dash, emojimodifier, emojimodifierbase]
U+3017 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, dash, emojimodifier, emojimodifierbase]
U+3019 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, dash, emojimodifier, emojimodifierbase]
U+301B ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, dash, emojimodifier, emojimodifierbase]
U+301E..U+301F ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han], [softdotted, terminalpunctuation, unifiedideograph, xidcontinue, xidstart]
U+FF63 ON Punctuation: Close punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han, yiii], [changeswhencasemapped, changeswhenlowercased, changeswhentitlecased, emojimodifier, emojimodifierbase]
find type Sk
U+005E ON Symbol: Modifier symbol, common, Other, [ascii, caseignorable, diacritic, graphemebase, math, patternsyntax]
U+0060 ON Symbol: Modifier symbol, common, Other, [ascii, caseignorable, diacritic, graphemebase, patternsyntax]
U+00A8 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+00AF ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+00B4 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+00B8 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+02C2..U+02C5 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+02D2..U+02DF ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+02E5..U+02E9 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+02EA..U+02EB ON Symbol: Modifier symbol, bopomofo, Other, [caseignorable, diacritic, graphemebase]
U+02ED ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+02EF..U+02FF ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+0375 ON Symbol: Modifier symbol, greek, Other, [caseignorable, diacritic, graphemebase]
U+0384 ON Symbol: Modifier symbol, greek, Other, [caseignorable, diacritic, graphemebase]
U+0385 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+0888 AL Symbol: Modifier symbol, arabic, Other, [caseignorable, graphemebase]
U+1FBD ON Symbol: Modifier symbol, greek, Other, [caseignorable, diacritic, graphemebase]
U+1FBF..U+1FC1 ON Symbol: Modifier symbol, greek, Other, [caseignorable, diacritic, graphemebase]
U+1FCD..U+1FCF ON Symbol: Modifier symbol, greek, Other, [caseignorable, diacritic, graphemebase]
U+1FDD..U+1FDF ON Symbol: Modifier symbol, greek, Other, [caseignorable, diacritic, graphemebase]
U+1FED..U+1FEF ON Symbol: Modifier symbol, greek, Other, [caseignorable, diacritic, graphemebase]
U+1FFD..U+1FFE ON Symbol: Modifier symbol, greek, Other, [caseignorable, diacritic, graphemebase]
U+309B..U+309C ON Symbol: Modifier symbol, common, Other, [hiragana, katakana], [caseignorable, diacritic, graphemebase, idcontinue, idstart]
U+A700..U+A707 ON Symbol: Modifier symbol, common, Other, [latin, han], [caseignorable, diacritic, graphemebase]
U+A708..U+A716 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+A720..U+A721 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+A789..U+A78A L Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+AB5B L Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+AB6A..U+AB6B ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+FBB2..U+FBC2 AL Symbol: Modifier symbol, arabic, Other, [caseignorable, graphemebase]
U+FF3E ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase, math]
U+FF40 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+FFE3 ON Symbol: Modifier symbol, common, Other, [caseignorable, diacritic, graphemebase]
U+1F3FB..U+1F3FF ON Symbol: Modifier symbol, common, Extend, [caseignorable, emoji, emojicomponent, emojimodifier, emojipresentation, graphemebase]
U+005E ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+0060 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, changeswhencasefolded, changeswhencasemapped, changeswhenlowercased, changeswhentitlecased, graphemebase, idcontinue, idstart, uppercase, xidcontinue, xidstart]
U+00A8 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+00AF ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+00B4 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+00B8 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+02C2..U+02C5 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+02D2..U+02DF ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+02E5..U+02E9 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+02EA..U+02EB ON Symbol: Modifier symbol, bopomofo, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+02ED ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+02EF..U+02FF ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+0375 ON Symbol: Modifier symbol, greek, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+0384 ON Symbol: Modifier symbol, greek, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+0385 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+0888 AL Symbol: Modifier symbol, arabic, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, lowercase, math, softdotted, xidcontinue, xidstart]
U+1FBD ON Symbol: Modifier symbol, greek, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+1FBF..U+1FC1 ON Symbol: Modifier symbol, greek, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+1FCD..U+1FCF ON Symbol: Modifier symbol, greek, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+1FDD..U+1FDF ON Symbol: Modifier symbol, greek, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+1FED..U+1FEF ON Symbol: Modifier symbol, greek, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+1FFD..U+1FFE ON Symbol: Modifier symbol, greek, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+309B..U+309C ON Symbol: Modifier symbol, common, Other, [hiragana, katakana], [alphabetic, bidimirrored, caseignorable, cased, changeswhencasefolded, changeswhenlowercased, changeswhentitlecased, changeswhenuppercased, dash, defaultignorablecodepoint, deprecated, diacritic, emoji, emojicomponent, emojimodifier, emojimodifierbase, emojipresentation, extendedpictographic, extender, graphemebase, graphemeextend, graphemelink, hexdigit, idsbinaryoperator, idstrinaryoperator, idcontinue, idstart, ideographic, sentenceterminal, unifiedideograph, whitespace, xidcontinue]
U+A700..U+A707 ON Symbol: Modifier symbol, common, Other, [latin, han], [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+A708..U+A716 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+A720..U+A721 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+A789..U+A78A L Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+AB5B L Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+AB6A..U+AB6B ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+FBB2..U+FBC2 AL Symbol: Modifier symbol, arabic, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, lowercase, math, softdotted, xidcontinue, xidstart]
U+FF3E ON Symbol: Modifier symbol, common, Other, [asciihexdigit, bidicontrol, bidimirrored, cased, changeswhencasefolded, sentenceterminal, unifiedideograph, whitespace, xidstart]
U+FF40 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+FFE3 ON Symbol: Modifier symbol, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+1F3FB..U+1F3FF ON Symbol: Modifier symbol, common, Extend, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, math, patternsyntax, radical, sentenceterminal, terminalpunctuation]
find type Pd
U+002D ES Punctuation: Dash punctuation, common, Other, [ascii, dash, graphemebase, patternsyntax]
U+058A ON Punctuation: Dash punctuation, armenian, Other, [dash, graphemebase]
U+05BE R Punctuation: Dash punctuation, hebrew, Other, [dash, graphemebase]
U+1400 ON Punctuation: Dash punctuation, canadianaboriginal, Other, [dash, graphemebase]
U+1806 ON Punctuation: Dash punctuation, mongolian, Other, [dash, graphemebase]
U+2010..U+2015 ON Punctuation: Dash punctuation, common, Other, [dash, graphemebase, patternsyntax]
U+2E17 ON Punctuation: Dash punctuation, common, Other, [dash, graphemebase, patternsyntax]
U+2E1A ON Punctuation: Dash punctuation, common, Other, [dash, graphemebase, patternsyntax]
U+2E3A..U+2E3B ON Punctuation: Dash punctuation, common, Other, [dash, graphemebase, patternsyntax]
U+2E40 ON Punctuation: Dash punctuation, common, Other, [dash, graphemebase, patternsyntax]
U+2E5D ON Punctuation: Dash punctuation, common, Other, [dash, graphemebase, patternsyntax]
U+301C ON Punctuation: Dash punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han], [dash, graphemebase, patternsyntax]
U+3030 ON Punctuation: Dash punctuation, common, Extended Pictographic, [hangul, hiragana, katakana, bopomofo, han], [dash, emoji, extendedpictographic, graphemebase, patternsyntax]
U+30A0 ON Punctuation: Dash punctuation, common, Other, [hiragana, katakana], [dash, graphemebase]
U+FE31..U+FE32 ON Punctuation: Dash punctuation, common, Other, [dash, graphemebase]
U+FE58 ON Punctuation: Dash punctuation, common, Other, [dash, graphemebase]
U+FE63 ES Punctuation: Dash punctuation, common, Other, [dash, graphemebase, math]
U+FF0D ES Punctuation: Dash punctuation, common, Other, [dash, graphemebase]
U+10EAD R Punctuation: Dash punctuation, yezidi, Other, [dash, graphemebase]
U+002D ES Punctuation: Dash punctuation, common, Other, [ascii, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, idcontinue, idstart, lowercase, softdotted, xidcontinue, xidstart]
U+058A ON Punctuation: Dash punctuation, armenian, Other, [emoji, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
U+05BE R Punctuation: Dash punctuation, hebrew, Other, [emoji, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
U+1400 ON Punctuation: Dash punctuation, canadianaboriginal, Other, [emoji, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
U+1806 ON Punctuation: Dash punctuation, mongolian, Other, [emoji, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
U+2010..U+2015 ON Punctuation: Dash punctuation, common, Other, [dash, defaultignorablecodepoint, deprecated, emojipresentation, joincontrol, lowercase, patternwhitespace, radical, regionalindicator, softdotted, xidcontinue, xidstart]
U+2E17 ON Punctuation: Dash punctuation, common, Other, [dash, defaultignorablecodepoint, deprecated, emojipresentation, joincontrol, lowercase, patternwhitespace, radical, regionalindicator, softdotted, xidcontinue, xidstart]
U+2E1A ON Punctuation: Dash punctuation, common, Other, [dash, defaultignorablecodepoint, deprecated, emojipresentation, joincontrol, lowercase, patternwhitespace, radical, regionalindicator, softdotted, xidcontinue, xidstart]
U+2E3A..U+2E3B ON Punctuation: Dash punctuation, common, Other, [dash, defaultignorablecodepoint, deprecated, emojipresentation, joincontrol, lowercase, patternwhitespace, radical, regionalindicator, softdotted, xidcontinue, xidstart]
U+2E40 ON Punctuation: Dash punctuation, common, Other, [dash, defaultignorablecodepoint, deprecated, emojipresentation, joincontrol, lowercase, patternwhitespace, radical, regionalindicator, softdotted, xidcontinue, xidstart]
U+2E5D ON Punctuation: Dash punctuation, common, Other, [dash, defaultignorablecodepoint, deprecated, emojipresentation, joincontrol, lowercase, patternwhitespace, radical, regionalindicator, softdotted, xidcontinue, xidstart]
U+301C ON Punctuation: Dash punctuation, common, Other, [hangul, hiragana, katakana, bopomofo, han], [dash, defaultignorablecodepoint, deprecated, emojipresentation, joincontrol, lowercase, patternwhitespace, radical, regionalindicator, softdotted, xidcontinue, xidstart]
U+3030 ON Punctuation: Dash punctuation, common, Extended Pictographic, [hangul, hiragana, katakana, bopomofo, han], [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, logicalorderexception, lowercase, prependedconcatenationmark, quotationmark, radical, regionalindicator, sentenceterminal, softdotted, terminalpunctuation, unifiedideograph, uppercase, variationselector, whitespace, xidcontinue, xidstart]
U+30A0 ON Punctuation: Dash punctuation, common, Other, [hiragana, katakana], [emoji, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
U+FE31..U+FE32 ON Punctuation: Dash punctuation, common, Other, [emoji, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
U+FE58 ON Punctuation: Dash punctuation, common, Other, [emoji, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
U+FE63 ES Punctuation: Dash punctuation, common, Other, [caseignorable, sentenceterminal, unifiedideograph, xidcontinue]
U+FF0D ES Punctuation: Dash punctuation, common, Other, [emoji, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
U+10EAD R Punctuation: Dash punctuation, yezidi, Other, [emoji, emojipresentation, extendedpictographic, graphemebase, patternsyntax]
find gbreak LVT
U+AC01..U+AC1B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AC1D..U+AC37 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AC39..U+AC53 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AC55..U+AC6F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AC71..U+AC8B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AC8D..U+ACA7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+ACA9..U+ACC3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+ACC5..U+ACDF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+ACE1..U+ACFB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+ACFD..U+AD17 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AD19..U+AD33 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AD35..U+AD4F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AD51..U+AD6B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AD6D..U+AD87 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AD89..U+ADA3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+ADA5..U+ADBF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+ADC1..U+ADDB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+ADDD..U+ADF7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+ADF9..U+AE13 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AE15..U+AE2F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AE31..U+AE4B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AE4D..U+AE67 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AE69..U+AE83 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AE85..U+AE9F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AEA1..U+AEBB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AEBD..U+AED7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AED9..U+AEF3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AEF5..U+AF0F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AF11..U+AF2B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AF2D..U+AF47 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AF49..U+AF63 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AF65..U+AF7F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AF81..U+AF9B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AF9D..U+AFB7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AFB9..U+AFD3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AFD5..U+AFEF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AFF1..U+B00B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B00D..U+B027 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B029..U+B043 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B045..U+B05F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B061..U+B07B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B07D..U+B097 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B099..U+B0B3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B0B5..U+B0CF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B0D1..U+B0EB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B0ED..U+B107 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B109..U+B123 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B125..U+B13F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B141..U+B15B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B15D..U+B177 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B179..U+B193 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B195..U+B1AF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B1B1..U+B1CB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B1CD..U+B1E7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B1E9..U+B203 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B205..U+B21F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B221..U+B23B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B23D..U+B257 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B259..U+B273 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B275..U+B28F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B291..U+B2AB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B2AD..U+B2C7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B2C9..U+B2E3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B2E5..U+B2FF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B301..U+B31B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B31D..U+B337 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B339..U+B353 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B355..U+B36F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B371..U+B38B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B38D..U+B3A7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B3A9..U+B3C3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B3C5..U+B3DF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B3E1..U+B3FB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B3FD..U+B417 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B419..U+B433 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B435..U+B44F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B451..U+B46B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B46D..U+B487 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B489..U+B4A3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B4A5..U+B4BF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B4C1..U+B4DB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B4DD..U+B4F7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B4F9..U+B513 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B515..U+B52F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B531..U+B54B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B54D..U+B567 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B569..U+B583 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B585..U+B59F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B5A1..U+B5BB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B5BD..U+B5D7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B5D9..U+B5F3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B5F5..U+B60F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B611..U+B62B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B62D..U+B647 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B649..U+B663 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B665..U+B67F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B681..U+B69B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B69D..U+B6B7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B6B9..U+B6D3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+B6D5..U+B6EF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+AC01..U+AC1B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AC1D..U+AC37 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AC39..U+AC53 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AC55..U+AC6F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AC71..U+AC8B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AC8D..U+ACA7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+ACA9..U+ACC3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+ACC5..U+ACDF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+ACE1..U+ACFB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+ACFD..U+AD17 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AD19..U+AD33 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AD35..U+AD4F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AD51..U+AD6B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AD6D..U+AD87 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AD89..U+ADA3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+ADA5..U+ADBF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+ADC1..U+ADDB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+ADDD..U+ADF7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+ADF9..U+AE13 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AE15..U+AE2F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AE31..U+AE4B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AE4D..U+AE67 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AE69..U+AE83 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AE85..U+AE9F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AEA1..U+AEBB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AEBD..U+AED7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AED9..U+AEF3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AEF5..U+AF0F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AF11..U+AF2B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AF2D..U+AF47 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AF49..U+AF63 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AF65..U+AF7F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AF81..U+AF9B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AF9D..U+AFB7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AFB9..U+AFD3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AFD5..U+AFEF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+AFF1..U+B00B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B00D..U+B027 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B029..U+B043 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B045..U+B05F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B061..U+B07B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B07D..U+B097 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B099..U+B0B3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B0B5..U+B0CF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B0D1..U+B0EB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B0ED..U+B107 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B109..U+B123 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B125..U+B13F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B141..U+B15B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B15D..U+B177 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B179..U+B193 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B195..U+B1AF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B1B1..U+B1CB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B1CD..U+B1E7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B1E9..U+B203 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B205..U+B21F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B221..U+B23B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B23D..U+B257 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B259..U+B273 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B275..U+B28F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B291..U+B2AB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B2AD..U+B2C7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B2C9..U+B2E3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B2E5..U+B2FF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B301..U+B31B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B31D..U+B337 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B339..U+B353 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B355..U+B36F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B371..U+B38B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B38D..U+B3A7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B3A9..U+B3C3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B3C5..U+B3DF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B3E1..U+B3FB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B3FD..U+B417 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B419..U+B433 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B435..U+B44F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B451..U+B46B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B46D..U+B487 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B489..U+B4A3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B4A5..U+B4BF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B4C1..U+B4DB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B4DD..U+B4F7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B4F9..U+B513 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B515..U+B52F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B531..U+B54B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B54D..U+B567 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B569..U+B583 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B585..U+B59F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B5A1..U+B5BB L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B5BD..U+B5D7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B5D9..U+B5F3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B5F5..U+B60F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B611..U+B62B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B62D..U+B647 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B649..U+B663 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B665..U+B67F L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B681..U+B69B L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B69D..U+B6B7 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B6B9..U+B6D3 L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+B6D5..U+B6EF L Letter: Other letter, hangul, Hangul syllable type LVT, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
...
find script Old_Uyghur
U+10F70..U+10F81 R Letter: Other letter, olduyghur, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+10F82..U+10F85 NSM Mark: Non-spacing mark, olduyghur, Extend, [caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+10F86..U+10F89 R Punctuation: Other punctuation, olduyghur, Other, [graphemebase, sentenceterminal, terminalpunctuation]
U+10F70..U+10F81 R Letter: Other letter, olduyghur, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+10F82..U+10F85 NSM Mark: Non-spacing mark, olduyghur, Extend, [alphabetic, cased, graphemebase, idcontinue, idstart, lowercase, softdotted, xidcontinue, xidstart]
U+10F86..U+10F89 R Punctuation: Other punctuation, olduyghur, Other, [bidimirrored, graphemebase, math, patternsyntax]
find bidi PDF
U+202C PDF Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+202C PDF Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
find bidi CS
U+002C CS Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax, terminalpunctuation]
U+002E CS Punctuation: Other punctuation, common, Other, [ascii, caseignorable, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+002F CS Punctuation: Other punctuation, common, Other, [ascii, graphemebase, patternsyntax]
U+003A CS Punctuation: Other punctuation, common, Other, [ascii, caseignorable, graphemebase, patternsyntax, terminalpunctuation]
U+00A0 CS Separator: Space separator, common, Other, [graphemebase, whitespace]
U+060C CS Punctuation: Other punctuation, common, Other, [arabic, syriac, thaana, nko, hanifirohingya, yezidi], [graphemebase, terminalpunctuation]
U+202F CS Separator: Space separator, common, Other, [latin, mongolian], [graphemebase, whitespace]
U+2044 CS Symbol: Mathematical symbol, common, Other, [graphemebase, math, patternsyntax]
U+FE50 CS Punctuation: Other punctuation, common, Other, [graphemebase, terminalpunctuation]
U+FE52 CS Punctuation: Other punctuation, common, Other, [caseignorable, graphemebase, sentenceterminal, terminalpunctuation]
U+FE55 CS Punctuation: Other punctuation, common, Other, [caseignorable, graphemebase, terminalpunctuation]
U+FF0C CS Punctuation: Other punctuation, common, Other, [graphemebase, terminalpunctuation]
U+FF0E CS Punctuation: Other punctuation, common, Other, [caseignorable, graphemebase, sentenceterminal, terminalpunctuation]
U+FF0F CS Punctuation: Other punctuation, common, Other, [graphemebase]
U+FF1A CS Punctuation: Other punctuation, common, Other, [caseignorable, graphemebase, terminalpunctuation]
U+002C CS Punctuation: Other punctuation, common, Other, [ascii, asciihexdigit, alphabetic, cased, changeswhencasemapped, changeswhentitlecased, changeswhenuppercased, graphemebase, hexdigit, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+002E CS Punctuation: Other punctuation, common, Other, [graphemebase, whitespace]
U+002F CS Punctuation: Other punctuation, common, Other, [ascii, asciihexdigit, emoji, emojicomponent, graphemebase, hexdigit, idcontinue, xidcontinue]
U+003A CS Punctuation: Other punctuation, common, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, lowercase, xidcontinue, xidstart]
U+00A0 CS Separator: Space separator, common, Other, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase]
U+060C CS Punctuation: Other punctuation, common, Other, [arabic, syriac, thaana, nko, hanifirohingya, yezidi], [graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+202F CS Separator: Space separator, common, Other, [latin, mongolian], [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase]
U+2044 CS Symbol: Mathematical symbol, common, Other, [alphabetic, caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
U+FE50 CS Punctuation: Other punctuation, common, Other, [graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+FE52 CS Punctuation: Other punctuation, common, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+FE55 CS Punctuation: Other punctuation, common, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, noncharactercodepoint, patternwhitespace, prependedconcatenationmark]
U+FF0C CS Punctuation: Other punctuation, common, Other, [graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+FF0E CS Punctuation: Other punctuation, common, Other, [changeswhenuppercased, deprecated, emojimodifier, emojimodifierbase, extender, quotationmark, sentenceterminal, xidcontinue, xidstart]
U+FF0F CS Punctuation: Other punctuation, common, Other, [alphabetic, caseignorable, extender, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+FF1A CS Punctuation: Other punctuation, common, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, noncharactercodepoint, patternwhitespace, prependedconcatenationmark]
find bidi CS type Sm
U+2044 CS Symbol: Mathematical symbol, common, Other, [graphemebase, math, patternsyntax]
U+2044 CS Symbol: Mathematical symbol, common, Other, [alphabetic, caseignorable, diacritic, graphemeextend, idcontinue, xidcontinue]
find bidi B
U+000A B Control: Control, common, LF, [ascii, patternwhitespace, whitespace]
U+000D B Control: Control, common, CR, [ascii, patternwhitespace, whitespace]
U+001C..U+001E B Control: Control, common, Control, [ascii]
U+0085 B Control: Control, common, Control, [patternwhitespace, whitespace]
U+2029 B Separator: Paragraph separator, common, Control, [patternwhitespace, whitespace]
U+000A B Control: Control, common, LF, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+000D B Control: Control, common, CR, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+001C..U+001E B Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0085 B Control: Control, common, Control, [caseignorable, defaultignorablecodepoint, graphemeextend, idcontinue, xidcontinue]
U+2029 B Separator: Paragraph separator, common, Control, [caseignorable, defaultignorablecodepoint, graphemeextend, idcontinue, xidcontinue]
find bidi FSI
U+2068 FSI Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+2068 FSI Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
find bidi PDI
U+2069 PDI Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+2069 PDI Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
find bidi RLI
U+2067 RLI Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+2067 RLI Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
find bidi RLO
U+202E RLO Control: Format, common, Control, [bidicontrol, caseignorable, defaultignorablecodepoint]
U+202E RLO Control: Format, common, Control, [extendedpictographic, graphemebase, math, patternsyntax]
find bidi S
U+0009 S Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+000B S Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+001F S Control: Control, common, Control, [ascii]
U+0009 S Control: Control, common, Control, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+000B S Control: Control, common, Control, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+001F S Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
find bidi WS
U+000C WS Control: Control, common, Control, [ascii, patternwhitespace, whitespace]
U+0020 WS Separator: Space separator, common, Other, [ascii, graphemebase, patternwhitespace, whitespace]
U+1680 WS Separator: Space separator, ogham, Other, [graphemebase, whitespace]
U+2000..U+200A WS Separator: Space separator, common, Other, [graphemebase, whitespace]
U+2028 WS Separator: Line separator, common, Control, [patternwhitespace, whitespace]
U+205F WS Separator: Space separator, common, Other, [graphemebase, whitespace]
U+3000 WS Separator: Space separator, common, Other, [graphemebase, whitespace]
U+000C WS Control: Control, common, Control, [ascii, graphemebase, patternsyntax, sentenceterminal, terminalpunctuation]
U+0020 WS Separator: Space separator, common, Other, [ascii, emoji, emojicomponent, graphemebase, patternsyntax]
U+1680 WS Separator: Space separator, ogham, Other, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase]
U+2000..U+200A WS Separator: Space separator, common, Other, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase]
U+2028 WS Separator: Line separator, common, Control, [caseignorable, defaultignorablecodepoint, graphemeextend, idcontinue, xidcontinue]
U+205F WS Separator: Space separator, common, Other, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase]
U+3000 WS Separator: Space separator, common, Other, [alphabetic, caseignorable, cased, diacritic, graphemebase, idcontinue, idstart, lowercase]
find script bopo
U+02EA..U+02EB ON Symbol: Modifier symbol, bopomofo, Other, [caseignorable, diacritic, graphemebase]
U+3105..U+312F L Letter: Other letter, bopomofo, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+31A0..U+31BF L Letter: Other letter, bopomofo, Other, [alphabetic, graphemebase, idcontinue, idstart, xidcontinue, xidstart]
U+02EA..U+02EB ON Symbol: Modifier symbol, bopomofo, Other, [alphabetic, cased, graphemebase, idcontinue, idstart, math, uppercase, xidcontinue, xidstart]
U+3105..U+312F L Letter: Other letter, bopomofo, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
U+31A0..U+31BF L Letter: Other letter, bopomofo, Other, [alphabetic, diacritic, graphemebase, idcontinue, xidcontinue]
find bool prependedconcatenationmark
U+0600..U+0604 AN Control: Format, arabic, Prepend, [caseignorable, prependedconcatenationmark]
U+0605 AN Control: Format, common, Prepend, [caseignorable, prependedconcatenationmark]
U+06DD AN Control: Format, common, Prepend, [caseignorable, prependedconcatenationmark]
U+070F AL Control: Format, syriac, Prepend, [caseignorable, prependedconcatenationmark]
U+0890..U+0891 AN Control: Format, arabic, Prepend, [caseignorable, prependedconcatenationmark]
U+08E2 AN Control: Format, common, Prepend, [caseignorable, prependedconcatenationmark]
U+110BD L Control: Format, kaithi, Prepend, [caseignorable, prependedconcatenationmark]
U+110CD L Control: Format, kaithi, Prepend, [caseignorable, prependedconcatenationmark]
U+00AD BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+180E BN Control: Format, mongolian, Control, [caseignorable, prependedconcatenationmark]
U+200B BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+2060 BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+2118 ON Symbol: Mathematical symbol, common, Other, [changeswhencasemapped, changeswhentitlecased, emojimodifier, emojimodifierbase, patternsyntax, patternwhitespace, prependedconcatenationmark, quotationmark, radical, regionalindicator, sentenceterminal, softdotted, terminalpunctuation, unifiedideograph, uppercase, variationselector, whitespace, xidcontinue, xidstart]
U+3030 ON Punctuation: Dash punctuation, common, Extended Pictographic, [hangul, hiragana, katakana, bopomofo, han], [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, logicalorderexception, lowercase, prependedconcatenationmark, quotationmark, radical, regionalindicator, sentenceterminal, softdotted, terminalpunctuation, unifiedideograph, uppercase, variationselector, whitespace, xidcontinue, xidstart]
U+AAC0 L Letter: Other letter, taiviet, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, logicalorderexception, lowercase, math, patternwhitespace, prependedconcatenationmark]
U+AAC2 L Letter: Other letter, taiviet, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, logicalorderexception, lowercase, math, patternwhitespace, prependedconcatenationmark]
U+FE0F NSM Mark: Non-spacing mark, inherited, Extend, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, logicalorderexception, math, patternwhitespace, prependedconcatenationmark]
U+FE55 CS Punctuation: Other punctuation, common, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, noncharactercodepoint, patternwhitespace, prependedconcatenationmark]
U+FEFF BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+FF1A CS Punctuation: Other punctuation, common, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, noncharactercodepoint, patternwhitespace, prependedconcatenationmark]
U+FF21..U+FF26 L Letter: Upper case letter, latin, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, logicalorderexception, lowercase, noncharactercodepoint, patternwhitespace, prependedconcatenationmark]
U+10D22..U+10D23 AL Letter: Other letter, hanifirohingya, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, logicalorderexception, lowercase, math, patternwhitespace, prependedconcatenationmark]
U+1135D L Letter: Other letter, grantha, Other, [changeswhencasemapped, changeswhentitlecased, emojimodifier, emojimodifierbase, graphemeextend, hexdigit, logicalorderexception, lowercase, math, noncharactercodepoint, patternsyntax, patternwhitespace, prependedconcatenationmark, quotationmark, radical, regionalindicator, sentenceterminal, softdotted, terminalpunctuation, unifiedideograph, uppercase, variationselector, whitespace, xidcontinue, xidstart]
U+1BCA0..U+1BCA3 BN Control: Format, common, Control, [duployan], [caseignorable, prependedconcatenationmark]
U+1D173..U+1D17A BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+1F1E6..U+1F1FF L Symbol: Other symbol, common, Regional Indicator, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, math, patternwhitespace, prependedconcatenationmark, quotationmark, radical, regionalindicator, sentenceterminal, softdotted, terminalpunctuation, unifiedideograph, uppercase, variationselector, whitespace, xidcontinue, xidstart]
find bool pcm
U+0600..U+0604 AN Control: Format, arabic, Prepend, [caseignorable, prependedconcatenationmark]
U+0605 AN Control: Format, common, Prepend, [caseignorable, prependedconcatenationmark]
U+06DD AN Control: Format, common, Prepend, [caseignorable, prependedconcatenationmark]
U+070F AL Control: Format, syriac, Prepend, [caseignorable, prependedconcatenationmark]
U+0890..U+0891 AN Control: Format, arabic, Prepend, [caseignorable, prependedconcatenationmark]
U+08E2 AN Control: Format, common, Prepend, [caseignorable, prependedconcatenationmark]
U+110BD L Control: Format, kaithi, Prepend, [caseignorable, prependedconcatenationmark]
U+110CD L Control: Format, kaithi, Prepend, [caseignorable, prependedconcatenationmark]
U+00AD BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+180E BN Control: Format, mongolian, Control, [caseignorable, prependedconcatenationmark]
U+200B BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+2060 BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+2118 ON Symbol: Mathematical symbol, common, Other, [changeswhencasemapped, changeswhentitlecased, emojimodifier, emojimodifierbase, patternsyntax, patternwhitespace, prependedconcatenationmark, quotationmark, radical, regionalindicator, sentenceterminal, softdotted, terminalpunctuation, unifiedideograph, uppercase, variationselector, whitespace, xidcontinue, xidstart]
U+3030 ON Punctuation: Dash punctuation, common, Extended Pictographic, [hangul, hiragana, katakana, bopomofo, han], [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, logicalorderexception, lowercase, prependedconcatenationmark, quotationmark, radical, regionalindicator, sentenceterminal, softdotted, terminalpunctuation, unifiedideograph, uppercase, variationselector, whitespace, xidcontinue, xidstart]
U+AAC0 L Letter: Other letter, taiviet, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, logicalorderexception, lowercase, math, patternwhitespace, prependedconcatenationmark]
U+AAC2 L Letter: Other letter, taiviet, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, logicalorderexception, lowercase, math, patternwhitespace, prependedconcatenationmark]
U+FE0F NSM Mark: Non-spacing mark, inherited, Extend, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, logicalorderexception, math, patternwhitespace, prependedconcatenationmark]
U+FE55 CS Punctuation: Other punctuation, common, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, noncharactercodepoint, patternwhitespace, prependedconcatenationmark]
U+FEFF BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+FF1A CS Punctuation: Other punctuation, common, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, joincontrol, noncharactercodepoint, patternwhitespace, prependedconcatenationmark]
U+FF21..U+FF26 L Letter: Upper case letter, latin, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, logicalorderexception, lowercase, noncharactercodepoint, patternwhitespace, prependedconcatenationmark]
U+10D22..U+10D23 AL Letter: Other letter, hanifirohingya, Other, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, logicalorderexception, lowercase, math, patternwhitespace, prependedconcatenationmark]
U+1135D L Letter: Other letter, grantha, Other, [changeswhencasemapped, changeswhentitlecased, emojimodifier, emojimodifierbase, graphemeextend, hexdigit, logicalorderexception, lowercase, math, noncharactercodepoint, patternsyntax, patternwhitespace, prependedconcatenationmark, quotationmark, radical, regionalindicator, sentenceterminal, softdotted, terminalpunctuation, unifiedideograph, uppercase, variationselector, whitespace, xidcontinue, xidstart]
U+1BCA0..U+1BCA3 BN Control: Format, common, Control, [duployan], [caseignorable, prependedconcatenationmark]
U+1D173..U+1D17A BN Control: Format, common, Control, [caseignorable, prependedconcatenationmark]
U+1F1E6..U+1F1FF L Symbol: Other symbol, common, Regional Indicator, [changeswhencasemapped, changeswhenuppercased, emojimodifier, emojimodifierbase, math, patternwhitespace, prependedconcatenationmark, quotationmark, radical, regionalindicator, sentenceterminal, softdotted, terminalpunctuation, unifiedideograph, uppercase, variationselector, whitespace, xidcontinue, xidstart]

View File

@ -1266,8 +1266,10 @@ PCRE2_SIZE* ref_count;
if (code != NULL)
{
#ifdef SUPPORT_JIT
if (code->executable_jit != NULL)
PRIV(jit_free)(code->executable_jit, &code->memctl);
#endif
if ((code->flags & PCRE2_DEREF_TABLES) != 0)
{
@ -10620,4 +10622,10 @@ re = NULL;
goto EXIT;
}
/* These #undefs are here to enable unity builds with CMake. */
#undef NLBLOCK /* Block containing newline information */
#undef PSSTART /* Field containing processed string start */
#undef PSEND /* Field containing processed string end */
/* End of pcre2_compile.c */

View File

@ -7,7 +7,7 @@ and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel
Original API code Copyright (c) 1997-2012 University of Cambridge
New API code Copyright (c) 2016-2018 University of Cambridge
New API code Copyright (c) 2016-2022 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@ -65,9 +65,8 @@ POSSIBILITY OF SUCH DAMAGE.
#define STR_QUERY_s STR_LEFT_PARENTHESIS STR_QUESTION_MARK STR_s STR_RIGHT_PARENTHESIS
#define STR_STAR_NUL STR_LEFT_PARENTHESIS STR_ASTERISK STR_N STR_U STR_L STR_RIGHT_PARENTHESIS
/* States for range and POSIX processing */
/* States for POSIX processing */
enum { RANGE_NOT_STARTED, RANGE_STARTING, RANGE_STARTED };
enum { POSIX_START_REGEX, POSIX_ANCHORED, POSIX_NOT_BRACKET,
POSIX_CLASS_NOT_STARTED, POSIX_CLASS_STARTING, POSIX_CLASS_STARTED };

View File

@ -350,7 +350,7 @@ Returns: the return from the callout
*/
static int
do_callout(PCRE2_SPTR code, PCRE2_SIZE *offsets, PCRE2_SPTR current_subject,
do_callout_dfa(PCRE2_SPTR code, PCRE2_SIZE *offsets, PCRE2_SPTR current_subject,
PCRE2_SPTR ptr, dfa_match_block *mb, PCRE2_SIZE extracode,
PCRE2_SIZE *lengthptr)
{
@ -2799,7 +2799,7 @@ for (;;)
|| code[LINK_SIZE + 1] == OP_CALLOUT_STR)
{
PCRE2_SIZE callout_length;
rrc = do_callout(code, offsets, current_subject, ptr, mb,
rrc = do_callout_dfa(code, offsets, current_subject, ptr, mb,
1 + LINK_SIZE, &callout_length);
if (rrc < 0) return rrc; /* Abandon */
if (rrc > 0) break; /* Fail this thread */
@ -3196,7 +3196,7 @@ for (;;)
case OP_CALLOUT_STR:
{
PCRE2_SIZE callout_length;
rrc = do_callout(code, offsets, current_subject, ptr, mb, 0,
rrc = do_callout_dfa(code, offsets, current_subject, ptr, mb, 0,
&callout_length);
if (rrc < 0) return rrc; /* Abandon */
if (rrc == 0)
@ -4057,4 +4057,10 @@ while (rws->next != NULL)
return rc;
}
/* These #undefs are here to enable unity builds with CMake. */
#undef NLBLOCK /* Block containing newline information */
#undef PSSTART /* Field containing processed string start */
#undef PSEND /* Field containing processed string end */
/* End of pcre2_dfa_match.c */

View File

@ -220,18 +220,17 @@ not rely on this. */
#define COMPILE_ERROR_BASE 100
/* The initial frames vector for remembering backtracking points in
pcre2_match() is allocated on the system stack, of this size (bytes). The size
must be a multiple of sizeof(PCRE2_SPTR) in all environments, so making it a
multiple of 8 is best. Typical frame sizes are a few hundred bytes (it depends
on the number of capturing parentheses) so 20KiB handles quite a few frames. A
larger vector on the heap is obtained for patterns that need more frames. The
maximum size of this can be limited. */
/* The initial frames vector for remembering pcre2_match() backtracking points
is allocated on the heap, of this size (bytes) or ten times the frame size if
larger, unless the heap limit is smaller. Typical frame sizes are a few hundred
bytes (it depends on the number of capturing parentheses) so 20KiB handles
quite a few frames. A larger vector on the heap is obtained for matches that
need more frames, subject to the heap limit. */
#define START_FRAMES_SIZE 20480
/* Similarly, for DFA matching, an initial internal workspace vector is
allocated on the stack. */
/* For DFA matching, an initial internal workspace vector is allocated on the
stack. The heap is used only if this turns out to be too small. */
#define DFA_START_RWS_SIZE 30720

View File

@ -7,7 +7,7 @@ and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel
Original API code Copyright (c) 1997-2012 University of Cambridge
New API code Copyright (c) 2016-2018 University of Cambridge
New API code Copyright (c) 2016-2022 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@ -649,19 +649,23 @@ the size varies from call to call. As the maximum number of capturing
subpatterns is 65535 we must allow for 65536 strings to include the overall
match. (See also the heapframe structure below.) */
struct heapframe; /* Forward reference */
typedef struct pcre2_real_match_data {
pcre2_memctl memctl;
const pcre2_real_code *code; /* The pattern used for the match */
PCRE2_SPTR subject; /* The subject that was matched */
PCRE2_SPTR mark; /* Pointer to last mark */
PCRE2_SIZE leftchar; /* Offset to leftmost code unit */
PCRE2_SIZE rightchar; /* Offset to rightmost code unit */
PCRE2_SIZE startchar; /* Offset to starting code unit */
uint8_t matchedby; /* Type of match (normal, JIT, DFA) */
uint8_t flags; /* Various flags */
uint16_t oveccount; /* Number of pairs */
int rc; /* The return code from the match */
PCRE2_SIZE ovector[131072]; /* Must be last in the structure */
pcre2_memctl memctl; /* Memory control fields */
const pcre2_real_code *code; /* The pattern used for the match */
PCRE2_SPTR subject; /* The subject that was matched */
PCRE2_SPTR mark; /* Pointer to last mark */
struct heapframe *heapframes; /* Backtracking frames heap memory */
PCRE2_SIZE heapframes_size; /* Malloc-ed size */
PCRE2_SIZE leftchar; /* Offset to leftmost code unit */
PCRE2_SIZE rightchar; /* Offset to rightmost code unit */
PCRE2_SIZE startchar; /* Offset to starting code unit */
uint8_t matchedby; /* Type of match (normal, JIT, DFA) */
uint8_t flags; /* Various flags */
uint16_t oveccount; /* Number of pairs */
int rc; /* The return code from the match */
PCRE2_SIZE ovector[131072]; /* Must be last in the structure */
} pcre2_real_match_data;
@ -854,10 +858,6 @@ doing traditional NFA matching (pcre2_match() and friends). */
typedef struct match_block {
pcre2_memctl memctl; /* For general use */
PCRE2_SIZE frame_vector_size; /* Size of a backtracking frame */
heapframe *match_frames; /* Points to vector of frames */
heapframe *match_frames_top; /* Points after the end of the vector */
heapframe *stack_frames; /* The original vector on the stack */
PCRE2_SIZE heap_limit; /* As it says */
uint32_t match_limit; /* As it says */
uint32_t match_limit_depth; /* As it says */

View File

@ -5886,7 +5886,7 @@ static BOOL check_fast_forward_char_pair_simd(compiler_common *common, fast_forw
while (j < i)
{
b_pri = chars[j].last_count;
if (b_pri > 2 && a_pri + b_pri >= max_pri)
if (b_pri > 2 && (sljit_u32)a_pri + (sljit_u32)b_pri >= max_pri)
{
b1 = chars[j].chars[0];
b2 = chars[j].chars[1];
@ -9689,7 +9689,7 @@ BACKTRACK_AS(recurse_backtrack)->matchingpath = LABEL();
return cc + 1 + LINK_SIZE;
}
static sljit_s32 SLJIT_FUNC do_callout(struct jit_arguments *arguments, pcre2_callout_block *callout_block, PCRE2_SPTR *jit_ovector)
static sljit_s32 SLJIT_FUNC do_callout_jit(struct jit_arguments *arguments, pcre2_callout_block *callout_block, PCRE2_SPTR *jit_ovector)
{
PCRE2_SPTR begin;
PCRE2_SIZE *ovector;
@ -9806,7 +9806,7 @@ OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_SP), LOCALS0, STR_PTR, 0);
/* SLJIT_R0 = arguments */
OP1(SLJIT_MOV, SLJIT_R1, 0, STACK_TOP, 0);
GET_LOCAL_BASE(SLJIT_R2, 0, OVECTOR_START);
sljit_emit_icall(compiler, SLJIT_CALL, SLJIT_ARGS3(32, W, W, W), SLJIT_IMM, SLJIT_FUNC_ADDR(do_callout));
sljit_emit_icall(compiler, SLJIT_CALL, SLJIT_ARGS3(32, W, W, W), SLJIT_IMM, SLJIT_FUNC_ADDR(do_callout_jit));
OP1(SLJIT_MOV, STR_PTR, 0, SLJIT_MEM1(SLJIT_SP), LOCALS0);
free_stack(common, callout_arg_size);
@ -11517,19 +11517,19 @@ if (exact > 1)
}
}
else if (exact == 1)
{
compile_char1_matchingpath(common, type, cc, &backtrack->topbacktracks, TRUE);
if (early_fail_type == type_fail_range)
{
OP1(SLJIT_MOV, TMP1, 0, SLJIT_MEM1(SLJIT_SP), early_fail_ptr);
OP1(SLJIT_MOV, TMP2, 0, SLJIT_MEM1(SLJIT_SP), early_fail_ptr + (int)sizeof(sljit_sw));
OP2(SLJIT_SUB, TMP1, 0, TMP1, 0, TMP2, 0);
OP2(SLJIT_SUB, TMP2, 0, STR_PTR, 0, TMP2, 0);
add_jump(compiler, &backtrack->topbacktracks, CMP(SLJIT_LESS_EQUAL, TMP2, 0, TMP1, 0));
if (early_fail_type == type_fail_range)
{
/* Range end first, followed by range start. */
OP1(SLJIT_MOV, TMP1, 0, SLJIT_MEM1(SLJIT_SP), early_fail_ptr);
OP1(SLJIT_MOV, TMP2, 0, SLJIT_MEM1(SLJIT_SP), early_fail_ptr + (int)sizeof(sljit_sw));
OP2(SLJIT_SUB, TMP1, 0, TMP1, 0, TMP2, 0);
OP2(SLJIT_SUB, TMP2, 0, STR_PTR, 0, TMP2, 0);
add_jump(compiler, &backtrack->topbacktracks, CMP(SLJIT_LESS_EQUAL, TMP2, 0, TMP1, 0));
OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_SP), early_fail_ptr + (int)sizeof(sljit_sw), STR_PTR, 0);
}
OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_SP), early_fail_ptr, STR_PTR, 0);
OP1(SLJIT_MOV, SLJIT_MEM1(SLJIT_SP), early_fail_ptr + (int)sizeof(sljit_sw), STR_PTR, 0);
}
switch(opcode)
@ -14384,7 +14384,7 @@ pcre2_jit_compile(pcre2_code *code, uint32_t options)
pcre2_real_code *re = (pcre2_real_code *)code;
#ifdef SUPPORT_JIT
executable_functions *functions;
static int executable_allocator_is_working = 0;
static int executable_allocator_is_working = -1;
#endif
if (code == NULL)
@ -14447,23 +14447,21 @@ return PCRE2_ERROR_JIT_BADOPTION;
if ((re->flags & PCRE2_NOJIT) != 0) return 0;
if (executable_allocator_is_working == 0)
if (executable_allocator_is_working == -1)
{
/* Checks whether the executable allocator is working. This check
might run multiple times in multi-threaded environments, but the
result should not be affected by it. */
void *ptr = SLJIT_MALLOC_EXEC(32, NULL);
executable_allocator_is_working = -1;
if (ptr != NULL)
{
SLJIT_FREE_EXEC(((sljit_u8*)(ptr)) + SLJIT_EXEC_OFFSET(ptr), NULL);
executable_allocator_is_working = 1;
}
else executable_allocator_is_working = 0;
}
if (executable_allocator_is_working < 0)
if (!executable_allocator_is_working)
return PCRE2_ERROR_NOMEMORY;
if ((re->overall_options & PCRE2_MATCH_INVALID_UTF) != 0)

View File

@ -354,6 +354,7 @@ static struct regression_test_case regression_test_cases[] = {
{ MU, A, 0, 0, "_[ab]+_*a", "_aa" },
{ MU, A, 0, 0, "#(A+)#\\d+", "#A#A#0" },
{ MU, A, 0, 0, "(?P<size>\\d+)m|M", "4M" },
{ M, PCRE2_NEWLINE_CRLF, 0, 0, "\\n?.+#", "\n,\n,#" },
/* Bracket repeats with limit. */
{ MU, A, 0, 0, "(?:(ab){2}){5}M", "abababababababababababM" },

View File

@ -204,6 +204,7 @@ Arguments:
P a previous frame of interest
frame_size the frame size
mb points to the match block
match_data points to the match data block
s identification text
Returns: nothing
@ -211,7 +212,7 @@ Returns: nothing
static void
display_frames(FILE *f, heapframe *F, heapframe *P, PCRE2_SIZE frame_size,
match_block *mb, const char *s, ...)
match_block *mb, pcre2_match_data *match_data, const char *s, ...)
{
uint32_t i;
heapframe *Q;
@ -223,10 +224,10 @@ vfprintf(f, s, ap);
va_end(ap);
if (P != NULL) fprintf(f, " P=%lu",
((char *)P - (char *)(mb->match_frames))/frame_size);
((char *)P - (char *)(match_data->heapframes))/frame_size);
fprintf(f, "\n");
for (i = 0, Q = mb->match_frames;
for (i = 0, Q = match_data->heapframes;
Q <= F;
i++, Q = (heapframe *)((char *)Q + frame_size))
{
@ -490,10 +491,16 @@ A version did exist that used individual frames on the heap instead of calling
match() recursively, but this ran substantially slower. The current version is
a refactoring that uses a vector of frames to remember backtracking points.
This runs no slower, and possibly even a bit faster than the original recursive
implementation. An initial vector of size START_FRAMES_SIZE (enough for maybe
50 frames) is allocated on the system stack. If this is not big enough, the
heap is used for a larger vector.
implementation.
At first, an initial vector of size START_FRAMES_SIZE (enough for maybe 50
frames) was allocated on the system stack. If this was not big enough, the heap
was used for a larger vector. However, it turns out that there are environments
where taking as little as 20KiB from the system stack is an embarrassment.
After another refactoring, the heap is used exclusively, but a pointer the
frames vector and its size are cached in the match_data block, so that there is
no new memory allocation if the same match_data block is used for multiple
matches (unless the frames vector has to be extended).
*******************************************************************************
******************************************************************************/
@ -566,10 +573,9 @@ made performance worse.
Arguments:
start_eptr starting character in subject
start_ecode starting position in compiled code
ovector pointer to the final output vector
oveccount number of pairs in ovector
top_bracket number of capturing parentheses in the pattern
frame_size size of each backtracking frame
match_data pointer to the match_data block
mb pointer to "static" variables block
Returns: MATCH_MATCH if matched ) these values are >= 0
@ -580,17 +586,19 @@ Returns: MATCH_MATCH if matched ) these values are >= 0
*/
static int
match(PCRE2_SPTR start_eptr, PCRE2_SPTR start_ecode, PCRE2_SIZE *ovector,
uint16_t oveccount, uint16_t top_bracket, PCRE2_SIZE frame_size,
match_block *mb)
match(PCRE2_SPTR start_eptr, PCRE2_SPTR start_ecode, uint16_t top_bracket,
PCRE2_SIZE frame_size, pcre2_match_data *match_data, match_block *mb)
{
/* Frame-handling variables */
heapframe *F; /* Current frame pointer */
heapframe *N = NULL; /* Temporary frame pointers */
heapframe *P = NULL;
heapframe *frames_top; /* End of frames vector */
heapframe *assert_accept_frame = NULL; /* For passing back a frame with captures */
PCRE2_SIZE frame_copy_size; /* Amount to copy when creating a new frame */
PCRE2_SIZE heapframes_size; /* Usable size of frames vector */
PCRE2_SIZE frame_copy_size; /* Amount to copy when creating a new frame */
/* Local variables that do not need to be preserved over calls to RRMATCH(). */
@ -627,10 +635,14 @@ copied when a new frame is created. */
frame_copy_size = frame_size - offsetof(heapframe, eptr);
/* Set up the first current frame at the start of the vector, and initialize
fields that are not reset for new frames. */
/* Set up the first frame and the end of the frames vector. We set the local
heapframes_size to the usuable amount of the vector, that is, a whole number of
frames. */
F = match_data->heapframes;
heapframes_size = (match_data->heapframes_size / frame_size) * frame_size;
frames_top = (heapframe *)((char *)F + heapframes_size);
F = mb->match_frames;
Frdepth = 0; /* "Recursion" depth */
Fcapture_last = 0; /* Number of most recent capture */
Fcurrent_recurse = RECURSE_UNSET; /* Not pattern recursing. */
@ -646,34 +658,35 @@ backtracking point. */
MATCH_RECURSE:
/* Set up a new backtracking frame. If the vector is full, get a new one
on the heap, doubling the size, but constrained by the heap limit. */
/* Set up a new backtracking frame. If the vector is full, get a new one,
doubling the size, but constrained by the heap limit (which is in KiB). */
N = (heapframe *)((char *)F + frame_size);
if (N >= mb->match_frames_top)
if (N >= frames_top)
{
PCRE2_SIZE newsize = mb->frame_vector_size * 2;
heapframe *new;
PCRE2_SIZE newsize = match_data->heapframes_size * 2;
if ((newsize / 1024) > mb->heap_limit)
if (newsize > mb->heap_limit)
{
PCRE2_SIZE maxsize = ((mb->heap_limit * 1024)/frame_size) * frame_size;
if (mb->frame_vector_size >= maxsize) return PCRE2_ERROR_HEAPLIMIT;
PCRE2_SIZE maxsize = (mb->heap_limit/frame_size) * frame_size;
if (match_data->heapframes_size >= maxsize) return PCRE2_ERROR_HEAPLIMIT;
newsize = maxsize;
}
new = mb->memctl.malloc(newsize, mb->memctl.memory_data);
new = match_data->memctl.malloc(newsize, match_data->memctl.memory_data);
if (new == NULL) return PCRE2_ERROR_NOMEMORY;
memcpy(new, mb->match_frames, mb->frame_vector_size);
memcpy(new, match_data->heapframes, heapframes_size);
F = (heapframe *)((char *)new + ((char *)F - (char *)mb->match_frames));
F = (heapframe *)((char *)new + ((char *)F - (char *)match_data->heapframes));
N = (heapframe *)((char *)F + frame_size);
if (mb->match_frames != mb->stack_frames)
mb->memctl.free(mb->match_frames, mb->memctl.memory_data);
mb->match_frames = new;
mb->match_frames_top = (heapframe *)((char *)mb->match_frames + newsize);
mb->frame_vector_size = newsize;
match_data->memctl.free(match_data->heapframes, match_data->memctl.memory_data);
match_data->heapframes = new;
match_data->heapframes_size = newsize;
heapframes_size = (newsize / frame_size) * frame_size;
frames_top = (heapframe *)((char *)new + heapframes_size);
}
#ifdef DEBUG_SHOW_RMATCH
@ -731,7 +744,7 @@ recursion value. */
if (group_frame_type != 0)
{
Flast_group_offset = (char *)F - (char *)mb->match_frames;
Flast_group_offset = (char *)F - (char *)match_data->heapframes;
if (GF_IDMASK(group_frame_type) == GF_RECURSE)
Fcurrent_recurse = GF_DATAMASK(group_frame_type);
group_frame_type = 0;
@ -773,7 +786,7 @@ fprintf(stderr, "++ op=%d\n", *Fecode);
for(;;)
{
if (offset == PCRE2_UNSET) return PCRE2_ERROR_INTERNAL;
N = (heapframe *)((char *)mb->match_frames + offset);
N = (heapframe *)((char *)match_data->heapframes + offset);
P = (heapframe *)((char *)N - frame_size);
if (N->group_frame_type == (GF_CAPTURE | number)) break;
offset = P->last_group_offset;
@ -811,7 +824,7 @@ fprintf(stderr, "++ op=%d\n", *Fecode);
for(;;)
{
if (offset == PCRE2_UNSET) return PCRE2_ERROR_INTERNAL;
N = (heapframe *)((char *)mb->match_frames + offset);
N = (heapframe *)((char *)match_data->heapframes + offset);
P = (heapframe *)((char *)N - frame_size);
if (GF_IDMASK(N->group_frame_type) == GF_RECURSE) break;
offset = P->last_group_offset;
@ -864,14 +877,15 @@ fprintf(stderr, "++ op=%d\n", *Fecode);
mb->mark = Fmark; /* and the last success mark */
if (Feptr > mb->last_used_ptr) mb->last_used_ptr = Feptr;
ovector[0] = Fstart_match - mb->start_subject;
ovector[1] = Feptr - mb->start_subject;
match_data->ovector[0] = Fstart_match - mb->start_subject;
match_data->ovector[1] = Feptr - mb->start_subject;
/* Set i to the smaller of the sizes of the external and frame ovectors. */
i = 2 * ((top_bracket + 1 > oveccount)? oveccount : top_bracket + 1);
memcpy(ovector + 2, Fovector, (i - 2) * sizeof(PCRE2_SIZE));
while (--i >= Foffset_top + 2) ovector[i] = PCRE2_UNSET;
i = 2 * ((top_bracket + 1 > match_data->oveccount)?
match_data->oveccount : top_bracket + 1);
memcpy(match_data->ovector + 2, Fovector, (i - 2) * sizeof(PCRE2_SIZE));
while (--i >= Foffset_top + 2) match_data->ovector[i] = PCRE2_UNSET;
return MATCH_MATCH; /* Note: NOT RRETURN */
@ -5328,7 +5342,7 @@ fprintf(stderr, "++ op=%d\n", *Fecode);
offset = Flast_group_offset;
while (offset != PCRE2_UNSET)
{
N = (heapframe *)((char *)mb->match_frames + offset);
N = (heapframe *)((char *)match_data->heapframes + offset);
P = (heapframe *)((char *)N - frame_size);
if (N->group_frame_type == (GF_RECURSE | number))
{
@ -5729,7 +5743,7 @@ fprintf(stderr, "++ op=%d\n", *Fecode);
if (*bracode != OP_BRA && *bracode != OP_COND)
{
N = (heapframe *)((char *)mb->match_frames + Flast_group_offset);
N = (heapframe *)((char *)match_data->heapframes + Flast_group_offset);
P = (heapframe *)((char *)N - frame_size);
Flast_group_offset = P->last_group_offset;
@ -6346,6 +6360,7 @@ BOOL jit_checked_utf = FALSE;
#endif /* SUPPORT_UNICODE */
PCRE2_SIZE frame_size;
PCRE2_SIZE heapframes_size;
/* We need to have mb as a pointer to a match block, because the IS_NEWLINE
macro is used below, and it expects NLBLOCK to be defined as a pointer. */
@ -6354,15 +6369,6 @@ pcre2_callout_block cb;
match_block actual_match_block;
match_block *mb = &actual_match_block;
/* Allocate an initial vector of backtracking frames on the stack. If this
proves to be too small, it is replaced by a larger one on the heap. To get a
vector of the size required that is aligned for pointers, allocate it as a
vector of pointers. */
PCRE2_SPTR stack_frames_vector[START_FRAMES_SIZE/sizeof(PCRE2_SPTR)]
PCRE2_KEEP_UNINITIALIZED;
mb->stack_frames = (heapframe *)stack_frames_vector;
/* Recognize NULL, length 0 as an empty string. */
if (subject == NULL && length == 0) subject = (PCRE2_SPTR)"";
@ -6793,15 +6799,11 @@ switch(re->newline_convention)
vector at the end, whose size depends on the number of capturing parentheses in
the pattern. It is not used at all if there are no capturing parentheses.
frame_size is the total size of each frame
mb->frame_vector_size is the total usable size of the vector (rounded down
to a whole number of frames)
frame_size is the total size of each frame
match_data->heapframes is the pointer to the frames vector
match_data->heapframes_size is the total size of the vector
The last of these is changed within the match() function if the frame vector
has to be expanded. We therefore put it into the match block so that it is
correct when calling match() more than once for non-anchored patterns.
We must also pad frame_size for alignment to ensure subsequent frames are as
We must pad the frame_size for alignment to ensure subsequent frames are as
aligned as heapframe. Whilst ovector is word-aligned due to being a PCRE2_SIZE
array, that does not guarantee it is suitably aligned for pointers, as some
architectures have pointers that are larger than a size_t. */
@ -6813,8 +6815,8 @@ frame_size = (offsetof(heapframe, ovector) +
/* Limits set in the pattern override the match context only if they are
smaller. */
mb->heap_limit = (mcontext->heap_limit < re->limit_heap)?
mcontext->heap_limit : re->limit_heap;
mb->heap_limit = ((mcontext->heap_limit < re->limit_heap)?
mcontext->heap_limit : re->limit_heap) * 1024;
mb->match_limit = (mcontext->match_limit < re->limit_match)?
mcontext->match_limit : re->limit_match;
@ -6823,35 +6825,40 @@ mb->match_limit_depth = (mcontext->depth_limit < re->limit_depth)?
mcontext->depth_limit : re->limit_depth;
/* If a pattern has very many capturing parentheses, the frame size may be very
large. Ensure that there are at least 10 available frames by getting an initial
vector on the heap if necessary, except when the heap limit prevents this. Get
fewer if possible. (The heap limit is in kibibytes.) */
large. Set the initial frame vector size to ensure that there are at least 10
available frames, but enforce a minimum of START_FRAMES_SIZE. If this is
greater than the heap limit, get as large a vector as possible. Always round
the size to a multiple of the frame size. */
if (frame_size <= START_FRAMES_SIZE/10)
heapframes_size = frame_size * 10;
if (heapframes_size < START_FRAMES_SIZE) heapframes_size = START_FRAMES_SIZE;
if (heapframes_size > mb->heap_limit)
{
mb->match_frames = mb->stack_frames; /* Initial frame vector on the stack */
mb->frame_vector_size = ((START_FRAMES_SIZE/frame_size) * frame_size);
if (frame_size > mb->heap_limit ) return PCRE2_ERROR_HEAPLIMIT;
heapframes_size = mb->heap_limit;
}
else
/* If an existing frame vector in the match_data block is large enough, we can
use it.Otherwise, free any pre-existing vector and get a new one. */
if (match_data->heapframes_size < heapframes_size)
{
mb->frame_vector_size = frame_size * 10;
if ((mb->frame_vector_size / 1024) > mb->heap_limit)
match_data->memctl.free(match_data->heapframes,
match_data->memctl.memory_data);
match_data->heapframes = match_data->memctl.malloc(heapframes_size,
match_data->memctl.memory_data);
if (match_data->heapframes == NULL)
{
if (frame_size > mb->heap_limit * 1024) return PCRE2_ERROR_HEAPLIMIT;
mb->frame_vector_size = ((mb->heap_limit * 1024)/frame_size) * frame_size;
}
mb->match_frames = mb->memctl.malloc(mb->frame_vector_size,
mb->memctl.memory_data);
if (mb->match_frames == NULL) return PCRE2_ERROR_NOMEMORY;
match_data->heapframes_size = 0;
return PCRE2_ERROR_NOMEMORY;
}
match_data->heapframes_size = heapframes_size;
}
mb->match_frames_top =
(heapframe *)((char *)mb->match_frames + mb->frame_vector_size);
/* Write to the ovector within the first frame to mark every capture unset and
to avoid uninitialized memory read errors when it is copied to a new frame. */
memset((char *)(mb->match_frames) + offsetof(heapframe, ovector), 0xff,
memset((char *)(match_data->heapframes) + offsetof(heapframe, ovector), 0xff,
frame_size - offsetof(heapframe, ovector));
/* Pointers to the individual character tables */
@ -7279,8 +7286,8 @@ for(;;)
mb->end_offset_top = 0;
mb->skip_arg_count = 0;
rc = match(start_match, mb->start_code, match_data->ovector,
match_data->oveccount, re->top_bracket, frame_size, mb);
rc = match(start_match, mb->start_code, re->top_bracket, frame_size,
match_data, mb);
if (mb->hitend && start_partial == NULL)
{
@ -7463,11 +7470,6 @@ if (utf && end_subject != true_end_subject &&
}
#endif /* SUPPORT_UNICODE */
/* Release an enlarged frame vector that is on the heap. */
if (mb->match_frames != mb->stack_frames)
mb->memctl.free(mb->match_frames, mb->memctl.memory_data);
/* Fill in fields that are always returned in the match data. */
match_data->code = re;
@ -7533,4 +7535,10 @@ else match_data->rc = PCRE2_ERROR_NOMATCH;
return match_data->rc;
}
/* These #undefs are here to enable unity builds with CMake. */
#undef NLBLOCK /* Block containing newline information */
#undef PSSTART /* Field containing processed string start */
#undef PSEND /* Field containing processed string end */
/* End of pcre2_match.c */

View File

@ -7,7 +7,7 @@ and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel
Original API code Copyright (c) 1997-2012 University of Cambridge
New API code Copyright (c) 2016-2019 University of Cambridge
New API code Copyright (c) 2016-2022 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@ -64,6 +64,8 @@ yield = PRIV(memctl_malloc)(
if (yield == NULL) return NULL;
yield->oveccount = oveccount;
yield->flags = 0;
yield->heapframes = NULL;
yield->heapframes_size = 0;
return yield;
}
@ -95,6 +97,9 @@ pcre2_match_data_free(pcre2_match_data *match_data)
{
if (match_data != NULL)
{
if (match_data->heapframes != NULL)
match_data->memctl.free(match_data->heapframes,
match_data->memctl.memory_data);
if ((match_data->flags & PCRE2_MD_COPIED_SUBJECT) != 0)
match_data->memctl.free((void *)match_data->subject,
match_data->memctl.memory_data);

View File

@ -7,7 +7,7 @@ and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel
Original API code Copyright (c) 1997-2012 University of Cambridge
New API code Copyright (c) 2016-2021 University of Cambridge
New API code Copyright (c) 2016-2022 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@ -259,16 +259,16 @@ PCRE2_UNSET, so as not to imply an offset in the replacement. */
if ((options & (PCRE2_PARTIAL_HARD|PCRE2_PARTIAL_SOFT)) != 0)
return PCRE2_ERROR_BADOPTION;
/* Validate length and find the end of the replacement. A NULL replacement of
/* Validate length and find the end of the replacement. A NULL replacement of
zero length is interpreted as an empty string. */
if (replacement == NULL)
if (replacement == NULL)
{
if (rlength != 0) return PCRE2_ERROR_NULL;
replacement = (PCRE2_SPTR)"";
}
replacement = (PCRE2_SPTR)"";
}
if (rlength == PCRE2_ZERO_TERMINATED) rlength = PRIV(strlen)(replacement);
repend = replacement + rlength;
@ -282,8 +282,9 @@ replacement_only = ((options & PCRE2_SUBSTITUTE_REPLACEMENT_ONLY) != 0);
match data block. We create an internal match_data block in two cases: (a) an
external one is not supplied (and we are not starting from an existing match);
(b) an existing match is to be used for the first substitution. In the latter
case, we copy the existing match into the internal block. This ensures that no
changes are made to the existing match data block. */
case, we copy the existing match into the internal block, except for any cached
heap frame size and pointer. This ensures that no changes are made to the
external match data block. */
if (match_data == NULL)
{
@ -309,6 +310,8 @@ else if (use_existing_match)
if (internal_match_data == NULL) return PCRE2_ERROR_NOMEMORY;
memcpy(internal_match_data, match_data, offsetof(pcre2_match_data, ovector)
+ 2*pairs*sizeof(PCRE2_SIZE));
internal_match_data->heapframes = NULL;
internal_match_data->heapframes_size = 0;
match_data = internal_match_data;
}
@ -328,9 +331,9 @@ scb.ovector = ovector;
if (subject == NULL)
{
if (length != 0) return PCRE2_ERROR_NULL;
if (length != 0) return PCRE2_ERROR_NULL;
subject = (PCRE2_SPTR)"";
}
}
/* Find length of zero-terminated subject */

View File

@ -13,7 +13,7 @@ distribution because other apparatus is needed to compile pcre2grep for z/OS.
The header can be found in the special z/OS distribution, which is available
from www.zaconsultants.net or from www.cbttape.org.
Copyright (c) 1997-2020 University of Cambridge
Copyright (c) 1997-2022 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@ -205,9 +205,6 @@ point. */
* Global variables *
*************************************************/
/* Jeffrey Friedl has some debugging requirements that are not part of the
regular code. */
static const char *colour_string = "1;31";
static const char *colour_option = NULL;
static const char *dee_option = NULL;
@ -220,19 +217,24 @@ static const char *output_text = NULL;
static char *main_buffer = NULL;
static const char *printname_nl = STDOUT_NL; /* Changed to NULL for -Z */
static int printname_colon = ':'; /* Changed to 0 for -Z */
static int printname_hyphen = '-'; /* Changed to 0 for -Z */
static int after_context = 0;
static int before_context = 0;
static int binary_files = BIN_BINARY;
static int both_context = 0;
static int bufthird = PCRE2GREP_BUFSIZE;
static int max_bufthird = PCRE2GREP_MAX_BUFSIZE;
static int bufsize = 3*PCRE2GREP_BUFSIZE;
static int endlinetype;
static int count_limit = -1; /* Not long, so that it works with OP_NUMBER */
static unsigned long int counts_printed = 0;
static unsigned long int total_count = 0;
static PCRE2_SIZE bufthird = PCRE2GREP_BUFSIZE;
static PCRE2_SIZE max_bufthird = PCRE2GREP_MAX_BUFSIZE;
static PCRE2_SIZE bufsize = 3*PCRE2GREP_BUFSIZE;
#ifdef WIN32
static int dee_action = dee_SKIP;
#else
@ -425,8 +427,8 @@ static option_item optionlist[] = {
{ OP_NODATA, 'a', NULL, "text", "treat binary files as text" },
{ OP_NUMBER, 'B', &before_context, "before-context=number", "set number of prior context lines" },
{ OP_BINFILES, N_BINARY_FILES, NULL, "binary-files=word", "set treatment of binary files" },
{ OP_NUMBER, N_BUFSIZE,&bufthird, "buffer-size=number", "set processing buffer starting size" },
{ OP_NUMBER, N_MAX_BUFSIZE,&max_bufthird, "max-buffer-size=number", "set processing buffer maximum size" },
{ OP_SIZE, N_BUFSIZE,&bufthird, "buffer-size=number", "set processing buffer starting size" },
{ OP_SIZE, N_MAX_BUFSIZE,&max_bufthird, "max-buffer-size=number", "set processing buffer maximum size" },
{ OP_OP_STRING, N_COLOUR, &colour_option, "color=option", "matched text color option" },
{ OP_OP_STRING, N_COLOUR, &colour_option, "colour=option", "matched text colour option" },
{ OP_NUMBER, 'C', &both_context, "context=number", "set number of context lines, before & after" },
@ -482,6 +484,7 @@ static option_item optionlist[] = {
{ OP_NODATA, 'w', NULL, "word-regex(p)", "force patterns to match only as words" },
{ OP_NODATA, 'x', NULL, "line-regex(p)", "force patterns to match only whole lines" },
{ OP_NODATA, N_ALLABSK, NULL, "allow-lookaround-bsk", "allow \\K in lookarounds" },
{ OP_NODATA, 'Z', NULL, "null", "output 0 byte after file names" },
{ OP_NODATA, 0, NULL, NULL, NULL }
};
@ -1408,10 +1411,10 @@ Returns: the number of characters read, zero at end of file
*/
static PCRE2_SIZE
read_one_line(char *buffer, int length, FILE *f)
read_one_line(char *buffer, PCRE2_SIZE length, FILE *f)
{
int c;
int yield = 0;
PCRE2_SIZE yield = 0;
while ((c = fgetc(f)) != EOF)
{
buffer[yield++] = c;
@ -1772,7 +1775,7 @@ if (after_context > 0 && lastmatchnumber > 0)
{
char *pp = end_of_line(lastmatchrestart, endptr, &ellength);
if (ellength == 0 && pp == main_buffer + bufsize) break;
if (printname != NULL) fprintf(stdout, "%s-", printname);
if (printname != NULL) fprintf(stdout, "%s%c", printname, printname_hyphen);
if (number) fprintf(stdout, "%lu-", lastmatchnumber++);
FWRITE_IGNORE(lastmatchrestart, 1, pp - lastmatchrestart, stdout);
lastmatchrestart = pp;
@ -2437,7 +2440,11 @@ if (pid == 0)
exit(1);
}
else if (pid > 0)
{
(void)fflush(stdout);
(void)waitpid(pid, &result, 0);
(void)fflush(stdout);
}
#endif /* End Windows/VMS/other handling */
free(args);
@ -2457,8 +2464,8 @@ return result != 0;
* Read a portion of the file into buffer *
*************************************************/
static int
fill_buffer(void *handle, int frtype, char *buffer, int length,
static PCRE2_SIZE
fill_buffer(void *handle, int frtype, char *buffer, PCRE2_SIZE length,
BOOL input_line_buffered)
{
(void)frtype; /* Avoid warning when not used */
@ -2620,7 +2627,7 @@ while (ptr < endptr)
if (bufthird < max_bufthird)
{
char *new_buffer;
int new_bufthird = 2*bufthird;
PCRE2_SIZE new_bufthird = 2*bufthird;
if (new_bufthird > max_bufthird) new_bufthird = max_bufthird;
new_buffer = (char *)malloc(3*new_bufthird);
@ -2629,7 +2636,8 @@ while (ptr < endptr)
{
fprintf(stderr,
"pcre2grep: line %lu%s%s is too long for the internal buffer\n"
"pcre2grep: not enough memory to increase the buffer size to %d\n",
"pcre2grep: not enough memory to increase the buffer size to %"
SIZ_FORM "\n",
linenumber,
(filename == NULL)? "" : " of file ",
(filename == NULL)? "" : filename,
@ -2659,7 +2667,7 @@ while (ptr < endptr)
{
fprintf(stderr,
"pcre2grep: line %lu%s%s is too long for the internal buffer\n"
"pcre2grep: the maximum buffer size is %d\n"
"pcre2grep: the maximum buffer size is %" SIZ_FORM "\n"
"pcre2grep: use the --max-buffer-size option to change it\n",
linenumber,
(filename == NULL)? "" : " of file ",
@ -2724,7 +2732,9 @@ while (ptr < endptr)
else if (filenames == FN_MATCH_ONLY)
{
fprintf(stdout, "%s" STDOUT_NL, printname);
fprintf(stdout, "%s", printname);
if (printname_nl == NULL) fprintf(stdout, "%c", 0);
else fprintf(stdout, "%s", printname_nl);
return 0;
}
@ -2743,7 +2753,8 @@ while (ptr < endptr)
{
PCRE2_SIZE oldstartoffset;
if (printname != NULL) fprintf(stdout, "%s:", printname);
if (printname != NULL) fprintf(stdout, "%s%c", printname,
printname_colon);
if (number) fprintf(stdout, "%lu:", linenumber);
/* Handle --line-offsets */
@ -2865,7 +2876,8 @@ while (ptr < endptr)
while (lastmatchrestart < p)
{
char *pp = lastmatchrestart;
if (printname != NULL) fprintf(stdout, "%s-", printname);
if (printname != NULL) fprintf(stdout, "%s%c", printname,
printname_hyphen);
if (number) fprintf(stdout, "%lu-", lastmatchnumber++);
pp = end_of_line(pp, endptr, &ellength);
FWRITE_IGNORE(lastmatchrestart, 1, pp - lastmatchrestart, stdout);
@ -2906,7 +2918,8 @@ while (ptr < endptr)
{
int ellength;
char *pp = p;
if (printname != NULL) fprintf(stdout, "%s-", printname);
if (printname != NULL) fprintf(stdout, "%s%c", printname,
printname_hyphen);
if (number) fprintf(stdout, "%lu-", linenumber - linecount--);
pp = end_of_line(pp, endptr, &ellength);
FWRITE_IGNORE(p, 1, pp - p, stdout);
@ -2920,7 +2933,8 @@ while (ptr < endptr)
if (after_context > 0 || before_context > 0)
endhyphenpending = TRUE;
if (printname != NULL) fprintf(stdout, "%s:", printname);
if (printname != NULL) fprintf(stdout, "%s%c", printname,
printname_colon);
if (number) fprintf(stdout, "%lu:", linenumber);
/* In multiline mode, or if colouring, we have to split the line(s) up
@ -3076,7 +3090,7 @@ while (ptr < endptr)
if (input_line_buffered && bufflength < (PCRE2_SIZE)bufsize)
{
int add = read_one_line(ptr, bufsize - (int)(ptr - main_buffer), in);
PCRE2_SIZE add = read_one_line(ptr, bufsize - (ptr - main_buffer), in);
bufflength += add;
endptr += add;
}
@ -3125,7 +3139,9 @@ were none. If we found a match, we won't have got this far. */
if (filenames == FN_NOMATCH_ONLY)
{
fprintf(stdout, "%s" STDOUT_NL, printname);
fprintf(stdout, "%s", printname);
if (printname_nl == NULL) fprintf(stdout, "%c", 0);
else fprintf(stdout, "%s", printname_nl);
return 0;
}
@ -3136,7 +3152,7 @@ if (count_only && !quiet)
if (count > 0 || !omit_zero_count)
{
if (printname != NULL && filenames != FN_NONE)
fprintf(stdout, "%s:", printname);
fprintf(stdout, "%s%c", printname, printname_colon);
fprintf(stdout, "%lu" STDOUT_NL, count);
counts_printed++;
}
@ -3522,8 +3538,6 @@ switch(letter)
case 'u': options |= PCRE2_UTF; utf = TRUE; break;
case 'U': options |= PCRE2_UTF|PCRE2_MATCH_INVALID_UTF; utf = TRUE; break;
case 'v': invert = TRUE; break;
case 'w': extra_options |= PCRE2_EXTRA_MATCH_WORD; break;
case 'x': extra_options |= PCRE2_EXTRA_MATCH_LINE; break;
case 'V':
{
@ -3534,6 +3548,10 @@ switch(letter)
pcre2grep_exit(0);
break;
case 'w': extra_options |= PCRE2_EXTRA_MATCH_WORD; break;
case 'x': extra_options |= PCRE2_EXTRA_MATCH_LINE; break;
case 'Z': printname_colon = printname_hyphen = 0; printname_nl = NULL; break;
default:
fprintf(stderr, "pcre2grep: Unknown option -%c\n", letter);
pcre2grep_exit(usage(2));
@ -4253,8 +4271,6 @@ if (DEE_option != NULL)
(void)pcre2_set_compile_extra_options(compile_context, extra_options);
/* Check the values for Jeffrey Friedl's debugging options. */
/* If use_jit is set, check whether JIT is available. If not, do not try
to use JIT. */

View File

@ -479,7 +479,7 @@ so many of them that they are split into two fields. */
#define CTL_DFA 0x00000200u
#define CTL_EXPAND 0x00000400u
#define CTL_FINDLIMITS 0x00000800u
#define CTL_FRAMESIZE 0x00001000u
#define CTL_FINDLIMITS_NOHEAP 0x00001000u
#define CTL_FULLBINCODE 0x00002000u
#define CTL_GETALL 0x00004000u
#define CTL_GLOBAL 0x00008000u
@ -522,6 +522,7 @@ so many of them that they are split into two fields. */
#define CTL2_ALLVECTOR 0x00000800u
#define CTL2_NULL_SUBJECT 0x00001000u
#define CTL2_NULL_REPLACEMENT 0x00002000u
#define CTL2_FRAMESIZE 0x00004000u
#define CTL2_NL_SET 0x40000000u /* Informational */
#define CTL2_BSR_SET 0x80000000u /* Informational */
@ -673,8 +674,9 @@ static modstruct modlist[] = {
{ "extended_more", MOD_PATP, MOD_OPT, PCRE2_EXTENDED_MORE, PO(options) },
{ "extra_alt_bsux", MOD_CTC, MOD_OPT, PCRE2_EXTRA_ALT_BSUX, CO(extra_options) },
{ "find_limits", MOD_DAT, MOD_CTL, CTL_FINDLIMITS, DO(control) },
{ "find_limits_noheap", MOD_DAT, MOD_CTL, CTL_FINDLIMITS_NOHEAP, DO(control) },
{ "firstline", MOD_PAT, MOD_OPT, PCRE2_FIRSTLINE, PO(options) },
{ "framesize", MOD_PAT, MOD_CTL, CTL_FRAMESIZE, PO(control) },
{ "framesize", MOD_PAT, MOD_CTL, CTL2_FRAMESIZE, PO(control2) },
{ "fullbincode", MOD_PAT, MOD_CTL, CTL_FULLBINCODE, PO(control) },
{ "get", MOD_DAT, MOD_NN, DO(get_numbers), DO(get_names) },
{ "getall", MOD_DAT, MOD_CTL, CTL_GETALL, DO(control) },
@ -781,10 +783,11 @@ static modstruct modlist[] = {
#define PUSH_SUPPORTED_COMPILE_CONTROLS ( \
CTL_BINCODE|CTL_CALLOUT_INFO|CTL_FULLBINCODE|CTL_HEXPAT|CTL_INFO| \
CTL_JITVERIFY|CTL_MEMORY|CTL_FRAMESIZE|CTL_PUSH|CTL_PUSHCOPY| \
CTL_JITVERIFY|CTL_MEMORY|CTL_PUSH|CTL_PUSHCOPY| \
CTL_PUSHTABLESCOPY|CTL_USE_LENGTH)
#define PUSH_SUPPORTED_COMPILE_CONTROLS2 (CTL2_BSR_SET|CTL2_NL_SET)
#define PUSH_SUPPORTED_COMPILE_CONTROLS2 (CTL2_BSR_SET|CTL2_FRAMESIZE| \
CTL2_NL_SET)
/* Controls that apply only at compile time with 'push'. */
@ -813,8 +816,9 @@ static uint32_t exclusive_pat_controls[] = {
first control word. */
static uint32_t exclusive_dat_controls[] = {
CTL_ALLUSEDTEXT | CTL_STARTCHAR,
CTL_FINDLIMITS | CTL_NULLCONTEXT };
CTL_ALLUSEDTEXT | CTL_STARTCHAR,
CTL_FINDLIMITS | CTL_NULLCONTEXT,
CTL_FINDLIMITS_NOHEAP | CTL_NULLCONTEXT };
/* Table of single-character abbreviated modifiers. The index field is
initialized to -1, but the first time the modifier is encountered, it is filled
@ -927,7 +931,6 @@ static BOOL jit_was_used;
static BOOL restrict_for_perl_test = FALSE;
static BOOL show_memory = FALSE;
static int code_unit_size; /* Bytes */
static int jitrc; /* Return from JIT compile */
static int test_mode = DEFAULT_TEST_MODE;
static int timeit = 0;
@ -937,6 +940,7 @@ clock_t total_compile_time = 0;
clock_t total_jit_compile_time = 0;
clock_t total_match_time = 0;
static uint32_t code_unit_size; /* Bytes */
static uint32_t dfa_matched;
static uint32_t forbid_utf = 0;
static uint32_t maxlookbehind;
@ -1242,19 +1246,19 @@ are supported. */
#define PCRE2_MATCH_DATA_CREATE(a,b,c) \
if (test_mode == PCRE8_MODE) \
G(a,8) = pcre2_match_data_create_8(b,c); \
G(a,8) = pcre2_match_data_create_8(b,G(c,8)); \
else if (test_mode == PCRE16_MODE) \
G(a,16) = pcre2_match_data_create_16(b,c); \
G(a,16) = pcre2_match_data_create_16(b,G(c,16)); \
else \
G(a,32) = pcre2_match_data_create_32(b,c)
G(a,32) = pcre2_match_data_create_32(b,G(c,32))
#define PCRE2_MATCH_DATA_CREATE_FROM_PATTERN(a,b,c) \
if (test_mode == PCRE8_MODE) \
G(a,8) = pcre2_match_data_create_from_pattern_8(G(b,8),c); \
G(a,8) = pcre2_match_data_create_from_pattern_8(G(b,8),G(c,8)); \
else if (test_mode == PCRE16_MODE) \
G(a,16) = pcre2_match_data_create_from_pattern_16(G(b,16),c); \
G(a,16) = pcre2_match_data_create_from_pattern_16(G(b,16),G(c,16)); \
else \
G(a,32) = pcre2_match_data_create_from_pattern_32(G(b,32),c)
G(a,32) = pcre2_match_data_create_from_pattern_32(G(b,32),G(c,32))
#define PCRE2_MATCH_DATA_FREE(a) \
if (test_mode == PCRE8_MODE) \
@ -1762,15 +1766,15 @@ the three different cases. */
#define PCRE2_MATCH_DATA_CREATE(a,b,c) \
if (test_mode == G(G(PCRE,BITONE),_MODE)) \
G(a,BITONE) = G(pcre2_match_data_create_,BITONE)(b,c); \
G(a,BITONE) = G(pcre2_match_data_create_,BITONE)(b,G(c,BITONE)); \
else \
G(a,BITTWO) = G(pcre2_match_data_create_,BITTWO)(b,c)
G(a,BITTWO) = G(pcre2_match_data_create_,BITTWO)(b,G(c,BITTWO))
#define PCRE2_MATCH_DATA_CREATE_FROM_PATTERN(a,b,c) \
if (test_mode == G(G(PCRE,BITONE),_MODE)) \
G(a,BITONE) = G(pcre2_match_data_create_from_pattern_,BITONE)(G(b,BITONE),c); \
G(a,BITONE) = G(pcre2_match_data_create_from_pattern_,BITONE)(G(b,BITONE),G(c,BITONE)); \
else \
G(a,BITTWO) = G(pcre2_match_data_create_from_pattern_,BITTWO)(G(b,BITTWO),c)
G(a,BITTWO) = G(pcre2_match_data_create_from_pattern_,BITTWO)(G(b,BITTWO),G(c,BITTWO))
#define PCRE2_MATCH_DATA_FREE(a) \
if (test_mode == G(G(PCRE,BITONE),_MODE)) \
@ -2070,9 +2074,9 @@ the three different cases. */
#define PCRE2_MAKETABLES(a) a = pcre2_maketables_8(NULL)
#define PCRE2_MATCH(a,b,c,d,e,f,g,h) \
a = pcre2_match_8(G(b,8),(PCRE2_SPTR8)c,d,e,f,G(g,8),h)
#define PCRE2_MATCH_DATA_CREATE(a,b,c) G(a,8) = pcre2_match_data_create_8(b,c)
#define PCRE2_MATCH_DATA_CREATE(a,b,c) G(a,8) = pcre2_match_data_create_8(b,G(c,8))
#define PCRE2_MATCH_DATA_CREATE_FROM_PATTERN(a,b,c) \
G(a,8) = pcre2_match_data_create_from_pattern_8(G(b,8),c)
G(a,8) = pcre2_match_data_create_from_pattern_8(G(b,8),G(c,8))
#define PCRE2_MATCH_DATA_FREE(a) pcre2_match_data_free_8(G(a,8))
#define PCRE2_PATTERN_CONVERT(a,b,c,d,e,f,g) a = pcre2_pattern_convert_8(G(b,8),c,d,(PCRE2_UCHAR8 **)e,f,G(g,8))
#define PCRE2_PATTERN_INFO(a,b,c,d) a = pcre2_pattern_info_8(G(b,8),c,d)
@ -2177,9 +2181,9 @@ the three different cases. */
#define PCRE2_MAKETABLES(a) a = pcre2_maketables_16(NULL)
#define PCRE2_MATCH(a,b,c,d,e,f,g,h) \
a = pcre2_match_16(G(b,16),(PCRE2_SPTR16)c,d,e,f,G(g,16),h)
#define PCRE2_MATCH_DATA_CREATE(a,b,c) G(a,16) = pcre2_match_data_create_16(b,c)
#define PCRE2_MATCH_DATA_CREATE(a,b,c) G(a,16) = pcre2_match_data_create_16(b,G(c,16))
#define PCRE2_MATCH_DATA_CREATE_FROM_PATTERN(a,b,c) \
G(a,16) = pcre2_match_data_create_from_pattern_16(G(b,16),c)
G(a,16) = pcre2_match_data_create_from_pattern_16(G(b,16),G(c,16))
#define PCRE2_MATCH_DATA_FREE(a) pcre2_match_data_free_16(G(a,16))
#define PCRE2_PATTERN_CONVERT(a,b,c,d,e,f,g) a = pcre2_pattern_convert_16(G(b,16),c,d,(PCRE2_UCHAR16 **)e,f,G(g,16))
#define PCRE2_PATTERN_INFO(a,b,c,d) a = pcre2_pattern_info_16(G(b,16),c,d)
@ -2284,9 +2288,9 @@ the three different cases. */
#define PCRE2_MAKETABLES(a) a = pcre2_maketables_32(NULL)
#define PCRE2_MATCH(a,b,c,d,e,f,g,h) \
a = pcre2_match_32(G(b,32),(PCRE2_SPTR32)c,d,e,f,G(g,32),h)
#define PCRE2_MATCH_DATA_CREATE(a,b,c) G(a,32) = pcre2_match_data_create_32(b,c)
#define PCRE2_MATCH_DATA_CREATE(a,b,c) G(a,32) = pcre2_match_data_create_32(b,G(c,32))
#define PCRE2_MATCH_DATA_CREATE_FROM_PATTERN(a,b,c) \
G(a,32) = pcre2_match_data_create_from_pattern_32(G(b,32),c)
G(a,32) = pcre2_match_data_create_from_pattern_32(G(b,32),G(c,32))
#define PCRE2_MATCH_DATA_FREE(a) pcre2_match_data_free_32(G(a,32))
#define PCRE2_PATTERN_CONVERT(a,b,c,d,e,f,g) a = pcre2_pattern_convert_32(G(b,32),c,d,(PCRE2_UCHAR32 **)e,f,G(g,32))
#define PCRE2_PATTERN_INFO(a,b,c,d) a = pcre2_pattern_info_32(G(b,32),c,d)
@ -2780,7 +2784,7 @@ return block;
static void my_free(void *block, void *data)
{
(void)data;
if (show_memory)
if (show_memory && block != NULL)
{
uint32_t i, j;
BOOL found = FALSE;
@ -4112,7 +4116,7 @@ Returns: nothing
static void
show_controls(uint32_t controls, uint32_t controls2, const char *before)
{
fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s",
fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s",
before,
((controls & CTL_AFTERTEXT) != 0)? " aftertext" : "",
((controls & CTL_ALLAFTERTEXT) != 0)? " allaftertext" : "",
@ -4130,7 +4134,8 @@ fprintf(outfile, "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s
((controls & CTL_DFA) != 0)? " dfa" : "",
((controls & CTL_EXPAND) != 0)? " expand" : "",
((controls & CTL_FINDLIMITS) != 0)? " find_limits" : "",
((controls & CTL_FRAMESIZE) != 0)? " framesize" : "",
((controls & CTL_FINDLIMITS_NOHEAP) != 0)? " find_limits_noheap" : "",
((controls2 & CTL2_FRAMESIZE) != 0)? " framesize" : "",
((controls & CTL_FULLBINCODE) != 0)? " fullbincode" : "",
((controls & CTL_GETALL) != 0)? " getall" : "",
((controls & CTL_GLOBAL) != 0)? " global" : "",
@ -4307,12 +4312,18 @@ if (test_mode == PCRE32_MODE) cblock_size = sizeof(pcre2_real_code_32);
(void)pattern_info(PCRE2_INFO_SIZE, &size, FALSE);
(void)pattern_info(PCRE2_INFO_NAMECOUNT, &name_count, FALSE);
(void)pattern_info(PCRE2_INFO_NAMEENTRYSIZE, &name_entry_size, FALSE);
fprintf(outfile, "Memory allocation (code space): %d\n",
(int)(size - name_count*name_entry_size*code_unit_size - cblock_size));
/* The uint32_t variables are cast before multiplying to stop code analyzers
grumbling about potential overflow. */
fprintf(outfile, "Memory allocation (code space): %" SIZ_FORM "\n", size -
(size_t)name_count * (size_t)name_entry_size * (size_t)code_unit_size -
cblock_size);
if (pat_patctl.jit != 0)
{
(void)pattern_info(PCRE2_INFO_JITSIZE, &size, FALSE);
fprintf(outfile, "Memory allocation (JIT code): %d\n", (int)size);
fprintf(outfile, "Memory allocation (JIT code): %" SIZ_FORM "\n", size);
}
}
@ -4327,7 +4338,7 @@ show_framesize(void)
{
size_t frame_size;
(void)pattern_info(PCRE2_INFO_FRAMESIZE, &frame_size, FALSE);
fprintf(outfile, "Frame size for pcre2_match(): %d\n", (int)frame_size);
fprintf(outfile, "Frame size for pcre2_match(): %" SIZ_FORM "\n", frame_size);
}
@ -4749,19 +4760,19 @@ if ((pat_patctl.control & CTL_INFO) != 0)
if (pat_patctl.jit != 0 && (pat_patctl.control & CTL_JITVERIFY) != 0)
{
#ifdef SUPPORT_JIT
if (FLD(compiled_code, executable_jit) != NULL)
fprintf(outfile, "JIT compilation was successful\n");
else
{
#ifdef SUPPORT_JIT
fprintf(outfile, "JIT compilation was not successful");
if (jitrc != 0 && !print_error_message(jitrc, " (", ")"))
return PR_ABEND;
fprintf(outfile, "\n");
}
#else
fprintf(outfile, "JIT support is not available in this version of PCRE2\n");
#endif
}
}
}
@ -4980,7 +4991,7 @@ switch(cmd)
PCRE2_JIT_COMPILE(jitrc, compiled_code, pat_patctl.jit);
}
if ((pat_patctl.control & CTL_MEMORY) != 0) show_memory_info();
if ((pat_patctl.control & CTL_FRAMESIZE) != 0) show_framesize();
if ((pat_patctl.control2 & CTL2_FRAMESIZE) != 0) show_framesize();
if ((pat_patctl.control & CTL_ANYINFO) != 0)
{
rc = show_pattern_info();
@ -5942,7 +5953,7 @@ if ((pat_patctl.control2 & CTL2_NL_SET) != 0)
/* Output code size and other information if requested. */
if ((pat_patctl.control & CTL_MEMORY) != 0) show_memory_info();
if ((pat_patctl.control & CTL_FRAMESIZE) != 0) show_framesize();
if ((pat_patctl.control2 & CTL2_FRAMESIZE) != 0) show_framesize();
if ((pat_patctl.control & CTL_ANYINFO) != 0)
{
int rc = show_pattern_info();
@ -6021,10 +6032,46 @@ for (;;)
{
uint32_t stack_start = 0;
/* If we are checking the heap limit, free any frames vector that is cached
in the match_data so we always start without one. */
if (errnumber == PCRE2_ERROR_HEAPLIMIT)
{
PCRE2_SET_HEAP_LIMIT(dat_context, mid);
#ifdef SUPPORT_PCRE2_8
if (code_unit_size == 1)
{
match_data8->memctl.free(match_data8->heapframes,
match_data8->memctl.memory_data);
match_data8->heapframes = NULL;
match_data8->heapframes_size = 0;
}
#endif
#ifdef SUPPORT_PCRE2_16
if (code_unit_size == 2)
{
match_data16->memctl.free(match_data16->heapframes,
match_data16->memctl.memory_data);
match_data16->heapframes = NULL;
match_data16->heapframes_size = 0;
}
#endif
#ifdef SUPPORT_PCRE2_32
if (code_unit_size == 4)
{
match_data32->memctl.free(match_data32->heapframes,
match_data32->memctl.memory_data);
match_data32->heapframes = NULL;
match_data32->heapframes_size = 0;
}
#endif
}
/* No need to mess with the frames vector for match or depth limits. */
else if (errnumber == PCRE2_ERROR_MATCHLIMIT)
{
PCRE2_SET_MATCH_LIMIT(dat_context, mid);
@ -6034,6 +6081,8 @@ for (;;)
PCRE2_SET_DEPTH_LIMIT(dat_context, mid);
}
/* Do the appropriate match */
if ((dat_datctl.control & CTL_DFA) != 0)
{
stack_start = DFA_START_RWS_SIZE/1024;
@ -6052,7 +6101,6 @@ for (;;)
else
{
stack_start = START_FRAMES_SIZE/1024;
PCRE2_MATCH(capcount, compiled_code, pp, ulen, dat_datctl.offset,
dat_datctl.options, match_data, PTR(dat_context));
}
@ -6757,8 +6805,6 @@ while ((c = *p++) != 0)
{
long li;
char *endptr;
size_t qoffset = CAST8VAR(q) - dbuffer;
size_t rep_offset = start_rep - dbuffer;
if (*p++ != '{')
{
@ -6792,6 +6838,8 @@ while ((c = *p++) != 0)
if (needlen >= dbuffer_size)
{
size_t qoffset = CAST8VAR(q) - dbuffer;
size_t rep_offset = start_rep - dbuffer;
while (needlen >= dbuffer_size) dbuffer_size *= 2;
dbuffer = (uint8_t *)realloc(dbuffer, dbuffer_size);
if (dbuffer == NULL)
@ -7270,7 +7318,8 @@ causes a new match data block to be obtained that exactly fits the pattern. */
if (dat_datctl.oveccount == 0)
{
PCRE2_MATCH_DATA_FREE(match_data);
PCRE2_MATCH_DATA_CREATE_FROM_PATTERN(match_data, compiled_code, NULL);
PCRE2_MATCH_DATA_CREATE_FROM_PATTERN(match_data, compiled_code,
general_context);
PCRE2_GET_OVECTOR_COUNT(max_oveccount, match_data);
}
else if (dat_datctl.oveccount <= max_oveccount)
@ -7281,7 +7330,7 @@ else
{
max_oveccount = dat_datctl.oveccount;
PCRE2_MATCH_DATA_FREE(match_data);
PCRE2_MATCH_DATA_CREATE(match_data, max_oveccount, NULL);
PCRE2_MATCH_DATA_CREATE(match_data, max_oveccount, general_context);
}
if (CASTVAR(void *, match_data) == NULL)
@ -7578,12 +7627,13 @@ for (gmatched = 0;; gmatched++)
limits are not relevant for JIT. The return from check_match_limit() is the
return from the final call to pcre2_match() or pcre2_dfa_match(). */
if ((dat_datctl.control & CTL_FINDLIMITS) != 0)
if ((dat_datctl.control & (CTL_FINDLIMITS|CTL_FINDLIMITS_NOHEAP)) != 0)
{
capcount = 0; /* This stops compiler warnings */
if (FLD(compiled_code, executable_jit) == NULL ||
(dat_datctl.options & PCRE2_NO_JIT) != 0)
if ((dat_datctl.control & CTL_FINDLIMITS_NOHEAP) == 0 &&
(FLD(compiled_code, executable_jit) == NULL ||
(dat_datctl.options & PCRE2_NO_JIT) != 0))
{
(void)check_match_limit(pp, arg_ulen, PCRE2_ERROR_HEAPLIMIT, "heap");
}
@ -8917,7 +8967,7 @@ while (argc > 1 && argv[op][0] == '-' && argv[op][1] != 0)
else if (strcmp(arg, "-S") == 0 && argc > 2 &&
((uli = strtoul(argv[op+1], &endptr, 10)), *endptr == 0))
{
#if defined(_WIN32) || defined(WIN32) || defined(__HAIKU__) || defined(NATIVE_ZOS) || defined(__VMS)
#if defined(_WIN32) || defined(WIN32) || defined(__HAIKU__) || defined(NATIVE_ZOS) || defined(__VMS) || defined(__amigaos4__)
fprintf(stderr, "pcre2test: -S is not supported on this OS\n");
exit(1);
#else
@ -9449,3 +9499,4 @@ return yield;
}
/* End of pcre2test.c */

View File

@ -53,7 +53,8 @@ extern "C" {
/* #define SLJIT_CONFIG_PPC_64 1 */
/* #define SLJIT_CONFIG_MIPS_32 1 */
/* #define SLJIT_CONFIG_MIPS_64 1 */
/* #define SLJIT_CONFIG_SPARC_32 1 */
/* #define SLJIT_CONFIG_RISCV_32 1 */
/* #define SLJIT_CONFIG_RISCV_64 1 */
/* #define SLJIT_CONFIG_S390X 1 */
/* #define SLJIT_CONFIG_AUTO 1 */
@ -127,17 +128,6 @@ extern "C" {
#endif /* !SLJIT_EXECUTABLE_ALLOCATOR */
/* Force cdecl calling convention even if a better calling
convention (e.g. fastcall) is supported by the C compiler.
If this option is disabled (this is the default), functions
called from JIT should be defined with SLJIT_FUNC attribute.
Standard C functions can still be called by using the
SLJIT_CALL_CDECL jump type. */
#ifndef SLJIT_USE_CDECL_CALLING_CONVENTION
/* Disabled by default */
#define SLJIT_USE_CDECL_CALLING_CONVENTION 0
#endif
/* Return with error when an invalid argument is passed. */
#ifndef SLJIT_ARGUMENT_CHECKS
/* Disabled by default */

View File

@ -59,7 +59,8 @@ extern "C" {
SLJIT_64BIT_ARCHITECTURE : 64 bit architecture
SLJIT_LITTLE_ENDIAN : little endian architecture
SLJIT_BIG_ENDIAN : big endian architecture
SLJIT_UNALIGNED : allows unaligned memory accesses for non-fpu operations (only!)
SLJIT_UNALIGNED : unaligned memory accesses for non-fpu operations are supported
SLJIT_FPU_UNALIGNED : unaligned memory accesses for fpu operations are supported
SLJIT_INDIRECT_CALL : see SLJIT_FUNC_ADDR() for more information
Constants:
@ -98,7 +99,8 @@ extern "C" {
+ (defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64) \
+ (defined SLJIT_CONFIG_MIPS_32 && SLJIT_CONFIG_MIPS_32) \
+ (defined SLJIT_CONFIG_MIPS_64 && SLJIT_CONFIG_MIPS_64) \
+ (defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32) \
+ (defined SLJIT_CONFIG_RISCV_32 && SLJIT_CONFIG_RISCV_32) \
+ (defined SLJIT_CONFIG_RISCV_64 && SLJIT_CONFIG_RISCV_64) \
+ (defined SLJIT_CONFIG_S390X && SLJIT_CONFIG_S390X) \
+ (defined SLJIT_CONFIG_AUTO && SLJIT_CONFIG_AUTO) \
+ (defined SLJIT_CONFIG_UNSUPPORTED && SLJIT_CONFIG_UNSUPPORTED) >= 2
@ -115,7 +117,8 @@ extern "C" {
&& !(defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64) \
&& !(defined SLJIT_CONFIG_MIPS_32 && SLJIT_CONFIG_MIPS_32) \
&& !(defined SLJIT_CONFIG_MIPS_64 && SLJIT_CONFIG_MIPS_64) \
&& !(defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32) \
&& !(defined SLJIT_CONFIG_RISCV_32 && SLJIT_CONFIG_RISCV_32) \
&& !(defined SLJIT_CONFIG_RISCV_64 && SLJIT_CONFIG_RISCV_64) \
&& !(defined SLJIT_CONFIG_S390X && SLJIT_CONFIG_S390X) \
&& !(defined SLJIT_CONFIG_UNSUPPORTED && SLJIT_CONFIG_UNSUPPORTED) \
&& !(defined SLJIT_CONFIG_AUTO && SLJIT_CONFIG_AUTO)
@ -156,8 +159,10 @@ extern "C" {
#define SLJIT_CONFIG_MIPS_32 1
#elif defined(__mips64)
#define SLJIT_CONFIG_MIPS_64 1
#elif (defined(__sparc__) || defined(__sparc)) && !defined(_LP64)
#define SLJIT_CONFIG_SPARC_32 1
#elif defined (__riscv_xlen) && (__riscv_xlen == 32)
#define SLJIT_CONFIG_RISCV_32 1
#elif defined (__riscv_xlen) && (__riscv_xlen == 64)
#define SLJIT_CONFIG_RISCV_64 1
#elif defined(__s390x__)
#define SLJIT_CONFIG_S390X 1
#else
@ -205,8 +210,8 @@ extern "C" {
#define SLJIT_CONFIG_PPC 1
#elif (defined SLJIT_CONFIG_MIPS_32 && SLJIT_CONFIG_MIPS_32) || (defined SLJIT_CONFIG_MIPS_64 && SLJIT_CONFIG_MIPS_64)
#define SLJIT_CONFIG_MIPS 1
#elif (defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32) || (defined SLJIT_CONFIG_SPARC_64 && SLJIT_CONFIG_SPARC_64)
#define SLJIT_CONFIG_SPARC 1
#elif (defined SLJIT_CONFIG_RISCV_32 && SLJIT_CONFIG_RISCV_32) || (defined SLJIT_CONFIG_RISCV_64 && SLJIT_CONFIG_RISCV_64)
#define SLJIT_CONFIG_RISCV 1
#endif
/***********************************************************/
@ -330,8 +335,14 @@ extern "C" {
* older versions are known to abort in some targets
* https://github.com/PhilipHazel/pcre2/issues/92
*
* beware APPLE is known to have removed the code in iOS so
* it will need to be excempted or result in broken builds
* beware some vendors (ex: Microsoft, Apple) are known to have
* removed the code to support this builtin even if the call for
* __has_builtin reports it is available.
*
* make sure linking doesn't fail because __clear_cache() is
* missing before changing it or add an exception so that the
* system provided method that should be defined below is used
* instead.
*/
#if (!defined SLJIT_CACHE_FLUSH && defined __has_builtin)
#if __has_builtin(__builtin___clear_cache) && !defined(__clang__)
@ -339,9 +350,9 @@ extern "C" {
/*
* https://gcc.gnu.org/bugzilla//show_bug.cgi?id=91248
* https://gcc.gnu.org/bugzilla//show_bug.cgi?id=93811
* gcc's clear_cache builtin for power and sparc are broken
* gcc's clear_cache builtin for power is broken
*/
#if !defined(SLJIT_CONFIG_PPC) && !defined(SLJIT_CONFIG_SPARC_32)
#if !defined(SLJIT_CONFIG_PPC)
#define SLJIT_CACHE_FLUSH(from, to) \
__builtin___clear_cache((char*)(from), (char*)(to))
#endif
@ -373,12 +384,10 @@ extern "C" {
ppc_cache_flush((from), (to))
#define SLJIT_CACHE_FLUSH_OWN_IMPL 1
#elif (defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32)
#elif defined(_WIN32)
/* The __clear_cache() implementation of GCC is a dummy function on Sparc. */
#define SLJIT_CACHE_FLUSH(from, to) \
sparc_cache_flush((from), (to))
#define SLJIT_CACHE_FLUSH_OWN_IMPL 1
FlushInstructionCache(GetCurrentProcess(), (void*)(from), (char*)(to) - (char*)(from))
#elif (defined(__GNUC__) && (__GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3))) || defined(__clang__)
@ -392,11 +401,6 @@ extern "C" {
#define SLJIT_CACHE_FLUSH(from, to) \
cacheflush((long)(from), (long)(to), 0)
#elif defined _WIN32
#define SLJIT_CACHE_FLUSH(from, to) \
FlushInstructionCache(GetCurrentProcess(), (void*)(from), (char*)(to) - (char*)(from))
#else
/* Call __ARM_NR_cacheflush on ARM-Linux or the corresponding MIPS syscall. */
@ -435,6 +439,7 @@ typedef long int sljit_sw;
&& !(defined SLJIT_CONFIG_ARM_64 && SLJIT_CONFIG_ARM_64) \
&& !(defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64) \
&& !(defined SLJIT_CONFIG_MIPS_64 && SLJIT_CONFIG_MIPS_64) \
&& !(defined SLJIT_CONFIG_RISCV_64 && SLJIT_CONFIG_RISCV_64) \
&& !(defined SLJIT_CONFIG_S390X && SLJIT_CONFIG_S390X)
#define SLJIT_32BIT_ARCHITECTURE 1
#define SLJIT_WORD_SHIFT 2
@ -495,8 +500,7 @@ typedef double sljit_f64;
#if !defined(SLJIT_BIG_ENDIAN) && !defined(SLJIT_LITTLE_ENDIAN)
/* These macros are mostly useful for the applications. */
#if (defined SLJIT_CONFIG_PPC_32 && SLJIT_CONFIG_PPC_32) \
|| (defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64)
#if (defined SLJIT_CONFIG_PPC && SLJIT_CONFIG_PPC)
#ifdef __LITTLE_ENDIAN__
#define SLJIT_LITTLE_ENDIAN 1
@ -504,8 +508,7 @@ typedef double sljit_f64;
#define SLJIT_BIG_ENDIAN 1
#endif
#elif (defined SLJIT_CONFIG_MIPS_32 && SLJIT_CONFIG_MIPS_32) \
|| (defined SLJIT_CONFIG_MIPS_64 && SLJIT_CONFIG_MIPS_64)
#elif (defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS)
#ifdef __MIPSEL__
#define SLJIT_LITTLE_ENDIAN 1
@ -532,8 +535,7 @@ typedef double sljit_f64;
#endif /* !SLJIT_MIPS_REV */
#elif (defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32) \
|| (defined SLJIT_CONFIG_S390X && SLJIT_CONFIG_S390X)
#elif (defined SLJIT_CONFIG_S390X && SLJIT_CONFIG_S390X)
#define SLJIT_BIG_ENDIAN 1
@ -554,19 +556,30 @@ typedef double sljit_f64;
#ifndef SLJIT_UNALIGNED
#if (defined SLJIT_CONFIG_X86_32 && SLJIT_CONFIG_X86_32) \
|| (defined SLJIT_CONFIG_X86_64 && SLJIT_CONFIG_X86_64) \
#if (defined SLJIT_CONFIG_X86 && SLJIT_CONFIG_X86) \
|| (defined SLJIT_CONFIG_ARM_V7 && SLJIT_CONFIG_ARM_V7) \
|| (defined SLJIT_CONFIG_ARM_THUMB2 && SLJIT_CONFIG_ARM_THUMB2) \
|| (defined SLJIT_CONFIG_ARM_64 && SLJIT_CONFIG_ARM_64) \
|| (defined SLJIT_CONFIG_PPC_32 && SLJIT_CONFIG_PPC_32) \
|| (defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64) \
|| (defined SLJIT_CONFIG_PPC && SLJIT_CONFIG_PPC) \
|| (defined SLJIT_CONFIG_RISCV && SLJIT_CONFIG_RISCV) \
|| (defined SLJIT_CONFIG_S390X && SLJIT_CONFIG_S390X)
#define SLJIT_UNALIGNED 1
#endif
#endif /* !SLJIT_UNALIGNED */
#ifndef SLJIT_FPU_UNALIGNED
#if (defined SLJIT_CONFIG_X86 && SLJIT_CONFIG_X86) \
|| (defined SLJIT_CONFIG_ARM_64 && SLJIT_CONFIG_ARM_64) \
|| (defined SLJIT_CONFIG_PPC && SLJIT_CONFIG_PPC) \
|| (defined SLJIT_CONFIG_RISCV && SLJIT_CONFIG_RISCV) \
|| (defined SLJIT_CONFIG_S390X && SLJIT_CONFIG_S390X)
#define SLJIT_FPU_UNALIGNED 1
#endif
#endif /* !SLJIT_FPU_UNALIGNED */
#if (defined SLJIT_CONFIG_X86_32 && SLJIT_CONFIG_X86_32)
/* Auto detect SSE2 support using CPUID.
On 64 bit x86 cpus, sse2 must be present. */
@ -578,38 +591,7 @@ typedef double sljit_f64;
/*****************************************************************************************/
#ifndef SLJIT_FUNC
#if (defined SLJIT_USE_CDECL_CALLING_CONVENTION && SLJIT_USE_CDECL_CALLING_CONVENTION) \
|| !(defined SLJIT_CONFIG_X86_32 && SLJIT_CONFIG_X86_32)
#define SLJIT_FUNC
#elif defined(__GNUC__) && !defined(__APPLE__)
#if __GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
#define SLJIT_FUNC __attribute__ ((fastcall))
#define SLJIT_X86_32_FASTCALL 1
#else
#define SLJIT_FUNC
#endif /* gcc >= 3.4 */
#elif defined(_MSC_VER)
#define SLJIT_FUNC __fastcall
#define SLJIT_X86_32_FASTCALL 1
#elif defined(__BORLANDC__)
#define SLJIT_FUNC __msfastcall
#define SLJIT_X86_32_FASTCALL 1
#else /* Unknown compiler. */
/* The cdecl calling convention is usually the x86 default. */
#define SLJIT_FUNC
#endif /* SLJIT_USE_CDECL_CALLING_CONVENTION */
#endif /* !SLJIT_FUNC */
#ifndef SLJIT_INDIRECT_CALL
@ -624,11 +606,7 @@ typedef double sljit_f64;
/* The offset which needs to be substracted from the return address to
determine the next executed instruction after return. */
#ifndef SLJIT_RETURN_ADDRESS_OFFSET
#if (defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32)
#define SLJIT_RETURN_ADDRESS_OFFSET 8
#else
#define SLJIT_RETURN_ADDRESS_OFFSET 0
#endif
#endif /* SLJIT_RETURN_ADDRESS_OFFSET */
/***************************************************/
@ -740,17 +718,13 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_sw sljit_exec_offset(void* ptr);
#define SLJIT_NUMBER_OF_SAVED_FLOAT_REGISTERS 8
#endif
#elif (defined SLJIT_CONFIG_SPARC && SLJIT_CONFIG_SPARC)
#elif (defined SLJIT_CONFIG_RISCV && SLJIT_CONFIG_RISCV)
#define SLJIT_NUMBER_OF_REGISTERS 18
#define SLJIT_NUMBER_OF_SAVED_REGISTERS 14
#define SLJIT_NUMBER_OF_FLOAT_REGISTERS 14
#define SLJIT_NUMBER_OF_SAVED_FLOAT_REGISTERS 0
#if (defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32)
/* saved registers (16), return struct pointer (1), space for 6 argument words (1),
4th double arg (2), double alignment (1). */
#define SLJIT_LOCALS_OFFSET_BASE ((16 + 1 + 6 + 2 + 1) * (sljit_s32)sizeof(sljit_sw))
#endif
#define SLJIT_NUMBER_OF_REGISTERS 23
#define SLJIT_NUMBER_OF_SAVED_REGISTERS 12
#define SLJIT_LOCALS_OFFSET_BASE 0
#define SLJIT_NUMBER_OF_FLOAT_REGISTERS 30
#define SLJIT_NUMBER_OF_SAVED_FLOAT_REGISTERS 12
#elif (defined SLJIT_CONFIG_S390X && SLJIT_CONFIG_S390X)
@ -806,7 +780,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_sw sljit_exec_offset(void* ptr);
#if (defined SLJIT_CONFIG_ARM && SLJIT_CONFIG_ARM) \
|| (defined SLJIT_CONFIG_PPC && SLJIT_CONFIG_PPC) \
|| (defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS) \
|| (defined SLJIT_CONFIG_SPARC && SLJIT_CONFIG_SPARC) \
|| (defined SLJIT_CONFIG_RISCV && SLJIT_CONFIG_RISCV) \
|| (defined SLJIT_CONFIG_S390X && SLJIT_CONFIG_S390X)
#define SLJIT_HAS_STATUS_FLAGS_STATE 1
#endif

View File

@ -133,6 +133,9 @@
#define SLJIT_ARG_MASK 0x7
#define SLJIT_ARG_FULL_MASK (SLJIT_ARG_MASK | SLJIT_ARG_TYPE_SCRATCH_REG)
/* Mask for sljit_emit_enter. */
#define SLJIT_KEPT_SAVEDS_COUNT(options) ((options) & 0x3)
/* Jump flags. */
#define JUMP_LABEL 0x1
#define JUMP_ADDR 0x2
@ -145,16 +148,16 @@
# define PATCH_MD 0x10
#endif
# define TYPE_SHIFT 13
#endif
#endif /* SLJIT_CONFIG_X86 */
#if (defined SLJIT_CONFIG_ARM_V5 && SLJIT_CONFIG_ARM_V5) || (defined SLJIT_CONFIG_ARM_V7 && SLJIT_CONFIG_ARM_V7)
# define IS_BL 0x4
# define PATCH_B 0x8
#endif
#endif /* SLJIT_CONFIG_ARM_V5 || SLJIT_CONFIG_ARM_V7 */
#if (defined SLJIT_CONFIG_ARM_V5 && SLJIT_CONFIG_ARM_V5)
# define CPOOL_SIZE 512
#endif
#endif /* SLJIT_CONFIG_ARM_V5 */
#if (defined SLJIT_CONFIG_ARM_THUMB2 && SLJIT_CONFIG_ARM_THUMB2)
# define IS_COND 0x04
@ -172,7 +175,7 @@
/* BL + imm24 */
# define PATCH_BL 0x60
/* 0xf00 cc code for branches */
#endif
#endif /* SLJIT_CONFIG_ARM_THUMB2 */
#if (defined SLJIT_CONFIG_ARM_64 && SLJIT_CONFIG_ARM_64)
# define IS_COND 0x004
@ -182,7 +185,7 @@
# define PATCH_COND 0x040
# define PATCH_ABS48 0x080
# define PATCH_ABS64 0x100
#endif
#endif /* SLJIT_CONFIG_ARM_64 */
#if (defined SLJIT_CONFIG_PPC && SLJIT_CONFIG_PPC)
# define IS_COND 0x004
@ -192,9 +195,9 @@
#if (defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64)
# define PATCH_ABS32 0x040
# define PATCH_ABS48 0x080
#endif
#endif /* SLJIT_CONFIG_PPC_64 */
# define REMOVE_COND 0x100
#endif
#endif /* SLJIT_CONFIG_PPC */
#if (defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS)
# define IS_MOVABLE 0x004
@ -212,7 +215,7 @@
#if (defined SLJIT_CONFIG_MIPS_64 && SLJIT_CONFIG_MIPS_64)
# define PATCH_ABS32 0x400
# define PATCH_ABS48 0x800
#endif
#endif /* SLJIT_CONFIG_MIPS_64 */
/* instruction types */
# define MOVABLE_INS 0
@ -221,28 +224,24 @@
# define UNMOVABLE_INS 32
/* FPU status register */
# define FCSR_FCC 33
#endif
#endif /* SLJIT_CONFIG_MIPS */
#if (defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32)
# define IS_MOVABLE 0x04
# define IS_COND 0x08
# define IS_CALL 0x10
#if (defined SLJIT_CONFIG_RISCV && SLJIT_CONFIG_RISCV)
# define IS_COND 0x004
# define IS_CALL 0x008
# define PATCH_B 0x20
# define PATCH_CALL 0x40
# define PATCH_B 0x010
# define PATCH_J 0x020
/* instruction types */
# define MOVABLE_INS 0
/* 1 - 31 last destination register */
/* no destination (i.e: store) */
# define UNMOVABLE_INS 32
# define DST_INS_MASK 0xff
/* ICC_SET is the same as SET_FLAGS. */
# define ICC_IS_SET (1 << 23)
# define FCC_IS_SET (1 << 24)
#endif
#if (defined SLJIT_CONFIG_RISCV_64 && SLJIT_CONFIG_RISCV_64)
# define PATCH_REL32 0x040
# define PATCH_ABS32 0x080
# define PATCH_ABS44 0x100
# define PATCH_ABS52 0x200
#else /* !SLJIT_CONFIG_RISCV_64 */
# define PATCH_REL32 0x0
#endif /* SLJIT_CONFIG_RISCV_64 */
#endif /* SLJIT_CONFIG_RISCV */
/* Stack management. */
@ -385,7 +384,7 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_compiler* sljit_create_compiler(void *allo
invalid_integer_types);
SLJIT_COMPILE_ASSERT(SLJIT_REWRITABLE_JUMP != SLJIT_32,
rewritable_jump_and_single_op_must_not_be_the_same);
SLJIT_COMPILE_ASSERT(!(SLJIT_EQUAL & 0x1) && !(SLJIT_LESS & 0x1) && !(SLJIT_EQUAL_F64 & 0x1) && !(SLJIT_JUMP & 0x1),
SLJIT_COMPILE_ASSERT(!(SLJIT_EQUAL & 0x1) && !(SLJIT_LESS & 0x1) && !(SLJIT_F_EQUAL & 0x1) && !(SLJIT_JUMP & 0x1),
conditional_flags_must_be_even_numbers);
/* Only the non-zero members must be set. */
@ -437,10 +436,6 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_compiler* sljit_create_compiler(void *allo
compiler->delay_slot = UNMOVABLE_INS;
#endif
#if (defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32)
compiler->delay_slot = UNMOVABLE_INS;
#endif
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS) \
|| (defined SLJIT_DEBUG && SLJIT_DEBUG)
compiler->last_flags = 0;
@ -822,6 +817,9 @@ static sljit_s32 function_check_src_mem(struct sljit_compiler *compiler, sljit_s
if (!(p & SLJIT_MEM))
return 0;
if (p == SLJIT_MEM1(SLJIT_SP))
return (i >= 0 && i < compiler->logical_local_size);
if (!(!(p & REG_MASK) || FUNCTION_CHECK_IS_REG(p & REG_MASK)))
return 0;
@ -859,9 +857,6 @@ static sljit_s32 function_check_src(struct sljit_compiler *compiler, sljit_s32 p
if (p == SLJIT_IMM)
return 1;
if (p == SLJIT_MEM1(SLJIT_SP))
return (i >= 0 && i < compiler->logical_local_size);
return function_check_src_mem(compiler, p, i);
}
@ -876,9 +871,6 @@ static sljit_s32 function_check_dst(struct sljit_compiler *compiler, sljit_s32 p
if (FUNCTION_CHECK_IS_REG(p))
return (i == 0);
if (p == SLJIT_MEM1(SLJIT_SP))
return (i >= 0 && i < compiler->logical_local_size);
return function_check_src_mem(compiler, p, i);
}
@ -893,9 +885,6 @@ static sljit_s32 function_fcheck(struct sljit_compiler *compiler, sljit_s32 p, s
if (FUNCTION_CHECK_IS_FREG(p))
return (i == 0);
if (p == SLJIT_MEM1(SLJIT_SP))
return (i >= 0 && i < compiler->logical_local_size);
return function_check_src_mem(compiler, p, i);
}
@ -913,7 +902,11 @@ SLJIT_API_FUNC_ATTRIBUTE void sljit_compiler_verbose(struct sljit_compiler *comp
#if (defined SLJIT_64BIT_ARCHITECTURE && SLJIT_64BIT_ARCHITECTURE)
#ifdef _WIN64
#ifdef __GNUC__
# define SLJIT_PRINT_D "ll"
#else
# define SLJIT_PRINT_D "I64"
#endif
#else
# define SLJIT_PRINT_D "l"
#endif
@ -1020,10 +1013,6 @@ static const char* fop2_names[] = {
"add", "sub", "mul", "div"
};
#define JUMP_POSTFIX(type) \
((type & 0xff) <= SLJIT_NOT_OVERFLOW ? ((type & SLJIT_32) ? "32" : "") \
: ((type & 0xff) <= SLJIT_ORDERED_F64 ? ((type & SLJIT_32) ? ".f32" : ".f64") : ""))
static const char* jump_names[] = {
"equal", "not_equal",
"less", "greater_equal",
@ -1032,12 +1021,18 @@ static const char* jump_names[] = {
"sig_greater", "sig_less_equal",
"overflow", "not_overflow",
"carry", "",
"equal", "not_equal",
"less", "greater_equal",
"greater", "less_equal",
"f_equal", "f_not_equal",
"f_less", "f_greater_equal",
"f_greater", "f_less_equal",
"unordered", "ordered",
"ordered_equal", "unordered_or_not_equal",
"ordered_less", "unordered_or_greater_equal",
"ordered_greater", "unordered_or_less_equal",
"unordered_or_equal", "ordered_not_equal",
"unordered_or_less", "ordered_greater_equal",
"unordered_or_greater", "ordered_less_equal",
"jump", "fast_call",
"call", "call.cdecl"
"call", "call_reg_arg"
};
static const char* call_arg_names[] = {
@ -1053,6 +1048,8 @@ static const char* call_arg_names[] = {
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS) \
|| (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
#define SLJIT_SKIP_CHECKS(compiler) (compiler)->skip_checks = 1
static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_generate_code(struct sljit_compiler *compiler)
{
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
@ -1080,7 +1077,12 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_enter(struct sljit_compil
SLJIT_UNUSED_ARG(compiler);
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
CHECK_ARGUMENT(!(options & ~SLJIT_ENTER_CDECL));
if (options & SLJIT_ENTER_REG_ARG) {
CHECK_ARGUMENT(!(options & ~(0x3 | SLJIT_ENTER_REG_ARG)));
} else {
CHECK_ARGUMENT(options == 0);
}
CHECK_ARGUMENT(SLJIT_KEPT_SAVEDS_COUNT(options) <= 3 && SLJIT_KEPT_SAVEDS_COUNT(options) <= saveds);
CHECK_ARGUMENT(scratches >= 0 && scratches <= SLJIT_NUMBER_OF_REGISTERS);
CHECK_ARGUMENT(saveds >= 0 && saveds <= SLJIT_NUMBER_OF_SAVED_REGISTERS);
CHECK_ARGUMENT(scratches + saveds <= SLJIT_NUMBER_OF_REGISTERS);
@ -1089,7 +1091,7 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_enter(struct sljit_compil
CHECK_ARGUMENT(fscratches + fsaveds <= SLJIT_NUMBER_OF_FLOAT_REGISTERS);
CHECK_ARGUMENT(local_size >= 0 && local_size <= SLJIT_MAX_LOCAL_SIZE);
CHECK_ARGUMENT((arg_types & SLJIT_ARG_FULL_MASK) < SLJIT_ARG_TYPE_F64);
CHECK_ARGUMENT(function_check_arguments(arg_types, scratches, saveds, fscratches));
CHECK_ARGUMENT(function_check_arguments(arg_types, scratches, (options & SLJIT_ENTER_REG_ARG) ? 0 : saveds, fscratches));
compiler->last_flags = 0;
#endif
@ -1109,8 +1111,16 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_enter(struct sljit_compil
} while (arg_types);
}
fprintf(compiler->verbose, "],%s scratches:%d, saveds:%d, fscratches:%d, fsaveds:%d, local_size:%d\n",
(options & SLJIT_ENTER_CDECL) ? " enter:cdecl," : "",
fprintf(compiler->verbose, "],");
if (options & SLJIT_ENTER_REG_ARG) {
fprintf(compiler->verbose, " enter:reg_arg,");
if (SLJIT_KEPT_SAVEDS_COUNT(options) > 0)
fprintf(compiler->verbose, " keep:%d,", SLJIT_KEPT_SAVEDS_COUNT(options));
}
fprintf(compiler->verbose, "scratches:%d, saveds:%d, fscratches:%d, fsaveds:%d, local_size:%d\n",
scratches, saveds, fscratches, fsaveds, local_size);
}
#endif
@ -1124,7 +1134,12 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_set_context(struct sljit_compi
SLJIT_UNUSED_ARG(compiler);
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
CHECK_ARGUMENT(!(options & ~SLJIT_ENTER_CDECL));
if (options & SLJIT_ENTER_REG_ARG) {
CHECK_ARGUMENT(!(options & ~(0x3 | SLJIT_ENTER_REG_ARG)));
} else {
CHECK_ARGUMENT(options == 0);
}
CHECK_ARGUMENT(SLJIT_KEPT_SAVEDS_COUNT(options) <= 3 && SLJIT_KEPT_SAVEDS_COUNT(options) <= saveds);
CHECK_ARGUMENT(scratches >= 0 && scratches <= SLJIT_NUMBER_OF_REGISTERS);
CHECK_ARGUMENT(saveds >= 0 && saveds <= SLJIT_NUMBER_OF_SAVED_REGISTERS);
CHECK_ARGUMENT(scratches + saveds <= SLJIT_NUMBER_OF_REGISTERS);
@ -1133,7 +1148,7 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_set_context(struct sljit_compi
CHECK_ARGUMENT(fscratches + fsaveds <= SLJIT_NUMBER_OF_FLOAT_REGISTERS);
CHECK_ARGUMENT(local_size >= 0 && local_size <= SLJIT_MAX_LOCAL_SIZE);
CHECK_ARGUMENT((arg_types & SLJIT_ARG_FULL_MASK) < SLJIT_ARG_TYPE_F64);
CHECK_ARGUMENT(function_check_arguments(arg_types, scratches, saveds, fscratches));
CHECK_ARGUMENT(function_check_arguments(arg_types, scratches, (options & SLJIT_ENTER_REG_ARG) ? 0 : saveds, fscratches));
compiler->last_flags = 0;
#endif
@ -1153,8 +1168,16 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_set_context(struct sljit_compi
} while (arg_types);
}
fprintf(compiler->verbose, "],%s scratches:%d, saveds:%d, fscratches:%d, fsaveds:%d, local_size:%d\n",
(options & SLJIT_ENTER_CDECL) ? " enter:cdecl," : "",
fprintf(compiler->verbose, "],");
if (options & SLJIT_ENTER_REG_ARG) {
fprintf(compiler->verbose, " enter:reg_arg,");
if (SLJIT_KEPT_SAVEDS_COUNT(options) > 0)
fprintf(compiler->verbose, " keep:%d,", SLJIT_KEPT_SAVEDS_COUNT(options));
}
fprintf(compiler->verbose, " scratches:%d, saveds:%d, fscratches:%d, fsaveds:%d, local_size:%d\n",
scratches, saveds, fscratches, fsaveds, local_size);
}
#endif
@ -1510,7 +1533,7 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_fop1_cmp(struct sljit_com
sljit_s32 src2, sljit_sw src2w)
{
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->last_flags = GET_FLAG_TYPE(op) | (op & (SLJIT_32 | SLJIT_SET_Z));
compiler->last_flags = GET_FLAG_TYPE(op) | (op & SLJIT_32);
#endif
if (SLJIT_UNLIKELY(compiler->skip_checks)) {
@ -1523,7 +1546,7 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_fop1_cmp(struct sljit_com
CHECK_ARGUMENT(GET_OPCODE(op) == SLJIT_CMP_F64);
CHECK_ARGUMENT(!(op & SLJIT_SET_Z));
CHECK_ARGUMENT((op & VARIABLE_FLAG_MASK)
|| (GET_FLAG_TYPE(op) >= SLJIT_EQUAL_F64 && GET_FLAG_TYPE(op) <= SLJIT_ORDERED_F64));
|| (GET_FLAG_TYPE(op) >= SLJIT_F_EQUAL && GET_FLAG_TYPE(op) <= SLJIT_ORDERED_LESS_EQUAL));
FUNCTION_FCHECK(src1, src1w);
FUNCTION_FCHECK(src2, src2w);
#endif
@ -1531,7 +1554,7 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_fop1_cmp(struct sljit_com
if (SLJIT_UNLIKELY(!!compiler->verbose)) {
fprintf(compiler->verbose, " %s%s", fop1_names[SLJIT_CMP_F64 - SLJIT_FOP1_BASE], (op & SLJIT_32) ? ".f32" : ".f64");
if (op & VARIABLE_FLAG_MASK) {
fprintf(compiler->verbose, ".%s_f", jump_names[GET_FLAG_TYPE(op)]);
fprintf(compiler->verbose, ".%s", jump_names[GET_FLAG_TYPE(op)]);
}
fprintf(compiler->verbose, " ");
sljit_verbose_fparam(compiler, src1, src1w);
@ -1650,6 +1673,17 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_label(struct sljit_compil
CHECK_RETURN_OK;
}
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
#if (defined SLJIT_CONFIG_X86 && SLJIT_CONFIG_X86) \
|| (defined SLJIT_CONFIG_ARM && SLJIT_CONFIG_ARM)
#define CHECK_UNORDERED(type, last_flags) \
((((type) & 0xff) == SLJIT_UNORDERED || ((type) & 0xff) == SLJIT_ORDERED) && \
((last_flags) & 0xff) >= SLJIT_UNORDERED && ((last_flags) & 0xff) <= SLJIT_ORDERED_LESS_EQUAL)
#else
#define CHECK_UNORDERED(type, last_flags) 0
#endif
#endif /* SLJIT_ARGUMENT_CHECKS */
static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_jump(struct sljit_compiler *compiler, sljit_s32 type)
{
if (SLJIT_UNLIKELY(compiler->skip_checks)) {
@ -1658,9 +1692,8 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_jump(struct sljit_compile
}
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_REWRITABLE_JUMP | SLJIT_32)));
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_REWRITABLE_JUMP)));
CHECK_ARGUMENT((type & 0xff) >= SLJIT_EQUAL && (type & 0xff) <= SLJIT_FAST_CALL);
CHECK_ARGUMENT((type & 0xff) < SLJIT_JUMP || !(type & SLJIT_32));
if ((type & 0xff) < SLJIT_JUMP) {
if ((type & 0xff) <= SLJIT_NOT_ZERO)
@ -1670,13 +1703,14 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_jump(struct sljit_compile
compiler->last_flags = 0;
} else
CHECK_ARGUMENT((type & 0xff) == (compiler->last_flags & 0xff)
|| ((type & 0xff) == SLJIT_NOT_OVERFLOW && (compiler->last_flags & 0xff) == SLJIT_OVERFLOW));
|| ((type & 0xff) == SLJIT_NOT_OVERFLOW && (compiler->last_flags & 0xff) == SLJIT_OVERFLOW)
|| CHECK_UNORDERED(type, compiler->last_flags));
}
#endif
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
if (SLJIT_UNLIKELY(!!compiler->verbose))
fprintf(compiler->verbose, " jump%s %s%s\n", !(type & SLJIT_REWRITABLE_JUMP) ? "" : ".r",
jump_names[type & 0xff], JUMP_POSTFIX(type));
fprintf(compiler->verbose, " jump%s %s\n", !(type & SLJIT_REWRITABLE_JUMP) ? "" : ".r",
jump_names[type & 0xff]);
#endif
CHECK_RETURN_OK;
}
@ -1686,11 +1720,17 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_call(struct sljit_compile
{
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_REWRITABLE_JUMP | SLJIT_CALL_RETURN)));
CHECK_ARGUMENT((type & 0xff) == SLJIT_CALL || (type & 0xff) == SLJIT_CALL_CDECL);
CHECK_ARGUMENT((type & 0xff) >= SLJIT_CALL && (type & 0xff) <= SLJIT_CALL_REG_ARG);
CHECK_ARGUMENT(function_check_arguments(arg_types, compiler->scratches, -1, compiler->fscratches));
if (type & SLJIT_CALL_RETURN) {
CHECK_ARGUMENT((arg_types & SLJIT_ARG_MASK) == compiler->last_return);
if (compiler->options & SLJIT_ENTER_REG_ARG) {
CHECK_ARGUMENT((type & 0xff) == SLJIT_CALL_REG_ARG);
} else {
CHECK_ARGUMENT((type & 0xff) != SLJIT_CALL_REG_ARG);
}
}
#endif
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
@ -1729,8 +1769,8 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_cmp(struct sljit_compiler
#endif
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
if (SLJIT_UNLIKELY(!!compiler->verbose)) {
fprintf(compiler->verbose, " cmp%s %s%s, ", !(type & SLJIT_REWRITABLE_JUMP) ? "" : ".r",
jump_names[type & 0xff], (type & SLJIT_32) ? "32" : "");
fprintf(compiler->verbose, " cmp%s%s %s, ", (type & SLJIT_32) ? "32" : "",
!(type & SLJIT_REWRITABLE_JUMP) ? "" : ".r", jump_names[type & 0xff]);
sljit_verbose_param(compiler, src1, src1w);
fprintf(compiler->verbose, ", ");
sljit_verbose_param(compiler, src2, src2w);
@ -1747,15 +1787,16 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_fcmp(struct sljit_compile
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
CHECK_ARGUMENT(sljit_has_cpu_feature(SLJIT_HAS_FPU));
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_REWRITABLE_JUMP | SLJIT_32)));
CHECK_ARGUMENT((type & 0xff) >= SLJIT_EQUAL_F64 && (type & 0xff) <= SLJIT_ORDERED_F64);
CHECK_ARGUMENT((type & 0xff) >= SLJIT_F_EQUAL && (type & 0xff) <= SLJIT_ORDERED_LESS_EQUAL
&& ((type & 0xff) <= SLJIT_ORDERED || sljit_cmp_info(type & 0xff)));
FUNCTION_FCHECK(src1, src1w);
FUNCTION_FCHECK(src2, src2w);
compiler->last_flags = 0;
#endif
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
if (SLJIT_UNLIKELY(!!compiler->verbose)) {
fprintf(compiler->verbose, " fcmp%s %s%s, ", !(type & SLJIT_REWRITABLE_JUMP) ? "" : ".r",
jump_names[type & 0xff], (type & SLJIT_32) ? ".f32" : ".f64");
fprintf(compiler->verbose, " fcmp%s%s %s, ", (type & SLJIT_32) ? ".f32" : ".f64",
!(type & SLJIT_REWRITABLE_JUMP) ? "" : ".r", jump_names[type & 0xff]);
sljit_verbose_fparam(compiler, src1, src1w);
fprintf(compiler->verbose, ", ");
sljit_verbose_fparam(compiler, src2, src2w);
@ -1793,12 +1834,18 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_icall(struct sljit_compil
{
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_CALL_RETURN)));
CHECK_ARGUMENT((type & 0xff) == SLJIT_CALL || (type & 0xff) == SLJIT_CALL_CDECL);
CHECK_ARGUMENT((type & 0xff) >= SLJIT_CALL && (type & 0xff) <= SLJIT_CALL_REG_ARG);
CHECK_ARGUMENT(function_check_arguments(arg_types, compiler->scratches, -1, compiler->fscratches));
FUNCTION_CHECK_SRC(src, srcw);
if (type & SLJIT_CALL_RETURN) {
CHECK_ARGUMENT((arg_types & SLJIT_ARG_MASK) == compiler->last_return);
if (compiler->options & SLJIT_ENTER_REG_ARG) {
CHECK_ARGUMENT((type & 0xff) == SLJIT_CALL_REG_ARG);
} else {
CHECK_ARGUMENT((type & 0xff) != SLJIT_CALL_REG_ARG);
}
}
#endif
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
@ -1830,18 +1877,18 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_op_flags(struct sljit_com
sljit_s32 type)
{
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_32)));
CHECK_ARGUMENT((type & 0xff) >= SLJIT_EQUAL && (type & 0xff) <= SLJIT_ORDERED_F64);
CHECK_ARGUMENT(type >= SLJIT_EQUAL && type <= SLJIT_ORDERED_LESS_EQUAL);
CHECK_ARGUMENT(op == SLJIT_MOV || op == SLJIT_MOV32
|| (GET_OPCODE(op) >= SLJIT_AND && GET_OPCODE(op) <= SLJIT_XOR));
CHECK_ARGUMENT(!(op & VARIABLE_FLAG_MASK));
if ((type & 0xff) <= SLJIT_NOT_ZERO)
if (type <= SLJIT_NOT_ZERO)
CHECK_ARGUMENT(compiler->last_flags & SLJIT_SET_Z);
else
CHECK_ARGUMENT((type & 0xff) == (compiler->last_flags & 0xff)
|| ((type & 0xff) == SLJIT_NOT_CARRY && (compiler->last_flags & 0xff) == SLJIT_CARRY)
|| ((type & 0xff) == SLJIT_NOT_OVERFLOW && (compiler->last_flags & 0xff) == SLJIT_OVERFLOW));
CHECK_ARGUMENT(type == (compiler->last_flags & 0xff)
|| (type == SLJIT_NOT_CARRY && (compiler->last_flags & 0xff) == SLJIT_CARRY)
|| (type == SLJIT_NOT_OVERFLOW && (compiler->last_flags & 0xff) == SLJIT_OVERFLOW)
|| CHECK_UNORDERED(type, compiler->last_flags));
FUNCTION_CHECK_DST(dst, dstw);
@ -1850,12 +1897,12 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_op_flags(struct sljit_com
#endif
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
if (SLJIT_UNLIKELY(!!compiler->verbose)) {
fprintf(compiler->verbose, " flags%s %s%s, ",
!(op & SLJIT_SET_Z) ? "" : ".z",
fprintf(compiler->verbose, " flags.%s%s%s ",
GET_OPCODE(op) < SLJIT_OP2_BASE ? "mov" : op2_names[GET_OPCODE(op) - SLJIT_OP2_BASE],
GET_OPCODE(op) < SLJIT_OP2_BASE ? op1_names[GET_OPCODE(op) - SLJIT_OP1_BASE] : ((op & SLJIT_32) ? "32" : ""));
GET_OPCODE(op) < SLJIT_OP2_BASE ? op1_names[GET_OPCODE(op) - SLJIT_OP1_BASE] : ((op & SLJIT_32) ? "32" : ""),
!(op & SLJIT_SET_Z) ? "" : ".z");
sljit_verbose_param(compiler, dst, dstw);
fprintf(compiler->verbose, ", %s%s\n", jump_names[type & 0xff], JUMP_POSTFIX(type));
fprintf(compiler->verbose, ", %s\n", jump_names[type]);
}
#endif
CHECK_RETURN_OK;
@ -1866,8 +1913,7 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_cmov(struct sljit_compile
sljit_s32 src, sljit_sw srcw)
{
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
CHECK_ARGUMENT(!(type & ~(0xff | SLJIT_32)));
CHECK_ARGUMENT((type & 0xff) >= SLJIT_EQUAL && (type & 0xff) <= SLJIT_ORDERED_F64);
CHECK_ARGUMENT(type >= SLJIT_EQUAL && type <= SLJIT_ORDERED_LESS_EQUAL);
CHECK_ARGUMENT(compiler->scratches != -1 && compiler->saveds != -1);
CHECK_ARGUMENT(FUNCTION_CHECK_IS_REG(dst_reg & ~SLJIT_32));
@ -1876,17 +1922,19 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_cmov(struct sljit_compile
CHECK_ARGUMENT(srcw == 0);
}
if ((type & 0xff) <= SLJIT_NOT_ZERO)
if (type <= SLJIT_NOT_ZERO)
CHECK_ARGUMENT(compiler->last_flags & SLJIT_SET_Z);
else
CHECK_ARGUMENT((type & 0xff) == (compiler->last_flags & 0xff)
|| ((type & 0xff) == SLJIT_NOT_OVERFLOW && (compiler->last_flags & 0xff) == SLJIT_OVERFLOW));
CHECK_ARGUMENT(type == (compiler->last_flags & 0xff)
|| (type == SLJIT_NOT_CARRY && (compiler->last_flags & 0xff) == SLJIT_CARRY)
|| (type == SLJIT_NOT_OVERFLOW && (compiler->last_flags & 0xff) == SLJIT_OVERFLOW)
|| CHECK_UNORDERED(type, compiler->last_flags));
#endif
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
if (SLJIT_UNLIKELY(!!compiler->verbose)) {
fprintf(compiler->verbose, " cmov%s %s%s, ",
fprintf(compiler->verbose, " cmov%s %s, ",
!(dst_reg & SLJIT_32) ? "" : "32",
jump_names[type & 0xff], JUMP_POSTFIX(type));
jump_names[type]);
sljit_verbose_reg(compiler, dst_reg & ~SLJIT_32);
fprintf(compiler->verbose, ", ");
sljit_verbose_param(compiler, src, srcw);
@ -1901,27 +1949,63 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_mem(struct sljit_compiler
sljit_s32 mem, sljit_sw memw)
{
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
sljit_s32 allowed_flags;
CHECK_ARGUMENT((type & 0xff) >= SLJIT_MOV && (type & 0xff) <= SLJIT_MOV_P);
CHECK_ARGUMENT(!(type & SLJIT_32) || ((type & 0xff) != SLJIT_MOV && (type & 0xff) != SLJIT_MOV_U32 && (type & 0xff) != SLJIT_MOV_P));
CHECK_ARGUMENT((type & SLJIT_MEM_PRE) || (type & SLJIT_MEM_POST));
CHECK_ARGUMENT((type & (SLJIT_MEM_PRE | SLJIT_MEM_POST)) != (SLJIT_MEM_PRE | SLJIT_MEM_POST));
CHECK_ARGUMENT((type & ~(0xff | SLJIT_32 | SLJIT_MEM_STORE | SLJIT_MEM_SUPP | SLJIT_MEM_PRE | SLJIT_MEM_POST)) == 0);
CHECK_ARGUMENT(!(type & SLJIT_32) || ((type & 0xff) >= SLJIT_MOV_U8 && (type & 0xff) <= SLJIT_MOV_S16));
if (type & SLJIT_MEM_UNALIGNED) {
allowed_flags = SLJIT_MEM_ALIGNED_16 | SLJIT_MEM_ALIGNED_32;
switch (type & 0xff) {
case SLJIT_MOV_U8:
case SLJIT_MOV_S8:
case SLJIT_MOV_U16:
case SLJIT_MOV_S16:
allowed_flags = 0;
break;
case SLJIT_MOV_U32:
case SLJIT_MOV_S32:
case SLJIT_MOV32:
allowed_flags = SLJIT_MEM_ALIGNED_16;
break;
}
CHECK_ARGUMENT((type & ~(0xff | SLJIT_32 | SLJIT_MEM_STORE | SLJIT_MEM_UNALIGNED | allowed_flags)) == 0);
CHECK_ARGUMENT((type & (SLJIT_MEM_ALIGNED_16 | SLJIT_MEM_ALIGNED_32)) != (SLJIT_MEM_ALIGNED_16 | SLJIT_MEM_ALIGNED_32));
} else {
CHECK_ARGUMENT((type & SLJIT_MEM_PRE) || (type & SLJIT_MEM_POST));
CHECK_ARGUMENT((type & (SLJIT_MEM_PRE | SLJIT_MEM_POST)) != (SLJIT_MEM_PRE | SLJIT_MEM_POST));
CHECK_ARGUMENT((type & ~(0xff | SLJIT_32 | SLJIT_MEM_STORE | SLJIT_MEM_SUPP | SLJIT_MEM_PRE | SLJIT_MEM_POST)) == 0);
CHECK_ARGUMENT((mem & REG_MASK) != 0 && (mem & REG_MASK) != reg);
}
FUNCTION_CHECK_SRC_MEM(mem, memw);
CHECK_ARGUMENT(FUNCTION_CHECK_IS_REG(reg));
CHECK_ARGUMENT((mem & REG_MASK) != 0 && (mem & REG_MASK) != reg);
#endif
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
if (!(type & SLJIT_MEM_SUPP) && SLJIT_UNLIKELY(!!compiler->verbose)) {
if (sljit_emit_mem(compiler, type | SLJIT_MEM_SUPP, reg, mem, memw) == SLJIT_ERR_UNSUPPORTED)
fprintf(compiler->verbose, " //");
if (SLJIT_UNLIKELY(!!compiler->verbose)) {
if (type & (SLJIT_MEM_PRE | SLJIT_MEM_POST)) {
if (type & SLJIT_MEM_SUPP)
CHECK_RETURN_OK;
if (sljit_emit_mem(compiler, type | SLJIT_MEM_SUPP, reg, mem, memw) == SLJIT_ERR_UNSUPPORTED) {
fprintf(compiler->verbose, " // mem: unsupported form, no instructions are emitted");
CHECK_RETURN_OK;
}
}
fprintf(compiler->verbose, " mem%s.%s%s%s ",
!(type & SLJIT_32) ? "" : "32",
(type & SLJIT_MEM_STORE) ? "st" : "ld",
op1_names[(type & 0xff) - SLJIT_OP1_BASE],
(type & SLJIT_MEM_PRE) ? ".pre" : ".post");
if ((type & 0xff) == SLJIT_MOV32)
fprintf(compiler->verbose, " mem32.%s",
(type & SLJIT_MEM_STORE) ? "st" : "ld");
else
fprintf(compiler->verbose, " mem%s.%s%s",
!(type & SLJIT_32) ? "" : "32",
(type & SLJIT_MEM_STORE) ? "st" : "ld",
op1_names[(type & 0xff) - SLJIT_OP1_BASE]);
if (type & SLJIT_MEM_UNALIGNED) {
printf(".un%s%s ", (type & SLJIT_MEM_ALIGNED_16) ? ".16" : "", (type & SLJIT_MEM_ALIGNED_32) ? ".32" : "");
} else
printf((type & SLJIT_MEM_PRE) ? ".pre " : ".post ");
sljit_verbose_reg(compiler, reg);
fprintf(compiler->verbose, ", ");
sljit_verbose_param(compiler, mem, memw);
@ -1937,22 +2021,37 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_fmem(struct sljit_compile
{
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
CHECK_ARGUMENT((type & 0xff) == SLJIT_MOV_F64);
CHECK_ARGUMENT((type & SLJIT_MEM_PRE) || (type & SLJIT_MEM_POST));
CHECK_ARGUMENT((type & (SLJIT_MEM_PRE | SLJIT_MEM_POST)) != (SLJIT_MEM_PRE | SLJIT_MEM_POST));
CHECK_ARGUMENT((type & ~(0xff | SLJIT_32 | SLJIT_MEM_STORE | SLJIT_MEM_SUPP | SLJIT_MEM_PRE | SLJIT_MEM_POST)) == 0);
if (type & SLJIT_MEM_UNALIGNED) {
CHECK_ARGUMENT((type & ~(0xff | SLJIT_32 | SLJIT_MEM_STORE | SLJIT_MEM_UNALIGNED | SLJIT_MEM_ALIGNED_16 | (type & SLJIT_32 ? 0 : SLJIT_MEM_ALIGNED_32))) == 0);
CHECK_ARGUMENT((type & (SLJIT_MEM_ALIGNED_16 | SLJIT_MEM_ALIGNED_32)) != (SLJIT_MEM_ALIGNED_16 | SLJIT_MEM_ALIGNED_32));
} else {
CHECK_ARGUMENT((type & SLJIT_MEM_PRE) || (type & SLJIT_MEM_POST));
CHECK_ARGUMENT((type & (SLJIT_MEM_PRE | SLJIT_MEM_POST)) != (SLJIT_MEM_PRE | SLJIT_MEM_POST));
CHECK_ARGUMENT((type & ~(0xff | SLJIT_32 | SLJIT_MEM_STORE | SLJIT_MEM_SUPP | SLJIT_MEM_PRE | SLJIT_MEM_POST)) == 0);
}
FUNCTION_CHECK_SRC_MEM(mem, memw);
CHECK_ARGUMENT(FUNCTION_CHECK_IS_FREG(freg));
#endif
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
if (!(type & SLJIT_MEM_SUPP) && SLJIT_UNLIKELY(!!compiler->verbose)) {
if (sljit_emit_fmem(compiler, type | SLJIT_MEM_SUPP, freg, mem, memw) == SLJIT_ERR_UNSUPPORTED)
fprintf(compiler->verbose, " //");
if (SLJIT_UNLIKELY(!!compiler->verbose)) {
if (type & (SLJIT_MEM_PRE | SLJIT_MEM_POST)) {
if (type & SLJIT_MEM_SUPP)
CHECK_RETURN_OK;
if (sljit_emit_fmem(compiler, type | SLJIT_MEM_SUPP, freg, mem, memw) == SLJIT_ERR_UNSUPPORTED) {
fprintf(compiler->verbose, " // fmem: unsupported form, no instructions are emitted");
CHECK_RETURN_OK;
}
}
fprintf(compiler->verbose, " fmem.%s%s%s ",
fprintf(compiler->verbose, " fmem.%s%s",
(type & SLJIT_MEM_STORE) ? "st" : "ld",
!(type & SLJIT_32) ? ".f64" : ".f32",
(type & SLJIT_MEM_PRE) ? ".pre" : ".post");
!(type & SLJIT_32) ? ".f64" : ".f32");
if (type & SLJIT_MEM_UNALIGNED) {
printf(".un%s%s ", (type & SLJIT_MEM_ALIGNED_16) ? ".16" : "", (type & SLJIT_MEM_ALIGNED_32) ? ".32" : "");
} else
printf((type & SLJIT_MEM_PRE) ? ".pre " : ".post ");
sljit_verbose_freg(compiler, freg);
fprintf(compiler->verbose, ", ");
sljit_verbose_param(compiler, mem, memw);
@ -2012,6 +2111,10 @@ static SLJIT_INLINE CHECK_RETURN_TYPE check_sljit_emit_put_label(struct sljit_co
CHECK_RETURN_OK;
}
#else /* !SLJIT_ARGUMENT_CHECKS && !SLJIT_VERBOSE */
#define SLJIT_SKIP_CHECKS(compiler)
#endif /* SLJIT_ARGUMENT_CHECKS || SLJIT_VERBOSE */
#define SELECT_FOP1_OPERATION_WITH_CHECKS(compiler, op, dst, dstw, src, srcw) \
@ -2050,15 +2153,10 @@ static SLJIT_INLINE sljit_s32 emit_mov_before_return(struct sljit_compiler *comp
return SLJIT_SUCCESS;
#endif
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS) \
|| (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_op1(compiler, op, SLJIT_RETURN_REG, 0, src, srcw);
}
#if !(defined SLJIT_CONFIG_SPARC && SLJIT_CONFIG_SPARC)
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_return(struct sljit_compiler *compiler, sljit_s32 op, sljit_s32 src, sljit_sw srcw)
{
CHECK_ERROR();
@ -2066,19 +2164,14 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_return(struct sljit_compiler *comp
FAIL_IF(emit_mov_before_return(compiler, op, src, srcw));
#if (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS) \
|| (defined SLJIT_VERBOSE && SLJIT_VERBOSE)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_return_void(compiler);
}
#endif
#if (defined SLJIT_CONFIG_X86 && SLJIT_CONFIG_X86) \
|| (defined SLJIT_CONFIG_PPC && SLJIT_CONFIG_PPC) \
|| (defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32) \
|| ((defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS) && !(defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 1 && SLJIT_MIPS_REV < 6))
|| ((defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS) && !(defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 1 && SLJIT_MIPS_REV < 6)) \
|| (defined SLJIT_CONFIG_RISCV && SLJIT_CONFIG_RISCV)
static SLJIT_INLINE sljit_s32 sljit_emit_cmov_generic(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 dst_reg,
@ -2088,31 +2181,55 @@ static SLJIT_INLINE sljit_s32 sljit_emit_cmov_generic(struct sljit_compiler *com
struct sljit_jump *jump;
sljit_s32 op = (dst_reg & SLJIT_32) ? SLJIT_MOV32 : SLJIT_MOV;
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
jump = sljit_emit_jump(compiler, type ^ 0x1);
FAIL_IF(!jump);
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
FAIL_IF(sljit_emit_op1(compiler, op, dst_reg & ~SLJIT_32, 0, src, srcw));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
label = sljit_emit_label(compiler);
FAIL_IF(!label);
sljit_set_label(jump, label);
return SLJIT_SUCCESS;
}
#endif
#if (!(defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS) || (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 6)) \
&& !(defined SLJIT_CONFIG_ARM_V5 && SLJIT_CONFIG_ARM_V5)
static sljit_s32 sljit_emit_mem_unaligned(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 reg,
sljit_s32 mem, sljit_sw memw)
{
SLJIT_SKIP_CHECKS(compiler);
if (type & SLJIT_MEM_STORE)
return sljit_emit_op1(compiler, type & (0xff | SLJIT_32), mem, memw, reg, 0);
return sljit_emit_op1(compiler, type & (0xff | SLJIT_32), reg, 0, mem, memw);
}
#endif /* (!SLJIT_CONFIG_MIPS || SLJIT_MIPS_REV >= 6) && !SLJIT_CONFIG_ARM_V5 */
#if (!(defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS) || (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 6)) \
&& !(defined SLJIT_CONFIG_ARM_32 && SLJIT_CONFIG_ARM_32)
static sljit_s32 sljit_emit_fmem_unaligned(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 freg,
sljit_s32 mem, sljit_sw memw)
{
SLJIT_SKIP_CHECKS(compiler);
if (type & SLJIT_MEM_STORE)
return sljit_emit_fop1(compiler, type & (0xff | SLJIT_32), mem, memw, freg, 0);
return sljit_emit_fop1(compiler, type & (0xff | SLJIT_32), freg, 0, mem, memw);
}
#endif /* (!SLJIT_CONFIG_MIPS || SLJIT_MIPS_REV >= 6) && !SLJIT_CONFIG_ARM */
/* CPU description section */
#if (defined SLJIT_32BIT_ARCHITECTURE && SLJIT_32BIT_ARCHITECTURE)
@ -2153,13 +2270,14 @@ static SLJIT_INLINE sljit_s32 sljit_emit_cmov_generic(struct sljit_compiler *com
# include "sljitNativePPC_common.c"
#elif (defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS)
# include "sljitNativeMIPS_common.c"
#elif (defined SLJIT_CONFIG_SPARC && SLJIT_CONFIG_SPARC)
# include "sljitNativeSPARC_common.c"
#elif (defined SLJIT_CONFIG_RISCV && SLJIT_CONFIG_RISCV)
# include "sljitNativeRISCV_common.c"
#elif (defined SLJIT_CONFIG_S390X && SLJIT_CONFIG_S390X)
# include "sljitNativeS390X.c"
#endif
#if !(defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS)
#if !(defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS) \
&& !(defined SLJIT_CONFIG_RISCV && SLJIT_CONFIG_RISCV)
SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_cmp(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 src1, sljit_sw src1w,
@ -2229,20 +2347,33 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_cmp(struct sljit_compiler
else
flags = condition << VARIABLE_FLAG_SHIFT;
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
PTR_FAIL_IF(sljit_emit_op2u(compiler,
SLJIT_SUB | flags | (type & SLJIT_32), src1, src1w, src2, src2w));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_jump(compiler, condition | (type & (SLJIT_REWRITABLE_JUMP | SLJIT_32)));
}
#endif
#endif /* !SLJIT_CONFIG_MIPS */
#if (defined SLJIT_CONFIG_ARM && SLJIT_CONFIG_ARM)
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_cmp_info(sljit_s32 type)
{
if (type < SLJIT_UNORDERED || type > SLJIT_ORDERED_LESS_EQUAL)
return 0;
switch (type) {
case SLJIT_UNORDERED_OR_EQUAL:
case SLJIT_ORDERED_NOT_EQUAL:
return 0;
}
return 1;
}
#endif /* SLJIT_CONFIG_ARM */
SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_fcmp(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 src1, sljit_sw src1w,
@ -2251,58 +2382,47 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_fcmp(struct sljit_compile
CHECK_ERROR_PTR();
CHECK_PTR(check_sljit_emit_fcmp(compiler, type, src1, src1w, src2, src2w));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
sljit_emit_fop1(compiler, SLJIT_CMP_F64 | ((type & 0xff) << VARIABLE_FLAG_SHIFT) | (type & SLJIT_32), src1, src1w, src2, src2w);
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_jump(compiler, type);
}
#if !(defined SLJIT_CONFIG_ARM_32 && SLJIT_CONFIG_ARM_32) \
&& !(defined SLJIT_CONFIG_ARM_64 && SLJIT_CONFIG_ARM_64) \
#if !(defined SLJIT_CONFIG_ARM && SLJIT_CONFIG_ARM) \
&& !(defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS) \
&& !(defined SLJIT_CONFIG_PPC && SLJIT_CONFIG_PPC)
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_mem(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 reg,
sljit_s32 mem, sljit_sw memw)
{
SLJIT_UNUSED_ARG(compiler);
SLJIT_UNUSED_ARG(type);
SLJIT_UNUSED_ARG(reg);
SLJIT_UNUSED_ARG(mem);
SLJIT_UNUSED_ARG(memw);
CHECK_ERROR();
CHECK(check_sljit_emit_mem(compiler, type, reg, mem, memw));
return SLJIT_ERR_UNSUPPORTED;
if (type & (SLJIT_MEM_PRE | SLJIT_MEM_POST))
return SLJIT_ERR_UNSUPPORTED;
return sljit_emit_mem_unaligned(compiler, type, reg, mem, memw);
}
#endif
#if !(defined SLJIT_CONFIG_ARM_64 && SLJIT_CONFIG_ARM_64) \
#if !(defined SLJIT_CONFIG_ARM && SLJIT_CONFIG_ARM) \
&& !(defined SLJIT_CONFIG_MIPS && SLJIT_CONFIG_MIPS) \
&& !(defined SLJIT_CONFIG_PPC && SLJIT_CONFIG_PPC)
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_fmem(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 freg,
sljit_s32 mem, sljit_sw memw)
{
SLJIT_UNUSED_ARG(compiler);
SLJIT_UNUSED_ARG(type);
SLJIT_UNUSED_ARG(freg);
SLJIT_UNUSED_ARG(mem);
SLJIT_UNUSED_ARG(memw);
CHECK_ERROR();
CHECK(check_sljit_emit_fmem(compiler, type, freg, mem, memw));
return SLJIT_ERR_UNSUPPORTED;
if (type & (SLJIT_MEM_PRE | SLJIT_MEM_POST))
return SLJIT_ERR_UNSUPPORTED;
return sljit_emit_fmem_unaligned(compiler, type, freg, mem, memw);
}
#endif
@ -2316,10 +2436,9 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_get_local_base(struct sljit_compiler *c
CHECK(check_sljit_get_local_base(compiler, dst, dstw, offset));
ADJUST_LOCAL_OFFSET(SLJIT_MEM1(SLJIT_SP), offset);
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
if (offset != 0)
return sljit_emit_op2(compiler, SLJIT_ADD, dst, dstw, SLJIT_SP, 0, SLJIT_IMM, offset);
return sljit_emit_op1(compiler, SLJIT_MOV, dst, dstw, SLJIT_SP, 0);
@ -2387,6 +2506,13 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_has_cpu_feature(sljit_s32 feature_type)
return 0;
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_cmp_info(sljit_s32 type)
{
SLJIT_UNUSED_ARG(type);
SLJIT_UNREACHABLE();
return 0;
}
SLJIT_API_FUNC_ATTRIBUTE void sljit_free_code(void* code, void *exec_allocator_data)
{
SLJIT_UNUSED_ARG(code);

View File

@ -488,8 +488,7 @@ struct sljit_compiler {
sljit_uw args_size;
#endif
#if (defined SLJIT_CONFIG_SPARC_32 && SLJIT_CONFIG_SPARC_32)
sljit_s32 delay_slot;
#if (defined SLJIT_CONFIG_RISCV && SLJIT_CONFIG_RISCV)
sljit_s32 cache_arg;
sljit_sw cache_argw;
#endif
@ -634,6 +633,20 @@ static SLJIT_INLINE sljit_uw sljit_get_generated_code_size(struct sljit_compiler
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_has_cpu_feature(sljit_s32 feature_type);
/* If type is between SLJIT_ORDERED_EQUAL and SLJIT_ORDERED_LESS_EQUAL,
sljit_cmp_info returns one, if the cpu supports the passed floating
point comparison type.
If type is SLJIT_UNORDERED or SLJIT_ORDERED, sljit_cmp_info returns
one, if the cpu supports checking the unordered comparison result
regardless of the comparison type passed to the comparison instruction.
The returned value is always one, if there is at least one type between
SLJIT_ORDERED_EQUAL and SLJIT_ORDERED_LESS_EQUAL where sljit_cmp_info
returns with a zero value.
Otherwise it returns zero. */
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_cmp_info(sljit_s32 type);
/* Instruction generation. Returns with any error code. If there is no
error, they return with SLJIT_SUCCESS. */
@ -683,9 +696,21 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_has_cpu_feature(sljit_s32 feature_type)
overwrites the previous context.
*/
/* The compiled function uses cdecl calling
* convention instead of SLJIT_FUNC. */
#define SLJIT_ENTER_CDECL 0x00000001
/* Saved registers between SLJIT_S0 and SLJIT_S(n - 1) (inclusive)
are not saved / restored on function enter / return. Instead,
these registers can be used to pass / return data (such as
global / local context pointers) across function calls. The
value of n must be between 1 and 3. Furthermore, this option
is only supported by register argument calling convention, so
SLJIT_ENTER_REG_ARG (see below) must be specified as well. */
#define SLJIT_ENTER_KEEP(n) (n)
/* The compiled function uses an sljit specific register argument
* calling convention. This is a lightweight function call type where
* both the caller and called function must be compiled with sljit.
* The jump type of the function call must be SLJIT_CALL_REG_ARG
* and the called function must store all arguments in registers. */
#define SLJIT_ENTER_REG_ARG 0x00000004
/* The local_size must be >= 0 and <= SLJIT_MAX_LOCAL_SIZE. */
#define SLJIT_MAX_LOCAL_SIZE 65536
@ -792,8 +817,9 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_fast_enter(struct sljit_compiler *
Write-back is supported except for one instruction: 32 bit signed
load with [reg+imm] addressing mode on 64 bit.
mips: [reg+imm], -65536 <= imm <= 65535
sparc: [reg+imm], -4096 <= imm <= 4095
[reg+reg] is supported
Write-back is not supported
riscv: [reg+imm], -2048 <= imm <= 2047
Write-back is not supported
s390x: [reg+imm], -2^19 <= imm < 2^19
[reg+reg] is supported
Write-back is not supported
@ -1207,41 +1233,70 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_label* sljit_emit_label(struct sljit_compi
#define SLJIT_SET_CARRY SLJIT_SET(SLJIT_CARRY)
#define SLJIT_NOT_CARRY 13
/* Floating point comparison types. */
#define SLJIT_EQUAL_F64 14
#define SLJIT_EQUAL_F32 (SLJIT_EQUAL_F64 | SLJIT_32)
#define SLJIT_SET_EQUAL_F SLJIT_SET(SLJIT_EQUAL_F64)
#define SLJIT_NOT_EQUAL_F64 15
#define SLJIT_NOT_EQUAL_F32 (SLJIT_NOT_EQUAL_F64 | SLJIT_32)
#define SLJIT_SET_NOT_EQUAL_F SLJIT_SET(SLJIT_NOT_EQUAL_F64)
#define SLJIT_LESS_F64 16
#define SLJIT_LESS_F32 (SLJIT_LESS_F64 | SLJIT_32)
#define SLJIT_SET_LESS_F SLJIT_SET(SLJIT_LESS_F64)
#define SLJIT_GREATER_EQUAL_F64 17
#define SLJIT_GREATER_EQUAL_F32 (SLJIT_GREATER_EQUAL_F64 | SLJIT_32)
#define SLJIT_SET_GREATER_EQUAL_F SLJIT_SET(SLJIT_GREATER_EQUAL_F64)
#define SLJIT_GREATER_F64 18
#define SLJIT_GREATER_F32 (SLJIT_GREATER_F64 | SLJIT_32)
#define SLJIT_SET_GREATER_F SLJIT_SET(SLJIT_GREATER_F64)
#define SLJIT_LESS_EQUAL_F64 19
#define SLJIT_LESS_EQUAL_F32 (SLJIT_LESS_EQUAL_F64 | SLJIT_32)
#define SLJIT_SET_LESS_EQUAL_F SLJIT_SET(SLJIT_LESS_EQUAL_F64)
#define SLJIT_UNORDERED_F64 20
#define SLJIT_UNORDERED_F32 (SLJIT_UNORDERED_F64 | SLJIT_32)
#define SLJIT_SET_UNORDERED_F SLJIT_SET(SLJIT_UNORDERED_F64)
#define SLJIT_ORDERED_F64 21
#define SLJIT_ORDERED_F32 (SLJIT_ORDERED_F64 | SLJIT_32)
#define SLJIT_SET_ORDERED_F SLJIT_SET(SLJIT_ORDERED_F64)
/* Basic floating point comparison types.
Note: when the comparison result is unordered, their behaviour is unspecified. */
#define SLJIT_F_EQUAL 14
#define SLJIT_SET_F_EQUAL SLJIT_SET(SLJIT_F_EQUAL)
#define SLJIT_F_NOT_EQUAL 15
#define SLJIT_SET_F_NOT_EQUAL SLJIT_SET(SLJIT_F_NOT_EQUAL)
#define SLJIT_F_LESS 16
#define SLJIT_SET_F_LESS SLJIT_SET(SLJIT_F_LESS)
#define SLJIT_F_GREATER_EQUAL 17
#define SLJIT_SET_F_GREATER_EQUAL SLJIT_SET(SLJIT_F_GREATER_EQUAL)
#define SLJIT_F_GREATER 18
#define SLJIT_SET_F_GREATER SLJIT_SET(SLJIT_F_GREATER)
#define SLJIT_F_LESS_EQUAL 19
#define SLJIT_SET_F_LESS_EQUAL SLJIT_SET(SLJIT_F_LESS_EQUAL)
/* Jumps when either argument contains a NaN value. */
#define SLJIT_UNORDERED 20
#define SLJIT_SET_UNORDERED SLJIT_SET(SLJIT_UNORDERED)
/* Jumps when neither argument contains a NaN value. */
#define SLJIT_ORDERED 21
#define SLJIT_SET_ORDERED SLJIT_SET(SLJIT_ORDERED)
/* Ordered / unordered floating point comparison types.
Note: each comparison type has an ordered and unordered form. Some
architectures supports only either of them (see: sljit_cmp_info). */
#define SLJIT_ORDERED_EQUAL 22
#define SLJIT_SET_ORDERED_EQUAL SLJIT_SET(SLJIT_ORDERED_EQUAL)
#define SLJIT_UNORDERED_OR_NOT_EQUAL 23
#define SLJIT_SET_UNORDERED_OR_NOT_EQUAL SLJIT_SET(SLJIT_UNORDERED_OR_NOT_EQUAL)
#define SLJIT_ORDERED_LESS 24
#define SLJIT_SET_ORDERED_LESS SLJIT_SET(SLJIT_ORDERED_LESS)
#define SLJIT_UNORDERED_OR_GREATER_EQUAL 25
#define SLJIT_SET_UNORDERED_OR_GREATER_EQUAL SLJIT_SET(SLJIT_UNORDERED_OR_GREATER_EQUAL)
#define SLJIT_ORDERED_GREATER 26
#define SLJIT_SET_ORDERED_GREATER SLJIT_SET(SLJIT_ORDERED_GREATER)
#define SLJIT_UNORDERED_OR_LESS_EQUAL 27
#define SLJIT_SET_UNORDERED_OR_LESS_EQUAL SLJIT_SET(SLJIT_UNORDERED_OR_LESS_EQUAL)
#define SLJIT_UNORDERED_OR_EQUAL 28
#define SLJIT_SET_UNORDERED_OR_EQUAL SLJIT_SET(SLJIT_UNORDERED_OR_EQUAL)
#define SLJIT_ORDERED_NOT_EQUAL 29
#define SLJIT_SET_ORDERED_NOT_EQUAL SLJIT_SET(SLJIT_ORDERED_NOT_EQUAL)
#define SLJIT_UNORDERED_OR_LESS 30
#define SLJIT_SET_UNORDERED_OR_LESS SLJIT_SET(SLJIT_UNORDERED_OR_LESS)
#define SLJIT_ORDERED_GREATER_EQUAL 31
#define SLJIT_SET_ORDERED_GREATER_EQUAL SLJIT_SET(SLJIT_ORDERED_GREATER_EQUAL)
#define SLJIT_UNORDERED_OR_GREATER 32
#define SLJIT_SET_UNORDERED_OR_GREATER SLJIT_SET(SLJIT_UNORDERED_OR_GREATER)
#define SLJIT_ORDERED_LESS_EQUAL 33
#define SLJIT_SET_ORDERED_LESS_EQUAL SLJIT_SET(SLJIT_ORDERED_LESS_EQUAL)
/* Unconditional jump types. */
#define SLJIT_JUMP 22
#define SLJIT_JUMP 34
/* Fast calling method. See sljit_emit_fast_enter / SLJIT_FAST_RETURN. */
#define SLJIT_FAST_CALL 23
/* Called function must be declared with the SLJIT_FUNC attribute. */
#define SLJIT_CALL 24
/* Called function must be declared with cdecl attribute.
This is the default attribute for C functions. */
#define SLJIT_CALL_CDECL 25
#define SLJIT_FAST_CALL 35
/* Default C calling convention. */
#define SLJIT_CALL 36
/* Called function must be an sljit compiled function.
See SLJIT_ENTER_REG_ARG option. */
#define SLJIT_CALL_REG_ARG 37
/* The target can be changed during runtime (see: sljit_set_jump_addr). */
#define SLJIT_REWRITABLE_JUMP 0x1000
@ -1249,10 +1304,7 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_label* sljit_emit_label(struct sljit_compi
the called function returns to the caller of the current function. The
stack usage is reduced before the call, but it is not necessarily reduced
to zero. In the latter case the compiler needs to allocate space for some
arguments and the return register must be kept as well.
This feature is highly experimental and not supported on SPARC platform
at the moment. */
arguments and the return address must be stored on the stack as well. */
#define SLJIT_CALL_RETURN 0x2000
/* Emit a jump instruction. The destination is not set, only the type of the jump.
@ -1287,7 +1339,7 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_cmp(struct sljit_compiler
sljit_emit_jump. However some architectures (i.e: MIPS) may employ
special optimizations here. It is suggested to use this comparison form
when appropriate.
type must be between SLJIT_EQUAL_F64 and SLJIT_ORDERED_F32
type must be between SLJIT_F_EQUAL and SLJIT_ORDERED_LESS_EQUAL
type can be combined (or'ed) with SLJIT_REWRITABLE_JUMP
Flags: destroy flags.
Note: if either operand is NaN, the behaviour is undefined for
@ -1320,7 +1372,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_ijump(struct sljit_compiler *compi
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_icall(struct sljit_compiler *compiler, sljit_s32 type, sljit_s32 arg_types, sljit_s32 src, sljit_sw srcw);
/* Perform the operation using the conditional flags as the second argument.
Type must always be between SLJIT_EQUAL and SLJIT_ORDERED_F64. The value
Type must always be between SLJIT_EQUAL and SLJIT_ORDERED_LESS_EQUAL. The value
represented by the type is 1, if the condition represented by the type
is fulfilled, and 0 otherwise.
@ -1339,7 +1391,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
if the condition is satisfied. Unlike other arithmetic operations this
instruction does not support memory access.
type must be between SLJIT_EQUAL and SLJIT_ORDERED_F64
type must be between SLJIT_EQUAL and SLJIT_ORDERED_LESS_EQUAL
dst_reg must be a valid register and it can be combined
with SLJIT_32 to perform a 32 bit arithmetic operation
src must be register or immediate (SLJIT_IMM)
@ -1351,32 +1403,58 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_cmov(struct sljit_compiler *compil
/* The following flags are used by sljit_emit_mem() and sljit_emit_fmem(). */
/* Memory load operation. This is the default. */
#define SLJIT_MEM_LOAD 0x000000
/* Memory store operation. */
#define SLJIT_MEM_STORE 0x000200
/* Load or stora data from an unaligned address. */
#define SLJIT_MEM_UNALIGNED 0x000400
/* Load or store data and update the base address with a single operation. */
/* Base register is updated before the memory access. */
#define SLJIT_MEM_PRE 0x000800
/* Base register is updated after the memory access. */
#define SLJIT_MEM_POST 0x001000
/* The following flags are supported when SLJIT_MEM_UNALIGNED is specified: */
/* Defines 16 bit alignment for unaligned accesses. */
#define SLJIT_MEM_ALIGNED_16 0x010000
/* Defines 32 bit alignment for unaligned accesses. */
#define SLJIT_MEM_ALIGNED_32 0x020000
/* The following flags are supported when SLJIT_MEM_PRE or
SLJIT_MEM_POST is specified: */
/* When SLJIT_MEM_SUPP is passed, no instructions are emitted.
Instead the function returns with SLJIT_SUCCESS if the instruction
form is supported and SLJIT_ERR_UNSUPPORTED otherwise. This flag
allows runtime checking of available instruction forms. */
#define SLJIT_MEM_SUPP 0x0200
/* Memory load operation. This is the default. */
#define SLJIT_MEM_LOAD 0x0000
/* Memory store operation. */
#define SLJIT_MEM_STORE 0x0400
/* Base register is updated before the memory access. */
#define SLJIT_MEM_PRE 0x0800
/* Base register is updated after the memory access. */
#define SLJIT_MEM_POST 0x1000
#define SLJIT_MEM_SUPP 0x010000
/* Emit a single memory load or store with update instruction. When the
requested instruction form is not supported by the CPU, it returns
with SLJIT_ERR_UNSUPPORTED instead of emulating the instruction. This
allows specializing tight loops based on the supported instruction
forms (see SLJIT_MEM_SUPP flag).
/* The sljit_emit_mem emits instructions for various memory operations:
When SLJIT_MEM_UNALIGNED is set in type argument:
Emit instructions for unaligned memory loads or stores. When
SLJIT_UNALIGNED is not defined, the only way to access unaligned
memory data is using sljit_emit_mem. Otherwise all operations (e.g.
sljit_emit_op1/2, or sljit_emit_fop1/2) supports unaligned access.
In general, the performance of unaligned memory accesses are often
lower than aligned and should be avoided.
When SLJIT_MEM_PRE or SLJIT_MEM_POST is set in type argument:
Emit a single memory load or store with update instruction.
When the requested instruction form is not supported by the CPU,
it returns with SLJIT_ERR_UNSUPPORTED instead of emulating the
instruction. This allows specializing tight loops based on
the supported instruction forms (see SLJIT_MEM_SUPP flag).
type must be between SLJIT_MOV and SLJIT_MOV_P and can be
combined with SLJIT_MEM_* flags. Either SLJIT_MEM_PRE
or SLJIT_MEM_POST must be specified.
combined with SLJIT_MEM_* flags.
reg is the source or destination register, and must be
different from the base register of the mem operand
mem must be a SLJIT_MEM1() or SLJIT_MEM2() operand
when SLJIT_MEM_PRE or SLJIT_MEM_POST is passed
mem must be a memory operand
Flags: - (does not modify flags) */
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_mem(struct sljit_compiler *compiler, sljit_s32 type,
@ -1386,9 +1464,11 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_mem(struct sljit_compiler *compile
/* Same as sljit_emit_mem except the followings:
type must be SLJIT_MOV_F64 or SLJIT_MOV_F32 and can be
combined with SLJIT_MEM_* flags. Either SLJIT_MEM_PRE
or SLJIT_MEM_POST must be specified.
freg is the source or destination floating point register */
combined with SLJIT_MEM_* flags.
freg is the source or destination floating point register
mem must be a memory operand
Flags: - (does not modify flags) */
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_fmem(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 freg,
@ -1547,7 +1627,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_get_register_index(sljit_s32 reg);
/* The following function is a helper function for sljit_emit_op_custom.
It returns with the real machine register index of any SLJIT_FLOAT register.
Note: the index is always an even number on ARM (except ARM-64), MIPS, and SPARC. */
Note: the index is always an even number on ARM-32, MIPS. */
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_get_float_register_index(sljit_s32 reg);

View File

@ -100,6 +100,7 @@ static const sljit_u8 freg_map[SLJIT_NUMBER_OF_FLOAT_REGISTERS + 3] = {
#define CMP 0xe1400000
#define BKPT 0xe1200070
#define EOR 0xe0200000
#define LDR 0xe5100000
#define MOV 0xe1a00000
#define MUL 0xe0000090
#define MVN 0xe1e00000
@ -111,6 +112,7 @@ static const sljit_u8 freg_map[SLJIT_NUMBER_OF_FLOAT_REGISTERS + 3] = {
#define RSC 0xe0e00000
#define SBC 0xe0c00000
#define SMULL 0xe0c00090
#define STR 0xe5000000
#define SUB 0xe0400000
#define TST 0xe1000000
#define UMULL 0xe0800090
@ -1049,7 +1051,8 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
sljit_s32 fscratches, sljit_s32 fsaveds, sljit_s32 local_size)
{
sljit_uw imm, offset;
sljit_s32 i, tmp, size, word_arg_count, saved_arg_count;
sljit_s32 i, tmp, size, word_arg_count;
sljit_s32 saved_arg_count = SLJIT_KEPT_SAVEDS_COUNT(options);
#ifdef __SOFTFP__
sljit_u32 float_arg_count;
#else
@ -1065,7 +1068,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
imm = 0;
tmp = SLJIT_S0 - saveds;
for (i = SLJIT_S0; i > tmp; i--)
for (i = SLJIT_S0 - saved_arg_count; i > tmp; i--)
imm |= (sljit_uw)1 << reg_map[i];
for (i = scratches; i >= SLJIT_FIRST_SAVED_REG; i--)
@ -1082,7 +1085,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
FAIL_IF(push_inst(compiler, 0xe52d0004 | RD(TMP_REG2)));
/* Stack must be aligned to 8 bytes: */
size = GET_SAVED_REGISTERS_SIZE(scratches, saveds, 1);
size = GET_SAVED_REGISTERS_SIZE(scratches, saveds - saved_arg_count, 1);
if (fsaveds > 0 || fscratches >= SLJIT_FIRST_SAVED_FLOAT_REG) {
if ((size & SSIZE_OF(sw)) != 0) {
@ -1103,6 +1106,9 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
local_size = ((size + local_size + 0x7) & ~0x7) - size;
compiler->local_size = local_size;
if (options & SLJIT_ENTER_REG_ARG)
arg_types = 0;
arg_types >>= SLJIT_ARG_SHIFT;
word_arg_count = 0;
saved_arg_count = 0;
@ -1148,8 +1154,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
if (offset < 4 * sizeof(sljit_sw))
FAIL_IF(push_inst(compiler, MOV | RD(tmp) | (offset >> 2)));
else
FAIL_IF(push_inst(compiler, data_transfer_insts[WORD_SIZE | LOAD_DATA] | 0x800000
| RN(SLJIT_SP) | RD(tmp) | (offset + (sljit_uw)size - 4 * sizeof(sljit_sw))));
FAIL_IF(push_inst(compiler, LDR | 0x800000 | RN(SLJIT_SP) | RD(tmp) | (offset + (sljit_uw)size - 4 * sizeof(sljit_sw))));
break;
}
@ -1217,7 +1222,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_set_context(struct sljit_compiler *comp
CHECK(check_sljit_set_context(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size));
set_set_context(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size);
size = GET_SAVED_REGISTERS_SIZE(scratches, saveds, 1);
size = GET_SAVED_REGISTERS_SIZE(scratches, saveds - SLJIT_KEPT_SAVEDS_COUNT(options), 1);
if ((size & SSIZE_OF(sw)) != 0 && (fsaveds > 0 || fscratches >= SLJIT_FIRST_SAVED_FLOAT_REG))
size += SSIZE_OF(sw);
@ -1241,6 +1246,7 @@ static sljit_s32 emit_add_sp(struct sljit_compiler *compiler, sljit_uw imm)
static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler, sljit_s32 frame_size)
{
sljit_s32 local_size, fscratches, fsaveds, i, tmp;
sljit_s32 saveds_restore_start = SLJIT_S0 - SLJIT_KEPT_SAVEDS_COUNT(compiler->options);
sljit_s32 lr_dst = TMP_PC;
sljit_uw reg_list;
@ -1277,8 +1283,11 @@ static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler, sljit
reg_list |= (sljit_uw)1 << reg_map[lr_dst];
tmp = SLJIT_S0 - compiler->saveds;
for (i = SLJIT_S0; i > tmp; i--)
reg_list |= (sljit_uw)1 << reg_map[i];
if (saveds_restore_start != tmp) {
for (i = saveds_restore_start; i > tmp; i--)
reg_list |= (sljit_uw)1 << reg_map[i];
} else
saveds_restore_start = 0;
for (i = compiler->scratches; i >= SLJIT_FIRST_SAVED_REG; i--)
reg_list |= (sljit_uw)1 << reg_map[i];
@ -1298,16 +1307,15 @@ static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler, sljit
if (reg_list == 0)
return SLJIT_SUCCESS;
if (compiler->saveds > 0) {
SLJIT_ASSERT(reg_list == ((sljit_uw)1 << reg_map[SLJIT_S0]));
lr_dst = SLJIT_S0;
if (saveds_restore_start != 0) {
SLJIT_ASSERT(reg_list == ((sljit_uw)1 << reg_map[saveds_restore_start]));
lr_dst = saveds_restore_start;
} else {
SLJIT_ASSERT(reg_list == ((sljit_uw)1 << reg_map[SLJIT_FIRST_SAVED_REG]));
lr_dst = SLJIT_FIRST_SAVED_REG;
}
return push_inst(compiler, data_transfer_insts[WORD_SIZE | LOAD_DATA] | 0x800000
| RN(SLJIT_SP) | RD(lr_dst) | (sljit_uw)(frame_size - 2 * SSIZE_OF(sw)));
return push_inst(compiler, LDR | 0x800000 | RN(SLJIT_SP) | RD(lr_dst) | (sljit_uw)(frame_size - 2 * SSIZE_OF(sw)));
}
if (local_size > 0)
@ -1674,23 +1682,17 @@ static SLJIT_INLINE sljit_s32 emit_op_mem(struct sljit_compiler *compiler, sljit
sljit_s32 arg, sljit_sw argw, sljit_s32 tmp_reg)
{
sljit_uw imm, offset_reg;
sljit_uw is_type1_transfer = IS_TYPE1_TRANSFER(flags);
sljit_sw mask = IS_TYPE1_TRANSFER(flags) ? 0xfff : 0xff;
SLJIT_ASSERT (arg & SLJIT_MEM);
SLJIT_ASSERT((arg & REG_MASK) != tmp_reg);
SLJIT_ASSERT((arg & REG_MASK) != tmp_reg || (arg == SLJIT_MEM1(tmp_reg) && argw >= -mask && argw <= mask));
if (!(arg & REG_MASK)) {
if (is_type1_transfer) {
FAIL_IF(load_immediate(compiler, tmp_reg, (sljit_uw)argw & ~(sljit_uw)0xfff));
argw &= 0xfff;
}
else {
FAIL_IF(load_immediate(compiler, tmp_reg, (sljit_uw)argw & ~(sljit_uw)0xff));
argw &= 0xff;
}
if (SLJIT_UNLIKELY(!(arg & REG_MASK))) {
FAIL_IF(load_immediate(compiler, tmp_reg, (sljit_uw)(argw & ~mask)));
argw &= mask;
return push_inst(compiler, EMIT_DATA_TRANSFER(flags, 1, reg, tmp_reg,
is_type1_transfer ? argw : TYPE2_TRANSFER_IMM(argw)));
(mask == 0xff) ? TYPE2_TRANSFER_IMM(argw) : argw));
}
if (arg & OFFS_REG_MASK) {
@ -1698,72 +1700,53 @@ static SLJIT_INLINE sljit_s32 emit_op_mem(struct sljit_compiler *compiler, sljit
arg &= REG_MASK;
argw &= 0x3;
if (argw != 0 && !is_type1_transfer) {
if (argw != 0 && (mask == 0xff)) {
FAIL_IF(push_inst(compiler, ADD | RD(tmp_reg) | RN(arg) | RM(offset_reg) | ((sljit_uw)argw << 7)));
return push_inst(compiler, EMIT_DATA_TRANSFER(flags, 1, reg, tmp_reg, TYPE2_TRANSFER_IMM(0)));
}
/* Bit 25: RM is offset. */
return push_inst(compiler, EMIT_DATA_TRANSFER(flags, 1, reg, arg,
RM(offset_reg) | (is_type1_transfer ? (1 << 25) : 0) | ((sljit_uw)argw << 7)));
RM(offset_reg) | (mask == 0xff ? 0 : (1 << 25)) | ((sljit_uw)argw << 7)));
}
arg &= REG_MASK;
if (is_type1_transfer) {
if (argw > 0xfff) {
imm = get_imm((sljit_uw)argw & ~(sljit_uw)0xfff);
if (imm) {
FAIL_IF(push_inst(compiler, ADD | RD(tmp_reg) | RN(arg) | imm));
argw = argw & 0xfff;
arg = tmp_reg;
}
if (argw > mask) {
imm = get_imm((sljit_uw)(argw & ~mask));
if (imm) {
FAIL_IF(push_inst(compiler, ADD | RD(tmp_reg) | RN(arg) | imm));
argw = argw & mask;
arg = tmp_reg;
}
else if (argw < -0xfff) {
imm = get_imm((sljit_uw)-argw & ~(sljit_uw)0xfff);
if (imm) {
FAIL_IF(push_inst(compiler, SUB | RD(tmp_reg) | RN(arg) | imm));
argw = -(-argw & 0xfff);
arg = tmp_reg;
}
}
if (argw >= 0 && argw <= 0xfff)
return push_inst(compiler, EMIT_DATA_TRANSFER(flags, 1, reg, arg, argw));
if (argw < 0 && argw >= -0xfff)
return push_inst(compiler, EMIT_DATA_TRANSFER(flags, 0, reg, arg, -argw));
}
else {
if (argw > 0xff) {
imm = get_imm((sljit_uw)argw & ~(sljit_uw)0xff);
if (imm) {
FAIL_IF(push_inst(compiler, ADD | RD(tmp_reg) | RN(arg) | imm));
argw = argw & 0xff;
arg = tmp_reg;
}
else if (argw < -mask) {
imm = get_imm((sljit_uw)(-argw & ~mask));
if (imm) {
FAIL_IF(push_inst(compiler, SUB | RD(tmp_reg) | RN(arg) | imm));
argw = -(-argw & mask);
arg = tmp_reg;
}
else if (argw < -0xff) {
imm = get_imm((sljit_uw)-argw & ~(sljit_uw)0xff);
if (imm) {
FAIL_IF(push_inst(compiler, SUB | RD(tmp_reg) | RN(arg) | imm));
argw = -(-argw & 0xff);
arg = tmp_reg;
}
}
if (argw <= mask && argw >= -mask) {
if (argw >= 0) {
if (mask == 0xff)
argw = TYPE2_TRANSFER_IMM(argw);
return push_inst(compiler, EMIT_DATA_TRANSFER(flags, 1, reg, arg, argw));
}
if (argw >= 0 && argw <= 0xff)
return push_inst(compiler, EMIT_DATA_TRANSFER(flags, 1, reg, arg, TYPE2_TRANSFER_IMM(argw)));
argw = -argw;
if (argw < 0 && argw >= -0xff) {
argw = -argw;
return push_inst(compiler, EMIT_DATA_TRANSFER(flags, 0, reg, arg, TYPE2_TRANSFER_IMM(argw)));
}
if (mask == 0xff)
argw = TYPE2_TRANSFER_IMM(argw);
return push_inst(compiler, EMIT_DATA_TRANSFER(flags, 0, reg, arg, argw));
}
FAIL_IF(load_immediate(compiler, tmp_reg, (sljit_uw)argw));
return push_inst(compiler, EMIT_DATA_TRANSFER(flags, 1, reg, arg,
RM(tmp_reg) | (is_type1_transfer ? (1 << 25) : 0)));
RM(tmp_reg) | (mask == 0xff ? 0 : (1 << 25))));
}
static sljit_s32 emit_op(struct sljit_compiler *compiler, sljit_s32 op, sljit_s32 inp_flags,
@ -1961,15 +1944,15 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op0(struct sljit_compiler *compile
saved_reg_list[saved_reg_count++] = 1;
if (saved_reg_count > 0) {
FAIL_IF(push_inst(compiler, 0xe52d0000 | (saved_reg_count >= 3 ? 16 : 8)
FAIL_IF(push_inst(compiler, STR | 0x2d0000 | (saved_reg_count >= 3 ? 16 : 8)
| (saved_reg_list[0] << 12) /* str rX, [sp, #-8/-16]! */));
if (saved_reg_count >= 2) {
SLJIT_ASSERT(saved_reg_list[1] < 8);
FAIL_IF(push_inst(compiler, 0xe58d0004 | (saved_reg_list[1] << 12) /* str rX, [sp, #4] */));
FAIL_IF(push_inst(compiler, STR | 0x8d0004 | (saved_reg_list[1] << 12) /* str rX, [sp, #4] */));
}
if (saved_reg_count >= 3) {
SLJIT_ASSERT(saved_reg_list[2] < 8);
FAIL_IF(push_inst(compiler, 0xe58d0008 | (saved_reg_list[2] << 12) /* str rX, [sp, #8] */));
FAIL_IF(push_inst(compiler, STR | 0x8d0008 | (saved_reg_list[2] << 12) /* str rX, [sp, #8] */));
}
}
@ -1983,13 +1966,13 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op0(struct sljit_compiler *compile
if (saved_reg_count > 0) {
if (saved_reg_count >= 3) {
SLJIT_ASSERT(saved_reg_list[2] < 8);
FAIL_IF(push_inst(compiler, 0xe59d0008 | (saved_reg_list[2] << 12) /* ldr rX, [sp, #8] */));
FAIL_IF(push_inst(compiler, LDR | 0x8d0008 | (saved_reg_list[2] << 12) /* ldr rX, [sp, #8] */));
}
if (saved_reg_count >= 2) {
SLJIT_ASSERT(saved_reg_list[1] < 8);
FAIL_IF(push_inst(compiler, 0xe59d0004 | (saved_reg_list[1] << 12) /* ldr rX, [sp, #4] */));
FAIL_IF(push_inst(compiler, LDR | 0x8d0004 | (saved_reg_list[1] << 12) /* ldr rX, [sp, #4] */));
}
return push_inst(compiler, 0xe49d0000 | (sljit_uw)(saved_reg_count >= 3 ? 16 : 8)
return push_inst(compiler, (LDR ^ (1 << 24)) | 0x8d0000 | (sljit_uw)(saved_reg_count >= 3 ? 16 : 8)
| (saved_reg_list[0] << 12) /* ldr rX, [sp], #8/16 */);
}
return SLJIT_SUCCESS;
@ -2091,10 +2074,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op2u(struct sljit_compiler *compil
CHECK_ERROR();
CHECK(check_sljit_emit_op2(compiler, op, 1, 0, 0, src1, src1w, src2, src2w));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_op2(compiler, op, TMP_REG2, 0, src1, src1w, src2, src2w);
}
@ -2370,7 +2350,6 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_fop2(struct sljit_compiler *compil
return SLJIT_SUCCESS;
}
#undef FPU_LOAD
#undef EMIT_FPU_DATA_TRANSFER
/* --------------------------------------------------------------------- */
@ -2400,11 +2379,15 @@ static sljit_uw get_cc(struct sljit_compiler *compiler, sljit_s32 type)
{
switch (type) {
case SLJIT_EQUAL:
case SLJIT_EQUAL_F64:
case SLJIT_F_EQUAL:
case SLJIT_ORDERED_EQUAL:
case SLJIT_UNORDERED_OR_EQUAL: /* Not supported. */
return 0x00000000;
case SLJIT_NOT_EQUAL:
case SLJIT_NOT_EQUAL_F64:
case SLJIT_F_NOT_EQUAL:
case SLJIT_UNORDERED_OR_NOT_EQUAL:
case SLJIT_ORDERED_NOT_EQUAL: /* Not supported. */
return 0x10000000;
case SLJIT_CARRY:
@ -2413,7 +2396,6 @@ static sljit_uw get_cc(struct sljit_compiler *compiler, sljit_s32 type)
/* fallthrough */
case SLJIT_LESS:
case SLJIT_LESS_F64:
return 0x30000000;
case SLJIT_NOT_CARRY:
@ -2422,27 +2404,33 @@ static sljit_uw get_cc(struct sljit_compiler *compiler, sljit_s32 type)
/* fallthrough */
case SLJIT_GREATER_EQUAL:
case SLJIT_GREATER_EQUAL_F64:
return 0x20000000;
case SLJIT_GREATER:
case SLJIT_GREATER_F64:
case SLJIT_UNORDERED_OR_GREATER:
return 0x80000000;
case SLJIT_LESS_EQUAL:
case SLJIT_LESS_EQUAL_F64:
case SLJIT_F_LESS_EQUAL:
case SLJIT_ORDERED_LESS_EQUAL:
return 0x90000000;
case SLJIT_SIG_LESS:
case SLJIT_UNORDERED_OR_LESS:
return 0xb0000000;
case SLJIT_SIG_GREATER_EQUAL:
case SLJIT_F_GREATER_EQUAL:
case SLJIT_ORDERED_GREATER_EQUAL:
return 0xa0000000;
case SLJIT_SIG_GREATER:
case SLJIT_F_GREATER:
case SLJIT_ORDERED_GREATER:
return 0xc0000000;
case SLJIT_SIG_LESS_EQUAL:
case SLJIT_UNORDERED_OR_LESS_EQUAL:
return 0xd0000000;
case SLJIT_OVERFLOW:
@ -2450,7 +2438,7 @@ static sljit_uw get_cc(struct sljit_compiler *compiler, sljit_s32 type)
return 0x10000000;
/* fallthrough */
case SLJIT_UNORDERED_F64:
case SLJIT_UNORDERED:
return 0x60000000;
case SLJIT_NOT_OVERFLOW:
@ -2458,11 +2446,18 @@ static sljit_uw get_cc(struct sljit_compiler *compiler, sljit_s32 type)
return 0x00000000;
/* fallthrough */
case SLJIT_ORDERED_F64:
case SLJIT_ORDERED:
return 0x70000000;
case SLJIT_F_LESS:
case SLJIT_ORDERED_LESS:
return 0x40000000;
case SLJIT_UNORDERED_OR_GREATER_EQUAL:
return 0x50000000;
default:
SLJIT_ASSERT(type >= SLJIT_JUMP && type <= SLJIT_CALL_CDECL);
SLJIT_ASSERT(type >= SLJIT_JUMP && type <= SLJIT_CALL_REG_ARG);
return 0xe0000000;
}
}
@ -2639,7 +2634,7 @@ static sljit_s32 softfloat_call_with_args(struct sljit_compiler *compiler, sljit
}
FAIL_IF(push_inst(compiler, MOV | (offset << 10) | (word_arg_offset >> 2)));
} else
FAIL_IF(push_inst(compiler, data_transfer_insts[WORD_SIZE] | 0x800000 | RN(SLJIT_SP) | (word_arg_offset << 10) | (offset - 4 * sizeof(sljit_sw))));
FAIL_IF(push_inst(compiler, STR | 0x800000 | RN(SLJIT_SP) | (word_arg_offset << 10) | (offset - 4 * sizeof(sljit_sw))));
}
break;
}
@ -2718,51 +2713,48 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
CHECK_PTR(check_sljit_emit_call(compiler, type, arg_types));
#ifdef __SOFTFP__
PTR_FAIL_IF(softfloat_call_with_args(compiler, arg_types, NULL, &extra_space));
SLJIT_ASSERT((extra_space & 0x7) == 0);
if ((type & 0xff) != SLJIT_CALL_REG_ARG) {
PTR_FAIL_IF(softfloat_call_with_args(compiler, arg_types, NULL, &extra_space));
SLJIT_ASSERT((extra_space & 0x7) == 0);
if ((type & SLJIT_CALL_RETURN) && extra_space == 0)
type = SLJIT_JUMP | (type & SLJIT_REWRITABLE_JUMP);
if ((type & SLJIT_CALL_RETURN) && extra_space == 0)
type = SLJIT_JUMP | (type & SLJIT_REWRITABLE_JUMP);
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
jump = sljit_emit_jump(compiler, type);
PTR_FAIL_IF(jump == NULL);
jump = sljit_emit_jump(compiler, type);
PTR_FAIL_IF(jump == NULL);
if (extra_space > 0) {
if (type & SLJIT_CALL_RETURN)
PTR_FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(WORD_SIZE | LOAD_DATA, 1,
TMP_REG2, SLJIT_SP, extra_space - sizeof(sljit_sw))));
if (extra_space > 0) {
if (type & SLJIT_CALL_RETURN)
PTR_FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(WORD_SIZE | LOAD_DATA, 1,
TMP_REG2, SLJIT_SP, extra_space - sizeof(sljit_sw))));
PTR_FAIL_IF(push_inst(compiler, ADD | RD(SLJIT_SP) | RN(SLJIT_SP) | SRC2_IMM | extra_space));
PTR_FAIL_IF(push_inst(compiler, ADD | RD(SLJIT_SP) | RN(SLJIT_SP) | SRC2_IMM | extra_space));
if (type & SLJIT_CALL_RETURN) {
PTR_FAIL_IF(push_inst(compiler, BX | RM(TMP_REG2)));
return jump;
if (type & SLJIT_CALL_RETURN) {
PTR_FAIL_IF(push_inst(compiler, BX | RM(TMP_REG2)));
return jump;
}
}
}
SLJIT_ASSERT(!(type & SLJIT_CALL_RETURN));
PTR_FAIL_IF(softfloat_post_call_with_args(compiler, arg_types));
return jump;
#else /* !__SOFTFP__ */
SLJIT_ASSERT(!(type & SLJIT_CALL_RETURN));
PTR_FAIL_IF(softfloat_post_call_with_args(compiler, arg_types));
return jump;
}
#endif /* __SOFTFP__ */
if (type & SLJIT_CALL_RETURN) {
PTR_FAIL_IF(emit_stack_frame_release(compiler, -1));
type = SLJIT_JUMP | (type & SLJIT_REWRITABLE_JUMP);
}
PTR_FAIL_IF(hardfloat_call_with_args(compiler, arg_types));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
#ifndef __SOFTFP__
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
PTR_FAIL_IF(hardfloat_call_with_args(compiler, arg_types));
#endif /* !__SOFTFP__ */
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_jump(compiler, type);
#endif /* __SOFTFP__ */
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_ijump(struct sljit_compiler *compiler, sljit_s32 type, sljit_s32 src, sljit_sw srcw)
@ -2828,47 +2820,44 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_icall(struct sljit_compiler *compi
}
#ifdef __SOFTFP__
FAIL_IF(softfloat_call_with_args(compiler, arg_types, &src, &extra_space));
SLJIT_ASSERT((extra_space & 0x7) == 0);
if ((type & 0xff) != SLJIT_CALL_REG_ARG) {
FAIL_IF(softfloat_call_with_args(compiler, arg_types, &src, &extra_space));
SLJIT_ASSERT((extra_space & 0x7) == 0);
if ((type & SLJIT_CALL_RETURN) && extra_space == 0)
type = SLJIT_JUMP;
if ((type & SLJIT_CALL_RETURN) && extra_space == 0)
type = SLJIT_JUMP;
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
FAIL_IF(sljit_emit_ijump(compiler, type, src, srcw));
FAIL_IF(sljit_emit_ijump(compiler, type, src, srcw));
if (extra_space > 0) {
if (type & SLJIT_CALL_RETURN)
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(WORD_SIZE | LOAD_DATA, 1,
TMP_REG2, SLJIT_SP, extra_space - sizeof(sljit_sw))));
if (extra_space > 0) {
if (type & SLJIT_CALL_RETURN)
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(WORD_SIZE | LOAD_DATA, 1,
TMP_REG2, SLJIT_SP, extra_space - sizeof(sljit_sw))));
FAIL_IF(push_inst(compiler, ADD | RD(SLJIT_SP) | RN(SLJIT_SP) | SRC2_IMM | extra_space));
FAIL_IF(push_inst(compiler, ADD | RD(SLJIT_SP) | RN(SLJIT_SP) | SRC2_IMM | extra_space));
if (type & SLJIT_CALL_RETURN)
return push_inst(compiler, BX | RM(TMP_REG2));
}
if (type & SLJIT_CALL_RETURN)
return push_inst(compiler, BX | RM(TMP_REG2));
SLJIT_ASSERT(!(type & SLJIT_CALL_RETURN));
return softfloat_post_call_with_args(compiler, arg_types);
}
#endif /* __SOFTFP__ */
SLJIT_ASSERT(!(type & SLJIT_CALL_RETURN));
return softfloat_post_call_with_args(compiler, arg_types);
#else /* !__SOFTFP__ */
if (type & SLJIT_CALL_RETURN) {
FAIL_IF(emit_stack_frame_release(compiler, -1));
type = SLJIT_JUMP;
}
FAIL_IF(hardfloat_call_with_args(compiler, arg_types));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
#ifndef __SOFTFP__
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
FAIL_IF(hardfloat_call_with_args(compiler, arg_types));
#endif /* !__SOFTFP__ */
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_ijump(compiler, type, src, srcw);
#endif /* __SOFTFP__ */
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *compiler, sljit_s32 op,
@ -2883,7 +2872,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
ADJUST_LOCAL_OFFSET(dst, dstw);
op = GET_OPCODE(op);
cc = get_cc(compiler, type & 0xff);
cc = get_cc(compiler, type);
dst_reg = FAST_IS_REG(dst) ? dst : TMP_REG1;
if (op < SLJIT_ADD) {
@ -2923,7 +2912,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_cmov(struct sljit_compiler *compil
dst_reg &= ~SLJIT_32;
cc = get_cc(compiler, type & 0xff);
cc = get_cc(compiler, type);
if (SLJIT_UNLIKELY(src & SLJIT_IMM)) {
tmp = get_imm((sljit_uw)srcw);
@ -2949,6 +2938,231 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_cmov(struct sljit_compiler *compil
return push_inst(compiler, ((MOV | RD(dst_reg) | RM(src)) & ~COND_MASK) | cc);
}
static sljit_s32 update_mem_addr(struct sljit_compiler *compiler, sljit_s32 *mem, sljit_sw *memw, sljit_s32 max_offset)
{
sljit_s32 arg = *mem;
sljit_sw argw = *memw;
sljit_uw imm;
#if (defined SLJIT_CONFIG_ARM_V5 && SLJIT_CONFIG_ARM_V5)
sljit_sw mask = max_offset >= 0x100 ? 0xfff : 0xff;
#else /* !SLJIT_CONFIG_ARM_V5 */
sljit_sw mask = 0xfff;
SLJIT_ASSERT(max_offset >= 0x100);
#endif /* SLJIT_CONFIG_ARM_V5 */
*mem = TMP_REG1;
if (SLJIT_UNLIKELY(arg & OFFS_REG_MASK)) {
*memw = 0;
return push_inst(compiler, ADD | RD(TMP_REG1) | RN(arg & REG_MASK) | RM(OFFS_REG(arg)) | ((sljit_uw)(argw & 0x3) << 7));
}
arg &= REG_MASK;
if (arg) {
if (argw <= max_offset && argw >= -mask) {
*mem = arg;
return SLJIT_SUCCESS;
}
if (argw < 0) {
imm = get_imm((sljit_uw)(-argw & ~mask));
if (imm) {
*memw = -(-argw & mask);
return push_inst(compiler, SUB | RD(TMP_REG1) | RN(arg) | imm);
}
} else if ((argw & mask) <= max_offset) {
imm = get_imm((sljit_uw)(argw & ~mask));
if (imm) {
*memw = argw & mask;
return push_inst(compiler, ADD | RD(TMP_REG1) | RN(arg) | imm);
}
} else {
imm = get_imm((sljit_uw)((argw | mask) + 1));
if (imm) {
*memw = (argw & mask) - (mask + 1);
return push_inst(compiler, ADD | RD(TMP_REG1) | RN(arg) | imm);
}
}
}
imm = (sljit_uw)(argw & ~mask);
if ((argw & mask) > max_offset) {
imm += (sljit_uw)(mask + 1);
*memw = (argw & mask) - (mask + 1);
} else
*memw = argw & mask;
FAIL_IF(load_immediate(compiler, TMP_REG1, imm));
if (arg == 0)
return SLJIT_SUCCESS;
return push_inst(compiler, ADD | RD(TMP_REG1) | RN(TMP_REG1) | RM(arg));
}
#if (defined SLJIT_CONFIG_ARM_V5 && SLJIT_CONFIG_ARM_V5)
static sljit_s32 sljit_emit_mem_unaligned(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 reg,
sljit_s32 mem, sljit_sw memw)
{
sljit_s32 flags;
sljit_s32 steps;
sljit_uw add, shift;
switch (type & 0xff) {
case SLJIT_MOV_U8:
case SLJIT_MOV_S8:
flags = BYTE_SIZE;
if (!(type & SLJIT_MEM_STORE))
flags |= LOAD_DATA;
if ((type & 0xff) == SLJIT_MOV_S8)
flags |= SIGNED;
return emit_op_mem(compiler, flags, reg, mem, memw, TMP_REG1);
case SLJIT_MOV_U16:
FAIL_IF(update_mem_addr(compiler, &mem, &memw, 0xfff - 1));
flags = BYTE_SIZE;
steps = 1;
break;
case SLJIT_MOV_S16:
FAIL_IF(update_mem_addr(compiler, &mem, &memw, 0xff - 1));
flags = BYTE_SIZE | SIGNED;
steps = 1;
break;
default:
if (type & SLJIT_MEM_ALIGNED_32) {
flags = WORD_SIZE;
if (!(type & SLJIT_MEM_STORE))
flags |= LOAD_DATA;
return emit_op_mem(compiler, flags, reg, mem, memw, TMP_REG1);
}
if (!(type & SLJIT_MEM_ALIGNED_16)) {
FAIL_IF(update_mem_addr(compiler, &mem, &memw, 0xfff - 3));
flags = BYTE_SIZE;
steps = 3;
break;
}
FAIL_IF(update_mem_addr(compiler, &mem, &memw, 0xff - 2));
add = 1;
if (memw < 0) {
add = 0;
memw = -memw;
}
if (type & SLJIT_MEM_STORE) {
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(HALF_SIZE, add, reg, mem, TYPE2_TRANSFER_IMM(memw))));
FAIL_IF(push_inst(compiler, MOV | RD(TMP_REG2) | RM(reg) | (16 << 7) | (2 << 4)));
if (!add) {
memw -= 2;
if (memw <= 0) {
memw = -memw;
add = 1;
}
} else
memw += 2;
return push_inst(compiler, EMIT_DATA_TRANSFER(HALF_SIZE, add, TMP_REG2, mem, TYPE2_TRANSFER_IMM(memw)));
}
if (reg == mem) {
FAIL_IF(push_inst(compiler, MOV | RD(TMP_REG1) | RM(mem)));
mem = TMP_REG1;
}
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(HALF_SIZE | LOAD_DATA, add, reg, mem, TYPE2_TRANSFER_IMM(memw))));
if (!add) {
memw -= 2;
if (memw <= 0) {
memw = -memw;
add = 1;
}
} else
memw += 2;
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(HALF_SIZE | LOAD_DATA, add, TMP_REG2, mem, TYPE2_TRANSFER_IMM(memw))));
return push_inst(compiler, ORR | RD(reg) | RN(reg) | RM(TMP_REG2) | (16 << 7));
}
SLJIT_ASSERT(steps > 0);
add = 1;
if (memw < 0) {
add = 0;
memw = -memw;
}
if (type & SLJIT_MEM_STORE) {
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(BYTE_SIZE, add, reg, mem, memw)));
FAIL_IF(push_inst(compiler, MOV | RD(TMP_REG2) | RM(reg) | (8 << 7) | (2 << 4)));
while (1) {
if (!add) {
memw -= 1;
if (memw == 0)
add = 1;
} else
memw += 1;
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(BYTE_SIZE, add, TMP_REG2, mem, memw)));
if (--steps == 0)
return SLJIT_SUCCESS;
FAIL_IF(push_inst(compiler, MOV | RD(TMP_REG2) | RM(TMP_REG2) | (8 << 7) | (2 << 4)));
}
}
if (reg == mem) {
FAIL_IF(push_inst(compiler, MOV | RD(TMP_REG1) | RM(mem)));
mem = TMP_REG1;
}
shift = 8;
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(BYTE_SIZE | LOAD_DATA, add, reg, mem, memw)));
do {
if (!add) {
memw -= 1;
if (memw == 0)
add = 1;
} else
memw += 1;
if (steps > 1) {
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(BYTE_SIZE | LOAD_DATA, add, TMP_REG2, mem, memw)));
FAIL_IF(push_inst(compiler, ORR | RD(reg) | RN(reg) | RM(TMP_REG2) | (shift << 7)));
shift += 8;
}
} while (--steps != 0);
flags |= LOAD_DATA;
if (flags & SIGNED)
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(flags, add, TMP_REG2, mem, TYPE2_TRANSFER_IMM(memw))));
else
FAIL_IF(push_inst(compiler, EMIT_DATA_TRANSFER(flags, add, TMP_REG2, mem, memw)));
return push_inst(compiler, ORR | RD(reg) | RN(reg) | RM(TMP_REG2) | (shift << 7));
}
#endif /* SLJIT_CONFIG_ARM_V5 */
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_mem(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 reg,
sljit_s32 mem, sljit_sw memw)
@ -2959,6 +3173,9 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_mem(struct sljit_compiler *compile
CHECK_ERROR();
CHECK(check_sljit_emit_mem(compiler, type, reg, mem, memw));
if (type & SLJIT_MEM_UNALIGNED)
return sljit_emit_mem_unaligned(compiler, type, reg, mem, memw);
is_type1_transfer = 1;
switch (type & 0xff) {
@ -3054,6 +3271,106 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_mem(struct sljit_compiler *compile
return push_inst(compiler, inst | TYPE2_TRANSFER_IMM((sljit_uw)memw));
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_fmem(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 freg,
sljit_s32 mem, sljit_sw memw)
{
#if (defined SLJIT_CONFIG_ARM_V5 && SLJIT_CONFIG_ARM_V5)
sljit_s32 max_offset;
sljit_s32 dst;
#endif /* SLJIT_CONFIG_ARM_V5 */
CHECK_ERROR();
CHECK(check_sljit_emit_fmem(compiler, type, freg, mem, memw));
if (type & (SLJIT_MEM_PRE | SLJIT_MEM_POST))
return SLJIT_ERR_UNSUPPORTED;
if (type & SLJIT_MEM_ALIGNED_32)
return emit_fop_mem(compiler, ((type ^ SLJIT_32) & SLJIT_32) | ((type & SLJIT_MEM_STORE) ? 0 : FPU_LOAD), freg, mem, memw);
#if (defined SLJIT_CONFIG_ARM_V5 && SLJIT_CONFIG_ARM_V5)
if (type & SLJIT_MEM_STORE) {
FAIL_IF(push_inst(compiler, VMOV | (1 << 20) | VN(freg) | RD(TMP_REG2)));
if (type & SLJIT_32)
return sljit_emit_mem_unaligned(compiler, SLJIT_MOV | SLJIT_MEM_STORE | (type & SLJIT_MEM_ALIGNED_16), TMP_REG2, mem, memw);
max_offset = 0xfff - 7;
if (type & SLJIT_MEM_ALIGNED_16)
max_offset++;
FAIL_IF(update_mem_addr(compiler, &mem, &memw, max_offset));
mem |= SLJIT_MEM;
FAIL_IF(sljit_emit_mem_unaligned(compiler, SLJIT_MOV | SLJIT_MEM_STORE | (type & SLJIT_MEM_ALIGNED_16), TMP_REG2, mem, memw));
FAIL_IF(push_inst(compiler, VMOV | (1 << 20) | VN(freg) | 0x80 | RD(TMP_REG2)));
return sljit_emit_mem_unaligned(compiler, SLJIT_MOV | SLJIT_MEM_STORE | (type & SLJIT_MEM_ALIGNED_16), TMP_REG2, mem, memw + 4);
}
max_offset = (type & SLJIT_32) ? 0xfff - 3 : 0xfff - 7;
if (type & SLJIT_MEM_ALIGNED_16)
max_offset++;
FAIL_IF(update_mem_addr(compiler, &mem, &memw, max_offset));
dst = TMP_REG1;
/* Stack offset adjustment is not needed because dst
is not stored on the stack when mem is SLJIT_SP. */
if (mem == TMP_REG1) {
dst = SLJIT_R3;
if (compiler->scratches >= 4)
FAIL_IF(push_inst(compiler, STR | (1 << 21) | RN(SLJIT_SP) | RD(SLJIT_R3) | 8));
}
mem |= SLJIT_MEM;
FAIL_IF(sljit_emit_mem_unaligned(compiler, SLJIT_MOV | (type & SLJIT_MEM_ALIGNED_16), dst, mem, memw));
FAIL_IF(push_inst(compiler, VMOV | VN(freg) | RD(dst)));
if (!(type & SLJIT_32)) {
FAIL_IF(sljit_emit_mem_unaligned(compiler, SLJIT_MOV | (type & SLJIT_MEM_ALIGNED_16), dst, mem, memw + 4));
FAIL_IF(push_inst(compiler, VMOV | VN(freg) | 0x80 | RD(dst)));
}
if (dst == SLJIT_R3 && compiler->scratches >= 4)
FAIL_IF(push_inst(compiler, (LDR ^ (0x1 << 24)) | (0x1 << 23) | RN(SLJIT_SP) | RD(SLJIT_R3) | 8));
return SLJIT_SUCCESS;
#else /* !SLJIT_CONFIG_ARM_V5 */
if (type & SLJIT_MEM_STORE) {
FAIL_IF(push_inst(compiler, VMOV | (1 << 20) | VN(freg) | RD(TMP_REG2)));
if (type & SLJIT_32)
return emit_op_mem(compiler, WORD_SIZE, TMP_REG2, mem, memw, TMP_REG1);
FAIL_IF(update_mem_addr(compiler, &mem, &memw, 0xfff - 4));
mem |= SLJIT_MEM;
FAIL_IF(emit_op_mem(compiler, WORD_SIZE, TMP_REG2, mem, memw, TMP_REG1));
FAIL_IF(push_inst(compiler, VMOV | (1 << 20) | VN(freg) | 0x80 | RD(TMP_REG2)));
return emit_op_mem(compiler, WORD_SIZE, TMP_REG2, mem, memw + 4, TMP_REG1);
}
if (type & SLJIT_32) {
FAIL_IF(emit_op_mem(compiler, WORD_SIZE | LOAD_DATA, TMP_REG2, mem, memw, TMP_REG1));
return push_inst(compiler, VMOV | VN(freg) | RD(TMP_REG2));
}
FAIL_IF(update_mem_addr(compiler, &mem, &memw, 0xfff - 4));
mem |= SLJIT_MEM;
FAIL_IF(emit_op_mem(compiler, WORD_SIZE | LOAD_DATA, TMP_REG2, mem, memw, TMP_REG1));
FAIL_IF(emit_op_mem(compiler, WORD_SIZE | LOAD_DATA, TMP_REG1, mem, memw + 4, TMP_REG1));
return push_inst(compiler, VMOV2 | VM(freg) | RD(TMP_REG2) | RN(TMP_REG1));
#endif /* SLJIT_CONFIG_ARM_V5 */
}
#undef FPU_LOAD
SLJIT_API_FUNC_ATTRIBUTE struct sljit_const* sljit_emit_const(struct sljit_compiler *compiler, sljit_s32 dst, sljit_sw dstw, sljit_sw init_value)
{
struct sljit_const *const_;

View File

@ -137,8 +137,6 @@ static const sljit_u8 freg_map[SLJIT_NUMBER_OF_FLOAT_REGISTERS + 3] = {
#define UDIV 0x9ac00800
#define UMULH 0x9bc03c00
/* dest_reg is the absolute name of the register
Useful for reordering instructions in the delay slot. */
static sljit_s32 push_inst(struct sljit_compiler *compiler, sljit_ins ins)
{
sljit_ins *ptr = (sljit_ins*)ensure_buf(compiler, sizeof(sljit_ins));
@ -296,8 +294,8 @@ SLJIT_API_FUNC_ATTRIBUTE void* sljit_generate_code(struct sljit_compiler *compil
}
next_addr = compute_next_addr(label, jump, const_, put_label);
}
code_ptr ++;
word_count ++;
code_ptr++;
word_count++;
} while (buf_ptr < buf_end);
buf = buf->next;
@ -924,14 +922,14 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
sljit_s32 fscratches, sljit_s32 fsaveds, sljit_s32 local_size)
{
sljit_s32 prev, fprev, saved_regs_size, i, tmp;
sljit_s32 word_arg_count = 0;
sljit_s32 saved_arg_count = SLJIT_KEPT_SAVEDS_COUNT(options);
sljit_ins offs;
CHECK_ERROR();
CHECK(check_sljit_emit_enter(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size));
set_emit_enter(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size);
saved_regs_size = GET_SAVED_REGISTERS_SIZE(scratches, saveds, 2);
saved_regs_size = GET_SAVED_REGISTERS_SIZE(scratches, saveds - saved_arg_count, 2);
saved_regs_size += GET_SAVED_FLOAT_REGISTERS_SIZE(fscratches, fsaveds, SSIZE_OF(f64));
local_size = (local_size + saved_regs_size + 0xf) & ~0xf;
@ -954,7 +952,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
prev = -1;
tmp = SLJIT_S0 - saveds;
for (i = SLJIT_S0; i > tmp; i--) {
for (i = SLJIT_S0 - saved_arg_count; i > tmp; i--) {
if (prev == -1) {
prev = i;
continue;
@ -1003,23 +1001,27 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
if (prev != -1)
FAIL_IF(push_inst(compiler, STRI | RT(prev) | RN(SLJIT_SP) | (offs >> 5) | ((fprev == -1) ? (1 << 10) : 0)));
arg_types >>= SLJIT_ARG_SHIFT;
#ifdef _WIN32
if (local_size > 4096)
FAIL_IF(push_inst(compiler, SUBI | RD(SLJIT_SP) | RN(SLJIT_SP) | (1 << 10) | (1 << 22)));
#endif /* _WIN32 */
tmp = 0;
while (arg_types > 0) {
if ((arg_types & SLJIT_ARG_MASK) < SLJIT_ARG_TYPE_F64) {
if (!(arg_types & SLJIT_ARG_TYPE_SCRATCH_REG)) {
FAIL_IF(push_inst(compiler, ORR | RD(SLJIT_S0 - tmp) | RN(TMP_ZERO) | RM(SLJIT_R0 + word_arg_count)));
if (!(options & SLJIT_ENTER_REG_ARG)) {
arg_types >>= SLJIT_ARG_SHIFT;
saved_arg_count = 0;
tmp = SLJIT_R0;
while (arg_types) {
if ((arg_types & SLJIT_ARG_MASK) < SLJIT_ARG_TYPE_F64) {
if (!(arg_types & SLJIT_ARG_TYPE_SCRATCH_REG)) {
FAIL_IF(push_inst(compiler, ORR | RD(SLJIT_S0 - saved_arg_count) | RN(TMP_ZERO) | RM(tmp)));
saved_arg_count++;
}
tmp++;
}
word_arg_count++;
arg_types >>= SLJIT_ARG_SHIFT;
}
arg_types >>= SLJIT_ARG_SHIFT;
}
#ifdef _WIN32
@ -1100,7 +1102,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_set_context(struct sljit_compiler *comp
CHECK(check_sljit_set_context(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size));
set_set_context(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size);
saved_regs_size = GET_SAVED_REGISTERS_SIZE(scratches, saveds, 2);
saved_regs_size = GET_SAVED_REGISTERS_SIZE(scratches, saveds - SLJIT_KEPT_SAVEDS_COUNT(options), 2);
saved_regs_size += GET_SAVED_FLOAT_REGISTERS_SIZE(fscratches, fsaveds, SSIZE_OF(f64));
compiler->local_size = (local_size + saved_regs_size + 0xf) & ~0xf;
@ -1137,7 +1139,7 @@ static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler)
prev = -1;
tmp = SLJIT_S0 - compiler->saveds;
for (i = SLJIT_S0; i > tmp; i--) {
for (i = SLJIT_S0 - SLJIT_KEPT_SAVEDS_COUNT(compiler->options); i > tmp; i--) {
if (prev == -1) {
prev = i;
continue;
@ -1392,10 +1394,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op2u(struct sljit_compiler *compil
CHECK_ERROR();
CHECK(check_sljit_emit_op2(compiler, op, 1, 0, 0, src1, src1w, src2, src2w));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_op2(compiler, op, TMP_REG1, 0, src1, src1w, src2, src2w);
}
@ -1550,10 +1549,9 @@ static SLJIT_INLINE sljit_s32 sljit_emit_fop1_conv_f64_from_sw(struct sljit_comp
emit_op_mem(compiler, ((GET_OPCODE(op) == SLJIT_CONV_F64_FROM_S32) ? INT_SIZE : WORD_SIZE), TMP_REG1, src, srcw, TMP_REG1);
src = TMP_REG1;
} else if (src & SLJIT_IMM) {
#if (defined SLJIT_CONFIG_X86_64 && SLJIT_CONFIG_X86_64)
if (GET_OPCODE(op) == SLJIT_CONV_F64_FROM_S32)
srcw = (sljit_s32)srcw;
#endif
FAIL_IF(load_immediate(compiler, TMP_REG1, srcw));
src = TMP_REG1;
}
@ -1699,11 +1697,15 @@ static sljit_ins get_cc(struct sljit_compiler *compiler, sljit_s32 type)
{
switch (type) {
case SLJIT_EQUAL:
case SLJIT_EQUAL_F64:
case SLJIT_F_EQUAL:
case SLJIT_ORDERED_EQUAL:
case SLJIT_UNORDERED_OR_EQUAL: /* Not supported. */
return 0x1;
case SLJIT_NOT_EQUAL:
case SLJIT_NOT_EQUAL_F64:
case SLJIT_F_NOT_EQUAL:
case SLJIT_UNORDERED_OR_NOT_EQUAL:
case SLJIT_ORDERED_NOT_EQUAL: /* Not supported. */
return 0x0;
case SLJIT_CARRY:
@ -1712,7 +1714,6 @@ static sljit_ins get_cc(struct sljit_compiler *compiler, sljit_s32 type)
/* fallthrough */
case SLJIT_LESS:
case SLJIT_LESS_F64:
return 0x2;
case SLJIT_NOT_CARRY:
@ -1721,27 +1722,33 @@ static sljit_ins get_cc(struct sljit_compiler *compiler, sljit_s32 type)
/* fallthrough */
case SLJIT_GREATER_EQUAL:
case SLJIT_GREATER_EQUAL_F64:
return 0x3;
case SLJIT_GREATER:
case SLJIT_GREATER_F64:
case SLJIT_UNORDERED_OR_GREATER:
return 0x9;
case SLJIT_LESS_EQUAL:
case SLJIT_LESS_EQUAL_F64:
case SLJIT_F_LESS_EQUAL:
case SLJIT_ORDERED_LESS_EQUAL:
return 0x8;
case SLJIT_SIG_LESS:
case SLJIT_UNORDERED_OR_LESS:
return 0xa;
case SLJIT_SIG_GREATER_EQUAL:
case SLJIT_F_GREATER_EQUAL:
case SLJIT_ORDERED_GREATER_EQUAL:
return 0xb;
case SLJIT_SIG_GREATER:
case SLJIT_F_GREATER:
case SLJIT_ORDERED_GREATER:
return 0xd;
case SLJIT_SIG_LESS_EQUAL:
case SLJIT_UNORDERED_OR_LESS_EQUAL:
return 0xc;
case SLJIT_OVERFLOW:
@ -1749,7 +1756,7 @@ static sljit_ins get_cc(struct sljit_compiler *compiler, sljit_s32 type)
return 0x0;
/* fallthrough */
case SLJIT_UNORDERED_F64:
case SLJIT_UNORDERED:
return 0x7;
case SLJIT_NOT_OVERFLOW:
@ -1757,9 +1764,16 @@ static sljit_ins get_cc(struct sljit_compiler *compiler, sljit_s32 type)
return 0x1;
/* fallthrough */
case SLJIT_ORDERED_F64:
case SLJIT_ORDERED:
return 0x6;
case SLJIT_F_LESS:
case SLJIT_ORDERED_LESS:
return 0x5;
case SLJIT_UNORDERED_OR_GREATER_EQUAL:
return 0x4;
default:
SLJIT_UNREACHABLE();
return 0xe;
@ -1820,11 +1834,7 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
type = SLJIT_JUMP | (type & SLJIT_REWRITABLE_JUMP);
}
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_jump(compiler, type);
}
@ -1914,11 +1924,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_icall(struct sljit_compiler *compi
type = SLJIT_JUMP;
}
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_ijump(compiler, type, src, srcw);
}
@ -1933,7 +1939,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
CHECK(check_sljit_emit_op_flags(compiler, op, dst, dstw, type));
ADJUST_LOCAL_OFFSET(dst, dstw);
cc = get_cc(compiler, type & 0xff);
cc = get_cc(compiler, type);
dst_r = FAST_IS_REG(dst) ? dst : TMP_REG1;
if (GET_OPCODE(op) < SLJIT_ADD) {
@ -1988,7 +1994,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_cmov(struct sljit_compiler *compil
srcw = 0;
}
cc = get_cc(compiler, type & 0xff);
cc = get_cc(compiler, type);
dst_reg &= ~SLJIT_32;
return push_inst(compiler, (CSEL ^ inv_bits) | (cc << 12) | RD(dst_reg) | RN(dst_reg) | RM(src));
@ -2003,6 +2009,9 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_mem(struct sljit_compiler *compile
CHECK_ERROR();
CHECK(check_sljit_emit_mem(compiler, type, reg, mem, memw));
if (type & SLJIT_MEM_UNALIGNED)
return sljit_emit_mem_unaligned(compiler, type, reg, mem, memw);
if ((mem & OFFS_REG_MASK) || (memw > 255 || memw < -256))
return SLJIT_ERR_UNSUPPORTED;
@ -2057,6 +2066,9 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_fmem(struct sljit_compiler *compil
CHECK_ERROR();
CHECK(check_sljit_emit_fmem(compiler, type, freg, mem, memw));
if (type & SLJIT_MEM_UNALIGNED)
return sljit_emit_fmem_unaligned(compiler, type, freg, mem, memw);
if ((mem & OFFS_REG_MASK) || (memw > 255 || memw < -256))
return SLJIT_ERR_UNSUPPORTED;

View File

@ -434,8 +434,8 @@ SLJIT_API_FUNC_ATTRIBUTE void* sljit_generate_code(struct sljit_compiler *compil
}
next_addr = compute_next_addr(label, jump, const_, put_label);
}
code_ptr ++;
half_count ++;
code_ptr++;
half_count++;
} while (buf_ptr < buf_end);
buf = buf->next;
@ -890,8 +890,8 @@ static sljit_s32 emit_op_imm(struct sljit_compiler *compiler, sljit_s32 flags, s
#define HALF_SIZE 0x08
#define PRELOAD 0x0c
#define IS_WORD_SIZE(flags) (!(flags & (BYTE_SIZE | HALF_SIZE)))
#define OFFSET_CHECK(imm, shift) (!(argw & ~(imm << shift)))
#define IS_WORD_SIZE(flags) (!((flags) & (BYTE_SIZE | HALF_SIZE)))
#define ALIGN_CHECK(argw, imm, shift) (!((argw) & ~((imm) << (shift))))
/*
1st letter:
@ -993,8 +993,7 @@ static SLJIT_INLINE sljit_s32 emit_op_mem(struct sljit_compiler *compiler, sljit
sljit_uw tmp;
SLJIT_ASSERT(arg & SLJIT_MEM);
SLJIT_ASSERT((arg & REG_MASK) != tmp_reg);
arg &= ~SLJIT_MEM;
SLJIT_ASSERT((arg & REG_MASK) != tmp_reg || (arg == SLJIT_MEM1(tmp_reg) && argw >= -0xff && argw <= 0xfff));
if (SLJIT_UNLIKELY(!(arg & REG_MASK))) {
tmp = get_imm((sljit_uw)argw & ~(sljit_uw)0xfff);
@ -1012,15 +1011,17 @@ static SLJIT_INLINE sljit_s32 emit_op_mem(struct sljit_compiler *compiler, sljit
if (SLJIT_UNLIKELY(arg & OFFS_REG_MASK)) {
argw &= 0x3;
other_r = OFFS_REG(arg);
arg &= 0xf;
arg &= REG_MASK;
if (!argw && IS_3_LO_REGS(reg, arg, other_r))
return push_inst16(compiler, sljit_mem16[flags] | RD3(reg) | RN3(arg) | RM3(other_r));
return push_inst32(compiler, sljit_mem32[flags] | RT4(reg) | RN4(arg) | RM4(other_r) | ((sljit_ins)argw << 4));
}
arg &= REG_MASK;
if (argw > 0xfff) {
tmp = get_imm((sljit_uw)argw & ~(sljit_uw)0xfff);
tmp = get_imm((sljit_uw)(argw & ~0xfff));
if (tmp != INVALID_IMM) {
push_inst32(compiler, ADD_WI | RD4(tmp_reg) | RN4(arg) | tmp);
arg = tmp_reg;
@ -1028,7 +1029,7 @@ static SLJIT_INLINE sljit_s32 emit_op_mem(struct sljit_compiler *compiler, sljit
}
}
else if (argw < -0xff) {
tmp = get_imm((sljit_uw)-argw & ~(sljit_uw)0xff);
tmp = get_imm((sljit_uw)(-argw & ~0xff));
if (tmp != INVALID_IMM) {
push_inst32(compiler, SUB_WI | RD4(tmp_reg) | RN4(arg) | tmp);
arg = tmp_reg;
@ -1036,27 +1037,28 @@ static SLJIT_INLINE sljit_s32 emit_op_mem(struct sljit_compiler *compiler, sljit
}
}
/* 16 bit instruction forms. */
if (IS_2_LO_REGS(reg, arg) && sljit_mem16_imm5[flags]) {
tmp = 3;
if (IS_WORD_SIZE(flags)) {
if (OFFSET_CHECK(0x1f, 2))
if (ALIGN_CHECK(argw, 0x1f, 2))
tmp = 2;
}
else if (flags & BYTE_SIZE)
{
if (OFFSET_CHECK(0x1f, 0))
if (ALIGN_CHECK(argw, 0x1f, 0))
tmp = 0;
}
else {
SLJIT_ASSERT(flags & HALF_SIZE);
if (OFFSET_CHECK(0x1f, 1))
if (ALIGN_CHECK(argw, 0x1f, 1))
tmp = 1;
}
if (tmp < 3)
return push_inst16(compiler, sljit_mem16_imm5[flags] | RD3(reg) | RN3(arg) | ((sljit_ins)argw << (6 - tmp)));
}
else if (SLJIT_UNLIKELY(arg == SLJIT_SP) && IS_WORD_SIZE(flags) && OFFSET_CHECK(0xff, 2) && reg_map[reg] <= 7) {
else if (SLJIT_UNLIKELY(arg == SLJIT_SP) && IS_WORD_SIZE(flags) && ALIGN_CHECK(argw, 0xff, 2) && reg_map[reg] <= 7) {
/* SP based immediate. */
return push_inst16(compiler, STR_SP | (sljit_ins)((flags & STORE) ? 0 : 0x800) | RDN3(reg) | ((sljit_ins)argw >> 2));
}
@ -1074,6 +1076,9 @@ static SLJIT_INLINE sljit_s32 emit_op_mem(struct sljit_compiler *compiler, sljit
return push_inst32(compiler, sljit_mem32[flags] | RT4(reg) | RN4(arg) | RM4(tmp_reg));
}
#undef ALIGN_CHECK
#undef IS_WORD_SIZE
/* --------------------------------------------------------------------- */
/* Entry, exit */
/* --------------------------------------------------------------------- */
@ -1082,7 +1087,8 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
sljit_s32 options, sljit_s32 arg_types, sljit_s32 scratches, sljit_s32 saveds,
sljit_s32 fscratches, sljit_s32 fsaveds, sljit_s32 local_size)
{
sljit_s32 size, i, tmp, word_arg_count, saved_arg_count;
sljit_s32 size, i, tmp, word_arg_count;
sljit_s32 saved_arg_count = SLJIT_KEPT_SAVEDS_COUNT(options);
sljit_uw offset;
sljit_uw imm = 0;
#ifdef __SOFTFP__
@ -1098,7 +1104,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
set_emit_enter(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size);
tmp = SLJIT_S0 - saveds;
for (i = SLJIT_S0; i > tmp; i--)
for (i = SLJIT_S0 - saved_arg_count; i > tmp; i--)
imm |= (sljit_uw)1 << reg_map[i];
for (i = scratches; i >= SLJIT_FIRST_SAVED_REG; i--)
@ -1110,7 +1116,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
: push_inst16(compiler, PUSH | (1 << 8) | imm));
/* Stack must be aligned to 8 bytes: (LR, R4) */
size = GET_SAVED_REGISTERS_SIZE(scratches, saveds, 1);
size = GET_SAVED_REGISTERS_SIZE(scratches, saveds - saved_arg_count, 1);
if (fsaveds > 0 || fscratches >= SLJIT_FIRST_SAVED_FLOAT_REG) {
if ((size & SSIZE_OF(sw)) != 0) {
@ -1131,6 +1137,9 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
local_size = ((size + local_size + 0x7) & ~0x7) - size;
compiler->local_size = local_size;
if (options & SLJIT_ENTER_REG_ARG)
arg_types = 0;
arg_types >>= SLJIT_ARG_SHIFT;
word_arg_count = 0;
saved_arg_count = 0;
@ -1173,13 +1182,14 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
else
break;
SLJIT_ASSERT(reg_map[tmp] <= 7);
if (offset < 4 * sizeof(sljit_sw))
FAIL_IF(push_inst16(compiler, MOV | RD3(tmp) | (offset << 1)));
else
FAIL_IF(push_inst16(compiler, MOV | ((sljit_ins)reg_map[tmp] & 0x7) | (((sljit_ins)reg_map[tmp] & 0x8) << 4) | (offset << 1)));
else if (reg_map[tmp] <= 7)
FAIL_IF(push_inst16(compiler, LDR_SP | RDN3(tmp)
| ((offset + (sljit_uw)size - 4 * sizeof(sljit_sw)) >> 2)));
else
FAIL_IF(push_inst32(compiler, LDR | RT4(tmp) | RN4(SLJIT_SP)
| ((offset + (sljit_uw)size - 4 * sizeof(sljit_sw)))));
break;
}
@ -1293,7 +1303,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_set_context(struct sljit_compiler *comp
CHECK(check_sljit_set_context(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size));
set_set_context(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size);
size = GET_SAVED_REGISTERS_SIZE(scratches, saveds, 1);
size = GET_SAVED_REGISTERS_SIZE(scratches, saveds - SLJIT_KEPT_SAVEDS_COUNT(options), 1);
if ((size & SSIZE_OF(sw)) != 0 && (fsaveds > 0 || fscratches >= SLJIT_FIRST_SAVED_FLOAT_REG))
size += SSIZE_OF(sw);
@ -1325,6 +1335,7 @@ static sljit_s32 emit_add_sp(struct sljit_compiler *compiler, sljit_uw imm)
static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler, sljit_s32 frame_size)
{
sljit_s32 local_size, fscratches, fsaveds, i, tmp;
sljit_s32 saveds_restore_start = SLJIT_S0 - SLJIT_KEPT_SAVEDS_COUNT(compiler->options);
sljit_s32 lr_dst = TMP_PC;
sljit_uw reg_list;
@ -1358,8 +1369,11 @@ static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler, sljit
reg_list = 0;
tmp = SLJIT_S0 - compiler->saveds;
for (i = SLJIT_S0; i > tmp; i--)
reg_list |= (sljit_uw)1 << reg_map[i];
if (saveds_restore_start != tmp) {
for (i = saveds_restore_start; i > tmp; i--)
reg_list |= (sljit_uw)1 << reg_map[i];
} else
saveds_restore_start = 0;
for (i = compiler->scratches; i >= SLJIT_FIRST_SAVED_REG; i--)
reg_list |= (sljit_uw)1 << reg_map[i];
@ -1379,9 +1393,9 @@ static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler, sljit
if (reg_list == 0)
return SLJIT_SUCCESS;
if (compiler->saveds > 0) {
SLJIT_ASSERT(reg_list == ((sljit_uw)1 << reg_map[SLJIT_S0]));
lr_dst = SLJIT_S0;
if (saveds_restore_start != 0) {
SLJIT_ASSERT(reg_list == ((sljit_uw)1 << reg_map[saveds_restore_start]));
lr_dst = saveds_restore_start;
} else {
SLJIT_ASSERT(reg_list == ((sljit_uw)1 << reg_map[SLJIT_FIRST_SAVED_REG]));
lr_dst = SLJIT_FIRST_SAVED_REG;
@ -1685,10 +1699,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op2u(struct sljit_compiler *compil
CHECK_ERROR();
CHECK(check_sljit_emit_op2(compiler, op, 1, 0, 0, src1, src1w, src2, src2w));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_op2(compiler, op, TMP_REG1, 0, src1, src1w, src2, src2w);
}
@ -1955,8 +1966,6 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_fop2(struct sljit_compiler *compil
return emit_fop_mem(compiler, (op & SLJIT_32), TMP_FREG1, dst, dstw);
}
#undef FPU_LOAD
/* --------------------------------------------------------------------- */
/* Other instructions */
/* --------------------------------------------------------------------- */
@ -1984,11 +1993,15 @@ static sljit_uw get_cc(struct sljit_compiler *compiler, sljit_s32 type)
{
switch (type) {
case SLJIT_EQUAL:
case SLJIT_EQUAL_F64:
case SLJIT_F_EQUAL:
case SLJIT_ORDERED_EQUAL:
case SLJIT_UNORDERED_OR_EQUAL: /* Not supported. */
return 0x0;
case SLJIT_NOT_EQUAL:
case SLJIT_NOT_EQUAL_F64:
case SLJIT_F_NOT_EQUAL:
case SLJIT_UNORDERED_OR_NOT_EQUAL:
case SLJIT_ORDERED_NOT_EQUAL: /* Not supported. */
return 0x1;
case SLJIT_CARRY:
@ -1997,7 +2010,6 @@ static sljit_uw get_cc(struct sljit_compiler *compiler, sljit_s32 type)
/* fallthrough */
case SLJIT_LESS:
case SLJIT_LESS_F64:
return 0x3;
case SLJIT_NOT_CARRY:
@ -2006,27 +2018,33 @@ static sljit_uw get_cc(struct sljit_compiler *compiler, sljit_s32 type)
/* fallthrough */
case SLJIT_GREATER_EQUAL:
case SLJIT_GREATER_EQUAL_F64:
return 0x2;
case SLJIT_GREATER:
case SLJIT_GREATER_F64:
case SLJIT_UNORDERED_OR_GREATER:
return 0x8;
case SLJIT_LESS_EQUAL:
case SLJIT_LESS_EQUAL_F64:
case SLJIT_F_LESS_EQUAL:
case SLJIT_ORDERED_LESS_EQUAL:
return 0x9;
case SLJIT_SIG_LESS:
case SLJIT_UNORDERED_OR_LESS:
return 0xb;
case SLJIT_SIG_GREATER_EQUAL:
case SLJIT_F_GREATER_EQUAL:
case SLJIT_ORDERED_GREATER_EQUAL:
return 0xa;
case SLJIT_SIG_GREATER:
case SLJIT_F_GREATER:
case SLJIT_ORDERED_GREATER:
return 0xc;
case SLJIT_SIG_LESS_EQUAL:
case SLJIT_UNORDERED_OR_LESS_EQUAL:
return 0xd;
case SLJIT_OVERFLOW:
@ -2034,7 +2052,7 @@ static sljit_uw get_cc(struct sljit_compiler *compiler, sljit_s32 type)
return 0x1;
/* fallthrough */
case SLJIT_UNORDERED_F64:
case SLJIT_UNORDERED:
return 0x6;
case SLJIT_NOT_OVERFLOW:
@ -2042,9 +2060,16 @@ static sljit_uw get_cc(struct sljit_compiler *compiler, sljit_s32 type)
return 0x0;
/* fallthrough */
case SLJIT_ORDERED_F64:
case SLJIT_ORDERED:
return 0x7;
case SLJIT_F_LESS:
case SLJIT_ORDERED_LESS:
return 0x4;
case SLJIT_UNORDERED_OR_GREATER_EQUAL:
return 0x5;
default: /* SLJIT_JUMP */
SLJIT_UNREACHABLE();
return 0xe;
@ -2289,52 +2314,49 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
CHECK_PTR(check_sljit_emit_call(compiler, type, arg_types));
#ifdef __SOFTFP__
PTR_FAIL_IF(softfloat_call_with_args(compiler, arg_types, NULL, &extra_space));
SLJIT_ASSERT((extra_space & 0x7) == 0);
if ((type & 0xff) != SLJIT_CALL_REG_ARG) {
PTR_FAIL_IF(softfloat_call_with_args(compiler, arg_types, NULL, &extra_space));
SLJIT_ASSERT((extra_space & 0x7) == 0);
if ((type & SLJIT_CALL_RETURN) && extra_space == 0)
type = SLJIT_JUMP | (type & SLJIT_REWRITABLE_JUMP);
if ((type & SLJIT_CALL_RETURN) && extra_space == 0)
type = SLJIT_JUMP | (type & SLJIT_REWRITABLE_JUMP);
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
jump = sljit_emit_jump(compiler, type);
PTR_FAIL_IF(jump == NULL);
jump = sljit_emit_jump(compiler, type);
PTR_FAIL_IF(jump == NULL);
if (extra_space > 0) {
if (type & SLJIT_CALL_RETURN)
PTR_FAIL_IF(push_inst32(compiler, LDR | RT4(TMP_REG2)
| RN4(SLJIT_SP) | (extra_space - sizeof(sljit_sw))));
if (extra_space > 0) {
if (type & SLJIT_CALL_RETURN)
PTR_FAIL_IF(push_inst32(compiler, LDR | RT4(TMP_REG2)
| RN4(SLJIT_SP) | (extra_space - sizeof(sljit_sw))));
PTR_FAIL_IF(push_inst16(compiler, ADD_SP_I | (extra_space >> 2)));
PTR_FAIL_IF(push_inst16(compiler, ADD_SP_I | (extra_space >> 2)));
if (type & SLJIT_CALL_RETURN) {
PTR_FAIL_IF(push_inst16(compiler, BX | RN3(TMP_REG2)));
return jump;
if (type & SLJIT_CALL_RETURN) {
PTR_FAIL_IF(push_inst16(compiler, BX | RN3(TMP_REG2)));
return jump;
}
}
}
SLJIT_ASSERT(!(type & SLJIT_CALL_RETURN));
PTR_FAIL_IF(softfloat_post_call_with_args(compiler, arg_types));
return jump;
#else
SLJIT_ASSERT(!(type & SLJIT_CALL_RETURN));
PTR_FAIL_IF(softfloat_post_call_with_args(compiler, arg_types));
return jump;
}
#endif /* __SOFTFP__ */
if (type & SLJIT_CALL_RETURN) {
/* ldmia sp!, {..., lr} */
PTR_FAIL_IF(emit_stack_frame_release(compiler, -1));
type = SLJIT_JUMP | (type & SLJIT_REWRITABLE_JUMP);
}
PTR_FAIL_IF(hardfloat_call_with_args(compiler, arg_types));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
#ifndef __SOFTFP__
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
PTR_FAIL_IF(hardfloat_call_with_args(compiler, arg_types));
#endif /* !__SOFTFP__ */
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_jump(compiler, type);
#endif
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_ijump(struct sljit_compiler *compiler, sljit_s32 type, sljit_s32 src, sljit_sw srcw)
@ -2391,48 +2413,45 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_icall(struct sljit_compiler *compi
}
#ifdef __SOFTFP__
FAIL_IF(softfloat_call_with_args(compiler, arg_types, &src, &extra_space));
SLJIT_ASSERT((extra_space & 0x7) == 0);
if ((type & 0xff) != SLJIT_CALL_REG_ARG) {
FAIL_IF(softfloat_call_with_args(compiler, arg_types, &src, &extra_space));
SLJIT_ASSERT((extra_space & 0x7) == 0);
if ((type & SLJIT_CALL_RETURN) && extra_space == 0)
type = SLJIT_JUMP;
if ((type & SLJIT_CALL_RETURN) && extra_space == 0)
type = SLJIT_JUMP;
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
FAIL_IF(sljit_emit_ijump(compiler, type, src, srcw));
FAIL_IF(sljit_emit_ijump(compiler, type, src, srcw));
if (extra_space > 0) {
if (type & SLJIT_CALL_RETURN)
FAIL_IF(push_inst32(compiler, LDR | RT4(TMP_REG2)
| RN4(SLJIT_SP) | (extra_space - sizeof(sljit_sw))));
if (extra_space > 0) {
if (type & SLJIT_CALL_RETURN)
FAIL_IF(push_inst32(compiler, LDR | RT4(TMP_REG2)
| RN4(SLJIT_SP) | (extra_space - sizeof(sljit_sw))));
FAIL_IF(push_inst16(compiler, ADD_SP_I | (extra_space >> 2)));
FAIL_IF(push_inst16(compiler, ADD_SP_I | (extra_space >> 2)));
if (type & SLJIT_CALL_RETURN)
return push_inst16(compiler, BX | RN3(TMP_REG2));
}
if (type & SLJIT_CALL_RETURN)
return push_inst16(compiler, BX | RN3(TMP_REG2));
SLJIT_ASSERT(!(type & SLJIT_CALL_RETURN));
return softfloat_post_call_with_args(compiler, arg_types);
}
#endif /* __SOFTFP__ */
SLJIT_ASSERT(!(type & SLJIT_CALL_RETURN));
return softfloat_post_call_with_args(compiler, arg_types);
#else /* !__SOFTFP__ */
if (type & SLJIT_CALL_RETURN) {
/* ldmia sp!, {..., lr} */
FAIL_IF(emit_stack_frame_release(compiler, -1));
type = SLJIT_JUMP;
}
FAIL_IF(hardfloat_call_with_args(compiler, arg_types));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
#ifndef __SOFTFP__
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
FAIL_IF(hardfloat_call_with_args(compiler, arg_types));
#endif /* !__SOFTFP__ */
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_ijump(compiler, type, src, srcw);
#endif /* __SOFTFP__ */
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *compiler, sljit_s32 op,
@ -2447,7 +2466,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
ADJUST_LOCAL_OFFSET(dst, dstw);
op = GET_OPCODE(op);
cc = get_cc(compiler, type & 0xff);
cc = get_cc(compiler, type);
dst_r = FAST_IS_REG(dst) ? dst : TMP_REG1;
if (op < SLJIT_ADD) {
@ -2499,7 +2518,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_cmov(struct sljit_compiler *compil
dst_reg &= ~SLJIT_32;
cc = get_cc(compiler, type & 0xff);
cc = get_cc(compiler, type);
if (!(src & SLJIT_IMM)) {
FAIL_IF(push_inst16(compiler, IT | (cc << 4) | 0x8));
@ -2546,6 +2565,9 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_mem(struct sljit_compiler *compile
CHECK_ERROR();
CHECK(check_sljit_emit_mem(compiler, type, reg, mem, memw));
if (type & SLJIT_MEM_UNALIGNED)
return sljit_emit_mem_unaligned(compiler, type, reg, mem, memw);
if ((mem & OFFS_REG_MASK) || (memw > 255 || memw < -255))
return SLJIT_ERR_UNSUPPORTED;
@ -2594,6 +2616,109 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_mem(struct sljit_compiler *compile
return push_inst32(compiler, inst | RT4(reg) | RN4(mem & REG_MASK) | (sljit_ins)memw);
}
static sljit_s32 update_mem_addr(struct sljit_compiler *compiler, sljit_s32 *mem, sljit_sw *memw, sljit_s32 max_offset)
{
sljit_s32 arg = *mem;
sljit_sw argw = *memw;
sljit_uw imm;
*mem = TMP_REG1;
if (SLJIT_UNLIKELY(arg & OFFS_REG_MASK)) {
*memw = 0;
return push_inst32(compiler, ADD_W | RD4(TMP_REG1) | RN4(arg & REG_MASK) | RM4(OFFS_REG(arg)) | ((sljit_uw)(argw & 0x3) << 6));
}
arg &= REG_MASK;
if (arg) {
if (argw <= max_offset && argw >= -0xff) {
*mem = arg;
return SLJIT_SUCCESS;
}
if (argw < 0) {
imm = get_imm((sljit_uw)(-argw & ~0xff));
if (imm) {
*memw = -(-argw & 0xff);
return push_inst32(compiler, SUB_WI | RD4(TMP_REG1) | RN4(arg) | imm);
}
} else if ((argw & 0xfff) <= max_offset) {
imm = get_imm((sljit_uw)(argw & ~0xfff));
if (imm) {
*memw = argw & 0xfff;
return push_inst32(compiler, ADD_WI | RD4(TMP_REG1) | RN4(arg) | imm);
}
} else {
imm = get_imm((sljit_uw)((argw | 0xfff) + 1));
if (imm) {
*memw = (argw & 0xfff) - 0x1000;
return push_inst32(compiler, ADD_WI | RD4(TMP_REG1) | RN4(arg) | imm);
}
}
}
imm = (sljit_uw)(argw & ~0xfff);
if ((argw & 0xfff) > max_offset) {
imm += 0x1000;
*memw = (argw & 0xfff) - 0x1000;
} else
*memw = argw & 0xfff;
FAIL_IF(load_immediate(compiler, TMP_REG1, imm));
if (arg == 0)
return SLJIT_SUCCESS;
return push_inst16(compiler, ADD | SET_REGS44(TMP_REG1, arg));
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_fmem(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 freg,
sljit_s32 mem, sljit_sw memw)
{
CHECK_ERROR();
CHECK(check_sljit_emit_fmem(compiler, type, freg, mem, memw));
if (type & (SLJIT_MEM_PRE | SLJIT_MEM_POST))
return SLJIT_ERR_UNSUPPORTED;
if (type & SLJIT_MEM_ALIGNED_32)
return emit_fop_mem(compiler, ((type ^ SLJIT_32) & SLJIT_32) | ((type & SLJIT_MEM_STORE) ? 0 : FPU_LOAD), freg, mem, memw);
if (type & SLJIT_MEM_STORE) {
FAIL_IF(push_inst32(compiler, VMOV | (1 << 20) | DN4(freg) | RT4(TMP_REG2)));
if (type & SLJIT_32)
return emit_op_mem(compiler, WORD_SIZE | STORE, TMP_REG2, mem, memw, TMP_REG1);
FAIL_IF(update_mem_addr(compiler, &mem, &memw, 0xfff - 4));
mem |= SLJIT_MEM;
FAIL_IF(emit_op_mem(compiler, WORD_SIZE | STORE, TMP_REG2, mem, memw, TMP_REG1));
FAIL_IF(push_inst32(compiler, VMOV | (1 << 20) | DN4(freg) | 0x80 | RT4(TMP_REG2)));
return emit_op_mem(compiler, WORD_SIZE | STORE, TMP_REG2, mem, memw + 4, TMP_REG1);
}
if (type & SLJIT_32) {
FAIL_IF(emit_op_mem(compiler, WORD_SIZE, TMP_REG2, mem, memw, TMP_REG1));
return push_inst32(compiler, VMOV | DN4(freg) | RT4(TMP_REG2));
}
FAIL_IF(update_mem_addr(compiler, &mem, &memw, 0xfff - 4));
mem |= SLJIT_MEM;
FAIL_IF(emit_op_mem(compiler, WORD_SIZE, TMP_REG2, mem, memw, TMP_REG1));
FAIL_IF(emit_op_mem(compiler, WORD_SIZE, TMP_REG1, mem, memw + 4, TMP_REG1));
return push_inst32(compiler, VMOV2 | DM4(freg) | RT4(TMP_REG2) | RN4(TMP_REG1));
}
#undef FPU_LOAD
SLJIT_API_FUNC_ATTRIBUTE struct sljit_const* sljit_emit_const(struct sljit_compiler *compiler, sljit_s32 dst, sljit_sw dstw, sljit_sw init_value)
{
struct sljit_const *const_;

View File

@ -38,383 +38,6 @@ static sljit_s32 load_immediate(struct sljit_compiler *compiler, sljit_s32 dst_a
return (imm & 0xffff) ? push_inst(compiler, ORI | SA(dst_ar) | TA(dst_ar) | IMM(imm), dst_ar) : SLJIT_SUCCESS;
}
#define EMIT_LOGICAL(op_imm, op_norm) \
if (flags & SRC2_IMM) { \
if (op & SLJIT_SET_Z) \
FAIL_IF(push_inst(compiler, op_imm | S(src1) | TA(EQUAL_FLAG) | IMM(src2), EQUAL_FLAG)); \
if (!(flags & UNUSED_DEST)) \
FAIL_IF(push_inst(compiler, op_imm | S(src1) | T(dst) | IMM(src2), DR(dst))); \
} \
else { \
if (op & SLJIT_SET_Z) \
FAIL_IF(push_inst(compiler, op_norm | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG)); \
if (!(flags & UNUSED_DEST)) \
FAIL_IF(push_inst(compiler, op_norm | S(src1) | T(src2) | D(dst), DR(dst))); \
}
#define EMIT_SHIFT(op_imm, op_v) \
if (flags & SRC2_IMM) { \
if (op & SLJIT_SET_Z) \
FAIL_IF(push_inst(compiler, op_imm | T(src1) | DA(EQUAL_FLAG) | SH_IMM(src2), EQUAL_FLAG)); \
if (!(flags & UNUSED_DEST)) \
FAIL_IF(push_inst(compiler, op_imm | T(src1) | D(dst) | SH_IMM(src2), DR(dst))); \
} \
else { \
if (op & SLJIT_SET_Z) \
FAIL_IF(push_inst(compiler, op_v | S(src2) | T(src1) | DA(EQUAL_FLAG), EQUAL_FLAG)); \
if (!(flags & UNUSED_DEST)) \
FAIL_IF(push_inst(compiler, op_v | S(src2) | T(src1) | D(dst), DR(dst))); \
}
static SLJIT_INLINE sljit_s32 emit_single_op(struct sljit_compiler *compiler, sljit_s32 op, sljit_s32 flags,
sljit_s32 dst, sljit_s32 src1, sljit_sw src2)
{
sljit_s32 is_overflow, is_carry, is_handled;
switch (GET_OPCODE(op)) {
case SLJIT_MOV:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if (dst != src2)
return push_inst(compiler, ADDU | S(src2) | TA(0) | D(dst), DR(dst));
return SLJIT_SUCCESS;
case SLJIT_MOV_U8:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE))
return push_inst(compiler, ANDI | S(src2) | T(dst) | IMM(0xff), DR(dst));
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_MOV_S8:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE)) {
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 1)
return push_inst(compiler, SEB | T(src2) | D(dst), DR(dst));
#else /* SLJIT_MIPS_REV < 1 */
FAIL_IF(push_inst(compiler, SLL | T(src2) | D(dst) | SH_IMM(24), DR(dst)));
return push_inst(compiler, SRA | T(dst) | D(dst) | SH_IMM(24), DR(dst));
#endif /* SLJIT_MIPS_REV >= 1 */
}
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_MOV_U16:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE))
return push_inst(compiler, ANDI | S(src2) | T(dst) | IMM(0xffff), DR(dst));
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_MOV_S16:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE)) {
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 1)
return push_inst(compiler, SEH | T(src2) | D(dst), DR(dst));
#else /* SLJIT_MIPS_REV < 1 */
FAIL_IF(push_inst(compiler, SLL | T(src2) | D(dst) | SH_IMM(16), DR(dst)));
return push_inst(compiler, SRA | T(dst) | D(dst) | SH_IMM(16), DR(dst));
#endif /* SLJIT_MIPS_REV >= 1 */
}
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_NOT:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, NOR | S(src2) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
if (!(flags & UNUSED_DEST))
FAIL_IF(push_inst(compiler, NOR | S(src2) | T(src2) | D(dst), DR(dst)));
return SLJIT_SUCCESS;
case SLJIT_CLZ:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 1)
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, CLZ | S(src2) | TA(EQUAL_FLAG) | DA(EQUAL_FLAG), EQUAL_FLAG));
if (!(flags & UNUSED_DEST))
FAIL_IF(push_inst(compiler, CLZ | S(src2) | T(dst) | D(dst), DR(dst)));
#else /* SLJIT_MIPS_REV < 1 */
if (SLJIT_UNLIKELY(flags & UNUSED_DEST)) {
FAIL_IF(push_inst(compiler, SRL | T(src2) | DA(EQUAL_FLAG) | SH_IMM(31), EQUAL_FLAG));
return push_inst(compiler, XORI | SA(EQUAL_FLAG) | TA(EQUAL_FLAG) | IMM(1), EQUAL_FLAG);
}
/* Nearly all instructions are unmovable in the following sequence. */
FAIL_IF(push_inst(compiler, ADDU | S(src2) | TA(0) | D(TMP_REG1), DR(TMP_REG1)));
/* Check zero. */
FAIL_IF(push_inst(compiler, BEQ | S(TMP_REG1) | TA(0) | IMM(5), UNMOVABLE_INS));
FAIL_IF(push_inst(compiler, ORI | SA(0) | T(dst) | IMM(32), UNMOVABLE_INS));
FAIL_IF(push_inst(compiler, ADDIU | SA(0) | T(dst) | IMM(-1), DR(dst)));
/* Loop for searching the highest bit. */
FAIL_IF(push_inst(compiler, ADDIU | S(dst) | T(dst) | IMM(1), DR(dst)));
FAIL_IF(push_inst(compiler, BGEZ | S(TMP_REG1) | IMM(-2), UNMOVABLE_INS));
FAIL_IF(push_inst(compiler, SLL | T(TMP_REG1) | D(TMP_REG1) | SH_IMM(1), UNMOVABLE_INS));
#endif /* SLJIT_MIPS_REV >= 1 */
return SLJIT_SUCCESS;
case SLJIT_ADD:
is_overflow = GET_FLAG_TYPE(op) == SLJIT_OVERFLOW;
is_carry = GET_FLAG_TYPE(op) == GET_FLAG_TYPE(SLJIT_SET_CARRY);
if (flags & SRC2_IMM) {
if (is_overflow) {
if (src2 >= 0)
FAIL_IF(push_inst(compiler, OR | S(src1) | T(src1) | DA(EQUAL_FLAG), EQUAL_FLAG));
else
FAIL_IF(push_inst(compiler, NOR | S(src1) | T(src1) | DA(EQUAL_FLAG), EQUAL_FLAG));
}
else if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, ADDIU | S(src1) | TA(EQUAL_FLAG) | IMM(src2), EQUAL_FLAG));
if (is_overflow || is_carry) {
if (src2 >= 0)
FAIL_IF(push_inst(compiler, ORI | S(src1) | TA(OTHER_FLAG) | IMM(src2), OTHER_FLAG));
else {
FAIL_IF(push_inst(compiler, ADDIU | SA(0) | TA(OTHER_FLAG) | IMM(src2), OTHER_FLAG));
FAIL_IF(push_inst(compiler, OR | S(src1) | TA(OTHER_FLAG) | DA(OTHER_FLAG), OTHER_FLAG));
}
}
/* dst may be the same as src1 or src2. */
if (!(flags & UNUSED_DEST) || (op & VARIABLE_FLAG_MASK))
FAIL_IF(push_inst(compiler, ADDIU | S(src1) | T(dst) | IMM(src2), DR(dst)));
}
else {
if (is_overflow)
FAIL_IF(push_inst(compiler, XOR | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
else if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, ADDU | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
if (is_overflow || is_carry)
FAIL_IF(push_inst(compiler, OR | S(src1) | T(src2) | DA(OTHER_FLAG), OTHER_FLAG));
/* dst may be the same as src1 or src2. */
if (!(flags & UNUSED_DEST) || (op & VARIABLE_FLAG_MASK))
FAIL_IF(push_inst(compiler, ADDU | S(src1) | T(src2) | D(dst), DR(dst)));
}
/* a + b >= a | b (otherwise, the carry should be set to 1). */
if (is_overflow || is_carry)
FAIL_IF(push_inst(compiler, SLTU | S(dst) | TA(OTHER_FLAG) | DA(OTHER_FLAG), OTHER_FLAG));
if (!is_overflow)
return SLJIT_SUCCESS;
FAIL_IF(push_inst(compiler, SLL | TA(OTHER_FLAG) | D(TMP_REG1) | SH_IMM(31), DR(TMP_REG1)));
FAIL_IF(push_inst(compiler, XOR | S(TMP_REG1) | TA(EQUAL_FLAG) | DA(EQUAL_FLAG), EQUAL_FLAG));
FAIL_IF(push_inst(compiler, XOR | S(dst) | TA(EQUAL_FLAG) | DA(OTHER_FLAG), OTHER_FLAG));
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, ADDU | S(dst) | TA(0) | DA(EQUAL_FLAG), EQUAL_FLAG));
return push_inst(compiler, SRL | TA(OTHER_FLAG) | DA(OTHER_FLAG) | SH_IMM(31), OTHER_FLAG);
case SLJIT_ADDC:
is_carry = GET_FLAG_TYPE(op) == GET_FLAG_TYPE(SLJIT_SET_CARRY);
if (flags & SRC2_IMM) {
if (is_carry) {
if (src2 >= 0)
FAIL_IF(push_inst(compiler, ORI | S(src1) | TA(EQUAL_FLAG) | IMM(src2), EQUAL_FLAG));
else {
FAIL_IF(push_inst(compiler, ADDIU | SA(0) | TA(EQUAL_FLAG) | IMM(src2), EQUAL_FLAG));
FAIL_IF(push_inst(compiler, OR | S(src1) | TA(EQUAL_FLAG) | DA(EQUAL_FLAG), EQUAL_FLAG));
}
}
FAIL_IF(push_inst(compiler, ADDIU | S(src1) | T(dst) | IMM(src2), DR(dst)));
} else {
if (is_carry)
FAIL_IF(push_inst(compiler, OR | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
/* dst may be the same as src1 or src2. */
FAIL_IF(push_inst(compiler, ADDU | S(src1) | T(src2) | D(dst), DR(dst)));
}
if (is_carry)
FAIL_IF(push_inst(compiler, SLTU | S(dst) | TA(EQUAL_FLAG) | DA(EQUAL_FLAG), EQUAL_FLAG));
FAIL_IF(push_inst(compiler, ADDU | S(dst) | TA(OTHER_FLAG) | D(dst), DR(dst)));
if (!is_carry)
return SLJIT_SUCCESS;
/* Set ULESS_FLAG (dst == 0) && (OTHER_FLAG == 1). */
FAIL_IF(push_inst(compiler, SLTU | S(dst) | TA(OTHER_FLAG) | DA(OTHER_FLAG), OTHER_FLAG));
/* Set carry flag. */
return push_inst(compiler, OR | SA(OTHER_FLAG) | TA(EQUAL_FLAG) | DA(OTHER_FLAG), OTHER_FLAG);
case SLJIT_SUB:
if ((flags & SRC2_IMM) && src2 == SIMM_MIN) {
FAIL_IF(push_inst(compiler, ADDIU | SA(0) | T(TMP_REG2) | IMM(src2), DR(TMP_REG2)));
src2 = TMP_REG2;
flags &= ~SRC2_IMM;
}
is_handled = 0;
if (flags & SRC2_IMM) {
if (GET_FLAG_TYPE(op) == SLJIT_LESS || GET_FLAG_TYPE(op) == SLJIT_GREATER_EQUAL) {
FAIL_IF(push_inst(compiler, SLTIU | S(src1) | TA(OTHER_FLAG) | IMM(src2), OTHER_FLAG));
is_handled = 1;
}
else if (GET_FLAG_TYPE(op) == SLJIT_SIG_LESS || GET_FLAG_TYPE(op) == SLJIT_SIG_GREATER_EQUAL) {
FAIL_IF(push_inst(compiler, SLTI | S(src1) | TA(OTHER_FLAG) | IMM(src2), OTHER_FLAG));
is_handled = 1;
}
}
if (!is_handled && GET_FLAG_TYPE(op) >= SLJIT_LESS && GET_FLAG_TYPE(op) <= SLJIT_SIG_LESS_EQUAL) {
is_handled = 1;
if (flags & SRC2_IMM) {
FAIL_IF(push_inst(compiler, ADDIU | SA(0) | T(TMP_REG2) | IMM(src2), DR(TMP_REG2)));
src2 = TMP_REG2;
flags &= ~SRC2_IMM;
}
if (GET_FLAG_TYPE(op) == SLJIT_LESS || GET_FLAG_TYPE(op) == SLJIT_GREATER_EQUAL) {
FAIL_IF(push_inst(compiler, SLTU | S(src1) | T(src2) | DA(OTHER_FLAG), OTHER_FLAG));
}
else if (GET_FLAG_TYPE(op) == SLJIT_GREATER || GET_FLAG_TYPE(op) == SLJIT_LESS_EQUAL)
{
FAIL_IF(push_inst(compiler, SLTU | S(src2) | T(src1) | DA(OTHER_FLAG), OTHER_FLAG));
}
else if (GET_FLAG_TYPE(op) == SLJIT_SIG_LESS || GET_FLAG_TYPE(op) == SLJIT_SIG_GREATER_EQUAL) {
FAIL_IF(push_inst(compiler, SLT | S(src1) | T(src2) | DA(OTHER_FLAG), OTHER_FLAG));
}
else if (GET_FLAG_TYPE(op) == SLJIT_SIG_GREATER || GET_FLAG_TYPE(op) == SLJIT_SIG_LESS_EQUAL)
{
FAIL_IF(push_inst(compiler, SLT | S(src2) | T(src1) | DA(OTHER_FLAG), OTHER_FLAG));
}
}
if (is_handled) {
if (flags & SRC2_IMM) {
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, ADDIU | S(src1) | TA(EQUAL_FLAG) | IMM(-src2), EQUAL_FLAG));
if (!(flags & UNUSED_DEST))
return push_inst(compiler, ADDIU | S(src1) | T(dst) | IMM(-src2), DR(dst));
}
else {
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SUBU | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
if (!(flags & UNUSED_DEST))
return push_inst(compiler, SUBU | S(src1) | T(src2) | D(dst), DR(dst));
}
return SLJIT_SUCCESS;
}
is_overflow = GET_FLAG_TYPE(op) == SLJIT_OVERFLOW;
is_carry = GET_FLAG_TYPE(op) == GET_FLAG_TYPE(SLJIT_SET_CARRY);
if (flags & SRC2_IMM) {
if (is_overflow) {
if (src2 >= 0)
FAIL_IF(push_inst(compiler, OR | S(src1) | T(src1) | DA(EQUAL_FLAG), EQUAL_FLAG));
else
FAIL_IF(push_inst(compiler, NOR | S(src1) | T(src1) | DA(EQUAL_FLAG), EQUAL_FLAG));
}
else if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, ADDIU | S(src1) | TA(EQUAL_FLAG) | IMM(-src2), EQUAL_FLAG));
if (is_overflow || is_carry)
FAIL_IF(push_inst(compiler, SLTIU | S(src1) | TA(OTHER_FLAG) | IMM(src2), OTHER_FLAG));
/* dst may be the same as src1 or src2. */
if (!(flags & UNUSED_DEST) || (op & VARIABLE_FLAG_MASK))
FAIL_IF(push_inst(compiler, ADDIU | S(src1) | T(dst) | IMM(-src2), DR(dst)));
}
else {
if (is_overflow)
FAIL_IF(push_inst(compiler, XOR | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
else if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SUBU | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
if (is_overflow || is_carry)
FAIL_IF(push_inst(compiler, SLTU | S(src1) | T(src2) | DA(OTHER_FLAG), OTHER_FLAG));
/* dst may be the same as src1 or src2. */
if (!(flags & UNUSED_DEST) || (op & VARIABLE_FLAG_MASK))
FAIL_IF(push_inst(compiler, SUBU | S(src1) | T(src2) | D(dst), DR(dst)));
}
if (!is_overflow)
return SLJIT_SUCCESS;
FAIL_IF(push_inst(compiler, SLL | TA(OTHER_FLAG) | D(TMP_REG1) | SH_IMM(31), DR(TMP_REG1)));
FAIL_IF(push_inst(compiler, XOR | S(TMP_REG1) | TA(EQUAL_FLAG) | DA(EQUAL_FLAG), EQUAL_FLAG));
FAIL_IF(push_inst(compiler, XOR | S(dst) | TA(EQUAL_FLAG) | DA(OTHER_FLAG), OTHER_FLAG));
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, ADDU | S(dst) | TA(0) | DA(EQUAL_FLAG), EQUAL_FLAG));
return push_inst(compiler, SRL | TA(OTHER_FLAG) | DA(OTHER_FLAG) | SH_IMM(31), OTHER_FLAG);
case SLJIT_SUBC:
if ((flags & SRC2_IMM) && src2 == SIMM_MIN) {
FAIL_IF(push_inst(compiler, ADDIU | SA(0) | T(TMP_REG2) | IMM(src2), DR(TMP_REG2)));
src2 = TMP_REG2;
flags &= ~SRC2_IMM;
}
is_carry = GET_FLAG_TYPE(op) == GET_FLAG_TYPE(SLJIT_SET_CARRY);
if (flags & SRC2_IMM) {
if (is_carry)
FAIL_IF(push_inst(compiler, SLTIU | S(src1) | TA(EQUAL_FLAG) | IMM(src2), EQUAL_FLAG));
/* dst may be the same as src1 or src2. */
FAIL_IF(push_inst(compiler, ADDIU | S(src1) | T(dst) | IMM(-src2), DR(dst)));
}
else {
if (is_carry)
FAIL_IF(push_inst(compiler, SLTU | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
/* dst may be the same as src1 or src2. */
FAIL_IF(push_inst(compiler, SUBU | S(src1) | T(src2) | D(dst), DR(dst)));
}
if (is_carry)
FAIL_IF(push_inst(compiler, SLTU | S(dst) | TA(OTHER_FLAG) | D(TMP_REG1), DR(TMP_REG1)));
FAIL_IF(push_inst(compiler, SUBU | S(dst) | TA(OTHER_FLAG) | D(dst), DR(dst)));
return (is_carry) ? push_inst(compiler, OR | SA(EQUAL_FLAG) | T(TMP_REG1) | DA(OTHER_FLAG), OTHER_FLAG) : SLJIT_SUCCESS;
case SLJIT_MUL:
SLJIT_ASSERT(!(flags & SRC2_IMM));
if (GET_FLAG_TYPE(op) != SLJIT_OVERFLOW) {
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 1)
return push_inst(compiler, MUL | S(src1) | T(src2) | D(dst), DR(dst));
#else /* SLJIT_MIPS_REV < 1 */
FAIL_IF(push_inst(compiler, MULT | S(src1) | T(src2), MOVABLE_INS));
return push_inst(compiler, MFLO | D(dst), DR(dst));
#endif /* SLJIT_MIPS_REV >= 1 */
}
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 6)
FAIL_IF(push_inst(compiler, MUL | S(src1) | T(src2) | D(dst), DR(dst)));
FAIL_IF(push_inst(compiler, MUH | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
#else /* SLJIT_MIPS_REV < 6 */
FAIL_IF(push_inst(compiler, MULT | S(src1) | T(src2), MOVABLE_INS));
FAIL_IF(push_inst(compiler, MFHI | DA(EQUAL_FLAG), EQUAL_FLAG));
FAIL_IF(push_inst(compiler, MFLO | D(dst), DR(dst)));
#endif /* SLJIT_MIPS_REV >= 6 */
FAIL_IF(push_inst(compiler, SRA | T(dst) | DA(OTHER_FLAG) | SH_IMM(31), OTHER_FLAG));
return push_inst(compiler, SUBU | SA(EQUAL_FLAG) | TA(OTHER_FLAG) | DA(OTHER_FLAG), OTHER_FLAG);
case SLJIT_AND:
EMIT_LOGICAL(ANDI, AND);
return SLJIT_SUCCESS;
case SLJIT_OR:
EMIT_LOGICAL(ORI, OR);
return SLJIT_SUCCESS;
case SLJIT_XOR:
EMIT_LOGICAL(XORI, XOR);
return SLJIT_SUCCESS;
case SLJIT_SHL:
EMIT_SHIFT(SLL, SLLV);
return SLJIT_SUCCESS;
case SLJIT_LSHR:
EMIT_SHIFT(SRL, SRLV);
return SLJIT_SUCCESS;
case SLJIT_ASHR:
EMIT_SHIFT(SRA, SRAV);
return SLJIT_SUCCESS;
}
SLJIT_UNREACHABLE();
return SLJIT_SUCCESS;
}
static SLJIT_INLINE sljit_s32 emit_const(struct sljit_compiler *compiler, sljit_s32 dst, sljit_sw init_value)
{
FAIL_IF(push_inst(compiler, LUI | T(dst) | IMM(init_value >> 16), DR(dst)));
@ -573,8 +196,8 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
sljit_s32 arg_types)
{
struct sljit_jump *jump;
sljit_u32 extra_space = (sljit_u32)type;
sljit_ins ins;
sljit_u32 extra_space = 0;
sljit_ins ins = NOP;
CHECK_ERROR_PTR();
CHECK_PTR(check_sljit_emit_call(compiler, type, arg_types));
@ -583,14 +206,23 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
PTR_FAIL_IF(!jump);
set_jump(jump, compiler, type & SLJIT_REWRITABLE_JUMP);
PTR_FAIL_IF(call_with_args(compiler, arg_types, &ins, &extra_space));
if ((type & 0xff) != SLJIT_CALL_REG_ARG) {
extra_space = (sljit_u32)type;
PTR_FAIL_IF(call_with_args(compiler, arg_types, &ins, &extra_space));
} else if (type & SLJIT_CALL_RETURN)
PTR_FAIL_IF(emit_stack_frame_release(compiler, 0, &ins));
SLJIT_ASSERT(DR(PIC_ADDR_REG) == 25 && PIC_ADDR_REG == TMP_REG2);
PTR_FAIL_IF(emit_const(compiler, PIC_ADDR_REG, 0));
if (ins == NOP && compiler->delay_slot != UNMOVABLE_INS)
jump->flags |= IS_MOVABLE;
if (!(type & SLJIT_CALL_RETURN) || extra_space > 0) {
jump->flags |= IS_JAL | IS_CALL;
jump->flags |= IS_JAL;
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
jump->flags |= IS_CALL;
PTR_FAIL_IF(push_inst(compiler, JALR | S(PIC_ADDR_REG) | DA(RETURN_ADDR_REG), UNMOVABLE_INS));
} else
PTR_FAIL_IF(push_inst(compiler, JR | S(PIC_ADDR_REG), UNMOVABLE_INS));
@ -598,6 +230,9 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
jump->addr = compiler->size;
PTR_FAIL_IF(push_inst(compiler, ins, UNMOVABLE_INS));
/* Maximum number of instructions required for generating a constant. */
compiler->size += 2;
if (extra_space == 0)
return jump;
@ -623,16 +258,37 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_icall(struct sljit_compiler *compi
CHECK_ERROR();
CHECK(check_sljit_emit_icall(compiler, type, arg_types, src, srcw));
if (src & SLJIT_MEM) {
ADJUST_LOCAL_OFFSET(src, srcw);
FAIL_IF(emit_op_mem(compiler, WORD_DATA | LOAD_DATA, DR(PIC_ADDR_REG), src, srcw));
src = PIC_ADDR_REG;
srcw = 0;
}
if ((type & 0xff) == SLJIT_CALL_REG_ARG) {
if (type & SLJIT_CALL_RETURN) {
if (src >= SLJIT_FIRST_SAVED_REG && src <= SLJIT_S0) {
FAIL_IF(push_inst(compiler, ADDU | S(src) | TA(0) | D(PIC_ADDR_REG), DR(PIC_ADDR_REG)));
src = PIC_ADDR_REG;
srcw = 0;
}
FAIL_IF(emit_stack_frame_release(compiler, 0, &ins));
if (ins != NOP)
FAIL_IF(push_inst(compiler, ins, MOVABLE_INS));
}
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_ijump(compiler, type, src, srcw);
}
SLJIT_ASSERT(DR(PIC_ADDR_REG) == 25 && PIC_ADDR_REG == TMP_REG2);
if (src & SLJIT_IMM)
FAIL_IF(load_immediate(compiler, DR(PIC_ADDR_REG), srcw));
else if (FAST_IS_REG(src))
else if (src != PIC_ADDR_REG)
FAIL_IF(push_inst(compiler, ADDU | S(src) | TA(0) | D(PIC_ADDR_REG), DR(PIC_ADDR_REG)));
else if (src & SLJIT_MEM) {
ADJUST_LOCAL_OFFSET(src, srcw);
FAIL_IF(emit_op_mem(compiler, WORD_DATA | LOAD_DATA, DR(PIC_ADDR_REG), src, srcw));
}
FAIL_IF(call_with_args(compiler, arg_types, &ins, &extra_space));

View File

@ -118,421 +118,6 @@ static sljit_s32 load_immediate(struct sljit_compiler *compiler, sljit_s32 dst_a
return !(imm & 0xffff) ? SLJIT_SUCCESS : push_inst(compiler, ORI | SA(dst_ar) | TA(dst_ar) | IMM(imm), dst_ar);
}
#define SELECT_OP(a, b) \
(!(op & SLJIT_32) ? a : b)
#define EMIT_LOGICAL(op_imm, op_norm) \
if (flags & SRC2_IMM) { \
if (op & SLJIT_SET_Z) \
FAIL_IF(push_inst(compiler, op_imm | S(src1) | TA(EQUAL_FLAG) | IMM(src2), EQUAL_FLAG)); \
if (!(flags & UNUSED_DEST)) \
FAIL_IF(push_inst(compiler, op_imm | S(src1) | T(dst) | IMM(src2), DR(dst))); \
} \
else { \
if (op & SLJIT_SET_Z) \
FAIL_IF(push_inst(compiler, op_norm | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG)); \
if (!(flags & UNUSED_DEST)) \
FAIL_IF(push_inst(compiler, op_norm | S(src1) | T(src2) | D(dst), DR(dst))); \
}
#define EMIT_SHIFT(op_dimm, op_dimm32, op_imm, op_dv, op_v) \
if (flags & SRC2_IMM) { \
if (src2 >= 32) { \
SLJIT_ASSERT(!(op & SLJIT_32)); \
ins = op_dimm32; \
src2 -= 32; \
} \
else \
ins = (op & SLJIT_32) ? op_imm : op_dimm; \
if (op & SLJIT_SET_Z) \
FAIL_IF(push_inst(compiler, ins | T(src1) | DA(EQUAL_FLAG) | SH_IMM(src2), EQUAL_FLAG)); \
if (!(flags & UNUSED_DEST)) \
FAIL_IF(push_inst(compiler, ins | T(src1) | D(dst) | SH_IMM(src2), DR(dst))); \
} \
else { \
ins = (op & SLJIT_32) ? op_v : op_dv; \
if (op & SLJIT_SET_Z) \
FAIL_IF(push_inst(compiler, ins | S(src2) | T(src1) | DA(EQUAL_FLAG), EQUAL_FLAG)); \
if (!(flags & UNUSED_DEST)) \
FAIL_IF(push_inst(compiler, ins | S(src2) | T(src1) | D(dst), DR(dst))); \
}
static SLJIT_INLINE sljit_s32 emit_single_op(struct sljit_compiler *compiler, sljit_s32 op, sljit_s32 flags,
sljit_s32 dst, sljit_s32 src1, sljit_sw src2)
{
sljit_ins ins;
sljit_s32 is_overflow, is_carry, is_handled;
switch (GET_OPCODE(op)) {
case SLJIT_MOV:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if (dst != src2)
return push_inst(compiler, SELECT_OP(DADDU, ADDU) | S(src2) | TA(0) | D(dst), DR(dst));
return SLJIT_SUCCESS;
case SLJIT_MOV_U8:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE))
return push_inst(compiler, ANDI | S(src2) | T(dst) | IMM(0xff), DR(dst));
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_MOV_S8:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE)) {
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 1)
if (op & SLJIT_32)
return push_inst(compiler, SEB | T(src2) | D(dst), DR(dst));
#endif /* SLJIT_MIPS_REV >= 1 */
FAIL_IF(push_inst(compiler, DSLL32 | T(src2) | D(dst) | SH_IMM(24), DR(dst)));
return push_inst(compiler, DSRA32 | T(dst) | D(dst) | SH_IMM(24), DR(dst));
}
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_MOV_U16:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE))
return push_inst(compiler, ANDI | S(src2) | T(dst) | IMM(0xffff), DR(dst));
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_MOV_S16:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE)) {
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 1)
if (op & SLJIT_32)
return push_inst(compiler, SEH | T(src2) | D(dst), DR(dst));
#endif /* SLJIT_MIPS_REV >= 1 */
FAIL_IF(push_inst(compiler, DSLL32 | T(src2) | D(dst) | SH_IMM(16), DR(dst)));
return push_inst(compiler, DSRA32 | T(dst) | D(dst) | SH_IMM(16), DR(dst));
}
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_MOV_U32:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM) && !(op & SLJIT_32));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE)) {
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 2)
if (dst == src2)
return push_inst(compiler, DINSU | T(src2) | SA(0) | (31 << 11) | (0 << 11), DR(dst));
#endif /* SLJIT_MIPS_REV >= 2 */
FAIL_IF(push_inst(compiler, DSLL32 | T(src2) | D(dst) | SH_IMM(0), DR(dst)));
return push_inst(compiler, DSRL32 | T(dst) | D(dst) | SH_IMM(0), DR(dst));
}
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_MOV_S32:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM) && !(op & SLJIT_32));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE)) {
return push_inst(compiler, SLL | T(src2) | D(dst) | SH_IMM(0), DR(dst));
}
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_NOT:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, NOR | S(src2) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
if (!(flags & UNUSED_DEST))
FAIL_IF(push_inst(compiler, NOR | S(src2) | T(src2) | D(dst), DR(dst)));
return SLJIT_SUCCESS;
case SLJIT_CLZ:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 1)
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SELECT_OP(DCLZ, CLZ) | S(src2) | TA(EQUAL_FLAG) | DA(EQUAL_FLAG), EQUAL_FLAG));
if (!(flags & UNUSED_DEST))
FAIL_IF(push_inst(compiler, SELECT_OP(DCLZ, CLZ) | S(src2) | T(dst) | D(dst), DR(dst)));
#else /* SLJIT_MIPS_REV < 1 */
if (SLJIT_UNLIKELY(flags & UNUSED_DEST)) {
FAIL_IF(push_inst(compiler, SELECT_OP(DSRL32, SRL) | T(src2) | DA(EQUAL_FLAG) | SH_IMM(31), EQUAL_FLAG));
return push_inst(compiler, XORI | SA(EQUAL_FLAG) | TA(EQUAL_FLAG) | IMM(1), EQUAL_FLAG);
}
/* Nearly all instructions are unmovable in the following sequence. */
FAIL_IF(push_inst(compiler, SELECT_OP(DADDU, ADDU) | S(src2) | TA(0) | D(TMP_REG1), DR(TMP_REG1)));
/* Check zero. */
FAIL_IF(push_inst(compiler, BEQ | S(TMP_REG1) | TA(0) | IMM(5), UNMOVABLE_INS));
FAIL_IF(push_inst(compiler, ORI | SA(0) | T(dst) | IMM((op & SLJIT_32) ? 32 : 64), UNMOVABLE_INS));
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | SA(0) | T(dst) | IMM(-1), DR(dst)));
/* Loop for searching the highest bit. */
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | S(dst) | T(dst) | IMM(1), DR(dst)));
FAIL_IF(push_inst(compiler, BGEZ | S(TMP_REG1) | IMM(-2), UNMOVABLE_INS));
FAIL_IF(push_inst(compiler, SELECT_OP(DSLL, SLL) | T(TMP_REG1) | D(TMP_REG1) | SH_IMM(1), UNMOVABLE_INS));
#endif /* SLJIT_MIPS_REV >= 1 */
return SLJIT_SUCCESS;
case SLJIT_ADD:
is_overflow = GET_FLAG_TYPE(op) == SLJIT_OVERFLOW;
is_carry = GET_FLAG_TYPE(op) == GET_FLAG_TYPE(SLJIT_SET_CARRY);
if (flags & SRC2_IMM) {
if (is_overflow) {
if (src2 >= 0)
FAIL_IF(push_inst(compiler, OR | S(src1) | T(src1) | DA(EQUAL_FLAG), EQUAL_FLAG));
else
FAIL_IF(push_inst(compiler, NOR | S(src1) | T(src1) | DA(EQUAL_FLAG), EQUAL_FLAG));
}
else if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | S(src1) | TA(EQUAL_FLAG) | IMM(src2), EQUAL_FLAG));
if (is_overflow || is_carry) {
if (src2 >= 0)
FAIL_IF(push_inst(compiler, ORI | S(src1) | TA(OTHER_FLAG) | IMM(src2), OTHER_FLAG));
else {
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | SA(0) | TA(OTHER_FLAG) | IMM(src2), OTHER_FLAG));
FAIL_IF(push_inst(compiler, OR | S(src1) | TA(OTHER_FLAG) | DA(OTHER_FLAG), OTHER_FLAG));
}
}
/* dst may be the same as src1 or src2. */
if (!(flags & UNUSED_DEST) || (op & VARIABLE_FLAG_MASK))
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | S(src1) | T(dst) | IMM(src2), DR(dst)));
}
else {
if (is_overflow)
FAIL_IF(push_inst(compiler, XOR | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
else if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SELECT_OP(DADDU, ADDU) | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
if (is_overflow || is_carry)
FAIL_IF(push_inst(compiler, OR | S(src1) | T(src2) | DA(OTHER_FLAG), OTHER_FLAG));
/* dst may be the same as src1 or src2. */
if (!(flags & UNUSED_DEST) || (op & VARIABLE_FLAG_MASK))
FAIL_IF(push_inst(compiler, SELECT_OP(DADDU, ADDU) | S(src1) | T(src2) | D(dst), DR(dst)));
}
/* a + b >= a | b (otherwise, the carry should be set to 1). */
if (is_overflow || is_carry)
FAIL_IF(push_inst(compiler, SLTU | S(dst) | TA(OTHER_FLAG) | DA(OTHER_FLAG), OTHER_FLAG));
if (!is_overflow)
return SLJIT_SUCCESS;
FAIL_IF(push_inst(compiler, SELECT_OP(DSLL32, SLL) | TA(OTHER_FLAG) | D(TMP_REG1) | SH_IMM(31), DR(TMP_REG1)));
FAIL_IF(push_inst(compiler, XOR | S(TMP_REG1) | TA(EQUAL_FLAG) | DA(EQUAL_FLAG), EQUAL_FLAG));
FAIL_IF(push_inst(compiler, XOR | S(dst) | TA(EQUAL_FLAG) | DA(OTHER_FLAG), OTHER_FLAG));
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SELECT_OP(DADDU, ADDU) | S(dst) | TA(0) | DA(EQUAL_FLAG), EQUAL_FLAG));
return push_inst(compiler, SELECT_OP(DSRL32, SRL) | TA(OTHER_FLAG) | DA(OTHER_FLAG) | SH_IMM(31), OTHER_FLAG);
case SLJIT_ADDC:
is_carry = GET_FLAG_TYPE(op) == GET_FLAG_TYPE(SLJIT_SET_CARRY);
if (flags & SRC2_IMM) {
if (is_carry) {
if (src2 >= 0)
FAIL_IF(push_inst(compiler, ORI | S(src1) | TA(EQUAL_FLAG) | IMM(src2), EQUAL_FLAG));
else {
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | SA(0) | TA(EQUAL_FLAG) | IMM(src2), EQUAL_FLAG));
FAIL_IF(push_inst(compiler, OR | S(src1) | TA(EQUAL_FLAG) | DA(EQUAL_FLAG), EQUAL_FLAG));
}
}
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | S(src1) | T(dst) | IMM(src2), DR(dst)));
} else {
if (is_carry)
FAIL_IF(push_inst(compiler, OR | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
/* dst may be the same as src1 or src2. */
FAIL_IF(push_inst(compiler, SELECT_OP(DADDU, ADDU) | S(src1) | T(src2) | D(dst), DR(dst)));
}
if (is_carry)
FAIL_IF(push_inst(compiler, SLTU | S(dst) | TA(EQUAL_FLAG) | DA(EQUAL_FLAG), EQUAL_FLAG));
FAIL_IF(push_inst(compiler, SELECT_OP(DADDU, ADDU) | S(dst) | TA(OTHER_FLAG) | D(dst), DR(dst)));
if (!is_carry)
return SLJIT_SUCCESS;
/* Set ULESS_FLAG (dst == 0) && (OTHER_FLAG == 1). */
FAIL_IF(push_inst(compiler, SLTU | S(dst) | TA(OTHER_FLAG) | DA(OTHER_FLAG), OTHER_FLAG));
/* Set carry flag. */
return push_inst(compiler, OR | SA(OTHER_FLAG) | TA(EQUAL_FLAG) | DA(OTHER_FLAG), OTHER_FLAG);
case SLJIT_SUB:
if ((flags & SRC2_IMM) && src2 == SIMM_MIN) {
FAIL_IF(push_inst(compiler, ADDIU | SA(0) | T(TMP_REG2) | IMM(src2), DR(TMP_REG2)));
src2 = TMP_REG2;
flags &= ~SRC2_IMM;
}
is_handled = 0;
if (flags & SRC2_IMM) {
if (GET_FLAG_TYPE(op) == SLJIT_LESS || GET_FLAG_TYPE(op) == SLJIT_GREATER_EQUAL) {
FAIL_IF(push_inst(compiler, SLTIU | S(src1) | TA(OTHER_FLAG) | IMM(src2), OTHER_FLAG));
is_handled = 1;
}
else if (GET_FLAG_TYPE(op) == SLJIT_SIG_LESS || GET_FLAG_TYPE(op) == SLJIT_SIG_GREATER_EQUAL) {
FAIL_IF(push_inst(compiler, SLTI | S(src1) | TA(OTHER_FLAG) | IMM(src2), OTHER_FLAG));
is_handled = 1;
}
}
if (!is_handled && GET_FLAG_TYPE(op) >= SLJIT_LESS && GET_FLAG_TYPE(op) <= SLJIT_SIG_LESS_EQUAL) {
is_handled = 1;
if (flags & SRC2_IMM) {
FAIL_IF(push_inst(compiler, ADDIU | SA(0) | T(TMP_REG2) | IMM(src2), DR(TMP_REG2)));
src2 = TMP_REG2;
flags &= ~SRC2_IMM;
}
if (GET_FLAG_TYPE(op) == SLJIT_LESS || GET_FLAG_TYPE(op) == SLJIT_GREATER_EQUAL) {
FAIL_IF(push_inst(compiler, SLTU | S(src1) | T(src2) | DA(OTHER_FLAG), OTHER_FLAG));
}
else if (GET_FLAG_TYPE(op) == SLJIT_GREATER || GET_FLAG_TYPE(op) == SLJIT_LESS_EQUAL)
{
FAIL_IF(push_inst(compiler, SLTU | S(src2) | T(src1) | DA(OTHER_FLAG), OTHER_FLAG));
}
else if (GET_FLAG_TYPE(op) == SLJIT_SIG_LESS || GET_FLAG_TYPE(op) == SLJIT_SIG_GREATER_EQUAL) {
FAIL_IF(push_inst(compiler, SLT | S(src1) | T(src2) | DA(OTHER_FLAG), OTHER_FLAG));
}
else if (GET_FLAG_TYPE(op) == SLJIT_SIG_GREATER || GET_FLAG_TYPE(op) == SLJIT_SIG_LESS_EQUAL)
{
FAIL_IF(push_inst(compiler, SLT | S(src2) | T(src1) | DA(OTHER_FLAG), OTHER_FLAG));
}
}
if (is_handled) {
if (flags & SRC2_IMM) {
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | S(src1) | TA(EQUAL_FLAG) | IMM(-src2), EQUAL_FLAG));
if (!(flags & UNUSED_DEST))
return push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | S(src1) | T(dst) | IMM(-src2), DR(dst));
}
else {
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SELECT_OP(DSUBU, SUBU) | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
if (!(flags & UNUSED_DEST))
return push_inst(compiler, SELECT_OP(DSUBU, SUBU) | S(src1) | T(src2) | D(dst), DR(dst));
}
return SLJIT_SUCCESS;
}
is_overflow = GET_FLAG_TYPE(op) == SLJIT_OVERFLOW;
is_carry = GET_FLAG_TYPE(op) == GET_FLAG_TYPE(SLJIT_SET_CARRY);
if (flags & SRC2_IMM) {
if (is_overflow) {
if (src2 >= 0)
FAIL_IF(push_inst(compiler, OR | S(src1) | T(src1) | DA(EQUAL_FLAG), EQUAL_FLAG));
else
FAIL_IF(push_inst(compiler, NOR | S(src1) | T(src1) | DA(EQUAL_FLAG), EQUAL_FLAG));
}
else if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | S(src1) | TA(EQUAL_FLAG) | IMM(-src2), EQUAL_FLAG));
if (is_overflow || is_carry)
FAIL_IF(push_inst(compiler, SLTIU | S(src1) | TA(OTHER_FLAG) | IMM(src2), OTHER_FLAG));
/* dst may be the same as src1 or src2. */
if (!(flags & UNUSED_DEST) || (op & VARIABLE_FLAG_MASK))
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | S(src1) | T(dst) | IMM(-src2), DR(dst)));
}
else {
if (is_overflow)
FAIL_IF(push_inst(compiler, XOR | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
else if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SELECT_OP(DSUBU, SUBU) | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
if (is_overflow || is_carry)
FAIL_IF(push_inst(compiler, SLTU | S(src1) | T(src2) | DA(OTHER_FLAG), OTHER_FLAG));
/* dst may be the same as src1 or src2. */
if (!(flags & UNUSED_DEST) || (op & VARIABLE_FLAG_MASK))
FAIL_IF(push_inst(compiler, SELECT_OP(DSUBU, SUBU) | S(src1) | T(src2) | D(dst), DR(dst)));
}
if (!is_overflow)
return SLJIT_SUCCESS;
FAIL_IF(push_inst(compiler, SELECT_OP(DSLL32, SLL) | TA(OTHER_FLAG) | D(TMP_REG1) | SH_IMM(31), DR(TMP_REG1)));
FAIL_IF(push_inst(compiler, XOR | S(TMP_REG1) | TA(EQUAL_FLAG) | DA(EQUAL_FLAG), EQUAL_FLAG));
FAIL_IF(push_inst(compiler, XOR | S(dst) | TA(EQUAL_FLAG) | DA(OTHER_FLAG), OTHER_FLAG));
if (op & SLJIT_SET_Z)
FAIL_IF(push_inst(compiler, SELECT_OP(DADDU, ADDU) | S(dst) | TA(0) | DA(EQUAL_FLAG), EQUAL_FLAG));
return push_inst(compiler, SELECT_OP(DSRL32, SRL) | TA(OTHER_FLAG) | DA(OTHER_FLAG) | SH_IMM(31), OTHER_FLAG);
case SLJIT_SUBC:
if ((flags & SRC2_IMM) && src2 == SIMM_MIN) {
FAIL_IF(push_inst(compiler, ADDIU | SA(0) | T(TMP_REG2) | IMM(src2), DR(TMP_REG2)));
src2 = TMP_REG2;
flags &= ~SRC2_IMM;
}
is_carry = GET_FLAG_TYPE(op) == GET_FLAG_TYPE(SLJIT_SET_CARRY);
if (flags & SRC2_IMM) {
if (is_carry)
FAIL_IF(push_inst(compiler, SLTIU | S(src1) | TA(EQUAL_FLAG) | IMM(src2), EQUAL_FLAG));
/* dst may be the same as src1 or src2. */
FAIL_IF(push_inst(compiler, SELECT_OP(DADDIU, ADDIU) | S(src1) | T(dst) | IMM(-src2), DR(dst)));
}
else {
if (is_carry)
FAIL_IF(push_inst(compiler, SLTU | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
/* dst may be the same as src1 or src2. */
FAIL_IF(push_inst(compiler, SELECT_OP(DSUBU, SUBU) | S(src1) | T(src2) | D(dst), DR(dst)));
}
if (is_carry)
FAIL_IF(push_inst(compiler, SLTU | S(dst) | TA(OTHER_FLAG) | D(TMP_REG1), DR(TMP_REG1)));
FAIL_IF(push_inst(compiler, SELECT_OP(DSUBU, SUBU) | S(dst) | TA(OTHER_FLAG) | D(dst), DR(dst)));
return (is_carry) ? push_inst(compiler, OR | SA(EQUAL_FLAG) | T(TMP_REG1) | DA(OTHER_FLAG), OTHER_FLAG) : SLJIT_SUCCESS;
case SLJIT_MUL:
SLJIT_ASSERT(!(flags & SRC2_IMM));
if (GET_FLAG_TYPE(op) != SLJIT_OVERFLOW) {
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 6)
return push_inst(compiler, SELECT_OP(DMUL, MUL) | S(src1) | T(src2) | D(dst), DR(dst));
#elif (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 1)
if (op & SLJIT_32)
return push_inst(compiler, MUL | S(src1) | T(src2) | D(dst), DR(dst));
FAIL_IF(push_inst(compiler, DMULT | S(src1) | T(src2), MOVABLE_INS));
return push_inst(compiler, MFLO | D(dst), DR(dst));
#else /* SLJIT_MIPS_REV < 1 */
FAIL_IF(push_inst(compiler, SELECT_OP(DMULT, MULT) | S(src1) | T(src2), MOVABLE_INS));
return push_inst(compiler, MFLO | D(dst), DR(dst));
#endif /* SLJIT_MIPS_REV >= 6 */
}
#if (defined SLJIT_MIPS_REV && SLJIT_MIPS_REV >= 6)
FAIL_IF(push_inst(compiler, SELECT_OP(DMUL, MUL) | S(src1) | T(src2) | D(dst), DR(dst)));
FAIL_IF(push_inst(compiler, SELECT_OP(DMUH, MUH) | S(src1) | T(src2) | DA(EQUAL_FLAG), EQUAL_FLAG));
#else /* SLJIT_MIPS_REV < 6 */
FAIL_IF(push_inst(compiler, SELECT_OP(DMULT, MULT) | S(src1) | T(src2), MOVABLE_INS));
FAIL_IF(push_inst(compiler, MFHI | DA(EQUAL_FLAG), EQUAL_FLAG));
FAIL_IF(push_inst(compiler, MFLO | D(dst), DR(dst)));
#endif /* SLJIT_MIPS_REV >= 6 */
FAIL_IF(push_inst(compiler, SELECT_OP(DSRA32, SRA) | T(dst) | DA(OTHER_FLAG) | SH_IMM(31), OTHER_FLAG));
return push_inst(compiler, SELECT_OP(DSUBU, SUBU) | SA(EQUAL_FLAG) | TA(OTHER_FLAG) | DA(OTHER_FLAG), OTHER_FLAG);
case SLJIT_AND:
EMIT_LOGICAL(ANDI, AND);
return SLJIT_SUCCESS;
case SLJIT_OR:
EMIT_LOGICAL(ORI, OR);
return SLJIT_SUCCESS;
case SLJIT_XOR:
EMIT_LOGICAL(XORI, XOR);
return SLJIT_SUCCESS;
case SLJIT_SHL:
EMIT_SHIFT(DSLL, DSLL32, SLL, DSLLV, SLLV);
return SLJIT_SUCCESS;
case SLJIT_LSHR:
EMIT_SHIFT(DSRL, DSRL32, SRL, DSRLV, SRLV);
return SLJIT_SUCCESS;
case SLJIT_ASHR:
EMIT_SHIFT(DSRA, DSRA32, SRA, DSRAV, SRAV);
return SLJIT_SUCCESS;
}
SLJIT_UNREACHABLE();
return SLJIT_SUCCESS;
}
static SLJIT_INLINE sljit_s32 emit_const(struct sljit_compiler *compiler, sljit_s32 dst, sljit_sw init_value)
{
FAIL_IF(push_inst(compiler, LUI | T(dst) | IMM(init_value >> 48), DR(dst)));
@ -653,14 +238,20 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
if (type & SLJIT_CALL_RETURN)
PTR_FAIL_IF(emit_stack_frame_release(compiler, 0, &ins));
PTR_FAIL_IF(call_with_args(compiler, arg_types, &ins));
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
PTR_FAIL_IF(call_with_args(compiler, arg_types, &ins));
SLJIT_ASSERT(DR(PIC_ADDR_REG) == 25 && PIC_ADDR_REG == TMP_REG2);
PTR_FAIL_IF(emit_const(compiler, PIC_ADDR_REG, 0));
if (ins == NOP && compiler->delay_slot != UNMOVABLE_INS)
jump->flags |= IS_MOVABLE;
if (!(type & SLJIT_CALL_RETURN)) {
jump->flags |= IS_JAL | IS_CALL;
jump->flags |= IS_JAL;
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
jump->flags |= IS_CALL;
PTR_FAIL_IF(push_inst(compiler, JALR | S(PIC_ADDR_REG) | DA(RETURN_ADDR_REG), UNMOVABLE_INS));
} else
PTR_FAIL_IF(push_inst(compiler, JR | S(PIC_ADDR_REG), UNMOVABLE_INS));
@ -668,6 +259,8 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
jump->addr = compiler->size;
PTR_FAIL_IF(push_inst(compiler, ins, UNMOVABLE_INS));
/* Maximum number of instructions required for generating a constant. */
compiler->size += 6;
return jump;
}
@ -680,16 +273,37 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_icall(struct sljit_compiler *compi
CHECK_ERROR();
CHECK(check_sljit_emit_icall(compiler, type, arg_types, src, srcw));
if (src & SLJIT_MEM) {
ADJUST_LOCAL_OFFSET(src, srcw);
FAIL_IF(emit_op_mem(compiler, WORD_DATA | LOAD_DATA, DR(PIC_ADDR_REG), src, srcw));
src = PIC_ADDR_REG;
srcw = 0;
}
if ((type & 0xff) == SLJIT_CALL_REG_ARG) {
if (type & SLJIT_CALL_RETURN) {
if (src >= SLJIT_FIRST_SAVED_REG && src <= SLJIT_S0) {
FAIL_IF(push_inst(compiler, DADDU | S(src) | TA(0) | D(PIC_ADDR_REG), DR(PIC_ADDR_REG)));
src = PIC_ADDR_REG;
srcw = 0;
}
FAIL_IF(emit_stack_frame_release(compiler, 0, &ins));
if (ins != NOP)
FAIL_IF(push_inst(compiler, ins, MOVABLE_INS));
}
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_ijump(compiler, type, src, srcw);
}
SLJIT_ASSERT(DR(PIC_ADDR_REG) == 25 && PIC_ADDR_REG == TMP_REG2);
if (src & SLJIT_IMM)
FAIL_IF(load_immediate(compiler, DR(PIC_ADDR_REG), srcw));
else if (FAST_IS_REG(src))
else if (src != PIC_ADDR_REG)
FAIL_IF(push_inst(compiler, DADDU | S(src) | TA(0) | D(PIC_ADDR_REG), DR(PIC_ADDR_REG)));
else if (src & SLJIT_MEM) {
ADJUST_LOCAL_OFFSET(src, srcw);
FAIL_IF(emit_op_mem(compiler, WORD_DATA | LOAD_DATA, DR(PIC_ADDR_REG), src, srcw));
}
if (type & SLJIT_CALL_RETURN)
FAIL_IF(emit_stack_frame_release(compiler, 0, &ins));

File diff suppressed because it is too large Load Diff

View File

@ -277,8 +277,3 @@ SLJIT_API_FUNC_ATTRIBUTE void sljit_set_jump_addr(sljit_uw addr, sljit_uw new_ta
inst = (sljit_ins *)SLJIT_ADD_EXEC_OFFSET(inst, executable_offset);
SLJIT_CACHE_FLUSH(inst, inst + 2);
}
SLJIT_API_FUNC_ATTRIBUTE void sljit_set_const(sljit_uw addr, sljit_sw new_constant, sljit_sw executable_offset)
{
sljit_set_jump_addr(addr, (sljit_uw)new_constant, executable_offset);
}

View File

@ -502,8 +502,3 @@ SLJIT_API_FUNC_ATTRIBUTE void sljit_set_jump_addr(sljit_uw addr, sljit_uw new_ta
inst = (sljit_ins *)SLJIT_ADD_EXEC_OFFSET(inst, executable_offset);
SLJIT_CACHE_FLUSH(inst, inst + 5);
}
SLJIT_API_FUNC_ATTRIBUTE void sljit_set_const(sljit_uw addr, sljit_sw new_constant, sljit_sw executable_offset)
{
sljit_set_jump_addr(addr, (sljit_uw)new_constant, executable_offset);
}

View File

@ -368,7 +368,7 @@ static SLJIT_INLINE void put_label_set(struct sljit_put_label *put_label)
else {
inst[0] = ORIS | S(TMP_ZERO) | A(reg) | IMM(addr >> 48);
inst[1] = ORI | S(reg) | A(reg) | IMM((addr >> 32) & 0xffff);
inst ++;
inst++;
}
inst[1] = RLDI(reg, reg, 32, 31, 1);
@ -497,8 +497,8 @@ SLJIT_API_FUNC_ATTRIBUTE void* sljit_generate_code(struct sljit_compiler *compil
}
next_addr = compute_next_addr(label, jump, const_, put_label);
}
code_ptr ++;
word_count ++;
code_ptr++;
word_count++;
} while (buf_ptr < buf_end);
buf = buf->next;
@ -649,6 +649,11 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_has_cpu_feature(sljit_s32 feature_type)
}
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_cmp_info(sljit_s32 type)
{
return (type >= SLJIT_UNORDERED && type <= SLJIT_ORDERED_LESS_EQUAL);
}
/* --------------------------------------------------------------------- */
/* Entry, exit */
/* --------------------------------------------------------------------- */
@ -721,7 +726,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
{
sljit_s32 i, tmp, base, offset;
sljit_s32 word_arg_count = 0;
sljit_s32 saved_arg_count = 0;
sljit_s32 saved_arg_count = SLJIT_KEPT_SAVEDS_COUNT(options);
#if (defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64)
sljit_s32 arg_count = 0;
#endif
@ -730,8 +735,12 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
CHECK(check_sljit_emit_enter(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size));
set_emit_enter(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size);
local_size += GET_SAVED_REGISTERS_SIZE(scratches, saveds, 1)
local_size += GET_SAVED_REGISTERS_SIZE(scratches, saveds - saved_arg_count, 0)
+ GET_SAVED_FLOAT_REGISTERS_SIZE(fscratches, fsaveds, sizeof(sljit_f64));
if (!(options & SLJIT_ENTER_REG_ARG))
local_size += SSIZE_OF(sw);
local_size = (local_size + SLJIT_LOCALS_OFFSET + 15) & ~0xf;
compiler->local_size = local_size;
@ -770,11 +779,13 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
FAIL_IF(push_inst(compiler, STFD | FS(i) | A(base) | IMM(offset)));
}
offset -= SSIZE_OF(sw);
FAIL_IF(push_inst(compiler, STACK_STORE | S(TMP_ZERO) | A(base) | IMM(offset)));
if (!(options & SLJIT_ENTER_REG_ARG)) {
offset -= SSIZE_OF(sw);
FAIL_IF(push_inst(compiler, STACK_STORE | S(TMP_ZERO) | A(base) | IMM(offset)));
}
tmp = SLJIT_S0 - saveds;
for (i = SLJIT_S0; i > tmp; i--) {
for (i = SLJIT_S0 - saved_arg_count; i > tmp; i--) {
offset -= SSIZE_OF(sw);
FAIL_IF(push_inst(compiler, STACK_STORE | S(i) | A(base) | IMM(offset)));
}
@ -785,9 +796,14 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
}
FAIL_IF(push_inst(compiler, STACK_STORE | S(0) | A(base) | IMM(local_size + LR_SAVE_OFFSET)));
if (options & SLJIT_ENTER_REG_ARG)
return SLJIT_SUCCESS;
FAIL_IF(push_inst(compiler, ADDI | D(TMP_ZERO) | A(0) | 0));
arg_types >>= SLJIT_ARG_SHIFT;
saved_arg_count = 0;
while (arg_types > 0) {
if ((arg_types & SLJIT_ARG_MASK) < SLJIT_ARG_TYPE_F64) {
@ -829,13 +845,16 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_set_context(struct sljit_compiler *comp
CHECK(check_sljit_set_context(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size));
set_set_context(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size);
local_size += GET_SAVED_REGISTERS_SIZE(scratches, saveds, 1)
local_size += GET_SAVED_REGISTERS_SIZE(scratches, saveds - SLJIT_KEPT_SAVEDS_COUNT(options), 0)
+ GET_SAVED_FLOAT_REGISTERS_SIZE(fscratches, fsaveds, sizeof(sljit_f64));
if (!(options & SLJIT_ENTER_REG_ARG))
local_size += SSIZE_OF(sw);
compiler->local_size = (local_size + SLJIT_LOCALS_OFFSET + 15) & ~0xf;
return SLJIT_SUCCESS;
}
static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler)
{
sljit_s32 i, tmp, base, offset;
@ -867,11 +886,13 @@ static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler)
FAIL_IF(push_inst(compiler, LFD | FS(i) | A(base) | IMM(offset)));
}
offset -= SSIZE_OF(sw);
FAIL_IF(push_inst(compiler, STACK_LOAD | S(TMP_ZERO) | A(base) | IMM(offset)));
if (!(compiler->options & SLJIT_ENTER_REG_ARG)) {
offset -= SSIZE_OF(sw);
FAIL_IF(push_inst(compiler, STACK_LOAD | S(TMP_ZERO) | A(base) | IMM(offset)));
}
tmp = SLJIT_S0 - compiler->saveds;
for (i = SLJIT_S0; i > tmp; i--) {
for (i = SLJIT_S0 - SLJIT_KEPT_SAVEDS_COUNT(compiler->options); i > tmp; i--) {
offset -= SSIZE_OF(sw);
FAIL_IF(push_inst(compiler, STACK_LOAD | S(i) | A(base) | IMM(offset)));
}
@ -1626,7 +1647,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op2(struct sljit_compiler *compile
return emit_op(compiler, GET_OPCODE(op), flags | ALT_FORM2, dst, dstw, src2, src2w, TMP_REG2, 0);
}
}
if (GET_OPCODE(op) != SLJIT_AND) {
if (!HAS_FLAGS(op) && GET_OPCODE(op) != SLJIT_AND) {
/* Unlike or and xor, the and resets unwanted bits as well. */
if (TEST_UI_IMM(src2, src2w)) {
compiler->imm = (sljit_ins)src2w;
@ -1663,10 +1684,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op2u(struct sljit_compiler *compil
CHECK_ERROR();
CHECK(check_sljit_emit_op2(compiler, op, 1, 0, 0, src1, src1w, src2, src2w));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_op2(compiler, op, TMP_REG2, 0, src1, src1w, src2, src2w);
}
@ -1818,6 +1836,7 @@ static SLJIT_INLINE sljit_s32 sljit_emit_fop1_conv_f64_from_sw(struct sljit_comp
if (src & SLJIT_IMM) {
if (GET_OPCODE(op) == SLJIT_CONV_F64_FROM_S32)
srcw = (sljit_s32)srcw;
FAIL_IF(load_immediate(compiler, TMP_REG1, srcw));
src = TMP_REG1;
}
@ -1899,7 +1918,21 @@ static SLJIT_INLINE sljit_s32 sljit_emit_fop1_cmp(struct sljit_compiler *compile
src2 = TMP_FREG2;
}
return push_inst(compiler, FCMPU | CRD(4) | FA(src1) | FB(src2));
FAIL_IF(push_inst(compiler, FCMPU | CRD(4) | FA(src1) | FB(src2)));
switch (GET_FLAG_TYPE(op)) {
case SLJIT_UNORDERED_OR_EQUAL:
case SLJIT_ORDERED_NOT_EQUAL:
return push_inst(compiler, CROR | ((4 + 2) << 21) | ((4 + 2) << 16) | ((4 + 3) << 11));
case SLJIT_UNORDERED_OR_LESS:
case SLJIT_ORDERED_GREATER_EQUAL:
return push_inst(compiler, CROR | ((4 + 0) << 21) | ((4 + 0) << 16) | ((4 + 3) << 11));
case SLJIT_UNORDERED_OR_GREATER:
case SLJIT_ORDERED_LESS_EQUAL:
return push_inst(compiler, CROR | ((4 + 1) << 21) | ((4 + 1) << 16) | ((4 + 3) << 11));
}
return SLJIT_SUCCESS;
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_fop1(struct sljit_compiler *compiler, sljit_s32 op,
@ -2076,38 +2109,50 @@ static sljit_ins get_bo_bi_flags(struct sljit_compiler *compiler, sljit_s32 type
case SLJIT_SIG_LESS_EQUAL:
return (4 << 21) | (1 << 16);
case SLJIT_LESS_F64:
return (12 << 21) | ((4 + 0) << 16);
case SLJIT_GREATER_EQUAL_F64:
return (4 << 21) | ((4 + 0) << 16);
case SLJIT_GREATER_F64:
return (12 << 21) | ((4 + 1) << 16);
case SLJIT_LESS_EQUAL_F64:
return (4 << 21) | ((4 + 1) << 16);
case SLJIT_OVERFLOW:
return (12 << 21) | (3 << 16);
case SLJIT_NOT_OVERFLOW:
return (4 << 21) | (3 << 16);
case SLJIT_EQUAL_F64:
case SLJIT_F_LESS:
case SLJIT_ORDERED_LESS:
case SLJIT_UNORDERED_OR_LESS:
return (12 << 21) | ((4 + 0) << 16);
case SLJIT_F_GREATER_EQUAL:
case SLJIT_ORDERED_GREATER_EQUAL:
case SLJIT_UNORDERED_OR_GREATER_EQUAL:
return (4 << 21) | ((4 + 0) << 16);
case SLJIT_F_GREATER:
case SLJIT_ORDERED_GREATER:
case SLJIT_UNORDERED_OR_GREATER:
return (12 << 21) | ((4 + 1) << 16);
case SLJIT_F_LESS_EQUAL:
case SLJIT_ORDERED_LESS_EQUAL:
case SLJIT_UNORDERED_OR_LESS_EQUAL:
return (4 << 21) | ((4 + 1) << 16);
case SLJIT_F_EQUAL:
case SLJIT_ORDERED_EQUAL:
case SLJIT_UNORDERED_OR_EQUAL:
return (12 << 21) | ((4 + 2) << 16);
case SLJIT_NOT_EQUAL_F64:
case SLJIT_F_NOT_EQUAL:
case SLJIT_ORDERED_NOT_EQUAL:
case SLJIT_UNORDERED_OR_NOT_EQUAL:
return (4 << 21) | ((4 + 2) << 16);
case SLJIT_UNORDERED_F64:
case SLJIT_UNORDERED:
return (12 << 21) | ((4 + 3) << 16);
case SLJIT_ORDERED_F64:
case SLJIT_ORDERED:
return (4 << 21) | ((4 + 3) << 16);
default:
SLJIT_ASSERT(type >= SLJIT_JUMP && type <= SLJIT_CALL_CDECL);
SLJIT_ASSERT(type >= SLJIT_JUMP && type <= SLJIT_CALL_REG_ARG);
return (20 << 21);
}
}
@ -2154,7 +2199,8 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
CHECK_PTR(check_sljit_emit_call(compiler, type, arg_types));
#if (defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64)
PTR_FAIL_IF(call_with_args(compiler, arg_types, NULL));
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
PTR_FAIL_IF(call_with_args(compiler, arg_types, NULL));
#endif
if (type & SLJIT_CALL_RETURN) {
@ -2162,11 +2208,7 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
type = SLJIT_JUMP | (type & SLJIT_REWRITABLE_JUMP);
}
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_jump(compiler, type);
}
@ -2240,14 +2282,11 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_icall(struct sljit_compiler *compi
}
#if (defined SLJIT_CONFIG_PPC_64 && SLJIT_CONFIG_PPC_64)
FAIL_IF(call_with_args(compiler, arg_types, &src));
#endif
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
FAIL_IF(call_with_args(compiler, arg_types, &src));
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_ijump(compiler, type, src, srcw);
}
@ -2279,7 +2318,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
bit = 0;
from_xer = 0;
switch (type & 0xff) {
switch (type) {
case SLJIT_LESS:
case SLJIT_SIG_LESS:
break;
@ -2332,38 +2371,50 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
invert = (compiler->status_flags_state & SLJIT_CURRENT_FLAGS_ADD) != 0;
break;
case SLJIT_LESS_F64:
case SLJIT_F_LESS:
case SLJIT_ORDERED_LESS:
case SLJIT_UNORDERED_OR_LESS:
bit = 4 + 0;
break;
case SLJIT_GREATER_EQUAL_F64:
case SLJIT_F_GREATER_EQUAL:
case SLJIT_ORDERED_GREATER_EQUAL:
case SLJIT_UNORDERED_OR_GREATER_EQUAL:
bit = 4 + 0;
invert = 1;
break;
case SLJIT_GREATER_F64:
case SLJIT_F_GREATER:
case SLJIT_ORDERED_GREATER:
case SLJIT_UNORDERED_OR_GREATER:
bit = 4 + 1;
break;
case SLJIT_LESS_EQUAL_F64:
case SLJIT_F_LESS_EQUAL:
case SLJIT_ORDERED_LESS_EQUAL:
case SLJIT_UNORDERED_OR_LESS_EQUAL:
bit = 4 + 1;
invert = 1;
break;
case SLJIT_EQUAL_F64:
case SLJIT_F_EQUAL:
case SLJIT_ORDERED_EQUAL:
case SLJIT_UNORDERED_OR_EQUAL:
bit = 4 + 2;
break;
case SLJIT_NOT_EQUAL_F64:
case SLJIT_F_NOT_EQUAL:
case SLJIT_ORDERED_NOT_EQUAL:
case SLJIT_UNORDERED_OR_NOT_EQUAL:
bit = 4 + 2;
invert = 1;
break;
case SLJIT_UNORDERED_F64:
case SLJIT_UNORDERED:
bit = 4 + 3;
break;
case SLJIT_ORDERED_F64:
case SLJIT_ORDERED:
bit = 4 + 3;
invert = 1;
break;
@ -2385,10 +2436,8 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
return emit_op_mem(compiler, input_flags, reg, dst, dstw, TMP_REG1);
}
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
if (dst & SLJIT_MEM)
return sljit_emit_op2(compiler, saved_op, dst, saved_dstw, TMP_REG1, 0, TMP_REG2, 0);
return sljit_emit_op2(compiler, saved_op, dst, 0, dst, 0, TMP_REG2, 0);
@ -2414,6 +2463,9 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_mem(struct sljit_compiler *compile
CHECK_ERROR();
CHECK(check_sljit_emit_mem(compiler, type, reg, mem, memw));
if (type & SLJIT_MEM_UNALIGNED)
return sljit_emit_mem_unaligned(compiler, type, reg, mem, memw);
if (type & SLJIT_MEM_POST)
return SLJIT_ERR_UNSUPPORTED;
@ -2510,6 +2562,9 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_fmem(struct sljit_compiler *compil
CHECK_ERROR();
CHECK(check_sljit_emit_fmem(compiler, type, freg, mem, memw));
if (type & SLJIT_MEM_UNALIGNED)
return sljit_emit_fmem_unaligned(compiler, type, freg, mem, memw);
if (type & SLJIT_MEM_POST)
return SLJIT_ERR_UNSUPPORTED;
@ -2587,3 +2642,8 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_put_label* sljit_emit_put_label(struct slj
return put_label;
}
SLJIT_API_FUNC_ATTRIBUTE void sljit_set_const(sljit_uw addr, sljit_sw new_constant, sljit_sw executable_offset)
{
sljit_set_jump_addr(addr, (sljit_uw)new_constant, executable_offset);
}

View File

@ -0,0 +1,72 @@
/*
* Stack-less Just-In-Time compiler
*
* Copyright Zoltan Herczeg (hzmester@freemail.hu). All rights reserved.
*
* Redistribution and use in source and binary forms, with or without modification, are
* permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this list of
* conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright notice, this list
* of conditions and the following disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) AND CONTRIBUTORS ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
* SHALL THE COPYRIGHT HOLDER(S) OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
* TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
* BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
static sljit_s32 load_immediate(struct sljit_compiler *compiler, sljit_s32 dst_r, sljit_sw imm, sljit_s32 tmp_r)
{
SLJIT_UNUSED_ARG(tmp_r);
if (imm <= SIMM_MAX && imm >= SIMM_MIN)
return push_inst(compiler, ADDI | RD(dst_r) | RS1(TMP_ZERO) | IMM_I(imm));
if (imm & 0x800)
imm += 0x1000;
FAIL_IF(push_inst(compiler, LUI | RD(dst_r) | (sljit_ins)(imm & ~0xfff)));
if ((imm & 0xfff) == 0)
return SLJIT_SUCCESS;
return push_inst(compiler, ADDI | RD(dst_r) | RS1(dst_r) | IMM_I(imm));
}
static SLJIT_INLINE sljit_s32 emit_const(struct sljit_compiler *compiler, sljit_s32 dst, sljit_sw init_value, sljit_ins last_ins)
{
if ((init_value & 0x800) != 0)
init_value += 0x1000;
FAIL_IF(push_inst(compiler, LUI | RD(dst) | (sljit_ins)(init_value & ~0xfff)));
return push_inst(compiler, last_ins | RS1(dst) | IMM_I(init_value));
}
SLJIT_API_FUNC_ATTRIBUTE void sljit_set_jump_addr(sljit_uw addr, sljit_uw new_target, sljit_sw executable_offset)
{
sljit_ins *inst = (sljit_ins*)addr;
SLJIT_UNUSED_ARG(executable_offset);
if ((new_target & 0x800) != 0)
new_target += 0x1000;
SLJIT_UPDATE_WX_FLAGS(inst, inst + 5, 0);
SLJIT_ASSERT((inst[0] & 0x7f) == LUI);
inst[0] = (inst[0] & 0xfff) | (sljit_ins)((sljit_sw)new_target & ~0xfff);
SLJIT_ASSERT((inst[1] & 0x707f) == ADDI || (inst[1] & 0x707f) == JALR);
inst[1] = (inst[1] & 0xfffff) | IMM_I(new_target);
SLJIT_UPDATE_WX_FLAGS(inst, inst + 5, 1);
inst = (sljit_ins *)SLJIT_ADD_EXEC_OFFSET(inst, executable_offset);
SLJIT_CACHE_FLUSH(inst, inst + 5);
}

View File

@ -0,0 +1,181 @@
/*
* Stack-less Just-In-Time compiler
*
* Copyright Zoltan Herczeg (hzmester@freemail.hu). All rights reserved.
*
* Redistribution and use in source and binary forms, with or without modification, are
* permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this list of
* conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright notice, this list
* of conditions and the following disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) AND CONTRIBUTORS ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
* SHALL THE COPYRIGHT HOLDER(S) OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
* TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
* BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
static sljit_s32 load_immediate(struct sljit_compiler *compiler, sljit_s32 dst_r, sljit_sw imm, sljit_s32 tmp_r)
{
sljit_sw high;
if (imm <= SIMM_MAX && imm >= SIMM_MIN)
return push_inst(compiler, ADDI | RD(dst_r) | RS1(TMP_ZERO) | IMM_I(imm));
if (imm <= 0x7fffffffl && imm >= S32_MIN) {
if (imm > S32_MAX) {
SLJIT_ASSERT((imm & 0x800) != 0);
FAIL_IF(push_inst(compiler, LUI | RD(dst_r) | (sljit_ins)0x80000000u));
return push_inst(compiler, XORI | RD(dst_r) | RS1(dst_r) | IMM_I(imm));
}
if ((imm & 0x800) != 0)
imm += 0x1000;
FAIL_IF(push_inst(compiler, LUI | RD(dst_r) | (sljit_ins)(imm & ~0xfff)));
if ((imm & 0xfff) == 0)
return SLJIT_SUCCESS;
return push_inst(compiler, ADDI | RD(dst_r) | RS1(dst_r) | IMM_I(imm));
}
/* Trailing zeroes could be used to produce shifted immediates. */
if (imm <= 0x7ffffffffffl && imm >= -0x80000000000l) {
high = imm >> 12;
if (imm & 0x800)
high = ~high;
if (high > S32_MAX) {
SLJIT_ASSERT((high & 0x800) != 0);
FAIL_IF(push_inst(compiler, LUI | RD(dst_r) | (sljit_ins)0x80000000u));
FAIL_IF(push_inst(compiler, XORI | RD(dst_r) | RS1(dst_r) | IMM_I(high)));
} else {
if ((high & 0x800) != 0)
high += 0x1000;
FAIL_IF(push_inst(compiler, LUI | RD(dst_r) | (sljit_ins)(high & ~0xfff)));
if ((high & 0xfff) != 0)
FAIL_IF(push_inst(compiler, ADDI | RD(dst_r) | RS1(dst_r) | IMM_I(high)));
}
FAIL_IF(push_inst(compiler, SLLI | RD(dst_r) | RS1(dst_r) | IMM_I(12)));
if ((imm & 0xfff) != 0)
return push_inst(compiler, XORI | RD(dst_r) | RS1(dst_r) | IMM_I(imm));
return SLJIT_SUCCESS;
}
high = imm >> 32;
imm = (sljit_s32)imm;
if ((imm & 0x80000000l) != 0)
high = ~high;
if (high <= 0x7ffff && high >= -0x80000) {
FAIL_IF(push_inst(compiler, LUI | RD(tmp_r) | (sljit_ins)(high << 12)));
high = 0x1000;
} else {
if ((high & 0x800) != 0)
high += 0x1000;
FAIL_IF(push_inst(compiler, LUI | RD(tmp_r) | (sljit_ins)(high & ~0xfff)));
high &= 0xfff;
}
if (imm <= SIMM_MAX && imm >= SIMM_MIN) {
FAIL_IF(push_inst(compiler, ADDI | RD(dst_r) | RS1(TMP_ZERO) | IMM_I(imm)));
imm = 0;
} else if (imm > S32_MAX) {
SLJIT_ASSERT((imm & 0x800) != 0);
FAIL_IF(push_inst(compiler, LUI | RD(dst_r) | (sljit_ins)0x80000000u));
imm = 0x1000 | (imm & 0xfff);
} else {
if ((imm & 0x800) != 0)
imm += 0x1000;
FAIL_IF(push_inst(compiler, LUI | RD(dst_r) | (sljit_ins)(imm & ~0xfff)));
imm &= 0xfff;
}
if ((high & 0xfff) != 0)
FAIL_IF(push_inst(compiler, ADDI | RD(tmp_r) | RS1(tmp_r) | IMM_I(high)));
if (imm & 0x1000)
FAIL_IF(push_inst(compiler, XORI | RD(dst_r) | RS1(dst_r) | IMM_I(imm)));
else if (imm != 0)
FAIL_IF(push_inst(compiler, ADDI | RD(dst_r) | RS1(dst_r) | IMM_I(imm)));
FAIL_IF(push_inst(compiler, SLLI | RD(tmp_r) | RS1(tmp_r) | IMM_I((high & 0x1000) ? 20 : 32)));
return push_inst(compiler, XOR | RD(dst_r) | RS1(dst_r) | RS2(tmp_r));
}
static SLJIT_INLINE sljit_s32 emit_const(struct sljit_compiler *compiler, sljit_s32 dst, sljit_sw init_value, sljit_ins last_ins)
{
sljit_sw high;
if ((init_value & 0x800) != 0)
init_value += 0x1000;
high = init_value >> 32;
if ((init_value & 0x80000000l) != 0)
high = ~high;
if ((high & 0x800) != 0)
high += 0x1000;
FAIL_IF(push_inst(compiler, LUI | RD(TMP_REG3) | (sljit_ins)(high & ~0xfff)));
FAIL_IF(push_inst(compiler, ADDI | RD(TMP_REG3) | RS1(TMP_REG3) | IMM_I(high)));
FAIL_IF(push_inst(compiler, LUI | RD(dst) | (sljit_ins)(init_value & ~0xfff)));
FAIL_IF(push_inst(compiler, SLLI | RD(TMP_REG3) | RS1(TMP_REG3) | IMM_I(32)));
FAIL_IF(push_inst(compiler, XOR | RD(dst) | RS1(dst) | RS2(TMP_REG3)));
return push_inst(compiler, last_ins | RS1(dst) | IMM_I(init_value));
}
SLJIT_API_FUNC_ATTRIBUTE void sljit_set_jump_addr(sljit_uw addr, sljit_uw new_target, sljit_sw executable_offset)
{
sljit_ins *inst = (sljit_ins*)addr;
sljit_sw high;
SLJIT_UNUSED_ARG(executable_offset);
if ((new_target & 0x800) != 0)
new_target += 0x1000;
high = (sljit_sw)new_target >> 32;
if ((new_target & 0x80000000l) != 0)
high = ~high;
if ((high & 0x800) != 0)
high += 0x1000;
SLJIT_UPDATE_WX_FLAGS(inst, inst + 5, 0);
SLJIT_ASSERT((inst[0] & 0x7f) == LUI);
inst[0] = (inst[0] & 0xfff) | (sljit_ins)(high & ~0xfff);
SLJIT_ASSERT((inst[1] & 0x707f) == ADDI);
inst[1] = (inst[1] & 0xfffff) | IMM_I(high);
SLJIT_ASSERT((inst[2] & 0x7f) == LUI);
inst[2] = (inst[2] & 0xfff) | (sljit_ins)((sljit_sw)new_target & ~0xfff);
SLJIT_ASSERT((inst[5] & 0x707f) == ADDI || (inst[5] & 0x707f) == JALR);
inst[5] = (inst[5] & 0xfffff) | IMM_I(new_target);
SLJIT_UPDATE_WX_FLAGS(inst, inst + 5, 1);
inst = (sljit_ins *)SLJIT_ADD_EXEC_OFFSET(inst, executable_offset);
SLJIT_CACHE_FLUSH(inst, inst + 5);
}

File diff suppressed because it is too large Load Diff

View File

@ -220,7 +220,8 @@ static SLJIT_INLINE sljit_u8 get_cc(struct sljit_compiler *compiler, sljit_s32 t
}
/* fallthrough */
case SLJIT_EQUAL_F64:
case SLJIT_F_EQUAL:
case SLJIT_ORDERED_EQUAL:
return cc0;
case SLJIT_NOT_EQUAL:
@ -234,13 +235,14 @@ static SLJIT_INLINE sljit_u8 get_cc(struct sljit_compiler *compiler, sljit_s32 t
}
/* fallthrough */
case SLJIT_NOT_EQUAL_F64:
case SLJIT_UNORDERED_OR_NOT_EQUAL:
return (cc1 | cc2 | cc3);
case SLJIT_LESS:
return cc1;
case SLJIT_GREATER_EQUAL:
case SLJIT_UNORDERED_OR_GREATER_EQUAL:
return (cc0 | cc2 | cc3);
case SLJIT_GREATER:
@ -254,7 +256,8 @@ static SLJIT_INLINE sljit_u8 get_cc(struct sljit_compiler *compiler, sljit_s32 t
return (cc0 | cc1 | cc2);
case SLJIT_SIG_LESS:
case SLJIT_LESS_F64:
case SLJIT_F_LESS:
case SLJIT_ORDERED_LESS:
return cc1;
case SLJIT_NOT_CARRY:
@ -263,7 +266,8 @@ static SLJIT_INLINE sljit_u8 get_cc(struct sljit_compiler *compiler, sljit_s32 t
/* fallthrough */
case SLJIT_SIG_LESS_EQUAL:
case SLJIT_LESS_EQUAL_F64:
case SLJIT_F_LESS_EQUAL:
case SLJIT_ORDERED_LESS_EQUAL:
return (cc0 | cc1);
case SLJIT_CARRY:
@ -272,6 +276,7 @@ static SLJIT_INLINE sljit_u8 get_cc(struct sljit_compiler *compiler, sljit_s32 t
/* fallthrough */
case SLJIT_SIG_GREATER:
case SLJIT_UNORDERED_OR_GREATER:
/* Overflow is considered greater, see SLJIT_SUB. */
return cc2 | cc3;
@ -283,7 +288,7 @@ static SLJIT_INLINE sljit_u8 get_cc(struct sljit_compiler *compiler, sljit_s32 t
return (cc2 | cc3);
/* fallthrough */
case SLJIT_UNORDERED_F64:
case SLJIT_UNORDERED:
return cc3;
case SLJIT_NOT_OVERFLOW:
@ -291,14 +296,29 @@ static SLJIT_INLINE sljit_u8 get_cc(struct sljit_compiler *compiler, sljit_s32 t
return (cc0 | cc1);
/* fallthrough */
case SLJIT_ORDERED_F64:
case SLJIT_ORDERED:
return (cc0 | cc1 | cc2);
case SLJIT_GREATER_F64:
case SLJIT_F_NOT_EQUAL:
case SLJIT_ORDERED_NOT_EQUAL:
return (cc1 | cc2);
case SLJIT_F_GREATER:
case SLJIT_ORDERED_GREATER:
return cc2;
case SLJIT_GREATER_EQUAL_F64:
case SLJIT_F_GREATER_EQUAL:
case SLJIT_ORDERED_GREATER_EQUAL:
return (cc0 | cc2);
case SLJIT_UNORDERED_OR_LESS_EQUAL:
return (cc0 | cc1 | cc3);
case SLJIT_UNORDERED_OR_EQUAL:
return (cc0 | cc3);
case SLJIT_UNORDERED_OR_LESS:
return (cc1 | cc3);
}
SLJIT_UNREACHABLE();
@ -1628,6 +1648,11 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_has_cpu_feature(sljit_s32 feature_type)
return 0;
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_cmp_info(sljit_s32 type)
{
return (type >= SLJIT_UNORDERED && type <= SLJIT_ORDERED_LESS_EQUAL);
}
/* --------------------------------------------------------------------- */
/* Entry, exit */
/* --------------------------------------------------------------------- */
@ -1636,7 +1661,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
sljit_s32 options, sljit_s32 arg_types, sljit_s32 scratches, sljit_s32 saveds,
sljit_s32 fscratches, sljit_s32 fsaveds, sljit_s32 local_size)
{
sljit_s32 word_arg_count = 0;
sljit_s32 saved_arg_count = SLJIT_KEPT_SAVEDS_COUNT(options);
sljit_s32 offset, i, tmp;
CHECK_ERROR();
@ -1648,8 +1673,13 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
offset = 2 * SSIZE_OF(sw);
if (saveds + scratches >= SLJIT_NUMBER_OF_REGISTERS) {
FAIL_IF(push_inst(compiler, stmg(r6, r14, offset, r15))); /* save registers TODO(MGM): optimize */
offset += 9 * SSIZE_OF(sw);
if (saved_arg_count == 0) {
FAIL_IF(push_inst(compiler, stmg(r6, r14, offset, r15)));
offset += 9 * SSIZE_OF(sw);
} else {
FAIL_IF(push_inst(compiler, stmg(r6, r13 - (sljit_gpr)saved_arg_count, offset, r15)));
offset += (8 - saved_arg_count) * SSIZE_OF(sw);
}
} else {
if (scratches == SLJIT_FIRST_SAVED_REG) {
FAIL_IF(push_inst(compiler, stg(r6, offset, 0, r15)));
@ -1659,15 +1689,30 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
offset += (scratches - (SLJIT_FIRST_SAVED_REG - 1)) * SSIZE_OF(sw);
}
if (saveds == 0) {
FAIL_IF(push_inst(compiler, stg(r14, offset, 0, r15)));
offset += SSIZE_OF(sw);
} else {
FAIL_IF(push_inst(compiler, stmg(r14 - (sljit_gpr)saveds, r14, offset, r15)));
offset += (saveds + 1) * SSIZE_OF(sw);
if (saved_arg_count == 0) {
if (saveds == 0) {
FAIL_IF(push_inst(compiler, stg(r14, offset, 0, r15)));
offset += SSIZE_OF(sw);
} else {
FAIL_IF(push_inst(compiler, stmg(r14 - (sljit_gpr)saveds, r14, offset, r15)));
offset += (saveds + 1) * SSIZE_OF(sw);
}
} else if (saveds > saved_arg_count) {
if (saveds == saved_arg_count + 1) {
FAIL_IF(push_inst(compiler, stg(r14 - (sljit_gpr)saveds, offset, 0, r15)));
offset += SSIZE_OF(sw);
} else {
FAIL_IF(push_inst(compiler, stmg(r14 - (sljit_gpr)saveds, r13 - (sljit_gpr)saved_arg_count, offset, r15)));
offset += (saveds - saved_arg_count) * SSIZE_OF(sw);
}
}
}
if (saved_arg_count > 0) {
FAIL_IF(push_inst(compiler, stg(r14, offset, 0, r15)));
offset += SSIZE_OF(sw);
}
tmp = SLJIT_FS0 - fsaveds;
for (i = SLJIT_FS0; i > tmp; i--) {
FAIL_IF(push_inst(compiler, 0x60000000 /* std */ | F20(i) | R12A(r15) | (sljit_ins)offset));
@ -1684,15 +1729,19 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
FAIL_IF(push_inst(compiler, 0xe30000000071 /* lay */ | R36A(r15) | R28A(r15) | disp_s20(-local_size)));
if (options & SLJIT_ENTER_REG_ARG)
return SLJIT_SUCCESS;
arg_types >>= SLJIT_ARG_SHIFT;
saved_arg_count = 0;
tmp = 0;
while (arg_types > 0) {
if ((arg_types & SLJIT_ARG_MASK) < SLJIT_ARG_TYPE_F64) {
if (!(arg_types & SLJIT_ARG_TYPE_SCRATCH_REG)) {
FAIL_IF(push_inst(compiler, lgr(gpr(SLJIT_S0 - tmp), gpr(SLJIT_R0 + word_arg_count))));
tmp++;
FAIL_IF(push_inst(compiler, lgr(gpr(SLJIT_S0 - saved_arg_count), gpr(SLJIT_R0 + tmp))));
saved_arg_count++;
}
word_arg_count++;
tmp++;
}
arg_types >>= SLJIT_ARG_SHIFT;
@ -1719,6 +1768,7 @@ static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler)
sljit_s32 local_size = compiler->local_size;
sljit_s32 saveds = compiler->saveds;
sljit_s32 scratches = compiler->scratches;
sljit_s32 kept_saveds_count = SLJIT_KEPT_SAVEDS_COUNT(compiler->options);
if (is_u12(local_size))
FAIL_IF(push_inst(compiler, 0x41000000 /* ly */ | R20A(r15) | R12A(r15) | (sljit_ins)local_size));
@ -1727,8 +1777,13 @@ static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler)
offset = 2 * SSIZE_OF(sw);
if (saveds + scratches >= SLJIT_NUMBER_OF_REGISTERS) {
FAIL_IF(push_inst(compiler, lmg(r6, r14, offset, r15))); /* save registers TODO(MGM): optimize */
offset += 9 * SSIZE_OF(sw);
if (kept_saveds_count == 0) {
FAIL_IF(push_inst(compiler, lmg(r6, r14, offset, r15)));
offset += 9 * SSIZE_OF(sw);
} else {
FAIL_IF(push_inst(compiler, lmg(r6, r13 - (sljit_gpr)kept_saveds_count, offset, r15)));
offset += (8 - kept_saveds_count) * SSIZE_OF(sw);
}
} else {
if (scratches == SLJIT_FIRST_SAVED_REG) {
FAIL_IF(push_inst(compiler, lg(r6, offset, 0, r15)));
@ -1738,15 +1793,30 @@ static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler)
offset += (scratches - (SLJIT_FIRST_SAVED_REG - 1)) * SSIZE_OF(sw);
}
if (saveds == 0) {
FAIL_IF(push_inst(compiler, lg(r14, offset, 0, r15)));
offset += SSIZE_OF(sw);
} else {
FAIL_IF(push_inst(compiler, lmg(r14 - (sljit_gpr)saveds, r14, offset, r15)));
offset += (saveds + 1) * SSIZE_OF(sw);
if (kept_saveds_count == 0) {
if (saveds == 0) {
FAIL_IF(push_inst(compiler, lg(r14, offset, 0, r15)));
offset += SSIZE_OF(sw);
} else {
FAIL_IF(push_inst(compiler, lmg(r14 - (sljit_gpr)saveds, r14, offset, r15)));
offset += (saveds + 1) * SSIZE_OF(sw);
}
} else if (saveds > kept_saveds_count) {
if (saveds == kept_saveds_count + 1) {
FAIL_IF(push_inst(compiler, lg(r14 - (sljit_gpr)saveds, offset, 0, r15)));
offset += SSIZE_OF(sw);
} else {
FAIL_IF(push_inst(compiler, lmg(r14 - (sljit_gpr)saveds, r13 - (sljit_gpr)kept_saveds_count, offset, r15)));
offset += (saveds - kept_saveds_count) * SSIZE_OF(sw);
}
}
}
if (kept_saveds_count > 0) {
FAIL_IF(push_inst(compiler, lg(r14, offset, 0, r15)));
offset += SSIZE_OF(sw);
}
tmp = SLJIT_FS0 - compiler->fsaveds;
for (i = SLJIT_FS0; i > tmp; i--) {
FAIL_IF(push_inst(compiler, 0x68000000 /* ld */ | F20(i) | R12A(r15) | (sljit_ins)offset));
@ -2734,10 +2804,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op2u(struct sljit_compiler *compil
CHECK_ERROR();
CHECK(check_sljit_emit_op2(compiler, op, 1, 0, 0, src1, src1w, src2, src2w));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_op2(compiler, op, (sljit_s32)tmp0, 0, src1, src1w, src2, src2w);
}
@ -3117,6 +3184,7 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_jump(struct sljit_compile
SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compiler *compiler, sljit_s32 type,
sljit_s32 arg_types)
{
SLJIT_UNUSED_ARG(arg_types);
CHECK_ERROR_PTR();
CHECK_PTR(check_sljit_emit_call(compiler, type, arg_types));
@ -3125,11 +3193,7 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
type = SLJIT_JUMP | (type & SLJIT_REWRITABLE_JUMP);
}
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_jump(compiler, type);
}
@ -3181,11 +3245,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_icall(struct sljit_compiler *compi
type = SLJIT_JUMP;
}
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_ijump(compiler, type, src, srcw);
}
@ -3193,7 +3253,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
sljit_s32 dst, sljit_sw dstw,
sljit_s32 type)
{
sljit_u8 mask = get_cc(compiler, type & 0xff);
sljit_u8 mask = get_cc(compiler, type);
CHECK_ERROR();
CHECK(check_sljit_emit_op_flags(compiler, op, dst, dstw, type));
@ -3263,7 +3323,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_cmov(struct sljit_compiler *compil
sljit_s32 dst_reg,
sljit_s32 src, sljit_sw srcw)
{
sljit_u8 mask = get_cc(compiler, type & 0xff);
sljit_u8 mask = get_cc(compiler, type);
sljit_gpr dst_r = gpr(dst_reg & ~SLJIT_32);
sljit_gpr src_r = FAST_IS_REG(src) ? gpr(src) : tmp0;

View File

@ -1,283 +0,0 @@
/*
* Stack-less Just-In-Time compiler
*
* Copyright Zoltan Herczeg (hzmester@freemail.hu). All rights reserved.
*
* Redistribution and use in source and binary forms, with or without modification, are
* permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this list of
* conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright notice, this list
* of conditions and the following disclaimer in the documentation and/or other materials
* provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) AND CONTRIBUTORS ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
* SHALL THE COPYRIGHT HOLDER(S) OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
* TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
* BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
static sljit_s32 load_immediate(struct sljit_compiler *compiler, sljit_s32 dst, sljit_sw imm)
{
if (imm <= SIMM_MAX && imm >= SIMM_MIN)
return push_inst(compiler, OR | D(dst) | S1(0) | IMM(imm), DR(dst));
FAIL_IF(push_inst(compiler, SETHI | D(dst) | ((imm >> 10) & 0x3fffff), DR(dst)));
return (imm & 0x3ff) ? push_inst(compiler, OR | D(dst) | S1(dst) | IMM_ARG | (imm & 0x3ff), DR(dst)) : SLJIT_SUCCESS;
}
#define ARG2(flags, src2) ((flags & SRC2_IMM) ? IMM(src2) : S2(src2))
static SLJIT_INLINE sljit_s32 emit_single_op(struct sljit_compiler *compiler, sljit_s32 op, sljit_u32 flags,
sljit_s32 dst, sljit_s32 src1, sljit_sw src2)
{
SLJIT_COMPILE_ASSERT(ICC_IS_SET == SET_FLAGS, icc_is_set_and_set_flags_must_be_the_same);
switch (op) {
case SLJIT_MOV:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if (dst != src2)
return push_inst(compiler, OR | D(dst) | S1(0) | S2(src2), DR(dst));
return SLJIT_SUCCESS;
case SLJIT_MOV_U8:
case SLJIT_MOV_S8:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE)) {
if (op == SLJIT_MOV_U8)
return push_inst(compiler, AND | D(dst) | S1(src2) | IMM(0xff), DR(dst));
FAIL_IF(push_inst(compiler, SLL | D(dst) | S1(src2) | IMM(24), DR(dst)));
return push_inst(compiler, SRA | D(dst) | S1(dst) | IMM(24), DR(dst));
}
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_MOV_U16:
case SLJIT_MOV_S16:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
if ((flags & (REG_DEST | REG2_SOURCE)) == (REG_DEST | REG2_SOURCE)) {
FAIL_IF(push_inst(compiler, SLL | D(dst) | S1(src2) | IMM(16), DR(dst)));
return push_inst(compiler, (op == SLJIT_MOV_S16 ? SRA : SRL) | D(dst) | S1(dst) | IMM(16), DR(dst));
}
SLJIT_ASSERT(dst == src2);
return SLJIT_SUCCESS;
case SLJIT_NOT:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
return push_inst(compiler, XNOR | (flags & SET_FLAGS) | D(dst) | S1(0) | S2(src2), DRF(dst, flags));
case SLJIT_CLZ:
SLJIT_ASSERT(src1 == TMP_REG1 && !(flags & SRC2_IMM));
FAIL_IF(push_inst(compiler, SUB | SET_FLAGS | D(0) | S1(src2) | S2(0), SET_FLAGS));
FAIL_IF(push_inst(compiler, OR | D(TMP_REG1) | S1(0) | S2(src2), DR(TMP_REG1)));
FAIL_IF(push_inst(compiler, BICC | DA(0x1) | (7 & DISP_MASK), UNMOVABLE_INS));
FAIL_IF(push_inst(compiler, OR | D(dst) | S1(0) | IMM(32), UNMOVABLE_INS));
FAIL_IF(push_inst(compiler, OR | D(dst) | S1(0) | IMM(-1), DR(dst)));
/* Loop. */
FAIL_IF(push_inst(compiler, SUB | SET_FLAGS | D(0) | S1(TMP_REG1) | S2(0), SET_FLAGS));
FAIL_IF(push_inst(compiler, SLL | D(TMP_REG1) | S1(TMP_REG1) | IMM(1), DR(TMP_REG1)));
FAIL_IF(push_inst(compiler, BICC | DA(0xe) | ((sljit_ins)-2 & DISP_MASK), UNMOVABLE_INS));
return push_inst(compiler, ADD | D(dst) | S1(dst) | IMM(1), UNMOVABLE_INS);
case SLJIT_ADD:
compiler->status_flags_state = SLJIT_CURRENT_FLAGS_ADD;
return push_inst(compiler, ADD | (flags & SET_FLAGS) | D(dst) | S1(src1) | ARG2(flags, src2), DRF(dst, flags));
case SLJIT_ADDC:
compiler->status_flags_state = SLJIT_CURRENT_FLAGS_ADD;
return push_inst(compiler, ADDC | (flags & SET_FLAGS) | D(dst) | S1(src1) | ARG2(flags, src2), DRF(dst, flags));
case SLJIT_SUB:
compiler->status_flags_state = SLJIT_CURRENT_FLAGS_SUB;
return push_inst(compiler, SUB | (flags & SET_FLAGS) | D(dst) | S1(src1) | ARG2(flags, src2), DRF(dst, flags));
case SLJIT_SUBC:
compiler->status_flags_state = SLJIT_CURRENT_FLAGS_SUB;
return push_inst(compiler, SUBC | (flags & SET_FLAGS) | D(dst) | S1(src1) | ARG2(flags, src2), DRF(dst, flags));
case SLJIT_MUL:
compiler->status_flags_state = 0;
FAIL_IF(push_inst(compiler, SMUL | D(dst) | S1(src1) | ARG2(flags, src2), DR(dst)));
if (!(flags & SET_FLAGS))
return SLJIT_SUCCESS;
FAIL_IF(push_inst(compiler, SRA | D(TMP_REG1) | S1(dst) | IMM(31), DR(TMP_REG1)));
FAIL_IF(push_inst(compiler, RDY | D(TMP_LINK), DR(TMP_LINK)));
return push_inst(compiler, SUB | SET_FLAGS | D(0) | S1(TMP_REG1) | S2(TMP_LINK), MOVABLE_INS | SET_FLAGS);
case SLJIT_AND:
return push_inst(compiler, AND | (flags & SET_FLAGS) | D(dst) | S1(src1) | ARG2(flags, src2), DRF(dst, flags));
case SLJIT_OR:
return push_inst(compiler, OR | (flags & SET_FLAGS) | D(dst) | S1(src1) | ARG2(flags, src2), DRF(dst, flags));
case SLJIT_XOR:
return push_inst(compiler, XOR | (flags & SET_FLAGS) | D(dst) | S1(src1) | ARG2(flags, src2), DRF(dst, flags));
case SLJIT_SHL:
FAIL_IF(push_inst(compiler, SLL | D(dst) | S1(src1) | ARG2(flags, src2), DR(dst)));
return !(flags & SET_FLAGS) ? SLJIT_SUCCESS : push_inst(compiler, SUB | SET_FLAGS | D(0) | S1(dst) | S2(0), SET_FLAGS);
case SLJIT_LSHR:
FAIL_IF(push_inst(compiler, SRL | D(dst) | S1(src1) | ARG2(flags, src2), DR(dst)));
return !(flags & SET_FLAGS) ? SLJIT_SUCCESS : push_inst(compiler, SUB | SET_FLAGS | D(0) | S1(dst) | S2(0), SET_FLAGS);
case SLJIT_ASHR:
FAIL_IF(push_inst(compiler, SRA | D(dst) | S1(src1) | ARG2(flags, src2), DR(dst)));
return !(flags & SET_FLAGS) ? SLJIT_SUCCESS : push_inst(compiler, SUB | SET_FLAGS | D(0) | S1(dst) | S2(0), SET_FLAGS);
}
SLJIT_UNREACHABLE();
return SLJIT_SUCCESS;
}
static sljit_s32 call_with_args(struct sljit_compiler *compiler, sljit_s32 arg_types, sljit_s32 *src)
{
sljit_s32 reg_index = 8;
sljit_s32 word_reg_index = 8;
sljit_s32 float_arg_index = 1;
sljit_s32 double_arg_count = 0;
sljit_u32 float_offset = (16 + 6) * sizeof(sljit_sw);
sljit_s32 types = 0;
sljit_s32 reg = 0;
sljit_s32 move_to_tmp2 = 0;
if (src)
reg = reg_map[*src & REG_MASK];
arg_types >>= SLJIT_ARG_SHIFT;
while (arg_types) {
types = (types << SLJIT_ARG_SHIFT) | (arg_types & SLJIT_ARG_MASK);
switch (arg_types & SLJIT_ARG_MASK) {
case SLJIT_ARG_TYPE_F64:
float_arg_index++;
double_arg_count++;
if (reg_index == reg || reg_index + 1 == reg)
move_to_tmp2 = 1;
reg_index += 2;
break;
case SLJIT_ARG_TYPE_F32:
float_arg_index++;
if (reg_index == reg)
move_to_tmp2 = 1;
reg_index++;
break;
default:
if (reg_index != word_reg_index && reg_index == reg)
move_to_tmp2 = 1;
reg_index++;
word_reg_index++;
break;
}
arg_types >>= SLJIT_ARG_SHIFT;
}
if (move_to_tmp2) {
if (reg < 14)
FAIL_IF(push_inst(compiler, OR | D(TMP_REG1) | S1(0) | S2A(reg), DR(TMP_REG1)));
*src = TMP_REG1;
}
arg_types = types;
while (arg_types) {
switch (arg_types & SLJIT_ARG_MASK) {
case SLJIT_ARG_TYPE_F64:
float_arg_index--;
if (float_arg_index == 4 && double_arg_count == 4) {
/* The address is not doubleword aligned, so two instructions are required to store the double. */
FAIL_IF(push_inst(compiler, STF | FD(float_arg_index) | S1(SLJIT_SP) | IMM((16 + 7) * sizeof(sljit_sw)), MOVABLE_INS));
FAIL_IF(push_inst(compiler, STF | FD(float_arg_index) | (1 << 25) | S1(SLJIT_SP) | IMM((16 + 8) * sizeof(sljit_sw)), MOVABLE_INS));
}
else
FAIL_IF(push_inst(compiler, STDF | FD(float_arg_index) | S1(SLJIT_SP) | IMM(float_offset), MOVABLE_INS));
float_offset -= sizeof(sljit_f64);
break;
case SLJIT_ARG_TYPE_F32:
float_arg_index--;
FAIL_IF(push_inst(compiler, STF | FD(float_arg_index) | S1(SLJIT_SP) | IMM(float_offset), MOVABLE_INS));
float_offset -= sizeof(sljit_f64);
break;
default:
break;
}
arg_types >>= SLJIT_ARG_SHIFT;
}
float_offset = (16 + 6) * sizeof(sljit_sw);
while (types) {
switch (types & SLJIT_ARG_MASK) {
case SLJIT_ARG_TYPE_F64:
reg_index -= 2;
if (reg_index < 14) {
if ((reg_index & 0x1) != 0) {
FAIL_IF(push_inst(compiler, LDUW | DA(reg_index) | S1(SLJIT_SP) | IMM(float_offset), reg_index));
if (reg_index < 8 + 6 - 1)
FAIL_IF(push_inst(compiler, LDUW | DA(reg_index + 1) | S1(SLJIT_SP) | IMM(float_offset + sizeof(sljit_sw)), reg_index + 1));
}
else
FAIL_IF(push_inst(compiler, LDD | DA(reg_index) | S1(SLJIT_SP) | IMM(float_offset), reg_index));
}
float_offset -= sizeof(sljit_f64);
break;
case SLJIT_ARG_TYPE_F32:
reg_index--;
if (reg_index < 8 + 6)
FAIL_IF(push_inst(compiler, LDUW | DA(reg_index) | S1(SLJIT_SP) | IMM(float_offset), reg_index));
float_offset -= sizeof(sljit_f64);
break;
default:
reg_index--;
word_reg_index--;
if (reg_index != word_reg_index) {
if (reg_index < 14)
FAIL_IF(push_inst(compiler, OR | DA(reg_index) | S1(0) | S2A(word_reg_index), reg_index));
else
FAIL_IF(push_inst(compiler, STW | DA(word_reg_index) | S1(SLJIT_SP) | IMM(92), word_reg_index));
}
break;
}
types >>= SLJIT_ARG_SHIFT;
}
return SLJIT_SUCCESS;
}
static SLJIT_INLINE sljit_s32 emit_const(struct sljit_compiler *compiler, sljit_s32 dst, sljit_sw init_value)
{
FAIL_IF(push_inst(compiler, SETHI | D(dst) | ((init_value >> 10) & 0x3fffff), DR(dst)));
return push_inst(compiler, OR | D(dst) | S1(dst) | IMM_ARG | (init_value & 0x3ff), DR(dst));
}
SLJIT_API_FUNC_ATTRIBUTE void sljit_set_jump_addr(sljit_uw addr, sljit_uw new_target, sljit_sw executable_offset)
{
sljit_ins *inst = (sljit_ins *)addr;
SLJIT_UNUSED_ARG(executable_offset);
SLJIT_UPDATE_WX_FLAGS(inst, inst + 2, 0);
SLJIT_ASSERT(((inst[0] & 0xc1c00000) == 0x01000000) && ((inst[1] & 0xc1f82000) == 0x80102000));
inst[0] = (inst[0] & 0xffc00000) | ((new_target >> 10) & 0x3fffff);
inst[1] = (inst[1] & 0xfffffc00) | (new_target & 0x3ff);
SLJIT_UPDATE_WX_FLAGS(inst, inst + 2, 1);
inst = (sljit_ins *)SLJIT_ADD_EXEC_OFFSET(inst, executable_offset);
SLJIT_CACHE_FLUSH(inst, inst + 2);
}
SLJIT_API_FUNC_ATTRIBUTE void sljit_set_const(sljit_uw addr, sljit_sw new_constant, sljit_sw executable_offset)
{
sljit_set_jump_addr(addr, (sljit_uw)new_constant, executable_offset);
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -101,34 +101,38 @@ static sljit_u8* emit_x86_instruction(struct sljit_compiler *compiler, sljit_uw
/* Calculate size of b. */
inst_size += 1; /* mod r/m byte. */
if (b & SLJIT_MEM) {
if (!(b & OFFS_REG_MASK)) {
if (NOT_HALFWORD(immb)) {
PTR_FAIL_IF(emit_load_imm64(compiler, TMP_REG2, immb));
immb = 0;
if (b & REG_MASK)
b |= TO_OFFS_REG(TMP_REG2);
else
b |= TMP_REG2;
}
else if (reg_lmap[b & REG_MASK] == 4)
b |= TO_OFFS_REG(SLJIT_SP);
if (!(b & OFFS_REG_MASK) && NOT_HALFWORD(immb)) {
PTR_FAIL_IF(emit_load_imm64(compiler, TMP_REG2, immb));
immb = 0;
if (b & REG_MASK)
b |= TO_OFFS_REG(TMP_REG2);
else
b |= TMP_REG2;
}
if (!(b & REG_MASK))
inst_size += 1 + sizeof(sljit_s32); /* SIB byte required to avoid RIP based addressing. */
else {
if (reg_map[b & REG_MASK] >= 8)
rex |= REX_B;
if (immb != 0 && (!(b & OFFS_REG_MASK) || (b & OFFS_REG_MASK) == TO_OFFS_REG(SLJIT_SP))) {
if (immb != 0 && !(b & OFFS_REG_MASK)) {
/* Immediate operand. */
if (immb <= 127 && immb >= -128)
inst_size += sizeof(sljit_s8);
else
inst_size += sizeof(sljit_s32);
}
else if (reg_lmap[b & REG_MASK] == 5)
inst_size += sizeof(sljit_s8);
else if (reg_lmap[b & REG_MASK] == 5) {
/* Swap registers if possible. */
if ((b & OFFS_REG_MASK) && (immb & 0x3) == 0 && reg_lmap[OFFS_REG(b)] != 5)
b = SLJIT_MEM | OFFS_REG(b) | TO_OFFS_REG(b & REG_MASK);
else
inst_size += sizeof(sljit_s8);
}
if (reg_map[b & REG_MASK] >= 8)
rex |= REX_B;
if (reg_lmap[b & REG_MASK] == 4 && !(b & OFFS_REG_MASK))
b |= TO_OFFS_REG(SLJIT_SP);
if (b & OFFS_REG_MASK) {
inst_size += 1; /* SIB byte. */
@ -155,7 +159,7 @@ static sljit_u8* emit_x86_instruction(struct sljit_compiler *compiler, sljit_uw
else if (flags & EX86_SHIFT_INS) {
imma &= compiler->mode32 ? 0x1f : 0x3f;
if (imma != 1) {
inst_size ++;
inst_size++;
flags |= EX86_BYTE_ARG;
}
} else if (flags & EX86_BYTE_ARG)
@ -223,7 +227,7 @@ static sljit_u8* emit_x86_instruction(struct sljit_compiler *compiler, sljit_uw
} else if (b & REG_MASK) {
reg_lmap_b = reg_lmap[b & REG_MASK];
if (!(b & OFFS_REG_MASK) || (b & OFFS_REG_MASK) == TO_OFFS_REG(SLJIT_SP) || reg_lmap_b == 5) {
if (!(b & OFFS_REG_MASK) || (b & OFFS_REG_MASK) == TO_OFFS_REG(SLJIT_SP)) {
if (immb != 0 || reg_lmap_b == 5) {
if (immb <= 127 && immb >= -128)
*buf_ptr |= 0x40;
@ -248,8 +252,14 @@ static sljit_u8* emit_x86_instruction(struct sljit_compiler *compiler, sljit_uw
}
}
else {
if (reg_lmap_b == 5)
*buf_ptr |= 0x40;
*buf_ptr++ |= 0x04;
*buf_ptr++ = U8(reg_lmap_b | (reg_lmap[OFFS_REG(b)] << 3) | (immb << 6));
if (reg_lmap_b == 5)
*buf_ptr++ = 0;
}
}
else {
@ -366,7 +376,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
{
sljit_uw size;
sljit_s32 word_arg_count = 0;
sljit_s32 saved_arg_count = 0;
sljit_s32 saved_arg_count = SLJIT_KEPT_SAVEDS_COUNT(options);
sljit_s32 saved_regs_size, tmp, i;
#ifdef _WIN64
sljit_s32 saved_float_regs_size;
@ -379,16 +389,19 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_enter(struct sljit_compiler *compi
CHECK(check_sljit_emit_enter(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size));
set_emit_enter(compiler, options, arg_types, scratches, saveds, fscratches, fsaveds, local_size);
if (options & SLJIT_ENTER_REG_ARG)
arg_types = 0;
/* Emit ENDBR64 at function entry if needed. */
FAIL_IF(emit_endbranch(compiler));
compiler->mode32 = 0;
/* Including the return address saved by the call instruction. */
saved_regs_size = GET_SAVED_REGISTERS_SIZE(scratches, saveds, 1);
saved_regs_size = GET_SAVED_REGISTERS_SIZE(scratches, saveds - saved_arg_count, 1);
tmp = SLJIT_S0 - saveds;
for (i = SLJIT_S0; i > tmp; i--) {
for (i = SLJIT_S0 - saved_arg_count; i > tmp; i--) {
size = reg_map[i] >= 8 ? 2 : 1;
inst = (sljit_u8*)ensure_buf(compiler, 1 + size);
FAIL_IF(!inst);
@ -561,7 +574,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_set_context(struct sljit_compiler *comp
#endif /* _WIN64 */
/* Including the return address saved by the call instruction. */
saved_regs_size = GET_SAVED_REGISTERS_SIZE(scratches, saveds, 1);
saved_regs_size = GET_SAVED_REGISTERS_SIZE(scratches, saveds - SLJIT_KEPT_SAVEDS_COUNT(options), 1);
compiler->local_size = ((local_size + saved_regs_size + 0xf) & ~0xf) - saved_regs_size;
return SLJIT_SUCCESS;
}
@ -633,8 +646,8 @@ static sljit_s32 emit_stack_frame_release(struct sljit_compiler *compiler)
POP_REG(reg_lmap[i]);
}
tmp = compiler->saveds < SLJIT_NUMBER_OF_SAVED_REGISTERS ? (SLJIT_S0 + 1 - compiler->saveds) : SLJIT_FIRST_SAVED_REG;
for (i = tmp; i <= SLJIT_S0; i++) {
tmp = SLJIT_S0 - SLJIT_KEPT_SAVEDS_COUNT(compiler->options);
for (i = SLJIT_S0 + 1 - compiler->saveds; i <= tmp; i++) {
size = reg_map[i] >= 8 ? 2 : 1;
inst = (sljit_u8*)ensure_buf(compiler, 1 + size);
FAIL_IF(!inst);
@ -786,17 +799,15 @@ SLJIT_API_FUNC_ATTRIBUTE struct sljit_jump* sljit_emit_call(struct sljit_compile
compiler->mode32 = 0;
PTR_FAIL_IF(call_with_args(compiler, arg_types, NULL));
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
PTR_FAIL_IF(call_with_args(compiler, arg_types, NULL));
if (type & SLJIT_CALL_RETURN) {
PTR_FAIL_IF(emit_stack_frame_release(compiler));
type = SLJIT_JUMP | (type & SLJIT_REWRITABLE_JUMP);
}
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_jump(compiler, type);
}
@ -822,16 +833,15 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_icall(struct sljit_compiler *compi
}
FAIL_IF(emit_stack_frame_release(compiler));
type = SLJIT_JUMP;
}
FAIL_IF(call_with_args(compiler, arg_types, &src));
if ((type & 0xff) != SLJIT_CALL_REG_ARG)
FAIL_IF(call_with_args(compiler, arg_types, &src));
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
if (type & SLJIT_CALL_RETURN)
type = SLJIT_JUMP;
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_ijump(compiler, type, src, srcw);
}

View File

@ -26,11 +26,7 @@
SLJIT_API_FUNC_ATTRIBUTE const char* sljit_get_platform_name(void)
{
#if (defined SLJIT_X86_32_FASTCALL && SLJIT_X86_32_FASTCALL)
return "x86" SLJIT_CPUINFO " ABI:fastcall";
#else
return "x86" SLJIT_CPUINFO;
#endif
}
/*
@ -379,29 +375,41 @@ static sljit_u8 get_jump_code(sljit_uw type)
{
switch (type) {
case SLJIT_EQUAL:
case SLJIT_EQUAL_F64:
case SLJIT_F_EQUAL:
case SLJIT_UNORDERED_OR_EQUAL:
case SLJIT_ORDERED_EQUAL: /* Not supported. */
return 0x84 /* je */;
case SLJIT_NOT_EQUAL:
case SLJIT_NOT_EQUAL_F64:
case SLJIT_F_NOT_EQUAL:
case SLJIT_ORDERED_NOT_EQUAL:
case SLJIT_UNORDERED_OR_NOT_EQUAL: /* Not supported. */
return 0x85 /* jne */;
case SLJIT_LESS:
case SLJIT_CARRY:
case SLJIT_LESS_F64:
case SLJIT_F_LESS:
case SLJIT_UNORDERED_OR_LESS:
case SLJIT_UNORDERED_OR_GREATER:
return 0x82 /* jc */;
case SLJIT_GREATER_EQUAL:
case SLJIT_NOT_CARRY:
case SLJIT_GREATER_EQUAL_F64:
case SLJIT_F_GREATER_EQUAL:
case SLJIT_ORDERED_GREATER_EQUAL:
case SLJIT_ORDERED_LESS_EQUAL:
return 0x83 /* jae */;
case SLJIT_GREATER:
case SLJIT_GREATER_F64:
case SLJIT_F_GREATER:
case SLJIT_ORDERED_LESS:
case SLJIT_ORDERED_GREATER:
return 0x87 /* jnbe */;
case SLJIT_LESS_EQUAL:
case SLJIT_LESS_EQUAL_F64:
case SLJIT_F_LESS_EQUAL:
case SLJIT_UNORDERED_OR_GREATER_EQUAL:
case SLJIT_UNORDERED_OR_LESS_EQUAL:
return 0x86 /* jbe */;
case SLJIT_SIG_LESS:
@ -422,10 +430,10 @@ static sljit_u8 get_jump_code(sljit_uw type)
case SLJIT_NOT_OVERFLOW:
return 0x81 /* jno */;
case SLJIT_UNORDERED_F64:
case SLJIT_UNORDERED:
return 0x8a /* jp */;
case SLJIT_ORDERED_F64:
case SLJIT_ORDERED:
return 0x8b /* jpo */;
}
return 0;
@ -682,6 +690,20 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_has_cpu_feature(sljit_s32 feature_type)
}
}
SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_cmp_info(sljit_s32 type)
{
if (type < SLJIT_UNORDERED || type > SLJIT_ORDERED_LESS_EQUAL)
return 0;
switch (type) {
case SLJIT_ORDERED_EQUAL:
case SLJIT_UNORDERED_OR_NOT_EQUAL:
return 0;
}
return 1;
}
/* --------------------------------------------------------------------- */
/* Operators */
/* --------------------------------------------------------------------- */
@ -2312,10 +2334,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op2u(struct sljit_compiler *compil
CHECK(check_sljit_emit_op2(compiler, op, 1, 0, 0, src1, src1w, src2, src2w));
if (opcode != SLJIT_SUB && opcode != SLJIT_AND) {
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_op2(compiler, op, TMP_REG1, 0, src1, src1w, src2, src2w);
}
@ -2516,6 +2535,19 @@ static SLJIT_INLINE sljit_s32 sljit_emit_fop1_cmp(struct sljit_compiler *compile
sljit_s32 src1, sljit_sw src1w,
sljit_s32 src2, sljit_sw src2w)
{
switch (GET_FLAG_TYPE(op)) {
case SLJIT_ORDERED_LESS:
case SLJIT_UNORDERED_OR_GREATER_EQUAL:
case SLJIT_UNORDERED_OR_GREATER:
case SLJIT_ORDERED_LESS_EQUAL:
if (!FAST_IS_REG(src2)) {
FAIL_IF(emit_sse2_load(compiler, op & SLJIT_32, TMP_FREG, src2, src2w));
src2 = TMP_FREG;
}
return emit_sse2_logic(compiler, UCOMISD_x_xm, !(op & SLJIT_32), src2, src1, src1w);
}
if (!FAST_IS_REG(src1)) {
FAIL_IF(emit_sse2_load(compiler, op & SLJIT_32, TMP_FREG, src1, src1w));
src1 = TMP_FREG;
@ -2769,7 +2801,6 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
ADJUST_LOCAL_OFFSET(dst, dstw);
CHECK_EXTRA_REGS(dst, dstw, (void)0);
type &= 0xff;
/* setcc = jcc + 0x10. */
cond_set = U8(get_jump_code((sljit_uw)type) + 0x10);
@ -2813,10 +2844,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
return emit_mov(compiler, dst, dstw, TMP_REG1, 0);
}
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_op2(compiler, op, dst_save, dstw_save, dst_save, dstw_save, TMP_REG1, 0);
#else
@ -2927,10 +2955,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_op_flags(struct sljit_compiler *co
if (GET_OPCODE(op) < SLJIT_ADD)
return emit_mov(compiler, dst, dstw, TMP_REG1, 0);
#if (defined SLJIT_VERBOSE && SLJIT_VERBOSE) \
|| (defined SLJIT_ARGUMENT_CHECKS && SLJIT_ARGUMENT_CHECKS)
compiler->skip_checks = 1;
#endif
SLJIT_SKIP_CHECKS(compiler);
return sljit_emit_op2(compiler, op, dst_save, dstw_save, dst_save, dstw_save, TMP_REG1, 0);
#endif /* SLJIT_CONFIG_X86_64 */
}
@ -2971,7 +2996,7 @@ SLJIT_API_FUNC_ATTRIBUTE sljit_s32 sljit_emit_cmov(struct sljit_compiler *compil
inst = emit_x86_instruction(compiler, 2, dst_reg, 0, src, srcw);
FAIL_IF(!inst);
*inst++ = GROUP_0F;
*inst = U8(get_jump_code(type & 0xff) - 0x40);
*inst = U8(get_jump_code((sljit_uw)type) - 0x40);
return SLJIT_SUCCESS;
}

View File

@ -59,38 +59,15 @@
#include <sys/mman.h>
#ifdef __NetBSD__
#if defined(PROT_MPROTECT)
#define check_se_protected(ptr, size) (0)
#define SLJIT_PROT_WX PROT_MPROTECT(PROT_EXEC)
#else /* !PROT_MPROTECT */
#ifdef _NETBSD_SOURCE
#include <sys/param.h>
#else /* !_NETBSD_SOURCE */
typedef unsigned int u_int;
#define devmajor_t sljit_s32
#endif /* _NETBSD_SOURCE */
#include <sys/sysctl.h>
#include <unistd.h>
#define check_se_protected(ptr, size) netbsd_se_protected()
static SLJIT_INLINE int netbsd_se_protected(void)
{
int mib[3];
int paxflags;
size_t len = sizeof(paxflags);
mib[0] = CTL_PROC;
mib[1] = getpid();
mib[2] = PROC_PID_PAXFLAGS;
if (SLJIT_UNLIKELY(sysctl(mib, 3, &paxflags, &len, NULL, 0) < 0))
return -1;
return (paxflags & CTL_PROC_PAXFLAGS_MPROTECT) ? -1 : 0;
}
#endif /* PROT_MPROTECT */
#define check_se_protected(ptr, size) (0)
#else /* POSIX */
#if !(defined SLJIT_SINGLE_THREADED && SLJIT_SINGLE_THREADED)
#include <pthread.h>
#define SLJIT_SE_LOCK() pthread_mutex_lock(&se_lock)
#define SLJIT_SE_UNLOCK() pthread_mutex_unlock(&se_lock)
#endif /* !SLJIT_SINGLE_THREADED */
#define check_se_protected(ptr, size) generic_se_protected(ptr, size)
static SLJIT_INLINE int generic_se_protected(void *ptr, sljit_uw size)
@ -102,22 +79,20 @@ static SLJIT_INLINE int generic_se_protected(void *ptr, sljit_uw size)
}
#endif /* NetBSD */
#if defined SLJIT_SINGLE_THREADED && SLJIT_SINGLE_THREADED
#ifndef SLJIT_SE_LOCK
#define SLJIT_SE_LOCK()
#endif
#ifndef SLJIT_SE_UNLOCK
#define SLJIT_SE_UNLOCK()
#else /* !SLJIT_SINGLE_THREADED */
#include <pthread.h>
#define SLJIT_SE_LOCK() pthread_mutex_lock(&se_lock)
#define SLJIT_SE_UNLOCK() pthread_mutex_unlock(&se_lock)
#endif /* SLJIT_SINGLE_THREADED */
#endif
#ifndef SLJIT_PROT_WX
#define SLJIT_PROT_WX 0
#endif /* !SLJIT_PROT_WX */
#endif
SLJIT_API_FUNC_ATTRIBUTE void* sljit_malloc_exec(sljit_uw size)
{
#if !(defined SLJIT_SINGLE_THREADED && SLJIT_SINGLE_THREADED)
#if !(defined SLJIT_SINGLE_THREADED && SLJIT_SINGLE_THREADED) \
&& !defined(__NetBSD__)
static pthread_mutex_t se_lock = PTHREAD_MUTEX_INITIALIZER;
#endif
static int se_protected = !SLJIT_PROT_WX;

19
testdata/grepoutput vendored
View File

@ -991,3 +991,22 @@ RC=0
---------------------------- Test 134 -----------------------------
=AB3CD5=
RC=0
---------------------------- Test 135 -----------------------------
./testdata/grepinputv@The word is cat in this line
RC=0
./testdata/grepinputv@./testdata/grepinputv@RC=0
./testdata/grepinputv@This line contains \E and (regex) *meta* [characters].
./testdata/grepinputv@The word is cat in this line
./testdata/grepinputv@The caterpillar sat on the mat
RC=0
testdata/grepinputM3:start end in between start
end and following
testdata/grepinputM7:start end in between start
end and following start
end other stuff
testdata/grepinputM11:start end in between start
end
testdata/grepinputM16:start end in between start
end
RC=0

20
testdata/grepoutputC vendored
View File

@ -1,26 +1,26 @@
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
Arg1: [T] [his] [s] Arg2: |T| () () (0)
Arg1: [T] [his] [s] Arg2: |T| () () (0)
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
The quick brown
Arg1: [T] [his] [s] Arg2: |T| () () (0)
This time it jumps and jumps and jumps.
Arg1: [T] [his] [s] Arg2: |T| () () (0)
This line contains \E and (regex) *meta* [characters].
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
The word is cat in this line
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
The caterpillar sat on the mat
Arg1: [T] [he ] [ ] Arg2: |T| () () (0)
The snowcat is not an animal
Arg1: [qu] [qu]
Arg1: [ t] [ t]
Arg1: [ l] [ l]
Arg1: [wo] [wo]
Arg1: [ca] [ca]
Arg1: [sn] [sn]
The quick brown
Arg1: [ t] [ t]
This time it jumps and jumps and jumps.
Arg1: [ l] [ l]
This line contains \E and (regex) *meta* [characters].
Arg1: [wo] [wo]
The word is cat in this line
Arg1: [ca] [ca]
The caterpillar sat on the mat
Arg1: [sn] [sn]
The snowcat is not an animal
0:T
The quick brown

35
testdata/testinput15 vendored
View File

@ -6,38 +6,44 @@
# (2) Other tests that must not be run with JIT.
# This test is first so that it doesn't inherit a large enough heap frame
# vector from a previous test.
/(*LIMIT_HEAP=21)\[(a)]{60}/expand
\[a]{60}
/(a+)*zz/I
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzbbbbbb\=find_limits
aaaaaaaaaaaaaz\=find_limits
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzbbbbbb\=find_limits_noheap
aaaaaaaaaaaaaz\=find_limits_noheap
!((?:\s|//.*\\n|/[*](?:\\n|.)*?[*]/)*)!I
/* this is a C style comment */\=find_limits
/* this is a C style comment */\=find_limits_noheap
/^(?>a)++/
aa\=find_limits
aaaaaaaaa\=find_limits
aa\=find_limits_noheap
aaaaaaaaa\=find_limits_noheap
/(a)(?1)++/
aa\=find_limits
aaaaaaaaa\=find_limits
aa\=find_limits_noheap
aaaaaaaaa\=find_limits_noheap
/a(?:.)*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
/a(?:.(*THEN))*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
/a(?:.(*THEN:ABC))*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
/^(?>a+)(?>b+)(?>c+)(?>d+)(?>e+)/
aabbccddee\=find_limits
aabbccddee\=find_limits_noheap
/^(?>(a+))(?>(b+))(?>(c+))(?>(d+))(?>(e+))/
aabbccddee\=find_limits
aabbccddee\=find_limits_noheap
/^(?>(a+))(?>b+)(?>(c+))(?>d+)(?>(e+))/
aabbccddee\=find_limits
aabbccddee\=find_limits_noheap
/(*LIMIT_MATCH=12bc)abc/
@ -228,9 +234,6 @@
/(|]+){2,2452}/
(|]+){2,2452}
/(*LIMIT_HEAP=21)\[(a)]{60}/expand
\[a]{60}
/b(?<!ax)(?!cx)/allusedtext
abc
abcz

13
testdata/testinputheap vendored Normal file
View File

@ -0,0 +1,13 @@
#pattern framesize, memory
/abcd/
abcd\=memory
abcd\=find_limits
/(((((((((((((((((((((((((((((( (^abc|xyz){1,20}$ ))))))))))))))))))))))))))))))/x
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcX\=memory
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcX\=find_limits
/ab(cd)/
abcd\=memory
abcd\=memory,ovector=0

50
testdata/testoutput15 vendored
View File

@ -6,19 +6,24 @@
# (2) Other tests that must not be run with JIT.
# This test is first so that it doesn't inherit a large enough heap frame
# vector from a previous test.
/(*LIMIT_HEAP=21)\[(a)]{60}/expand
\[a]{60}
Failed: error -63: heap limit exceeded
/(a+)*zz/I
Capture group count = 1
Starting code units: a z
Last code unit = 'z'
Subject length lower bound = 2
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzbbbbbb\=find_limits
Minimum heap limit = 0
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazzbbbbbb\=find_limits_noheap
Minimum match limit = 7
Minimum depth limit = 7
0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaazz
1: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaz\=find_limits
Minimum heap limit = 0
aaaaaaaaaaaaaz\=find_limits_noheap
Minimum match limit = 20481
Minimum depth limit = 30
No match
@ -27,70 +32,60 @@ No match
Capture group count = 1
May match empty string
Subject length lower bound = 0
/* this is a C style comment */\=find_limits
Minimum heap limit = 0
/* this is a C style comment */\=find_limits_noheap
Minimum match limit = 64
Minimum depth limit = 7
0: /* this is a C style comment */
1: /* this is a C style comment */
/^(?>a)++/
aa\=find_limits
Minimum heap limit = 0
aa\=find_limits_noheap
Minimum match limit = 5
Minimum depth limit = 3
0: aa
aaaaaaaaa\=find_limits
Minimum heap limit = 0
aaaaaaaaa\=find_limits_noheap
Minimum match limit = 12
Minimum depth limit = 3
0: aaaaaaaaa
/(a)(?1)++/
aa\=find_limits
Minimum heap limit = 0
aa\=find_limits_noheap
Minimum match limit = 7
Minimum depth limit = 5
0: aa
1: a
aaaaaaaaa\=find_limits
Minimum heap limit = 0
aaaaaaaaa\=find_limits_noheap
Minimum match limit = 21
Minimum depth limit = 5
0: aaaaaaaaa
1: a
/a(?:.)*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits
Minimum heap limit = 0
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
Minimum match limit = 24
Minimum depth limit = 3
0: abbbbbbbbbbbbbbbbbbbbba
/a(?:.(*THEN))*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits
Minimum heap limit = 0
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
Minimum match limit = 66
Minimum depth limit = 45
0: abbbbbbbbbbbbbbbbbbbbba
/a(?:.(*THEN:ABC))*?a/ims
abbbbbbbbbbbbbbbbbbbbba\=find_limits
Minimum heap limit = 0
abbbbbbbbbbbbbbbbbbbbba\=find_limits_noheap
Minimum match limit = 66
Minimum depth limit = 45
0: abbbbbbbbbbbbbbbbbbbbba
/^(?>a+)(?>b+)(?>c+)(?>d+)(?>e+)/
aabbccddee\=find_limits
Minimum heap limit = 0
aabbccddee\=find_limits_noheap
Minimum match limit = 7
Minimum depth limit = 7
0: aabbccddee
/^(?>(a+))(?>(b+))(?>(c+))(?>(d+))(?>(e+))/
aabbccddee\=find_limits
Minimum heap limit = 0
aabbccddee\=find_limits_noheap
Minimum match limit = 12
Minimum depth limit = 12
0: aabbccddee
@ -101,8 +96,7 @@ Minimum depth limit = 12
5: ee
/^(?>(a+))(?>b+)(?>(c+))(?>d+)(?>(e+))/
aabbccddee\=find_limits
Minimum heap limit = 0
aabbccddee\=find_limits_noheap
Minimum match limit = 10
Minimum depth limit = 10
0: aabbccddee
@ -521,10 +515,6 @@ No match
0:
1:
/(*LIMIT_HEAP=21)\[(a)]{60}/expand
\[a]{60}
Failed: error -63: heap limit exceeded
/b(?<!ax)(?!cx)/allusedtext
abc
0: abc

40
testdata/testoutputheap-16 vendored Normal file
View File

@ -0,0 +1,40 @@
#pattern framesize, memory
/abcd/
Memory allocation (code space): 26
Frame size for pcre2_match(): 128
abcd\=memory
malloc 20480
0: abcd
abcd\=find_limits
Minimum heap limit = 1
Minimum match limit = 2
Minimum depth limit = 2
0: abcd
/(((((((((((((((((((((((((((((( (^abc|xyz){1,20}$ ))))))))))))))))))))))))))))))/x
Memory allocation (code space): 1294
Frame size for pcre2_match(): 624
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcX\=memory
malloc 40960
free unremembered block
No match
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcX\=find_limits
Minimum heap limit = 22
Minimum match limit = 37
Minimum depth limit = 35
No match
/ab(cd)/
Memory allocation (code space): 36
Frame size for pcre2_match(): 144
abcd\=memory
0: abcd
1: cd
abcd\=memory,ovector=0
free 40960
free unremembered block
malloc 128
malloc 20480
0: abcd
1: cd

40
testdata/testoutputheap-32 vendored Normal file
View File

@ -0,0 +1,40 @@
#pattern framesize, memory
/abcd/
Memory allocation (code space): 52
Frame size for pcre2_match(): 128
abcd\=memory
malloc 20480
0: abcd
abcd\=find_limits
Minimum heap limit = 1
Minimum match limit = 2
Minimum depth limit = 2
0: abcd
/(((((((((((((((((((((((((((((( (^abc|xyz){1,20}$ ))))))))))))))))))))))))))))))/x
Memory allocation (code space): 2588
Frame size for pcre2_match(): 624
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcX\=memory
malloc 40960
free unremembered block
No match
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcX\=find_limits
Minimum heap limit = 22
Minimum match limit = 37
Minimum depth limit = 35
No match
/ab(cd)/
Memory allocation (code space): 72
Frame size for pcre2_match(): 144
abcd\=memory
0: abcd
1: cd
abcd\=memory,ovector=0
free 40960
free unremembered block
malloc 128
malloc 20480
0: abcd
1: cd

40
testdata/testoutputheap-8 vendored Normal file
View File

@ -0,0 +1,40 @@
#pattern framesize, memory
/abcd/
Memory allocation (code space): 15
Frame size for pcre2_match(): 128
abcd\=memory
malloc 20480
0: abcd
abcd\=find_limits
Minimum heap limit = 1
Minimum match limit = 2
Minimum depth limit = 2
0: abcd
/(((((((((((((((((((((((((((((( (^abc|xyz){1,20}$ ))))))))))))))))))))))))))))))/x
Memory allocation (code space): 855
Frame size for pcre2_match(): 624
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcX\=memory
malloc 40960
free unremembered block
No match
abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcX\=find_limits
Minimum heap limit = 22
Minimum match limit = 37
Minimum depth limit = 35
No match
/ab(cd)/
Memory allocation (code space): 23
Frame size for pcre2_match(): 144
abcd\=memory
0: abcd
1: cd
abcd\=memory,ovector=0
free 40960
free unremembered block
malloc 128
malloc 20480
0: abcd
1: cd