Building PCRE2 without using autotools
--------------------------------------

This document contains the following sections:

  General
  Generic instructions for the PCRE2 C library
  Stack size in Windows environments
  Linking programs in Windows environments
  Calling conventions in Windows environments
  Comments about Win32 builds
  Building PCRE2 on Windows with CMake
  Building PCRE2 on Windows with Visual Studio
  Testing with RunTest.bat
  Building PCRE2 on native z/OS and z/VM


GENERAL

The basic PCRE2 library consists entirely of code written in Standard C, and so
should compile successfully on any system that has a Standard C compiler and
library.

The PCRE2 distribution includes a "configure" file for use by the
configure/make (autotools) build system, as found in many Unix-like
environments. The README file contains information about the options for
"configure".

There is also support for CMake, which some users prefer, especially in Windows
environments, though it can also be run in Unix-like environments. See the
section entitled "Building PCRE2 on Windows with CMake" below.

Versions of src/config.h and src/pcre2.h are distributed in the PCRE2 tarballs
under the names src/config.h.generic and src/pcre2.h.generic. These are
provided for those who build PCRE2 without using "configure" or CMake. If you
use "configure" or CMake, the .generic versions are not used.


GENERIC INSTRUCTIONS FOR THE PCRE2 C LIBRARY

The following are generic instructions for building the PCRE2 C library "by
hand". If you are going to use CMake, this section does not apply to you; you
can skip ahead to the CMake section. Note that the settings concerned with
8-bit, 16-bit, and 32-bit code units relate to the type of data string that
PCRE2 processes. They are NOT referring to the underlying operating system bit
width. You do not have to do anything special to compile in a 64-bit
environment, for example.

 (1) Copy or rename the file src/config.h.generic as src/config.h, and edit the
     macro settings that it contains to whatever is appropriate for your
     environment. In particular, you can alter the definition of the NEWLINE
     macro to specify what character(s) you want to be interpreted as line
     terminators by default.

     When you subsequently compile any of the PCRE2 modules, you must specify
     -DHAVE_CONFIG_H to your compiler so that src/config.h is included in the
     sources.

     An alternative approach is not to edit src/config.h, but to use -D on the
     compiler command line to make any changes that you need to the
     configuration options. In this case -DHAVE_CONFIG_H must not be set.

     NOTE: There have been occasions when the way in which certain parameters
     in src/config.h are used has changed between releases. (In the
     configure/make world, this is handled automatically.) When upgrading to a
     new release, you are strongly advised to review src/config.h.generic
     before re-using what you had previously.

     Note also that the src/config.h.generic file is created from a config.h
     that was generated by Autotools, which automatically includes settings of
     a number of macros that are not actually used by PCRE2 (for example,
     HAVE_MEMORY_H).

 (2) Copy or rename the file src/pcre2.h.generic as src/pcre2.h.

 (3) EITHER:
       Copy or rename file src/pcre2_chartables.c.dist as
       src/pcre2_chartables.c.

     OR:
       Compile src/pcre2_dftables.c as a stand-alone program (using
       -DHAVE_CONFIG_H if you have set up src/config.h), and then run it with
       the single argument "src/pcre2_chartables.c". This generates a set of
       standard character tables and writes them to that file. The tables are
       generated using the default C locale for your system. If you want to use
       a locale that is specified by LC_xxx environment variables, add the -L
       option to the pcre2_dftables command. You must use this method if you
       are building on a system that uses EBCDIC code.

     The tables in src/pcre2_chartables.c are defaults. The caller of PCRE2 can
     specify alternative tables at run time.

 (4) For a library that supports 8-bit code units in the character strings that
     it processes, compile the following source files from the src directory,
     setting -DPCRE2_CODE_UNIT_WIDTH=8 as a compiler option. Also set
     -DHAVE_CONFIG_H if you have set up src/config.h with your configuration,
     or else use other -D settings to change the configuration as required.

       pcre2_auto_possess.c
       pcre2_chartables.c
       pcre2_compile.c
       pcre2_config.c
       pcre2_context.c
       pcre2_convert.c
       pcre2_dfa_match.c
       pcre2_error.c
       pcre2_extuni.c
       pcre2_find_bracket.c
       pcre2_jit_compile.c
       pcre2_maketables.c
       pcre2_match.c
       pcre2_match_data.c
       pcre2_newline.c
       pcre2_ord2utf.c
       pcre2_pattern_info.c
       pcre2_script_run.c
       pcre2_serialize.c
       pcre2_string_utils.c
       pcre2_study.c
       pcre2_substitute.c
       pcre2_substring.c
       pcre2_tables.c
       pcre2_ucd.c
       pcre2_valid_utf.c
       pcre2_xclass.c

     Make sure that you include -I. in the compiler command (or equivalent for
     an unusual compiler) so that all included PCRE2 header files are first
     sought in the src directory under the current directory. Otherwise you run
     the risk of picking up a previously-installed file from somewhere else.

     Note that you must compile pcre2_jit_compile.c, even if you have not
     defined SUPPORT_JIT in src/config.h, because when JIT support is not
     configured, dummy functions are compiled. When JIT support IS configured,
     pcre2_jit_compile.c #includes other files from the sljit subdirectory,
     all of whose names begin with "sljit". It also #includes
     src/pcre2_jit_match.c and src/pcre2_jit_misc.c, so you should not compile
     these yourself.

     Note also that the pcre2_fuzzsupport.c file contains special code that is
     useful to those who want to run fuzzing tests on the PCRE2 library. Unless
     you are doing that, you can ignore it.

 (5) Now link all the compiled code into an object library in whichever form
     your system keeps such libraries. This is the basic PCRE2 C 8-bit library.
     If your system has static and shared libraries, you may have to do this
     once for each type.

 (6) If you want to build a library that supports 16-bit or 32-bit code units,
     (as well as, or instead of the 8-bit library) just supply 16 or 32 as the
     value of -DPCRE2_CODE_UNIT_WIDTH when you are compiling.

 (7) If you want to build the POSIX wrapper functions (which apply only to the
     8-bit library), ensure that you have the src/pcre2posix.h file and then
     compile src/pcre2posix.c. Link the result (on its own) as the pcre2posix
     library.

 (8) The pcre2test program can be linked with any combination of the 8-bit,
     16-bit and 32-bit libraries (depending on what you selected in
     src/config.h). Compile src/pcre2test.c; don't forget -DHAVE_CONFIG_H if
     necessary, but do NOT define PCRE2_CODE_UNIT_WIDTH. Then link with the
     appropriate library/ies. If you compiled an 8-bit library, pcre2test also
     needs the pcre2posix wrapper library.

 (9) Run pcre2test on the testinput files in the testdata directory, and check
     that the output matches the corresponding testoutput files. There are
     comments about what each test does in the section entitled "Testing PCRE2"
     in the README file. If you compiled more than one of the 8-bit, 16-bit and
     32-bit libraries, you need to run pcre2test with the -16 option to do
     16-bit tests and with the -32 option to do 32-bit tests.

     Some tests are relevant only when certain build-time options are selected.
     For example, test 4 is for Unicode support, and will not run if you have
     built PCRE2 without it. See the comments at the start of each testinput
     file. If you have a suitable Unix-like shell, the RunTest script will run
     the appropriate tests for you. The command "RunTest list" will output a
     list of all the tests.

     Note that the supplied files are in Unix format, with just LF characters
     as line terminators. You may need to edit them to change this if your
     system uses a different convention.

(10) If you have built PCRE2 with SUPPORT_JIT, the JIT features can be tested
     by running pcre2test with the -jit option. This is done automatically by
     the RunTest script. You might also like to build and run the freestanding
     JIT test program, src/pcre2_jit_test.c.

(11) If you want to use the pcre2grep command, compile and link
     src/pcre2grep.c; it uses only the basic 8-bit PCRE2 library (it does not
     need the pcre2posix library). If you have built the PCRE2 library with JIT
     support by defining SUPPORT_JIT in src/config.h, you can also define
     SUPPORT_PCRE2GREP_JIT, which causes pcre2grep to make use of JIT (unless
     it is run with --no-jit). If you define SUPPORT_PCRE2GREP_JIT without
     defining SUPPORT_JIT, pcre2grep does not try to make use of JIT.


STACK SIZE IN WINDOWS ENVIRONMENTS

Prior to release 10.30 the default system stack size of 1MiB in some Windows
environments caused issues with some tests. This should no longer be the case
for 10.30 and later releases.


LINKING PROGRAMS IN WINDOWS ENVIRONMENTS

If you want to statically link a program against a PCRE2 library in the form of
a non-dll .a file, you must define PCRE2_STATIC before including src/pcre2.h.


CALLING CONVENTIONS IN WINDOWS ENVIRONMENTS

It is possible to compile programs to use different calling conventions using
MSVC. Search the web for "calling conventions" for more information. To make it
easier to change the calling convention for the exported functions in the
PCRE2 library, the macro PCRE2_CALL_CONVENTION is present in all the external
definitions. It can be set externally when compiling (e.g. in CFLAGS). If it is
not set, it defaults to empty; the default calling convention is then used
(which is what is wanted most of the time).


COMMENTS ABOUT WIN32 BUILDS (see also "BUILDING PCRE2 ON WINDOWS WITH CMAKE")

There are two ways of building PCRE2 using the "configure, make, make install"
paradigm on Windows systems: using MinGW or using Cygwin. These are not at all
the same thing; they are completely different from each other. There is also
support for building using CMake, which some users find a more straightforward
way of building PCRE2 under Windows.

The MinGW home page (http://www.mingw.org/) says this:

  MinGW: A collection of freely available and freely distributable Windows
  specific header files and import libraries combined with GNU toolsets that
  allow one to produce native Windows programs that do not rely on any
  3rd-party C runtime DLLs.

The Cygwin home page (http://www.cygwin.com/) says this:

  Cygwin is a Linux-like environment for Windows. It consists of two parts:

  . A DLL (cygwin1.dll) which acts as a Linux API emulation layer providing
    substantial Linux API functionality

  . A collection of tools which provide Linux look and feel.

On both MinGW and Cygwin, PCRE2 should build correctly using:

  ./configure && make && make install

This should create two libraries called libpcre2-8 and libpcre2-posix. These
are independent libraries: when you link with libpcre2-posix you must also link
with libpcre2-8, which contains the basic functions.

Using Cygwin's compiler generates libraries and executables that depend on
cygwin1.dll. If a library that is generated this way is distributed,
cygwin1.dll has to be distributed as well. Since cygwin1.dll is under the GPL
licence, this forces not only PCRE2 to be under the GPL, but also the entire
application. A distributor who wants to keep their own code proprietary must
purchase an appropriate Cygwin licence.

MinGW has no such restrictions. The MinGW compiler generates a library or
executable that can run standalone on Windows without any third party dll or
licensing issues.

But there is more complication:

If a Cygwin user uses the -mno-cygwin Cygwin gcc flag, what that really does is
to tell Cygwin's gcc to use the MinGW gcc. Cygwin's gcc is only acting as a
front end to MinGW's gcc (if you install Cygwin's gcc, you get both Cygwin's
gcc and MinGW's gcc). So, a user can:

. Build native binaries by using MinGW or by getting Cygwin and using
  -mno-cygwin.

. Build binaries that depend on cygwin1.dll by using Cygwin with the normal
  compiler flags.

The test files that are supplied with PCRE2 are in UNIX format, with LF
characters as line terminators. Unless your PCRE2 library uses a default
newline option that includes LF as a valid newline, it may be necessary to
change the line terminators in the test files to get some of the tests to work.


BUILDING PCRE2 ON WINDOWS WITH CMAKE

CMake is an alternative configuration facility that can be used instead of
"configure". CMake creates project files (make files, solution files, etc.)
tailored to numerous development environments, including Visual Studio,
Borland, Msys, MinGW, NMake, and Unix. If possible, use short paths with no
spaces in the names for your CMake installation and your PCRE2 source and build
directories.

The following instructions were contributed by a PCRE1 user, but they should
also work for PCRE2. If they are not followed exactly, errors may occur. In the
event that errors do occur, it is recommended that you delete the CMake cache
before attempting to repeat the CMake build process. In the CMake GUI, the
cache can be deleted by selecting "File > Delete Cache".

1.  Install the latest CMake version available from http://www.cmake.org/, and
    ensure that cmake\bin is on your path.

2.  Unzip (retaining folder structure) the PCRE2 source tree into a source
    directory such as C:\pcre2. You should ensure your local date and time
    is not earlier than the file dates in your source dir if the release is
    very new.

3.  Create a new, empty build directory, preferably a subdirectory of the
    source dir. For example, C:\pcre2\pcre2-xx\build.

4.  Run cmake-gui from the Shell envirornment of your build tool, for example,
    Msys for Msys/MinGW or Visual Studio Command Prompt for VC/VC++. Do not try
    to start Cmake from the Windows Start menu, as this can lead to errors.

5.  Enter C:\pcre2\pcre2-xx and C:\pcre2\pcre2-xx\build for the source and
    build directories, respectively.

6.  Hit the "Configure" button.

7.  Select the particular IDE / build tool that you are using (Visual
    Studio, MSYS makefiles, MinGW makefiles, etc.)

8.  The GUI will then list several configuration options. This is where
    you can disable Unicode support or select other PCRE2 optional features.

9.  Hit "Configure" again. The adjacent "Generate" button should now be
    active.

10. Hit "Generate".

11. The build directory should now contain a usable build system, be it a
    solution file for Visual Studio, makefiles for MinGW, etc. Exit from
    cmake-gui and use the generated build system with your compiler or IDE.
    E.g., for MinGW you can run "make", or for Visual Studio, open the PCRE2
    solution, select the desired configuration (Debug, or Release, etc.) and
    build the ALL_BUILD project.

12. If during configuration with cmake-gui you've elected to build the test
    programs, you can execute them by building the test project. E.g., for
    MinGW: "make test"; for Visual Studio build the RUN_TESTS project. The
    most recent build configuration is targeted by the tests. A summary of
    test results is presented. Complete test output is subsequently
    available for review in Testing\Temporary under your build dir.


BUILDING PCRE2 ON WINDOWS WITH VISUAL STUDIO

The code currently cannot be compiled without a stdint.h header, which is
available only in relatively recent versions of Visual Studio. However, this
portable and permissively-licensed implementation of the header worked without
issue:

  http://www.azillionmonkeys.com/qed/pstdint.h

Just rename it and drop it into the top level of the build tree.


TESTING WITH RUNTEST.BAT

If configured with CMake, building the test project ("make test" or building
ALL_TESTS in Visual Studio) creates (and runs) pcre2_test.bat (and depending
on your configuration options, possibly other test programs) in the build
directory. The pcre2_test.bat script runs RunTest.bat with correct source and
exe paths.

For manual testing with RunTest.bat, provided the build dir is a subdirectory
of the source directory: Open command shell window. Chdir to the location
of your pcre2test.exe and pcre2grep.exe programs. Call RunTest.bat with
"..\RunTest.Bat" or "..\..\RunTest.bat" as appropriate.

To run only a particular test with RunTest.Bat provide a test number argument.

Otherwise:

1. Copy RunTest.bat into the directory where pcre2test.exe and pcre2grep.exe
   have been created.

2. Edit RunTest.bat to indentify the full or relative location of
   the pcre2 source (wherein which the testdata folder resides), e.g.:

   set srcdir=C:\pcre2\pcre2-10.00

3. In a Windows command environment, chdir to the location of your bat and
   exe programs.

4. Run RunTest.bat. Test outputs will automatically be compared to expected
   results, and discrepancies will be identified in the console output.

To independently test the just-in-time compiler, run pcre2_jit_test.exe.


BUILDING PCRE2 ON NATIVE Z/OS AND Z/VM

z/OS and z/VM are operating systems for mainframe computers, produced by IBM.
The character code used is EBCDIC, not ASCII or Unicode. In z/OS, UNIX APIs and
applications can be supported through UNIX System Services, and in such an
environment it should be possible to build PCRE2 in the same way as in other
systems, with the EBCDIC related configuration settings, but it is not known if
anybody has tried this.

In native z/OS (without UNIX System Services) and in z/VM, special ports are
required. For details, please see file 939 on this web site:

  http://www.cbttape.org

Everything in that location, source and executable, is in EBCDIC and native
z/OS file formats. The port provides an API for LE languages such as COBOL and
for the z/OS and z/VM versions of the Rexx languages.

===========================
Last Updated: 28 April 2021
===========================