Commit Graph

202 Commits

Author SHA1 Message Date
Tim Rühsen 3ac807d987 Add --encoding to psl-make-dafsa man page 2016-11-05 10:37:01 +01:00
Tim Rühsen 4b42762cbf Skip punycode conversion for _psl_is_public_suffix() if data contains UTF-8 rules 2016-11-05 10:37:01 +01:00
Olle Liljenzin 86034ac7c9 Added function to the parser for reading DAFSA encoding mode. 2016-11-05 10:37:01 +01:00
Olle Liljenzin 8c2bcd5a24 Added version info into generated DAFSA.
psl-make-dafsa got a mode switch so that the old version can be
generated for testing.
2016-11-05 10:01:54 +01:00
Olle Liljenzin e03953e27a Updated DAFSA generator and parser to support UTF-8 encoding 2016-11-05 10:01:54 +01:00
Tim Rühsen 598a78b2de Add better test code coverage 2016-09-26 15:15:34 +02:00
Tim Rühsen 5ebc24f0e0 Code cleanup in libidn2 branch of _psl_idna_toASCII()
Reported-by: https://github.com/daurnimator
2016-09-26 10:13:43 +02:00
Tim Rühsen 7eb8592035 Let u8_tolower() allocate the result buffer.
Reported-by: https://github.com/daurnimator
2016-09-25 19:44:33 +02:00
Tim Rühsen 32543dd5a5 Avoid unneeded memory allocactions in psl_str_to_utf8lower()
Reported-by: https://github.com/daurnimator
2016-09-25 12:49:56 +02:00
Tim Rühsen 1baaacccd5 Fix libidn/libidn2 code path of psl_str_to_utf8lower()
* fixing memory leaks
* proper handling of unterminated results of u8_tolower()
* second call to iconv() ensures flush of internal memory
* check more code paths of psl_str_to_utf8lower() via
  tests/test-registrable-domain.c
2016-09-23 12:35:08 +02:00
Tim Rühsen e2812e8c4c Check return value for strdup and strndup
Fixes #60
Reported-by: https://github.com/daurnimator
2016-09-22 15:53:31 +02:00
Tim Rühsen 351b3fb912 Remove redundant define of countof() 2016-09-22 11:37:23 +02:00
Tim Rühsen 9e1ca81be4 Remove memory allocations from _utf8_to_utf32()
Reported-by: https://github.com/daurnimator
2016-09-22 11:19:52 +02:00
Tim Rühsen 6cfb33e530 Amend API docs to be more precise about invalid input.
Fixes #59
Reported-by: https://github.com/daurnimator
2016-09-21 12:03:00 +02:00
Tim Rühsen 10f7b5fe7c Fallback to malloc from alloca for larger memory chunks
Fixes #58
Reported-by: https://github.com/daurnimator
2016-09-21 11:54:39 +02:00
Tim Rühsen 1ab7be5641 Check malloc/realloc results in src/psl.c
Fixes #57
Reported-by: https://github.com/daurnimator
2016-09-21 11:15:43 +02:00
Dagobert Michelsen 7983f86820 Use proper library path and libs for ICU 2016-09-17 14:46:06 +02:00
Tim Rühsen 126d2dca9c Package and install psl.1 and psl-make-dafsa.1
Fixes #53
Reported-by: https://github.com/yselkowitz
2016-09-17 14:46:00 +02:00
Jeremy Ehrhardt 003dec4203 Change src/psl-make-dafsa shebang so it'll run on OS X 2016-09-16 18:42:54 -07:00
Daniel Kahn Gillmor dc7bf5bbae rename src/make_dafsa.py to src/psl-make-dafsa, add documentation
I've talked to the good people on #debian-bootstrap who would be most
affected by the possible build-dep cycle, and i think the simplest
approach is actually to split out make_dafsa.py into its own
architecture-independent package.

I'm thinking i'll call the package psl-make-dafsa, and in the course of
shipping it, i'll place src/make_dafsa.py as /usr/bin/psl-make-dafsa.

This is because:

 * debian discourages scripts on the $PATH from having language-specific
   suffixes like .py:

    https://lintian.debian.org/tags/script-with-language-extension.html

 * "-" appears to be a more common delimiter in command names than "_":

    0 dkg@alice:~$ for x in - _; do printf "%s: %d " "$x" $(ls -1 ${PATH//:/ } | grep -c "$x"); done; echo
    -: 1235 _: 368
    0 dkg@alice:~$

 * i'd prefer to prefix the command with "psl-" since it really is
   producing and interpreting PSL-specific data structures.

Accepting this patch would mean i'd have fewer changes to make in the
debian packaging, and would allow other distributors to take a similar
approach if they want to.
2016-07-14 11:55:04 +02:00
Tim Rühsen 8dba092c73 Add magic header to DAFSA binary files 2016-07-13 11:14:18 +02:00
Tim Rühsen 852931571f Fixed invocation of make_dafsa.py in psl2c.c 2016-07-13 11:13:04 +02:00
Daniel Kahn Gillmor dc9cc02982 s/publix/public/ 2016-07-06 15:32:51 +02:00
Daniel Kahn Gillmor 248327e4aa use https where possible 2016-07-06 15:32:51 +02:00
Tim Rühsen 2914afa8c7 New linter/ dir with pslint.py selftest 2016-02-18 16:40:06 +01:00
Tim Rühsen 811513f17e Print message and exit when no suffixes are found 2016-02-12 12:27:25 +01:00
Tim Rühsen d19c46c003 Make a few enhancements to pslint 2016-02-08 14:11:52 +01:00
Tim Rühsen 36609787d5 Fix python3 UTF-8 runtime error and section detection 2016-02-08 09:40:43 +01:00
Tim Rühsen 568394438d Add disabled code for 'Group Order' checking
The check has been disabled since it turned out that those
'groupings' of PSL entries are not really ordered in the way
(# of labels, TLD, sublabel#1, sublabel#2, ...)

This commit also fixes section detection / verification
2016-02-05 12:16:50 +01:00
Tim Rühsen aa028e606b Adjust text in doublette comment in src/pslint.py 2016-02-02 22:49:02 +01:00
Tim Rühsen a46af675b4 Fix indentation multi-line comment in src/pslint.py 2016-02-02 22:41:18 +01:00
Tim Rühsen bd70c79c18 Indent src/pslint.py with tabs 2016-02-02 22:20:58 +01:00
Tim Rühsen 98aed19c3a Convert copyright line to UTF-8 in pslint.py 2016-02-02 19:59:45 +01:00
Tim Rühsen 3ba8903915 Add PSL linter written in Python 2016-02-02 16:43:03 +01:00
Tim Rühsen 8c39291f55 Slightly shorter DAFSA array when sorting input 2016-01-05 10:57:07 +01:00
Tim Rühsen 1bd9347af9 Fix for commit fd928da46e 2016-01-04 22:15:43 +01:00
Tim Rühsen fd928da46e Fix python3 incompatibilities in make_dafsa.py 2016-01-04 20:22:13 +01:00
Tim Rühsen 95a5152e56 Update copyright year to 2016 2016-01-02 13:36:49 +01:00
Tim Rühsen 96e0848d81 Release unused memory after loading DAFSA data 2016-01-02 13:31:53 +01:00
Tim Rühsen 748e3ae9cc Load DAFSA precompiled files (auto-detection) 2016-01-01 22:38:21 +01:00
Tim Rühsen 1604cb3dca Fix make_dafsa.py to generate 4 bit return values 2016-01-01 22:32:11 +01:00
Tim Rühsen 23345f5f37 Convert lookup_string_in_fixed_set.c into UTF-8 2016-01-01 22:31:01 +01:00
Tim Rühsen c9d76e4898 Remove unused variable source_date_epoch 2016-01-01 17:20:30 +01:00
Tim Rühsen cde5e53ea6 Remove psl_builtin_compile_time() for reproducable builds 2016-01-01 15:44:24 +01:00
Tim Rühsen c699e3c441 Add --input-format and --output-format to make_dafsa.py 2015-12-30 17:52:48 +01:00
Tim Rühsen 355edc152f Fix for previous commit 2015-12-29 17:20:28 +01:00
Tim Rühsen 82e9445493 Add psl2c --binary to create DAFSA binary file from PSL 2015-12-29 16:53:47 +01:00
Tim Rühsen 5363290cbe Remove debugging printf 2015-12-26 14:29:10 +01:00
Tim Rühsen 093d5eac3d Fix ./configure --disable-runtime
Added runtime punycode generation code from
  http://www.nicemice.net/idn/punycode-spec.gz
2015-12-26 14:15:08 +01:00
Tim Rühsen e252af877f Fix ./configure --disable-builtin 2015-12-15 20:46:25 +01:00
Daniel Kahn Gillmor 01a3751524 re-fix psl_builtin_outdated() 2015-12-11 22:59:15 -05:00
Tim Rühsen 0ca3741df6 Use DAWG/DAFSA format for builtin data
This data representation reduces the size of the PSL data
drastically and still allows fast lookups.
2015-12-09 09:35:04 +01:00
Tim Rühsen 36139b601d Merge branch 'develop' into dafsa 2015-12-07 10:33:44 +01:00
Tim Rühsen 9d2e93f0b8 New function psl_is_public_suffix2()
The current PSL has two sections, ICANN and PRIVATE.
This new function allows to limit the check for one or both
of these sections.
2015-12-06 21:55:56 +01:00
Tim Rühsen 883e67f008 Create src/suffixes_dafsa.c with DAFSA C array 2015-12-04 21:26:30 +01:00
Tim Rühsen aa0593460c Remove .travis.yml from branch 2015-12-04 17:15:03 +01:00
Tim Rühsen b53273d406 Use absolute PSL path to make psl_builtin_outdated() work reliable 2015-11-19 11:18:17 +01:00
Tim Rühsen dbefdb6767 Remove include of bits/stat.h 2015-11-19 10:06:04 +01:00
Tim Rühsen 643e523f09 Fix psl_builtin_outdated() 2015-09-27 19:14:13 +02:00
Tim Rühsen 53c2fe31a8 Update copyright years 2015-09-23 14:50:01 +02:00
Tim Rühsen 00b9cfb119 Add function psl_check_version_number() 2015-09-23 14:04:17 +02:00
Tim Rühsen 6a8f33ee39 Add new function psl_builtin_outdated() 2015-09-19 14:00:49 +02:00
Tim Rühsen 34289fa59b Add function psl_suffix_wildcard_count() 2015-09-19 10:55:09 +02:00
Tim Rühsen e443d21b61 Code cleanup, faster lookups 2015-09-19 10:50:00 +02:00
Tim Rühsen 597709cb11 Support combination of foo.bar and *.foo.bar 2015-09-15 14:49:53 +02:00
Tim Rühsen f6a3b96f91 Check PSL entries before generating built-in data 2015-09-15 11:46:21 +02:00
Daniel Kahn Gillmor ac8ba5a828 Documentation cleanup 2015-08-12 10:06:49 +02:00
Tim Rühsen 3f5e208967 src/psl.c: Fix C99 comment to C89 2015-08-06 12:31:21 +02:00
Tim Rühsen 71835fcd44 Add https://github.com/publicsuffix as git submodule 2015-07-14 13:25:42 +02:00
Daniel Kahn Gillmor f9a1bdcf80 Embed _psl_compile_time derived from $SOURCE_DATE_EPOCH if set
Making packages build byte-for-byte reproducibly from a given
toolchain+source makes it much easier to corroborate builds by testing
against other build infrastructure.

By default, libpsl currently embeds the current unix timestamp in
_psl_compile_time, which makes it bytewise incompatible if it is
rebuild even on the same machine one second later.

See https://wiki.debian.org/ReproducibleBuilds/TimestampsProposal for
more information about $SOURCE_DATE_EPOCH.
2015-07-12 22:55:35 +02:00
Tim Rühsen 998b5515d7 Work around a libidn<=1.30 vulnerability 2015-07-06 13:03:50 +02:00
Giuseppe Scrivano 7a07205f1b psl.c: fix strndup replacement
Do not copy more bytes than the src string length.
2015-02-28 18:52:47 +01:00
Giuseppe Scrivano 225c557e23 psl.c: Do not define _GNU_SOURCE 2015-02-28 18:37:14 +01:00
Tim Rühsen 067f6aee9c Don't use locale dependent isspace()
Fixes an issue on Solaris
Reported-by: Dagobert Michelsen <dam@opencsw.org>
2015-01-26 11:05:32 +01:00
Tim Rühsen 896f7f6ae4 Fix ASCII check in src/psl2c.c 2015-01-26 11:04:22 +01:00
Tim Rühsen 58a4f6c028 add iconv Solaris compatibility 2015-01-23 16:13:19 +01:00
Tim Rühsen 910c4b37b6 add strndup() compatibility code 2015-01-23 15:05:02 +01:00
Tim Rühsen 16d751c7d3 mark API as stable 2015-01-21 15:38:18 +01:00
Tim Rühsen 6f899ae32b fixed gcc warning about comparison being always true 2015-01-21 12:26:44 +01:00
Tim Rühsen d5254ac816 removed C99 style comments 2015-01-21 12:21:32 +01:00
Tim Rühsen c8a9d2d6ff revoke ec63726165 2014-11-14 17:18:41 +01:00
Tim Rühsen ec63726165 fixed compiler warning in src/psl.c 2014-11-14 15:52:37 +01:00
Tim Rühsen bbed26b303 check for alloca.h before including 2014-10-28 15:41:35 +01:00
Tim Rühsen 4a33c2f65c removed qsort_r() which seems unavailable on CygWin 2014-08-22 17:44:48 +02:00
Tim Rühsen 8c6179e798 added support for IP addresses in psl_is_cookie_domain_acceptable() 2014-08-19 17:46:36 +02:00
Tim Rühsen c5f61d745b whitespace correction 2014-08-14 11:05:47 +02:00
Jakub Čajka c599471282 Fixed ascii string detection on architectures with unsigned char 2014-08-01 09:16:44 +02:00
Tim Rühsen 5c5ee3aad7 added code for all of runtime and builtin options 2014-06-30 13:21:16 +02:00
Tim Ruehsen 373bcb912c more work on support for libidn, libidn2, libicu 2014-06-29 22:56:33 +02:00
Tim Rühsen 74f715bd9c started with libidn2 integration 2014-06-27 17:13:30 +02:00
Tim Ruehsen c9fd29a977 small doc format change 2014-06-23 12:56:13 +02:00
Tim Ruehsen f7f1408088 removed possible C89 compilation issue 2014-06-20 17:04:22 +02:00
Tim Ruehsen 1c20931896 introduced defines for error codes 2014-06-20 12:36:51 +02:00
Tim Ruehsen 9f5d6b1e9d added idn2 punycode generation as fallback for missing libicu 2014-06-19 13:15:31 +02:00
Tim Ruehsen 1d13ab1d18 removed redundant code from psl2c.c 2014-06-19 12:06:54 +02:00
Tim Ruehsen a1a5b5e5d7 fixed c89 compatibility 2014-06-18 16:27:29 +02:00
Tim Ruehsen 4ae0fecc64 some libicu cleanups 2014-06-18 15:21:22 +02:00
Tim Ruehsen e6e0f7759f added lowercase conversion to ASCII strings 2014-06-18 12:39:55 +02:00
Tim Ruehsen 935b44b3ea updated docs, removed printing to stderr 2014-06-18 12:26:45 +02:00
Tim Ruehsen 57394eb1f8 added psl_str_to_utf8lower() 2014-06-17 17:14:02 +02:00