Daniel Kahn Gillmor
2c66c15a99
fix spelling errors
2018-03-23 11:33:55 +00:00
Tim Rühsen
f7e0d9441a
Fix --disable-builtin configure option
2018-03-05 11:25:06 +01:00
Tim Rühsen
e0c1ed0e7a
Improve docs for PSL_TYPE_NO_STAR_RULE
2018-02-26 11:45:57 +01:00
Tim Rühsen
8fd480584e
Fix PSL_TYPE_NO_STAR_RULE and improve test suite
...
Reported-by: Daniel Kahn Gillmor
2018-02-23 12:09:07 +01:00
Tim Rühsen
43ec750b40
Update copyrights
2018-02-22 10:04:00 +01:00
Tim Rühsen
aaacdae977
Add TLDs to (DAFSA) data to allow skipping the star rule
2018-02-21 20:49:26 +01:00
Claudio Saavedra
9e9341f5b9
psl_is_public_suffix2(): allow checking for suffixes not in the list
...
Add a PSL_TYPE_NO_STAR_RULE type to check for suffixes without the '*'
rule. This allows checking for suffixes that are not in the PSL.
2018-02-21 17:11:01 +02:00
Tim Rühsen
819486edd1
Remove C99 loop construct
2018-02-21 15:56:58 +01:00
Tim Rühsen
179ca703b2
Limit CPU wasting on large inputs
...
Large inputs on psl_registrable_domain() and psl_unregistrable_domain()
suffer from a O(N^2) behavior. This change limits N to avoid excessive
CPU usage.
At the same time we limit the fuzz corpora size to 64k which is far more
then we expect any real life domain to be.
Reported-by: OSS-Fuzz
2018-02-13 15:42:17 +01:00
Tim Rühsen
1c44781718
Fix unsigned integer overflow in _mem_is_ascii()
...
Found by OSS-Fuzz. It has no impact.
2017-11-03 12:10:05 +01:00
Tim Rühsen
4e51142022
psl_*_count() return -1 if information is not available
2017-09-15 17:14:32 +02:00
Tim Rühsen
c7a48a0bf8
Amend start of comments in lookup_string_in_fixed_set.c
2017-09-15 17:14:32 +02:00
Tim Rühsen
a12bd1d2a6
Fix input encoding for python3
2017-09-14 20:25:59 +02:00
Darshit Shah
4d5982ed98
Add new function psl_free_string()
...
When writing a wrapper around LibPSL in a different language it is
important that libpsl provide functions to free any memory that it
allocates. Without this, it is impossible to correctly free the memory
allocated by psl_str_to_utf8lower() function since in other languages
one may not have access to the same free() call from libc.
2017-08-30 11:07:04 +02:00
Tim Rühsen
659ee4391e
Remove compiler warnings
2017-07-20 11:36:13 +02:00
Tim Rühsen
a6e4703318
Fix oss-fuzz issue #2600 (buffer overflow in libicu code)
...
Added a reproducer corpus and fixed the broken libicu code.
The buffer overflow could be triggered by psl_load(), psl_load_fp(),
psl_is_public_suffix(), psl_is_public_suffix2(), psl_unregistrable_domain(),
and psl_registrable_domain().
2017-07-13 15:40:58 +02:00
Tim Rühsen
926cc34ade
Fix uninitialised value created by stack allocation
...
Using valgrind testing the fuzz corpora revealed a missing
check in _add_punycode_if_needed() which lead to a
"Uninitialised value was created by a stack allocation".
Thanks to OSS-fuzz for the corpora, thanks valgrind to find this
issue (asan and ubsan didn't find it).
2017-07-09 20:21:55 +02:00
Tim Rühsen
492c884d7d
Fix memory overflow in LIBICU code of psl_str_to_utf8lower()
...
Immediately discovered with the new --enable-asan / --enable-ubsan
configure options, thanks to the fuzz corpora.
2017-06-20 16:30:29 +02:00
Tim Rühsen
d686c1fff7
Fix memleak in _psl_is_public_suffix() (found by fuzzing)
2017-06-13 22:24:18 +02:00
Tim Rühsen
e584007f42
* src/psl.c (psl_str_to_utf8lower): Fix docs
2017-06-12 17:00:53 +02:00
Tim Rühsen
045bf63031
Fix double free in psl_load_fp(), found by fuzzing
2017-06-09 22:53:19 +02:00
Tim Rühsen
a33feb8ff4
Fix typos found by ka7/misspell_fixer
2017-04-19 11:46:27 +02:00
Tim Rühsen
448f6e4564
Fix order of files in psl_latest()
...
If 'dist_filename' and 'filename' are given and both have the same
age, we want 'dist_filename' (expected DAFSA) being loaded.
2017-02-21 12:18:29 +01:00
Tim Rühsen
eda8276b5f
Use NON-TRANSITIONAL toASCII() with libicu
2017-01-16 10:47:21 +01:00
Tim Rühsen
26d0856d0a
Fix typo
2017-01-16 10:26:12 +01:00
Tim Rühsen
526768cc5d
Use TR46 non-transitional with libidn2 >= 0.14
...
I changed my mind after talking with the cURL
maintainer Daniel Stenberg.
See https://github.com/curl/curl/pull/1207
2017-01-14 15:47:44 +01:00
Tim Rühsen
2c17d56234
Use TR46 transitional with libidn2 >= 0.14
2017-01-03 12:30:43 +01:00
Tim Rühsen
ff29f13d8f
Add functions psl_latest() and psl_dist_filename()
...
Also add a new ./configure function to set a distribution wide
PSL file used by psl_latest(): --with-psl-distfile
If possible that filename should point to a DAFSA PSL file that
becomes updated regularly.
2016-12-06 20:16:12 +01:00
Tim Rühsen
deabd4a546
Replace psl2c by psl-make-dafsa
...
Removed --input-format from psl-make-dafsa.
Added --output-format=cxx+ to psl-make-dafsa.
Removed psl2c.
2016-12-06 15:22:18 +01:00
Tim Rühsen
6490b8214b
Don't taint out variable on error in psl_str_to_utf8lower()
...
Fixes #71
2016-12-05 16:28:47 +01:00
Tim Rühsen
b9e04d6958
Update copyright year
2016-12-05 15:03:27 +01:00
Tim Rühsen
2a3a743643
Fix typo Publix -> Public
2016-12-05 15:01:27 +01:00
Frederic Cambus
9f0b09e830
Missing includes for in6_addr / AF_INET*, fixes compilation on OpenBSD
2016-12-02 19:11:18 +01:00
Tim Rühsen
65e785e1ca
Replace NFCK -> NFKC in the docs
2016-11-29 14:49:35 +01:00
Tim Rühsen
5d32b80077
Make API docs more detailed
2016-11-14 12:08:20 +01:00
Olle Liljenzin
3f276c7d1e
Fix psl-make-dafsa to work with python3
2016-11-12 21:21:28 +01:00
Tim Rühsen
761d938d2a
Fix name of Olle Liljenzin in src/psl-make-dafsa.1
2016-11-06 22:47:33 +01:00
Olle Liljenzin
3a4dff8805
Fixed documentation and error message to match the actual code.
2016-11-06 16:26:44 +01:00
Tim Rühsen
2c871b1306
Skip conversion in _psl_is_public_suffix() for builtin psl context
2016-11-06 11:59:36 +01:00
Tim Rühsen
44e6bd4eb8
src/psl2c.c: Also include UTF-8 into DAFSA output
2016-11-06 11:30:20 +01:00
Tim Rühsen
3211a66f00
Put punycode + UTF-8 rules into DAFSA in utf-8 mode
2016-11-06 11:30:20 +01:00
Tim Rühsen
3ac807d987
Add --encoding to psl-make-dafsa man page
2016-11-05 10:37:01 +01:00
Tim Rühsen
4b42762cbf
Skip punycode conversion for _psl_is_public_suffix() if data contains UTF-8 rules
2016-11-05 10:37:01 +01:00
Olle Liljenzin
86034ac7c9
Added function to the parser for reading DAFSA encoding mode.
2016-11-05 10:37:01 +01:00
Olle Liljenzin
8c2bcd5a24
Added version info into generated DAFSA.
...
psl-make-dafsa got a mode switch so that the old version can be
generated for testing.
2016-11-05 10:01:54 +01:00
Olle Liljenzin
e03953e27a
Updated DAFSA generator and parser to support UTF-8 encoding
2016-11-05 10:01:54 +01:00
Tim Rühsen
598a78b2de
Add better test code coverage
2016-09-26 15:15:34 +02:00
Tim Rühsen
5ebc24f0e0
Code cleanup in libidn2 branch of _psl_idna_toASCII()
...
Reported-by: https://github.com/daurnimator
2016-09-26 10:13:43 +02:00
Tim Rühsen
7eb8592035
Let u8_tolower() allocate the result buffer.
...
Reported-by: https://github.com/daurnimator
2016-09-25 19:44:33 +02:00
Tim Rühsen
32543dd5a5
Avoid unneeded memory allocactions in psl_str_to_utf8lower()
...
Reported-by: https://github.com/daurnimator
2016-09-25 12:49:56 +02:00
Tim Rühsen
1baaacccd5
Fix libidn/libidn2 code path of psl_str_to_utf8lower()
...
* fixing memory leaks
* proper handling of unterminated results of u8_tolower()
* second call to iconv() ensures flush of internal memory
* check more code paths of psl_str_to_utf8lower() via
tests/test-registrable-domain.c
2016-09-23 12:35:08 +02:00
Tim Rühsen
e2812e8c4c
Check return value for strdup and strndup
...
Fixes #60
Reported-by: https://github.com/daurnimator
2016-09-22 15:53:31 +02:00
Tim Rühsen
351b3fb912
Remove redundant define of countof()
2016-09-22 11:37:23 +02:00
Tim Rühsen
9e1ca81be4
Remove memory allocations from _utf8_to_utf32()
...
Reported-by: https://github.com/daurnimator
2016-09-22 11:19:52 +02:00
Tim Rühsen
6cfb33e530
Amend API docs to be more precise about invalid input.
...
Fixes #59
Reported-by: https://github.com/daurnimator
2016-09-21 12:03:00 +02:00
Tim Rühsen
10f7b5fe7c
Fallback to malloc from alloca for larger memory chunks
...
Fixes #58
Reported-by: https://github.com/daurnimator
2016-09-21 11:54:39 +02:00
Tim Rühsen
1ab7be5641
Check malloc/realloc results in src/psl.c
...
Fixes #57
Reported-by: https://github.com/daurnimator
2016-09-21 11:15:43 +02:00
Dagobert Michelsen
7983f86820
Use proper library path and libs for ICU
2016-09-17 14:46:06 +02:00
Tim Rühsen
126d2dca9c
Package and install psl.1 and psl-make-dafsa.1
...
Fixes #53
Reported-by: https://github.com/yselkowitz
2016-09-17 14:46:00 +02:00
Jeremy Ehrhardt
003dec4203
Change src/psl-make-dafsa shebang so it'll run on OS X
2016-09-16 18:42:54 -07:00
Daniel Kahn Gillmor
dc7bf5bbae
rename src/make_dafsa.py to src/psl-make-dafsa, add documentation
...
I've talked to the good people on #debian-bootstrap who would be most
affected by the possible build-dep cycle, and i think the simplest
approach is actually to split out make_dafsa.py into its own
architecture-independent package.
I'm thinking i'll call the package psl-make-dafsa, and in the course of
shipping it, i'll place src/make_dafsa.py as /usr/bin/psl-make-dafsa.
This is because:
* debian discourages scripts on the $PATH from having language-specific
suffixes like .py:
https://lintian.debian.org/tags/script-with-language-extension.html
* "-" appears to be a more common delimiter in command names than "_":
0 dkg@alice:~$ for x in - _; do printf "%s: %d " "$x" $(ls -1 ${PATH//:/ } | grep -c "$x"); done; echo
-: 1235 _: 368
0 dkg@alice:~$
* i'd prefer to prefix the command with "psl-" since it really is
producing and interpreting PSL-specific data structures.
Accepting this patch would mean i'd have fewer changes to make in the
debian packaging, and would allow other distributors to take a similar
approach if they want to.
2016-07-14 11:55:04 +02:00
Tim Rühsen
8dba092c73
Add magic header to DAFSA binary files
2016-07-13 11:14:18 +02:00
Tim Rühsen
852931571f
Fixed invocation of make_dafsa.py in psl2c.c
2016-07-13 11:13:04 +02:00
Daniel Kahn Gillmor
dc9cc02982
s/publix/public/
2016-07-06 15:32:51 +02:00
Daniel Kahn Gillmor
248327e4aa
use https where possible
2016-07-06 15:32:51 +02:00
Tim Rühsen
2914afa8c7
New linter/ dir with pslint.py selftest
2016-02-18 16:40:06 +01:00
Tim Rühsen
811513f17e
Print message and exit when no suffixes are found
2016-02-12 12:27:25 +01:00
Tim Rühsen
d19c46c003
Make a few enhancements to pslint
2016-02-08 14:11:52 +01:00
Tim Rühsen
36609787d5
Fix python3 UTF-8 runtime error and section detection
2016-02-08 09:40:43 +01:00
Tim Rühsen
568394438d
Add disabled code for 'Group Order' checking
...
The check has been disabled since it turned out that those
'groupings' of PSL entries are not really ordered in the way
(# of labels, TLD, sublabel#1, sublabel#2, ...)
This commit also fixes section detection / verification
2016-02-05 12:16:50 +01:00
Tim Rühsen
aa028e606b
Adjust text in doublette comment in src/pslint.py
2016-02-02 22:49:02 +01:00
Tim Rühsen
a46af675b4
Fix indentation multi-line comment in src/pslint.py
2016-02-02 22:41:18 +01:00
Tim Rühsen
bd70c79c18
Indent src/pslint.py with tabs
2016-02-02 22:20:58 +01:00
Tim Rühsen
98aed19c3a
Convert copyright line to UTF-8 in pslint.py
2016-02-02 19:59:45 +01:00
Tim Rühsen
3ba8903915
Add PSL linter written in Python
2016-02-02 16:43:03 +01:00
Tim Rühsen
8c39291f55
Slightly shorter DAFSA array when sorting input
2016-01-05 10:57:07 +01:00
Tim Rühsen
1bd9347af9
Fix for commit fd928da46e
2016-01-04 22:15:43 +01:00
Tim Rühsen
fd928da46e
Fix python3 incompatibilities in make_dafsa.py
2016-01-04 20:22:13 +01:00
Tim Rühsen
95a5152e56
Update copyright year to 2016
2016-01-02 13:36:49 +01:00
Tim Rühsen
96e0848d81
Release unused memory after loading DAFSA data
2016-01-02 13:31:53 +01:00
Tim Rühsen
748e3ae9cc
Load DAFSA precompiled files (auto-detection)
2016-01-01 22:38:21 +01:00
Tim Rühsen
1604cb3dca
Fix make_dafsa.py to generate 4 bit return values
2016-01-01 22:32:11 +01:00
Tim Rühsen
23345f5f37
Convert lookup_string_in_fixed_set.c into UTF-8
2016-01-01 22:31:01 +01:00
Tim Rühsen
c9d76e4898
Remove unused variable source_date_epoch
2016-01-01 17:20:30 +01:00
Tim Rühsen
cde5e53ea6
Remove psl_builtin_compile_time() for reproducable builds
2016-01-01 15:44:24 +01:00
Tim Rühsen
c699e3c441
Add --input-format and --output-format to make_dafsa.py
2015-12-30 17:52:48 +01:00
Tim Rühsen
355edc152f
Fix for previous commit
2015-12-29 17:20:28 +01:00
Tim Rühsen
82e9445493
Add psl2c --binary to create DAFSA binary file from PSL
2015-12-29 16:53:47 +01:00
Tim Rühsen
5363290cbe
Remove debugging printf
2015-12-26 14:29:10 +01:00
Tim Rühsen
093d5eac3d
Fix ./configure --disable-runtime
...
Added runtime punycode generation code from
http://www.nicemice.net/idn/punycode-spec.gz
2015-12-26 14:15:08 +01:00
Tim Rühsen
e252af877f
Fix ./configure --disable-builtin
2015-12-15 20:46:25 +01:00
Daniel Kahn Gillmor
01a3751524
re-fix psl_builtin_outdated()
2015-12-11 22:59:15 -05:00
Tim Rühsen
0ca3741df6
Use DAWG/DAFSA format for builtin data
...
This data representation reduces the size of the PSL data
drastically and still allows fast lookups.
2015-12-09 09:35:04 +01:00
Tim Rühsen
36139b601d
Merge branch 'develop' into dafsa
2015-12-07 10:33:44 +01:00
Tim Rühsen
9d2e93f0b8
New function psl_is_public_suffix2()
...
The current PSL has two sections, ICANN and PRIVATE.
This new function allows to limit the check for one or both
of these sections.
2015-12-06 21:55:56 +01:00
Tim Rühsen
883e67f008
Create src/suffixes_dafsa.c with DAFSA C array
2015-12-04 21:26:30 +01:00
Tim Rühsen
aa0593460c
Remove .travis.yml from branch
2015-12-04 17:15:03 +01:00
Tim Rühsen
b53273d406
Use absolute PSL path to make psl_builtin_outdated() work reliable
2015-11-19 11:18:17 +01:00
Tim Rühsen
dbefdb6767
Remove include of bits/stat.h
2015-11-19 10:06:04 +01:00
Tim Rühsen
643e523f09
Fix psl_builtin_outdated()
2015-09-27 19:14:13 +02:00