Commit Graph

4 Commits

Author SHA1 Message Date
Olle Liljenzin e03953e27a Updated DAFSA generator and parser to support UTF-8 encoding 2016-11-05 10:01:54 +01:00
Daniel Kahn Gillmor dc7bf5bbae rename src/make_dafsa.py to src/psl-make-dafsa, add documentation
I've talked to the good people on #debian-bootstrap who would be most
affected by the possible build-dep cycle, and i think the simplest
approach is actually to split out make_dafsa.py into its own
architecture-independent package.

I'm thinking i'll call the package psl-make-dafsa, and in the course of
shipping it, i'll place src/make_dafsa.py as /usr/bin/psl-make-dafsa.

This is because:

 * debian discourages scripts on the $PATH from having language-specific
   suffixes like .py:

    https://lintian.debian.org/tags/script-with-language-extension.html

 * "-" appears to be a more common delimiter in command names than "_":

    0 dkg@alice:~$ for x in - _; do printf "%s: %d " "$x" $(ls -1 ${PATH//:/ } | grep -c "$x"); done; echo
    -: 1235 _: 368
    0 dkg@alice:~$

 * i'd prefer to prefix the command with "psl-" since it really is
   producing and interpreting PSL-specific data structures.

Accepting this patch would mean i'd have fewer changes to make in the
debian packaging, and would allow other distributors to take a similar
approach if they want to.
2016-07-14 11:55:04 +02:00
Tim Rühsen 23345f5f37 Convert lookup_string_in_fixed_set.c into UTF-8 2016-01-01 22:31:01 +01:00
Tim Rühsen 0ca3741df6 Use DAWG/DAFSA format for builtin data
This data representation reduces the size of the PSL data
drastically and still allows fast lookups.
2015-12-09 09:35:04 +01:00