2010-07-17 Caolán McNamara : * strip down csutil.* to the bits that are used 2010-03-04 Caolán McNamara : * hun#1724558 tidy substring.c a little * hun#1999737 add some malloc checks * hun#2005643 tidy string functions 2010-02-23 László Németh : * hyphen.c: fix lefthyphenmin calculation for UTF-8 encoded input * hyphen.c: add Unicode ligature support for hyphenmin calculation (see also LONG_LIGATURE macro in hyphen.c for conditional compiling) * csutil.c: static struct for encds[] (from OpenOffice.org patch), (unsigned char)s, wordmin + 5 limit (see hyphen-2.4.patch of OOo) * Makefile.am, ooopatch.awk: add OpenOffice.org patch to the en_US hyphenation dictionary to fix apostrophe handling, see lhmin.test * Makefile.am, lig.awk: add Unicode ligature support to en_US hyphenation dictionary. NOTE: hyphenation within ligatures is not supported yet because of an implementation problem of OpenOffice.org: see OOo issue 71608. * tests: new tests: lig.* for ligature hyphenation, lhmin.* for lefthyphenmin calculation for UTF-8 encoded text with diacritics. 2008-05-01 László Németh : * hyphen.c, hyphen.h: compound word hyphenation support by recursive pattern matching based on two hyphenation pattern sets, see README.compound. Especially useful for languages with arbitrary number of compounds (Danish, Dutch, Finnish, German, Hungarian, Icelandic, Norwegian, Swedish etc.). - dictionary options for compound word hyphenation: COMPOUNDLEFTHYPHENMIN: minimal hyphenation distance from the left compound word boundary COMPOUNDRIGHTHYPHENMIN: minimal hyphenation distance from the right compound word boundary * README.compound: documentation * tests/compound.*: test data for compound word hyphenation and COMPOUNDLEFTHYPHENMIN and COMPOUNDRIGHTHYPHENMIN. * tests/test.sh: - add Valgrind debugger support, usage: make check VALGRIND=memcheck make check - fix false return when an error occurred - fix make distcheck target * tests/*.pat, test.sh: using static pattern files processed by substrings.pl instead of run-time processed patterns. * hyphen.c: add default hyphenmin support to the dictionaries: LEFTHYPHENMIN: minimal hyphenation distance from the left end of the word RIGHTHYPHENMIN: minimal hyphenation distance from the right end of the word. Problems with the LEFTHYPHENMIN and RIGHTHYPHENMIN and a possible solution reported by Joan Montané in SF.net Bug ID 1777894. * tests/settings*.*: test data of LEFTHYPHENMIN and RIGHTHYPHENMIN. First test (settings.*) is based on the test data of Joan Montané (SF.net Bug ID 1777894). * example.c: changed options: - old -d (non-standard hyph.) mode is the default now - -dd (listing possible hyphenations) -> -d - -o : old (without non-standard hyphenation support) mode * Makefile.am: - remove unused csutil from the shared library (-20 kB and solve a csutil conflict with Hunspell reported by Rene Engerhald in SF.net Bug 1939988). * substrings.pl : add lefthyphenmin and righthyphemin options: substrings.pl infile outfile [encoding [lefthyphenmin [righthyphenmin]]] * hyph_en_US.dic, Makefile.am: set right default values for American English, based on the original TeX settings and American English orthography: LEFTHYPHENMIN 2 RIGHTHYPHENMIN 3 * README_hyph_en_US.dic: add README for en_US hyphenation patterns * tbhyphext.tex: TugBoat hyphenation exception log with thousand word fixes, source: http://www.ctan.org/tex-archive/info/digests/tugboat/tb0hyf.tex, processed by the hyphenex.sh script (see in the same folder). * tbhyphext.sh: conversion script for tbhyphext.pat. 2008-02-19 László Németh : * hyphen.c: fix unconditional jump in the obsolete hnj_hyphen_hyphenate() (it was already fixed in the preferred hnj_hyphen_hyphenate2()). Possible fix for the problem reported by Rene Engelhard in SourceForge Bug ID 1896207. * Makefile.am: add missing $(srcdir)s for make dist * NEWS: add NEWS for autoreconf 2007-11-22 László Németh : * hyphen.c: fix a bad condition that introduced in the previous version. Problem reported by Joan Montané under SourceForge Bug ID 1772381. * Makefile.am: rename the library to "hyphen". * hyphen.tex: use the last official version and its time stamp. Source: http://tug.ctan.org/text-archive/macros/plain/base/hyphen.tex * tests/*: add make check support * doc/tb87nemeth.pdf: TugBoat article about non-standard hyphenation and its implementation. 2007-11-12 Caolan McNamara : * autoconf/automake/libtoolize it Which as a side effect makes it fit into the existing --with-system-altlinuxhyph configure support in OOo to use a system pre-installed library for OOo hyphenation. * make a shared library libhnj.so from it * install the hyphen.h header * hyphen.patch: document by a make target how to go from the original hyphen.tex file to the interim hyphen.us to the final hyph_en_US.dic that OOo uses. (For example, converting \hyphenate section of hyphen.tex.) 2007-05-14 László Németh : * README: add information about substring.pl conversion and HyFo Java hyphenation module. * README.hyphen: add the following references about hyphenation: Franklin M. Liang: Word Hy-phen-a-tion by Com-put-er. Stanford University, 1983. http://www.tug.org/docs/liang. László Németh: Automatic non-standard hyphenation in OpenOffice.org, TUGboat (27), 2006. No. 2., http://hunspell.sourceforge.net/tb87nemeth.pdf * README. nonstandard: add information about narrow subpatterns, and a problem reported by Peter B. West. 2006-11-27 László Németh : * substrings.pl: restore previous version to fix rare non-standard hyphenation problems reported by Peter B. West, HyFo (Java XSLT formatter) developer 2006-08-03 László Németh : * hyphen.c: fix bad Unicode non-standard hyphenation (reset deleted break in UTF-8 length conversion code in hnj_hyphen_load()) * tests/unicode*, Makefile: test for this fix * hyphen.c: fix bad hyphen duplication in hyphword output in hnj_hyphenate2() * example.c: fix empty input fault in single_hyphenations() (unsigned return value of strlen() output couldn't be negative in the condition) * substrings.pl: shorter version with Nanning Buitenhuis's substrings.pl fix. 2006-07-28 Nanning Buitenhuis : * substrings.c: faster C version of substrings.pl - It also fixed a minor bug in combine(): if a sub-pattern is found twice (or more) in the main pattern, then all occurences were changed instead of (the correct) last occurence. Only example in hyphen.us is 'tanta3' 2006-01-27 László Németh : * *.{c,h}: add non-standard hyphenation and Unicode support * README.discretionary: documentation - add tests/ (see make check) 2005-10-13 Daniel Naber : * example.c: fixed the call to hnj_hyphen_hyphenate() in example.c so that patterns ending in a dot should now work (Daniel Naber) Libhnj was written by Raph Levien Adapted to OpenOffice.org by Peter Novodvorsky