Commit Graph

17 Commits

Author SHA1 Message Date
Bartek Fabiszewski 73dd2967c8 Discard too long lines in dictionary file (#14)
* Discard too long lines in dictionary file

* Discard too long lines in dictionary file: add warning and test
2017-11-27 12:19:26 +01:00
Bartek Fabiszewski a8d50da0cc hnj_hyphen_hyphword: fix buffer overflow (#13)
* Fix buffer overflow

* hnj_hyphen_hyphword: rewrite, improve overflow checking

* hnj_hyphen_hyphword: add test to detect overflows
2017-05-02 18:00:42 +02:00
Dimitrij Mijoski a725591330 Fixes hunspell/hunspell#184 2016-09-14 15:54:33 +02:00
Dimitrij Mijoski 803bb92274 Fixes hunspell/hunspell#17 2016-09-14 15:28:56 +02:00
Anne-Edgar WILKE 6df43f8b17 Fix COMPOUNDHYPHENMIN=1 compound hyphenation
FIRST BUG
 ---------

  Problem

In a compound word, the word parts of two characters are never
hyphenated.

  Example

To reproduce the bug, just go to the directory hyphen-2.8.8 and do the
following :

echo "\
UTF-8
LEFTHYPHENMIN 1
RIGHTHYPHENMIN 1
COMPOUNDLEFTHYPHENMIN 1
COMPOUNDRIGHTHYPHENMIN 1
.post1
NEXTLEVEL
e1
a1
" > hyphen.pat

./example hyphen.pat <(echo postea)

The output is post=ea ; but it should be post=e=a.

If you replace postea with posteaque in the command above, you get
post=e=a=que, which is correct. Indeed, the component "eaque" is now
five characters long, so it is hyphenated.

If you replace postea with ea, you get e=a, which is also correct ;
this is because ea is not a compound word.

  Solution

In the file hyphen.c, line 966, "if (i - begin > 1)" must be replaced
with "if (i - begin > 0)".
Indeed, the word part is comprised between begin and i inclusively ;
its length is i - begin + 1. So, if you want to hyphenate the words
parts of length 2 and above, you have to check that i - begin + 1 >= 2,
ie i - begin > 0.

    SECOND BUG
    ----------

  Problem

In a compound word, the word parts are never hyphenated between their
second to last and their last character.

  Example

To reproduce the bug, do the following :

echo "\
UTF-8
LEFTHYPHENMIN 1
RIGHTHYPHENMIN 1
COMPOUNDLEFTHYPHENMIN 1
COMPOUNDRIGHTHYPHENMIN 1
1que.
NEXTLEVEL
e1
" > hyphen.pat

./example hyphen.pat <(echo meaque)

The output is mea=que ; but it should be me=a=que.

Again, if you replace meaque with mea, you get me=a, which is correct,
because mea is not a compound word.

If you replace meaque with eamque, you get e=am=que, as expected ; it
shows that there is no similar bug with the first and the second
character of word parts.

  Solution

In the file hyphen.c, line 983, "for (j = 0; j < i - begin - 1; j++)"
must be replaced with "for (j = 0; j < i - begin; j++)".
Indeed, the word part has length i - begin + 1. So there are i - begin
possible places for a hyphen. Thus j must take i - begin different
values, ie go from 0 to i - begin - 1.
2015-08-28 00:53:59 +02:00
Caolán McNamara 25e74becb5 sf#247 comparison between signed and unsigned 2014-10-17 12:22:19 +00:00
Caolán McNamara 81b1d7eaf4 clang scan-build warnings 2014-06-26 13:45:51 +00:00
Caolán McNamara 9db38a4ba7 #54 hypen compile as C++, missing casts and variable names 2013-06-13 19:43:14 +00:00
László Németh 3bea3aed15 hjn_hyphen_load_file patch for sandboxing by Pawel Hajdan 2013-03-18 10:49:03 +00:00
László Németh d2dc7334af fdo#43931 (hard hyphen hyphenation) + fdo#54843 (rhmin fix) 2012-09-13 07:50:50 +00:00
Caolán McNamara 943b612a4e fix bug found by Miklos 2012-07-12 15:29:56 +00:00
Caolán McNamara 5a60cb75a9 fix coverity warnings 2012-06-29 10:02:24 +00:00
Caolán McNamara 39bf406090 sync 2.8.3 into CVS 2012-06-29 07:10:58 +00:00
László Németh 830eccebc3 NOHYPHEN fix 2010-12-01 01:30:20 +00:00
László Németh f86ce87baa NOHYPHEN feature, see README.compound 2010-11-27 02:20:33 +00:00
Caolán McNamara 5020b64b0e check return value of fgets 2010-03-04 12:27:39 +00:00
Caolán McNamara 21127cc849 Initia import 2010-03-04 12:13:53 +00:00