From 84186a64004e5dcd2ce98b564d0e0a09aa5d68b2 Mon Sep 17 00:00:00 2001 From: Behdad Esfahbod Date: Wed, 1 Aug 2012 13:32:39 -0400 Subject: [PATCH] Add commentary on the compatibility decomposition in the normalizer --- src/hb-ot-shape-normalize.cc | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/src/hb-ot-shape-normalize.cc b/src/hb-ot-shape-normalize.cc index 46c89ec17..17a9ac86f 100644 --- a/src/hb-ot-shape-normalize.cc +++ b/src/hb-ot-shape-normalize.cc @@ -63,10 +63,22 @@ * * - When a font does not support a character but supports its decomposition, * well, use the decomposition (preferring the canonical decomposition, but - * falling back to the compatibility decomposition if necessary). + * falling back to the compatibility decomposition if necessary). The + * compatibility decomposition is really nice to have, for characters like + * ellipsis, or various-sized space characters. * - * - The Indic shaper requests decomposed output. This will handle splitting - * matra for the Indic shaper. + * - The complex shapers can customize the compose and decompose functions to + * offload some of their requirements to the normalizer. For example, the + * Indic shaper may want to disallow recomposing of two matras. + * + * - We try compatibility decomposition if decomposing through canonical + * decomposition alone failed to find a sequence that the font supports. + * We don't try compatibility decomposition recursively during the canonical + * decomposition phase. This has minimal impact. There are only a handful + * of Greek letter that have canonical decompositions that include characters + * with compatibility decomposition. Those can be found using this command: + * + * egrep "`echo -n ';('; grep ';<' UnicodeData.txt | cut -d';' -f1 | tr '\n' '|'; echo ') '`" UnicodeData.txt */ static void