Commit Graph

266 Commits

Author SHA1 Message Date
Behdad Esfahbod 072ae7a982 Add hb_buffer_serialize_list_formats() 2012-11-15 13:14:12 -08:00
Behdad Esfahbod f9edf16725 Add buffer serialization / deserialization API
Two output formats for now: TEXT, and JSON.  For example:

  hb-shape --output-format=json

Deserialization API is added, but not implemented yet.
2012-11-15 13:10:07 -08:00
Behdad Esfahbod 66ac2ff32e API change: Remove "mask" from hb_buffer_add()
I don't expect anybody using hb_buffer_add(), so this shouldn't break
anyone's code.
2012-11-13 16:26:32 -08:00
Behdad Esfahbod 0c7df22228 Add buffer flags
New API:

	hb_buffer_flags_t

	HB_BUFFER_FLAGS_DEFAULT
	HB_BUFFER_FLAG_BOT
	HB_BUFFER_FLAG_EOT
	HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES

	hb_buffer_set_flags()
	hb_buffer_get_flags()

We use the BOT flag to decide whether to insert dottedcircle if the
first char in the buffer is a combining mark.

The PRESERVE_DEFAULT_IGNORABLES flag prevents removal of characters like
ZWNJ/ZWJ/...
2012-11-13 14:42:35 -08:00
Behdad Esfahbod 82ecaff736 Add hb_buffer_clear()
Which is like _reset(), but does NOT clear unicode-funcs.
2012-11-13 14:10:00 -08:00
Behdad Esfahbod da70111ab2 Don't clear buffer pre-context if no new context is being provided
Patch from Jonathan Kew.

Part of fixing:

Mozilla Bug 801410 - avoid inserting dotted-circle for run-initial
Unicode combining characters in "simple" scripts such as Latin

https://bugzilla.mozilla.org/show_bug.cgi?id=801410
2012-10-31 13:45:30 -07:00
Behdad Esfahbod 0bc7a38463 [OT] Fix ReverseChainingSubst
We should make it clear that we don't want output buffer in this case,
otherwise buffer->backtrack_len() would be wrong.
2012-10-29 22:02:45 -07:00
Behdad Esfahbod 38b015e57f Fix hb_buffer_set_length(buffer, 0)
Was causing invalid realloc()s.
2012-10-28 20:11:47 -07:00
Behdad Esfahbod 05207a79e0 [buffer] Save pre/post textual context
To be used for a variety of purposes.  We save up to five characters
in each direction.  No public API changes, everything is taken care
of already.  All clients need to do is to call hb_buffer_add_utf* with
the full text + segment info (or at least some context) instead of
just passing in the segment.

Various operations (hb_buffer_reset, hb_buffer_set_length,
hb_buffer_add*) automatically reset the relevant contexts.
2012-09-25 21:32:21 -04:00
Behdad Esfahbod 1f66c3c1a0 Add hb_utf_strlen()
Speeds up UTF-8 parsing by calling strlen().
2012-09-25 11:42:16 -04:00
Behdad Esfahbod 7f19ae7b9f [buffer] Templatize UTF handling
Also move UTF routines into a separate file, to be reused from shapers
that need it.
2012-09-25 11:23:55 -04:00
Behdad Esfahbod 0e0a4da9b7 [buffer] Towards template'izing different UTF adders 2012-09-25 11:09:04 -04:00
Behdad Esfahbod 7d37280600 Minor 2012-09-25 11:04:41 -04:00
Behdad Esfahbod 96fdc04e5c Add hb_buffer_[sg]et_content_type
And hb_buffer_content_type_t and enum values.
2012-09-06 22:30:53 -04:00
Behdad Esfahbod b85800f9de [Indic] Implement dotted-circle insertion for broken clusters
No panic, we reeally insert dotted circle when it's absolutely broken.

Fixes most of the dotted-circle cases against Uniscribe. (for Devanagari
fixes 80% of them, for Khmer 70%; the rest look like Uniscribe being
really bogus...)

I had to make a decision.  Apparently Uniscribe adds one dotted circle
to each broken character.  I tried that, but that goes wrong easily with
split matras.  So I made it add only one dotted circle to an entire
broken syllable tail.  As in: "if there was a dotted circle here, this
would have formed a correct cluster."  That works better for split
stuff, and I like it more.
2012-08-31 19:18:20 -04:00
Behdad Esfahbod 1be368e96f Minor 2012-08-31 16:29:17 -04:00
Behdad Esfahbod 965c280de0 Add HB_BUFFER_ASSERT_VAR
To be used in places we access buffer vars...
2012-08-29 14:02:37 -04:00
Behdad Esfahbod d5045a5f40 [ICU] Use new normalizer2 compose/decompose API
It's considerably faster than the fallback implementation we had
previously!
2012-08-11 21:27:15 -04:00
Behdad Esfahbod 208f70f055 Inline Unicode callbacks internally 2012-08-01 17:13:10 -04:00
Behdad Esfahbod 69cc492dc1 [buffer] Minor 2012-07-31 14:51:36 -04:00
Behdad Esfahbod ea278d3895 Partially switch ot shaper to shape_plan 2012-07-27 02:12:28 -04:00
Behdad Esfahbod 47ef931f13 [buffer] Make sure out_info = info during GPOS 2012-07-19 20:52:44 -04:00
Behdad Esfahbod 39b17837b4 Add hb_buffer_normalize_glyphs() and hb-shape --normalize-glyphs
This reorders glyphs within the cluster to a nominal order.  This should
have no visible effect on the output, but helps with testing, for
getting the same hb-shape output for visually-equal glyphs for each
cluster.
2012-07-17 17:09:29 -04:00
Behdad Esfahbod e085fcf7ca Remove unused buffer->replace_glyphs_be16 2012-06-08 21:45:00 -04:00
Behdad Esfahbod fe3dabc08d Minor 2012-06-08 20:56:05 -04:00
Behdad Esfahbod e88e14421a Use merge_clusters instead of open-coding 2012-06-08 20:55:21 -04:00
Behdad Esfahbod e51d2b6ed1 Extend into main buffer if extension hit end of out-buffer merging clusters 2012-06-08 20:36:33 -04:00
Behdad Esfahbod 5ced012d9f Extend end when merging clusters in out-buffer 2012-06-08 20:31:32 -04:00
Behdad Esfahbod 72c0a18783 Extend clusters backward in out-buffer 2012-06-08 20:30:03 -04:00
Behdad Esfahbod cd5891493d Extend clusters backwards, into the out-buffer too 2012-06-08 20:28:59 -04:00
Behdad Esfahbod cafa6f3727 When merging clusters, extend the end 2012-06-08 20:17:10 -04:00
Behdad Esfahbod 2a3d911fe0 Fix alignment-requirement missmatch
Detected by clang and lots of cmdline options.
2012-06-07 17:31:46 -04:00
Behdad Esfahbod 0594a24484 Cleanup TRUE/FALSE vs true/false 2012-06-05 20:35:40 -04:00
Behdad Esfahbod e1ac38f8dd Fix inert buffer set_length() with zero
Oops!
2012-06-05 20:31:49 -04:00
Behdad Esfahbod be4560a3b5 Undo default unicode-funcs to avoid static initializer again 2012-06-05 18:43:57 -04:00
Behdad Esfahbod f06ab8a426 Better hide nil objects and make them const 2012-06-05 14:49:14 -04:00
Behdad Esfahbod 8e3715f8a1 Minor 2012-04-23 22:18:54 -04:00
Behdad Esfahbod 3b26f96ebe Add Thai shaper that does SARA AM decomposition / reordering
That's not in the OpenType spec, but it's what MS and Adobe do.
2012-04-10 10:52:07 -04:00
Behdad Esfahbod d4cc44716c Move code around, in prep for Thai/Lao shaper 2012-04-07 21:52:28 -04:00
Behdad Esfahbod c521e793bd Fix OOB in replace_glyph()
Patch from Kenichi Ishibashi.
2012-01-18 21:51:05 -05:00
Behdad Esfahbod 9ebe8c0286 Add buffer->replace_glyphs() 2011-08-26 09:29:42 +02:00
Behdad Esfahbod e6c09cdf43 Remove the pre_allocate argument from hb_buffer_create()
For two reasons:

1. User can always call hb_buffer_pre_allocate() themselves, and

2. Now we do a pre_alloc in add_utfX anyway, so the total number of
reallocs is limited to a small number (~3) anyway.  This just makes the
API cleaner.
2011-08-19 19:20:26 +02:00
Behdad Esfahbod 4e9ff1dd6e Pre-allocate buffers when adding string
We do a conservative estimate of the number of characters, but still,
this limits the number of buffer reallocs to a small constant.
2011-08-15 16:21:22 +02:00
Behdad Esfahbod 33ccc77902 [API] Make set_user_data() functions take a replace parameter
We need this to set data on objects safely without worrying that some
other thread unsets it by setting it at the same time.
2011-08-09 00:43:24 +02:00
Behdad Esfahbod 944b2ba1ce [buffer] Make API take signed int length
Since we already switched to accepting -1 as 'zero-terminated'.
2011-08-09 00:23:58 +02:00
Behdad Esfahbod 144cd49a0e [buffer] Accept -1 for text_length and item_length
A -1 text_length means: zero-terminated string.
A -1 item_length means: to the end of string.
2011-08-07 00:51:50 -04:00
Behdad Esfahbod 02aeca985b [API] Changes to main shape API
hb_shape() now accepts a shaper_options and a shaper_list argument.
Both can be set to NULL to emulate previous API.  And in most situations
they are expected to be set to NULL.

hb_shape() also returns a boolean for now.  If shaper_list is NULL, the
return value can be ignored.

shaper_options is ignored for now, but otherwise it should be a
NULL-terminated list of strings.

shaper_list is a NULL-terminated list of strings.  Currently recognized
strings are "ot" for native OpenType Layout implementation, "uniscribe"
for the Uniscribe backend, and "fallback" for the non-complex backend
(that will be implemented shortly).  The fallback backend never fails.

The env var HB_SHAPER_LIST is also parsed and honored.  It's a
colon-separated list of shaper names.  The fallback shaper is invoked if
none of the env-listed shapers succeed.

New API hb_buffer_guess_properties() added.
2011-08-04 22:38:09 -04:00
Behdad Esfahbod c605bbbb6d Remove C++ guards from source files
Where causing issues for people with MSVC.
2011-08-04 20:00:53 -04:00
Behdad Esfahbod e62df43649 Add internal hb_buffer_t::get_scratch_buffer() 2011-08-03 17:38:54 -04:00
Behdad Esfahbod b65c06025d Formalize buffer var allocations 2011-07-28 16:49:29 -04:00
Behdad Esfahbod a9ad3d3460 Move more code around
Buffer var allocation coming into shape
2011-07-28 15:42:18 -04:00
Behdad Esfahbod 3a81b1db89 Minor, fix leak from my previous refactorings 2011-07-25 16:30:32 -04:00
Behdad Esfahbod f4a579bc42 Add internal API for buffer var allocation 2011-07-25 16:26:05 -04:00
Behdad Esfahbod 468e9cb25c Move buffer methods into the object 2011-07-22 14:49:14 -04:00
Behdad Esfahbod 9111b21ef9 Add _hb_buffer_output_glyph() and _hb_buffer_skip_glyph() 2011-07-21 00:59:15 -04:00
Behdad Esfahbod dd89d958c1 Fix cluster calculation for non-LTR text 2011-07-21 00:28:57 -04:00
Behdad Esfahbod 2e18c6dbdf Fix reverse_range() position loop
Mozilla Bug 669175 - Slow rendering of text sometimes in this case,
using direction: rtl
2011-07-06 16:05:45 -04:00
Behdad Esfahbod 80a6833b03 [API] Add hb_*_get_empty() for all objects 2011-05-11 18:21:58 -04:00
Behdad Esfahbod 3935af1c0d [buffer] Remove wrong optimization
While the cluster fields of the glyph string are usually sorted, they
wouldn't be in special cases (for example for non-native direction).
Blindly using bsearch is plain wrong.  If we want to reintroduce this
optimization we have to make sure we know the buffer clusters are
monotonic and in which direction.  Not sure it's worth it though.
2011-05-05 16:09:45 -04:00
Behdad Esfahbod e87867cb88 [buffer] Fail in _create() if we cannot pre-allocate the requested size 2011-05-02 19:35:05 -04:00
Behdad Esfahbod 243673d601 [test/buffer] Add more extensive UTF-8 test data from glib 2011-04-28 19:37:51 -04:00
Behdad Esfahbod 080a0eb7d8 Add _hb_unsigned_int_mul_overflows 2011-04-28 16:01:01 -04:00
Behdad Esfahbod 3264042873 [test/buffer] Test pre_allocate() and allocation_successful() 2011-04-28 14:24:16 -04:00
Behdad Esfahbod e0db4b868f [buffer] More error handling
Should be all set now.
2011-04-28 12:56:49 -04:00
Behdad Esfahbod 5fa849b77d [API] Add _set/get_user_data() for all objects 2011-04-27 21:46:01 -04:00
Behdad Esfahbod 47e71d9661 [object] Remove unnecessary use of macros 2011-04-27 16:41:08 -04:00
Behdad Esfahbod 65e0063eae Make buffer size growth start from 32 instead of 8 2011-04-27 09:38:23 -04:00
Behdad Esfahbod d4bee9f813 [API] Add hb_unicode_funcs_get_default() 2011-04-27 09:38:19 -04:00
Behdad Esfahbod fca368c468 Add hb_object_header_t which is the common part of all objects
Makes way for adding arbitrary user_data support.
2011-04-21 18:24:02 -04:00
Behdad Esfahbod 2409d5f8d7 Update Copyright headers 2011-04-21 17:14:28 -04:00
Behdad Esfahbod af02933739 [API] Remove hb_*_get_reference_count()
This was a bizzare piece of API that I inherited from cairo.  It has
been wrong adding them to cairo in the first place.  Remove them before
someone uses them!
2011-04-20 15:49:31 -04:00
Behdad Esfahbod f85faee9b3 [API] Rename hb_buffer_add_glyph() to hb_buffer_add() 2011-04-19 00:38:01 -04:00
Behdad Esfahbod aab0de50e2 [API] Add hb_buffer_allocation_successful()
Returns the error status of the buffer.
2011-04-19 00:32:19 -04:00
Ryan Lortie 02a534b23f [API] Rename hb_buffer_ensure() to hb_buffer_pre_allocate()
The new name is self-documenting.
2011-04-19 00:05:43 -04:00
Ryan Lortie 70566befc5 [API} hb_buffer_get_glyph_{infos,positions}: Add length out parameter
Return the length, whenever we return an array.  Makes it easier on the
language bindings.
2011-04-19 00:03:44 -04:00
Behdad Esfahbod c0af193c8e Change buffer default properties to invalid
This includes HB_DIRECTION_INVALID and HB_SCRIPT_INVALID.

The INVALID will cause a "guess whatever from the text" in hb_shape().
While it's not ideal, it works better than the previous defaults at
least (HB_DIRECTION_LTR and HB_SCRIPT_COMMON).
2011-04-15 19:26:24 -04:00
Behdad Esfahbod 8f0d7e0c3f Remove hb_buffer_clear_positions(), add hb_ot_layout_position_start() 2011-04-15 19:08:43 -04:00
Behdad Esfahbod 2fc56edff6 [API] Remove hb_buffer_clear()
One should use hb_buffer_reset() really.
2011-04-15 19:08:38 -04:00
Behdad Esfahbod c910bec863 Add hb_buffer_reset() and hb_buffer_set_length() 2011-04-13 15:49:06 -04:00
Behdad Esfahbod 69ea23cb5d Minor 2011-04-13 15:02:40 -04:00
Behdad Esfahbod b5dd44e246 Fix possible overflow 2011-02-28 10:13:52 -08:00
Behdad Esfahbod cc1a8a938b Fix ChanContext backtrack matching with GPOS
Reported on mailing list by Keith Stribley and Khaled Hosny.
2011-01-06 14:58:52 -05:00
Behdad Esfahbod 1c3183027f Remove unused realloc
We always allocate and grow str and pos together.
2011-01-06 14:44:14 -05:00
Behdad Esfahbod 98370e89d1 WIP removing external synthesized GDEF support and implementing it internally 2010-11-02 19:12:58 -04:00
Behdad Esfahbod 870e2d6eac Remove unused function 2010-11-02 19:12:58 -04:00
Behdad Esfahbod dbf56b1d94 More lig-id cleanup 2010-11-02 19:12:58 -04:00
Behdad Esfahbod f6a23a0b91 More removal of lig-id code from buffer 2010-11-02 19:12:58 -04:00
Behdad Esfahbod dd2ffd282c Minor renaming 2010-11-02 19:12:58 -04:00
Behdad Esfahbod fe263272a2 Move setting lig_id/component out of buffer and to the gsub code 2010-11-02 19:12:58 -04:00
Behdad Esfahbod 37ab877149 Remove comment 2010-11-02 19:12:58 -04:00
Behdad Esfahbod 88474c6fda Get rid of the OpenType-specific internal buffer representation
Add variant integers to buffer item types.  More cleanup coming.
2010-11-02 19:12:58 -04:00
Behdad Esfahbod bd7378b2ef Massage mask setting a bit more
Still finding the exact correct way the masks should be set.
2010-10-13 18:33:16 -04:00
Behdad Esfahbod 961f9baa7b Oops, actually set global mask 2010-10-13 17:17:00 -04:00
Behdad Esfahbod 3506b2e78d Return early if mask is 0 2010-10-13 15:38:52 -04:00
Behdad Esfahbod 5c1c8c9c50 Make sure feature values don't leak out of their mask 2010-10-13 15:36:38 -04:00
Behdad Esfahbod 57ac0ecb78 Merge clearing masks and setting global masks 2010-10-12 17:07:02 -04:00
Behdad Esfahbod 34db6f031d Add XXX note 2010-10-07 01:21:19 -04:00
Behdad Esfahbod 4e4ef24e46 Towards separating bit allocation from shaping 2010-07-23 17:22:11 -04:00
Behdad Esfahbod acdba3f90b Prefer C linkage 2010-07-23 15:39:27 -04:00
Behdad Esfahbod 81c5e8724b Allow disabling default features
Patch from Jonathan Kew
2010-05-28 18:31:16 -04:00
Behdad Esfahbod 2163afbf35 Add note about UTF-8 decoder 2010-05-27 14:04:15 -04:00
Behdad Esfahbod 1ce7b87c4d Cleanup bitmask allocation 2010-05-21 17:31:45 +01:00
Behdad Esfahbod 009aad5678 Invert the mask logic
Before, the mask in the buffer was inverted.  That is, a 0 bit meant
feature should be applied and 1 meant not applied, whereas in the
lookups, the logic was positive.

Now both are in sync.  When calling hb_buffer_add_glyph() manually,
the mask should be 1 instead of 0.
2010-05-20 14:00:57 +01:00
Behdad Esfahbod 3567b87cce Add an inline version of hb_buffer_ensure() 2010-05-14 23:28:44 -04:00
Behdad Esfahbod a6a79df5fe Handle malloc failture in the buffer 2010-05-14 23:20:16 -04:00
Behdad Esfahbod 910a33fe84 Update buffer docs 2010-05-14 22:13:38 -04:00
Behdad Esfahbod 36b73c80df Shortening buffer accessors: rename buffer->in_pos to buffer->i 2010-05-14 22:10:39 -04:00
Behdad Esfahbod 29427c5c51 Shortening buffer accessors: rename buffer->out_length to buffer->out_len 2010-05-14 22:08:22 -04:00
Behdad Esfahbod 6960350be9 Shortening buffer accessors: rename buffer->in_length to buffer->len 2010-05-14 22:07:46 -04:00
Behdad Esfahbod 1b621823f3 Shortening buffer accessors: rename buffer->positions to buffer->pos 2010-05-14 22:05:53 -04:00
Behdad Esfahbod 9d5e26df08 Shortening buffer accessors: rename buffer->out_string to buffer->out_info 2010-05-14 22:03:11 -04:00
Behdad Esfahbod 7e7007a1c9 Shortening buffer accessors: rename buffer->in_string to buffer->info 2010-05-14 22:02:37 -04:00
Behdad Esfahbod 8e6b6bb293 Merge buffer->out_pos and buffer->out_length 2010-05-14 21:58:22 -04:00
Behdad Esfahbod 1d5e780136 Add a few other buffer methods 2010-05-12 23:43:00 -04:00
Behdad Esfahbod 8951fc2c82 Add buffer->allocate_lig_id() 2010-05-12 23:13:39 -04:00
Behdad Esfahbod 22da7fd94d Rename a few files to be C++ sources
In anticipation for buffer revamp coming.
2010-05-12 18:23:21 -04:00