Commit Graph

151 Commits

Author SHA1 Message Date
Behdad Esfahbod 5f13bbd9d4 When removing default-ignorables, merge clusters
Fixes test-shape, and:
https://code.google.com/p/chromium/issues/detail?id=497578
2015-06-19 13:31:49 -07:00
Sascha Brawer 01c3a88543 Fix "Since:" tags
Based on data from http://upstream-tracker.org/versions/harfbuzz.html
Resolves #103
2015-06-01 13:25:27 +02:00
Behdad Esfahbod 81bedda58c New API: hb_buffer_reverse_range() 2015-04-30 13:04:16 -04:00
Khaled Hosny 22524a514f [bindings] Fix hb_buffer_get_segment_properties
Annotate the output parameter.
2015-04-10 22:57:38 +02:00
Khaled Hosny 04f89e8f7d [bindings] Fix ownership of returned hb_language_t
It should not be freed by the caller.
2015-04-10 18:17:02 +02:00
Behdad Esfahbod 9e401f6890 Fix reverse_range() for empty range
Fixes coretext notdef loop consisting of all default_ignorable glyphs

https://code.google.com/p/chromium/issues/detail?id=464755
2015-03-20 16:08:38 -04:00
Behdad Esfahbod 8ac345e5c0 Fix reverse_range() to only reverse alt array if positions are used
In hb-coretext, when we were using scratch buffer for book-keeping,
a reverse_range() caused by the notdef-insertion loop could mess up
our log_clusters.  Ouch!
2015-03-02 16:06:55 -08:00
Behdad Esfahbod 61820bc4ca [API] Add hb_buffer_add_latin1()
This is by no ways to promote non-Unicode encodings.  This is an entry
point that takes Unicode codepoints that happen to all be the first
256 characters and hence fit in 8bit strings.  This is useful eg in Chrome
where strings that can fit in 8bit are implemented that way, and this
avoids copying into UTF-8 or UTF-16.

Perhaps we should rename this to hb_buffer_add_codepoints8().  I'm also
curious if anyone would be really interested in hb_buffer_add_codepoints16().

Please discuss!
2015-01-26 14:25:52 -08:00
Behdad Esfahbod 78c6e86c04 Fix hb_buffer_add_codepoints to actually NOT validate 2015-01-26 14:08:36 -08:00
Behdad Esfahbod b632e7997d Fix up gobject-introspection a bit
Minimal shaping works now!
2015-01-06 14:05:26 -08:00
Behdad Esfahbod b5fbc3b8f5 API: Do not clear buffer-flags in hb_buffer_clear_contents()
After 763e5466c0, one doesn't
need to set flags for different pieces of text.  The flags now
are something the client sets up once, depending on how it
actually uses the buffer.  As such, don't clear it in
clear_contents().

Tests updated.
2014-08-11 18:40:01 -04:00
Behdad Esfahbod 976c8f4552 New API: hb_buffer_[sg]et_replacement_codepoint()
With this change, we now by default replace broken UTF-8/16/32 bits
with U+FFFD.  This can be changed by calling new API on the buffer.
Previously the replacement value used to be (hb_codepoint_t)-1.

Note that hb_buffer_clear_contents() does NOT reset the replacement
character.

See discussion here:

6f13b6d62d

New API:

  hb_buffer_set_replacement_codepoint()
  hb_buffer_get_replacement_codepoint()
2014-07-16 15:34:20 -04:00
Behdad Esfahbod bcba8b4502 New API hb_buffer_add_codepoints()
Like hb_buffer_add_utf32, but doesn't do any Unicode validation.
This is like what hb_buffer_add_utf32 used to be until a couple
commits ago.
2014-07-16 14:59:04 -04:00
Behdad Esfahbod 625dbf141a [buffer] Templatize UTF-* functions 2014-07-16 14:52:59 -04:00
Behdad Esfahbod 66c6a48b6c Add HB_NO_MERGE_CLUSTERS
Disables any cluster-merging.  Added for testing purposes while
we investigate what kind of API to add for this.
2014-04-14 15:55:42 -07:00
Behdad Esfahbod 02c6c8cd6e Set buffer content type to INVALID in hb_buffer_set_length(0)
Previously we were only setting this in hb_buffer_clear_contents(),
but set_length(0) is a valid way to reinitialize buffer to use with
new text.
2013-11-15 13:07:03 -05:00
Behdad Esfahbod 061cb46493 Use long alignment for scratch buffer
Fixes last of scratch alignment warnings in hb-coretext.
2013-11-13 14:50:25 -05:00
Behdad Esfahbod 68c372ed2e More scratch-buffer cleanup 2013-11-13 14:45:43 -05:00
Behdad Esfahbod 16f175cb2e Fix scratch-buffer alignment warnings 2013-11-12 17:22:49 -05:00
Behdad Esfahbod da72042c52 [otlayout] Fix up recent Context matching change
Commit 6b65a76b40.  "end" was becoming
negative.  Was trigerred by Lohit-Kannada 2.5.3 and the sequence:
U+0CB0,U+200D,U+0CBE,U+0CB7,U+0CCD,U+0C9F,U+0CCD,U+0CB0,U+0C97,U+0CB3
Two glyphs were being duplicated.
2013-10-17 12:02:34 +02:00
Behdad Esfahbod 6b65a76b40 [otlayout] Fix (Chain)Context recursion!
Previously we only supported recursive sublookups with
ascending indices.  We were also not correctly handling
non-1-to-1 recursed lookups.

Fix all that!

Fixes the three tests in test/shaping/tests/context-matching.tests,
which were derived from NotoSansBengali and NotoSansDevanagari
among others.
2013-10-14 18:54:51 +02:00
Behdad Esfahbod 085d4291a9 [introspection] Disable constructors for now
Since our types are not associated with their methods, marking
constructors makes them inaccessible from bindings.  Undo for now.
2013-09-12 17:14:33 -04:00
Behdad Esfahbod 288f289997 [docs/introspection] More annotations 2013-09-06 17:30:54 -04:00
Behdad Esfahbod c44b81833d Whitespace 2013-09-06 15:13:16 -04:00
Behdad Esfahbod 5f512017ba [docs] Document a few symbols 2013-09-05 16:40:32 -04:00
Behdad Esfahbod 4dc798de19 Add hb-deprecated.h, and rename a couple enum values
Add deprecated alias for old name.
2013-08-27 11:46:08 -04:00
Behdad Esfahbod 6c15ddfe2b Renamed DEBUG to something else
Some infrastructures use DEBUG as a generic symbol.
2013-04-30 11:34:00 -04:00
Behdad Esfahbod 847794e929 [buffer] Implement buffer deserialization for format=text
Using a ragel machine.
2013-02-27 18:49:18 -05:00
Behdad Esfahbod d3e14aafff [buffer] Move buffer serialization code to a new file 2013-02-27 18:49:05 -05:00
Behdad Esfahbod 1172dc7362 Rename hb_buffer_clear() to hb_buffer_clear_contents()
The previous name was clashing with harfbuzz.old.  There are systems
that need to link both...

Clash-free now again.
2013-01-07 16:46:37 -06:00
Behdad Esfahbod f4abcbfc62 Minor 2012-12-21 16:48:51 -05:00
Behdad Esfahbod 8465a05a89 Fix hb_buffer_guess_segment_properties() for empty buffer
Was causing assertion failure in shape_plan().
2012-11-30 08:46:43 +02:00
Behdad Esfahbod 3f82f8ff07 Rename hb_buffer_guess_properties() to hb_buffer_guess_segment_properties() 2012-11-15 18:48:10 -08:00
Behdad Esfahbod f30641038b Bunch of independent changes (ouch)
API additions:

	hb_segment_properties_t
	HB_SEGMENT_PROPERTIES_DEFAULT
	hb_segment_properties_equal()
	hb_segment_properties_hash()

	hb_buffer_set_segment_properties()
	hb_buffer_get_segment_properties()

	hb_ot_layout_glyph_class_t

	hb_shape_plan_t
	hb_shape_plan_create()
	hb_shape_plan_create_cached()
	hb_shape_plan_get_empty()
	hb_shape_plan_reference()
	hb_shape_plan_destroy()
	hb_shape_plan_set_user_data()
	hb_shape_plan_get_user_data()
	hb_shape_plan_execute()

	hb_ot_shape_plan_collect_lookups()

API changes:

	Rename hb_ot_layout_feature_get_lookup_indexes() to
	hb_ot_layout_feature_get_lookups().

New header file:

	hb-shape-plan.h

And a bunch of prototyped but not implemented stuff.  Coming soon.
(Tests fail because of the prototypes right now.)
2012-11-15 18:48:10 -08:00
Behdad Esfahbod c54599ad26 Minor 2012-11-15 16:14:39 -08:00
Behdad Esfahbod 072ae7a982 Add hb_buffer_serialize_list_formats() 2012-11-15 13:14:12 -08:00
Behdad Esfahbod f9edf16725 Add buffer serialization / deserialization API
Two output formats for now: TEXT, and JSON.  For example:

  hb-shape --output-format=json

Deserialization API is added, but not implemented yet.
2012-11-15 13:10:07 -08:00
Behdad Esfahbod 66ac2ff32e API change: Remove "mask" from hb_buffer_add()
I don't expect anybody using hb_buffer_add(), so this shouldn't break
anyone's code.
2012-11-13 16:26:32 -08:00
Behdad Esfahbod 0c7df22228 Add buffer flags
New API:

	hb_buffer_flags_t

	HB_BUFFER_FLAGS_DEFAULT
	HB_BUFFER_FLAG_BOT
	HB_BUFFER_FLAG_EOT
	HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES

	hb_buffer_set_flags()
	hb_buffer_get_flags()

We use the BOT flag to decide whether to insert dottedcircle if the
first char in the buffer is a combining mark.

The PRESERVE_DEFAULT_IGNORABLES flag prevents removal of characters like
ZWNJ/ZWJ/...
2012-11-13 14:42:35 -08:00
Behdad Esfahbod 82ecaff736 Add hb_buffer_clear()
Which is like _reset(), but does NOT clear unicode-funcs.
2012-11-13 14:10:00 -08:00
Behdad Esfahbod da70111ab2 Don't clear buffer pre-context if no new context is being provided
Patch from Jonathan Kew.

Part of fixing:

Mozilla Bug 801410 - avoid inserting dotted-circle for run-initial
Unicode combining characters in "simple" scripts such as Latin

https://bugzilla.mozilla.org/show_bug.cgi?id=801410
2012-10-31 13:45:30 -07:00
Behdad Esfahbod 0bc7a38463 [OT] Fix ReverseChainingSubst
We should make it clear that we don't want output buffer in this case,
otherwise buffer->backtrack_len() would be wrong.
2012-10-29 22:02:45 -07:00
Behdad Esfahbod 38b015e57f Fix hb_buffer_set_length(buffer, 0)
Was causing invalid realloc()s.
2012-10-28 20:11:47 -07:00
Behdad Esfahbod 05207a79e0 [buffer] Save pre/post textual context
To be used for a variety of purposes.  We save up to five characters
in each direction.  No public API changes, everything is taken care
of already.  All clients need to do is to call hb_buffer_add_utf* with
the full text + segment info (or at least some context) instead of
just passing in the segment.

Various operations (hb_buffer_reset, hb_buffer_set_length,
hb_buffer_add*) automatically reset the relevant contexts.
2012-09-25 21:32:21 -04:00
Behdad Esfahbod 1f66c3c1a0 Add hb_utf_strlen()
Speeds up UTF-8 parsing by calling strlen().
2012-09-25 11:42:16 -04:00
Behdad Esfahbod 7f19ae7b9f [buffer] Templatize UTF handling
Also move UTF routines into a separate file, to be reused from shapers
that need it.
2012-09-25 11:23:55 -04:00
Behdad Esfahbod 0e0a4da9b7 [buffer] Towards template'izing different UTF adders 2012-09-25 11:09:04 -04:00
Behdad Esfahbod 7d37280600 Minor 2012-09-25 11:04:41 -04:00
Behdad Esfahbod 96fdc04e5c Add hb_buffer_[sg]et_content_type
And hb_buffer_content_type_t and enum values.
2012-09-06 22:30:53 -04:00
Behdad Esfahbod b85800f9de [Indic] Implement dotted-circle insertion for broken clusters
No panic, we reeally insert dotted circle when it's absolutely broken.

Fixes most of the dotted-circle cases against Uniscribe. (for Devanagari
fixes 80% of them, for Khmer 70%; the rest look like Uniscribe being
really bogus...)

I had to make a decision.  Apparently Uniscribe adds one dotted circle
to each broken character.  I tried that, but that goes wrong easily with
split matras.  So I made it add only one dotted circle to an entire
broken syllable tail.  As in: "if there was a dotted circle here, this
would have formed a correct cluster."  That works better for split
stuff, and I like it more.
2012-08-31 19:18:20 -04:00