diff --git a/docs/harfbuzz-docs.xml b/docs/harfbuzz-docs.xml
index 6c03f39a1..2c43c4687 100644
--- a/docs/harfbuzz-docs.xml
+++ b/docs/harfbuzz-docs.xml
@@ -45,6 +45,7 @@
+
diff --git a/docs/usermanual-clusters.xml b/docs/usermanual-clusters.xml
new file mode 100644
index 000000000..8b64bdee3
--- /dev/null
+++ b/docs/usermanual-clusters.xml
@@ -0,0 +1,304 @@
+
+
+ Clusters
+
+ In shaping text, a cluster is a sequence of
+ code points that needs to be treated as a single, indivisible unit.
+
+
+ When you add text to a HB buffer, each character is associated with
+ a cluster value. This is an arbitrary number as
+ far as HB is concerned.
+
+
+ Most clients will use UTF-8, UTF-16, or UTF-32 indices, but the
+ actual number does not matter. Moreover, it is not required for the
+ cluster values to be monotonically increasing, but pretty much all
+ of HB's tests are performed on monotonically increasing cluster
+ numbers. Nevertheless, there is no such assumption in the code
+ itself. With that in mind, let's examine what happens with cluster
+ values during shaping under each cluster-level.
+
+
+ HarfBuzz provides three levels of clustering
+ support. Level 0 is the default behavior and reproduces the behavior
+ of the old HarfBuzz library. Level 1 tweaks this behavior slightly
+ to produce better results, so level 1 clustering is recommended for
+ code that is not required to implement backward compatibility with
+ the old HarfBuzz.
+
+
+ Level 2 differs significantly in how it treats cluster values.
+ Levels 0 and 1 both process ligatures and glyph decomposition by
+ merging clusters; level 2 does not.
+
+
+ The conceptual model for what the cluster values mean, in levels 0
+ and 1, is this:
+
+
+
+
+ the sequence of cluster values will always remain monotone
+
+
+
+
+ each value represents a single cluster
+
+
+
+
+ each cluster contains one or more glyphs and one or more
+ characters
+
+
+
+
+ Assuming that initial cluster numbers were monotonically increasing
+ and distinct, then all adjacent glyphs having the same cluster
+ number belong to the same cluster, and all characters belong to the
+ cluster that has the highest number not larger than their initial
+ cluster number. This will become clearer with an example.
+
+
+
+ A clustering example for levels 0 and 1
+
+ Let's say we start with the following character sequence and cluster
+ values:
+
+
+ A,B,C,D,E
+ 0,1,2,3,4
+
+
+ We then map the characters to glyphs. For simplicity, let's assume
+ that each character maps to the corresponding, identical-looking
+ glyph:
+
+
+ A,B,C,D,E
+ 0,1,2,3,4
+
+
+ Now if, for example, B and C
+ ligate, then the clusters to which they belong "merge".
+ This merged cluster takes for its cluster number the minimum of all
+ the cluster numbers of the clusters that went in. In this case, we
+ get:
+
+
+ A,BC,D,E
+ 0,1 ,3,4
+
+
+ Now let's assume that the BC glyph decomposes
+ into three components, and D also decomposes into
+ two. The components each inherit the cluster value of their parent:
+
+
+ A,BC0,BC1,BC2,D0,D1,E
+ 0,1 ,1 ,1 ,3 ,3 ,4
+
+
+ Now if BC2 and D0 ligate, then
+ their clusters (numbers 1 and 3) merge into
+ min(1,3) = 1:
+
+
+ A,BC0,BC1,BC2D0,D1,E
+ 0,1 ,1 ,1 ,1 ,4
+
+
+ At this point, cluster 1 means: the character sequence
+ BCD is represented by glyphs
+ BC0,BC1,BC2D0,D1 and cannot be broken down any
+ further.
+
+
+
+ Reordering in levels 0 and 1
+
+ Another common operation in the more complex shapers is when things
+ reorder. In those cases, to maintain monotone clusters, HB merges
+ the clusters of everything in the reordering sequence. For example,
+ let's again start with the character sequence:
+
+
+ A,B,C,D,E
+ 0,1,2,3,4
+
+
+ If D is reordered before B,
+ then the B, C, and
+ D clusters merge, and we get:
+
+
+ A,D,B,C,E
+ 0,1,1,1,4
+
+
+ This is clearly not ideal, but it is the only sensible way to
+ maintain monotone indices and retain the true relationship between
+ glyphs and characters.
+
+
+
+ The distinction between levels 0 and 1
+
+ So, the above is pretty much what cluster levels 0 and 1 do. The
+ only difference between the two is this: in level 0, at the very
+ beginning of the shaping process, we also merge clusters between
+ base characters and all Unicode marks (combining or not) following
+ them. E.g.:
+
+
+ A,acute,B
+ 0,1 ,2
+
+
+ will become:
+
+
+ A,acute,B
+ 0,0 ,2
+
+
+ This is the default behavior. We do it because Windows did it and
+ old HarfBuzz did it, so this remained the default. But this behavior
+ makes it impossible to color diacritic marks differently from their
+ base characters. That's why in level 1 we do not perform this
+ initial merging step.
+
+
+ For clients, level 0 is more convenient if they rely on HarfBuzz
+ clusters for cursor positioning. But that's wrong anyway: cursor
+ positions should be determined based on Unicode grapheme boundaries,
+ NOT shaping clusters. As such, level 1 clusters are preferred.
+
+
+ One last note about levels 0 and 1. We currently don't allow a
+ MultipleSubst lookup to replace a glyph with zero
+ glyphs (i.e., to delete a glyph). But in some other situations,
+ glyphs can be deleted. In those cases, if the glyph being deleted is
+ the last glyph of its cluster, we make sure to merge the cluster
+ with a neighboring cluster.
+
+
+ This is, primarily, to make sure that the starting cluster of the
+ text always has the cluster index pointing to the start of the text
+ for the run; more than one client currently relies on this
+ guarantee.
+
+
+ Incidentally, Apple's CoreText does something else to maintain the
+ same promise: it inserts a glyph with id 65535 at the beginning of
+ the glyph string if the glyph corresponding to the first character
+ in the run was deleted. HarfBuzz might do something similar in the
+ future.
+
+
+
+ Level 2
+
+ Level 2 is a different beast from levels 0 and 1. It is simple to
+ describe, but hard to make sense of. It simply doesn't do any
+ cluster merging whatsoever. When things ligate or otherwise multiple
+ glyphs turn into one, the cluster value of the first glyph is
+ retained.
+
+
+ Here are a few examples of why processing cluster values produced at
+ this level might be tricky:
+
+
+ Ligatures with combining marks
+
+ Imagine capital letters are bases and lower case letters are
+ combining marks. With an input sequence like this:
+
+
+ A,a,B,b,C,c
+ 0,1,2,3,4,5
+
+
+ if A,B,C ligate, then here are the cluster
+ values one would get under the various levels:
+
+
+ level 0:
+
+
+ ABC,a,b,c
+ 0 ,0,0,0
+
+
+ level 1:
+
+
+ ABC,a,b,c
+ 0 ,0,0,5
+
+
+ level 2:
+
+
+ ABC,a,b,c
+ 0 ,1,3,5
+
+
+ Making sense of the last example is the hardest for a client,
+ because there is nothing in the cluster values to suggest that
+ B and C ligated with
+ A.
+
+
+
+ Reordering
+
+ Another tricky case is when things reorder. Under level 2:
+
+
+ A,B,C,D,E
+ 0,1,2,3,4
+
+
+ Now imagine D moves before
+ B:
+
+
+ A,D,B,C,E
+ 0,3,1,2,4
+
+
+ Now, if D ligates with B, we
+ get:
+
+
+ A,DB,C,E
+ 0,3 ,2,4
+
+
+ In a different scenario, A and
+ B could have ligated
+ before D reordered; that
+ would have resulted in:
+
+
+ AB,D,C,E
+ 0 ,3,2,4
+
+
+ There's no way to differentitate between these two scenarios based
+ on the cluster numbers alone.
+
+
+ Another problem appens with ligatures under level 2 if the
+ direction of the text is forced to opposite of its natural
+ direction (e.g. left-to-right Arabic). But that's too much of a
+ corner case to worry about.
+
+
+
+