summaryrefslogtreecommitdiff
path: root/doc/context/sources/general/manuals/evenmore/evenmore-normalization.tex
blob: 36d4390aaff1697f11db0c56451f12bfce8c6bf5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
% language=us

% \enabletrackers[nodes.directions]

\environment evenmore-style

\startcomponent evenmore-normalization

\startchapter[title=Normalization]

What I describe here was long due. I delayed it because when enabled it had best
also be used and I need to (check and) adapt some code to it in order to profit
from it. So, if used at all, it will take some time to have an effect on the
\CONTEXT\ code base. But first some background information.

When \TEX\ builds a paragraph it splits the current text stream (that makes up
the paragraph) into lines where each line becomes an horizontal box. In \LUATEX,
this process is split into distinctive steps, contrary to regular \TEX\ where the
splitting is combined with hyphenation, ligature construction and font kerning.
But what all engines have in common is that after the decision is made about what
a line is, the result gets packages into the horizontal box.

The decision making is influenced by quite some factors, like:

\startitemize[packed]
\startitem
    The indentation of the first line, driven by the presence of a box of
    with a certain width and no height and depth (its always there, also when
    the indentation is zero).
\stopitem
\startitem
    Hanging indentation, which can happen at each corner of the paragraph, or
    alternatively a specific parshape.
\stopitem
\startitem
    Left and|/|or right margins, aka left skip and right skip. A right skip is
    always present, even when zero.
\stopitem
\startitem
    The way the last line gets aligns, aka parfill skip.
\stopitem
\startitem
    Directional changes that need to be carries over to the next line.
\stopitem
\startitem
    Optional protrusion of characters to the left of right of the line, something
    that is sensitive for directional changes.
\stopitem
\startitem
    Expansion of characters in order to get better inter|-|word spacing and|/|or
    to prevent lines being too bad. There can be stretch as well as shrink but
    on a per line basis. Inter|-|character kerns can also get that treatment.
\stopitem
\startitem
    The penalties associated to hyphenation: the pre|-|last line, the last two
    lines, a list of penalties (\ETEX), specific penalties bound to hyphenation
    pints (\LUATEX).
\stopitem
\startitem
    The wish to have more or less lines than optimal, aka looseness. I have to
    admit that I never use that feature.
\stopitem
\stopitemize

In traditional \TEX\ it doesn't really matter how the resulting boxes look like,
as long as the following steps can handle them, and those steps don't look into
those boxes. In fact, unless you unpack a box, only the backend deals with the
content. But in \LUATEX\ we have callbacks that hook into several stages and {\em
can} look into the constructed boxes. In \LUATEX\ these boxes also have embedded
directional information (needed by the backend) and (although that is seldom
used) left or right boxed material, a features inherited from \ALEPH|/|\OMEGA.
And when messing around with the content of boxes one has to know what can be
seen there. In principle the code can be reorganized a it but adding additional
functionality is not that trivial because we want to stay close to the original
implementation, even if it has been messed up a bit by successive additions.
Eventually I might give it a try to integrate all these features a bit better,
but on the other hand: it works.

\starttexdefinition Sample #1#2
    \startluacode
        document.normalizestate = nodes.getnormalizeline()
        nodes.setnormalizeline(#1)
    \stopluacode
    \startsubsubject[title={normalization #1, #2}]
        \typebuffer[#2]
        \startlinecorrection
            \forgetall
            \start
                \setupalign[verytolerant,stretch]
                \showmakeup[line,hbox,vbox,glue]
                \vbox{\getbuffer[#2]\samplefile{sapolsky}}
            \stop
            \par
        \stoplinecorrection
    \stopsubject
    \startluacode
        nodes.setnormalizeline(document.normalizestate)
    \stopluacode
\stoptexdefinition

\startbuffer[sample-1]
    \parindent  = 20pt
    \leftskip   = 40pt
    \rightskip  = 50pt
    \hangindent =  0pt
    \hangafter  =  0
\stopbuffer

\startbuffer[sample-2]
    \parindent  =  0pt
    \leftskip   =  0pt
    \rightskip  =  0pt
    \hangindent =-20pt
    \hangafter  = -3
\stopbuffer

\startbuffer[sample-3]
    \parindent  =  0pt
    \leftskip   =  0pt
    \rightskip  =  0pt
    \hangindent = 20pt
    \hangafter  =  3
\stopbuffer

\startbuffer[sample-4]
    \parindent  =  0pt
    \leftskip   = 10pt
    \rightskip  = 30pt
    \hangindent = 20pt
    \hangafter  =  3
\stopbuffer

In the next examples we show how the result of typesetting a paragraph looks
like. We use the Sapolsky quote from the distribution. The cyan glue nodes are
the left and right skip nodes, and the gray one at the end of the last line
represents the parfill skip. The magenta ones at the edge are baseline skips. An
indentation is shown in gray too. As experiment we have four normalization levels
but in the end only the highest level makes sense, simply because normalization
makes no sense unless one consistently normalizes all. We just keep the
granularity because it makes it possible to explain what gets done.

\texdefinition{Sample}{0}{sample-1}
\texdefinition{Sample}{0}{sample-2}
\texdefinition{Sample}{0}{sample-3}
\texdefinition{Sample}{0}{sample-4}

You might have noticed that the right skip is always there but the left skip is
absent when it is zero. As said, as long as the result is okay, it does not
really matter. But \unknown\ in \LUATEX\ (and therefore \CONTEXT) it can have
consequences because there we can kick in a callback that does something with
lines. Such a callback often has to deal with these specific glues and them being
optional makes for more testing. The more predictable the order is, the better.
Although we can easily normalize lines (in a callback) to always have a left skip
too it is also an option in the engine.

\texdefinition{Sample}{1}{sample-1}
\texdefinition{Sample}{1}{sample-2}
\texdefinition{Sample}{1}{sample-3}
\texdefinition{Sample}{1}{sample-4}

In the previous examples there are always left skips as well as right skips. It
makes no sense to have an option to omit both zero left and right skips, because
that again is unpredictable. But we can go further.

\texdefinition{Sample}{2}{sample-1}
\texdefinition{Sample}{2}{sample-2}
\texdefinition{Sample}{2}{sample-3}
\texdefinition{Sample}{2}{sample-4}

In these examples the indentation has been turned into a glue as well (actually
it is more a kern but using a glue makes more sense). The hanging indentation
however is not seen here: it is not represented by glue but instead sort of
hidden in the width of the box and a shift of its content.

\texdefinition{Sample}{3}{sample-1}
\texdefinition{Sample}{3}{sample-2}
\texdefinition{Sample}{3}{sample-3}
\texdefinition{Sample}{3}{sample-4}

In the previous examples the hanging indentation is turned into left and right
hang skips. These cannot be set at the \TEX\ end, but are injected when we
instruct the normalizer to do so.

\texdefinition{Sample}{4}{sample-1}
\texdefinition{Sample}{4}{sample-2}
\texdefinition{Sample}{4}{sample-3}
\texdefinition{Sample}{4}{sample-4}

The previous examples differ from the previous set in that they push these hang
related glue nodes before the left and after the right skip. As I couldn't make
up my mind yet, I let \LUAMETATEX\ just provide both variants.

The option to keep hang related information explicitly in the line has some
consequences. First of all, we now have glue and not some shift|/|width
combination. Second, we have introduced an incompatibility: the lines now always
have the proper width. You might have noticed that but we can show it more
explicitly. We use two parameter sets

\startbuffer[sample-5]
    \hangindent = 20pt
    \hangafter  =  0
\stopbuffer

\startbuffer[sample-6]
    \hangindent =-20pt
    \hangafter  =  0
\stopbuffer

\Sample{0}{sample-5}
\Sample{4}{sample-5}

\Sample{0}{sample-6}
\Sample{4}{sample-6}

A not yet mention part of the normalization is that, because they are no longer
of relevance, the special local par nodes have been removed. The one that starts
a paragraph is turned into a normal directional node if needed, so that we get
properly balanced pairs of directional nodes. It must been said that the code
that does all this is a bit of a mess. We want to stay close to the original
code, but we also need to deal with all these extensions, like directions,
protrusion, extra boxes, etc.

Not shown here is that there is a fifth mode of operation. When we enable that
level an overfull box will get a correction skip so that the right skip etc are
properly aligned. How useful this is: we'll see.

Now, when I decide to keep this feature, which can be set at the \LUA\ end to do
the previously mentioned tasks, depending on a feature level ranging from zero to
four, I also need to check the impact on existing \CONTEXT\ code, which
(currently) is complicated by the fact that most is shared between \MKIV\ and
\LMTX, and only \LUAMETATEX\ has this normalization feature. I will probably
enable it for a while locally in order to see if there are side effects. Then,
when the code base gets adapted, we have to assume that normalization happens, so
there is no way back.

\stopchapter

\stopcomponent