summaryrefslogtreecommitdiff
path: root/doc/context/sources/general/manuals/evenmore/evenmore-normalization.tex
diff options
context:
space:
mode:
Diffstat (limited to 'doc/context/sources/general/manuals/evenmore/evenmore-normalization.tex')
-rw-r--r--doc/context/sources/general/manuals/evenmore/evenmore-normalization.tex239
1 files changed, 239 insertions, 0 deletions
diff --git a/doc/context/sources/general/manuals/evenmore/evenmore-normalization.tex b/doc/context/sources/general/manuals/evenmore/evenmore-normalization.tex
new file mode 100644
index 000000000..36d4390aa
--- /dev/null
+++ b/doc/context/sources/general/manuals/evenmore/evenmore-normalization.tex
@@ -0,0 +1,239 @@
+% language=us
+
+% \enabletrackers[nodes.directions]
+
+\environment evenmore-style
+
+\startcomponent evenmore-normalization
+
+\startchapter[title=Normalization]
+
+What I describe here was long due. I delayed it because when enabled it had best
+also be used and I need to (check and) adapt some code to it in order to profit
+from it. So, if used at all, it will take some time to have an effect on the
+\CONTEXT\ code base. But first some background information.
+
+When \TEX\ builds a paragraph it splits the current text stream (that makes up
+the paragraph) into lines where each line becomes an horizontal box. In \LUATEX,
+this process is split into distinctive steps, contrary to regular \TEX\ where the
+splitting is combined with hyphenation, ligature construction and font kerning.
+But what all engines have in common is that after the decision is made about what
+a line is, the result gets packages into the horizontal box.
+
+The decision making is influenced by quite some factors, like:
+
+\startitemize[packed]
+\startitem
+ The indentation of the first line, driven by the presence of a box of
+ with a certain width and no height and depth (its always there, also when
+ the indentation is zero).
+\stopitem
+\startitem
+ Hanging indentation, which can happen at each corner of the paragraph, or
+ alternatively a specific parshape.
+\stopitem
+\startitem
+ Left and|/|or right margins, aka left skip and right skip. A right skip is
+ always present, even when zero.
+\stopitem
+\startitem
+ The way the last line gets aligns, aka parfill skip.
+\stopitem
+\startitem
+ Directional changes that need to be carries over to the next line.
+\stopitem
+\startitem
+ Optional protrusion of characters to the left of right of the line, something
+ that is sensitive for directional changes.
+\stopitem
+\startitem
+ Expansion of characters in order to get better inter|-|word spacing and|/|or
+ to prevent lines being too bad. There can be stretch as well as shrink but
+ on a per line basis. Inter|-|character kerns can also get that treatment.
+\stopitem
+\startitem
+ The penalties associated to hyphenation: the pre|-|last line, the last two
+ lines, a list of penalties (\ETEX), specific penalties bound to hyphenation
+ pints (\LUATEX).
+\stopitem
+\startitem
+ The wish to have more or less lines than optimal, aka looseness. I have to
+ admit that I never use that feature.
+\stopitem
+\stopitemize
+
+In traditional \TEX\ it doesn't really matter how the resulting boxes look like,
+as long as the following steps can handle them, and those steps don't look into
+those boxes. In fact, unless you unpack a box, only the backend deals with the
+content. But in \LUATEX\ we have callbacks that hook into several stages and {\em
+can} look into the constructed boxes. In \LUATEX\ these boxes also have embedded
+directional information (needed by the backend) and (although that is seldom
+used) left or right boxed material, a features inherited from \ALEPH|/|\OMEGA.
+And when messing around with the content of boxes one has to know what can be
+seen there. In principle the code can be reorganized a it but adding additional
+functionality is not that trivial because we want to stay close to the original
+implementation, even if it has been messed up a bit by successive additions.
+Eventually I might give it a try to integrate all these features a bit better,
+but on the other hand: it works.
+
+\starttexdefinition Sample #1#2
+ \startluacode
+ document.normalizestate = nodes.getnormalizeline()
+ nodes.setnormalizeline(#1)
+ \stopluacode
+ \startsubsubject[title={normalization #1, #2}]
+ \typebuffer[#2]
+ \startlinecorrection
+ \forgetall
+ \start
+ \setupalign[verytolerant,stretch]
+ \showmakeup[line,hbox,vbox,glue]
+ \vbox{\getbuffer[#2]\samplefile{sapolsky}}
+ \stop
+ \par
+ \stoplinecorrection
+ \stopsubject
+ \startluacode
+ nodes.setnormalizeline(document.normalizestate)
+ \stopluacode
+\stoptexdefinition
+
+\startbuffer[sample-1]
+ \parindent = 20pt
+ \leftskip = 40pt
+ \rightskip = 50pt
+ \hangindent = 0pt
+ \hangafter = 0
+\stopbuffer
+
+\startbuffer[sample-2]
+ \parindent = 0pt
+ \leftskip = 0pt
+ \rightskip = 0pt
+ \hangindent =-20pt
+ \hangafter = -3
+\stopbuffer
+
+\startbuffer[sample-3]
+ \parindent = 0pt
+ \leftskip = 0pt
+ \rightskip = 0pt
+ \hangindent = 20pt
+ \hangafter = 3
+\stopbuffer
+
+\startbuffer[sample-4]
+ \parindent = 0pt
+ \leftskip = 10pt
+ \rightskip = 30pt
+ \hangindent = 20pt
+ \hangafter = 3
+\stopbuffer
+
+In the next examples we show how the result of typesetting a paragraph looks
+like. We use the Sapolsky quote from the distribution. The cyan glue nodes are
+the left and right skip nodes, and the gray one at the end of the last line
+represents the parfill skip. The magenta ones at the edge are baseline skips. An
+indentation is shown in gray too. As experiment we have four normalization levels
+but in the end only the highest level makes sense, simply because normalization
+makes no sense unless one consistently normalizes all. We just keep the
+granularity because it makes it possible to explain what gets done.
+
+\texdefinition{Sample}{0}{sample-1}
+\texdefinition{Sample}{0}{sample-2}
+\texdefinition{Sample}{0}{sample-3}
+\texdefinition{Sample}{0}{sample-4}
+
+You might have noticed that the right skip is always there but the left skip is
+absent when it is zero. As said, as long as the result is okay, it does not
+really matter. But \unknown\ in \LUATEX\ (and therefore \CONTEXT) it can have
+consequences because there we can kick in a callback that does something with
+lines. Such a callback often has to deal with these specific glues and them being
+optional makes for more testing. The more predictable the order is, the better.
+Although we can easily normalize lines (in a callback) to always have a left skip
+too it is also an option in the engine.
+
+\texdefinition{Sample}{1}{sample-1}
+\texdefinition{Sample}{1}{sample-2}
+\texdefinition{Sample}{1}{sample-3}
+\texdefinition{Sample}{1}{sample-4}
+
+In the previous examples there are always left skips as well as right skips. It
+makes no sense to have an option to omit both zero left and right skips, because
+that again is unpredictable. But we can go further.
+
+\texdefinition{Sample}{2}{sample-1}
+\texdefinition{Sample}{2}{sample-2}
+\texdefinition{Sample}{2}{sample-3}
+\texdefinition{Sample}{2}{sample-4}
+
+In these examples the indentation has been turned into a glue as well (actually
+it is more a kern but using a glue makes more sense). The hanging indentation
+however is not seen here: it is not represented by glue but instead sort of
+hidden in the width of the box and a shift of its content.
+
+\texdefinition{Sample}{3}{sample-1}
+\texdefinition{Sample}{3}{sample-2}
+\texdefinition{Sample}{3}{sample-3}
+\texdefinition{Sample}{3}{sample-4}
+
+In the previous examples the hanging indentation is turned into left and right
+hang skips. These cannot be set at the \TEX\ end, but are injected when we
+instruct the normalizer to do so.
+
+\texdefinition{Sample}{4}{sample-1}
+\texdefinition{Sample}{4}{sample-2}
+\texdefinition{Sample}{4}{sample-3}
+\texdefinition{Sample}{4}{sample-4}
+
+The previous examples differ from the previous set in that they push these hang
+related glue nodes before the left and after the right skip. As I couldn't make
+up my mind yet, I let \LUAMETATEX\ just provide both variants.
+
+The option to keep hang related information explicitly in the line has some
+consequences. First of all, we now have glue and not some shift|/|width
+combination. Second, we have introduced an incompatibility: the lines now always
+have the proper width. You might have noticed that but we can show it more
+explicitly. We use two parameter sets
+
+\startbuffer[sample-5]
+ \hangindent = 20pt
+ \hangafter = 0
+\stopbuffer
+
+\startbuffer[sample-6]
+ \hangindent =-20pt
+ \hangafter = 0
+\stopbuffer
+
+\Sample{0}{sample-5}
+\Sample{4}{sample-5}
+
+\Sample{0}{sample-6}
+\Sample{4}{sample-6}
+
+A not yet mention part of the normalization is that, because they are no longer
+of relevance, the special local par nodes have been removed. The one that starts
+a paragraph is turned into a normal directional node if needed, so that we get
+properly balanced pairs of directional nodes. It must been said that the code
+that does all this is a bit of a mess. We want to stay close to the original
+code, but we also need to deal with all these extensions, like directions,
+protrusion, extra boxes, etc.
+
+Not shown here is that there is a fifth mode of operation. When we enable that
+level an overfull box will get a correction skip so that the right skip etc are
+properly aligned. How useful this is: we'll see.
+
+Now, when I decide to keep this feature, which can be set at the \LUA\ end to do
+the previously mentioned tasks, depending on a feature level ranging from zero to
+four, I also need to check the impact on existing \CONTEXT\ code, which
+(currently) is complicated by the fact that most is shared between \MKIV\ and
+\LMTX, and only \LUAMETATEX\ has this normalization feature. I will probably
+enable it for a while locally in order to see if there are side effects. Then,
+when the code base gets adapted, we have to assume that normalization happens, so
+there is no way back.
+
+\stopchapter
+
+\stopcomponent
+