2018-06-11 12:12:00

author: Hans Hagen <pragma@wxs.nl> 2018-06-11 13:44:10 +0200
committer: Context Git Mirror Bot <phg42.2a@gmail.com> 2018-06-11 13:44:10 +0200
commit: 7686a24f79edfef2a9d013882c822c76a12e23dc (patch)
tree: 64cfb02179b872eca3f6d9f02c317b0b1823f046 /doc/context/sources
parent: bd8f4d00a5ba1af56451821cd1db1c12c22f5419 (diff)
download: context-7686a24f79edfef2a9d013882c822c76a12e23dc.tar.gz
3 files changed, 1031 insertions, 0 deletions
diff --git a/doc/context/sources/general/manuals/onandon/onandon-fences.tex b/doc/context/sources/general/manuals/onandon/onandon-fences.tex
new file mode 100644
index 000000000..133b9bfeb
--- /dev/null
+++ b/doc/context/sources/general/manuals/onandon/onandon-fences.tex
@@ -0,0 +1,499 @@
+% language=uk
+
+\startcomponent onandon-fences
+
+\environment onandon-environment
+
+% avoid context defaults:
+%
+% \mathitalicsmode   \plusone   % default in context
+% \mathdelimitersmode\plusseven % optional in context
+
+\def\UseMode#1{\appendtoks\mathdelimitersmode#1\to\everymathematics}
+
+\startchapter[title={Tricky fences}]
+
+Occasionally one of my colleagues notices some suboptimal rendering and asks me
+to have a look at it. Now, one can argue about \quotation {what is right} and
+indeed there is not always a best answer to it. Such questions can even be a
+nuisance; let's think of the following scenario. You have a project where \TEX\
+is practically the only solution. Let it be an \XML\ rendering project, which
+means that there are some boundary conditions. Speaking in 2017 we find that in
+most cases a project starts out with the assumption that everything is possible.
+
+Often such a project starts with a folio in mind and therefore by decent tagging
+to match the educational and esthetic design. When rendering is mostly automatic
+and concerns too many (variants) to check all rendering, some safeguards are used
+(an example will be given below). Then different authors, editors and designers
+come into play and their expectations, also about what is best, often conflict.
+Add to that rendering for the web, and devices and additional limitations show
+up: features get dropped and even more cases need to be compensated (the quality
+rules for paper are often much higher). But, all that defeats the earlier
+attempts to do well because suddenly it has to match the lesser format. This in
+turn makes investing in improving rendering very inefficient (read: a bottomless
+pit because it never gets paid and there is no way to gain back the investment).
+Quite often it is spacing that triggers discussions and questions what rendering
+is best. And inconsistency dominates these questions.
+
+So, in case you wonder why I bother with subtle aspects of rendering as discussed
+below, the answer is that it is not so much professional demand but users (like
+my colleagues or those on the mailing lists) that make me look into it and often
+something that looks trivial takes days to sort out (even for someone who knows
+his way around the macro language, fonts and the inner working of the engine).
+And one can be sure that more cases will pop up.
+
+All this being said, let's move on to a recent example. In \CONTEXT\ we support
+\MATHML\ although in practice we're forced to a mix of that standard and
+\ASCIIMATH. When we're lucky, we even get a mix with good old \TEX-encoded math.
+One problem with an automated flow and processing (other than raw \TEX) is that
+one can get anything and therefore we need to play safe. This means for instance
+that you can get input like this:
+
+\starttyping
+f(x) + f(1/x)
+\stoptyping
+
+or in more structured \TEX\ speak:
+
+\startbuffer
+$f(x) + f(\frac{1}{x})$
+\stopbuffer
+
+\typebuffer
+
+Using \TeX\ Gyre Pagella, this renders as: {\UseMode\zerocount\inlinebuffer}, and
+when seeing this a \TEX\ user will revert to:
+
+\startbuffer
+$f(x) + f\left(\frac{1}{x}\right)$
+\stopbuffer
+
+\typebuffer
+
+which gives: {\UseMode\zerocount \inlinebuffer}. So, in order to be robust we can
+always use the \type {\left} and \type {\right} commands, can't we?
+
+\startbuffer
+$f(x) + f\left(x\right)$
+\stopbuffer
+
+\typebuffer
+
+which gives {\UseMode\zerocount \inlinebuffer}, but let's blow up this result a
+bit showing some additional tracing from left to right, now in Latin Modern:
+
+\startbuffer[blownup]
+\startcombination[nx=3,ny=2,after=\vskip3mm]
+    {\scale[scale=4000]{\hbox{$f(x)$}}}
+        {just characters}
+    {\scale[scale=4000]{\ruledhbox{\showglyphs \showfontkerns \showfontitalics$f(x)$}}}
+        {just characters}
+    {\scale[scale=4000]{\ruledhbox{\showglyphs \showfontkerns \showfontitalics \showmakeup$f(x)$}}}
+        {just characters}
+    {\scale[scale=4000]{\hbox{$f\left(x\right)$}}}
+        {using delimiters}
+    {\scale[scale=4000]{\ruledhbox{\showglyphs \showfontkerns \showfontitalics$f\left(x\right)$}}}
+        {using delimiters}
+    {\scale[scale=4000]{\ruledhbox{\showglyphs \showfontkerns \showfontitalics \showmakeup$f\left(x\right)$}}}
+        {using delimiters}
+\stopcombination
+\stopbuffer
+
+\startlinecorrection
+\UseMode\zerocount
+\switchtobodyfont[modern]\getbuffer[blownup]
+\stoplinecorrection
+
+When we visualize the glyphs and kerns we see that there's a space instead of a
+kern when we use delimiters. This is because the delimited sequence is processed
+as a subformula and injected as a so|-|called inner object and as such gets
+spaced according to the ordinal (for the $f$) and inner (\quotation {fenced} with
+delimiters $x$) spacing rules. Such a difference normally will go unnoticed but
+as we mentioned authors, editors and designers being involved, there's a good
+chance that at some point one will magnify a \PDF\ preview and suddenly notice
+that the difference between the $f$ and $($ is a bit on the large side for simple
+unstacked cases, something that in print is likely to go unnoticed. So, even when
+we don't know how to solve this, we do need to have an answer ready.
+
+When I was confronted by this example of rendering I started wondering if there
+was a way out. It makes no sense to hard code a negative space before a fenced
+subformula because sometimes you don't want that, especially not when there's
+nothing before it. So, after some messing around I decided to have a look at the
+engine instead. I wondered if we could just give the non|-|scaled fence case the
+same treatment as the character sequence.
+
+Unfortunately here we run into the somewhat complex way the rendering takes
+place. Keep in mind that it is quite natural from the perspective of \TEX\
+because normally a user will explicitly use \type {\left} and \type {\right} as
+needed, while in our case the fact that we automate and therefore want a generic
+solution interferes (as usual in such cases).
+
+Once read in the sequence \type {f(x)} can be represented as a list:
+
+\starttyping
+list = {
+ {
+  id = "noad", subtype = "ord", nucleus = {
+   {
+    id = "mathchar", fam = 0, char = "U+00066",
+   },
+  },
+ },
+ {
+  id = "noad", subtype = "open", nucleus = {
+   {
+    id = "mathchar", fam = 0, char = "U+00028",
+   },
+  },
+ },
+ {
+  id = "noad", subtype = "ord", nucleus = {
+   {
+    id = "mathchar", fam = 0, char = "U+00078",
+   },
+  },
+ },
+ {
+  id = "noad", subtype = "close", nucleus = {
+   {
+    id = "mathchar", fam = 0, char = "U+00029",
+   },
+  },
+ },
+}
+\stoptyping
+
+The sequence \type {f \left( x \right)} is also a list but now it is a tree (we
+leave out some unset keys):
+
+\starttyping
+list = {
+ {
+  id = "noad", subtype = "ord", nucleus = {
+   {
+    id = "mathchar", fam = 0, char = "U+00066",
+   },
+  },
+ },
+ {
+  id = "noad", subtype = "inner", nucleus = {
+   {
+    id = "submlist", head = {
+     {
+      id = "fence", subtype = "left", delim = {
+       {
+        id = "delim", small_fam = 0, small_char = "U+00028",
+       },
+      },
+     },
+     {
+      id = "noad", subtype = "ord", nucleus = {
+       {
+        id = "mathchar", fam = 0, char = "U+00078",
+       },
+      },
+     },
+     {
+      id = "fence", subtype = "right", delim = {
+       {
+        id = "delim", small_fam = 0, small_char = "U+00029",
+       },
+      },
+     },
+    },
+   },
+  },
+ },
+}
+\stoptyping
+
+So, the formula \type {f(x)} is just four characters and stays that way, but with
+some inter|-|character spacing applied according to the rules of \TEX\ math. The
+sequence \typ {f \left( x \right)} however becomes two components: the \type {f}
+is an ordinal noad,\footnote {Noads are the mathematical building blocks.
+Eventually they become nodes, the building blocks of paragraphs and boxed
+material.} and \typ {\left( x \right)} becomes an inner noad with a list as a
+nucleus, which gets processed independently. The way the code is written this is
+what (roughly) happens:
+
+\startitemize
+\startitem
+    A formula starts; normally this is triggered by one or two dollar signs.
+\stopitem
+\startitem
+    The \type {f} becomes an ordinal noad and \TEX\ goes~on.
+\stopitem
+\startitem
+    A fence is seen with a left delimiter and an inner noad is injected.
+\stopitem
+\startitem
+    That noad has a sub|-|math list that takes the left delimiter up to a
+    matching right one.
+\stopitem
+\startitem
+    When all is scanned a routine is called that turns a list of math noads into
+    a list of nodes.
+\stopitem
+\startitem
+    So, we start at the beginning, the ordinal \type {f}.
+\stopitem
+\startitem
+    Before moving on a check happens if this character needs to be kerned with
+    another (but here we have an ordinal|-|inner combination).
+\stopitem
+\startitem
+    Then we encounter the subformula (including fences) which triggers a nested
+    call to the math typesetter.
+\stopitem
+\startitem
+    The result eventually gets packaged into a hlist and we're back one level up
+    (here after the ordinal \type {f}).
+\stopitem
+\startitem
+    Processing a list happens in two passes and, to cut it short, it's the second
+    pass that deals with choosing fences and spacing.
+\stopitem
+\startitem
+    Each time when a (sub)list is processed a second pass over that list
+    happens.
+\stopitem
+\startitem
+    So, now \TEX\ will inject the right spaces between pairs of noads.
+\stopitem
+\startitem
+    In our case that is between an ordinal and an inner noad, which is quite
+    different from a sequence of ordinals.
+\stopitem
+\stopitemize
+
+It's these fences that demand a two-pass approach because we need to know the
+height and depth of the subformula. Anyway, do you see the complication? In our
+inner formula the fences are not scaled, but this is not communicated back in the
+sense that the inner noad can become an ordinal one, as in the simple \type {f(}
+pair. The information is not only lost, it is not even considered useful and the
+only way to somehow bubble it up in the processing so that it can be used in the
+spacing requires an extension. And even then we have a problem: the kerning that
+we see between \type {f(} is also lost. It must be noted that this kerning is
+optional and triggered by setting \type {\mathitalicsmode=1}. One reason for this
+is that fonts approach italic correction differently, and cheat with the
+combination of natural width and italic correction.
+
+Now, because such a workaround is definitely conflicting with the inner workings
+of \TEX, our experimenting demands another variable be created: \type
+{\mathdelimitersmode}. It might be a prelude to more manipulations but for now we
+stick to this one case. How messy it really is can be demonstrated when we render
+our example with Cambria.
+
+\startlinecorrection
+\UseMode\zerocount
+\switchtobodyfont[cambria]\getbuffer[blownup]
+\stoplinecorrection
+
+If you look closely you will notice that the parenthesis are moved up a bit. Also
+notice the more accurate bounding boxes. Just to be sure we also show Pagella:
+
+\startlinecorrection
+\UseMode\zerocount
+\switchtobodyfont[pagella]\getbuffer[blownup]
+\stoplinecorrection
+
+When we really want the unscaled variant to be somewhat compatible with the
+fenced one we now need to take into account:
+
+\startitemize[packed]
+\startitem
+    the optional axis|-|and|-|height|/|depth related shift of the fence (bit 1)
+\stopitem
+\startitem
+    the optional kern between characters (bit 2)
+\stopitem
+\startitem
+    the optional space between math objects (bit 4)
+\stopitem
+\stopitemize
+
+Each option can be set (which is handy for testing) but here we will set them
+all, so, when \type {\mathdelimitersmode=7}, we want cambria to come out as
+follows:
+
+\startlinecorrection
+\UseMode\plusseven
+\switchtobodyfont[cambria]\getbuffer[blownup]
+\stoplinecorrection
+
+When this mode is set the following happens:
+
+\startitemize
+\startitem
+    We keep track of the scaling and when we use the normal size this is
+    registered in the noad (we had space in the data structure for that).
+\stopitem
+\startitem
+    This information is picked up by the caller of the routine that does the
+    subformula and stored in the (parent) inner noad (again, we had space for
+    that).
+\stopitem
+\startitem
+    Kerns between a character (ordinal) and subformula (inner) are kept,
+    which can be bad for other cases but probably less than what we try
+    to solve here.
+\stopitem
+\startitem
+    When the fences are unscaled the inner property temporarily becomes
+    an ordinal one when we apply the inter|-|noad spacing.
+\stopitem
+\stopitemize
+
+Hopefully this is good enough but anything more fancy would demand drastic
+changes in one of the most sensitive mechanisms of \TEX. It might not always work
+out right, so for now I consider it an experiment, which means that it can be
+kept around, rejected or improved.
+
+In case one wonders if such an extension is truly needed, one should also take
+into account that automated typesetting (also of math) is probably one of the
+areas where \TEX\ can shine for a while. And while we can deal with much by using
+\LUA, this is one of the cases where the interwoven and integrated parsing,
+converting and rendering of the math machinery makes it hard. It also fits into a
+further opening up of the inner working by modes.
+
+\startbuffer[simple]
+\dontleavehmode
+\scale
+    [scale=3000]
+    {\ruledhbox
+        {\showglyphs
+         \showfontkerns
+         \showfontitalics
+         $f(x)$}}
+\stopbuffer
+
+\startbuffer[fenced]
+\dontleavehmode
+\scale
+    [scale=3000]
+    {\ruledhbox
+        {\showglyphs
+         \showfontkerns
+         \showfontitalics
+         $f\left(x\right)$}}
+\stopbuffer
+
+\def\TestMe#1%
+  {\bTR
+       \bTD[width=35mm,align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode\zerocount\getbuffer[simple] \eTD
+       \bTD[width=35mm,align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode\zerocount\getbuffer[fenced] \eTD
+       \bTD[width=35mm,align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode\plusseven\getbuffer[simple] \eTD
+       \bTD[width=35mm,align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode\plusseven\getbuffer[fenced] \eTD
+   \eTR
+   \bTR
+       \bTD[align=middle,nx=2] \type{\mathdelimitersmode=0} \eTD
+       \bTD[align=middle,nx=2] \type{\mathdelimitersmode=7} \eTD
+   \eTR
+   \bTR
+       \bTD[align=middle,nx=4] \switchtobodyfont[#1]\bf #1 \eTD
+   \eTR}
+
+\startbuffer
+\bTABLE[frame=off]
+    \TestMe{modern}
+    \TestMe{cambria}
+    \TestMe{pagella}
+\eTABLE
+\stopbuffer
+
+Another objection to such a solution can be that we should not alter the engine
+too much. However, fences already are an exception and treated specially (tests
+and jumps in the program) so adding this fits reasonably well into that part of
+the design.
+
+In the following examples we demonstrate the results for Latin Modern, Cambria
+and Pagella when \type {\mathdelimitersmode} is set to zero or one. First we show
+the case where \type {\mathitalicsmode} is disabled:
+
+\startlinecorrection
+    \mathitalicsmode\zerocount\getbuffer
+\stoplinecorrection
+
+When we enable \type {\mathitalicsmode} we get:
+
+\startlinecorrection
+    \mathitalicsmode\plusone  \getbuffer
+\stoplinecorrection
+
+So is this all worth the effort? I don't know, but at least I got the picture and
+hopefully now you have too. It might also lead to some more modes in future
+versions of \LUATEX.
+
+\startbuffer[simple]
+\dontleavehmode
+\scale
+    [scale=2000]
+    {\ruledhbox
+        {\showglyphs
+         \showfontkerns
+         \showfontitalics
+         $f(x)$}}
+\stopbuffer
+
+\startbuffer[fenced]
+\dontleavehmode
+\scale
+    [scale=2000]
+    {\ruledhbox
+        {\showglyphs
+         \showfontkerns
+         \showfontitalics
+         $f\left(x\right)$}}
+\stopbuffer
+
+\def\TestMe#1%
+  {\bTR
+       \dostepwiserecurse{0}{7}{1}{
+           \bTD[align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode##1\getbuffer[simple] \eTD
+        }
+   \eTR
+   \bTR
+       \dostepwiserecurse{0}{7}{1}{
+           \bTD[align=middle,toffset=3mm] \switchtobodyfont[#1]\UseMode##1\getbuffer[fenced] \eTD
+        }
+   \eTR
+   \bTR
+       \dostepwiserecurse{0}{7}{1}{
+           \bTD[align=middle]
+              \tttf
+              \ifcase##1\relax
+              \or ns       % 1
+              \or    it    % 2
+              \or ns it    % 3
+              \or       or % 4
+              \or ns    or % 5
+              \or    it or % 6
+              \or ns it or % 7
+              \fi
+           \eTD
+       }
+   \eTR
+   \bTR
+       \bTD[align=middle,nx=8] \switchtobodyfont[#1]\bf #1 \eTD
+   \eTR}
+
+\startbuffer
+\bTABLE[frame=off,distance=2mm]
+    \TestMe{modern}
+    \TestMe{cambria}
+    \TestMe{pagella}
+\eTABLE
+\stopbuffer
+
+\startlinecorrection
+\getbuffer
+\stoplinecorrection
+
+In \CONTEXT, a regular document can specify \type {\setupmathfences
+[method=auto]}, but in \MATHML\ or \ASCIIMATH\ this feature is enabled by default
+(so that we can test it).
+
+We end with a summary of all the modes (assuming italics mode is enabled) in the
+table below.
+
+\stopcomponent
diff --git a/doc/context/sources/general/manuals/onandon/onandon-runtoks.tex b/doc/context/sources/general/manuals/onandon/onandon-runtoks.tex
new file mode 100644
index 000000000..b3adeb4a5
--- /dev/null
+++ b/doc/context/sources/general/manuals/onandon/onandon-runtoks.tex
@@ -0,0 +1,531 @@
+% language=uk
+
+\startcomponent onandon-amputating
+
+\environment onandon-environment
+
+\startchapter[title={Amputating code}]
+
+\startsection[title={Introduction}]
+
+Because \CONTEXT\ is already rather old in terms of software life and because it
+evolves over time, code can get replaced by better code. Reasons for this can be:
+
+\startitemize[packed]
+\startitem a better understanding of the way \TEX\ and \METAPOST\ work \stopitem
+\startitem demand for more advanced options \stopitem
+\startitem a brainwave resulting in a better solution \stopitem
+\startitem new functionality provided in \TEX\ engine used \stopitem
+\startitem the necessity to speed up a core process \stopitem
+\stopitemize
+
+Replacing code that in itself does a good job but is no longer the best to be
+used comes with sentiments. It can be rather satisfying to cook up a
+(conceptually as well as codewise) good solution and therefore removing code from
+a file can result in a somewhat bad feeling and even a feeling of losing
+something. Hence the title of this chapter.
+
+Here I will discuss one of the more complex subsystems: the one dealing with
+typeset text in \METAPOST\ graphics. I will stick to the principles and not
+present (much) code as that can be found in archives. This is not a tutorial,
+but more a sort of wrap|-|up for myself. It anyhow show the thinking behind
+this mechanism. I'll also introduce a new \LUATEX\ feature here: subruns.
+
+\stopsection
+
+\startsection[title={The problem}]
+
+\METAPOST\ is meant for drawing graphics and adding text to them is not really
+part of the concept. Its a bit like how \TEX\ sees images: the dimensions matter,
+the content doesn't. This means that in \METAPOST\ a blob of text is an
+abstraction. The native way to create a typeset text picture is:
+
+\starttyping
+picture p ; p := btex some text etex ;
+\stoptyping
+
+In traditional \METAPOST\ this will create a temporary \TEX\ file with the words
+\type {some text} wrapped in a box that when typeset is just shipped out. The
+result is a \DVI\ file that with an auxiliary program will be transformed into a
+\METAPOST\ picture. That picture itself is made from multiple pictures, because
+each sequences of characters becomes a picture and kerns become shifts.
+
+There is also a primitive \type {infont} that takes a text and just converts it
+into a low level text object but no typesetting is done there: so no ligatures
+and no kerns are found there. In \CONTEXT\ this operator is redefined to do the
+right thing.
+
+In both cases, what ends up in the \POSTSCRIPT\ file is references to fonts and
+characters and the original idea is that \DVIPS\ understands what
+fonts to embed. Details are communicated via specials (comments) that \DVIPS\ is
+supposed to intercept and understand. This all happens in an 8~bit (font) universe.
+
+When we moved on to \PDF, a converter from \METAPOST's rather predictable and
+simple \POSTSCRIPT\ code to \PDF\ was written in \TEX. The graphic operators
+became \PDF\ operators and the text was retypeset using the font information and
+snippets of strings and injected at the right spot. The only complication was
+that a non circular pen actually produced two path of which one has to be
+transformed.
+
+At that moment it already had become clear that a more tight integration in
+\CONTEXT\ would happen and not only would that demand a more sophisticated
+handling of text, but it would also require more features not present in
+\METAPOST, like dealing with \CMYK\ colors, special color spaces, transparency,
+images, shading, and more. All this was implemented. In the next sections we will
+only discuss texts.
+
+\stopsection
+
+\startsection[title={Using the traditional method}]
+
+The \type {btex} approach was not that flexible because what happens is that
+\type {btex} triggers the parser to just grabbing everything upto the \type
+{etex} and pass that to an external program. It's special scanner mode and
+because because of that using macros for typesetting texts is a pain. So, instead
+of using this method in \CONTEXT\ we used \type {textext}. Before a run the
+\METAPOST\ file was scanned and for each \type {textext} the argument was copied
+to a file. The \type {btex} calls were scanned to and replaced by \type {textext}
+calls.
+
+For each processed snippet the dimensions were stored in order to be loaded at
+the start of the \METAPOST\ run. In fact, each text was just a rectangle with
+certain dimensions. The \PDF\ converter would use the real snippet (by
+typesetting it).
+
+Of course there had to be some housekeeping in order to make sure that the right
+snippets were used, because the order of definition (as picture) can be different
+from them being used. This mechanism evolved into reasonable robust text handling
+but of course was limited by the fact that the file was scanned for snippets. So,
+the string had to be string and not assembled one. This disadvantage was
+compensated by the fact that we could communicate relevant bits of the
+environment and apply all the usual context trickery in texts in a way that was
+consistent with the rest of the document.
+
+A later implementation could communicate the text via specials which is more
+flexible. Although we talk of this method in the past sense it is still used in
+\MKII.
+
+\stopsection
+
+\startsection[title={Using the library}]
+
+When the \MPLIB\ library showed up in \LUATEX, the same approach was used but
+soon we moved on to a different approach. We already used specials to communicate
+extensions to the backend, using special colors and fake objects as signals. But
+at that time paths got pre- and postscripts fields and those could be used to
+really carry information with objects because unlike specials, they were bound to
+that object. So, all extensions using specials as well as texts were rewritten to
+use these scripts.
+
+The \type {textext} macro changed its behaviour a bit too. Remember that a
+text effectively was just a rectangle with some transformation applied. However
+this time the postscript field carried the text and the prescript field some
+specifics, like the fact that that we are dealing with text. Using the script made
+it possible to carry some more inforation around, like special color demands.
+
+\starttyping
+draw textext("foo") ;
+\stoptyping
+
+Among the prescripts are \typ {tx_index=trial} and \typ {tx_state=trial}
+(multiple prescripts are prepended) and the postscript is \type {foo}. In a
+second run the prescript is \type {tx_index=trial} and \typ {tx_state=final}.
+After the first run we analyze all objects, collect the texts (those with a \type
+{tx_} variables set) and typeset them. As part of the second run we pass the
+dimensions of each indexed text snippet. Internally before the first run we
+\quote {reset} states, then after the first run we \quote {analyze}, and after
+the second run we \quote {process} as part of the conversion of output to \PDF.
+
+\stopsection
+
+\startsection[title={Using \type {runscript}}]
+
+When the \type {runscript} feature was introduced in the library we no longer
+needed to pass the dimensions via subscripted variables. Instead we could just
+run a \LUA\ snippets and ask for the dimensions of a text with some index. This
+is conceptually not much different but it saves us creating \METAPOST\ code that
+stored the dimensions, at the cost of potentially a bit more runtime due to the
+\type {runscript} calls. But the code definitely looks a bit cleaner this way. Of
+course we had to keep the dimensions at the \LUA\ end but we already did that
+because we stored the preprocessed snippets for final usage.
+
+\stopsection
+
+\startsection[title={Using a sub \TEX\ run}]
+
+We now come the current (post \LUATEX\ 1.08) solution. For reasons I will
+mention later a two pass approach is not optimal, but we can live with that,
+especially because \CONTEXT\ with \METAFUN\ (which is what we're talking about
+here) is quit efficient. More important is that it's kind of ugly to do all the
+not that special work twice. In addition to text we also have outlines, graphics
+and more mechanisms that needed two passes and all these became one pass
+features.
+
+A \TEX\ run is special in many ways. At some point after starting up \TEX\
+enters the main loop and begins reading text and expanding macros. Normally you
+start with a file but soon a macro is seen, and a next level of input is entered,
+because as part of the expansion more text can be met, files can be opened,
+other macros be expanded. When a macro expands a token register, another level is
+entered and the same happens when a \LUA\ call is triggered. Such a call can
+print back something to \TEX\ and that has to be scanned as if it came from a
+file.
+
+When token lists (and macros) get expanded, some commands result in direct
+actions, others result in expansion only and processing later as one of more
+tokens can end up in the input stack. The internals of the engine operate in
+miraculous ways. All commands trigger a function call, but some have their own
+while others share one with a switch statement (in \CCODE\ speak) because they
+belong to a category of similar actions. Some are expanded directly, some get
+delayed.
+
+Does it sound complicated? Well, it is. It's even more so when you consider that
+\TEX\ uses nesting, which means pushing and popping local assignments, knows
+modes, like horizontal, vertical and math mode, keeps track of interrupts and at
+the same type triggers typesetting, par building, page construction and flushing
+to the output file.
+
+It is for this reason plus the fact that users can and will do a lot to influence
+that behaviour that there is just one main loop and in many aspects global state.
+There are some exceptions, for instance when the output routine is called, which
+creates a sort of closure: it interrupts the process and for that reason gets
+grouping enforced so that it doesn't influence the main run. But even then the
+main loop does the job.
+
+Starting with version 1.10 \LUATEX\ provides a way to do a local run. There are
+two ways provided: expanding a token register and calling a \LUA\ function. It
+took a bit of experimenting to reach an implementation that works out reasonable
+and many variants were tried. In the appendix we give an example of usage.
+
+The current variant is reasonable robust and does the job but care is needed.
+First of all, as soon as you start piping something to \TEX\ that gets typeset
+you'd better in a valid mode. If not, then for instance glyphs can end up in a
+vertical list and \LUATEX\ will abort. In case you wonder why we don't intercept
+this: we can't because we don't know the users intentions. We cannot enforce a
+mode for instance as this can have side effects, think of expanding \type
+{\everypar} or injecting an indentation box. Also, as soon as you start juggling
+nodes there is no way that \TEX\ can foresee what needs to be copied to
+discarded. Normally it works out okay but because in \LUATEX\ you can cheat in
+numerous ways with \LUA, you can get into trouble.
+
+So, what has this to do with \METAPOST ? Well, first of all we could now use a
+one pass approach. The \type {textext} macro calls \LUA, which then let \TEX\ do
+some typesetting, and then gives back the dimensions to \METAPOST. The \quote
+{analyze} phase is now integrated in the run. For a regular text this works quite
+well because we just box some text and that's it. However, in the next section we
+will see where things get complicated.
+
+Let's summarize the one pass approach: the \type {textext} macro creates
+rectangle with the right dimensions and for doing passes the string to \LUA\
+using \type {runscript}. We store the argument of \type {textext} in a variable,
+then call \type {runtoks}, which expands the given token list, where we typeset a
+box with the stored text (that we fetch with a \LUA\ call), and the \type
+{runscript} passes back the three dimensions as fake \RGB\ color to \METAPOST\
+which applies a \type {scantokens} to the result. So, in principle there is no
+real conceptual difference except that we now analyze in|-|place instead of
+between runs. I will not show the code here because in \CONTEXT\ we use a wrapper
+around \type {runscript} so low level examples won't run well.
+
+\stopsection
+
+\startsection[title={Some aspects}]
+
+An important aspect of the text handling is that the whole text can be
+transformed. Normally this is only some scaling but rotation is also quite valid.
+In the first approach, the original \METAPOST\ one, we have pictures constructed
+of snippets and pictures transform well as long as the backend is not too
+confused, something that can happen when for instance very small or large font
+scales are used. There were some limitations with respect to the number of fonts
+and efficient inclusion when for instance randomization was used (I remember
+cases with thousands of font instances). The \PDF\ backend could handle most
+cases well, by just using one size and scaling at the \PDF\ level. All the \type
+{textext} approaches use rectangles as stubs which is very efficient and permits
+all transforms.
+
+How about color? Think of this situation:
+
+\starttyping
+\startMPcode
+    draw textext("some \color[red]{text}")
+        withcolor green ;
+\stopMPcode
+\stoptyping
+
+And what about the document color? We suffice by saying that this is all well
+supported. Of course using transparency, spot colors etc.\ also needs extensions.
+These are however not directly related to texts although we need to take it into
+account when dealing with the inclusion.
+
+\starttyping
+\startMPcode
+    draw textext("some \color[red]{text}")
+      withcolor "blue"
+      withtransparency (1,0.5) ;
+\stopMPcode
+\stoptyping
+
+What if you have a graphic with many small snippets of which many have the same
+content? These are by default shared, but if needed you can disable it. This makes
+sense if you have a case like this:
+
+\starttyping
+\useMPlibrary[dum]
+
+\startMPcode
+    draw textext("\externalfigure[unknown]") notcached ;
+    draw textext("\externalfigure[unknown]") notcached ;
+\stopMPcode
+\stoptyping
+
+Normally each unknown image gets a nice placeholder with some random properties.
+So, do we want these two to have the same or not? At least you can control it.
+
+When I said that things can get complicated with the one pass approach the
+previous code snippet is a good example. The dummy figure is generated by
+\METAPOST. So, as we have one pass, and jump temporarily back to \TEX,
+we have two problems: we reenter the \MPLIB\ instance again in the middle of
+a run, and we might pipe back something to and|/|or from \TEX\ nested.
+
+The first problem could be solved by starting a new \MPLIB\ session. This
+normally is not a problem as both runs are independent of each other. In
+\CONTEXT\ we can have \METAPOST\ runs in many places and some produce some more
+of less stand alone graphic in the text while other calls produce \PDF\ code in
+the backend that is used in a different way (for instance in a font). In the
+first case the result gets nicely wrapped in a box, while in the second case it
+might directly end up in the page stream. And, as \TEX\ has no knowledge of what
+is needed, it's here that we can get the complications that can lead to aborting
+a run when you are careless. But in any case, if you abort, then you can be sure
+you're doing the wrong thing. So, the second problem can only be solved by
+careful programming.
+
+When I ran the test suite on the new code, some older modules had to be fixed.
+They were doing the right thing from the perspective of intermediate runs and
+therefore independent box handling, putting a text in a box and collecting
+dimensions, but interwoven they demanded a bit more defensive programming. For
+instance, the multi|-|pass approach always made copies snippets while the one
+pass approach does that only when needed. And that confused some old code in a
+module, which incidentally is never used today because we have better
+functionality built|-|in (the \METAFUN\ \type {followtext} mechanism).
+
+The two pass approach has special code for cases where a text is not used.
+Imagine this:
+
+\starttyping
+picture p ; p := textext("foo") ;
+
+draw boundingbox p;
+\stoptyping
+
+Here the \quote {analyze} stage will never see the text because we don't flush p.
+However because \type {textext} is called it can also make sure we still know the
+dimensions. In the next case we do use the text but in two different ways. These
+subtle aspects are dealt with properly and could be made a it simpler in the
+single pass approach.
+
+\starttyping
+picture p ; p := textext("foo") ;
+
+draw p rotated 90 withcolor red ;
+draw p withcolor green ;
+\stoptyping
+
+\stopsection
+
+\startsection[title=One or two runs]
+
+So are we better off now? One problem with two passes is that if you use the
+equation solver you need to make sure that you don't run into the redundant
+equation issue. So, you need to manage your variables well. In fact you need to
+do that anyway because you can call out to \METAPOST\ many times in a run so old
+variables can interfere anyway. So yes, we're better off here.
+
+Are we worse off now? The two runs with in between the text processing is very
+robust. There is no interference of nested runs and no interference of nested
+local \TEX\ calls. So, maybe we're also bit worse off. You need to anyhow keep
+this in mind when you write your own low level \TEX|-|\METAPOST\ interaction
+trickery, but fortunately now many users do that. And if you did write your own
+plugins, you now need to make them single pass.
+
+The new code is conceptually cleaner but also still not trivial because due to
+the mentioned complications. It's definitely less code but somehow amputating the
+old code does hurt a bit. Maybe I should keep it around as reference of how text
+handling evolved over a few decades.
+
+\stopsection
+
+\startsection[title=Appendix]
+
+Because the single pass approach made me finally look into a (although somewhat
+limited) local \TEX\ run, I will show a simple example. For the sake of
+generality I will use \type {\directlua}. Say that you need the dimensions of a
+box while in \LUA:
+
+\startbuffer
+\directlua {
+    tex.sprint("result 1: <")
+
+    tex.sprint("\\setbox0\\hbox{one}")
+    tex.sprint("\\number\\wd0")
+
+    tex.sprint("\\setbox0\\hbox{\\directlua{tex.print{'first'}}}")
+    tex.sprint(",")
+    tex.sprint("\\number\\wd0")
+
+    tex.sprint(">")
+}
+\stopbuffer
+
+\typebuffer \getbuffer
+
+This looks ok, but only because all printed text is collected and pushed into a
+new input level once the \LUA\ call is done. So take this then:
+
+\startbuffer
+\directlua {
+    tex.sprint("result 2: <")
+
+    tex.sprint("\\setbox0\\hbox{one}")
+    tex.sprint(tex.getbox(0).width)
+
+    tex.sprint("\\setbox0\\hbox{\\directlua{tex.print{'first'}}}")
+    tex.sprint(",")
+    tex.sprint(tex.getbox(0).width)
+
+    tex.sprint(">")
+}
+\stopbuffer
+
+\typebuffer \getbuffer
+
+This time we get the widths of the box known at the moment that we are in \LUA,
+but we haven't typeset the content yet, so we get the wrong dimensions. This
+however will work okay:
+
+\startbuffer
+\toks0{\setbox0\hbox{one}}
+\toks2{\setbox0\hbox{first}}
+\directlua {
+    tex.forcehmode(true)
+
+    tex.sprint("<")
+
+    tex.runtoks(0)
+    tex.sprint(tex.getbox(0).width)
+
+    tex.runtoks(2)
+    tex.sprint(",")
+    tex.sprint(tex.getbox(0).width)
+
+    tex.sprint(">")
+}
+\stopbuffer
+
+\typebuffer \getbuffer
+
+as does this:
+
+\startbuffer
+\toks0{\setbox0\hbox{\directlua{tex.sprint(MyGlobalText)}}}
+\directlua {
+    tex.forcehmode(true)
+
+    tex.sprint("result 3: <")
+
+    MyGlobalText = "one"
+    tex.runtoks(0)
+    tex.sprint(tex.getbox(0).width)
+
+    MyGlobalText = "first"
+    tex.runtoks(0)
+    tex.sprint(",")
+    tex.sprint(tex.getbox(0).width)
+
+    tex.sprint(">")
+}
+\stopbuffer
+
+\typebuffer \getbuffer
+
+Here is a variant that uses functions:
+
+\startbuffer
+\directlua {
+    tex.forcehmode(true)
+
+    tex.sprint("result 4: <")
+
+    tex.runtoks(function()
+        tex.sprint("\\setbox0\\hbox{one}")
+    end)
+    tex.sprint(tex.getbox(0).width)
+
+    tex.runtoks(function()
+        tex.sprint("\\setbox0\\hbox{\\directlua{tex.print{'first'}}}")
+    end)
+    tex.sprint(",")
+    tex.sprint(tex.getbox(0).width)
+
+    tex.sprint(">")
+}
+\stopbuffer
+
+\typebuffer \getbuffer
+
+The \type {forcemode} is needed when you do this in vertical mode. Otherwise the
+run aborts. Of course you can also force horizontal mode before the call. I'm
+sure that users will be surprised by side effects when they really use this
+feature but that is to be expected: you really need to be aware of the subtle
+interference of input levels and mix of input media (files, token lists, macros
+or \LUA) as well as the fact that \TEX\ often looks one token ahead, and often,
+when forced to typeset something, also can trigger builders. You're warned.
+
+\stopsection
+
+\stopchapter
+
+\stopcomponent
+
+% \starttext
+
+% \toks0{\hbox{test}}  [\ctxlua{tex.runtoks(0)}]\par
+
+% \toks0{\relax\relax\hbox{test}\relax\relax}[\ctxlua{tex.runtoks(0)}]\par
+
+% \toks0{xxxxxxx}  [\ctxlua{tex.runtoks(0)}]\par
+
+% \toks0{\hbox{(\ctxlua{context("test")})}}  [\ctxlua{tex.runtoks(0)}]\par
+
+% \toks0{\global\setbox1\hbox{(\ctxlua{context("test")})}}  [\ctxlua{tex.runtoks(0)}\box1]\par
+
+% \startluacode
+% local s = "[\\ctxlua{tex.runtoks(0)}\\box1]"
+% context("<")
+% context( function() context(s) end)
+% context( function() context(s) end)
+% context(">")
+% \stopluacode\par
+
+% \toks10000{\hbox{\red test1}}
+% \toks10002{\green\hbox{test2}}
+% \toks10004{\hbox{\global\setbox1\hbox to 1000sp{\directlua{context("!4!")}}}}
+% \toks10006{\hbox{\global\setbox3\hbox to 2000sp{\directlua{context("?6?")}}}}
+% \hbox{x\startluacode
+%     local s0 = "(\\hbox{\\ctxlua{tex.runtoks(10000)}})"
+%     local s2 = "[\\hbox{\\ctxlua{tex.runtoks(10002)}}]"
+%     context("<!")
+% --    context( function() context(s0) end)
+% --    context( function() context(s0) end)
+% --    context( function() context(s2) end)
+%     context(s0)
+%     context(s0)
+%     context(s2)
+%     context("<")
+%     tex.runtoks(10004)
+%     context("X")
+%     tex.runtoks(10006)
+%     context(tex.box[1].width)
+%     context("/")
+%     context(tex.box[3].width)
+%     context("!>")
+% \stopluacode x}\par
+
+
diff --git a/doc/context/sources/general/manuals/onandon/onandon.tex b/doc/context/sources/general/manuals/onandon/onandon.tex
index 65a7f5712..2352907fc 100644
--- a/doc/context/sources/general/manuals/onandon/onandon.tex
+++ b/doc/context/sources/general/manuals/onandon/onandon.tex
@@ -69,6 +69,7 @@
   % \component onandon-expansion
 
     \component onandon-110
+    \component onandon-runtoks
 \stopbodymatter
 
 \stopproduct
author	Hans Hagen <pragma@wxs.nl>	2018-06-11 13:44:10 +0200
committer	Context Git Mirror Bot <phg42.2a@gmail.com>	2018-06-11 13:44:10 +0200
commit	7686a24f79edfef2a9d013882c822c76a12e23dc (patch)
tree	64cfb02179b872eca3f6d9f02c317b0b1823f046 /doc/context/sources
parent	bd8f4d00a5ba1af56451821cd1db1c12c22f5419 (diff)
download	context-7686a24f79edfef2a9d013882c822c76a12e23dc.tar.gz