1 files changed, 903 insertions, 0 deletions
diff --git a/doc/context/sources/general/manuals/still/still-tokens.tex b/doc/context/sources/general/manuals/still/still-tokens.tex
new file mode 100644
index 000000000..34784cdf3
--- /dev/null
+++ b/doc/context/sources/general/manuals/still/still-tokens.tex
@@ -0,0 +1,903 @@
+% language=uk
+
+\environment still-environment
+
+\starttext
+
+\startchapter[title=Scanning input]
+
+\startsection[title=Introduction]
+
+Tokens are the building blocks of the input for \TEX\ and they drive the process
+of expansion which in turn results in typesetting. If you want to manipulate the
+input, intercepting tokens is one approach. Other solutions are preprocessing or
+writing macros that do something with their picked|-|up arguments. In \CONTEXT\
+\MKIV\ we often forget about manipulating the input but manipulate the
+intermediate typesetting results instead. The advantage is that only at that
+moment do you know what you're truly dealing with, but a disadvantage is that
+parsing the so-called node lists is not always efficient and it can even be
+rather complex, for instance in math. It remains a fact that until \LUATEX\
+version 0.80 \CONTEXT\ hardly used the token interface.
+
+In version 0.80 a new scanner interface was introduced, demonstrated by Taco
+Hoekwater at the \CONTEXT\ conference 2014. Luigi Scarso and I integrated that
+code and I added a few more functions. Eventually the team will kick out the old
+token library and overhaul the input|-|related code in \LUATEX, because no
+callback is needed any more (and also because the current code still has traces
+of multiple \LUA\ instances). This will happen stepwise to give users who use the
+old mechanism an opportunity to adapt.
+
+Here I will show a bit of the new token scanners and explain how they can be used
+in \CONTEXT. Some of the additional scanners written on top of the built|-|in ones
+will probably end up in the generic \LUATEX\ code that ships with \CONTEXT.
+
+\stopsection
+
+\startsection[title=The \TEX\ scanner]
+
+The new token scanner library of \LUATEX\ provides a way to hook \LUA\ into \TEX\
+in a rather natural way. I have to admit that I never had any real demand for
+such a feature but now that we have it, it is worth exploring.
+
+The \TEX\ scanner roughly provides the following sub-scanners that are used to
+implement primitives: keyword, token, token list, dimension, glue and integer.
+Deep down there are specific variants for scanning, for instance, font dimensions
+and special numbers.
+
+A token is a unit of input, and one or more characters are turned into a token.
+How a character is interpreted is determined by its current catcode. For instance
+a backslash is normally tagged as `escape character' which means that it starts a
+control sequence: a macro name or primitive. This means that once it is scanned a
+macro name travels as one token through the system. Take this:
+
+\starttyping
+\def\foo#1{\scratchcounter=123#1\relax}
+\stoptyping
+
+Here \TEX\ scans \type {\def} and turns it into a token. This particular token
+triggers a specific branch in the scanner. First a name is scanned with
+optionally an argument specification. Then the body is scanned and the macro is
+stored in memory. Because \type {\scratchcounter}, \type
+{\relax} and \type {#1} are
+turned into tokens, this body has 7~tokens.
+
+When the macro \type {\foo} is referenced the body gets expanded which here means
+that the scanner will scan for an argument first and uses that in the
+replacement. So, the scanner switches between different states. Sometimes tokens
+are just collected and stored, in other cases they get expanded immediately into
+some action.
+
+\stopsection
+
+\startsection[title=Scanning from \LUA]
+
+The basic building blocks of the scanner are available at the \LUA\ end, for
+instance:
+
+\starttyping
+\directlua{print(token.scan_int())} 123
+\stoptyping
+
+This will print \type {123} to the console. Or, you can store the number and
+use it later:
+
+\starttyping
+\directlua{SavedNumber = token.scan_int())} 123
+
+We saved: \directlua{tex.print(SavedNumber)}
+\stoptyping
+
+The number of scanner functions is (on purpose) limited but you can use them to
+write additional ones as you can just grab tokens, interpret them and act
+accordingly.
+
+The \type {scan_int} function picks up a number. This can also be a counter, a
+named (math) character or a numeric expression. In \TEX, numbers are integers;
+floating|-|point is not supported naturally. With \type {scan_dimen} a dimension
+is grabbed, where a dimen is either a number (float) followed by a unit, a dimen
+register or a dimen expression (internally, all become integers). Of course
+internal quantities are also okay. There are two optional arguments, the first
+indicating that we accept a filler as unit, while the second indicates that math
+units are expected. When an integer or dimension is scanned, tokens are expanded
+till the input is a valid number or dimension. The \type {scan_glue} function
+takes one optional argument: a boolean indicating if the units are math.
+
+The \type {scan_toks} function picks up a (normally) brace|-|delimited sequence of
+tokens and (\LUATEX\ 0.80) returns them as a table of tokens. The function \type
+{get_token} returns one (unexpanded) token while \type {scan_token} returns
+an expanded one.
+
+Because strings are natural to \LUA\ we also have \type {scan_string}. This one
+converts a following brace|-|delimited sequence of tokens into a proper string.
+
+The function \type {scan_keyword} looks for the given keyword and when found skips
+over it and returns \type {true}. Here is an example of usage: \footnote {In
+\LUATEX\ 0.80 you should use \type {newtoken} instead of \type {token}.}
+
+\starttyping
+function ScanPair()
+  local one = 0
+  local two = ""
+  while true do
+    if token.scan_keyword("one") then
+      one = token.scan_int()
+    elseif token.scan_keyword("two") then
+      two = token.scan_string()
+    else
+      break
+    end
+  end
+  tex.print("one: ",one,"\\par")
+  tex.print("two: ",two,"\\par")
+end
+\stoptyping
+
+This can be used as:
+
+\starttyping
+\directlua{ScanPair()}
+\stoptyping
+
+You can scan for an explicit character (class) with \type {scan_code}. This
+function takes a positive number as argument and returns a character or \type
+{nil}.
+
+\starttabulate[|r|r|l|]
+\NC \cldcontext{tokens.bits.escape     } \NC  0 \NC \type{escape}      \NC \NR
+\NC \cldcontext{tokens.bits.begingroup } \NC  1 \NC \type{begingroup}  \NC \NR
+\NC \cldcontext{tokens.bits.endgroup   } \NC  2 \NC \type{endgroup}    \NC \NR
+\NC \cldcontext{tokens.bits.mathshift  } \NC  3 \NC \type{mathshift}   \NC \NR
+\NC \cldcontext{tokens.bits.alignment  } \NC  4 \NC \type{alignment}   \NC \NR
+\NC \cldcontext{tokens.bits.endofline  } \NC  5 \NC \type{endofline}   \NC \NR
+\NC \cldcontext{tokens.bits.parameter  } \NC  6 \NC \type{parameter}   \NC \NR
+\NC \cldcontext{tokens.bits.superscript} \NC  7 \NC \type{superscript} \NC \NR
+\NC \cldcontext{tokens.bits.subscript  } \NC  8 \NC \type{subscript}   \NC \NR
+\NC \cldcontext{tokens.bits.ignore     } \NC  9 \NC \type{ignore}      \NC \NR
+\NC \cldcontext{tokens.bits.space      } \NC 10 \NC \type{space}       \NC \NR
+\NC \cldcontext{tokens.bits.letter     } \NC 11 \NC \type{letter}      \NC \NR
+\NC \cldcontext{tokens.bits.other      } \NC 12 \NC \type{other}       \NC \NR
+\NC \cldcontext{tokens.bits.active     } \NC 13 \NC \type{active}      \NC \NR
+\NC \cldcontext{tokens.bits.comment    } \NC 14 \NC \type{comment}     \NC \NR
+\NC \cldcontext{tokens.bits.invalid    } \NC 15 \NC \type{invalid}     \NC \NR
+\stoptabulate
+
+So, if you want to grab the character you can say:
+
+\starttyping
+local c = token.scan_code(2^10 + 2^11 + 2^12)
+\stoptyping
+
+In \CONTEXT\ you can say:
+
+\starttyping
+local c = tokens.scanners.code(
+  tokens.bits.space +
+  tokens.bits.letter +
+  tokens.bits.other
+)
+\stoptyping
+
+When no argument is given, the next character with catcode letter or other is
+returned (if found).
+
+In \CONTEXT\ we use the \type {tokens} namespace which has additional scanners
+available. That way we can remain compatible. I can add more scanners when
+needed, although it is not expected that users will use this mechanism directly.
+
+\starttabulate[||||]
+\NC \type {(new)token}   \NC \type {tokens}             \NC arguments \NC \NR
+\HL
+\NC                      \NC \type {scanners.boolean}   \NC \NC \NR
+\NC \type {scan_code}    \NC \type {scanners.code}      \NC \type {(bits)}      \NC \NR
+\NC \type {scan_dimen}   \NC \type {scanners.dimension} \NC \type {(fill,math)} \NC \NR
+\NC \type {scan_glue}    \NC \type {scanners.glue}      \NC \type {(math)}      \NC \NR
+\NC \type {scan_int}     \NC \type {scanners.integer}   \NC \NC \NR
+\NC \type {scan_keyword} \NC \type {scanners.keyword}   \NC \NC \NR
+\NC                      \NC \type {scanners.number}    \NC \NC \NR
+\NC \type {scan_token}   \NC \type {scanners.token}     \NC \NC \NR
+\NC \type {scan_tokens}  \NC \type {scanners.tokens}    \NC \NC \NR
+\NC \type {scan_string}  \NC \type {scanners.string}    \NC \NC \NR
+\NC \type {scan_word}    \NC \type {scanners.word}      \NC \NC \NR
+\NC \type {get_token}    \NC \type {getters.token}      \NC \NC \NR
+\NC \type {set_macro}    \NC \type {setters.macro}      \NC \type {(catcodes,cs,str,global)} \NC \NR
+\stoptabulate
+
+All except \type {get_token} (or its alias \type {getters.token}) expand tokens
+in order to satisfy the demands.
+
+Here are some examples of how we can use the scanners. When we would call
+\type {Foo} with regular arguments we do this:
+
+\starttyping
+\def\foo#1{%
+  \directlua {
+    Foo("whatever","#1",{n = 1})
+  }
+}
+\stoptyping
+
+but when \type {Foo} uses the scanners it becomes:
+
+\starttyping
+\def\foo#1{%
+  \directlua{Foo()} {whatever} {#1} n {1}\relax
+}
+\stoptyping
+
+In the first case we have a function \type {Foo} like this:
+
+\starttyping
+function Foo(what,str,n)
+  --
+  -- do something with these three parameters
+  --
+end
+\stoptyping
+
+and in the second variant we have (using the \type {tokens} namespace):
+
+\starttyping
+function Foo()
+  local what = tokens.scanners.string()
+  local str  = tokens.scanners.string()
+  local n    = tokens.scanners.keyword("n") and
+               tokens.scanners.integer() or 0
+  --
+  -- do something with these three parameters
+  --
+end
+\stoptyping
+
+The string scanned is kind of special as the result depends ok what is seen.
+Given the following definition:
+
+\startbuffer
+           \def\bar  {bar}
+\unexpanded\def\ubar {ubar} % \protected in plain etc
+           \def\foo  {foo-\bar-\ubar}
+           \def\wrap {{foo-\bar}}
+           \def\uwrap{{foo-\ubar}}
+\stopbuffer
+
+\typebuffer
+
+\getbuffer
+
+We get:
+
+\def\TokTest{\ctxlua{
+    local s = tokens.scanners.string()
+    context("\\bgroup\\red\\tt")
+    context.verbatim(s)
+    context("\\egroup")
+}}
+
+\starttabulate[|l|Tl|]
+\NC \type{{foo}}       \NC \TokTest {foo}       \NC \NR
+\NC \type{{foo-\bar}}  \NC \TokTest {foo-\bar}  \NC \NR
+\NC \type{{foo-\ubar}} \NC \TokTest {foo-\ubar} \NC \NR
+\NC \type{foo-\bar}    \NC \TokTest foo-\bar    \NC \NR
+\NC \type{foo-\ubar}   \NC \TokTest foo-\ubar   \NC \NR
+\NC \type{foo$bar$}    \NC \TokTest foo$bar$    \NC \NR
+\NC \type{\foo}        \NC \TokTest \foo        \NC \NR
+\NC \type{\wrap}       \NC \TokTest \wrap       \NC \NR
+\NC \type{\uwrap}      \NC \TokTest \uwrap      \NC \NR
+\stoptabulate
+
+Because scanners look ahead the following happens: when an open brace is seen (or
+any character marked as left brace) the scanner picks up tokens and expands them
+unless they are protected; so, effectively, it scans as if the body of an \type
+{\edef} is scanned. However, when the next token is a control sequence it will be
+expanded first to see if there is a left brace, so there we get the full
+expansion. In practice this is convenient behaviour because the braced variant
+permits us to pick up meanings honouring protection. Of course this is all a side
+effect of how \TEX\ scans.\footnote {This lookahead expansion can sometimes give
+unexpected side effects because often \TEX\ pushes back a token when a condition
+is not met. For instance when it scans a number, scanning stops when no digits
+are seen but the scanner has to look at the next (expanded) token in order to
+come to that conclusion. In the process it will, for instance, expand
+conditionals. This means that intermediate catcode changes will not be effective
+(or applied) to already-seen tokens that were pushed back into the input. This
+also happens with, for instance, \cs {futurelet}.}
+
+With the braced variant one can of course use primitives like \type {\detokenize}
+and \type {\unexpanded} (in \CONTEXT: \type {\normalunexpanded}, as we already
+had this mechanism before it was added to the engine).
+
+\stopsection
+
+\startsection[title=Considerations]
+
+Performance|-|wise there is not much difference between these methods. With some
+effort you can make the second approach faster than the first but in practice you
+will not notice much gain. So, the main motivation for using the scanner is that
+it provides a more \TEX|-|ified interface. When playing with the initial version
+of the scanners I did some tests with performance|-|sensitive \CONTEXT\ calls and
+the difference was measurable (positive) but deciding if and when to use the
+scanner approach was not easy. Sometimes embedded \LUA\ code looks better, and
+sometimes \TEX\ code. Eventually we will end up with a mix. Here are some
+considerations:
+
+\startitemize
+\startitem
+    In both cases there is the overhead of a \LUA\ call.
+\stopitem
+\startitem
+    In the pure \LUA\ case the whole argument is tokenized by \TEX\ and then
+    converted to a string that gets compiled by \LUA\ and executed.
+\stopitem
+\startitem
+    When the scan happens in \LUA\ there are extra calls to functions but
+    scanning still happens in \TEX; some token to string conversion is avoided
+    and compilation can be more efficient.
+\stopitem
+\startitem
+    When data comes from external files, parsing with \LUA\ is in most cases more
+    efficient than parsing by \TEX .
+\stopitem
+\startitem
+    A macro package like \CONTEXT\ wraps functionality in macros and is
+    controlled by key|/|value specifications. There is often no benefit in terms
+    of performance when delegating to the mentioned scanners.
+\stopitem
+\stopitemize
+
+Another consideration is that when using macros, parameters are often passed
+between \type {{}}:
+
+\starttyping
+\def\foo#1#2#3%
+  {...}
+\foo {a}{123}{b}
+\stoptyping
+
+and suddenly changing that to
+
+\starttyping
+\def\foo{\directlua{Foo()}}
+\stoptyping
+
+and using that as:
+
+\starttyping
+\foo {a} {b} n 123
+\stoptyping
+
+means that \type {{123}} will fail. So, eventually you will end up with something:
+
+\starttyping
+\def\myfakeprimitive{\directlua{Foo()}}
+\def\foo#1#2#3{\myfakeprimitive {#1} {#2} n #3 }
+\stoptyping
+
+and:
+
+\starttyping
+\foo {a} {b} {123}
+\stoptyping
+
+So in the end you don't gain much here apart from the fact that the fake
+primitive can be made more clever and accept optional arguments. But such new
+features are often hidden for the user who uses more high|-|level wrappers.
+
+When you code in pure \TEX\ and want to grab a number directly you need to test
+for the braced case; when you use the \LUA\ scanner method you still need to test
+for braces. The scanners are consistent with the way \TEX\ works. Of course you
+can write helpers that do some checking for braces in \LUA, so there are no real
+limitations, but it adds some overhead (and maybe also confusion).
+
+One way to speed up the call is to use the \type {\luafunction} primitive in
+combinations with predefined functions and although both mechanisms can benefit
+from this, the scanner approach gets more out of that as this method cannot be
+used with regular function calls that get arguments. In (rather low level) \LUA\
+it looks like this:
+
+\starttyping
+luafunctions[1] = function()
+  local a token.scan_string()
+  local n token.scan_int()
+  local b token.scan_string()
+  -- whatever --
+end
+\stoptyping
+
+And in \TEX:
+
+\starttyping
+\luafunction1 {a} 123 {b}
+\stoptyping
+
+This can of course be wrapped as:
+
+\starttyping
+\def\myprimitive{\luafunction1 }
+\stoptyping
+
+\stopsection
+
+\startsection[title=Applications]
+
+The question now pops up: where can this be used? Can you really make new
+primitives? The answer is yes. You can write code that exclusively stays on the
+\LUA\ side but you can also do some magic and then print back something to \TEX.
+Here we use the basic token interface, not \CONTEXT:
+
+\startbuffer
+\directlua {
+local token = newtoken or token
+function ColoredRule()
+  local w, h, d, c, t
+  while true do
+    if token.scan_keyword("width") then
+      w = token.scan_dimen()
+    elseif token.scan_keyword("height") then
+      h = token.scan_dimen()
+    elseif token.scan_keyword("depth") then
+      d = token.scan_dimen()
+    elseif token.scan_keyword("color") then
+      c = token.scan_string()
+    elseif token.scan_keyword("type") then
+      t = token.scan_string()
+    else
+      break
+    end
+  end
+  if c then
+    tex.sprint("\\color[",c,"]{")
+  end
+  if t == "vertical" then
+    tex.sprint("\\vrule")
+  else
+    tex.sprint("\\hrule")
+  end
+  if w then
+    tex.sprint("width ",w,"sp")
+  end
+  if h then
+    tex.sprint("height ",h,"sp")
+  end
+  if d then
+    tex.sprint("depth ",d,"sp")
+  end
+  if c then
+    tex.sprint("\\relax}")
+  end
+end
+}
+\stopbuffer
+
+\typebuffer \getbuffer
+
+This can be given a \TeX\ interface like:
+
+\startbuffer
+\def\myhrule{\directlua{ColoredRule()} type {horizontal} }
+\def\myvrule{\directlua{ColoredRule()} type {vertical} }
+\stopbuffer
+
+\typebuffer \getbuffer
+
+And used as:
+
+\startbuffer
+\myhrule width \hsize height 1cm color {darkred}
+\stopbuffer
+
+\typebuffer
+
+giving:
+
+% when no newtokens:
+%
+% \startbuffer
+% \blackrule[width=\hsize,height=1cm,color=darkred]
+% \stopbuffer
+
+\startlinecorrection \getbuffer \stoplinecorrection
+
+Of course \CONTEXT\ users can use the following commands to color an
+otherwise-black rule (likewise):
+
+\startbuffer
+\blackrule[width=\hsize,height=1cm,color=darkgreen]
+\stopbuffer
+
+\typebuffer \startlinecorrection \getbuffer \stoplinecorrection
+
+The official \CONTEXT\ way to define such a new command is the following. The
+conversion back to verbose dimensions is needed because we pass back to \TEX.
+
+\startbuffer
+\startluacode
+local myrule = tokens.compile {
+  {
+    { "width",  "dimension", "todimen" },
+    { "height", "dimension", "todimen" },
+    { "depth",  "dimension", "todimen" },
+    { "color",  "string" },
+    { "type",   "string" },
+  }
+}
+
+interfaces.scanners.ColoredRule = function()
+  local t = myrule()
+  context.blackrule {
+    color  = t.color,
+    width  = t.width,
+    height = t.height,
+    depth  = t.depth,
+  }
+end
+\stopluacode
+\stopbuffer
+
+\typebuffer \getbuffer
+
+With:
+
+\startbuffer
+\unprotect \let\myrule\clf_ColoredRule \protect
+\stopbuffer
+
+\typebuffer \getbuffer
+
+and
+
+\startbuffer
+\myrule width \textwidth height 1cm color {maincolor} \relax
+\stopbuffer
+
+\typebuffer
+
+we get:
+
+% when no newtokens:
+%
+% \startbuffer
+% \blackrule[width=\hsize,height=1cm,color=maincolor]
+% \stopbuffer
+
+\startlinecorrection \getbuffer \stoplinecorrection
+
+There are many ways to use the scanners and each has its charm. We will look at
+some alternatives from the perspective of performance. The timings are more meant
+as relative measures than absolute ones. After all it depends on the hardware. We
+assume the following shortcuts:
+
+\starttyping
+local scannumber  = tokens.scanners.number
+local scankeyword = tokens.scanners.keyword
+local scanword    = tokens.scanners.word
+\stoptyping
+
+We will scan for four different keys and values. The number is scanned using a
+helper \type {scannumber} that scans for a number that is acceptable for \LUA.
+Thus, \type {1.23} is valid, as are \type {0x1234} and \type {12.12E4}.
+
+% interfaces.scanners.test_scaling_a
+
+\starttyping
+function getmatrix()
+  local sx, sy = 1, 1
+  local rx, ry = 0, 0
+  while true do
+    if scankeyword("sx") then
+      sx = scannumber()
+    elseif scankeyword("sy") then
+      sy = scannumber()
+    elseif scankeyword("rx") then
+      rx = scannumber()
+    elseif scankeyword("ry") then
+      ry = scannumber()
+    else
+      break
+    end
+  end
+  -- action --
+end
+\stoptyping
+
+Scanning the following specification  100000 times takes 1.00 seconds:
+
+\starttyping
+sx 1.23 sy 4.5 rx 1.23 ry 4.5
+\stoptyping
+
+The \quote {tight} case takes 0.94 seconds:
+
+\starttyping
+sx1.23 sy4.5 rx1.23 ry4.5
+\stoptyping
+
+% interfaces.scanners.test_scaling_b
+
+We can compare this to scanning without keywords. In that case there have to be
+exactly four arguments. These have to be given in the right order which is no big
+deal as often such helpers are encapsulated in a user|-|friendly macro.
+
+\starttyping
+function getmatrix()
+  local sx, sy = scannumber(), scannumber()
+  local rx, ry = scannumber(), scannumber()
+  -- action --
+end
+\stoptyping
+
+As expected, this is more efficient than the previous examples. It takes 0.80
+seconds to scan this 100000 times:
+
+\starttyping
+1.23 4.5 1.23 4.5
+\stoptyping
+
+A third alternative is the following:
+
+\starttyping
+function getmatrix()
+  local sx, sy = 1, 1
+  local rx, ry = 0, 0
+  while true do
+    local kw = scanword()
+    if kw == "sx" then
+      sx = scannumber()
+    elseif kw == "sy" then
+      sy = scannumber()
+    elseif kw == "rx" then
+      rx = scannumber()
+    elseif kw == "ry" then
+      ry = scannumber()
+    else
+      break
+    end
+  end
+  -- action --
+end
+\stoptyping
+
+Here we scan for a keyword and assign a number to the right variable. This one
+call happens to be less efficient than calling \type {scan_keyword} 10 times
+($4+3+2+1$) for the explicit scan. This run takes 1.11 seconds for the next line.
+The spaces are really needed as words can be anything that has no space.
+\footnote {Hard|-|coding the word scan in a \CCODE\ helper makes little sense, as
+different macro packages can have different assumptions about what a word is. And
+we don't extend \LUATEX\ for specific macro packages.}
+
+\starttyping
+sx 1.23 sy 4.5 rx 1.23 ry 4.5
+\stoptyping
+
+Of course these numbers need to be compared to a baseline of no scanning (i.e.\
+the overhead of a \LUA\ call which here amounts to 0.10 seconds. This brings
+us to the following table.
+
+\starttabulate[|l|l|]
+\NC keyword checks \NC 0.9 sec\NC \NR
+\NC no keywords    \NC 0.7 sec\NC \NR
+\NC word checks    \NC 1.0 sec\NC \NR
+\stoptabulate
+
+The differences are not that impressive given the number of calls. Even in a
+complex document the overhead of scanning can be negligible compared to the
+actions involved in typesetting the document. In fact, there will always be some
+kind of scanning for such macros so we're talking about even less impact. So you
+can just use the method you like most. In practice, the extra overhead of using
+keywords in combination with explicit checks (the first case) is rather
+convenient.
+
+If you don't want to have many tests you can do something like this:
+
+\starttyping
+local keys = {
+  sx = scannumber, sy = scannumber,
+  rx = scannumber, ry = scannumber,
+}
+
+function getmatrix()
+  local values = { }
+  while true do
+    for key, scan in next, keys do
+      if scankeyword(key) then
+        values[key] = scan()
+      else
+        break
+      end
+    end
+  end
+  -- action --
+end
+\stoptyping
+
+This is still quite fast although one now has to access the values in a table.
+Working with specifications like this is clean anyway so in \CONTEXT\ we have a
+way to abstract the previous definition.
+
+\starttyping
+local specification = tokens.compile {
+  {
+    { "sx", "number" }, { "sy", "number" },
+    { "rx", "number" }, { "ry", "number" },
+  },
+}
+
+function getmatrix()
+  local values = specification()
+  -- action using values.sx etc --
+end
+\stoptyping
+
+Although one can make complex definitions this way, the question remains if it
+is a better approach than passing \LUA\ tables. The standard \CONTEXT\ way for
+controlling features is:
+
+\starttyping
+\getmatrix[sx=1.2,sy=3.4]
+\stoptyping
+
+So it doesn't matter much if deep down we see:
+
+\starttyping
+\def\getmatrix[#1]%
+  {\getparameters[@@matrix][sx=1,sy=1,rx=1,ry=1,#1]%
+   \domatrix
+     \@@matrixsx
+     \@@matrixsy
+     \@@matrixrx
+     \@@matrixry
+   \relax}
+\stoptyping
+
+or:
+
+\starttyping
+\def\getmatrix[#1]%
+  {\getparameters[@@matrix][sx=1,sy=1,rx=1,ry=1,#1]%
+   \domatrix
+     sx \@@matrixsx
+     sy \@@matrixsy
+     rx \@@matrixrx
+     ry \@@matrixry
+   \relax}
+\stoptyping
+
+In the second variant (with keywords) can be a scanner like we defined before:
+
+\starttyping
+\def\domatrix#1#2#3#4%
+  {\directlua{getmatrix()}}
+\stoptyping
+
+but also:
+
+\starttyping
+\def\domatrix#1#2#3#4%
+  {\directlua{getmatrix(#1,#2,#3,#4)}}
+\stoptyping
+
+given:
+
+\starttyping
+function getmatrix(sx,sy,rx,ry)
+    -- action using sx etc --
+end
+\stoptyping
+
+or maybe nicer:
+
+\starttyping
+\def\domatrix#1#2#3#4%
+  {\directlua{domatrix{
+     sx = #1,
+     sy = #2,
+     rx = #3,
+     ry = #4
+   }}}
+\stoptyping
+
+assuming:
+
+\starttyping
+function getmatrix(values)
+    -- action using values.sx etc --
+end
+\stoptyping
+
+If you go for speed the scanner variant without keywords is the most efficient
+one. For readability the scanner variant with keywords or the last shown example
+where a table is passed is better. For flexibility the table variant is best as
+it makes no assumptions about the scanner \emdash\ the token scanner can quit on
+unknown keys, unless that is intercepted of course. But as mentioned before, even
+the advantage of the fast one should not be overestimated. When you trace usage
+it can be that the (in this case matrix) macro is called only a few thousand
+times and that doesn't really add up. Of course many different sped-up calls can
+make a difference but then one really needs to optimize consistently the whole
+code base and that can conflict with readability. The token library presents us
+with a nice chicken||egg problem but nevertheless is fun to play with.
+
+\stopsection
+
+\startsection[title=Assigning meanings]
+
+The token library also provides a way to create tokens and access properties but
+that interface can change with upcoming versions when the old library is replaced
+by the new one and the input handling is cleaned up. One experimental function is
+worth mentioning:
+
+\starttyping
+token.set_macro("foo","the meaning of bar")
+\stoptyping
+
+This will turn the given string into tokens that get assigned to \type {\foo}.
+Here are some alternative calls:
+
+\starttabulate
+\NC \type {set_macro("foo")}                    \NC \type { \def \foo {}}        \NC \NR
+\NC \type {set_macro("foo","meaning")}          \NC \type { \def \foo {meaning}} \NC \NR
+\NC \type {set_macro("foo","meaning","global")} \NC \type {\gdef \foo {meaning}} \NC \NR
+\stoptabulate
+
+The conversion to tokens happens under the current catcode regime. You can
+enforce a different regime by passing a number of an allocated catcode table as
+the first argument, as with \type {tex.print}. As we mentioned performance
+before: setting at the \LUA\ end like this:
+
+\starttyping
+token.set_macro("foo","meaning")
+\stoptyping
+
+is about two times as fast as:
+
+\starttyping
+tex.sprint("\\def\\foo{meaning}")
+\stoptyping
+
+or (with slightly more overhead) in \CONTEXT\ terms:
+
+\starttyping
+context("\\def\\foo{meaning}")
+\stoptyping
+
+The next variant is actually slower (even when we alias \type {setvalue}):
+
+\starttyping
+context.setvalue("foo","meaning")
+\stoptyping
+
+but although 0.4 versus 0.8 seconds looks like a lot on a \TEX\ run I need a
+million calls to see such a difference, and a million macro definitions during a
+run is a lot. The different assignments involved in, for instance, 3000 entries
+in a bibliography (with an average of 5 assignments per entry) can hardly be
+measured as we're talking about milliseconds. So again, it's mostly a matter of
+convenience when using this function, not a necessity.
+
+\stopsection
+
+\startsection[title=Conclusion]
+
+For sure we will see usage of the new scanner code in \CONTEXT, but to what
+extent remains to be seen. The performance gain is not impressive enough to
+justify many changes to the code but as the low|-|level interfacing can sometimes
+become a bit cleaner it will be used in specific places, even if we sacrifice
+some speed (which then probably will be compensated for by a little gain
+elsewhere).
+
+The scanners will probably never be used by users directly simply because there
+are no such low level interfaces in \CONTEXT\ and because manipulating input is
+easier in \LUA. Even deep down in the internals of \CONTEXT\ we will use wrappers
+and additional helpers around the scanner code. Of course there is the fun-factor
+and playing with these scanners is fun indeed. The macro setters have as their
+main benefit that using them can be nicer in the \LUA\ source, and of course
+setting a macro this way is also conceptually cleaner (just like we can set
+registers).
+
+Of course there are some challenges left, like determining if we are scanning
+input of already converted tokens (for instance in a macro body or token\-list
+expansion). Once we can properly feed back tokens we can also look ahead like
+\type {\futurelet} does. But for that to happen we will first clean up the
+\LUATEX\ input scanner code and error handler.
+
+\stopsection
+
+\stopchapter
+
+\stoptext
+