% language=us

% This feature was done mid May 2020 with Alien Chatter (I really need to find the
% original cd's) in the background.

\environment evenmore-style

\startcomponent evenmore-parameters

\startchapter[title=Parameters]

When \TEX\ reads input it either does something directly, like setting a
register, loading a font, turning a character into a glyph node, packaging a box,
or it sort of collects tokens and stores them somehow, in a macro (definition),
in a token register, or someplace temporary to inject them into the input later.
Here we'll be discussing macros, which have a special token list containing the
preamble defining the arguments and a body doing the real work. For instance when
you say:

\starttyping[option=TEX]
\def\foo#1#2{#1 + #2 + #1 + #2}
\stoptyping

the macro \type {\foo} is stored in such a way that it knows how to pick up the
two arguments and when expanding the body, it will inject the collected arguments
each time a reference like \type {#1} or \type {#2} is seen. In fact, quite
often, \TEX\ pushes a list of tokens (like an argument) in the input stream and
then detours in taking tokens from that list. Because \TEX\ does all its memory
management itself the price of all that copying is not that high, although during
a long and more complex run the individual tokens that make the forward linked
list of tokens get scattered in token memory and memory access is still the
bottleneck in processing.

A somewhat simplified view of how a macro like this gets stored is the following:

\starttyping
hash entry "foo" with property "macro call" =>

    match (# property stored)
    match (# property stored)
    end of match

    match reference 1
    other character +
    match reference 2
    other character +
    match reference 1
    other character +
    match reference 2
\stoptyping

When a macro gets expanded, the scanner first collects all the passed arguments
and then pushes those (in this case two) token lists on the parameter stack. Keep
in mind that due to nesting many kinds of stacks play a role. When the body gets
expanded and a reference is seen, the argument that it refers to gets injected
into the input, so imagine that we have this definition:

\starttyping[option=TEX]
\foo#1#2{\ifdim\dimen0=0pt #1\else #2\fi}
\stoptyping

and we say:

\starttyping[option=TEX]
\foo{yes}{no}
\stoptyping

then it's as if we had typed:

\starttyping[option=TEX]
\ifdim\dimen0=0pt yes\else no\fi
\stoptyping

So, you'd better not have something in the arguments that messes up the condition
parser! From the perspective of an expansion machine it all makes sense. But it
also means that when arguments are not used, they still get parsed and stored.
Imagine using this one:

\starttyping[option=TEX]
\def\foo#1{\iffalse#1\oof#1\oof#1\oof#1\oof#1\fi}
\stoptyping

When \TEX\ sees that the condition is false it will enter a fast scanning mode
where it only looks at condition related tokens, so even if \type {\oof} is not
defined this will work ok:

\starttyping[option=TEX]
\foo{!}
\stoptyping

But when we say this:

\starttyping[option=TEX]
\foo{\else}
\stoptyping

It will bark! This is because each \type {#1} reference will be resolved, so we
effectively have

\starttyping[option=TEX]
\def\foo#1{\iffalse\else\oof\else\oof\else\oof\else\oof\else\fi}
\stoptyping

which is not good. On the other hand, since expansion takes place in quick
parsing mode, this will work:

\starttyping[option=TEX]
\def\oof{\else}
\foo\oof
\stoptyping

which actually is:

\starttyping[option=TEX]
\def\foo#1{\iffalse\oof\oof\oof\oof\oof\oof\oof\oof\oof\fi}
\stoptyping

So, a reference to an argument effectively is just a replacement. As long as you
keep that in mind, and realize that while \TEX\ is skipping \quote {if} branches
nothing gets expanded, you're okay.

Most users will associate the \type {#} character with macro arguments or
preambles in low level alignments, but since most macro packages provide a higher
level set of table macros the latter is less well known. But, as often with
characters in \TEX, you can do magic things:

\starttyping[option=TEX]
\catcode`?=\catcode`#

\def\foo #1#2?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\par
\def\foo ?1#2?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\par
\def\foo ?1?2#3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\par
\stoptyping

Here the question mark also indicates a macro argument. However, when expanded
we see this as result:

\starttyping
macro:#1#2?3->?1?2?3 =>123
macro:?1#2?3->?1?2?3 =>123
macro:?1?2#3->#1#2#3 =>123
\stoptyping

The last used argument signal character (officially called a match character,
here we have two that fit that category, \type {#} and \type {?}) is used in the
serialization! Now, there is an interesting aspect here. When \TEX\ stores the
preamble, as in our first example:

\starttyping
    match (# property stored)
    match (# property stored)
    end of match
\stoptyping

the property is stored, so in the later example we get:

\starttyping
    match (# property stored)
    match (# property stored)
    match (? property stored)
    end of match
\stoptyping

But in the macro body the number is stored instead, because we need it as
reference to the parameter, so when that bit gets serialized \TEX\ (or more
accurately: \LUATEX, which is what we're using here) doesn't know what specific
signal was used. When the preamble is serialized it does keep track of the last
so|-|called match character. This is why we see this inconsistency in rendering.

A simple solution would be to store the used signal for the match argument, which
probably only takes a few lines of extra code (using a nine integer array instead
of a single integer), and use that instead. I'm willing to see that as a bug in
\LUATEX\ but when I ran into it I was playing with something else: adding the
ability to prevent storing unused arguments. But the resulting confusion can make
one wonder why we do not always serialize the match character as \type {#}.

It was then that I noticed that the preamble stored the match tokens and not the
number and that \TEX\ in fact assumes that no mixture is used. And, after
prototyping that in itself trivial change I decided that in order to properly
serialize this new feature it also made sense to always serialize the match token
as \type {#}. I simply prefer consistency over confusion and so I caught two
flies in one stroke. The new feature is indicated with a \type {#0} parameter:

\startbuffer

\bgroup
\catcode`?=\catcode`#

\def\foo ?1?0?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\crlf
\def\foo ?1#0?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\crlf
\def\foo #1#2?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\crlf
\def\foo ?1#2?3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\crlf
\def\foo ?1?2#3{?1?2?3} \meaning\foo\space=>\foo{1}{2}{3}\crlf
\egroup
\stopbuffer

\typebuffer[option=TEX]

\start
\getbuffer
\stop

So, what is the rationale behind this new \type{#0} variant? Quite often you
don't want to do something with an argument at all. This happens when a macro
acts upon for instance a first argument and then expands another macro that
follows up but only deals with one of many arguments and discards the rest. Then
it makes no sense to store unused arguments. Keep in mind that in order to use it
more than once an argument does need to be stored, because the parser only looks
forward. In principle there could be some optimization in case the tokens come
from macros but we leave that for now. So, when we don't need an argument, we can
avoid storing it and just skip over it. Consider the following:

\startbuffer
\def\foo     #1{\ifnum#1=1 \expandafter\fooone\else\expandafter\footwo\fi}
\def\fooone#1#0{#1}
\def\footwo#0#2{#2}
\foo{1}{yes}{no}
\foo{0}{yes}{no}
\stopbuffer

\typebuffer[option=TEX]

We get:

\getbuffer

Just for the record, tracing of a macro shows that indeed there is no argument
stored:

\starttyping[option=TEX]
\def\foo#1#0#3{....}
\foo{11}{22}{33}
\foo #1#0#3->....
#1<-11
#2<-
#3<-33
\stoptyping

Now, you can argue, what is the benefit of not storing tokens? As mentioned
above, the \TEX\ engines do their own memory management. \footnote {An added
benefit is that dumping and undumping is relatively efficient too.} This has
large benefits in performance especially when one keeps in mind that tokens get
allocated and are recycled constantly (take only lookahead and push back).

However, even if this means that storing a couple of unused arguments doesn't put
much of a dent in performance, it does mean that a token sits somewhere in memory
and that this bit of memory needs to get accessed. Again, this is no big deal on
a computer where a \TEX\ job can take one core and basically is the only process
fighting for \CPU\ cache usage. But less memory access might be more relevant in
a scenario of multiple virtual machines running on the same hardware or multiple
\TEX\ processes on one machine. I didn't carefully measure that so I might be
wrong here. Anyway, it's always good to avoid moving around data when there is no
need for it.

Just to temper expectations with respect to performance, here are some examples:

\starttyping[option=TEX]
\catcode`!=9 % ignore this character
\firstoftwoarguments
  {!!!!!!!!!!!!!!!!!!!}{!!!!!!!!!!!!!!!!!!!}
\secondoftwoarguments
  {!!!!!!!!!!!!!!!!!!!}{!!!!!!!!!!!!!!!!!!!}
\secondoffourarguments
  {!!!!!!!!!!!!!!!!!!!}{!!!!!!!!!!!!!!!!!!!}
  {!!!!!!!!!!!!!!!!!!!}{!!!!!!!!!!!!!!!!!!!}
\stoptyping

In \CONTEXT\ we define these macros as follows:

\starttyping[option=TEX]
\def\firstoftwoarguments      #1#2{#1}
\def\secondoftwoarguments     #1#2{#2}
\def\secondoffourarguments#1#2#3#4{#2}
\stoptyping

The performance of 2 million expansions is the following (probably half or less
on a more modern machine):

\starttabulate[||||]
\BC macro                          \BC total \BC step        \NC \NR
\NC \type {\firstoftwoarguments}   \NC 0.245 \NC 0.000000123 \NC \NR
\NC \type {\secondoftwoarguments}  \NC 0.251 \NC 0.000000126 \NC \NR
\NC \type {\secondoffourarguments} \NC 0.390 \NC 0.000000195 \NC \NR
\stoptabulate

But we could use this instead:

\starttyping[option=TEX]
\def\firstoftwoarguments      #1#0{#1}
\def\secondoftwoarguments     #0#2{#2}
\def\secondoffourarguments#0#2#0#0{#2}
\stoptyping

which gives:

\starttabulate[||||]
\BC macro                          \BC total \BC step        \NC \NR
\NC \type {\firstoftwoarguments}   \NC 0.229 \NC 0.000000115 \NC \NR
\NC \type {\secondoftwoarguments}  \NC 0.236 \NC 0.000000118 \NC \NR
\NC \type {\secondoffourarguments} \NC 0.323 \NC 0.000000162 \NC \NR
\stoptabulate

So, no impressive differences, especially when one considers that when that many
expansions happen in a run, getting the document itself rendered plus expanding
real arguments (not something defined to be ignored) will take way more time
compared to this. I always test an extension like this on the test suite
\footnote {Currently some 1600 files that take 24 minutes plus or minus 30
seconds to process on a high end 2013 laptop. The 260 page manual with lots of
tables, verbatim and \METAPOST\ images takes around 11 seconds. A few
milliseconds more or less don't really show here. I only time these runs because
I want to make sure that there are no dramatic consequences.} as well as the
\LUAMETATEX\ manual (which takes about 11 seconds) and although one can notice a
little gain, it makes more sense not to play music on the same machine as we run
the \TEX\ job, if gaining milliseconds is that important. But, as said, it's more
about unnecessary memory access than about \CPU\ cycles.

This extension is downward compatible and its overhead can be neglected. Okay,
the serialization now always uses \type {#} but it was inconsistent before, so
I'm willing to sacrifice that (and I'm pretty sure no \CONTEXT\ user cares or
will even notice). Also, it's only in \LUAMETATEX\ (for now) so that other macro
packages don't suffer from this patch. The few cases where \CONTEXT\ can benefit
from it are easy to isolate for \MKIV\ and \LMTX\ so we can support \LUATEX\ and
\LUAMETATEX.

I mentioned \LUATEX\ and how it serializes, but for the record, let's see how
\PDFTEX, which is very close to original \TEX\ in terms of source code, does it.
If we have this input:

\starttyping[option=TEX]
\catcode`D=\catcode`#
\catcode`O=\catcode`#
\catcode`N=\catcode`#
\catcode`-=\catcode`#
\catcode`K=\catcode`#
\catcode`N=\catcode`#
\catcode`U=\catcode`#
\catcode`T=\catcode`#
\catcode`H=\catcode`#

\def\dek D1O2N3-4K5N6U7T8H9{#1#2#3 #4#6#7#8#9}

{\meaning\dek \tracingall \dek don{}knuth}
\stoptyping

The meaning gets typeset as:

\starttyping
macro:D1O2N3-4K5N6U7T8H9->H1H2H3 H4H6H7H8H9don nuth
\stoptyping

while the tracing reports:

\starttyping
\dek D1O2N3-4K5N6U7T8H9->H1H2H3 H5H6H7H8H9
D1<-d
O2<-o
N3<-n
-4<-
K5<-k
N6<-n
U7<-u
T8<-t
H9<-h
\stoptyping

The reason for the difference, as mentioned, is that the tracing uses the
template and therefore uses the stored match token, while the meaning uses the
reference match tokens that carry the number and at that time has no access to
the original match token. Keeping track of that for the sake of tracing would not
make sense anyway. So, traditional \TEX, which is what \PDFTEX\ is very close to,
uses the last used match token, the \type {H}. Maybe this example can convince
you that dropping that bit of log related compatibility is not that much of a
problem. I just tell myself that I turned an unwanted side effect into a new
feature.

\subject{A few side notes}

The fact that characters can be given a special meaning is one of the charming
properties of \TEX. Take these two cases:

\starttyping[option=TEX]
\bgroup\catcode`\&=5 &\egroup
\bgroup\catcode`\!=5 !\egroup
\stoptyping

In both lines there is now an alignment character used outside an alignment. And,
in both cases the error message is similar:

\starttyping
! Misplaced alignment tab character &
! Misplaced alignment tab character !
\stoptyping

So, indeed the right character is shown in the message. But, as soon as you ask
for help, there is a difference: in the first case the help is specific for a tab
character, but in the second case a more generic explanation is given. Just try
it.

The reason is an explicit check for the ampersand being used as tab character.
Such is the charm of \TEX. I'll probably opt for a trivial change to be
consistent here, although in \CONTEXT\ the ampersand is just an ampersand so no
user will notice.

There are a few more places where, although in principle any character can serve
any purpose, there are hard coded assumptions, like \type {$} being used for
math, so a missing dollar is reported, even if math started with another
character being used to enter math mode. This makes sense because there is no
urgent need to keep track of what specific character was used for entering math
mode. An even stronger argument could be that \TEX ies expect dollars to be used
for that purpose. Of course this works fine:

\starttyping[option=TEX]
\catcode`€=\catcode`$
€ \sqrt{x^3} €
\stoptyping

But when we forget an \type {€} we get messages like:

\starttyping
! Missing $ inserted
\stoptyping

or more generic:

\starttyping
! Extra }, or forgotten $
\stoptyping

which is definitely a confirmation of \quotation {America first}. Of course we
can compromise in display math because this is quite okay:

\starttyping[option=TEX]
\catcode`€=\catcode`$
$€ \sqrt{x^3} €$
\stoptyping

unless of course we forget the last dollar in which case we are told that

\starttyping
! Display math should end with $$
\stoptyping

so no matter what, the dollar wins. Given how ugly the Euro sign looks I can live
with this, although I always wonder what character would have been taken if \TEX\
was developed in another country.

\stopchapter

\stopcomponent