doc/context/sources/general/manuals/onandon/onandon-execute.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420

% language=uk

\startcomponent onandon-execute

\environment onandon-environment

\startchapter[title={Executing \TEX}]

Much of the \LUA\ code in \CONTEXT\ originates from experiments. What survives in
the source code is probably either used, waiting to be used, or kept for
educational purposes. The functionality that we describe here has already been
present for a while in \CONTEXT, but has been improved a little starting with
\LUATEX\ 1.08 due to an extra helper. The code shown here is generic and is not
used in \CONTEXT\ as such.

Say that we have this code:

\startbuffer
for i=1,10000 do
    tex.sprint("1")
    tex.sprint("2")
    for i=1,3 do
        tex.sprint("3")
        tex.sprint("4")
        tex.sprint("5")
    end
    tex.sprint("\\space")
end
\stopbuffer

\typebuffer

% \ctxluabuffer

When we call \type {\directlua} with this snippet we get some 30 pages of \type
{12345345345}. The printed text is saved until the end of the \LUA\ call so
basically we pipe some 170\,000 characters to \TEX\ that get interpreted as one
paragraph.

Now imagine this:

\startbuffer
\setbox0\hbox{xxxxxxxxxxx} \number\wd0
\stopbuffer

\typebuffer

which gives \getbuffer (the width of the \type {box0} register). If we check the
box in \LUA, with:

\startbuffer
tex.sprint(tex.box[0].width)
tex.sprint("\\enspace")
tex.sprint("\\setbox0\\hbox{!}")
tex.sprint(tex.box[0].width)
\stopbuffer

\typebuffer

the result is {\tttf \ctxluabuffer} i.e. the same number repeated, which is not
what you would expect at first sight. However, if you consider that we just pipe
to a \TEX\ buffer that gets parsed \italic {after} the \LUA\ call, it will be
clear that the reported width is each time the width that we started with. Our
code will work all right if we use:

\startbuffer
tex.sprint(tex.box[0].width)
tex.sprint("\\enspace")
tex.sprint("\\setbox0\\hbox{!}")
tex.sprint("\\directlua{tex.sprint(tex.box[0].width)}")
\stopbuffer

\typebuffer

and now we get: {\tttf\ctxluabuffer}, but this use is a bit awkward.

It's not that complex to write some support code that is convenient and this can
work out quite well but there is a drawback. If we add references to the status
of the input pointer:

\startbuffer
print(status.input_ptr)
tex.sprint(tex.box[0].width)
tex.sprint("\\enspace")
tex.sprint("\\setbox0\\hbox{!}")
tex.sprint("\\directlua{print(status.input_ptr)\
    tex.sprint(tex.box[0].width)}")
\stopbuffer

\typebuffer

we then get \type {6} and \type {7} reported. You can imagine that when a lot of
nested \type {\directlua} calls happen, this can lead to an overflow of the input
level or (depending on what we do) the input stack size. Ideally we want to do a
\LUA\ call, temporarily go to \TEX, return to \LUA, etc.\ without needing to
worry about nesting and possible crashes due to \LUA\ itself running into
problems. One charming solution is to use so|-|called coroutines: independent
\LUA\ threads that one can switch between --- you jump out from the current
routine to another and from there back to the current one. However, when we use
\type {\directlua} for that, we still have this nesting issue and what is worse,
we keep nesting function calls too. This can be compared to:

\starttyping
\def\whatever{\ifdone\whatever\fi}
\stoptyping

where at some point \type {\ifdone} would be false so we quit, but we keep
nesting when the condition is met and eventually we will end up with some nesting
related overflow. The following:

\starttyping
\def\whatever{\ifdone\expandafter\whatever\fi}
\stoptyping

is less likely to overflow because there we have tail recursion which basically
boils down to not nesting but continuing. Do we have something similar in
\LUATEX\ for \LUA ? Yes, we do. We can register a function, for instance:

\starttyping
lua.get_functions_table()[1] = function() print("Hi there!") end
\stoptyping

and call that one with:

\starttyping
\luafunction 1
\stoptyping

This is a bit faster than calling a function such as:

\starttyping
\directlua{HiThere()}
\stoptyping

which can also be achieved by

\starttyping
\directlua{print("Hi there!")}
\stoptyping

and is sometimes more convenient. Don't overestimate the gain in speed because
\type {directlua} is quite efficient too (and on an average run a user doesn't
call it that often, millions of times that is). Anyway, a function call is what
we can use for our purpose as it doesn't involve interpretation and effectively
behaves like a tail call. The following snippet shows what we have in mind:

\startbuffer[demo]
tex.routine(function()
    tex.sprint(tex.box[0].width)
    tex.sprint("\\enspace")
    tex.sprint("\\setbox0\\hbox{!}")
    tex.yield()
    tex.sprint(tex.box[0].width)
end)
\stopbuffer

\typebuffer[demo]

\startbuffer[code]
local stepper = nil
local stack   = { }
local fid     = 2 -- make sure to take a free slot
local goback  = "\\luafunction" .. fid .. "\\relax"

function tex.resume()
    if coroutine.status(stepper) == "dead" then
        stepper = table.remove(stack)
    end
    if stepper then
        coroutine.resume(stepper)
    end
end

lua.get_functions_table()[fid] = tex.resume

function tex.yield()
    tex.sprint(goback)
    coroutine.yield()
    texio.closeinput()
end

function tex.routine(f)
    table.insert(stack,stepper)
    stepper = coroutine.create(f)
    tex.sprint(goback)
end

-- Because we protect against abuse and overload of functions, in ConTeXt we
-- need to do the following:

if context then
    fid    = context.functions.register(tex.resume)
    goback = "\\luafunction" .. fid .. "\\relax"
end
\stopbuffer

We start a routine, jump out to \TEX\ in the middle, come back when we're done
and continue. This gives us: \ctxluabuffer [code,demo], which is what we expect.

% What does this accomplish (or is it left over)?
%
% \setbox0\hbox{xxxxxxxxxxx}
%
% \ctxluabuffer[demo]

This mechanism permits efficient (nested) loops like:

\startbuffer[demo]
tex.routine(function()
    for i=1,10000 do
        tex.sprint("1")
        tex.yield()
        tex.sprint("2")
        tex.routine(function()
            for i=1,3 do
                tex.sprint("3")
                tex.yield()
                tex.sprint("4")
                tex.yield()
                tex.sprint("5")
            end
        end)
        tex.sprint("\\space")
        tex.yield()
    end
end)
\stopbuffer

\typebuffer[demo]

We do create coroutines, go back and forwards between \LUA\ and \TEX, but avoid
memory being filled up with printed content. If we flush paragraphs (instead of
e.g.\ the space) then the main difference is that instead of a small delay due to
the loop unfolding in a large set of prints and accumulated content, we now get a
steady flushing and processing.

However, even using this scheme we can still have an overflow of input buffers
because we still nest them: the limitation at the \TEX\ end has moved to a
limitation at the \LUA\ end. How come? Here is the code that we use defining the
function \type {tex.yield()}:

\typebuffer[code]

The \type {routine} creates a coroutine, and \type {yield} gives control to \TEX.
The \type {resume} is done at the \TEX\ end when we're finished there. In
practice this works fine and when you permit enough nesting and levels in \TEX\
then you will not easily overflow.

When I picked up this side project and wondered how to get around it, it suddenly
struck me that if we could just quit the current input level then nesting would
not be a problem. Adding a simple helper to the engine made that possible (of
course figuring this out took a while):

\startbuffer[code]
local stepper = nil
local stack   = { }
local fid     = 3 -- make sure to take a frees slot
local goback  = "\\luafunction" .. fid .. "\\relax"

function tex.resume()
    if coroutine.status(stepper) == "dead" then
        stepper = table.remove(stack)
    end
    if stepper then
        coroutine.resume(stepper)
    end
end

lua.get_functions_table()[fid] = tex.resume

if texio.closeinput then
    function tex.yield()
        tex.sprint(goback)
        coroutine.yield()
        texio.closeinput()
    end
else
    function tex.yield()
        tex.sprint(goback)
        coroutine.yield()
    end
end

function tex.routine(f)
    table.insert(stack,stepper)
    stepper = coroutine.create(f)
    tex.sprint(goback)
end

-- Again we need to do it as follows in ConTeXt:

if context then
    fid     = context.functions.register(tex.resume)
    goback  = "\\luafunction" .. fid .. "\\relax"
end
\stopbuffer

\ctxluabuffer[code]

\typebuffer[code]

The trick is in \type {texio.closeinput}, a recent helper to the engine and one
that should be used with care. We assume that the user knows what she or he is
doing. On an older laptop with a i7-3840 processor running \WINDOWS\ 10 the
following snippet takes less than 0.35 seconds with \LUATEX\ and 0.26 seconds
with \LUAJITTEX.

\startbuffer[code]
tex.routine(function()
    for i=1,10000 do
        tex.sprint("\\setbox0\\hpack{x}")
        tex.yield()
        tex.sprint(tex.box[0].width)
        tex.routine(function()
            for i=1,3 do
                tex.sprint("\\setbox0\\hpack{xx}")
                tex.yield()
                tex.sprint(tex.box[0].width)
            end
        end)
    end
end)
\stopbuffer

\typebuffer[code]

% \testfeatureonce {1} {\setbox0\hpack{\ctxluabuffer[code]}} \elapsedtime

Say that we were to run the bad snippet:

\startbuffer[code]
for i=1,10000 do
    tex.sprint("\\setbox0\\hpack{x}")
    tex.sprint(tex.box[0].width)
    for i=1,3 do
        tex.sprint("\\setbox0\\hpack{xx}")
        tex.sprint(tex.box[0].width)
    end
end
\stopbuffer

\typebuffer[code]

% \testfeatureonce {1} {\setbox0\hpack{\ctxluabuffer[code]}} \elapsedtime

This executes in only 0.12 seconds in both engines. So what if we run this:

\startbuffer[code]
\dorecurse{10000}{%
    \setbox0\hpack{x}
    \number\wd0
    \dorecurse{3}{%
        \setbox0\hpack{xx}
        \number\wd0
    }%
}
\stopbuffer

\typebuffer[code]

% \testfeatureonce {1} {\setbox0\hpack{\getbuffer[code]}} \elapsedtime

Pure \TEX\ needs 0.30 seconds for both engines but there we lose 0.13 seconds on
the loop code. In the \LUA\ example where we yield, the loop code takes hardly
any time. As we need only 0.05 seconds more it demonstrates that when we use the
power of \LUA, the performance hit of the switch is quite small: we yield 40.000
times! In general, such differences are far exceeded by the overhead: the time
needed to typeset the content (which \type {\hpack} doesn't do), breaking
paragraphs into lines, constructing pages and other overhead involved in the run.
In \CONTEXT\ we use a slightly different variant which has 0.30 seconds more
overhead, but that is probably true for all \LUA\ usage in \CONTEXT, but again,
it disappears in other runtime.

Here is another example:

\startbuffer[code]
\def\TestWord#1%
  {\directlua{
     tex.routine(function()
       tex.sprint("\\setbox0\\hbox{\\tttf #1}")
       tex.yield()
       tex.sprint(math.round(100 * tex.box[0].width/tex.hsize))
       tex.sprint(" percent of the hsize: ")
       tex.sprint("\\box0")
     end)
  }}
\stopbuffer

\typebuffer[code] \getbuffer[code]

\startbuffer
The width of next word is \TestWord {inline}!
\stopbuffer

\typebuffer \getbuffer

Now, in order to stay realistic, this macro can also be defined as:

\startbuffer[code]
\def\TestWord#1%
  {\setbox0\hbox{\tttf #1}%
   \directlua{
      tex.sprint(math.round(100 * tex.box[0].width/tex.hsize))
   } %
   percent of the hsize: \box0\relax}
\stopbuffer

\typebuffer[code]

We get the same result: \quotation {\getbuffer}.

We have been using a \LUA|-|\TEX\ mix for over a decade now in \CONTEXT\ and have
never really needed this mixed model. There are a few places where we could
(have) benefited from it and now we might use it in a few places, but so far we
have done fine without it. In fact, in most cases typesetting can be done fine at
the \TEX\ end. It's all a matter of imagination.

\stopchapter

\stopcomponent