summaryrefslogtreecommitdiff
path: root/doc/context/sources/general/manuals/luatex/luatex-enhancements.tex
blob: 19f88234a53d1c4a14370f6c0dadfd47181f1b21 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
% language=uk

\environment luatex-style
\environment luatex-logos

\startcomponent luatex-enhancements

\startchapter[reference=enhancements,title={Basic \TEX\ enhancements}]

\section{Introduction}

From day one, \LUATEX\ has offered extra features compared to the superset of
\PDFTEX\ and \ALEPH. This has not been limited to the possibility to execute
\LUA\ code via \type {\directlua}, but \LUATEX\ also adds functionality via new
\TEX|-|side primitives or extensions to existing ones.

When \LUATEX\ starts up in \quote {iniluatex} mode (\type {luatex -ini}), it
defines only the primitive commands known by \TEX82 and the one extra command
\type {\directlua}. As is fitting, a \LUA\ function has to be called to add the
extra primitives to the user environment. The simplest method to get access to
all of the new primitive commands is by adding this line to the format generation
file:

\starttyping
\directlua { tex.enableprimitives('',tex.extraprimitives()) }
\stoptyping

But be aware that the curly braces may not have the proper \type {\catcode}
assigned to them at this early time (giving a \quote {Missing number} error), so
it may be needed to put these assignments before the above line:

\starttyping
\catcode `\{=1
\catcode `\}=2
\stoptyping

More fine|-|grained primitives control is possible and you can look up the
details in \in {section} [luaprimitives]. For simplicity's sake, this manual
assumes that you have executed the \type {\directlua} command as given above.

The startup behaviour documented above is considered stable in the sense that
there will not be backward|-|incompatible changes any more. We have promoted some
rather generic \PDFTEX\ primitives to core \LUATEX\ ones, and the ones inherited
frome \ALEPH\ (\OMEGA) are also promoted. Effectively this means that we now only
have the \type {tex}, \type {etex} and \type {luatex} sets left.

In \in {Chapter} [modifications] we discuss several primitives that are derived
from \PDFTEX\ and \ALEPH\ (\OMEGA). Here we stick to real new ones. In the
chapters on fonts and math we discuss a few more new ones.

\section{Version information}

\subsection {\type {\luatexbanner}, \type {\luatexversion} and \type {\luatexrevision}}

There are three new primitives to test the version of \LUATEX:

\starttabulate[|l|pl|pl|]
\NC \bf primitive           \NC \bf explanation                           \NC \bf value          \NC \NR
\NC \type {\luatexbanner}   \NC the banner reported on the command line   \NC \luatexbanner      \NC \NR
\NC \type {\luatexversion}  \NC a combination of major and minor number   \NC \the\luatexversion \NC \NR
\NC \type {\luatexrevision} \NC the revision number, the current value is \NC \luatexrevision    \NC \NR
\stoptabulate

The official \LUATEX\ version is defined as follows:

\startitemize
\startitem
    The major version is the integer result of \type {\luatexversion} divided by
    100. The primitive is an \quote {internal variable}, so you may need to prefix
    its use with \type {\the} depending on the context.
\stopitem
\startitem
    The minor version is the two-digit result of \type {\luatexversion} modulo 100.
\stopitem
\startitem
    The revision is the given by \type {\luatexrevision}. This primitive expands to
    a positive integer.
\stopitem
\startitem
    The full version number consists of the major version, minor version and
    revision, separated by dots.
\stopitem
\stopitemize

\subsection{\type {\formatname}}

The \type {\formatname} syntax is identical to \type {\jobname}. In \INITEX, the
expansion is empty. Otherwise, the expansion is the value that \type {\jobname} had
during the \INITEX\ run that dumped the currently loaded format. You can use this
token list to provide your own version info.

\section{\UNICODE\ text support}

\subsection {Extended ranges}

Text input and output is now considered to be \UNICODE\ text, so input characters
can use the full range of \UNICODE\ ($2^{20}+2^{16}-1 = \hbox{0x10FFFF}$). Later
chapters will talk of characters and glyphs. Although these are not
interchangeable, they are closely related. During typesetting, a character is
always converted to a suitable graphic representation of that character in a
specific font. However, while processing a list of to|-|be|-|typeset nodes, its
contents may still be seen as a character. Inside \LUATEX\ there is no clear
separation between the two concepts. Because the subtype of a glyph node can be
changed in \LUA\ it is up to the user: subtypes larger than 255 indicate that
font processing has happened.

A few primitives are affected by this, all in a similar fashion: each of them has
to accommodate for a larger range of acceptable numbers. For instance, \type
{\char} now accepts values between~0 and $1{,}114{,}111$. This should not be a
problem for well|-|behaved input files, but it could create incompatibilities for
input that would have generated an error when processed by older \TEX|-|based
engines. The affected commands with an altered initial (left of the equals sign)
or secondary (right of the equals sign) value are: \type {\char}, \type
{\lccode}, \type {\uccode}, \type {\catcode}, \type {\sfcode}, \type {\efcode},
\type {\lpcode}, \type {\rpcode}, \type {\chardef}.

As far as the core engine is concerned, all input and output to text files is
\UTF-8 encoded. Input files can be pre|-|processed using the \type {reader}
callback. This will be explained in a later chapter.

Output in byte|-|sized chunks can be achieved by using characters just outside of
the valid \UNICODE\ range, starting at the value $1{,}114{,}112$ (0x110000). When
the time comes to print a character $c>=1{,}114{,}112$, \LUATEX\ will actually
print the single byte corresponding to $c$ minus 1{,}114{,}112.

Output to the terminal uses \type {^^} notation for the lower control range
($c<32$), with the exception of \type {^^I}, \type {^^J} and \type {^^M}. These
are considered \quote {safe} and therefore printed as|-|is. You can disable
escaping with \type {texio.setescape(false)} in which case you get the normal
characters on the console.

Normalization of the \UNICODE\ input can be handled by a macro package during
callback processing (this will be explained in \in {section} [iocallback]).

\subsection{\type {\Uchar}}

The expandable command \type {\Uchar} reads a number between~0 and $1{,}114{,}111$
and expands to the associated \UNICODE\ character.

\section{Extended tables}

All traditional \TEX\ and \ETEX\ registers can be 16-bit numbers. The affected
commands are:

\startfourcolumns
\starttyping
\count
\dimen
\skip
\muskip
\marks
\toks
\countdef
\dimendef
\skipdef
\muskipdef
\toksdef
\insert
\box
\unhbox
\unvbox
\copy
\unhcopy
\unvcopy
\wd
\ht
\dp
\setbox
\vsplit
\stoptyping
\stopfourcolumns

Because font memory management has been rewritten, character properties in fonts
are no longer shared among fonts instances that originate from the same metric
file.

\section{Attributes}

\subsection{Attribute registers}

Attributes are a completely new concept in \LUATEX. Syntactically, they behave a
lot like counters: attributes obey \TEX's nesting stack and can be used after
\type {\the} etc.\ just like the normal \type {\count} registers.

\startsyntax
\attribute <16-bit number> <optional equals> <32-bit number>!crlf
\attributedef <csname> <optional equals> <16-bit number>
\stopsyntax

Conceptually, an attribute is either \quote {set} or \quote {unset}. Unset
attributes have a special negative value to indicate that they are unset, that
value is the lowest legal value: \type {-"7FFFFFFF} in hexadecimal, a.k.a.
$-2147483647$ in decimal. It follows that the value \type {-"7FFFFFFF} cannot be
used as a legal attribute value, but you {\it can\/} assign \type {-"7FFFFFFF} to
\quote {unset} an attribute. All attributes start out in this \quote {unset}
state in \INITEX.

Attributes can be used as extra counter values, but their usefulness comes mostly
from the fact that the numbers and values of all \quote {set} attributes are
attached to all nodes created in their scope. These can then be queried from any
\LUA\ code that deals with node processing. Further information about how to use
attributes for node list processing from \LUA\ is given in~\in {chapter}[nodes].

Attributes are stored in a sorted (sparse) linked list that are shared when
possible. This permits efficient testing and updating.

\subsection{Box attributes}

Nodes typically receive the list of attributes that is in effect when they are
created. This moment can be quite asynchronous. For example: in paragraph
building, the individual line boxes are created after the \type {\par} command has
been processed, so they will receive the list of attributes that is in effect
then, not the attributes that were in effect in, say, the first or third line of
the paragraph.

Similar situations happen in \LUATEX\ regularly. A few of the more obvious
problematic cases are dealt with: the attributes for nodes that are created
during hyphenation, kerning and ligaturing borrow their attributes from their
surrounding glyphs, and it is possible to influence box attributes directly.

When you assemble a box in a register, the attributes of the nodes contained in
the box are unchanged when such a box is placed, unboxed, or copied. In this
respect attributes act the same as characters that have been converted to
references to glyphs in fonts. For instance, when you use attributes to implement
color support, each node carries information about its eventual color. In that
case, unless you implement mechanisms that deal with it, applying a color to
already boxed material will have no effect. Keep in mind that this
incompatibility is mostly due to the fact that separate specials and literals are
a more unnatural approach to colors than attributes.

It is possible to fine-tune the list of attributes that are applied to a \type
{hbox}, \type {vbox} or \type {vtop} by the use of the keyword \type {attr}. An
example:

\starttyping
\attribute2=5
\setbox0=\hbox {Hello}
\setbox2=\hbox attr1=12 attr2=-"7FFFFFFF{Hello}
\stoptyping

This will set the attribute list of box~2 to $1=12$, and the attributes of box~0
will be $2=5$. As you can see, assigning the maximum negative value causes an
attribute to be ignored.

The \type {attr} keyword(s) should come before a \type {to} or \type {spread}, if
that is also specified.

\section{\LUA\ related primitives}

\subsection{\type {\directlua}}

In order to merge \LUA\ code with \TEX\ input, a few new primitives are needed.
The primitive \type {\directlua} is used to execute \LUA\ code immediately. The
syntax is

\startsyntax
\directlua <general text>!crlf
\directlua <16-bit number> <general text>
\stopsyntax

The \syntax {<general text>} is expanded fully, and then fed into the \LUA\
interpreter. After reading and expansion has been applied to the \syntax
{<general text>}, the resulting token list is converted to a string as if it was
displayed using \type {\the\toks}. On the \LUA\ side, each \type {\directlua}
block is treated as a separate chunk. In such a chunk you can use the \type
{local} directive to keep your variables from interfering with those used by the
macro package.

The conversion to and from a token list means that you normally can not use \LUA\
line comments (starting with \type {--}) within the argument. As there typically
will be only one \quote {line} the first line comment will run on until the end
of the input. You will either need to use \TEX|-|style line comments (starting
with \%), or change the \TEX\ category codes locally. Another possibility is to
say:

\starttyping
\begingroup
\endlinechar=10
\directlua ...
\endgroup
\stoptyping

Then \LUA\ line comments can be used, since \TEX\ does not replace line endings
with spaces.

Likewise, the \syntax {<16-bit number>} designates a name of a \LUA\ chunk and is
taken from the \type {lua.name} array (see the documentation of the \type {lua}
table further in this manual). When a chunk name starts with a \type {@} it will
be displayed as a file name. This is a side effect of the way \LUA\ implements
error handling.

The \type {\directlua} command is expandable. Since it passes \LUA\ code to the
\LUA\ interpreter its expansion from the \TEX\ viewpoint is usually empty.
However, there are some \LUA\ functions that produce material to be read by \TEX,
the so called print functions. The most simple use of these is \type
{tex.print(<string> s)}. The characters of the string \type {s} will be placed on
the \TEX\ input buffer, that is, \quote {before \TEX's eyes} to be read by \TEX\
immediately. For example:

\startbuffer
\count10=20
a\directlua{tex.print(tex.count[10]+5)}b
\stopbuffer

\typebuffer

expands to

\getbuffer

Here is another example:

\startbuffer
$\pi = \directlua{tex.print(math.pi)}$
\stopbuffer

\typebuffer

will result in

\getbuffer

Note that the expansion of \type {\directlua} is a sequence of characters, not of
tokens, contrary to all \TEX\ commands. So formally speaking its expansion is
null, but it places material on a pseudo-file to be immediately read by \TEX, as
\ETEX's \type {\scantokens}. For a description of print functions look at \in
{section} [sec:luaprint].

Because the \syntax {<general text>} is a chunk, the normal \LUA\ error handling
is triggered if there is a problem in the included code. The \LUA\ error messages
should be clear enough, but the contextual information is still pretty bad.
Often, you will only see the line number of the right brace at the end of the
code.

While on the subject of errors: some of the things you can do inside \LUA\ code
can break up \LUATEX\ pretty bad. If you are not careful while working with the
node list interface, you may even end up with assertion errors from within the
\TEX\ portion of the executable.

The behaviour documented in the above subsection is considered stable in the sense
that there will not be backward-incompatible changes any more.

\subsection{\type {\latelua}}

Contrary to \type {\directlua}, \type {\latelua} stores \LUA\ code in a whatsit
that will be processed at the time of shipping out. Its intended use is a cross
between \PDF\ literals (often available as \type {\pdfliteral}) and the
traditional \TEX\ extension \type {\write}. Within the \LUA\ code you can print
\PDF\ statements directly to the \PDF\ file via \type {pdf.print}, or you can
write to other output streams via \type {texio.write} or simply using \LUA\ \IO\
routines.

\startsyntax
\latelua <general text>!crlf
\latelua <16-bit number> <general text>
\stopsyntax

Expansion of macros in the final \type {<general text>} is delayed until just
before the whatsit is executed (like in \type {\write}). With regard to \PDF\
output stream \type {\latelua} behaves as \PDF\ page literals. The \syntax
{name <general text>} and \syntax {<16-bit number>} behave in the same way as
they do for \type {\directlua}

\subsection{\type {\luaescapestring}}

This primitive converts a \TEX\ token sequence so that it can be safely used as
the contents of a \LUA\ string: embedded backslashes, double and single quotes,
and newlines and carriage returns are escaped. This is done by prepending an
extra token consisting of a backslash with category code~12, and for the line
endings, converting them to \type {n} and \type {r} respectively. The token
sequence is fully expanded.

\startsyntax
\luaescapestring <general text>
\stopsyntax

Most often, this command is not actually the best way to deal with the
differences between the \TEX\ and \LUA. In very short bits of \LUA\
code it is often not needed, and for longer stretches of \LUA\ code it
is easier to keep the code in a separate file and load it using \LUA's
\type {dofile}:

\starttyping
\directlua { dofile('mysetups.lua') }
\stoptyping

\subsection{\type {\luafunction}}

The \type {\directlua} commands involves tokenization of its argument (after
picking up an optional name or number specification). The tokenlist is then
converted into a string and given to \LUA\ to turn into a function that is
called. The overhead is rather small but when you use this primitive hundreds of
thousands of times, it can become noticeable. For this reason there is a variant
call available: \type {\luafunction}. This command is used as follows:

\starttyping
\directlua {
    local t = lua.get_functions_table()
    t[1] = function() tex.print("!") end
    t[2] = function() tex.print("?") end
}

\luafunction1
\luafunction2
\stoptyping

Of course the functions can also be defined in a separate file. There is no limit
on the number of functions apart from normal \LUA\ limitations. Of course there
is the limitation of no arguments but that would involve parsing and thereby give
no gain. The function, when called in fact gets one argument, being the index, so
in the following example the number \type {8} gets typeset.

\starttyping
\directlua {
    local t = lua.get_functions_table()
    t[8] = function(slot) tex.print(slot) end
}
\stoptyping

\section {Alignments}

\subsection{\tex {alignmark}}

This primitive duplicates the functionality of \type {#} inside alignment
preambles.

\subsection{\tex {aligntab}}

This primitive duplicates the functionality of \type {&} inside alignments and
preambles.

\section{Catcode tables}

Catcode tables are a new feature that allows you to switch to a predefined
catcode regime in a single statement. You can have a practically unlimited number
of different tables. This subsystem is backward compatible: if you never use the
following commands, your document will not notice any difference in behaviour
compared to traditional \TEX. The contents of each catcode table is independent
from any other catcode tables, and their contents is stored and retrieved from
the format file.

\subsection{\type {\catcodetable}}

\startsyntax
\catcodetable <15-bit number>
\stopsyntax

The primitive \type {\catcodetable} switches to a different catcode table. Such a
table has to be previously created using one of the two primitives below, or it
has to be zero. Table zero is initialized by \INITEX.

\subsection{\type {\initcatcodetable}}

\startsyntax
\initcatcodetable <15-bit number>
\stopsyntax

The primitive \type {\initcatcodetable} creates a new table with catcodes identical
to those defined by \INITEX:

\starttabulate[|r|l|l|l|]
\NC  0 \NC \tttf \letterbackslash       \NC         \NC \type {escape}       \NC\NR
\NC  5 \NC \tttf \letterhat\letterhat M \NC return  \NC \type {car_ret}      \NC\NR
\NC  9 \NC \tttf \letterhat\letterhat @ \NC null    \NC \type {ignore}       \NC\NR
\NC 10 \NC \tttf <space>                \NC space   \NC \type {spacer}       \NC\NR
\NC 11 \NC {\tttf a} \endash\ {\tttf z} \NC         \NC \type {letter}       \NC\NR
\NC 11 \NC {\tttf A} \endash\ {\tttf Z} \NC         \NC \type {letter}       \NC\NR
\NC 12 \NC everything else              \NC         \NC \type {other}        \NC\NR
\NC 14 \NC \tttf \letterpercent         \NC         \NC \type {comment}      \NC\NR
\NC 15 \NC \tttf \letterhat\letterhat ? \NC delete  \NC \type {invalid_char} \NC\NR
\stoptabulate

The new catcode table is allocated globally: it will not go away after the
current group has ended. If the supplied number is identical to the currently
active table, an error is raised.

\subsection{\type {\savecatcodetable}}

\startsyntax
\savecatcodetable <15-bit number>
\stopsyntax

\type {\savecatcodetable} copies the current set of catcodes to a new table with
the requested number. The definitions in this new table are all treated as if
they were made in the outermost level.

The new table is allocated globally: it will not go away after the current group
has ended. If the supplied number is the currently active table, an error is
raised.

\section{Suppressing errors}

\subsection{\type {\suppressfontnotfounderror}}

\startsyntax
\suppressfontnotfounderror = 1
\stopsyntax

If this integer parameter is non|-|zero, then \LUATEX\ will not complain about
font metrics that are not found. Instead it will silently skip the font
assignment, making the requested csname for the font \type {\ifx} equal to \type
{\nullfont}, so that it can be tested against that without bothering the user.

\subsection{\type {\suppresslongerror}}

\startsyntax
\suppresslongerror = 1
\stopsyntax

If this integer parameter is non|-|zero, then \LUATEX\ will not complain about
\type {\par} commands encountered in contexts where that is normally prohibited
(most prominently in the arguments of non-long macros).

\subsection{\type {\suppressifcsnameerror}}

\startsyntax
\suppressifcsnameerror = 1
\stopsyntax

If this integer parameter is non|-|zero, then \LUATEX\ will not complain about
non-expandable commands appearing in the middle of a \type {\ifcsname} expansion.
Instead, it will keep getting expanded tokens from the input until it encounters
an \type {\endcsname} command. If the input expansion is unbalanced with respect
to \type {\csname} \ldots \type {\endcsname} pairs, the \LUATEX\ process may hang
indefinitely.

\subsection{\type {\suppressoutererror}}

\startsyntax
\suppressoutererror = 1
\stopsyntax

If this new integer parameter is non|-|zero, then \LUATEX\ will not complain
about \type {\outer} commands encountered in contexts where that is normally
prohibited.

\subsection{\type {\suppressmathparerror}}

The following setting will permit \type {\par} tokens in a math formula:

\startsyntax
\suppressmathparerror = 1
\stopsyntax

So, the next code is valid then:

\starttyping
$ x + 1 =

a $
\stoptyping

\section {Math}

\subsection{Extensions}

We will cover math in its own chapter because not only the font subsystem and
spacing model have been enhanced (thereby introducing many new primitives) but
also because some more control has been added to existing functionality.

\subsection{\type {\matheqnogapstep}}

By default \TEX\ will add one quad between the equation and the number. This is
hard coded. A new primitive can control this:

\startsyntax
\matheqnogapstep = 1000
\stopsyntax

Because a math quad from the math text font is used instead of a dimension, we
use a step to control the size. A value of zero will suppress the gap. The step
is divided by 1000 which is the usual way to mimmick floating point factors in
\TEX.

\section{Fonts}

\subsection{Font syntax}

\LUATEX\ will accept a braced argument as a font name:

\starttyping
\font\myfont = {cmr10}
\stoptyping

This allows for embedded spaces, without the need for double quotes. Macro
expansion takes place inside the argument.

\subsection{\type {\fontid}}

\startsyntax
\fontid\font
\stopsyntax

This primitive expands into a number. It is not a register so there is no need to
prefix with \type {\number} (and using \type {\the} gives an error). The currently
used font id is \fontid\font. Here are some more:

\starttabulate[|l|c|]
\NC \type {\bf} \NC \bf \fontid\font \NC \NR
\NC \type {\it} \NC \it \fontid\font \NC \NR
\NC \type {\bi} \NC \bi \fontid\font \NC \NR
\stoptabulate

These numbers depend on the macro package used because each one has its own way
of dealing with fonts. They can also differ per run, as they can depend on the
order of loading fonts. For instance, when in \CONTEXT\ virtual math \UNICODE\
fonts are used, we can easily get over a hundred ids in use. Not all ids have to
be bound to a real font, after all it's just a number.

\subsection{\type {\setfontid}}

The primitive \type {\setfontid} can be used to enable a font with the given id
(which of course needs to be a valid one).

\subsection{\type {\noligs} and \type {\nokerns}}

These primitives prohibit ligature and kerning insertion at the time when the
initial node list is built by \LUATEX's main control loop. You can enable these
primitives when you want to do node list processing of \quote {characters}, where
\TEX's normal processing would get in the way.

\startsyntax
\noligs <integer>!crlf
\nokerns <integer>
\stopsyntax

These primitives can also be implemented by overloading the ligature building and
kerning functions, i.e.\ by assigning dummy functions to their associated
callbacks. Keep in mind that when you define a font (using \LUA) you can also
omit the kern and ligature tables, which has the same effect as the above.

\subsection{\type{\nospaces}}

This new primitive can be used to overrule the usual \type {\spaceskip}
related heuristics when a space character is seen in a text flow. The
value~\type{1} triggers no injection while \type{2} results in injection of
a zero skip. Below we see the results for four characters separated by a
space.

\startlinecorrection
\startcombination[3*2]
    {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=0\relax x x x x \par}\hss}} {\type {0 / hsize 10mm}}
    {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=1\relax x x x x \par}\hss}} {\type {1 / hsize 10mm}}
    {\ruledhbox to 5cm{\vtop{\hsize 10mm\nospaces=2\relax x x x x \par}\hss}} {\type {2 / hsize 10mm}}
    {\ruledhbox to 5cm{\vtop{\hsize  1mm\nospaces=0\relax x x x x \par}\hss}} {\type {0 / hsize 1mm}}
    {\ruledhbox to 5cm{\vtop{\hsize  1mm\nospaces=1\relax x x x x \par}\hss}} {\type {1 / hsize 1mm}}
    {\ruledhbox to 5cm{\vtop{\hsize  1mm\nospaces=2\relax x x x x \par}\hss}} {\type {2 / hsize 1mm}}
\stopcombination
\stoplinecorrection

\section{Tokens, commands and strings}

\subsection{\type {\scantextokens}}

The syntax of \type {\scantextokens} is identical to \type {\scantokens}. This
primitive is a slightly adapted version of \ETEX's \type {\scantokens}. The
differences are:

\startitemize
\startitem
    The last (and usually only) line does not have a \type {\endlinechar}
    appended.
\stopitem
\startitem
    \type {\scantextokens} never raises an EOF error, and it does not execute
    \type {\everyeof} tokens.
\stopitem
\startitem
    There are no \quote {\unknown\ while end of file \unknown} error tests
    executed. This allows the expansion to end on a different grouping level or
    while a conditional is still incomplete.
\stopitem
\stopitemize

\subsection{\type {\toksapp}, \type {\tokspre}, \type {\etoksapp} and \type {\etokspre}}

Instead of:

\starttyping
\toks0\expandafter{\the\toks0 foo}
\stoptyping

you can use:

\starttyping
\etoksapp0{foo}
\stoptyping

The \type {pre} variants prepend instead of append, and the \type {e} variants
expand the passed general text.

\subsection{\type {\csstring}, \type {\begincsname} and \type {\lastnamedcs}}

These are somewhat special. The \type {\csstring} primitive is like
\type {\string} but it omits the leading escape character. This can be
somewhat more efficient that stripping it of afterwards.

The \type {\begincsname} primitive is like \type {\csname} but doesn't create
a relaxed equivalent when there is no such name. It is equivalent to

\starttyping
\ifcsname foo\endcsname
  \csname foo\endcsname
\fi
\stoptyping

The advantage is that it saves a lookup (don't expect much speedup) but more
important is that it avoids using the \type {\if}.

The \type {\lastnamedcs} is one that should be used with care. The above
example could be written as:

\starttyping
\ifcsname foo\endcsname
  \lastnamedcs
\fi
\stoptyping

This is slightly more efficient than constructing the string twice (deep down in
\LUATEX\ this also involves some \UTF8 juggling), but probably more relevant is
that it saves a few tokens and can make code a bit more more readable.

\subsection{\type {\clearmarks}}

This primitive complements the \ETEX\ mark primitives and clears a mark class
completely, resetting all three connected mark texts to empty. It is an
immediate command.

\startsyntax
\clearmarks <16-bit number>
\stopsyntax

\subsection{\type{\letcharcode}}

This primitive is still experimental but can be used to assign a meaning to an active
character, as in:

\starttyping
\def\foo{bar} \letcharcode123=\foo
\stoptyping

This can be a bit nicer that using the uppercase tricks (using the property of
\type {\uppercase} that it treats active characters special).

\section{Boxes, rules and leaders}

\subsection{\type {\outputbox}}

\startsyntax
\outputbox = 65535
\stopsyntax

This new integer parameter allows you to alter the number of the box that will be
used to store the page sent to the output routine. Its default value is 255, and
the acceptable range is from 0 to 65535.

\subsection{\type {\vpack}, \type {\hpack} and \type {\tpack}}

These three primitives are like \type {\vbox}, \type {\hbox} and \type {\vtop}
but don't apply the related callbacks.

\subsection{\type {\vsplit}}

The \type {\vsplit} primitive has to be followed by a specification of the
required height. As alternative for the \type {to} keyword you can use \type
{upto} to get a split of the given size but result has the natural dimensions
then.

\subsection{Images and Forms}

These two concepts are now core concepts and no longer whatsits. They are in fact
now implemented as rules with special properties. Normal rules have subtype~0,
saved boxes have subtype~1 and images have subtype~2. This has the positive side
effect that whenever we need to take content with dimensions into account, when we
look at rule nodes, we automatically also deal with these two types.

The syntax of the \type {\save...resource} is the same as in \PDFTEX\ but you
should consider them to be backend specific. This means that a macro package
should treat them as such and check for the current output mode if applicable.
Here are the equivalents:

\starttabulate[|l|l|]
\NC \type {\saveboxresource}             \EQ \type {\pdfxform}           \NC \NR
\NC \type {\saveimageresource}           \EQ \type {\pdfximage}          \NC \NR
\NC \type {\useboxresource}              \EQ \type {\pdfrefxform}        \NC \NR
\NC \type {\useimageresource}            \EQ \type {\pdfrefximage}       \NC \NR
\NC \type {\lastsavedboxresourceindex}   \EQ \type {\pdflastxform}       \NC \NR
\NC \type {\lastsavedimageresourceindex} \EQ \type {\pdflastximage}      \NC \NR
\NC \type {\lastsavedimageresourcepages} \EQ \type {\pdflastximagepages} \NC \NR
\stoptabulate

\LUATEX\ accepts optional dimension parameters for \type {\use...resource} in the
same format as for rules. With images, these dimensions are then used instead of
the ones given to \type {\useimageresource} but the original dimensions are not
overwritten, so that a \type {\useimageresource} without dimensions still
provides the image with dimensions defined by \type {\saveimageresource}. These
optional parameters are not implemented for \type {\saveboxresource}.

\starttyping
\useimageresource width 20mm height 10mm depth 5mm \lastsavedimageresourceindex
\useboxresource   width 20mm height 10mm depth 5mm \lastsavedboxresourceindex
\stoptyping

The box resources are of course implemented in the backend and therefore we do
support the \type {attr} and \type {resources} keys that accept a token list. New
is the \type {type} key. When set to non|-|zero the \type {/Type} entry is
omitted. A value of 1 or 3 still writes a \type {/BBox}, while 2 or 3 will write
a \type {/Matrix}.

\subsection{\type {\nohrule} and \type {\novrule}}

Because introducing a new keyword can cause incompatibilities, two new primitives
were introduced: \type {\nohrule} and \type {\novrule}. These can be used to
reserve space. This is often more efficient than creating an empty box with fake
dimensions).

\subsection{\type {\gleaders}}

This type of leaders is anchored to the origin of the box to be shipped out. So
they are like normal \type {\leaders} in that they align nicely, except that the
alignment is based on the {\it largest\/} enclosing box instead of the {\it
smallest\/}. The \type {g} stresses this global nature.

\section {Languages}

\subsection{\type {\hyphenationmin}}

This primitive can be used to set the minimal word length, so setting it to a value
of~$5$ means that only words of 6 characters and more will be hyphenated, of course
within the constraints of the \type {\lefthyphenmin} and \type {\righthyphenmin}
values (as stored in the glyph node). This primitive accepts a number and stores
the value with the language.

\subsection{\type {\boundary}, \type {\noboundary}, \type {\protrusionboundary} and \type
{\wordboundary}}

The \type {\noboundary} commands used to inject a whatsit node but now injects a normal
node with type \type {boundary} and subtype~0. In addition you can say:

\starttyping
x\boundary 123\relax y
\stoptyping

This has the same effect but the subtype is now~1 and the value~123 is stored.
The traditional ligature builder still sees this as a cancel boundary directive
but at the \LUA\ end you can implement different behaviour. The added benefit of
passing this value is a side effect of the generalization. The subtypes~2 and~3
are used to control protrusion and word boundaries in hyphenation.

\section{Control and debugging}

\subsection {Tracing}

If \type {\tracingonline} is larger than~2, the node list display will also print
the node number of the nodes.

\subsection{\type {\outputmode} and \type {\draftmode}}

The \type {\outputmode} variable tells \LUATEX\ what it has to produce:

\starttabulate[|l|l|]
\NC \type {0} \NC \DVI\ code \NC \NR
\NC \type {1} \NC \PDF\ code \NC \NR
\stoptabulate

The value of the \type {\draftmode} counter signals the backend if it should
output less. The \PDF\ backend accepts a value of~$1$, while the \DVI\ backend
ignores the value.

\section {Files}

\subsection{File syntax}

\LUATEX\ will accept a braced argument as a file name:

\starttyping
\input {plain}
\openin 0 {plain}
\stoptyping

This allows for embedded spaces, without the need for double quotes. Macro
expansion takes place inside the argument.

\subsection{Writing to file}

You can now open upto 127 files with \type {\openout}. When no file is open
writes will go to the console and log. As a consequence a system command is
no longer possible but one can use \type {os.execute} to do the same.

\stopchapter

\stopcomponent