\environment luatex-style \environment luatex-logos \startcomponent luatex-modifications \startchapter[reference=modifications,title={Modifications}] \startsection[title=The merged engines] \startsubsection[title=The need for change] The first version of \LUATEX\ only had a few extra primitives and it was largely the same as \PDFTEX. Then we merged substantial parts of \ALEPH\ into the code and got more primitives. When we got more stable teh decision was made to clean up the rather hybrid nature of the program. This means that some primnitives have been promoted to core primitives, often with a different name, and that others were removed. This made it possible to start cleaning up the code base. We will describe most in following paragraphs. Besides the expected changes caused by new functionality, there are a number of not|-|so|-|expected changes. These are sometimes a side|-|effect of a new (conflicting) feature, or, more often than not, a change neccessary to clean up the internal interfaces. These will also be mentioned. \stopsubsection \startsubsection[title=Changes from \TEX\ 3.1415926] Of course it all starts with traditional \TEX. Even if we started with \PDFTEX, most still comes from the original. But we divert a bit. \startitemize \startitem The current code base is written in \CCODE, not \PASCAL. We use \CWEB\ when possible. \stopitem \startitem See \in {chapter} [languages] for many small changes related to paragraph building, language handling and hyphenation. The most important change is that adding a brace group in the middle of a word (like in \type {of{}fice}) does not prevent ligature creation. \stopitem \startitem There is no pool file, all strings are embedded during compilation. \stopitem \startitem The specifier \type {plus 1 fillll} does not generate an error. The extra \quote{l} is simply typeset. \stopitem \startitem The upper limit to \type {\endlinechar} and \type {\newlinechar} is 127. \stopitem \startitem The hz optimization code has been partially redone so that we no longer need to create extra font instances. The front- and backend have been decoupled and more efficient (\PDF) code is generated. \stopitem \stopitemize \stopsubsection \startsubsection[title=Changes from \ETEX\ 2.2] Being the de factor standard extension of course we provide the \ETEX\ functionality, but with a few small adaptions. \startitemize \startitem The \ETEX\ functionality is always present and enabled so the prepended asterisk or \type {-etex} switch for \INITEX\ is not needed. \stopitem \startitem The \TEXXET\ extension is not present, so the primitives \type {\TeXXeTstate}, \type {\beginR}, \type {\beginL}, \type {\endR} and \type {\endL} are missing. \stopitem \startitem Some of the tracing information that is output by \ETEX's \type {\tracingassigns} and \type {\tracingrestores} is not there. \stopitem \startitem Register management in \LUATEX\ uses the \ALEPH\ model, so the maximum value is 65535 and the implementation uses a flat array instead of the mixed flat|\&|sparse model from \ETEX. \stopitem \startitem The \type {\savinghyphcodes} command is a no|-|op. \in {Chapter} [languages] explains why. \stopitem \startitem When kpathsea is used to find files, \LUATEX\ uses the \type {ofm} file format to search for font metrics. In turn, this means that \LUATEX\ looks at the \type {OFMFONTS} configuration variable (like \OMEGA\ and \ALEPH) instead of \type {TFMFONTS} (like \TEX\ and \PDFTEX). Likewise for virtual fonts (\LUATEX\ uses the variable \type {OVFFONTS} instead of \type {VFFONTS}). \stopitem \stopitemize \stopsubsection \startsubsection[title=Changes from \PDFTEX\ 1.40] Because we want to produce \PDF\ the most natural starting point was the popular \PDFTEX\ program. We inherit the stable features, dropped most of the experimental code and promoted some functionality to core \LUATEX\ functionality which in turn triggered renaming primitives. \startitemize \startitem The (experimental) support for snap nodes has been removed, because it is much more natural to build this functionality on top of node processing and attributes. The associated primitives that are now gone are: \type {\pdfsnaprefpoint}, \type {\pdfsnapy}, and \type {\pdfsnapycomp}. \stopitem \startitem The (experimental) support for specialized spacing around nodes has also been removed. The associated primitives that are now gone are: \type {\pdfadjustinterwordglue}, \type {\pdfprependkern}, and \type {\pdfappendkern}, as well as the five supporting primitives \type {\knbscode}, \type {\stbscode}, \type {\shbscode}, \type {\knbccode}, and \type {\knaccode}. \stopitem \startitem A number of \quote {pdftex primitives} have been removed as they can be implemented using \LUA: \start \raggedright \type {\pdfelapsedtime}, \type {\pdfescapehex}, \type {\pdfescapename}, \type {\pdfescapestring}, \type {\pdffiledump}, \type {\pdffilemoddate}, \type {\pdffilesize}, \type {\pdfforcepagebox}, \type {\pdflastmatch}, \type {\pdfmatch}, \type {\pdfmdfivesum}, \type {\pdfmovechars}, \type {\pdfoptionalwaysusepdfpagebox}, \type {\pdfoptionpdfinclusionerrorlevel}, \type {\pdfresettimer}, \type {\pdfshellescape}, \type {\pdfstrcmp} and \type {\pdfunescapehex} \par \stop \stopitem \startitem The version related primitives \type {\pdftexbanner}, \type {\pdftexversion} and \type {\pdftexrevision} are no longer present as there is no longer a strict relationship with \PDFTEX\ development. \stopitem \startitem The experimental snapper mechanism has been removed and therefore also the primitives: \start \raggedright \type {\pdfignoreddimen}, \type {\pdffirstlineheight}, \type {\pdfeachlineheight}, \type {\pdfeachlinedepth} and \type {\pdflastlinedepth} \par \stop \stopitem \startitem The experimental primitives \type {\primitive}, \type {\ifprimitive}, \type {\ifabsnum} and \type {\ifabsdim} are promoted to core primitives. The \type {\pdf*} prefixed originals are not available. \stopitem \startitem The \PNG\ transparency fix from 1.40.6 is not applied as high|-|level support is pending. \stopitem \startitem Two extra token lists are provides, \type {\pdfxformresources} and \type {\pdfxformattr}, as an alternative to \type {\pdfxform} keywords. \stopitem \startitem The current version of \LUATEX\ no longer replaces and|/|or merges fonts in embedded pdf files with fonts of the enveloping \PDF\ document. This regression may be temporary, depending on how the rewritten font backend will look like. \stopitem \startitem The primitives \type {\pdfpagewidth} and \type {\pdfpageheight} have been removed because \type {\pagewidth} and \type {\pageheight} have that purpose. \stopitem \startitem The primitives \type {\pdfnormaldeviate}, \type {\pdfuniformdeviate}, \type {\pdfsetrandomseed} and \type {\pdfrandomseed} have been promoted to core primitives without \type {pdf} prefix so the original commands are no longer recognized. \stopitem \startitem The primitives \type {\ifincsname}, \type {\expanded} and \type {\quitvmode} are now core primitives. \stopitem \startitem As the hz and protrusion mechanism are part of the core the related primitives \type {\lpcode}, \type {\rpcode}, \type {\efcode}, \type {\leftmarginkern}, \type {\rightmarginkern} are promoted to core primitives. The two commands \type {\protrudechars} and \type {\adjustspacing} replace their prefixed with \type {\pdf} originals. \stopitem \startitem The \type {\tagcode} primitive is promoted to core primitive. \stopitem \startitem The \type {\letterspacefont} feature is now part of the core but will not be changed (improved). We just provide it for legacy use. \stopitem \startitem The \type {\pdfnoligatures} primitive is now \type {\ignoreligaturesinfont}. \stopitem \startitem The \type {\pdffontexpand} primitive is now \type {\expandglyphsinfont}. \stopitem \startitem Because position tracking is also available in \DVI\ mode the \type {\savepos}, \type {\lastxpos} and \type {\lastypos} commands now replace their \type {pdf} prefixed originals. \stopitem \startitem Candidates for removal are \type {\pdfcolorstackinit} and \type {\pdfcolorstack}. \stopitem \startitem Candidates for replacement are \type {\pdfoutput} (\type {\outputmode}) and \type {\pdfmatrix} (something with a normal syntax). \stopitem \startitem The introspective primitives \type {\pdflastximagecolordepth} and \type {\pdfximagebbox} have been removed. One can use external applications to determine these properties or use the built|-|in \type {img} library. \stopitem \stopitemize One change involves the so called xforms and ximages. In \PDFTEX\ these are implemented as so called whatsits. But contrary to other whatsits they have dimensions that need to be taken into account when for instance calculating optimal linebreaks. In \LUATEX\ these are now promoted to normal nodes, which simplifies code that needs those dimensions. Another reason for promotion is that these are useful concepts. Backends can provide the ability to use content that has been rendered in several places, and images are also common. For that reason we also changed the names: \starttabulate[|l|l|] \NC \bf new name \NC \bf old name \NC \NR \NC \type {\saveboxresource} \NC \type {\pdfxform} \NC \NR \NC \type {\saveimageresource} \NC \type {\pdfximage} \NC \NR \NC \type {\useboxresource} \NC \type {\pdfrefxform} \NC \NR \NC \type {\useimageresource} \NC \type {\pdfrefximage} \NC \NR \NC \type {\lastsavedboxresourceindex} \NC \type {\pdflastxform} \NC \NR \NC \type {\lastsavedimageresourceindex} \NC \type {\pdflastximage} \NC \NR \NC \type {\lastsavedimageresourcepages} \NC \type {\pdflastximagepages} \NC \NR \stoptabulate There are a few \type {\pdf...} primitives that relate to this but these are typical backend specific ones. The index that gets returned is to be considered as \quote {just a number} and although it still has the same meaning (object related) as before, you should not depend on that. \stopsubsection \startsubsection[title=Changes from \ALEPH\ RC4] Because we wanted proper directional typesetting the \ALEPH\ mechanisms looked most attractive. These are rather close to the ones provided by \OMEGA, so what we say next applies to both these programs. \startitemize \startitem The extended 16-bit math primitives (\type {\omathcode} etc.) have been removed. \stopitem \startitem The \OCP\ processing is no longer supported at all. As a consequence, the following primitives have been removed: \start \raggedright \type {\ocp}, \type {\externalocp}, \type {\ocplist}, \type {\pushocplist}, \type {\popocplist}, \type {\clearocplists}, \type {\addbeforeocplist}, \type {\addafterocplist}, \type {\removebeforeocplist}, \type {\removeafterocplist} and \type {\ocptracelevel} \par \stop \stopitem \startitem \LUATEX\ only understands 4~of the 16~direction specifiers of \ALEPH: \type {TLT} (latin), \type {TRT} (arabic), \type {RTT} (cjk), \type {LTL} (mongolian). All other direction specifiers generate an error. \stopitem \startitem The input translations from \ALEPH\ are not implemented, the related primitives are not available: \start \raggedright \type {\DefaultInputMode}, \type {\noDefaultInputMode}, \type {\noInputMode}, \type {\InputMode}, \type {\DefaultOutputMode}, \type {\noDefaultOutputMode}, \type {\noOutputMode}, \type {\OutputMode}, \type {\DefaultInputTranslation}, \type {\noDefaultInputTranslation}, \type {\noInputTranslation}, \type {\InputTranslation}, \type {\DefaultOutputTranslation}, \type {\noDefaultOutputTranslation}, \type {\noOutputTranslation} and \type {\OutputTranslation} \par \stop \stopitem \startitem Several bugs hav ebeen fixed. The \type {\hoffset} bug when \type {\pagedir TRT} is gone, removing the need for an explicit fix to \type {\hoffset}. Also bug causing \type {\fam} to fail for family numbers above 15 is fixed. A fair amount of other minor bugs are fixed as well, most of these related to \type {\tracingcommands} output. \stopitem \startitem The scanner for direction specifications now allows an optional space after the direction is completely parsed. \stopitem \startitem The \type {^^} notation can come in five and six item repetitions also, to insert characters that do not fit in the BMP. \stopitem \startitem Glues {\it immediately after} direction change commands are not legal breakpoints. \stopitem \startitem Several mechanisms that need to be right|-|to|-|left aware have been improved. For instance placement of formula numbers. \stopitem \startitem The page dimension related primitives \type {\pagewidth} and \type {\pageheight} have been promoted to core primitives. \stopitem \startitem The primitives \type {\charwd}, \type {\charht}, \type {\chardp} and \type {\charit} have been removes as we have the \ETEX\ variants \type {\fontchar*}. \stopitem \startitem The two dimension registers \type {\pagerightoffset} and \type {\pagebottomoffset} are now core primitives. \stopitem \startitem The direction related primitives \type {\pagedir}, \type {\bodydir}, \type {\pardir}, \type {\textdir}, \type {\mathdir} and \type {\boxdir} are now core primitives. \stopitem \startitem The promotion of primitives to core primitives as well as the removed of all others mean that the initialization namespace \type {aleph} is gone. \stopitem \stopitemize \stopsubsection \startsubsection[title=Changes from standard \WEBC] The compilation framework is \WEBC\ and we keep using that but without the \PASCAL\ to \CCODE\ step. This framework also provides some common features that deal with reading bytes from files and locating files in \TDS. This is what we do different: \startitemize \startitem There is no mltex support. \stopitem \startitem There is no enctex support. \stopitem \startitem The following commandline switches are silently ignored, even in non|-|\LUA\ mode: \type {-8bit}, \type {-translate-file}, \type {-mltex}, \type {-enc} and \type {-etex}. \stopitem \startitem The \type {\openout} whatsits are not written to the log file. \stopitem \startitem Some of the so|-|called web2c extensions are hard to set up in non|-|\KPSE\ mode because \type {texmf.cnf} is not read: \type {shell-escape} is off (but that is not a problem because of \LUA's \type {os.execute}), and the paranoia checks on \type {openin} and \type {openout} do not happen (however, it is easy for a \LUA\ script to do this itself by overloading \type {io.open}). \stopitem \startitem The \quote{E} option does not do anything useful. \stopitem \stopitemize \stopsubsection \stopsection \startsection[title=Implementation notes] \startsubsection[title=Memory allocation] The single internal memory heap that traditional \TEX\ used for tokens and nodes is split into two separate arrays. Each of these will grow dynamically when needed. The \type {texmf.cnf} settings related to main memory are no longer used (these are: \type {main_memory}, \type {mem_bot}, \type {extra_mem_top} and \type {extra_mem_bot}). \quote {Out of main memory} errors can still occur, but the limiting factor is now the amount of RAM in your system, not a predefined limit. Also, the memory (de)allocation routines for nodes are completely rewritten. The relevant code now lives in the C file \type {texnode.c}, and basically uses a dozen or so \quote {avail} lists instead of a doubly|-|linked model. An extra function layer is added so that the code can ask for nodes by type instead of directly requisitioning a certain amount of memory words. Because of the split into two arrays and the resulting differences in the data structures, some of the macros have been duplicated. For instance, there are now \type {vlink} and \type {vinfo} as well as \type {token_link} and \type {token_info}. All access to the variable memory array is now hidden behind a macro called \type {vmem}. The implementation of the growth of two arrays (via reallocation) introduces a potential pitfall: the memory arrays should never be used as the left hand side of a statement that can modify the array in question. The input line buffer and pool size are now also reallocated when needed, and the \type {texmf.cnf} settings \type {buf_size} and \type {pool_size} are silently ignored. \stopsubsection \startsubsection[title=Sparse arrays] The \type {\mathcode}, \type {\delcode}, \type {\catcode}, \type {\sfcode}, \type {\lccode} and \type {\uccode} tables are now sparse arrays that are implemented in~\CCODE. They are no longer part of the \TEX\ \quote {equivalence table} and because each had 1.1 million entries with a few memory words each, this makes a major difference in memory usage. The \type {\catcode}, \type {\sfcode}, \type {\lccode} and \type {\uccode} assignments do not yet show up when using the etex tracing routines \type {\tracingassigns} and \type {\tracingrestores} (code simply not written yet). A side|-|effect of the current implementation is that \type {\global} is now more expensive in terms of processing than non|-|global assignments. See \type {mathcodes.c} and \type {textcodes.c} if you are interested in the details. Also, the glyph ids within a font are now managed by means of a sparse array and glyph ids can go up to index $2^{21}-1$. \stopsubsection \startsubsection[title=Simple single-character csnames] Single|-|character commands are no longer treated specially in the internals, they are stored in the hash just like the multiletter csnames. The code that displays control sequences explicitly checks if the length is one when it has to decide whether or not to add a trailing space. Active characters are internally implemented as a special type of multi|-|letter control sequences that uses a prefix that is otherwise impossible to obtain. \stopsubsection \startsubsection[title=Compressed format] The format is passed through zlib, allowing it to shrink to roughly half of the size it would have had in uncompressed form. This takes a bit more \CPU\ cycles but much less disk \IO, so it should still be faster. \stopsubsection \startsubsection[title=Binary file reading] All of the internal code is changed in such a way that if one of the \type {read_xxx_file} callbacks is not set, then the file is read by a C function using basically the same convention as the callback: a single read into a buffer big enough to hold the entire file contents. While this uses more memory than the previous code (that mostly used \type {getc} calls), it can be quite a bit faster (depending on your I/O subsystem). \stopsubsection \stopsection \stopchapter \stopcomponent