summaryrefslogtreecommitdiff
path: root/source/luametatex/source/luametatex.h
diff options
context:
space:
mode:
Diffstat (limited to 'source/luametatex/source/luametatex.h')
-rw-r--r--source/luametatex/source/luametatex.h345
1 files changed, 345 insertions, 0 deletions
diff --git a/source/luametatex/source/luametatex.h b/source/luametatex/source/luametatex.h
new file mode 100644
index 000000000..c2536f461
--- /dev/null
+++ b/source/luametatex/source/luametatex.h
@@ -0,0 +1,345 @@
+/*
+ See license.txt in the root of this project.
+*/
+
+# ifndef LMT_LUAMETATEX_H
+# define LMT_LUAMETATEX_H
+
+/*tex
+
+ The \LUATEX\ project started in 2005 with an experiments by Hartmut and me: adding the \LUA\
+ Scripting language (that I knew from the \SCITE\ editor) to \PDFTEX. When we came to the
+ conclusion that a more tight integration made sense Taco did the impressive conversion from
+ \PASCAL\ |WEB\ to \CWEB. This happened in the perspective of the Oriental \TEX\ project, that
+ has as objective high quality Arabic typesetting. The way to achieve that was opening up the
+ font machinery and access to the paragraph building. It was an intense development period,
+ with Taco doing the coding, Hans exploring possibilities and extending \CONTEXT, and Idris
+ making fonts and testing. Taco and I discussed, compiled, accepted and rejected ideas. These
+ were interesting times! Over the years that we had used \TEX\ we could finally explore what we
+ had been talking about for years (long trips to user group meetings are good for that). We
+ ame to the first version(s) of \LUATEX\ with \CONTEXT\ \MKIV\ providing a testbed and as we
+ progressed we ended up with something we liked a lot.
+
+ After half a decade, where in the meantime Taco also had turned MetaPost into a library, we
+ had a version that had proved itself well. The following years, with Taco having less time
+ available, I started loking at the code. Some more got added to the Lua interfaces. Math got
+ split code paths and some new primitives were introduced. Luigi started taking care of managing
+ the code base so that I could cross compile for \MSWINDOWS. He also deals with the libraries
+ that were used and integration in \TEXLIVE\ and maintains the (by now stable) \METAPOST\ code
+ base.
+
+ After a while it became clear that users other than \CONTEXT\ wanted the program to stay as it
+ was and not introduce features or improve interfaces in ways that demanded a change in used
+ \LUA\ code. So, after a decade of development the official stable release took place. We already
+ had a split between stable (normally the \TEXLIVE\ release) and experimental (that we used for
+ development). However, in practice experimental versions were seen as real releases and we got
+ complaints that something could be broken (which actually is natural for an experimental
+ version). So, this split model didn't work out well in practice: you cannot explore and
+ experiment when you cannot play with yet unfinished code.
+
+ So at some point I decided that the best approach to a follow up, one not interfering with
+ usage of a stable \LUATEX, would be a more drastic split: the idea of \LUAMETATEX\ took shape.
+ This code base is the result of that. For whatever bad was introduced in \LUAMETATEX, and maybe
+ already before that in \LUATEX), you can blame me (Hans) and not Taco: Luigi consistently added
+ (hh) to the \LUATEX\ svn entries when that was feasible, so one can check where I messed up.
+ In the end all this work can be considered a co-product and the \CONTEXT\ (dev) community was
+ instrumental in this as well.
+
+ There are some fundamental changes: there is no backend but maybe I'll introduce a framework
+ for that at some point because the impact on performance has been quite noticeable (although
+ it has been compensated in the meantime). There is no support for \LUAJIT, because it doesn't
+ keep up with \LUA. Also, there is no support for \FFI, because that project is orphaned, but
+ there are other ways. Some more is delegated to \LUA, but also some more has been added to \TEX.
+
+ Over the 15 years that it took to go from the first version of \LUATEX\ in 2005 to the first
+ release of \LUAMETATEX\ in 2020 (although intermediate versions have always been good enough
+ to be used in production with \CONTEXT) I've written numerous articles in user group journals
+ as well as several presentations each year on progress and features. There are also wrapups
+ available in the \CONTEXT\ distribution that shed some light on how the developments
+ progress(ed). In the end it's all a work of many. There are no commercial interrests and
+ everything is done out of love for TeX and in free time, so take that into account when you
+ bark about code or documentation.
+
+ The \LUAMETATEX\ code base is maintained by Hans Hagen and Wolfgang Schuster (code, programming,
+ etc) with help from Mojca Miklavec (distribution, compile farm, etc) and Alan Braslau (testing,
+ feedback, etc). Of course with get help from all those \CONTEXT\ users who are always very
+ willing to test.
+
+ We start with the version numbers. While \LUATEX\ operates in the 100 range, the \LUAMETATEX\
+ engine takes the 200 range. Revisions range from 00 upto 99 and the dates \unknown\ depend on
+ the mood. The |2.05.00| version with the development id |20200229| was more or less the first
+ official version, in the sense that most of the things on my initial todo list were done. It's
+ a kind of virtual date as it happens to be a leapyear. As with LuaTeX the .10 version will be
+ the first 'stable' one, released somewhere around the ConTeXt 2021 meeting.
+
+ 2.08.18 : around TeXLive 2021 code freeze (so a bit of a reference version)
+ 2.09.35 : near the end of 2021 (so close to the 2.10 release date)
+ 2.09.55 : in July 2022 (the official release of the new math engine)
+ 2.10.00 : a few days before the ctx 2022 meeting (starting September 19)
+
+ At some point the \CONTEXT\ group will be responsible for guaranteeing that the official version
+ is what comes with \CONTEXT\ and that long term support and stabilty is guaranteed and that no
+ feature creep or messing up happens. We'll see.
+
+*/
+
+# include "tex/textypes.h"
+
+# define luametatex_version 210
+# define luametatex_revision 00
+# define luametatex_version_string "2.10.00"
+# define luametatex_development_id 20220918
+
+# define luametatex_name_camelcase "LuaMetaTeX"
+# define luametatex_name_lowercase "luametatex"
+# define luametatex_copyright_holder "Taco Hoekwater, Hans Hagen & Wolfgang Schuster"
+# define luametatex_bug_address "dev-context@ntg.nl"
+# define luametatex_support_address "context@ntg.nl"
+
+/*tex
+
+ One difference with \LUATEX\ is that we keep global variables that kind of belong together in
+ structures. This also has the advantage that we have more specific access (via a namespace) and
+ don't use that many macros (that can conflict later on).
+
+*/
+
+typedef struct version_state_info {
+ int version;
+ int revision;
+ const char *verbose;
+ const char *banner;
+ const char *compiler;
+ // const char *libc;
+ int developmentid;
+ int formatid;
+ const char *copyright;
+} version_state_info;
+
+extern version_state_info lmt_version_state;
+
+/*tex
+
+ This is actually the main headere file. Of course we could split it up and be more explicit in
+ other files but this is simple and just works. There is of course some overhead in loading
+ headers that are not used, but because compilation is simple and fast I don't care.
+
+*/
+
+# include <stdarg.h>
+# include <string.h>
+# include <math.h>
+# include <stdlib.h>
+# include <errno.h>
+# include <float.h>
+# include <locale.h>
+# include <ctype.h>
+# include <stdint.h>
+# include <stdio.h>
+# include <time.h>
+# include <signal.h>
+# include <sys/stat.h>
+
+# ifdef _WIN32
+ # include <windows.h>
+ # include <winerror.h>
+ # include <fcntl.h>
+ # include <io.h>
+# else
+ # include <unistd.h>
+ # include <sys/time.h>
+# endif
+
+/*tex
+
+ We use stock \LUA\ where we only adapt the bytecode format flag so that we can use intermediate
+ \LUA\ versions without crashes due to different bytecode. Here are some constants that have to
+ be set:
+
+ \starttyping
+ # define LUAI_HASHLIMIT 6
+ # define LUA_USE_JUMPTABLE 0
+ # define LUA_BUILD_AS_DLL 0
+ # define LUA_CORE 0
+ \stoptyping
+
+ Earlier versions of \LUA\ an definitely \LUAJIT\ needed the |LUAI_HASHLIMIT| setting to be
+ adapted in order not to loose performance. This flag is no longer in \LUA\ version 5.4+.
+
+*/
+
+# include "lua.h"
+# include "lauxlib.h"
+
+# define LUA_VERSION_STRING ("Lua " LUA_VERSION_MAJOR "." LUA_VERSION_MINOR "." LUA_VERSION_RELEASE)
+
+/*tex
+
+ The code in \LUAMETATEX\ is a follow up on \LUATEX\ which is itself a follow up on \PDFTEX\
+ (and parts of \ALEPH). The original \PASCAL\ code has been converted \CCODE. Substantial amounts
+ of code were added over a decade. Stepwise artifacts have been removed (for instance originating
+ in the transations from \PASCAL, or from integration in the infrastructure), parts of code has
+ been rewritten. As much as possible we keep the old naming intact (so that most of the \TEX\
+ documentation applies. However, as we now assume \CCODE, some things have changed. Among the
+ changes are handling datatypes and certain checks. For instance, when |null| is used this is
+ now always assumed to be |0|, so a zero test is also valid. Old side effects of zero nodes for
+ zero gluespecs are gone because these have been reimplemented. Of course we keep |NULL| as
+ abstraction for unset pointers. This way it's clear when we have a \CCODE\ pointer or a \TEX\
+ managed one (where |null| or |0| means no node or token).
+
+ As with all \TEX\ engines, \LUATEX\ started out with the \PASCAL\ version of \TEX\ and as
+ mentioned we started with \PDFTEX. The first thing that was done (by Taco) was to create a
+ permanent \CCODE\ base instead of \PASCAL. In the process, some macros and library interfacing
+ wrappers were moved to the \LUATEX\ code base. Sometimes \PASCAL\ and \CCODE\ don't map well
+ end intermediate functions were used for that. Over time some artifacts that resulted from
+ automatic conversions from one to the other has been removed.
+
+ In the next stage of \LUATEX\ development, we went a but further and tried to get rid of more
+ dependencies. Among the rationales for this is that we depend on \LUA, and whatever works for
+ the \LUA\ codebase (which is quite portable) should also work for \LUATEX. But there are always
+ some overloads because (especially in \LUATEX\ where one can use \KPSE) the integration in a
+ \TEX\ ecosystem expects some behaviour with respect to files and running subprocesses and such.
+ In \LUAMETATEX\ there is less of that because \CONTEXT\ does more of that itself.
+
+ So, one of the biggest complications was the dependency on the \WEBC\ helpers and file system
+ interface. However, because that was already kind of isolated, it could be removed. If needed
+ we can always bring back \KPSE\ as an external library. In the process there can be some side
+ effects but in the end it gives a cleaner codebase and less depedencies. We suddenly don't need
+ all kind of tweaks to get the program compiled.
+
+ The \TEX\ memory model is based on packing data in memory words, but that concept is somewhat
+ fluid as in the past we had 16 byte processors too. However, we now mostly think in 32 bit and
+ internally \LUATEX\ will pack most of its node data in a multiples of 64 bits (called words). On
+ the one hand there is more memory involved but on the other hand it suits the architectures
+ well. In \LUAMETATEX\ we target 64 bit machines, but still provide binaries for 32 bit
+ architectures. The endianness related code has been dropped, simply because already for decades,
+ format files are not shared between platforms either.
+
+ Because \TEX\ efficiently implements its own memory management of nodes, the address of a node
+ is actually a number. Numbers like are sometimes indicates as |pointer|, but can also be called
+ |halfword|. Dimensions also fit into half a word and are called |scaled| but again we see them
+ being called |halfword|. What term is used depends a bit on the location and also on the
+ original code. For now we keep this mix but maybe some day we will normalize this. I did look
+ into more dynamic loading (only using the main memory numeric address pointers because that is
+ fast and efficient) but it makes the code more complex and probably hit performance badly. But
+ I keep an eye on it.
+
+ When we have halfwords representing pointers (into the main memory array) we indicate an unset
+ pointer as |null| (lowercase). But, because the usage of |null| and |0| was kind of mixed and
+ inconstent the |null| is only used to indicate zeroing a halfword encoded pointer. It will
+ always remain |0|.
+
+ We could reshuffle a lot more and normalize defines and enums but for now we stick to the way
+ it's done in order to divert not too much from the ancestors. However, in due time it can
+ evolve. Some constants used in \TEX\ the program now have a prefix |namespace_| or suffix
+ |_code| or |_cmd| in order not to clash with other usage. Some of these are in files like
+ |texcommands.h| and |texequivalents.h| but others end up in other |.h| files. This might change
+ but in the end it's not that important. Consider the spread a side effect of the still present
+ ideas of literate programming.
+
+ Some of the modules put data into the structures that could have been kept private but for now
+ I decided to be a bit consistent. However, of course there are still quite some private
+ variables left.
+
+*/
+
+/*tex This is not used (yet) as I don't expect much from it, but \LUA\ has some of it. */
+
+# if defined(__GNUC__)
+# define lmt_likely(x) (__builtin_expect(((x) != 0), 1))
+# define lmt_unlikely(x) (__builtin_expect(((x) != 0), 0))
+# else
+# define lmt_likely(x) (x)
+# define lmt_unlikely(x) (x)
+# endif
+
+# include "utilities/auxarithmetic.h"
+# include "utilities/auxmemory.h"
+# include "utilities/auxzlib.h"
+
+# include "tex/texmainbody.h"
+
+# include "lua/lmtinterface.h"
+# include "lua/lmtlibrary.h"
+# include "lua/lmttexiolib.h"
+
+# include "utilities/auxsystem.h"
+# include "utilities/auxsparsearray.h"
+# include "utilities/auxunistring.h"
+# include "utilities/auxfile.h"
+
+# include "libraries/hnj/hnjhyphen.h"
+
+# include "tex/texexpand.h"
+# include "tex/texmarks.h"
+# include "tex/texconditional.h"
+# include "tex/textextcodes.h"
+# include "tex/texmathcodes.h"
+# include "tex/texalign.h"
+# include "tex/texrules.h"
+/* "tex/texdirections.h" */
+# include "tex/texerrors.h"
+# include "tex/texinputstack.h"
+# include "tex/texstringpool.h"
+# include "tex/textoken.h"
+# include "tex/texprinting.h"
+# include "tex/texfileio.h"
+# include "tex/texarithmetic.h"
+# include "tex/texnesting.h"
+# include "tex/texadjust.h"
+# include "tex/texinserts.h"
+# include "tex/texlocalboxes.h"
+# include "tex/texpackaging.h"
+# include "tex/texscanning.h"
+# include "tex/texbuildpage.h"
+# include "tex/texmaincontrol.h"
+# include "tex/texdumpdata.h"
+# include "tex/texmainbody.h"
+# include "tex/texnodes.h"
+# include "tex/texdirections.h"
+# include "tex/texlinebreak.h"
+# include "tex/texmath.h"
+# include "tex/texmlist.h"
+# include "tex/texcommands.h"
+# include "tex/texprimitive.h"
+# include "tex/texequivalents.h"
+# include "tex/texfont.h"
+# include "tex/texlanguage.h"
+
+# include "lua/lmtcallbacklib.h"
+# include "lua/lmttokenlib.h"
+# include "lua/lmtnodelib.h"
+# include "lua/lmtlanguagelib.h"
+# include "lua/lmtfontlib.h"
+# include "lua/lmtlualib.h"
+# include "lua/lmttexlib.h"
+# include "lua/lmtenginelib.h"
+
+/*tex
+
+ We use proper warnings, error messages, and confusion reporting instead of:
+
+ \starttyping
+ # ifdef HAVE_ASSERT_H
+ # include <assert.h>
+ # else
+ # define assert(expr)
+ # endif
+ \stoptyping
+
+ In fact, we don't use assert at all in \LUAMETATEX\ because if we need it we should do a decent
+ test and report an issue. In the \TEXLIVE\ eco system there can be assignments and function
+ calls in asserts which can disappear in case of e.g. compiling with msvc, so the above define
+ is even wrong!
+
+*/
+
+// # ifndef _WIN32
+//
+// /* We don't want these use |foo_s| instead of |foo| messages. This will move. */
+//
+// # define _CRT_SECURE_NO_WARNINGS
+//
+// # endif
+
+# endif