summaryrefslogtreecommitdiff
path: root/doc/context/sources/general/manuals/onandon/onandon-53.tex
diff options
context:
space:
mode:
Diffstat (limited to 'doc/context/sources/general/manuals/onandon/onandon-53.tex')
-rw-r--r--doc/context/sources/general/manuals/onandon/onandon-53.tex283
1 files changed, 155 insertions, 128 deletions
diff --git a/doc/context/sources/general/manuals/onandon/onandon-53.tex b/doc/context/sources/general/manuals/onandon/onandon-53.tex
index 46eac3510..c724bf810 100644
--- a/doc/context/sources/general/manuals/onandon/onandon-53.tex
+++ b/doc/context/sources/general/manuals/onandon/onandon-53.tex
@@ -2,49 +2,55 @@
\startcomponent onandon-53
+% copy-edited by Alan Braslau
+
\environment onandon-environment
\startchapter[title={From \LUA\ 5.2 to 5.3}]
-When we started with \LUATEX\ we used \LUA\ 5.1 and moved to 5.2 when that became
-available. We didn't run into issues then because there were no fundamental
-changes that could not be dealt with. However, when \LUA\ 5.3 was announced in
-2015 we were not sure if we should make the move. The main reason was that we'd
-chosen \LUA\ because of its clean design which meant that we had only one number
-type: double. In 5.3 on the other hand, deep down a number can be either an
-integer or a floating point quantity.
-
-Internally \TEX\ is mostly (up to) 32-bit integers and when we go from \LUA\ to
-\TEX\ we round numbers. Nonetheless one can expect some benefits in using
-integers. Performance|-|wise we didn't expect much, and memory consumption would
-be the same too. So, the main question then was: can we get the same output and
-not run into trouble due to possible differences in serializing numbers; after
-all \TEX\ is about stability. The serialization aspect is for instance important
-when we compare quantities and|/|or use numbers in hashes.
-
-Apart from this change in number model, which comes with a few extra helpers,
-another extension in 5.3 was that bit|-|wise operations are now part of the
-language. The lpeg library is still not part of stock \LUA. There is some minimal
-\UTF8 support, but less than we provide in \LUATEX\ already. So, looking at these
-changes, we were not in a hurry to update. Also, it made sense to wait till this
-important number|-|related change was stable.
-
-But, a few years later, we still had it on our agenda to test, and after the
-\CONTEXT\ 2017 meeting we decided to give it a try; here are some observations. A
-quick test was just dropping in the new \LUA\ code and seeing if we could make a
-\CONTEXT\ format. Indeed that was no big deal but a test run failed because at
-some point a (for instance) \type {1} became a \type {1.0}. It turned out that
-serializing has some side effects. And with some ad hoc prints for tracing (in
-the \LUATEX\ source) I could figure out what went on. How numbers are seen can
-(to some extent) be deduced from the \type {string.format} function, which is in
-\LUA\ a combination of parsing, splitting and concatenation combined with piping
-to the \CCODE\ \type {sprintf} function. \footnote {Actually, at some point I
-decided to write my own formatter on top of \type {format} and I ended up with
-splitting as well. It's only now that I realize why this is working out so well
-(in terms of performance): simple format (single items) are passed more or less
-directly to \type {sprintf} and as \LUA\ itself is fast, due to some caching, the
-overhead is small compared to the built|-|in splitter method. And the \CONTEXT\
-formatter has many more options and is extensible.}
+When we started with \LUATEX\ we used \LUA\ 5.1 and then moved seamlessly to 5.2
+when that became available. We didn't run into issues with this language version
+change because there were no fundamental differences that could not be easily
+dealt with. However, when \LUA\ 5.3 was announced in 2015 we were not sure if we
+should make the move. The main reason was that we'd chosen \LUA\ because of its
+clean design part of which meant that we had only one number type: double. In 5.3
+on the other hand, deep down a number can be either an integer or a floating
+point quantity.
+
+Internally \TEX\ is mostly (up to) 32-bit integers so when we go from \LUA\ to
+\TEX\ we are forced to round numbers. Nonetheless, or perhaps because of this,
+one can expect some benefits in using integers in \LUA. Performance|-|wise we
+didn't expect much, and memory consumption would be the same too. So the main
+question then was: can we get the same output and not run into trouble due to
+possible differences in serializing numbers? After all \TEX\ is about stability.
+The serialization aspect is for instance important when we compare quantities
+and|/|or use numbers in hashes, so one must be careful.
+
+Apart from this change in the number model (which comes with a few extra
+helpers), another interesting extension in 5.3 was that bit|-|wise operations are
+now part of the language. However, the lpeg library is still not part of stock
+\LUA. There is also added some minimal \UTF8 support, but less than we provide in
+\LUATEX\ already. So, considering these changes, we were not in a big hurry to
+update. Also, it made sense to wait until this important number|-|related change
+became stable.
+
+But, a few years later, we still had it on our agenda to test the new version of
+\LUA, and after the \CONTEXT\ 2017 meeting we decided to give it a try; here are
+some observations. A quick test involved just dropping in the new \LUA\ code and
+seeing if with this we could still compile a \CONTEXT\ format. Indeed that was no
+big deal but the test run failed because at some point a (for instance) \type {1}
+became a \type {1.0}. It turned out that serializing has some side effects, and
+with some ad hoc prints for tracing (in the \LUATEX\ source) I could figure out
+what was going on. How numbers are seen can (to some extent) be deduced from the
+\type {string.format} function, which is in \LUA\ a combination of parsing,
+splitting and concatenation combined with piping to the \CCODE\ \type {sprintf}
+function: \footnote {Actually, at some point I decided to write my own formatter
+on top of \type {format} and I ended up with splitting as well. It's only now
+that I realize why this is working out so well (in terms of performance): simple
+format (single items) are passed more or less directly to \type {sprintf} and as
+\LUA\ itself is fast, due to some caching, the overhead is small compared to the
+built|-|in splitter method. An advantage is that the \CONTEXT\ formatter has many
+more options and is also extensible.}
\starttyping
local a = 2 * (1/2) print(string.format("%s", a),math.type(x))
@@ -76,10 +82,10 @@ This gives the following results:
\BC k \NC -2 \NC .0f \NC 2 \NC integer \NC \NR
\stoptabulate
-This demonstrates that we have to be careful when we need these numbers
-represented as strings. In \CONTEXT\ the number of places where we had to check
-for that was not that large; in fact, only some hashing related to font sizes had
-to be done using explicit rounding.
+This demonstrates that we have to be careful when we need numbers represented as
+strings. In \CONTEXT\ the places where we had to check for this was not that
+many: in fact, only some hashing related to font sizes had to be done using
+explicit rounding.
Another surprising side effect is the following. Instead of:
@@ -103,22 +109,23 @@ because we don't want this to be serialized to \type {64.0} which is due to the
fact that a power results in a float. One can wonder if this makes sense when we
apply it to an integer.
-At any rate, once we could process a file, two documents were chosen for a
-performance test. Some experiments with loops and casts had demonstrated that we
-could expect a small performance hit and indeed, this was the case. Processing
-the \LUATEX\ manual takes 10.7 seconds with 5.2 on my 5-year-old laptop and 11.6
-seconds with 5.3. If we consider that \CONTEXT\ spends 50\% of its time in \LUA,
-then we see a 20\% performance penalty. Processing the \METAFUN\ manual (which
-has lots of \METAPOST\ images) went from less than 20 seconds (\LUAJITTEX\ does
-it in 16 seconds) up to more than 27 seconds. So there we lose more than 50\% on
-the \LUA\ end. When we observed these kinds of differences, Luigi and I
-immediately got into debugging mode, partly out of curiosity, but also because
-consistent performance is important to~us.
-
-Because these numbers made no sense, we traced different sub-mechanisms and
-eventually it became clear that the reason for the speed penalty was that the
+At any rate, once we were able to process a file, two standard documents were
+chosen for a performance test. Some experiments with loops and casts had
+demonstrated that we could expect a small performance hit and indeed, this was
+the case. Processing the \LUATEX\ manual takes 10.7 seconds with 5.2 on my
+5-year-old laptop and 11.6 seconds with 5.3. If we consider that \CONTEXT\ spends
+about 50\% of its time in \LUA, then we find here a 20\% performance penalty
+using the later version of \LUA. Processing the \METAFUN\ manual (which has lots
+of \METAPOST\ images) went from less than 20 seconds (and \LUAJITTEX\ does it in
+16 seconds) to up to more than 27 seconds. So there we lose more than 50\% on the
+\LUA\ end. When we observed these kinds of differences, Luigi and I immediately
+got into debugging mode, partly out of curiosity but also because consistent
+performance is always important to us.
+
+As these results made no sense, we traced different sub-mechanisms and eventually
+it became clear that the reason behind the speed penalty was in fact that the
core \typ {string.format} function was behaving quite badly in the \type {mingw}
-cross-compiled binary, as seen by this test:
+cross|-|compiled binary, as can be seen by this test:
\starttyping
local t = os.clock()
@@ -137,13 +144,13 @@ print(os.clock()-t)
\BC c \NC 0.26 \NC 0.68 \NC 3.67 (0.29) \NC 0.66 \NC \NR
\stoptabulate
-The 5.2 binaries perform the same but the 5.3 Lua binary greatly outperforms
-\LUATEX, and so we had to figure out why. After all, all this integer
-optimization could bring some gain! It took us a while to figure this out. The
-numbers in parentheses are the results after fixing this.
+Both 5.2 binaries perform the same but the 5.3 \LUA\ binary greatly outperforms
+the \LUATEX binary so we had to figure out why. After all, the integer
+optimization should bring some gain! It took us a while to figure out what was
+going wrong, and the numbers in parentheses are the results after fixing \LUATEX.
Because font internals are specified in integers one would expect a gain
-in running:
+in running the command:
\starttyping
mtxrun --script font --reload force
@@ -151,8 +158,10 @@ mtxrun --script font --reload force
and indeed that is the case. On my machine a scan results in 2561 registered
fonts from 4906 read files and with 5.2 that takes 9.1 seconds while 5.3 needs a
-bit less: 8.6 seconds (with the bad format performance) and even less once that
-was fixed. For a test:
+bit less: 8.6 seconds (with the bad cross|-|compiled format performance) and even
+less once that was fixed.
+
+For a test:
\starttyping
\setupbodyfont[modern] \tf \bf \it \bs
@@ -163,21 +172,23 @@ was fixed. For a test:
\starttext \stoptext
\stoptyping
-This code needs 30\% more runtime so the question is: how often do we call \type
-{string.format} there? A first run (when we wipe the font cache) needs some
-715,000 calls while successive runs need 115,000 calls so that slow down
-definitely comes from the bad handling of \type {string.format}. When we drop in
-a \LUA\ update or whatever other dependency we don't want this kind of impact. In
-fact, when one uses external libraries that are or can be compiled under the
-\TEX\ Live infrastructure and the impact would be such, it's bad advertising,
-especially when one considers the occasional complaint about \LUATEX\ being
-slower than other engines.
+This code needs 30\% more runtime using the newer version of \LUA\ so the
+question is: how often do we call \type {string.format} there? A first run (when
+we wipe the font cache) needs some 715\,000 calls while successive runs need
+115\,000 calls so the slow down definitely comes from the bad handling of \type
+{string.format}.
+
+When we drop in a \LUA\ or whatever other dependency update we don't want this
+kind of impact. In fact, when one uses external libraries that are or can be
+compiled under the \TEX\ Live infrastructure and the impact would be so dramatic,
+this would be very bad advertising, especially when one considers the occasional
+complaint about \LUATEX\ being slower than other engines.
The good news is that eventually Luigi was able to nail down this issue and we
got a binary that performed well. It looks like \LUA\ 5.3.4 (cross|)|compiles
-badly with \GCC\ 5.3.0 and 6.3.0.
+badly under both \GCC\ 5.3.0 and 6.3.0.
-So in the end caching the fonts takes:
+So in the end loading the fonts takes:
\starttabulate[||c|c|]
\BC \BC caching \BC running \NC \NR
@@ -186,18 +197,20 @@ So in the end caching the fonts takes:
\BC 5.3 fixed \NC 6.3 \NC 1.0 \NC \NR
\stoptabulate
-So indeed it looks like 5.3 is able to speed up \LUATEX\ a bit, given that one
-integrates it in the right way! Using a recent compiler is needed too, although
-one can wonder when a bad case will show up again. One can also wonder why such a
-slow down can mostly go unnoticed, because for sure \LUATEX\ is not the only
-compiled program.
+So indeed after an initial scare it looks like 5.3 is able to speed up \LUATEX\ a
+bit, given that one integrates it in the right way! The use of a recent compiler
+is needed here, although one can wonder when another bad case will show up again.
+One can also wonder why such a slow down can mostly go unnoticed, because for
+sure \LUATEX\ is not the only compiled program integrating the \LUA\ language.
+\footnote{We can only speculate that others do not pay such close attention to
+performance.}
The next examples are some edge cases that show you need to be aware
that
\startitemize[n,text,nostopper]
\startitem an integer has its limits, \stopitem
- \startitem that hexadecimal numbers are integers and \stopitem
- \startitem that \LUA\ and \LUAJIT\ can be different in details. \stopitem
+ \startitem that hexadecimal numbers are integers, and \stopitem
+ \startitem that \LUA\ 5.2 and \LUAJIT\ can differ in small details: \stopitem
\stopitemize
\starttabulate[||T|T|]
@@ -208,47 +221,53 @@ that
\BC lua 53 \NC -1 \NC 9223372036854775807 \NC \NR
\stoptabulate
-So, to summarize the process. A quick test was relatively easy: move 5.3 into the
-code base, adapt a little bit of internals (there were some \LUATEX\ interfacing
-bits where explicit rounding was needed), run tests and eventually fix some
-issues related to the Makefile (compatibility) and \CCODE\ obscurities (the slow
-\type {sprintf}). Adapting \CONTEXT\ was also not much work, and the test suite
-uncovered some nasty side effects. For instance, the valid 5.2 solution:
+We see here that \LUA\ 5.3 clearly represents some progress.
+
+So, to summarize the migration, a quick test was relatively easy: move 5.3 into
+the code base, make slight adaptations to the internals (there were a few
+\LUATEX\ interfacing bits where explicit rounding was needed), run tests, and
+eventually fix some issues related to the Makefile (compatibility) and \CCODE\
+obscurities (the very slow \type {sprintf}). \footnote{This demonstrates the
+importance of compilers, or rather how one writes code with respect to each
+compiler.}
+
+Adapting \CONTEXT\ was also not much work, but the test suite uncovered some
+nasty side effects. For instance, the valid 5.2 solution:
\starttyping
local s = string.format("02X",u/1024)
local s = string.char (u/1024)
\stoptyping
-now has to become (both 5.2 and 5.3):
+now has to become (works with both 5.2 and 5.3):
\starttyping
local s = string.format("02X",math.floor(u/1024))
local s = string.char (math.floor(u/1024))
\stoptyping
-or (both 5.2 and (emulated or real) 5.3):
+or (with 5.2 and emulated or real 5.3):
\starttyping
local s = string.format("02X",bit32.rshift(u,10))
local s = string.char (bit32.rshift(u,10))
\stoptyping
-or (only 5.3):
+or (5.3 only):
\starttyping
local s = string.format("02X",u >> 10))
local s = string.char (u >> 10)
\stoptyping
-or (only 5.3):
+or (5.3 only):
\starttyping
local s = string.format("02X",u//1024)
local s = string.char (u//1024)
\stoptyping
-A conditional section like:
+Unfortunately, adapting a conditional section like:
\starttyping
if LUAVERSION >= 5.3 then
@@ -260,53 +279,61 @@ else
end
\stoptyping
-will fail because (of course) the 5.2 parser doesn't like that. In \CONTEXT\ we
-have some experimental solutions for that but that is beyond this summary.
+will fail because (of course) the 5.2 parser doesn't like the 5.3 syntax
+elements. In \CONTEXT\ we have some experimental solutions for this but it is
+beyond the scope of this summary.
-In the process a few \UTF\ helpers were added to the string library so that we
-have a common set for \LUAJIT\ and \LUA\ (the \type {utf8} library that was added
-to 5.3 is not that important for \LUATEX). For now we keep the \type {bit32}
-library on board. Of course we'll not mention all the details here.
+In the process of this update a few \UTF\ helpers were added to the string
+library so that we have a common set for both \LUAJIT\ and \LUA\ (the \type
+{utf8} library that was added to 5.3 is not very useful for \LUATEX). For now we
+also keep the \type {bit32} library on board, of course, we'll not mention all
+the details here.
-When we consider a gain in speed of 5-10\% with 5.3 that also means that the gain
-of \LUAJITTEX\ compared to 5.2 becomes less. For instance in font processing both
-engines now perform closer to the same.
+When we consider a gain in speed of 5–10\% with 5.3 that also means that the gain
+obtained using \LUAJITTEX\ compared to \LUA\ 5.2 becomes less important. For
+instance in font processing both engines (\LUA\ 5.3 and \LUAJIT) now perform
+roughly to the same.
As I write this, we've just entered 2018 and after a few months of testing
\LUATEX\ with \LUA\ 5.3 we're confident that we can move the code to the
experimental branch. This means that we will use this version in the \CONTEXT\
-distribution and likely will ship this version as 1.10 in 2019, where it becomes
+distribution and likely will ship this as 1.10 in 2019 where \LUA\ 5.3 becomes
the default. The 2018 version of \TEX~Live will have 1.07 with \LUA\ 5.2 while
intermediate versions of the \LUA\ 5.3 binary will end up on the \CONTEXT\
garden, probably with number 1.08 and 1.09 (who knows what else we will add or
change in the meantime).
-\subsubject{addendum}
-
-Around the 2018 meeting I started with what is to become the next major upgrade
-of \CONTEXT, this time using \LUAMETATEX. When working on that I decided to try
-\LUA\ 5.4 and see what consequences that would have for us. There are no real
-conceptual changes, as with the number model in 5.3, so the tests didn't reveal
-any issues. But as an additional step towards a bit cleaner distinction between
-strings and numbers, I disabled the casting so that mixing them in expression for
-instance is no longer permitted. If I remember right only in one place I had to
-adapt the source (and in the meantime we're talking of a pretty large code base).
-
-There is a new mechanism for freezing constants but I'm not yet sure if it makes
-much sense to use it. It goes along with some other restrictions, like the
-possibility to adapt loop counters inside the loop. Inside the body of a loop one
-could always adapt such a variable, which (I can imagine) can come in handy. I
-didn't check yet the source code for that, but probably I don't do that.
-
-Another new features is an alternative garbage collector which seems to perform
-better when there are many variables with s short live span. For now I decided
-to default to this variant in future releases.
+\subsubject{Addendum}
+
+Around the 2018 meeting I also started what is to become the next major upgrade
+of \CONTEXT, this time using a new engine \LUAMETATEX. In working on that I
+decided to try \LUA\ 5.4 to see what consequences this new version would have for
+us. There are no real conceptual changes as were found with the number model in
+5.3, so the tests didn't reveal any real issues. But as an additional step
+towards a bit cleaner distinction between strings and numbers, I decided to
+disable the automatic casting so that mixing strings and numbers in expression
+for instance is no longer permitted. If I remember correctly, there was only in
+one place I had to adapt the source (and we're talking about a pretty large \LUA\
+code base).
+
+There is a new mechanism in \LUA\ for freezing constants but I'm not yet sure if
+it makes much sense to use it, although one of the intentions is to produce more
+efficient bytecode. \footnote {Mid July 2019 some quick tests indeed show a
+performance boost with the experimental code base, but if we want to benefit from
+using constants, the \CONTEXT\ codebase has to be adapted, which means that those
+parts no longer will work with stock \LUATEX.} It's use goes along with some
+other restrictions, like the possibility to adapt loop counters inside the loop.
+Inside the body of a loop one could always adapt such a variable, which (I can
+imagine) can come in handy. I haven't checked the source code for that, but
+probably I don't do this anywhere.
+
+Another new feature is an alternative garbage collector which seems to perform
+better when there are many variables with a short life spans. At least for now I
+have decided to default to this variant in future releases.
Overall the performance of \LUA\ 5.4 is better than its predecessors which means
-that the gape between \LUATEX\ and \LUAJITTEX\ is closing. This is good because
-in \LUAMETATEX\ I will not support that variant.
-
-\stopchapter
+that the gap between \LUATEX\ and \LUAJITTEX\ is closed or is closing. This is
+good because I have chosen not to support \LUAJIT\ in \LUAMETATEX.
\stopcomponent