summaryrefslogtreecommitdiff
path: root/doc/context/sources/general/manuals/onandon/onandon-performance.tex
diff options
context:
space:
mode:
Diffstat (limited to 'doc/context/sources/general/manuals/onandon/onandon-performance.tex')
-rw-r--r--doc/context/sources/general/manuals/onandon/onandon-performance.tex523
1 files changed, 262 insertions, 261 deletions
diff --git a/doc/context/sources/general/manuals/onandon/onandon-performance.tex b/doc/context/sources/general/manuals/onandon/onandon-performance.tex
index 279383a8c..b1b34443d 100644
--- a/doc/context/sources/general/manuals/onandon/onandon-performance.tex
+++ b/doc/context/sources/general/manuals/onandon/onandon-performance.tex
@@ -28,8 +28,8 @@ So what exactly does performance refer to? If you use \CONTEXT\ there are
probably only two things that matter:
\startitemize[packed]
-\startitem How long does one run take. \stopitem
-\startitem How many runs do I need. \stopitem
+\startitem How long does one run take? \stopitem
+\startitem How many runs do I need? \stopitem
\stopitemize
Processing speed is reported at the end of a run in terms of seconds spent on the
@@ -50,72 +50,74 @@ i7-3840QM as reference. A simple
\stoptext
\stoptyping
-document reports 0.4 seconds but as we wrap the run in an \type {mtxrun}
-management run we have an additional 0.3 overhead (auxiliary file handling, \PDF\
-viewer management, etc). This includes loading the Latin Modern font. With
-\LUAJITTEX\ these times are below 0.3 and 0.2 seconds. It might look like much
-overhead but in an edit|-|preview runs it feels snappy. One can try this:
+document reports 0.4 seconds but, as we wrap the run in an \type {mtxrun}
+management run, we have an additional 0.3 overhead (auxiliary file handling,
+\PDF\ viewer management, etc). This includes loading the Latin Modern font. With
+\LUAJITTEX, these times are below 0.3 and 0.2 seconds. It might look like a lot
+of overhead, but in an edit|-|preview runs it feels snappy. One can try this:
\starttyping
\stoptext
\stoptyping
-which bring down the time to about 0.2 seconds for both engines but as it doesn't
-do anything useful that is is no practice.
+which bring down the time to about 0.2 seconds for both engines but it doesn't
+do anything useful in practice.
-Finishing a document is not that demanding because most gets flushed as we go.
-The more (large) fonts we use, the longer it takes to finish a document but on
+Finishing a document is not that demanding, because most gets flushed as we go.
+The more (large) fonts we use, the longer it takes to finish a document, but, on
the average that time is not worth noticing. The main runtime contribution comes
from processing the pages.
Okay, this is not always true. For instance, if we process a 400 page book from
2500 small \XML\ files with multiple graphics per page, there is a little
-overhead in loading the files and constructing the \XML\ tree as well as in
-inserting the graphics but in such cases one expects a few seconds more runtime. The
-\METAFUN\ manual has some 450 pages with over 2500 runtime generated \METAPOST\
-graphics. It has color, uses quite some fonts, has lots of font switches
-(verbatim too) but still one run takes only 18 seconds in stock \LUATEX\ and less
-that 15 seconds with \LUAJITTEX. Keep these numbers in mind if a non|-|\CONTEXT\
-users barks against the performance tree that his few page mediocre document
-takes 10 seconds to compile: the content, styling, quality of macros and whatever
-one can come up with all plays a role. Personally I find any rate between 10 and
-30 pages per second acceptable, and if I get the lower rate then I normally know
-pretty well that the job is demanding in all kind of aspects.
-
-Over time the \CONTEXT||\LUATEX\ combination, in spite of the fact that more
+overhead in loading the files and in constructing the \XML\ tree as well as in
+inserting the graphics, but in such cases one expects a few seconds longer
+runtime. \METAFUN\ manual has some 450 pages with over 2500 runtime|-|generated
+\METAPOST\ graphics. It has color, uses quite some fonts, has lots of font
+switches (verbatim, too), but, still, one run takes only 18 seconds in stock
+\LUATEX\ and less and less that 15 seconds with \LUAJITTEX. Keep these numbers in
+mind if a non|-|\CONTEXT\ users bark against the performance tree that his few
+page mediocre document takes 10 seconds to compile: the content, styling, quality
+of macros and whatever one can come up with all play a role. Personally I find
+any rate between 10 and 30 pages per second acceptable, and, if I get the lower
+rate, then I normally know pretty well that the job is demanding in all kind of
+aspects.
+
+Over time, the \CONTEXT||\LUATEX\ combination, in spite of the fact that more
functionality has been added, has not become slower. In fact, some subsystems
-have been sped up. For instance font handling is very sensitive for adding
+have been sped up. For instance, font handling is very sensitive to adding
functionality. However, each version so far performed a bit better. Whenever some
neat new trickery was added, at the same time improvements were made thanks to
-more insight in the matter. In practice we're not talking of changes in speed by
+more insight in the matter. In practice, we're not talking of changes in speed by
large factors but more by small percentages. I'm pretty sure that most \CONTEXT\
-users never noticed. Recently a 15\endash30\% speed up (in font handling) was
-realized (for more complex fonts) but only when you use such complex fonts and
-pages full of text you will see a positive impact on the whole run.
+users never noticed. Recently, a 15\endash30\% speed up (in font handling) was
+realized (for more complex fonts), but only when you use such complex fonts and
+pages full of text will you see a positive impact on the whole run.
There is one important factor I didn't mention yet: the efficiency of the
console. You can best check that by making a format (\typ {context --make en}).
When that is done by piping the messages to a file, it takes 3.2 seconds on my
laptop and about the same when done from the editor (\SCITE), maybe because the
\LUATEX\ run and the log pane run on a different thread. When I use the standard
-console it takes 3.8 seconds in Windows 10 Creative update (in older versions it
+console, it takes 3.8 seconds in Windows 10 Creative update (in older versions it
took 4.3 and slightly less when using a console wrapper). The powershell takes
-3.2 seconds which is the same as piping to a file. Interesting is that in Bash on
-Windows it takes 2.8 seconds and 2.6 seconds when piped to a file. Normal runs
-are somewhat slower, but it looks like the 64 bit Linux binary is somewhat faster
-than the 64 bit mingw version. \footnote {Long ago we found that \LUATEX\ is very
-sensitive to for instance the \CPU\ cache so maybe there are some differences due
-to optimization flags and|/|or the fact that bash runs in one thread and all file
-\IO\ in the main windows instance. Who knows.} Anyway, it demonstrates that when
-someone yells a number you need to ask what the conditions where.
-
-At a \CONTEXT\ meeting there has been a presentation about possible speed|-|up of
-a run for instance by using a separate syntax checker to prevent a useless run.
-However, the use case concerned a document that took a minute on the machine
-used, while the same document took a few seconds on mine. At the same meeting we
-also did a comparison of speed for a \LATEX\ run using \PDFTEX\ and the same
-document migrated to \CONTEXT\ \MKIV\ using \LUATEX\ (Harald K\"onigs \XML\
-torture and compatibility test). Contrary to what one might expect, the
+3.2 seconds, which is the same as piping to a file. Interesting is that in Bash
+on Windows, it takes 2.8 seconds and 2.6 seconds when piped to a file. Normal
+runs are somewhat slower, but it looks like the 64 bit Linux binary is somewhat
+faster than the 64 bit mingw version. \footnote {Long ago, we found that \LUATEX\
+is very sensitive to for instance the \CPU\ cache, so maybe there are some
+differences due to optimization flags and|/|or the fact that bash runs in one
+thread, and all file \IO\ takes place in the main Windows instance. Who knows.}
+Anyway, it demonstrates that when someone yells a number you need to ask what the
+conditions were.
+
+At a \CONTEXT\ meeting, there has been a presentation about possible speed|-|up
+of of a run by using, for instance, a separate syntax checker to prevent a
+useless run. However, the use case concerned a document that took a minute on the
+machine used, while the same document took a few seconds on mine. At the same
+meeting, we also did a comparison of speed for a \LATEX\ run using \PDFTEX\ and
+the same document migrated to \CONTEXT\ \MKIV\ using \LUATEX\ (Harald K\"onigs
+\XML\ torture and compatibility test). Contrary to what one might expect, the
\CONTEXT\ run was significantly faster; the resulting document was a few
gigabytes in size.
@@ -126,76 +128,77 @@ gigabytes in size.
I will discuss a few potential bottlenecks next. A complex integrated system like
\CONTEXT\ has lots of components and some can be quite demanding. However, when
something is not used, it has no (or hardly any) impact on performance. Even when
-we spend a lot of time in \LUA\ that is not the reason for a slow|-|down.
+we spend a lot of time in \LUA, that is not the reason for a slow|-|down.
Sometimes using \LUA\ results in a speedup, sometimes it doesn't matter. Complex
-mechanisms like natural tables for instance will not suddenly become less
+mechanisms like natural tables, for instance, will not suddenly become less
complex. So, let's focus on the \quotation {aspects} that come up in those
complaints: fonts and \LUA. Because I only use \CONTEXT\ and occasionally test
with the plain \TEX\ version that we provide, I will not explore the potential
-impact of using truckloads of packages, styles and such, which I'm sure of plays
-a role, but one neglected in the discussion.
+impact of using truckloads of packages, styles, and such, which I'm sure of plays
+a role, but one neglected in my discussion.
\startsubsubject[title=Fonts]
-According to the principles of \LUATEX\ we process (\OPENTYPE) fonts using \LUA.
-That way we have complete control over any aspect of font handling, and can, as
+According to the principles of \LUATEX, we process (\OPENTYPE) fonts using \LUA.
+That way, we have complete control over any aspect of font handling, and can, as
to be expected in \TEX\ systems, provide users what they need, now and in the
-future. In fact, if we didn't had that freedom in \CONTEXT\ I'd probably already
+future. In fact, if we didn't had that freedom in \CONTEXT, I'd probably already
quit using \TEX\ a decade ago and found myself some other (programming) niche.
-After a font is loaded, part of the data gets passed to the \TEX\ engine so that
-it can do its work. For instance, in order to be able to typeset a paragraph,
-\TEX\ needs to know the dimensions of glyphs. Once a font has been loaded
-(that is, the binary blob) the next time it's fetched from a cache. Initial
-loading (and preparation) takes some time, depending on the complexity or size of
-the font. Loading from cache is close to instantaneous. After loading the
-dimensions are passed to \TEX\ but all data remains accessible for any desired
-usage. The \OPENTYPE\ feature processor for instance uses that data and \CONTEXT\
-for sure needs that data (fast accessible) for different purposes too.
-
-When a font is used in so called base mode, we let \TEX\ do the ligaturing and
+After a font has been loaded, part of the data gets passed to the \TEX\ engine,
+so that it can do its work. For instance, in order to be able to typeset a
+paragraph, \TEX\ needs to know the dimensions of glyphs. Once a font has been
+loaded (that is, the binary blob) it's fetched from a cache the next time.
+Initial loading (and preparation) takes some time, depending on the complexity
+and the size of the font. Loading from cache is close to instantaneous. After
+loading, the dimensions are passed to \TEX\ but all data remains accessible for
+any desired usage. The \OPENTYPE\ feature processor, for instance, uses that data
+and \CONTEXT, for sure, needs that data (quickly accessible) for different
+purposes, too.
+
+When a font is used in so|-|called base mode, we let \TEX\ do the ligaturing and
kerning. This is possible with simple fonts and features. If you have a critical
-workflow you might enable base mode, which can be done per font instance.
-Processing in node mode takes some time but how much depends on the font and
-script. Normally there is no difference between \CONTEXT\ and generic usage. In
-\CONTEXT\ we also have dynamic features, and the impact on performance depends on
-usage. In addition to base and node we also have plug mode but that is only used
+workflow, you might enable base mode, which can be done per font instance.
+Processing in node mode takes some time, but how much depends on the font and
+script. Normally, there is no difference between \CONTEXT\ and generic usage. In
+\CONTEXT, we also have dynamic features, and the impact on performance depends on
+usage. In addition to base and node, we also have plug mode, but that is only used
for testing and therefore not advertised.
Every \type {\hbox} and every paragraph goes through the font handler. Because
we support mixed modes, some analysis takes place, and because we do more in
-\CONTEXT, the generic analyzer is more light weight, which again can mean that a
+\CONTEXT, the generic analyzer is more lightweight, which again can mean that a
generic run is not slower than a similar \CONTEXT\ one.
Interesting is that added functionality for variable and|/|or color fonts had no
-impact on performance. Runtime added user features can have some impact but when
-defined well it can be neglected. I bet that when you add additional node list
-handling yourself, its impact on performance is larger. But in the end what
-counts is that the job gets done and the more you demand the higher the price you
-pay.
+impact on performance. Runtime|-|added user features can have some impact, but,
+when defined well, it can be neglected. I bet that when you add additional node
+list handling yourself, its impact on performance will be larger. But in the end
+what counts is that the job gets done and the more you demand the higher the
+price you pay.
\stopsubsubject
\startsubsubject[title=\LUA]
The second possible bottleneck when using \LUATEX\ can be in using \LUA\ code.
-However, using that as argument for slow runs is laughable. For instance
-\CONTEXT\ \MKIV\ can easily spend half its time in \LUA\ and that is not making
+However, using that is laughable as an argument for slow runs. For instance,
+\CONTEXT\ \MKIV\ can easily spend half its time in \LUA, and that is not making
it any slower than \MKII\ using \PDFTEX\ doing equally complex things. For
-instance the embedded \METAPOST\ library makes \MKIV\ way faster than \MKII, and
+instance, the embedded \METAPOST\ library makes \MKIV\ way faster than \MKII, and
the built|-|in \XML\ processing capabilities in \MKIV\ can easily beat \MKII\
\XML\ handling, apart from the fact that it can do more, like filtering by path
and expression. In fact, files that take, say, half a minute in \MKIV, could as
well have taken 15 minutes or more in \MKII\ (and imagine multiple runs then).
-So, for \CONTEXT\ using \LUA\ to achieve its objectives is mandate. The
+So, for \CONTEXT, using \LUA\ to achieve its objectives is mandatory. The
combination of \TEX, \METAPOST\ and \LUA\ is pretty powerful! Each of these
components is really fast. If \TEX\ is your bottleneck, review your macros! When
\LUA\ seems to be the bad, go over your code and make it better. Much of the
-\LUA\ code I see flying around doesn't look that efficient, which is okay because
+\LUA\ code I see flying around doesn't look that efficient, which is okay, because
the interpreter is really fast, but don't blame \LUA\ beforehand, blame your
coding (style) first. When \METAPOST\ is the bottleneck, well, sometimes not much
-can be done about it, but when you know that language well enough you can often
+can be done about it, but when you know that language well enough, you can often
make it perform better.
For the record: every additional mechanism that kicks in, like character spacing
@@ -210,26 +213,26 @@ gets pretty well obscured by other things happening, just that you know.
\startsection[title=Some timing]
-Next I will show some timings related to fonts. For this I use stock \LUATEX\
-(second column) as well as \LUAJITTEX\ (last column) which of course performs
-much better. The timings are given in 3 decimals but often (within a set of runs)
-and as the system load is normally consistent in a set of test runs the last two
-decimals only matter in relative comparison. So, for comparing runs over time
-round to the first decimal. Let's start with loading a bodyfont. This happens
-once per document and normally one has only one bodyfont active. Loading involves
-definitions as well as setting up math so a couple of fonts are actually loaded,
-even if they're not used later on. A setup normally involves a serif, sans, mono,
-and math setup (in \CONTEXT). \footnote {The timing for Latin Modern is so low
+Next, I will show some timings related to fonts. For this, I use stock \LUATEX\
+(second column) as well as \LUAJITTEX\ (last column), which, of course, performs
+much better. The timings are rounded to three decimal places, but, as the system
+load is usually only consistent in a set of test runs, the last two decimals only
+matter in relative comparison. So, for comparing runs over time, round to the
+first decimal. Let's start with loading a bodyfont. This happens once per
+document, and one usually only has one bodyfont active. Loading involves
+definitions as well as setting up math, so a couple of fonts are actually loaded
+even if they're not used later on. A setup normally involves a serif, sans, mono
+and math setup (in \CONTEXT). \footnote {The timing for Latin Modern is so low,
because that font is loaded already.}
\environment onandon-speed-000
\ShowSample{onandon-speed-000} % bodyfont
-There is a bit difference between the font sets but a safe average is 150 milli
-seconds and this is rather constant over runs.
+There is a bit of a difference between the font sets, but a safe average is 150
+milliseconds, and this is rather constant over runs.
-An actual font switch can result in loading a font but this is a one time overhead.
+An actual font switch can result in loading a font, but this is a one|-|time overhead.
Loading four variants (regular, bold, italic and bold italic) roughly takes the
following time:
@@ -239,34 +242,32 @@ Using them again later on takes no time:
\ShowSample{onandon-speed-002} % four variants
-Before we start timing the font handler, first a few baseline benchmarks are
-shown. When no font is applied and nothing else is done with the node list we
-get:
+Before we start timing the font handler, a few baseline benchmarks are shown.
+When no font is applied and nothing else is done with the node list, we get:
\ShowSample{onandon-speed-009}
-A simple monospaced, no features applied, run takes a bit more:
+A simple monospaced, no|-|features|-|applied, run takes a bit more:
\ShowSample{onandon-speed-010}
-Now we show a one font typesetting run. As the two benchmarks before, we just
-typeset a text in a \type {\hbox}, so no par builder interference happens. We use
-the \type {sapolsky} sample text and typeset it 100 times 4 (either of not with
-font switches).
+Now, we show a one|-|font typesetting run. As with the two benchmarks before, we
+just typeset a text in a \type {\hbox}, so no par builder interference happens.
+We use the \type {sapolsky} sample text and typeset it 100 times 4, first without
+font switches.
\ShowSample{onandon-speed-003}
-Much more runtime is needed when we typeset with four font switches. The garamond
-is most demanding. Actually we're not doing 4 fonts there because it has no bold,
-so the numbers are a bit lower than expected for this example. One reason for it
-being demanding is that it has lots of (contextual) lookups. The only comment I
-can make about that is that it also depends on the strategies of the font
-designer. Combining lookups saves space and time so complexity of a font is not
-always a good predictor for performance hits.
+Much more runtime is needed when we typeset with four font switches. Ebgaramond
+is the most demanding. Actually, we're not doing 4 fonts there because ebgaramond
+has no bold, so the numbers are a bit lower than expected for this example. One
+reason for it being demanding is that it has lots of (contextual) lookups.
+Combining lookups saves space and time, so complexity of a font is not always a
+good predictor for performance hits.
% \ShowSample{onandon-speed-004}
-If we typeset paragraphs we get this:
+If we typeset paragraphs, we get the following:
\ShowSample{onandon-speed-005}
@@ -274,11 +275,11 @@ We're talking of some 275 pages here.
\ShowSample{onandon-speed-006}
-There is of course overhead in handling paragraphs and pages:
+There is, of course overhead in handling paragraphs and pages:
\ShowSample{onandon-speed-011}
-Before I discuss these numbers in more details two more benchmarks are
+Before I discuss these numbers in more detail, two more benchmarks are
shown. The next table concerns a paragraph with only a few (bold) words.
\ShowSample{onandon-speed-007}
@@ -290,11 +291,11 @@ typeset using \type{\type}.
When a node list (hbox or paragraph) is processed, each glyph is looked at. One
important property of \LUATEX\ (compared to \PDFTEX) is that it hyphenates the
-whole text, not only the most feasible spots. For the \type {sapolsky} snippet
-this results in 200 potential breakpoints, registered in an equal number of
-discretionary nodes. The snippet has 688 characters grouped into 125 words and
-because it's an English quote we're not hampered with composed characters or
-complex script handling. And, when we mention 100 runs then we actually mean
+whole text, not only the most feasible spots. For the \type {sapolsky} snippet,
+this results in 200 potential breakpoints registered in an equal number of
+discretionary nodes. The snippet has 688 characters grouped into 125 words and,
+because it's an English quote, we're not hampered with composed characters or
+complex script handling. And, when we mention 100 runs, then we actually mean
400 ones when font switching and bodyfonts are compared
\startnarrower
@@ -302,7 +303,7 @@ complex script handling. And, when we mention 100 runs then we actually mean
\input sapolsky \wordright{Robert M. Sapolsky}
\stopnarrower
-In order to get substitutions and positioning right we need not only to consult
+In order to get substitutions and positioning right, we need not only to consult
streams of glyphs but also combinations with preceding pre or replace, or
trailing post and replace texts. When a font has a bit more complex substitutions,
as ebgaramond has, multiple (sometimes hundreds of) passes over the list are made.
@@ -312,15 +313,15 @@ Another factor, one you could easily deduce from the benchmarks, is intermediate
font switches. Even a few such switches (in the last benchmarks) already result
in a runtime penalty. The four switch benchmarks show an impressive increase of
runtime, but it's good to know that such a situation seldom happens. It's also
-important not to confuse for instance a verbatim snippet with a bold one. The
+important not to confuse, for instance, a verbatim snippet with a bold one. The
bold one is indeed leading to a pass over the list, but verbatim is normally
-skipped because it uses a font that needs no processing. That verbatim or bold
+skipped, because it uses a font that needs no processing. That verbatim or bold
have the same penalty is mainly due to the fact that verbatim itself is costly:
the text is picked up using a different catcode regime and travels through \TEX\
and \LUA\ before it finally gets typeset. This relates to special treatments of
-spacing and syntax highlighting and such.
+spacing, syntax highlighting, and such.
-Also keep in mind that the page examples are quite unreal. We use a layout with
+Also, keep in mind that the page examples are quite unreal. We use a layout with
no margins, just text from edge to edge.
\placefigure
@@ -343,19 +344,19 @@ no margins, just text from edge to edge.
{\SampleTitle{onandon-speed-011}}
{\externalfigure[onandon-speed-011][frame=on,orientation=90,width=.45\textheight]}
-So what is a realistic example? That is hard to say. Unfortunately no one ever
-asked us to typeset novels. They are rather brain dead products for a machinery
-so they process fast. On the mentioned laptop 350 word pages in Dejavu fonts can
-be processed at a rate of 75 pages per second with \LUATEX\ and over 100 pages
-per second with \LUAJITTEX . On a more modern laptop or professional server
-performance is of course better. And for automated flows batch mode is your
-friend. The rate is not much worse for a document in a language with a bit more
-complex character handling, take accents or ligatures. Of course \PDFTEX\ is
-faster on such a dumb document but kick in some more functionality and the
-advantage quickly disappears. So, if someone complains that \LUATEX\ needs 10 or
-more seconds for a simple few page document \unknown\ you can bet that when the
-fonts are seen as reason, that the setup is pretty bad. Personally I'd not waste
-time on such a complaint.
+So, what is a realistic example? That is hard to say. Unfortunately, no one has
+ever asked us to typeset novels. They are rather brain dead-products for a
+machinery, so they process fast. On the mentioned laptop, 350 word pages in
+Dejavu fonts can be processed at a rate of 75 pages per second with \LUATEX\ and
+over 100 pages per second with \LUAJITTEX . On a more modern laptop or a
+professional server, the performance is of course better. And, for automated
+flows, batch mode is your friend. The rate is not much worse for a document in a
+language with a bit more complex character handling, take accents or ligatures.
+Of course, \PDFTEX\ is faster on such a dumb document, but kick in some more
+functionality, and the advantage quickly disappears. So, if someone complains
+that \LUATEX\ needs 10 or more seconds for a simple few page document \unknown\
+you can bet that when the fonts are seen as reason, then the setup is pretty bad.
+Personally I would not waste time on such a complaint.
\stopsection
@@ -366,74 +367,75 @@ about the slowness of \LUATEX:
\startsubsubject[title={What engines do you compare?}]
-If you come from \PDFTEX\ you come from an 8~bit world: input and font handling
-are based on bytes and hyphenation is integrated into the par builder. If you use
-\UTF-8\ in \PDFTEX, the input is decoded by \TEX\ macros which carries a speed
+If you come from \PDFTEX, you come from an 8-bit world: input and font handling
+are based on bytes, and hyphenation is integrated into the par builder. If you use
+\UTF-8\ in \PDFTEX, the input is decoded by \TEX\ macros, which carries a speed
penalty. Because in the wide engines macro names can also be \UTF\ sequences,
construction of macro names is less efficient too.
-When you try to use wide fonts, again there is a penalty. Now, if you use \XETEX\
-or \LUATEX\ your input is \UTF-8 which becomes something 32 bit internally. Fonts
-are wide so more resources are needed, apart from these fonts being larger and in
-need of more processing due to feature handling. Where \XETEX\ uses a library,
-\LUATEX\ uses its own handler. Does that have a consequence for performance? Yes
-and no. First of all it depends on how much time is spent on fonts at all, but
-even then the difference is not that large. Sometimes \XETEX\ wins, sometimes
-\LUATEX. One thing is clear: \LUATEX\ is more flexible as we can roll out our own
-solutions and therefore do more advanced font magic. For \CONTEXT\ it doesn't
-matter as we use \LUATEX\ exclusively and rely on the flexible font handler, also
-for future extensions. If really needed you can kick in a library based handler
-but it's (currently) not distributed as we loose other functionality which in
-turn would result in complaints about that fact (apart from conflicting with the
-strive for independence).
-
-There is no doubt that \PDFTEX\ is faster but for \CONTEXT\ it's an obsolete
-engine. The hard coded solutions engine \XETEX\ is also not feasible for
-\CONTEXT\ either. So, in practice \CONTEXT\ users have no choice: \LUATEX\ is
-used, but users of other macro packages can use the alternatives if they are not
-satisfied with performance. The fact that \CONTEXT\ users don't complain about
-speed is a clear signal that this is no issue. And, if you want more speed you
-can use \LUAJITTEX. \footnote {In plug mode we can actually test a library and
-experiments have shown that performance on the average is much worse but it can
+When you try to use wide fonts, there is, again, a penalty. Now, if you use
+\XETEX\ or \LUATEX, your input is \UTF-8, which becomes something 32-bit
+internally. Fonts are wide, so more resources are needed, apart from these fonts
+being larger and in need of more processing due to feature handling. Where
+\XETEX\ uses a library, \LUATEX\ uses its own handler. Does that have a
+consequence for performance? Yes and no. First of all, it depends on how much
+time is spent on fonts at all, but even then, the difference is not that large.
+Sometimes \XETEX\ wins, sometimes it's \LUATEX. One thing is clear: \LUATEX\ is
+more flexible as we can roll out our own solutions and therefore do more advanced
+font magic. For \CONTEXT, it doesn't matter as we use \LUATEX\ exclusively, and
+we rely on the flexible font handler, also for future extensions. If really
+needed, you can kick in a library-based handler but it's (currently) not
+distributed as we lose other functionality, which would, in turn, result in
+complaints about that fact (apart from conflicting with the strive for
+independence).
+
+There is no doubt that \PDFTEX\ is faster, but, for \CONTEXT, it's an obsolete
+engine. The hard-coded-solutions engine \XETEX\ is not feasible for \CONTEXT\
+either. So, in practice, \CONTEXT\ users have no choice: \LUATEX\ is used, but
+users of other macro packages can use the alternatives if they are not satisfied
+with performance. The fact that \CONTEXT\ users don't complain about speed is a
+clear signal that this is a no|-|issue. And, if you want more speed, you can always
+use \LUAJITTEX. \footnote {In plug mode, we can actually test a library and
+experiments have shown that performance on the average is much worse, but it can
be a bit better for complex scripts, although a gain gets unnoticed in normal
documents. So, one can decide to use a library but at the cost of much other
-functionality that \CONTEXT\ offers, so we don't support it.} In the last section
+functionality that \CONTEXT\ offers, so we don't support it.} In the last section,
the different engines will be compared in more detail.
-Just that you know, when we do the four switches example in plain \TEX\ on my
-laptop I get a rate of 40 pages per second, and for one font 180 pages per
-second. There is of course a bit more going on in \CONTEXT\ in page building and
-so, but the difference between plain and \CONTEXT\ is not that large.
+Just that you know, when we do the four|-|switches example in plain \TEX\ on my
+laptop, I get a rate of 40 pages per second, and, for one font, 180 pages per
+second. There is, of course, a bit more going on in \CONTEXT\ in page building
+and so, but the difference between plain and \CONTEXT\ is not that large.
\stopsubsubject
\startsubsubject[title={What macro package is used?}]
-If the answer is that when plain \TEX\ is used, a follow up question is: what
-variant? The \CONTEXT\ distribution ships with \type {luatex-plain} and that is
-our benchmark. If there really is a bottleneck it is worth exploring. But keep in
-mind that in order to be plain, not that much can be done. The \LUATEX\ part is
-just an example of an implementation. We already discussed \CONTEXT, and for
-\LATEX\ I don't want to speculate where performance hits might come from. When
-we're talking fonts, \CONTEXT\ can actually a bit slower than the generic (or
-\LATEX) variant because we can kick in more functionality. Also, when you compare
-macro packages, keep in mind that when node list processing code is added in that
-package the impact depends on interaction with other functionality and depends on
-the efficiency of the code. You can't compare mechanisms or draw general
-conclusions when you don't know what else is done!
+When plain \TEX\ is used, a follow up question is: what variant? The \CONTEXT\
+distribution ships with \type {luatex-plain}, and that is our benchmark. If there
+really is a bottleneck, it is worth exploring, but keep in mind that, in order to
+be plain, not that much can be done. The \LUATEX\ part is just an example of an
+implementation. We already discussed \CONTEXT, and for \LATEX, I don't want to
+speculate where performance hits might come from. When we're talking fonts,
+\CONTEXT\ can actually be a bit slower than the generic (or \LATEX) variant, because
+we can kick in more functionality. Also, when you compare macro packages, keep in
+mind that, when node list processing code is added in that package, the impact
+depends on interaction with other functionality and depends on the efficiency of
+the code. You can't compare mechanisms or draw general conclusions when you don't
+know what else is done!
\stopsubsubject
\startsubsubject[title={What do you load?}]
-Most \CONTEXT\ modules are small and load fast. Of course there can be exceptions
-when we rely on third party code; for instance loading tikz takes a a bit of
-time. It makes no sense to look for ways to speed that system up because it is
-maintained elsewhere. There can probably be gained a bit but again, no user
-complained so far.
+Most \CONTEXT\ modules are small and load fast. Of course, there can be exceptions
+when we rely on third party code; for instance, loading tikz takes a bit of
+time. It makes no sense to look for ways to speed that system up, because it is
+maintained elsewhere. There can probably be gained a bit, but, again, no user
+has complained so far.
-If \CONTEXT\ is not used, one probably also uses a large \TEX\ installations.
-File lookup in \CONTEXT\ is done differently and can can be faster. Even loading
+If \CONTEXT\ is not used, one probably also uses a large \TEX\ installation.
+File lookup in \CONTEXT\ is done differently, and can be faster. Even loading
can be more efficient in \CONTEXT, but it's hard to generalize that conclusion.
If one complains about loading fonts being an issue, just try to measure how much
time is spent on loading other code.
@@ -443,36 +445,36 @@ time is spent on loading other code.
\startsubsubject[title={Did you patch macros?}]
Not everyone is a \TEX pert. So, coming up with macros that are expanded many
-times and|/|or have inefficient user interfacing can have some impact. If someone
-complains about one subsystem being slow, then honestly demands to complain about
+times and|/|or have inefficient user interfacing, can have some impact. If someone
+complains about one subsystem being slow, then honesty demands to complain about
other subsystems as well. You get what you ask for.
\stopsubsubject
\startsubsubject[title={How efficient is the code that you use?}]
-Writing super efficient code only makes sense when it's used frequently. In
-\CONTEXT\ most code is reasonable efficient. It can be that in one document fonts
-are responsible for most runtime, but in another document table construction can
+Writing super|-|efficient code only makes sense when it's used frequently. In
+\CONTEXT, most code is reasonable efficient. It can be that in one document fonts
+are responsible for most runtime, but in another document, table construction can
be more demanding while yet another document puts some stress on interactive
-features. When hz or protrusion is enabled then you run substantially slower
-anyway so when you are willing to sacrifice 10\% or more runtime don't complain
-about other components. The same is true for enabling \SYNCTEX: if you are
-willing to add more than 10\% runtime for that, don't wither about the same
-amount for font handling. \footnote {In \CONTEXT\ we use a \SYNCTEX\ alternative
-that is somewhat faster but it remains a fact that enabling more and more
-functionality will make the penalty of for instance font processing relatively
-small.}
+features. When hz or protrusion is enabled, then you run substantially slower
+anyway, so when you are willing to sacrifice 10 \% or more of runtime, don't
+complain about other components. The same is true for enabling \SYNCTEX: if you
+are willing to add more than 10 \% of runtime for that, don't wither about the
+same amount for font handling. \footnote {In \CONTEXT, we use a \SYNCTEX\
+alternative that is somewhat faster, but it remains a fact that enabling more and
+more functionality will make the penalty of, for instance, font processing
+relatively small.}
\stopsubsubject
\startsubsubject[title={How efficient is the styling that you use?}]
-Probably the most easily overseen optimization is in switching fonts and color.
-Although in \CONTEXT\ font switching is fast, I have no clue about it in other
-macro packages. But in a style you can decide to use inefficient (massive) font
+Probably the most easily overlooked optimization is in switching fonts and colors.
+Although in \CONTEXT, font switching is fast, I have no clue about it in other
+macro packages. But in a style, you can decide to use inefficient (massive) font
switches. The effects can easily be tested by commenting bit and pieces. For
-instance sometimes you need to do a full bodyfont switch when changing a style,
+instance, sometimes you need to do a full bodyfont switch when changing a style,
like assigning \type {\small\bf} to the \type {style} key in \type {\setuphead},
but often using e.g.\ \type {\tfd} is much more efficient and works quite as
well. Just try it.
@@ -481,24 +483,24 @@ well. Just try it.
\startsubsubject[title={Are fonts really the bottleneck?}]
-We already mentioned that one can look in the wrong direction. Maybe once someone
+We already mentioned that one can look in the wrong direction. Maybe, once someone
is convinced that fonts are the culprit, it gets hard to look at the real issue.
-If a similar job in different macro packages has a significant different runtime
+If a similar job in different macro packages has a significantly different runtime,
one can wonder what happens indeed.
It is good to keep in mind that the amount of text is often not as large as you
-think. It's easy to do a test with hundreds of paragraphs of text but in practice
+think. It's easy to do a test with hundreds of paragraphs of text, but, in practice,
we have whitespace, section titles, half empty pages, floats, itemize and similar
-constructs, etc. Often we don't mix many fonts in the running text either. So, in
-the end a real document is the best test.
+constructs, etc. Often, we don't mix many fonts in the running text either. So,
+in the end, a real document is your best test.
\stopsubsubject
\startsubsubject[title={If you use \LUA, is that code any good?}]
You can gain from the faster virtual machine of \LUAJITTEX. Don't expect wonders
-from the jitting as that only pays of for long runs with the same code used over
-and over again. If the gain is high you can even wonder how well written your
+from the jitting as that only pays off in long runs with the same code used over
+and over again. If the gain is high, you can even wonder how well-written your
\LUA\ code is anyway.
\stopsubsubject
@@ -506,18 +508,18 @@ and over again. If the gain is high you can even wonder how well written your
\startsubsubject[title={What if they don't believe you?}]
So, say that someone finds \LUATEX\ slow, what can be done about it? Just advice
-him or her to stick to tool used previously. Then, if arguments come that one
+them to stick to their previously|-|used tool. Then, if arguments come that one
also wants to use \UTF-8, \OPENTYPE\ fonts, a bit of \METAPOST, and is looking
forward to using \LUA\ runtime, the only answer is: take it or leave it. You pay
-a price for progress, but if you do your job well, the price is not that large.
-Tell them to spend time on learning and maybe adapting and bark against their own
+a price for progress, but, if you do your job well, the price is not that high.
+Tell them to spend time on learning and maybe adapting and to bark against their own
tree before barking against those who took that step a decade ago. Most \CONTEXT\
users took that step and someone still using \LUATEX\ after a decade can't be
that stupid. It's always best to first wonder what one actually asks from \LUATEX,
and if the benefit of having \LUA\ on board has an advantage. If not, one can
just use another engine.
-Also think of this. When a job is slow, for me it's no problem to identify where
+Also think of this: when a job is slow, for me it's no problem to identify where
the problem is. The question then is: can something be done about it? Well, I
happily keep the answer for myself. After all, some people always need room to
complain, maybe if only to hide their ignorance or incompetence. Who knows.
@@ -529,13 +531,13 @@ complain, maybe if only to hide their ignorance or incompetence. Who knows.
\startsection[title={Comparing engines}]
The next comparison is to be taken with a grain of salt and concerns the state of
-affairs mid 2017. First of all, you cannot really compare \MKII\ with \MKIV: the
-later has more functionality (or a more advanced implementation of
-functionality). And as mentioned you can also not really compare \PDFTEX\ and the
-wide engines. Anyway, here are some (useless) tests. First a bunch of loads. Keep
+affairs mid-2017. First of all, you cannot really compare \MKII\ with \MKIV: the
+latter has more functionality (or a more advanced implementation of
+functionality). And, as mentioned, you can also not really compare \PDFTEX\ and the
+wide engines. Anyway, here are some (useless) tests. First, a bunch of loads. Keep
in mind that different engines also deal differently with reading files. For
-instance \MKIV\ uses \LUATEX\ callbacks to normalize the input and has its own
-readers. There is a bit more overhead in starting up a \LUATEX\ run and some
+instance, \MKIV\ uses \LUATEX\ callbacks to normalize the input and has its own
+readers. There is a bit more overhead in starting up a \LUATEX\ run, and some
functionality is enabled that is not present in \MKII. The format is also larger,
if only because we preload a lot of useful font, character and script related
data.
@@ -549,7 +551,7 @@ data.
\stoptext
\stoptyping
-When looking at the numbers one should realize that the times include startup and
+When looking at the numbers, one should realize that the times include startup and
job management by the runner scripts. We also run in batchmode to avoid logging
to influence runtime. The average is calculated from 5 runs.
@@ -593,8 +595,7 @@ The second example does a few switches in a paragraph:
\HL
\stoptabulate
-The third examples does a few more, resulting in multiple subranges
-per style:
+The third example does more, resulting in multiple subranges per style:
\starttyping
\starttext
@@ -623,7 +624,7 @@ per style:
The last example adds some color. Enabling more functionality can have an impact
on performance. In fact, as \MKIV\ uses a lot of \LUA\ and is also more advanced
-that \MKII, one can expect a performance hit but in practice the opposite
+that \MKII, one can expect a performance hit, but, in practice, the opposite
happens, which can also be due to some fundamental differences deep down at the
macro level.
@@ -654,20 +655,20 @@ macro level.
\HL
\stoptabulate
-In these measurements the accuracy is a few decimals but a pattern is visible. As
-expected \PDFTEX\ wins on simple documents but starts loosing when things get
-more complex. For these tests I used 64 bit binaries. A 32 bit \XETEX\ with
-\MKII\ performs the same as \LUAJITTEX\ with \MKIV, but a 64 bit \XETEX\ is
-actually quite a bit slower. In that case the mingw cross compiled \LUATEX\
-version does pretty well. A 64 bit \PDFTEX\ is also slower (it looks) that a 32
-bit version. So in the end, there are more factors that play a role. Choosing
-between \LUATEX\ and \LUAJITTEX\ depends on how well the memory limited
+In these measurements, the accuracy is a few decimals, but a pattern is visible.
+As expected, \PDFTEX\ wins on simple documents but starts losing when things get
+more complex. For these tests, I used 64 bit binaries. A 32-bit \XETEX\ with
+\MKII\ performs the same as \LUAJITTEX\ with \MKIV, but a 64-bit \XETEX\ is
+actually quite a bit slower. In that case, the mingw cross|-|compiled \LUATEX\
+version does pretty well. A 64-bit \PDFTEX\ is also slower (it looks) than a
+32-bit version. So, in the end, there are more factors that play a role. Choosing
+between \LUATEX\ and \LUAJITTEX\ depends on how well the memory|-|limited
\LUAJITTEX\ variant can handle your documents and fonts.
Because in most of our recent styles we use \OPENTYPE\ fonts and (structural)
-features as well as recent \METAFUN\ extensions only present in \MKIV\ we cannot
+features as well as recent \METAFUN\ extensions only present in \MKIV, we cannot
compare engines using such documents. The mentioned performance of \LUATEX\ (or
-\LUAJITTEX) and \MKIV\ on the \METAFUN\ manual illustrate that in most cases this
+\LUAJITTEX) and \MKIV\ on the \METAFUN\ manual illustrate that, in most cases, this
combination is a clear winner.
\starttyping
@@ -703,8 +704,8 @@ That leaves the zero run:
\stoptext
\stoptyping
-This gives the following numbers. In longer runs the difference in overhead is
-neglectable.
+This gives the following numbers. In longer runs, the difference in overhead is
+negligible.
% sample 6, number of runs: 5
@@ -719,31 +720,31 @@ neglectable.
\HL
\stoptabulate
-It will be clear that when we use different fonts the numbers will also be
-different. And if you use a lot of runtime \METAPOST\ graphics (for instance for
-backgrounds), the \MKIV\ runs end up at the top. And when we process \XML\ it
+It will be clear that when we use different fonts, the numbers will also be
+different. And, if you use a lot of runtime \METAPOST\ graphics (for instance for
+backgrounds), the \MKIV\ runs end up at the top. And, when we process \XML, it
will be clear that going back to \MKII\ is no longer a realistic option. It must
-be noted that I occasionally manage to improve performance but we've now reached
+be noted that I occasionally manage to improve performance, but we've now reached
a state where there is not that much to gain. Some functionality is hard to
-compare. For instance in \CONTEXT\ we don't use much of the \PDF\ backend
-features because we implement them all in \LUA. In fact, even in \MKII\ already a
-done in \TEX, so in the end the speed difference there is not large and often in
+compare. For instance, in \CONTEXT, we don't use much of the \PDF\ backend
+features because we implement them all in \LUA. In fact, even in \MKII, already
+done in \TEX, so in the end, the speed difference there is not large and often in
favour of \MKIV.
-For the record I mention that shipping out the about 1250 pages has some overhead
-too: about 2 seconds. Here \LUAJITTEX\ is 20\% more efficient which is an
+For the record, I mention that shipping out the about 1250 pages has some overhead
+too: about 2 seconds. Here, \LUAJITTEX\ is 20\% more efficient, which is an
indication of quite some \LUA\ involvement. Loading the input files has an
-overhead of about half a second. Starting up \LUATEX\ takes more time that
+overhead of about half a second. Starting up \LUATEX\ takes more time than
\PDFTEX\ and \XETEX, but that disadvantage disappears with more pages. So, in the
-end there are quite some factors that blur the measurements. In practice what
-matters is convenience: does the runtime feel reasonable and in most cases it
+end, there are quite some factors that blur the measurements. In practice, what
+matters is convenience: does the runtime feel reasonable and, in most cases, it
does.
-If I would replace my laptop with a reasonable comparable alternative that one
+If I would replace my laptop with a reasonable comparable alternative, that one
would be some 35\% faster (single threads on processors don't gain much per year).
-I guess that this is about the same increase in performance that \CONTEXT\
-\MKIV\ got in that period. I don't expect such a gain in the coming years so
-at some point we're stuck with what we have.
+I guess that this is about the same increase in performance than \CONTEXT\
+\MKIV\ got in that period. I don't expect such a gain in the upcoming years, so,
+at some point, we're stuck with what we have.
\stopsection
@@ -754,29 +755,29 @@ go back in time to when the first wide engines showed up, \OMEGA\ was considered
to be slow, although I never tested that myself. Then, when \XETEX\ showed up,
there was not much talk about speed, just about the fact that we could use
\OPENTYPE\ fonts and native \UTF\ input. If you look at the numbers, for sure you
-can say that it was much slower than \PDFTEX. So how come that some people
+can say that it was much slower than \PDFTEX. So, how come that some people
complain about \LUATEX\ being so slow, especially when we take into account that
-it's not that much slower than \XETEX, and that \LUAJITTEX\ is often faster that
-\XETEX. Also, computers have become faster. With the wide engines you get more
+it's not that much slower than \XETEX, and that \LUAJITTEX\ is often faster than
+\XETEX. Also, computers have become faster. With the wide engines, you get more
functionality and that comes at a price. This was accepted for \XETEX\ and is
also acceptable for \LUATEX. But the price is nto that high if you take into
account that hardware performs better: you just need to compare \LUATEX\ (and
\XETEX) runtime with \PDFTEX\ runtime 15 years ago.
As a comparison, look at games and video. Resolution became much higher as did
-color depth. Higher frame rates were in demand. Therefore the hardware had to
-become faster and it did, and as a result the user experience kept up. No user
+color depth. Higher frame rates were in demand. Therefore, the hardware had to
+become faster, and it did, and, as a result, the user experience kept up. No user
will say that a modern game is slower than an old one, because the old one does
500 frames per second compared to some 50 for the new game on the modern
hardware. In a similar fashion, the demands for typesetting became higher:
\UNICODE, \OPENTYPE, graphics, \XML, advanced \PDF, more complex (niche)
typesetting, etc. This happened more or less in parallel with computers becoming
more powerful. So, as with games, the user experience didn't degrade with
-demands. Comparing \LUATEX\ with \PDFTEX\ is like comparing a low res, low frame
-rate, low color game with a modern one. You need to have up to date hardware and
-even then, the writer of such programs need to make sure it runs efficient,
-simply because hardware no longer scales like it did decades ago. You need to
-look at the larger picture.
+demands. Comparing \LUATEX\ with \PDFTEX\ is like comparing a low|-|res,
+low|-|framerate, low|-|color game with a modern one. You need to have
+up|-|to|-|date hardware and even then, the writer of such programs needs to make
+sure that they run efficiently, simply because hardware no longer scales like it
+did decades ago. You need to look at the bigger picture.
\stopsection