summaryrefslogtreecommitdiff
path: root/doc/context/sources/general/manuals/mk/mk-last.tex
diff options
context:
space:
mode:
Diffstat (limited to 'doc/context/sources/general/manuals/mk/mk-last.tex')
-rw-r--r--doc/context/sources/general/manuals/mk/mk-last.tex404
1 files changed, 404 insertions, 0 deletions
diff --git a/doc/context/sources/general/manuals/mk/mk-last.tex b/doc/context/sources/general/manuals/mk/mk-last.tex
new file mode 100644
index 000000000..b2d3dc519
--- /dev/null
+++ b/doc/context/sources/general/manuals/mk/mk-last.tex
@@ -0,0 +1,404 @@
+% language=uk
+
+\startcomponent mk-arabic
+
+\environment mk-environment
+
+\chapter{Where do we stand}
+
+In the previous chapter we discussed the state of \LUATEX\ in the
+beginning of 2009, the prelude to version 0.50. We consider the
+release of the 0.50 version to be a really important, both for
+\LUATEX\ and for \MKIV\ so here I will reflect on the state
+around this release. I will do this from the perspective of
+processing documents because useability is an important measure.
+
+There are several reasons why \LUATEX\ 0.50 is an important release,
+both for \LUATEX\ and for \MKIV. Let's start with \LUATEX.
+
+\startitemize
+
+\startitem Apart from a couple of bug fixes, the current version
+is pretty usable and stable. Details of what we've reached so far
+have been presented previously. \stopitem
+
+\startitem The code base has been converted from \PASCAL\ to
+\CCODE, and as a result the source tree has become simpler (being
+\CWEB\ compliant happens around 0.60). This transition also opens
+up the possibility to start looking into some of the more tricky
+internals, like page building. \stopitem
+
+\startitem Most of the front end has been opened up and the new
+backend code is getting into shape. As the backend was partly already done in
+\CCODE\ the moment has come to do a real cleanup. Keep in mind that
+we started with \PDFTEX\ and that much of its extra functionality is
+rather interwoven with traditional \TEX\ code. \stopitem
+
+\stopitemize
+
+If we look at \CONTEXT, we've also reached a crucial point in the
+upgrade.
+
+\startitemize
+
+\startitem The code base is now divided into \MKII\ and \MKIV. This
+permits us not only to reimplement bits and pieces (something that
+was already in progress) but also to clean up the code (only
+\MKIV). \stopitem
+
+\startitem If you kept up with the development you already know
+the kind of tasks we can (and do) delegate to \LUA. Just to
+mention a few: file handling, font loading and \OPENTYPE\
+processing, casing and some spacing issues, everything related to
+graphics and \METAPOST, language support, color and other
+attributes, input regimes, \XML, multi|-|pass data, etc. \stopitem
+
+\startitem Recently all backend related code was moved to
+\LUA\ and the code dealing with hyperlinks, widgets and alike is
+now mostly moved away from \TEX. The related cleanup was possible
+because we no longer have to deal with a mix of \DVI\ drivers too.
+\stopitem
+
+\startitem Everything related to structure (which includes
+numbering and multi-pass data like tables of contents and
+registers) is now delegated to \LUA. We move around way more
+information and will extend these mechanisms in the near future.
+\stopitem
+
+\stopitemize
+
+Tracing on Taco's machine has shown that when processing the
+\LUATEX\ reference manual the engine spends about 10\%
+of the time on getting tokens, 15\% on macro expansion, and some
+50\% on \LUA\ (callback interfacing included). Especially the time
+spent by \LUA\ differs per document and garbage collections seems
+to be a bottleneck here. So, let's wrap up how \LUATEX\ performs
+around the time of 0.50.
+
+We use three documents for testing (intermediate) \LUATEX\
+binaries: the reference manual, the history document \quote{mk},
+and the revised metafun manual. The reference manual has a
+\METAPOST\ graphic on each page which is positioned using the
+\CONTEXT\ background layering mechanism. This mechanism is active
+only when backgrounds are defined and has some performance
+consequences for the page builder. However, most time is spent on
+constructing the tables (tabulate) and because these can contain
+paragraphs that can run over multiple pages, constructing a table
+takes a few analysis passes per table plus some so-called
+vsplitting. We load some fonts (including narrow variants) but for
+the rest this document is not that complex. Of course colors are
+used as well as hyperlinks.
+
+The report at the end of the runs looks as follows:
+
+\start \switchtobodyfont[small]
+\starttyping
+input load time - 0.109 seconds
+stored bytecode data - 184 modules, 45 tables, 229 chunks
+node list callback tasks - 4 unique tasks, 4 created, 20980 calls
+cleaned up reserved nodes - 29 nodes, 10 lists of 1427
+node memory usage - 19 glue_spec, 2 dir
+h-node processing time - 0.312 seconds including kernel
+attribute processing time - 1.154 seconds
+used backend - pdf (backend for directly generating pdf output)
+loaded patterns - en:us:pat:exc:2
+jobdata time - 0.078 seconds saving, 0.047 seconds loading
+callbacks - direct: 86692, indirect: 13364, total: 100056
+interactive elements - 178 references, 356 destinations
+v-node processing time - 0.062 seconds
+loaded fonts - 43 files: ....
+fonts load time - 1.030 seconds
+metapost processing time - 0.281 seconds, loading: 0.016 seconds,
+ execution: 0.156 seconds, n: 161
+result saved in file - luatexref-t.pdf
+luatex banner - this is luatex, version beta-0.42.0
+control sequences - 31880 of 147189
+current memory usage - 106 MB (ctx: 108 MB)
+runtime - 12.433 seconds, 164 processed pages,
+ 164 shipped pages, 13.191 pages/second
+\stoptyping
+\stop
+
+The runtime is influenced by the fact that some startup time and
+font loading takes place. The more pages your document has, the
+less the runtime is influenced by this.
+
+More demanding is the \quote {mk} document (figure~\ref{fig.mk}). Here
+we have many fonts, including some really huge \CJK\ and Arabic ones (and these are
+loaded at several sizes and with different features). The reported
+font load time is large but this is partly due to the fact that on
+my machine for some reason passing the tables to \TEX\ involved a
+lot of pagefaults (we think that the cpu cache is the culprit).
+Older versions of \LUATEX\ didn't have that performance penalty,
+so probably half of the reported font loading time is kind of
+wasted.
+
+The hnode processing time refers mostly to \OPENTYPE\ font
+processing and attribute processing time has to do with backend
+issues (like injecting color directives). The more features you
+enable, the larger these numbers get. The \METAPOST\ font loading
+refers to the punk font instances.
+
+\start \switchtobodyfont[small]
+\starttyping
+input load time - 0.125 seconds
+stored bytecode data - 184 modules, 45 tables, 229 chunks
+node list callback tasks - 4 unique tasks, 4 created, 24295 calls
+cleaned up reserved nodes - 116 nodes, 29 lists of 1411
+node memory usage - 21 attribute, 23 glue_spec, 7 attribute_list,
+ 7 local_par, 2 dir
+h-node processing time - 1.763 seconds including kernel
+attribute processing time - 2.231 seconds
+used backend - pdf (backend for directly generating pdf output)
+loaded patterns - en:us:pat:exc:2 en-gb:gb:pat:exc:3 nl:nl:pat:exc:4
+language load time - 0.094 seconds, n=4
+jobdata time - 0.062 seconds saving, 0.031 seconds loading
+callbacks - direct: 98199, indirect: 20257, total: 118456
+xml load time - 0.000 seconds, lpath calls: 46, cached calls: 31
+v-node processing time - 0.234 seconds
+loaded fonts - 69 files: ....
+fonts load time - 28.205 seconds
+metapost processing time - 0.421 seconds, loading: 0.016 seconds,
+ execution: 0.203 seconds, n: 65
+graphics processing time - 0.125 seconds including tex, n=7
+result saved in file - mk.pdf
+metapost font generation - 0 glyphs, 0.000 seconds runtime
+metapost font loading - 0.187 seconds, 40 instances,
+ 213.904 instances/second
+luatex banner - this is luatex, version beta-0.42.0
+control sequences - 34449 of 147189
+current memory usage - 454 MB (ctx: 465 MB)
+runtime - 50.326 seconds, 316 processed pages,
+ 316 shipped pages, 6.279 pages/second
+\stoptyping
+\stop
+
+Looking at the Metafun manual one might expect that one needs
+even more time per page but this is not true. We use \OPENTYPE\
+fonts in base mode as we don't use fancy font features (base mode
+uses traditional \TEX\ methods). Most interesting here is the time
+involved in processing \METAPOST\ graphics. There are a lot of
+them (1772) and in addition we have 7 calls to independent
+\CONTEXT\ runs that take one third of the total runtime. About
+half of the runtime involves graphics.
+
+\start \switchtobodyfont[small]
+\starttyping
+input load time - 0.109 seconds
+stored bytecode data - 184 modules, 45 tables, 229 chunks
+node list callback tasks - 4 unique tasks, 4 created, 33510 calls
+cleaned up reserved nodes - 39 nodes, 93 lists of 1432
+node memory usage - 249 attribute, 19 glue_spec, 82 attribute_list,
+ 85 local_par, 2 dir
+h-node processing time - 0.562 seconds including kernel
+attribute processing time - 2.512 seconds
+used backend - pdf (backend for directly generating pdf output)
+loaded patterns - en:us:pat:exc:2
+jobdata time - 0.094 seconds saving, 0.031 seconds loading
+callbacks - direct: 143950, indirect: 28492, total: 172442
+interactive elements - 214 references, 371 destinations
+v-node processing time - 0.250 seconds
+loaded fonts - 45 files: l.....
+fonts load time - 1.794 seconds
+metapost processing time - 5.585 seconds, loading: 0.047 seconds,
+ execution: 2.371 seconds, n: 1772,
+ external: 15.475 seconds (7 calls)
+mps conversion time - 0.000 seconds, 1 conversions
+graphics processing time - 0.499 seconds including tex, n=74
+result saved in file - metafun.pdf
+luatex banner - this is luatex, version beta-0.42.0
+control sequences - 32587 of 147189
+current memory usage - 113 MB (ctx: 115 MB)
+runtime - 43.368 seconds, 362 processed pages,
+ 362 shipped pages, 8.347 pages/second
+\stoptyping
+\stop
+
+By now it will be clear that processing a document takes a bit of
+time. However, keep in mind that these documents are a bit
+atypical. Although \unknown\ thee average \CONTEXT\ document
+probably uses color (including color spaces that involve resource
+management), and has multiple layers, which involves some testing of
+the about 30 areas that make up the page. And there is the
+user interface that comes with a price.
+
+It might be good to say a bit more about fonts. In \CONTEXT\ we
+use symbolic names and often a chain of them, so the abstract
+\type {SerifBold} resolves to \type {MyNiceFontSerif-Bold} which
+in turn resolves to \type {mnfs-bold.otf}. As \XETEX\ introduced
+lookup by internal (or system) fontname instead of filename,
+\MKII\ also provides that method but \MKIV\ adds some heuristics
+to it. Users can specify font sizes in traditional \TEX\ units but
+also relative to the body font. All this involves a bit of
+expansion (resolving the chain) and parsing (of the
+specification). At each of the levels of name abstraction we can
+have associated parameters, like features, fallbacks and more.
+Although these mechanisms are quite optimized this still comes at a
+performance price.
+
+Also, in the default \MKIV\ font setup we use a couple more
+font variants (as they are available in Latin Modern). We've kept
+definitions sort of dynamic so you can change them and combine
+them in many ways. Definitions are collected in typescripts which
+are filtered. We support multiple mixed font sets which takes a bit
+of time to define but switching is generally fast. Compared to \MKII\
+the model lacks the (font) encoding and case handling code (here
+we gain speed) but it now offers fallback fonts (replaced ranges
+within fonts) and dynamic \OPENTYPE\ font feature switching. When
+used we might lose a bit of processing speed although fewer
+definitions are needed which gets us some back. The font subsystem
+is anyway a factor in the performance, if only because more
+complex scripts or font features demand extensive node list
+parsing.
+
+Processing the \TEX book with \LUATEX\ on Taco's machine takes some
+3.5 seconds in \PDFTEX\ and 5.5 seconds in \LUATEX. This is
+because \LUATEX\ internally is \UNICODE\ and has a larger memory
+space. The few seconds more runtime are consistent with this. One
+of the reasons that The \TEX\ Book processes fast is that the font
+system is not that complex and has hardly any overhead, and an
+efficient output routine is used. The format file is small and the
+macro set is optimal for the task. The coding is rather low level
+so to say (no layers of interfacing). Anyway, 100 pages per second
+is not bad at all and we don't come close with \CONTEXT\ and the
+kind of documents that we produce there.
+
+This made me curious as to how fast really dumb documents could be
+processed. It does not make sense to compare plain \TEX\ and
+\CONTEXT\ because they do different things. Instead I decided to
+look at differences in engines and compare runs with different
+numbers of pages. That way we get an idea of how startup time
+influences overall performance. We look at \PDFTEX, which is
+basically an 8-bit system, \XETEX, which uses external libraries and is
+\UNICODE, and \LUATEX\ which is also \UNICODE, but stays closer to
+traditional \TEX\ but has to check for callbacks.
+
+In our measurement we use a really simple test document as we only
+want to see how the baseline performs. As not much content is
+processed, we focus on loading (startup), the output routine and
+page building, and some basic \PDF\ generation. After all, it's
+often a quick and dirty test that gives users their first
+impression. When looking at the times you need to keep in mind
+that \XETEX\ pipes to \DVIPDFMX\ and can benefit from multiple
+cpu cores. All systems have different memory management and garbage
+collection might influence performance (as demonstrated in an
+earlier chapter of the \quote{mk} document we can trace in detail
+how the runtime is distributed). As terminal output is a significant
+slowdown for \TEX\ we run in batchmode. The test is as follows:
+
+\starttyping
+\starttext
+ \dorecurse{2000}{test\page}
+\stoptext
+\stoptyping
+
+On my laptop (Dell M90 with 2.3Ghz T76000 Core 2 and 4MB memory
+running Vista) I get the following results. The test script ran
+each test set 5~times and we show the fastest run so we kind of
+avoid interference with other processes that take time. In
+practice runtime differs quite a bit for similar runs, depending
+on the system load. The time is in seconds and between parentheses
+the number of pages per seconds is mentioned.
+
+% \starttabulate[||||||]
+% \NC \bf engine \NC 30 \NC 300 \NC 2000 \NC 10000 \NC \NR
+% \HL
+% \NC \bf xetex \NC 1.84 (16) 1.81 (16) \NC 2.51 (119) 2.45 (122) \NC 7.38 (270) 6.97 (286) \NC 38.53 (259) 29.20 (342) \NC \NR
+% \NC \bf pdftex \NC 1.32 (22) 1.28 (23) \NC 2.16 (138) 2.07 (144) \NC 7.34 (272) 6.96 (287) \NC 43.73 (228) 30.94 (323) \NC \NR
+% \NC \bf luatex \NC 1.53 (19) 1.48 (20) \NC 2.41 (124) 2.36 (127) \NC 8.16 (245) 7.85 (254) \NC 44.67 (223) 34.34 (291) \NC \NR
+% \stoptabulate
+
+\starttabulate[||||||]
+\NC \bf engine \NC 30 \NC 300 \NC 2000 \NC 10000 \NC \NR
+\HL
+\NC \bf xetex \NC 1.81 (16) \NC 2.45 (122) \NC 6.97 (286) \NC 29.20 (342) \NC \NR
+\NC \bf pdftex \NC 1.28 (23) \NC 2.07 (144) \NC 6.96 (287) \NC 30.94 (323) \NC \NR
+\NC \bf luatex \NC 1.48 (20) \NC 2.36 (127) \NC 7.85 (254) \NC 34.34 (291) \NC \NR
+\stoptabulate
+
+The next table shows the same test but this time on a 2.5Ghz E5420
+quad core server with 16GB memory running Linux, but with 6
+virtual machines idling in the background. All binaries are 64 bit.
+
+% \starttabulate[||||||]
+% \NC \bf engine \NC 30 \NC 300 \NC 2000 \NC 10000 \NC \NR
+% \HL
+% \NC \bf xetex \NC 0.94 (31) 0.92 (32) \NC 2.00 (150) 1.89 (158) \NC 9.02 (221) 8.74 (228) \NC 42.41 (235) 42.19 (237) \NC \NR
+% \NC \bf pdftex \NC 0.51 (58) 0.49 (61) \NC 1.19 (251) 1.14 (262) \NC 5.34 (374) 5.23 (382) \NC 25.16 (397) 24.66 (405) \NC \NR
+% \NC \bf luatex \NC 1.09 (27) 1.07 (27) \NC 2.06 (145) 1.99 (150) \NC 8.72 (229) 8.32 (240) \NC 40.10 (249) 38.22 (261) \NC \NR
+% \stoptabulate
+
+\starttabulate[||||||]
+\NC \bf engine \NC 30 \NC 300 \NC 2000 \NC 10000 \NC \NR
+\HL
+\NC \bf xetex \NC 0.92 (32) \NC 1.89 (158) \NC 8.74 (228) \NC 42.19 (237) \NC \NR
+\NC \bf pdftex \NC 0.49 (61) \NC 1.14 (262) \NC 5.23 (382) \NC 24.66 (405) \NC \NR
+\NC \bf luatex \NC 1.07 (27) \NC 1.99 (150) \NC 8.32 (240) \NC 38.22 (261) \NC \NR
+\stoptabulate
+
+A test demonstrated that for \LUATEX\ the 30 and 300 page runs
+take 70\% more runtime with 32 bit binaries (recent binaries for
+these engines are available on the \CONTEXT\ wiki \type
+{contextgarden.net}).
+
+When you compare both tables it will be clear that it is
+non|-|trivial to come to conclusions about performances. But one thing
+is clear: \LUATEX\ with \CONTEXT\ \MKIV\ is not performing that
+badly compared to its cousins. The \UNICODE\ engines perform about
+the same and \PDFTEX\ beats them significantly. Okay, I have to
+admit that in the meantime some cleanup of code in \MKIV\ has
+happened and the \LUATEX\ runs benefit from this, but on the other
+hand, the other engines are not hindered by callbacks. As I expect
+to use \MKII\ less frequently optimizing the older code makes no
+sense.
+
+There is not much chance of \LUATEX\ itself becoming faster,
+although a few days before writing this Taco managed to speed up
+font inclusion in the backend code significantly (we're talking
+about half a second to a second for the three documents used
+here). On the contrary, when we open up more mechanisms and have
+upgraded backend code it might actually be a bit slower. On the
+other hand, I expect to be able to clean up some more \CONTEXT\
+code, although we already got rid of some subsystems (like the
+rather flexible (mixed) font encoding, where each language could
+have multiple hyphenation patters, etc.). Also, although initial
+loading of math fonts might take a bit more time (as long as we
+use virtual Latin Modern math), font switching is more efficient
+now due to fewer families. But speedups in the \CONTEXT\ code might
+be compensated for by more advanced mechanisms that call out to \LUA.
+You will be surprised by how much speed can be improved by proper
+document encoding and proper styles. I can try to gain a couple
+more pages per second by more efficient code, but a user's style
+that does an inefficient massive font switch for some 10 words per
+page easily compensates for that.
+
+When processing this 10 page chapter in an editor (Scite) it takes
+some 2.7 seconds between hitting the processing key and the result
+showing up in Acrobat. I can live with that, especially when I
+keep in mind that my next computer will be faster.
+
+This is where we stand now. The three reports shown before give
+you an impression of the impact of \LUATEX\ on \CONTEXT. To what
+extent is this reflected in the code base? We end this chapter
+with showing four tables. The first table shows the number of
+files that make up the core of \CONTEXT\ (modules are excluded).
+The second table shows the accumulated size of these files
+(comments and spacing stripped). The third and fourth table show
+the same information in a different way, just to give you a better
+impression of the relative number of files and sizes. The four
+character tags represent the file groups, so the files have
+names like \type {node-ini.mkiv}, \type {font-otf.lua} and
+\type {supp-box.tex}.
+
+Eventually most \MKII\ files (with the \type {mkii} suffix) and
+\MKIV\ files (with suffix \type {mkiv}) will differ and the number
+of files with the \type {tex} suffix will be fewer. Because they
+are and will be mostly downward compatible, styles and modules
+will be shared as much as possible.
+
+\placefigure[none,90,page]{}{\externalfigure[mk-last-state.pdf][page=1,width=\the\textheight]}
+\placefigure[none,90,page]{}{\externalfigure[mk-last-state.pdf][page=2,width=\the\textheight]}
+\placefigure[none,90,page]{}{\externalfigure[mk-last-state.pdf][page=3,width=\the\textheight]}
+\placefigure[none,90,page]{}{\externalfigure[mk-last-state.pdf][page=4,width=\the\textheight]}
+
+\stopcomponent