summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorPhilipp Gesang <phg@phi-gamma.net>2016-04-13 00:51:30 +0200
committerPhilipp Gesang <phg@phi-gamma.net>2016-04-13 00:51:30 +0200
commitf0deb1226705f86d849e40a83c7a9b027f8a2957 (patch)
treef2dfb89d14df77f8a6df28a79e0e3a94e4f4372d
parentadbde7d798a5b39d6524fe39396fe1cc8de71d09 (diff)
downloadluaotfload-f0deb1226705f86d849e40a83c7a9b027f8a2957.tar.gz
[db] add some stats regarding the new loader
-rw-r--r--src/luaotfload-database.lua124
1 files changed, 99 insertions, 25 deletions
diff --git a/src/luaotfload-database.lua b/src/luaotfload-database.lua
index 24c2745..0557c1a 100644
--- a/src/luaotfload-database.lua
+++ b/src/luaotfload-database.lua
@@ -8,31 +8,105 @@ if not modules then modules = { } end modules ['luaotfload-database'] = {
--[[doc--
- Some statistics:
-
- a) TL 2012, mkluatexfontdb --force
- b) v2.4, luaotfload-tool --update --force
- c) v2.4, luaotfload-tool --update --force --formats=+afm,pfa,pfb
- d) Context, mtxrun --script fonts --reload --force
-
- (Keep in mind that Context does index fewer fonts since it
- considers only the contents of the minimals tree, not the
- tex live one!)
-
- time (m:s) peak VmSize (kB)
- a 1:19 386 018
- b 0:37 715 797
- c 2:27 1 017 674
- d 0:44 1 082 313
-
- Most of the increase in memory consumption from version 1.x to 2.2+
- can be attributed to the move from single-pass to a multi-pass
- approach to building the index: Information is first gathered from
- all reachable fonts and only afterwards processed, classified and
- discarded. Also, there is a good deal of additional stuff kept in
- the database now: two extra tables for file names and font families
- have been added, making font lookups more efficient while improving
- maintainability of the code.
+ With version 2.7 we killed of the Fontforge libraries in favor of
+ the Lua implementation of the OT format reader. There were many
+ reasons to do this on top of the fact that FF won’t be around at
+ version 1.0 anymore: In addition to maintainability, memory safety
+ and general code hygiene, the new reader shows an amazing
+ performance: Scanning of the 3200 font files on my system takes
+ around 23 s now, as opposed to 74 s with the Fontforge libs. Memory
+ usage has improved drastically as well, as illustrated by these
+ profiles:
+
+ GB
+ 1.324^ #
+ | #
+ | ::#
+ | : #::
+ | @: #:
+ | @: #:
+ | @@@: #:
+ | @@@ @: #:
+ | @@ @ @ @: #:
+ | @ @@:@ @ @: #: :
+ | @@ : @ :@ @ @: #: :
+ | @@: ::@@ :@@ ::@ :@ @ @: #: : :::
+ | @@ : :: @@ :@ : @ :@ @ @: #: : :: :
+ | @@ : ::: @@ :@ ::: @ :@ @ @: #: :: :: :
+ | @@@@ :: :::: @@ :@ : : @ :@ @ @: #: :: :::: ::
+ | :@ @@ :: :::: @@ :@ : :: : @ :@ @ @: #: :: :: :: ::
+ | @:@ @@ :: ::::: @@ :@ : :::: : @ :@ @ @: #: :::: :::: :: ::
+ | @@:@ @@ :: ::::::: @@ :@ : : :: : @ :@ @ @: #: ::: @: :: :: ::@
+ | @@@:@ @@ :: :: ::::: @@ :@ ::: :: : @ :@ @ @: #: ::: @: :: :: ::@
+ | @@@@@:@ @@ ::::::: ::::: @@ :@ ::: :: : @ :@ @ @: #: ::: ::@: :: :: ::@
+ 0 +----------------------------------------------------------------------->GB
+ 0 16.29
+
+ This is the memory usage during a complete database rebuild with
+ the Fontforge libraries. The same action using the new
+ ``getinfo()`` method gives a different picture:
+
+ MB
+ 43.37^ #
+ | @ @ @#
+ | @@ @ @ @# :
+ | @@@ : @: @ @ @# :
+ | @ @@@ : : @: @ @: :@# :
+ | @ @ @@@ : : @: @ @: :@# :
+ | @ : : :@ @@@:::: @::@ @: :@#:: :
+ | :: : @ : @ : :::@ @ :@@@:::::@::@ @:::@#::::
+ | : @ : :: : :@:: :@: :::::@ @ :@@@:::::@::@:@:::@#::::
+ | :: :@ : @ ::@:@:::@:: :@: :::::@: :@ :@@@:::::@::@:@:::@#::::
+ | :: :@::: :@ ::@:@: :@::::@::::::::@:::@::@@@:::::@::@:@:::@#::::
+ | :@::::@::: :@:::@:@: :@::::@::::::::@:::@::@@@:::::@::@:@:::@#::::
+ | :::::@::::@::: :@:::@:@: :@::::@::::::::@:::@::@@@:::::@::@:@:::@#::::
+ | ::: :@::::@::: :@:::@:@: :@::::@::::::::@:::@::@@@:::::@::@:@:::@#::::
+ | :::: :@::::@::: :@:::@:@: :@::::@::::::::@:::@::@@@:::::@::@:@:::@#::::
+ | :::: :@::::@::: :@:::@:@: :@::::@::::::::@:::@::@@@:::::@::@:@:::@#::::
+ | :::: :@::::@::: :@:::@:@: :@::::@::::::::@:::@::@@@:::::@::@:@:::@#::::
+ | :::: :@::::@::: :@:::@:@: :@::::@::::::::@:::@::@@@:::::@::@:@:::@#::::
+ | :::: :@::::@::: :@:::@:@: :@::::@::::::::@:::@::@@@:::::@::@:@:::@#::::
+ | :::: :@::::@::: :@:::@:@: :@::::@::::::::@:::@::@@@:::::@::@:@:::@#::::
+ 0 +----------------------------------------------------------------------->GB
+ 0 3.231
+
+ FF peaks at around 1.4 GB after 12.5 GB worth of allocations,
+ whereas the Lua implementation arrives at around 45 MB after 3.2 GB
+ total:
+
+ impl time(B) total(B) useful-heap(B) extra-heap(B)
+ fontforge 12,496,407,184 1,421,150,144 1,327,888,638 93,261,506
+ lua 3,263,698,960 45,478,640 37,231,892 8,246,748
+
+ Much of the inefficiency of Fontforge is a direct consequence of
+ having to parse the entire font to extract what essentially boils
+ down to a couple hundred bytes of metadata per font. Since some
+ information like design sizes (oh, Adobe!) is stuffed away in
+ Opentype tables, the vastly more efficient approach of
+ fontloader.info() proves insufficient for indexing. Thus, we ended
+ up using fontloader.open() which causes even the character tables
+ to be parsed, which incidentally are responsible for most of the
+ allocations during that peak of 1.4 GB measured above, along with
+ the encodings:
+
+ 20.67% (293,781,048B) 0x6A8F72: SplineCharCreate (splineutil.c:3878)
+ 09.82% (139,570,318B) 0x618ACD: _FontViewBaseCreate (fontviewbase.c:84)
+ 08.77% (124,634,384B) 0x6A8FB3: SplineCharCreate (splineutil.c:3885)
+ 04.53% (64,436,904B) in 80 places, all below massif's threshold (1.00%)
+ 02.68% (38,071,520B) 0x64E14E: addKernPair (parsettfatt.c:493)
+ 01.04% (14,735,320B) 0x64DE7D: addPairPos (parsettfatt.c:452)
+ 39.26% (557,942,484B) 0x64A4E0: PsuedoEncodeUnencoded (parsettf.c:5706)
+
+ What gives? For 2.7 we expect a rougher transition than a year back
+ due to the complete revamp of the OT loading code. Breakage of
+ fragile aspects like font and style names has been anticipated and
+ addressed prior to the 2016 pretest release. In contrast to the
+ earlier approach of letting FF do a complete dump and then harvest
+ identifiers from the output we now have to coordinate with upstream
+ as to which fields are actually needed in order to provide a
+ similarly acceptable name → file lookup. On the bright side, these
+ things are a lot simpler to fix than the rather tedious work of
+ having users update their Luatex binary =)
--doc]]--