path: root/doc/context/sources/general/manuals/cld/cld-abitoflua.tex
diff options
authorHans Hagen <>2018-03-15 16:04:31 +0100
committerContext Git Mirror Bot <>2018-03-15 16:04:31 +0100
commita4e07f30e880ab27c2918f81f136e257475b7729 (patch)
tree02db002d3001a49777a049f9a98fdc872a5e1ad1 /doc/context/sources/general/manuals/cld/cld-abitoflua.tex
parentcbc37c39432e0ebe38e0922fc6d14c2955ab3ba2 (diff)
2018-03-15 15:36:00
Diffstat (limited to 'doc/context/sources/general/manuals/cld/cld-abitoflua.tex')
1 files changed, 869 insertions, 0 deletions
diff --git a/doc/context/sources/general/manuals/cld/cld-abitoflua.tex b/doc/context/sources/general/manuals/cld/cld-abitoflua.tex
new file mode 100644
index 000000000..e61507929
--- /dev/null
+++ b/doc/context/sources/general/manuals/cld/cld-abitoflua.tex
@@ -0,0 +1,869 @@
+% language=uk
+\startcomponent cld-abitoflua
+\environment cld-environment
+\startchapter[title=A bit of Lua]
+\startsection[title=The language]
+Small is beautiful and this is definitely true for the programming language \LUA\
+(moon in Portuguese). We had good reasons for using this language in \LUATEX:
+simplicity, speed, syntax and size to mention a few. Of course personal taste
+also played a role and after using a couple of scripting languages extensively
+the switch to \LUA\ was rather pleasant.
+As the \LUA\ reference manual is an excellent book there is no reason to discuss
+the language in great detail: just buy \quote {Programming in \LUA} by the \LUA\
+team. Nevertheless I will give a short summary of the important concepts but
+consult the book if you want more details.
+\startsection[title=Data types]
+The most basic data type is \type {nil}. When we define a variable, we don't need
+to give it a value:
+local v
+Here the variable \type {v} can get any value but till that
+happens it equals \type {nil}. There are simple data types like
+\type {numbers}, \type {booleans} and \type {strings}. Here are
+some numbers:
+local n = 1 + 2 * 3
+local x = 2.3
+Numbers are always floats \footnote {This is true for all versions upto 5.2 but
+following version can have a more hybrid model.} and you can use the normal
+arithmetic operators on them as well as functions defined in the math library.
+Inside \TEX\ we have only integers, although for instance dimensions can be
+specified in points using floats but that's more syntactic sugar. One reason for
+using integers in \TEX\ has been that this was the only way to guarantee
+portability across platforms. However, we're 30 years along the road and in \LUA\
+the floats are implemented identical across platforms, so we don't need to worry
+about compatibility.
+Strings in \LUA\ can be given between quotes or can be so called long strings
+forced by square brackets.
+local s = "Whatever"
+local t = s .. ' you want'
+local u = t .. [[ to know]] .. [[--[ about Lua!]--]]
+The two periods indicate a concatenation. Strings are hashed, so when you say:
+local s = "Whatever"
+local t = "Whatever"
+local u = t
+only one instance of \type {Whatever} is present in memory and this fact makes
+\LUA\ very efficient with respect to strings. Strings are constants and therefore
+when you change variable \type {s}, variable \type {t} keeps its value. When you
+compare strings, in fact you compare pointers, a method that is really fast. This
+compensates the time spent on hashing pretty well.
+Booleans are normally used to keep a state or the result from an expression.
+local b = false
+local c = n > 10 and s == "whatever"
+The other value is \type {true}. There is something that you need
+to keep in mind when you do testing on variables that are yet
+local b = false
+local n
+The following applies when \type {b} and \type {n} are defined this way:
+\NC b == false \NC true \NC \NR
+\NC n == false \NC false \NC \NR
+\NC n == nil \NC true \NC \NR
+\NC b == nil \NC false \NC \NR
+\NC b == n \NC false \NC \NR
+\NC n == nil \NC true \NC \NR
+Often a test looks like:
+if somevar then
+ ...
+ ...
+In this case we enter the else branch when \type {somevar} is either \type {nil}
+or \type {false}. It also means that by looking at the code we cannot beforehand
+conclude that \type {somevar} equals \type {true} or something else. If you want
+to really distinguish between the two cases you can be more explicit:
+if somevar == nil then
+ ...
+elseif somevar == false then
+ ...
+ ...
+if somevar == true then
+ ...
+ ...
+but such an explicit test is seldom needed.
+There are a few more data types: tables and functions. Tables are very important
+and you can recognize them by the same curly braces that make \TEX\ famous:
+local t = { 1, 2, 3 }
+local u = { a = 4, b = 9, c = 16 }
+local v = { [1] = "a", [3] = "2", [4] = false }
+local w = { 1, 2, 3, a = 4, b = 9, c = 16 }
+The \type {t} is an indexed table and \type {u} a hashed table. Because the
+second slot is empty, table \type {v} is partially indexed (slot 1) and partially
+hashed (the others). There is a gray area there, for instance, what happens when
+you nil a slot in an indexed table? In practice you will not run into problems as
+you will either use a hashed table, or an indexed table (with no holes), so table
+\type {w} is not uncommon.
+We mentioned that strings are in fact shared (hashed) but that an assignment of a
+string to a variable makes that variable behave like a constant. Contrary to
+that, when you assign a table, and then copy that variable, both variables can be
+used to change the table. Take this:
+local t = { 1, 2, 3 }
+local u = t
+We can change the content of the table as follows:
+t[1], t[3] = t[3], t[1]
+Here we swap two cells. This is an example of a parallel assigment. However, the
+following does the same:
+t[1], t[3] = u[3], u[1]
+After this, both \type {t} and \type {u} still share the same table. This kind of
+behaviour is quite natural. Keep in mind that expressions are evaluated first, so
+t[#t+1], t[#t+1] = 23, 45
+Makes no sense, as the values end up in the same slot. There is no gain in speed
+so using parallel assignments is mostly a convenience feature.
+There are a few specialized data types in \LUA, like \type {coroutines} (built
+in), \type {file} (when opened), \type {lpeg} (only when this library is linked
+in or loaded). These are called \quote {userdata} objects and in \LUATEX\ we have
+more userdata objects as we will see in later chapters. Of them nodes are the
+most noticeable: they are the core data type of the \TEX\ machinery. Other
+libraries, like \type {math} and \type {bit32} are just collections of functions
+operating on numbers.
+Functions look like this:
+function sum(a,b)
+ print(a, b, a + b)
+or this:
+function sum(a,b)
+ return a + b
+There can be many arguments of all kind of types and there can be multiple return
+values. A function is a real type, so you can say:
+local f = function(s) print("the value is: " .. s) end
+In all these examples we defined variables as \type {local}. This is a good
+practice and avoids clashes. Now watch the following:
+local n = 1
+function sum(a,b)
+ n = n + 1
+ return a + b
+function report()
+ print("number of summations: " .. n)
+Here the variable \type {n} is visible after its definition and accessible for
+the two global functions. Actually the variable is visible to all the code
+following, unless of course we define a new variable with the same name. We can
+hide \type {n} as follows:
+ local n = 1
+ sum = function(a,b)
+ n = n + 1
+ return a + b
+ end
+ report = function()
+ print("number of summations: " .. n)
+ end
+This example also shows another way of defining the function: by assignment.
+The \typ {do ... end} creates a so called closure. There are many places where
+such closures are created, for instance in function bodies or branches like \typ
+{if ... then ... else}. This means that in the following snippet, variable \type
+{b} is not seen after the end:
+if a > 10 then
+ local b = a + 10
+ print(b*b)
+When you process a blob of \LUA\ code in \TEX\ (using \type {\directlua} or \type
+{\latelua}) it happens in a closure with an implied \typ {do ... end}. So, \type
+{local} defined variables are really local.
+\startsection[title=\TEX's data types]
+We mentioned \type {numbers}. At the \TEX\ end we have counters as well as
+dimensions. Both are numbers but dimensions are specified differently
+local n = tex.count[0]
+local m = tex.dimen.lineheight
+local o = tex.sp("10.3pt") -- sp or 'scaled point' is the smallest unit
+The unit of dimension is \quote {scaled point} and this is a pretty small unit:
+10 points equals to 655360 such units.
+Another accessible data type is tokens. They are automatically converted to
+strings and vice versa.
+tex.toks[0] = "message"
+Be aware of the fact that the tokens are letters so the following will come out
+as text and not issue a message:
+tex.toks[0] = "\message{just text}"
+\startsection[title=Control structures]
+Loops are not much different from other languages: we have \typ {for ... do},
+\typ {while ... do} and \typ {repeat ... until}. We start with the simplest case:
+for index=1,10 do
+ print(index)
+You can specify a step and go downward as well:
+for index=22,2,-2 do
+ print(index)
+Indexed tables can be traversed this way:
+for index=1,#list do
+ print(index, list[index])
+Hashed tables on the other hand are dealt with as follows:
+for key, value in next, list do
+ print(key, value)
+Here \type {next} is a built in function. There is more to say about this
+mechanism but the average user will use only this variant. Slightly less
+efficient is the following, more readable variant:
+for key, value in pairs(list) do
+ print(key, value)
+and for an indexed table:
+for index, value in ipairs(list) do
+ print(index, value)
+The function call to \type {pairs(list)} returns \typ {next, list} so there is an
+(often neglectable) extra overhead of one function call.
+The other two loop variants, \type {while} and \type {repeat}, are similar.
+i = 0
+while i < 10 do
+ i = i + 1
+ print(i)
+This can also be written as:
+i = 0
+ i = i + 1
+ print(i)
+until i = 10
+i = 0
+while true do
+ i = i + 1
+ print(i)
+ if i = 10 then
+ break
+ end
+Of course you can use more complex expressions in such constructs.
+Conditions have the following form:
+if a == b or c > d or e then
+ ...
+elseif f == g then
+ ...
+ ...
+Watch the double \type {==}. The complement of this is \type {~=}. Precedence is
+similar to other languages. In practice, as strings are hashed. Tests like
+if key == "first" then
+ ...
+if n == 1 then
+ ...
+are equally efficient. There is really no need to use numbers to identify states
+instead of more verbose strings.
+Functionality can be grouped in libraries. There are a few default libraries,
+like \type {string}, \type {table}, \type {lpeg}, \type {math}, \type {io} and
+\type {os} and \LUATEX\ adds some more, like \type {node}, \type {tex} and \type
+A library is in fact nothing more than a bunch of functionality organized using a
+table, where the table provides a namespace as well as place to store public
+variables. Of course there can be local (hidden) variables used in defining
+ mylib = { }
+ local n = 1
+ function mylib.sum(a,b)
+ n = n + 1
+ return a + b
+ end
+ function
+ print("number of summations: " .. n)
+ end
+The defined function can be called like:
+You can also create a shortcut, This speeds up the process because there are less
+lookups then. In the following code multiple calls take place:
+local sum = mylib.sum
+for i=1,10 do
+ for j=1,10 do
+ print(i, j, sum(i,j))
+ end
+As \LUA\ is pretty fast you should not overestimate the speedup, especially not
+when a function is called seldom. There is an important side effect here: in the
+case of:
+ print(i, j, sum(i,j))
+the meaning of \type {sum} is frozen. But in the case of
+ print(i, j, mylib.sum(i,j))
+The current meaning is taken, that is: each time the interpreter will access
+\type {mylib} and get the current meaning of \type {sum}. And there can be a good
+reason for this, for instance when the meaning is adapted to different
+In \CONTEXT\ we have quite some code organized this way. Although much is exposed
+(if only because it is used all over the place) you should be careful in using
+functions (and data) that are still experimental. There are a couple of general
+libraries and some extend the core \LUA\ libraries. You might want to take a look
+at the files in the distribution that start with \type {l-}, like \type
+{l-table.lua}. These files are preloaded.\footnote {In fact, if you write scripts
+that need their functionality, you can use \type {mtxrun} to process the script,
+as \type {mtxrun} has the core libraries preloaded as well.} For instance, if you
+want to inspect a table, you can say:
+local t = { "aap", "noot", "mies" }
+You can get an overview of what is implemented by running the following command:
+context s-tra-02 --mode=tablet
+{\em todo: add nice synonym for this module and also add helpinfo at the to so
+that we can do \type {context --styles}}
+You can add comments to your \LUA\ code. There are basically two methods: one
+liners and multi line comments.
+local option = "test" -- use this option with care
+local method = "unknown" --[[comments can be very long and when entered
+ this way they and span multiple lines]]
+The so called long comments look like long strings preceded by \type {--} and
+there can be more complex boundary sequences.
+Sometimes \type {nil} can bite you, especially in tables, as they have a dual nature:
+indexed as well as hashed.
+local n1 = # { nil, 1, 2, nil } -- 3
+local n2 = # { nil, nil, 1, 2, nil } -- 0
+context("n1 = %s and n2 = %s",n1,n2)
+results in: \getbuffer
+So, you cannot really depend on the length operator here. On the other hand, with:
+local function check(...)
+ return select("#",...)
+local n1 = check ( nil, 1, 2, nil ) -- 4
+local n2 = check ( nil, nil, 1, 2, nil ) -- 5
+context("n1 = %s and n2 = %s",n1,n2)
+we get: \getbuffer, so the \type {select} is quite useable. However, that function also
+has its specialities. The following example needs some close reading:
+local function filter(n,...)
+ return select(n,...)
+local v1 = { filter ( 1, 1, 2, 3 ) }
+local v2 = { filter ( 2, 1, 2, 3 ) }
+local v3 = { filter ( 3, 1, 2, 3 ) }
+context("v1 = %+t and v2 = %+t and v3 = %+t",v1,v2,v3)
+We collect the result in a table and show the concatination:
+So, what you effectively get is the whole list starting with the given offset.
+local function filter(n,...)
+ return (select(n,...))
+local v1 = { filter ( 1, 1, 2, 3 ) }
+local v2 = { filter ( 2, 1, 2, 3 ) }
+local v3 = { filter ( 3, 1, 2, 3 ) }
+context("v1 = %+t and v2 = %+t and v3 = %+t",v1,v2,v3)
+Now we get: \getbuffer. The extra \type {()} around the result makes sure that
+we only get one return value.
+Of course the same effect can be achieved as follows:
+local function filter(n,...)
+ return select(n,...)
+local v1 = filter ( 1, 1, 2, 3 )
+local v2 = filter ( 2, 1, 2, 3 )
+local v3 = filter ( 3, 1, 2, 3 )
+context("v1 = %s and v2 = %s and v3 = %s",v1,v2,v3)
+\startsection[title={A few suggestions}]
+You can wrap all kind of functionality in functions but sometimes it makes no
+sense to add the overhead of a call as the same can be done with hardly any code.
+If you want a slice of a table, you can copy the range needed to a new table. A
+simple version with no bounds checking is:
+local new = { } for i=a,b do new[#new+1] = old[i] end
+Another, much faster, variant is the following.
+local new = { unpack(old,a,b) }
+You can use this variant for slices that are not extremely large. The function
+\type {table.sub} is an equivalent:
+local new = table.sub(old,a,b)
+An indexed table is empty when its size equals zero:
+if #indexed == 0 then ... else ... end
+Sometimes this is better:
+if indexed and #indexed == 0 then ... else ... end
+So how do we test if a hashed table is empty? We can use the
+\type {next} function as in:
+if hashed and next(indexed) then ... else ... end
+Say that we have the following table:
+local t = { a=1, b=2, c=3 }
+The call \type {next(t)} returns the first key and value:
+local k, v = next(t) -- "a", 1
+The second argument to \type {next} can be a key in which case the
+following key and value in the hash table is returned. The result
+is not predictable as a hash is unordered. The generic for loop
+uses this to loop over a hashed table:
+for k, v in next, t do
+ ...
+Anyway, when \type {next(t)} returns zero you can be sure that the table is
+empty. This is how you can test for exactly one entry:
+if t and not next(t,next(t)) then ... else ... end
+Here it starts making sense to wrap it into a function.
+function table.has_one_entry(t)
+ t and not next(t,next(t))
+On the other hand, this is not that usefull, unless you can spent the runtime on
+function table.is_empty(t)
+ return not t or not next(t)
+We have already seen that you can embed \LUA\ code using commands like:
+ print("this works")
+This command should not be confused with:
+ print("this works")
+The first variant has its own catcode regime which means that tokens between the start
+and stop command are treated as \LUA\ tokens, with the exception of \TEX\ commands. The
+second variant operates under the regular \TEX\ catcode regime.
+Their short variants are \type {\ctxluacode} and \type {\ctxlua} as in:
+\ctxluacode{print("this works")}
+\ctxlua{print("this works")}
+In practice you will probably use \type {\startluacode} when using or defining % \stopluacode
+a blob of \LUA\ and \type {\ctxlua} for inline code. Keep in mind that the
+longer versions need more initialization and have more overhead.
+There are some more commands. For instance \type {\ctxcommand} can be used as
+an efficient way to access functions in the \type {commands} namespace. The
+following two calls are equivalent:
+\ctxlua {commands.thisorthat("...")}
+\ctxcommand {thisorthat("...")}
+There are a few shortcuts to the \type {context} namespace. Their use can best be
+seen from their meaning:
+\cldloadfile #1{\directlua{context.loadfile("#1")}}
+\cldcontext #1{\directlua{context(#1)}}
+\cldcommand #1{\directlua{context.#1}}
+The \type {\directlua{}} command can also be implemented using the token parser
+and \LUA\ itself. A variant is therefore \type {\luascript{}} which can be
+considered an alias but with a bit different error reporting. A variant on this
+is the \type {\luathread {name} {code}} command. Here is an example of their
+\luascript { context("foo 1:") context(i) } \par
+\luathread {test} { i = 10 context("bar 1:") context(i) } \par
+\luathread {test} { context("bar 2:") context(i) } \par
+\luathread {test} {} % resets
+\luathread {test} { context("bar 3:") context(i) } \par
+\luascript { context("foo 2:") context(i) } \par
+These commands result in:
+\startpacked \getbuffer \stoppacked
+% \testfeatureonce{100000}{\directlua {local a = 10 local a = 10 local a = 10}} % 0.53s
+% \testfeatureonce{100000}{\luascript {local a = 10 local a = 10 local a = 10}} % 0.62s
+% \testfeatureonce{100000}{\luathread {test} {local a = 10 local a = 10 local a = 10}} % 0.79s
+The variable \type {i} is local to the thread (which is not really a thread in
+\LUA\ but more a named piece of code that provides an environment which is shared
+over the calls with the same name. You will probably never need these.
+Each time a call out to \LUA\ happens the argument eventually gets parsed, converted
+into tokens, then back into a string, compiled to bytecode and executed. The next
+example code shows a mechanism that avoids this:
+\startctxfunction MyFunctionA
+ context(" A1 ")
+\startctxfunctiondefinition MyFunctionB
+ context(" B2 ")
+The first command associates a name with some \LUA\ code and that code can be
+executed using:
+The second definition creates a command, so there we do:
+There are some more helpers but for use in document sources they make less sense. You
+can always browse the source code for examples.