doc/context/sources/general/manuals/cld/cld-backendcode.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388

% language=uk

\startcomponent cld-backendcode

\environment cld-environment

% derived from hybrid

\startchapter[title={Backend code}]

\startsection [title={Introduction}]

In \CONTEXT\ we've always separated the backend code in so called driver files.
This means that in the code related to typesetting only calls to the \API\ take
place, and no backend specific code is to be used. Currently a \PDF\ backend is
supported as well as an \XML\ export. \footnote {This chapter is derived from an
article on these matters. You can find nore information in \type {hybrid.pdf}.}

Some \CONTEXT\ users like to add their own \PDF\ specific code to their styles or
modules. However, such extensions can interfere with existing code, especially
when resources are involved. Therefore the construction of \PDF\ data structures
and resources is rather controlled and has to be done via the official helper
macros.

\stopsection

\startsection [title={Structure}]

A \PDF\ file is a tree of indirect objects. Each object has a number and the file
contains a table (or multiple tables) that relates these numbers to positions in
a file (or position in a compressed object stream). That way a file can be viewed
without reading all data: a viewer only loads what is needed.

\starttyping
1 0 obj <<
    /Name (test) /Address 2 0 R
>>
2 0 obj [
   (Main Street) (24) (postal code) (MyPlace)
]
\stoptyping

For the sake of the discussion we consider strings like \type {(test)} also to be
objects. In the next table we list what we can encounter in a \PDF\ file. There
can be indirect objects in which case a reference is used (\type{2 0 R}) and
direct ones.

It all starts in the document's root object. From there we access the page tree
and resources. Each page carries its own resource information which makes random
access easier. A page has a page stream and there we find the to be rendered
content as a mixture of (\UNICODE) strings and special drawing and rendering
operators. Here we will not discuss them as they are mostly generated by the
engine itself or dedicated subsystems like the \METAPOST\ converter. There we use
literal or \type {\latelua} whatsits to inject code into the current stream.

\stopsection

\startsection [title={Data types}]

There are several datatypes in \PDF\ and we support all of them one way or the
other.

\starttabulate[|l|l|p|]
\FL
\NC \bf type \NC \bf form \NC \bf meaning \NC \NR
\TL
\NC constant   \NC \type{/...} \NC A symbol (prescribed string). \NC \NR
\NC string     \NC \type{(...)} \NC A sequence of characters in pdfdoc
                   encoding \NC \NR
\NC unicode    \NC \type{<...>} \NC A sequence of characters in utf16
                   encoding \NC \NR
\NC number     \NC \type{3.1415} \NC A number constant. \NC \NR
\NC boolean    \NC \type{true/false} \NC A boolean constant. \NC \NR
\NC reference  \NC \type{N 0 R} \NC A reference to an object \NC \NR
\NC dictionary \NC \type{<< ... >>} \NC A collection of key value pairs
                   where the value itself is an (indirect) object.
                   \NC \NR
\NC array      \NC \type{[ ... ]} \NC A list of objects or references to
                   objects. \NC \NR
\NC stream     \NC \NC A sequence of bytes either or not packaged with
                   a dictionary that contains descriptive data. \NC \NR
\NC xform      \NC \NC A special kind of object containing an reusable
                   blob of data, for example an image. \NC \NR
\LL
\stoptabulate

While writing additional backend code, we mostly create dictionaries.

\starttyping
<< /Name (test) /Address 2 0 R >>
\stoptyping

In this case the indirect object can look like:

\starttyping
[ (Main Street) (24) (postal code) (MyPlace) ]
\stoptyping

The \LUATEX\ manual mentions primitives like \type {\pdfobj}, \type {\pdfannot},
\type {\pdfcatalog}, etc. However, in \MKIV\ no such primitives are used. You can
still use many of them but those that push data into document or page related
resources are overloaded to do nothing at all.

In the \LUA\ backend code you will find function calls like:

\starttyping
local d = lpdf.dictionary {
    Name    = lpdf.string("test"),
    Address = lpdf.array {
        "Main Street", "24", "postal code", "MyPlace",
    }
}
\stoptyping

Equaly valid is:

\starttyping
local d = lpdf.dictionary()
d.Name = "test"
\stoptyping

Eventually the object will end up in the file using calls like:

\starttyping
local r = lpdf.immediateobject(tostring(d))
\stoptyping

or using the wrapper (which permits tracing):

\starttyping
local r = lpdf.flushobject(d)
\stoptyping

The object content will be serialized according to the formal specification so
the proper \type {<< >>} etc.\ are added. If you want the content instead you can
use a function call:

\starttyping
local dict = d()
\stoptyping

An example of using references is:

\starttyping
local a = lpdf.array {
    "Main Street", "24", "postal code", "MyPlace",
}
local d = lpdf.dictionary {
    Name    = lpdf.string("test"),
    Address = lpdf.reference(a),
}
local r = lpdf.flushobject(d)
\stoptyping

\stopsection

We have the following creators. Their arguments are optional.

\starttabulate[|l|p|]
\FL
\NC \bf function \NC \bf optional parameter \NC \NR
\TL
\NC \type{lpdf.null}        \NC \NC \NR
\NC \type{lpdf.number}      \NC number \NC \NR
\NC \type{lpdf.constant}    \NC string \NC \NR
\NC \type{lpdf.string}      \NC string \NC \NR
\NC \type{lpdf.unicode}     \NC string \NC \NR
\NC \type{lpdf.boolean}     \NC boolean \NC \NR
\NC \type{lpdf.array}       \NC indexed table of objects \NC \NR
\NC \type{lpdf.dictionary}  \NC hash with key/values \NC \NR
%NC \type{lpdf.stream}      \NC indexed table of operators \NC \NR
\NC \type{lpdf.reference}   \NC string \NC \NR
\NC \type{lpdf.verbose}     \NC indexed table of strings \NC \NR
\LL
\stoptabulate

\ShowLuaExampleString{tostring(lpdf.null())}
\ShowLuaExampleString{tostring(lpdf.number(123))}
\ShowLuaExampleString{tostring(lpdf.constant("whatever"))}
\ShowLuaExampleString{tostring(lpdf.string("just a string"))}
\ShowLuaExampleString{tostring(lpdf.unicode("just a string"))}
\ShowLuaExampleString{tostring(lpdf.boolean(true))}
\ShowLuaExampleString{tostring(lpdf.array { 1, lpdf.constant("c"), true, "str" })}
\ShowLuaExampleString{tostring(lpdf.dictionary { a=1, b=lpdf.constant("c"), d=true, e="str" })}
%ShowLuaExampleString{tostring(lpdf.stream("whatever"))}
\ShowLuaExampleString{tostring(lpdf.reference(123))}
\ShowLuaExampleString{tostring(lpdf.verbose("whatever"))}

\stopsection

\startsection[title={Managing objects}]

Flushing objects is done with:

\starttyping
lpdf.flushobject(obj)
\stoptyping

Reserving object is or course possible and done with:

\starttyping
local r = lpdf.reserveobject()
\stoptyping

Such an object is flushed with:

\starttyping
lpdf.flushobject(r,obj)
\stoptyping

We also support named objects:

\starttyping
lpdf.reserveobject("myobject")

lpdf.flushobject("myobject",obj)
\stoptyping

A delayed object is created with:

\starttyping
local ref = pdf.delayedobject(data)
\stoptyping

The data will be flushed later using the object number that is returned (\type
{ref}). When you expect that many object with the same content are used, you can
use:

\starttyping
local obj = lpdf.shareobject(data)
local ref = lpdf.shareobjectreference(data)
\stoptyping

This one flushes the object and returns the object number. Already defined
objects are reused. In addition to this code driven optimization, some other
optimization and reuse takes place but all that happens without user
intervention. Only use this when it's really needed as it might consume more
memory and needs more processing time.

\startsection [title={Resources}]

While \LUATEX\ itself will embed all resources related to regular typesetting,
\MKIV\ has to take care of embedding those related to special tricks, like
annotations, spot colors, layers, shades, transparencies, metadata, etc. Because
third party modules (like tikz) also can add resources we provide some macros
that makes sure that no interference takes place:

\starttyping
\pdfbackendsetcatalog       {key}{string}
\pdfbackendsetinfo          {key}{string}
\pdfbackendsetname          {key}{string}

\pdfbackendsetpageattribute {key}{string}
\pdfbackendsetpagesattribute{key}{string}
\pdfbackendsetpageresource  {key}{string}

\pdfbackendsetextgstate     {key}{pdfdata}
\pdfbackendsetcolorspace    {key}{pdfdata}
\pdfbackendsetpattern       {key}{pdfdata}
\pdfbackendsetshade         {key}{pdfdata}
\stoptyping

One is free to use the \LUA\ interface instead, as there one has more
possibilities but when code is shared with other macro packages the macro
interface makes more sense. The names of the \LUA\ functions are similar, like:

\starttyping
lpdf.addtoinfo(key,anything_valid_pdf)
\stoptyping

Currently we expose a  bit more of the backend code than we like and
future versions will have a more restricted access. The following
function will stay public:

\starttyping
lpdf.addtopageresources  (key,value)
lpdf.addtopageattributes (key,value)
lpdf.addtopagesattributes(key,value)

lpdf.adddocumentextgstate(key,value)
lpdf.adddocumentcolorspac(key,value)
lpdf.adddocumentpattern  (key,value)
lpdf.adddocumentshade    (key,value)

lpdf.addtocatalog        (key,value)
lpdf.addtoinfo           (key,value)
lpdf.addtonames          (key,value)
\stoptyping

\stopsection

\startsection [title={Annotations}]

You can use the \LUA\ functions that relate to annotations etc.\ but normally you
will use the regular \CONTEXT\ user interface. You can look into some of the
\type {lpdf-*} modules to see how special annotations can be dealt with.

\stopsection

\startsection [title={Tracing}]

There are several tracing options built in and some more will be added in due
time:

\starttyping
\enabletrackers
  [backend.finalizers,
   backend.resources,
   backend.objects,
   backend.detail]
\stoptyping

As with all trackers you can also pass them on the command line, for example:

\starttyping
context --trackers=backend.* yourfile
\stoptyping

The reference related backend mechanisms have their own trackers. When you write
code that generates \PDF, it also helps to look in the \PDF\ file so see if
things are done right. In that case you need to disable compression:

\starttyping
\nopdfcompression
\stoptyping

\stopsection

\startsection[title={Analyzing}]

The \type {epdf} library that comes with \LUATEX\ offers a userdata interface to
\PDF\ files. On top of that \CONTEXT\ provides a more \LUA-ish access, using
tables. You can open a \PDF\ file with:

\starttyping
local mypdf = lpdf.epdf.load(filename)
\stoptyping

When opening is successful, you have access to a couple of tables:

\starttyping
\NC \type{pages}         \NC indexed \NC \NR
\NC \type{destinations}  \NC hashed  \NC \NR
\NC \type{javascripts}   \NC hashed  \NC \NR
\NC \type{widgets}       \NC hashed  \NC \NR
\NC \type{embeddedfiles} \NC hashed  \NC \NR
\NC \type{layers}        \NC indexed \NC \NR
\stoptyping

These provide efficient access to some data that otherwise would take a bit of
code to deal with. Another top level table is the for \PDF\ characteristic \type
{Catalog}. Watch the capitalization: as with other native \PDF\ data structures,
keys are case sensitive and match the standard.

Here is an example of usage:

\starttyping
local MyDocument = lpdf.epdf.load("somefile.pdf")

context.starttext()

  local pages    = MyDocument.pages
  local nofpages = pages.n

  context.starttabulate { "|c|c|c|" }

    context.NC() context("page")
    context.NC() context("width")
    context.NC() context("height") context.NR()

    for i=1, nofpages do
      local page = pages[i]
      local bbox = page.CropBox or page.MediaBox
      context.NC() context(i)
      context.NC() context(bbox[4]-bbox[2])
      context.NC() context(bbox[3]-bbox[1]) context.NR()
    end

  context.stoptabulate()

context.stoptext()
\stoptyping

\stopsection

\stopchapter

\stopcomponent