summaryrefslogtreecommitdiff
path: root/doc/context/sources/general/manuals/hybrid/hybrid-backend.tex
diff options
context:
space:
mode:
Diffstat (limited to 'doc/context/sources/general/manuals/hybrid/hybrid-backend.tex')
-rw-r--r--doc/context/sources/general/manuals/hybrid/hybrid-backend.tex389
1 files changed, 389 insertions, 0 deletions
diff --git a/doc/context/sources/general/manuals/hybrid/hybrid-backend.tex b/doc/context/sources/general/manuals/hybrid/hybrid-backend.tex
new file mode 100644
index 000000000..4b6055151
--- /dev/null
+++ b/doc/context/sources/general/manuals/hybrid/hybrid-backend.tex
@@ -0,0 +1,389 @@
+% language=uk
+
+\startcomponent hybrid-backends
+
+\environment hybrid-environment
+
+\startchapter[title={Backend code}]
+
+\startsection [title={Introduction}]
+
+In \CONTEXT\ we've always separated the backend code in so called driver files.
+This means that in the code related to typesetting only calls to the \API\ take
+place, and no backend specific code is to be used. That way we can support
+backend like dvipsone (and dviwindo), dvips, acrobat, pdftex and dvipdfmx with
+one interface. A simular model is used in \MKIV\ although at the moment we only
+have one backend: \PDF. \footnote {At this moment we only support the native
+\PDF\ backend but future versions might support \XML\ (\HTML) output as well.}
+
+Some \CONTEXT\ users like to add their own \PDF\ specific code to their styles or
+modules. However, such extensions can interfere with existing code, especially
+when resources are involved. This has to be done via the official helper macros.
+
+In the next sections an overview will be given of the current approach. There are
+still quite some rough edges but these will be polished as soon as the backend
+code is more isolated in \LUATEX\ itself.
+
+\stopsection
+
+\startsection [title={Structure}]
+
+A \PDF\ file is a tree of indirect objects. Each object has a number and the file
+contains a table (or multiple tables) that relates these numbers to positions in
+a file (or position in a compressed object stream). That way a file can be viewed
+without reading all data: a viewer only loads what is needed.
+
+\starttyping
+1 0 obj <<
+ /Name (test) /Address 2 0 R
+>>
+2 0 obj [
+ (Main Street) (24) (postal code) (MyPlace)
+]
+\stoptyping
+
+For the sake of the discussion we consider strings like \type {(test)} also to be
+objects. In the next table we list what we can encounter in a \PDF\ file. There
+can be indirect objects in which case a reference is used (\type{2 0 R}) and
+direct ones.
+
+\starttabulate[|l|l|p|]
+\FL
+\NC \bf type \NC \bf form \NC \bf meaning \NC \NR
+\TL
+\NC constant \NC \type{/...} \NC A symbol (prescribed string). \NC \NR
+\NC string \NC \type{(...)} \NC A sequence of characters in pdfdoc encoding \NC \NR
+\NC unicode \NC \type{<...>} \NC A sequence of characters in utf16 encoding \NC \NR
+\NC number \NC \type{3.1415} \NC A number constant. \NC \NR
+\NC boolean \NC \type{true/false} \NC A boolean constant. \NC \NR
+\NC reference \NC \type{N 0 R} \NC A reference to an object \NC \NR
+\NC dictionary \NC \type{<< ... >>} \NC A collection of key value pairs where the
+ value itself is an (indirect) object. \NC \NR
+\NC array \NC \type{[ ... ]} \NC A list of objects or references to objects. \NC \NR
+\NC stream \NC \NC A sequence of bytes either or not packaged with a dictionary
+ that contains descriptive data. \NC \NR
+\NC xform \NC \NC A special kind of object containing an reusable blob of data,
+ for example an image. \NC \NR
+\LL
+\stoptabulate
+
+While writing additional backend code, we mostly create dictionaries.
+
+\starttyping
+<< /Name (test) /Address 2 0 R >>
+\stoptyping
+
+In this case the indirect object can look like:
+
+\starttyping
+[ (Main Street) (24) (postal code) (MyPlace) ]
+\stoptyping
+
+It all starts in the document's root object. From there we access the page tree
+and resources. Each page carries its own resource information which makes random
+access easier. A page has a page stream and there we find the to be rendered
+content as a mixture of (\UNICODE) strings and special drawing and rendering
+operators. Here we will not discuss them as they are mostly generated by the
+engine itself or dedicated subsystems like the \METAPOST\ converter. There we use
+literal or \type {\latelua} whatsits to inject code into the current stream.
+
+In the \CONTEXT\ \MKII\ backend drivers code you will see objects in their
+verbose form. The content is passed on using special primitives, like \type
+{\pdfobj}, \type{\pdfannot}, \type {\pdfcatalog}, etc. In \MKIV\ no such
+primitives are used. In fact, some of them are overloaded to do nothing at all.
+In the \LUA\ backend code you will find function calls like:
+
+\starttyping
+local d = lpdf.dictionary {
+ Name = lpdf.string("test"),
+ Address = lpdf.array {
+ "Main Street", "24", "postal code", "MyPlace",
+ }
+}
+\stoptyping
+
+Equaly valid is:
+
+\starttyping
+local d = lpdf.dictionary()
+d.Name = "test"
+\stoptyping
+
+Eventually the object will end up in the file using calls like:
+
+\starttyping
+local r = pdf.immediateobj(tostring(d))
+\stoptyping
+
+or using the wrapper (which permits tracing):
+
+\starttyping
+local r = lpdf.flushobject(d)
+\stoptyping
+
+The object content will be serialized according to the formal specification so
+the proper \type {<< >>} etc.\ are added. If you want the content instead you can
+use a function call:
+
+\starttyping
+local dict = d()
+\stoptyping
+
+An example of using references is:
+
+\starttyping
+local a = lpdf.array {
+ "Main Street", "24", "postal code", "MyPlace",
+}
+local d = lpdf.dictionary {
+ Name = lpdf.string("test"),
+ Address = lpdf.reference(a),
+}
+local r = lpdf.flushobject(d)
+\stoptyping
+
+\stopsection
+
+We have the following creators. Their arguments are optional.
+
+\starttabulate[|l|p|]
+\FL
+\NC \bf function \NC \bf optional parameter \NC \NR
+\TL
+%NC \type{lpdf.stream} \NC indexed table of operators \NC \NR
+\NC \type{lpdf.dictionary} \NC hash with key/values \NC \NR
+\NC \type{lpdf.array} \NC indexed table of objects \NC \NR
+\NC \type{lpdf.unicode} \NC string \NC \NR
+\NC \type{lpdf.string} \NC string \NC \NR
+\NC \type{lpdf.number} \NC number \NC \NR
+\NC \type{lpdf.constant} \NC string \NC \NR
+\NC \type{lpdf.null} \NC \NC \NR
+\NC \type{lpdf.boolean} \NC boolean \NC \NR
+%NC \type{lpdf.true} \NC \NC \NR
+%NC \type{lpdf.false} \NC \NC \NR
+\NC \type{lpdf.reference} \NC string \NC \NR
+\NC \type{lpdf.verbose} \NC indexed table of strings \NC \NR
+\LL
+\stoptabulate
+
+Flushing objects is done with:
+
+\starttyping
+lpdf.flushobject(obj)
+\stoptyping
+
+Reserving object is or course possible and done with:
+
+\starttyping
+local r = lpdf.reserveobject()
+\stoptyping
+
+Such an object is flushed with:
+
+\starttyping
+lpdf.flushobject(r,obj)
+\stoptyping
+
+We also support named objects:
+
+\starttyping
+lpdf.reserveobject("myobject")
+
+lpdf.flushobject("myobject",obj)
+\stoptyping
+
+\startsection [title={Resources}]
+
+While \LUATEX\ itself will embed all resources related to regular typesetting,
+\MKIV\ has to take care of embedding those related to special tricks, like
+annotations, spot colors, layers, shades, transparencies, metadata, etc. If you
+ever took a look in the \MKII\ \type {spec-*} files you might have gotten the
+impression that it quickly becomes messy. The code there is actually rather old
+and evolved in sync with the \PDF\ format as well as \PDFTEX\ and \DVIPDFMX\
+maturing to their current state. As a result we have a dedicated object
+referencing model that sometimes results in multiple passes due to forward
+references. We could have gotten away from that with the latest versions of
+\PDFTEX\ as it provides means to reserve object numbers but it makes not much
+sense to do that now that \MKII\ is frozen.
+
+Because third party modules (like tikz) also can add resources like in \MKII\
+using an \API\ that makes sure that no interference takes place. Think of macros
+like:
+
+\starttyping
+\pdfbackendsetcatalog {key}{string}
+\pdfbackendsetinfo {key}{string}
+\pdfbackendsetname {key}{string}
+
+\pdfbackendsetpageattribute {key}{string}
+\pdfbackendsetpagesattribute{key}{string}
+\pdfbackendsetpageresource {key}{string}
+
+\pdfbackendsetextgstate {key}{pdfdata}
+\pdfbackendsetcolorspace {key}{pdfdata}
+\pdfbackendsetpattern {key}{pdfdata}
+\pdfbackendsetshade {key}{pdfdata}
+\stoptyping
+
+One is free to use the \LUA\ interface instead, as there one has more
+possibilities. The names are similar, like:
+
+\starttyping
+lpdf.addtoinfo(key,anything_valid_pdf)
+\stoptyping
+
+At the time of this writing (\LUATEX\ .50) there are still places where \TEX\ and
+\LUA\ code is interwoven in a non optimal way, but that will change in the future
+as the backend is completely separated and we can do more \TEX\ trickery at the
+\LUA\ end.
+
+Also, currently we expose more of the backend code than we like and future
+versions will have a more restricted access. The following function will stay
+public:
+
+\starttyping
+lpdf.addtopageresources (key,value)
+lpdf.addtopageattributes (key,value)
+lpdf.addtopagesattributes(key,value)
+
+lpdf.adddocumentextgstate(key,value)
+lpdf.adddocumentcolorspac(key,value)
+lpdf.adddocumentpattern (key,value)
+lpdf.adddocumentshade (key,value)
+
+lpdf.addtocatalog (key,value)
+lpdf.addtoinfo (key,value)
+lpdf.addtonames (key,value)
+\stoptyping
+
+There are several tracing options built in and some more will be added in due
+time:
+
+\starttyping
+\enabletrackers
+ [backend.finalizers,
+ backend.resources,
+ backend.objects,
+ backend.detail]
+\stoptyping
+
+As with all trackers you can also pass them on the command line, for example:
+
+\starttyping
+context --trackers=backend.* yourfile
+\stoptyping
+
+The reference related backend mechanisms have their own trackers.
+
+\stopsection
+
+\startsection [title={Transformations}]
+
+There is at the time of this writing still some backend related code at the \TEX\
+end that needs a cleanup. Most noticeable is the code that deals with
+transformations (like scaling). At some moment in \PDFTEX\ a primitive was
+introduced but it was not completely covering the transform matrix so we never
+used it. In \LUATEX\ we will come up with a better mechanism. Till that moment we
+stick to the \MKII\ method.
+
+\stopsection
+
+\startsection [title={Annotations}]
+
+The \LUA\ based backend of \MKIV\ is not so much less code, but definitely
+cleaner. The reason why there is quite some code is because in \CONTEXT\ we also
+handle annotations and destinations in \LUA. In other words: \TEX\ is not
+bothered by the backend any more. We could make that split without too much
+impact as we never depended on \PDFTEX\ hyperlink related features and used
+generic annotations instead. It's for that reason that \CONTEXT\ has always been
+able to nest hyperlinks and have annotations with a chain of actions.
+
+Another reason for doing it all at the \LUA\ end is that as in \MKII\ we have to
+deal with the rather hybrid cross reference mechanisms which uses a sort of
+language and parsing this is also easier at the \LUA\ end. Think of:
+
+\starttyping
+\definereference[somesound][StartSound(attention)]
+
+\at {just some page} [someplace,somesound,StartMovie(somemovie)]
+\stoptyping
+
+We parse the specification expanding shortcuts when needed, create an action
+chain, make sure that the movie related resources are taken care of (normally the
+movie itself will be a figure), and turn the three words into hyperlinks. As this
+all happens in \LUA\ we have less \TEX\ code. Contrary to what you might expect,
+the \LUA\ code is not that much faster as the \MKII\ \TEX\ code is rather
+optimized.
+
+Special features like \JAVASCRIPT\ as well as widgets (and forms) are also
+reimplemented. Support for \JAVASCRIPT\ is not that complex at all, but as in
+\CONTEXT\ we can organize scripts in collections and have automatic inclusion of
+used functions, still some code is needed. As we now do this in \LUA\ we use less
+\TEX\ memory. Reimplementing widgets took a bit more work as I used the
+opportunity to remove hacks for older viewers. As support for widgets is somewhat
+instable in viewers quite some testing was needed, especially because we keep
+supporting cloned and copied fields (resulting in widget trees).
+
+An interesting complication with widgets is that each instance can have a lot of
+properties and as we want to be able to use thousands of them in one document,
+each with different properties, we have efficient storage in \MKII\ and want to
+do the same in \LUA. Most code at the \TEX\ end is related to passing all those
+options.
+
+You could use the \LUA\ functions that relate to annotations etc.\ but normally
+you will use the regular \CONTEXT\ user interface. For practical reasons, the
+backend code is grouped in several tables:
+
+The \type{backends} table has subtables for each backend and currently there is
+only one: \type {pdf}. Each backend provides tables itself. In the
+\type{codeinjections} namespace we collect functions that don't interfere with
+the typesetting or typeset result, like inserting all kind of resources (movies,
+attachment, etc.), widget related functionality, and in fact everything that does
+not fit into the other categories. In \type {nodeinjections} we organize
+functions that inject literal \PDF\ code in the nodelist which then ends up in
+the \PDF\ stream: color, layers, etc. The \type {registrations} table is reserved
+for functions related to resources that result from node injections: spot colors,
+transparencies, etc. Once the backend code is finished we might come up with
+another organization. No matter what we end up with, the way the \type {backends}
+table is supposed to be organized determines the \API\ and those who have seen
+the \MKII\ backend code will recognize some of it.
+
+\startsection [title={Metadata}]
+
+We always had the opportunity to set the information fields in a \PDF\ but
+standardization forces us to add these large verbose metadata blobs. As this blob
+is coded in \XML\ we use the built in \XML\ parser to fill a template. Thanks to
+extensive testing and research by Peter Rolf we now have a rather complete
+support for \PDF/x related demands. This will definitely evolve with the advance
+of the \PDF\ specification. You can replace the information with your own but we
+suggest that you stay away from this metadata mess as far as possible.
+
+\stopsection
+
+\startsection [title={Helpers}]
+
+If you look into the \type {lpdf-*.lua} files you will find more
+functions. Some are public helpers, like:
+
+\starttabulate
+\NC \type {lpdf.toeight(str)} \NC returns \type {(string)} \NC \NR
+%NC \type {lpdf.cleaned(str)} \NC returns \type {escaped string} \NC \NR
+\NC \type {lpdf.tosixteen(str)} \NC returns \type {<utf16 sequence>} \NC \NR
+\stoptabulate
+
+An example of another public function is:
+
+\starttyping
+lpdf.sharedobj(content)
+\stoptyping
+
+This one flushes the object and returns the object number. Already defined
+objects are reused. In addition to this code driven optimization, some other
+optimization and reuse takes place but all that happens without user
+intervention.
+
+\stopsection
+
+\stopchapter
+
+\stopcomponent