diff options
author | Hans Hagen <pragma@wxs.nl> | 2018-07-10 16:30:53 +0200 |
---|---|---|
committer | Context Git Mirror Bot <phg42.2a@gmail.com> | 2018-07-10 16:30:53 +0200 |
commit | ff693671b6540fa81d2ad7aecdbe786a4df97335 (patch) | |
tree | 979066b446d6d47fcec40fa7da9978c31a2bf802 /doc/context/sources/general/manuals | |
parent | f58860178fcd1497d52acaa3cb2ceda7531e46ac (diff) | |
download | context-ff693671b6540fa81d2ad7aecdbe786a4df97335.tar.gz |
2018-07-10 16:00:00
Diffstat (limited to 'doc/context/sources/general/manuals')
14 files changed, 4653 insertions, 4372 deletions
diff --git a/doc/context/sources/general/manuals/luatex/luatex-modifications.tex b/doc/context/sources/general/manuals/luatex/luatex-modifications.tex index b1b803d48..50d23fb1b 100644 --- a/doc/context/sources/general/manuals/luatex/luatex-modifications.tex +++ b/doc/context/sources/general/manuals/luatex/luatex-modifications.tex @@ -572,7 +572,7 @@ something (as it comes from the backend it's normally a sequence of tokens). \stopsubsection -\startsubsection[title={\lpr{pdfextension}, \lpr {pdfvariable} and \lpr {pdffeedback}}] +\startsubsection[title={\lpr{pdfextension}, \lpr {pdfvariable} and \lpr {pdffeedback}},reference=sec:pdfextensions] In order for \LUATEX\ to be more than just \TEX\ you need to enable primitives. That has already been the case right from the start. If you want the traditional \PDFTEX\ @@ -643,6 +643,7 @@ The configuration related registers have become: \edef\pdfignoreunknownimages {\pdfvariable ignoreunknownimages} \edef\pdfgentounicode {\pdfvariable gentounicode} \edef\pdfomitcidset {\pdfvariable omitcidset} +\edef\pdfomitcharset {\pdfvariable omitcharset} \edef\pdfpagebox {\pdfvariable pagebox} \edef\pdfminorversion {\pdfvariable minorversion} \edef\pdfuniqueresname {\pdfvariable uniqueresname} @@ -891,6 +892,7 @@ The engine sets the following defaults. \pdfignoreunknownimages 0 \pdfgentounicode 0 \pdfomitcidset 0 +\pdfomitcharset 0 \pdfpagebox 0 \pdfminorversion 4 \pdfuniqueresname 0 diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-commands.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-commands.tex new file mode 100644 index 000000000..ab9989895 --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-commands.tex @@ -0,0 +1,893 @@ +\environment xml-mkiv-style + +\startcomponent xml-mkiv-commands + +\startchapter[title={Commands}] + +\startsection[title={nodes and lpaths}] + +The amount of commands available for manipulating the \XML\ file is rather large. +Many of the commands cooperate with the already discussed setups, a fancy name +for a collection of macro calls either or not mixed with text. + +Most of the commands are just shortcuts to \LUA\ calls, which means that the real +work is done by \LUA. In fact, what happens is that we have a continuous transfer +of control from \TEX\ to \LUA, where \LUA\ prints back either data (like element +content or attribute values) or just invokes a setup whereby it passes a +reference to the node resolved conform the path expression. The invoked setup +itself might return control to \LUA\ again, etc. + +This sounds complicated but examples will show what we mean here. First we +present the whole repertoire of commands. Because users can read the source code, +they might uncover more commands, but only the ones discussed here are official. +The commands are grouped in categories. + +In the following sections \cmdinternal {cd:node} means a reference to a node: +this can be the identifier of the root (the loaded xml tree) or a reference to a +node in that tree (often the result of some lookup. A \cmdinternal {cd:lpath} is +a fancy name for a path expression (as with \XSLT) but resolved by \LUA. + +\stopsection + +\startsection[title={commands}] + +There are a lot of commands available but you probably can ignore most of them. +We try to be complete which means that there is for instance \type {\xmlfirst} as +well as \type {\xmllast} but you probably never need the last one. There are also +commands that were used when testing this interface and we see no reason to +remove them. Some obscure ones are used in modules and after a while even I often +forget that they exist. To give you an idea of what commands are important we +show their use in generating the \CONTEXT\ command definitions (\type +{x-set-11.mkiv}) per January 2016: + +\startcolumns[n=2,balance=yes] +\starttabulate[|l|r|] +\NC \type {\xmlall} \NC 1 \NC \NR +\NC \type {\xmlatt} \NC 23 \NC \NR +\NC \type {\xmlattribute} \NC 1 \NC \NR +\NC \type {\xmlcount} \NC 1 \NC \NR +\NC \type {\xmldoif} \NC 2 \NC \NR +\NC \type {\xmldoifelse} \NC 1 \NC \NR +\NC \type {\xmlfilterlist} \NC 4 \NC \NR +\NC \type {\xmlflush} \NC 5 \NC \NR +\NC \type {\xmlinclude} \NC 1 \NC \NR +\NC \type {\xmlloadonly} \NC 1 \NC \NR +\NC \type {\xmlregisterdocumentsetup} \NC 1 \NC \NR +\NC \type {\xmlsetsetup} \NC 1 \NC \NR +\NC \type {\xmlsetup} \NC 4 \NC \NR +\stoptabulate +\stopcolumns + +As you can see filtering, flushing and accessing attributes score high. Below we show +the statistics of a quite complex rendering (5 variants of schoolbooks: basic book, +answers, teachers guide, worksheets, full blown version with extensive tracing). + +\startcolumns[n=2,balance=yes] +\starttabulate[|l|r|] +\NC \type {\xmladdindex} \NC 3 \NC \NR +\NC \type {\xmlall} \NC 5 \NC \NR +\NC \type {\xmlappendsetup} \NC 1 \NC \NR +\NC \type {\xmlapplyselectors} \NC 1 \NC \NR +\NC \type {\xmlatt} \NC 40 \NC \NR +\NC \type {\xmlattdef} \NC 9 \NC \NR +\NC \type {\xmlattribute} \NC 10 \NC \NR +\NC \type {\xmlbadinclusions} \NC 3 \NC \NR +\NC \type {\xmlconcat} \NC 3 \NC \NR +\NC \type {\xmlcount} \NC 1 \NC \NR +\NC \type {\xmldelete} \NC 11 \NC \NR +\NC \type {\xmldoif} \NC 39 \NC \NR +\NC \type {\xmldoifelse} \NC 28 \NC \NR +\NC \type {\xmldoifelsetext} \NC 13 \NC \NR +\NC \type {\xmldoifnot} \NC 2 \NC \NR +\NC \type {\xmldoifnotselfempty} \NC 1 \NC \NR +\NC \type {\xmlfilter} \NC 100 \NC \NR +\NC \type {\xmlfirst} \NC 51 \NC \NR +\NC \type {\xmlflush} \NC 69 \NC \NR +\NC \type {\xmlflushcontext} \NC 2 \NC \NR +\NC \type {\xmlinclude} \NC 1 \NC \NR +\NC \type {\xmlincludeoptions} \NC 5 \NC \NR +\NC \type {\xmlinclusion} \NC 16 \NC \NR +\NC \type {\xmlinjector} \NC 1 \NC \NR +\NC \type {\xmlloaddirectives} \NC 1 \NC \NR +\NC \type {\xmlmapvalue} \NC 4 \NC \NR +\NC \type {\xmlmatch} \NC 1 \NC \NR +\NC \type {\xmlprependsetup} \NC 5 \NC \NR +\NC \type {\xmlregisterdocumentsetup} \NC 2 \NC \NR +\NC \type {\xmlregistersetup} \NC 1 \NC \NR +\NC \type {\xmlremapnamespace} \NC 1 \NC \NR +\NC \type {\xmlsetfunction} \NC 2 \NC \NR +\NC \type {\xmlsetinjectors} \NC 2 \NC \NR +\NC \type {\xmlsetsetup} \NC 11 \NC \NR +\NC \type {\xmlsetup} \NC 76 \NC \NR +\NC \type {\xmlstrip} \NC 1 \NC \NR +\NC \type {\xmlstripanywhere} \NC 1 \NC \NR +\NC \type {\xmltag} \NC 1 \NC \NR +\NC \type {\xmltext} \NC 53 \NC \NR +\NC \type {\xmlvalue} \NC 2 \NC \NR +\stoptabulate +\stopcolumns + +Here many more are used but this is an exceptional case. The top is again +dominated by filtering, flushing and attribute consulting. The list can actually +be smaller. For instance, the \type {\xmlcount} can just as well be \type +{\xmlfilter} with a \type {count} finalizer. There are also some special ones, +like the injectors, that are needed for finetuning the final result. + +\stopsection + +\startsection[title={loading}] + +\startxmlcmd {\cmdbasicsetup{xmlloadfile}} + loads the file \cmdinternal {cd:file} and registers it under \cmdinternal + {cd:name} and applies either given or standard \cmdinternal + {cd:xmlsetup} (alias: \type {\xmlload}) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlloadbuffer}} + loads the buffer \cmdinternal {cd:buffer} and registers it under + \cmdinternal {cd:name} and applies either given or standard + \cmdinternal {cd:xmlsetup} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlloaddata}} + loads \cmdinternal {cd:text} and registers it under \cmdinternal + {cd:name} and applies either given or standard \cmdinternal + {cd:xmlsetup} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlloadonly}} + loads \cmdinternal {cd:text} and registers it under \cmdinternal + {cd:name} and applies either given or standard \cmdinternal + {cd:xmlsetup} but doesn't flush the content +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlinclude}} + includes the file specified by attribute \cmdinternal {cd:name} of the + element located by \cmdinternal {cd:lpath} at node \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlprocessfile}} + registers file \cmdinternal {cd:file} as \cmdinternal {cd:name} and + process the tree starting with \cmdinternal {cd:xmlsetup} (alias: + \type {\xmlprocess}) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlprocessbuffer}} + registers buffer \cmdinternal {cd:name} as \cmdinternal {cd:name} and process + the tree starting with \cmdinternal {cd:xmlsetup} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlprocessdata}} + registers \cmdinternal {cd:text} as \cmdinternal {cd:name} and process + the tree starting with \cmdinternal {cd:xmlsetup} +\stopxmlcmd + +The initial setup defaults to \type {xml:process} that is defined +as follows: + +\starttyping +\startsetups xml:process + \xmlregistereddocumentsetups\xmldocument + \xmlmain\xmldocument +\stopsetups +\stoptyping + +First we apply the setups associated with the document (including common setups) +and then we flush the whole document. The macro \type {\xmldocument} expands to +the current document id. There is also \type {\xmlself} which expands to the +current node number (\type {#1} in setups). + +\startxmlcmd {\cmdbasicsetup{xmlmain}} + returns the whole document +\stopxmlcmd + +Normally such a flush will trigger a chain reaction of setups associated with the +child elements. + +\stopsection + +\startsection[title={saving}] + +\startxmlcmd {\cmdbasicsetup{xmlsave}} + saves the given node \cmdinternal {cd:node} in the file \cmdinternal {cd:file} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmltofile}} + saves the match of \cmdinternal {cd:lpath} in the file \cmdinternal {cd:file} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmltobuffer}} + saves the match of \cmdinternal {cd:lpath} in the buffer \cmdinternal {cd:buffer} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmltobufferverbose}} + saves the match of \cmdinternal {cd:lpath} verbatim in the buffer \cmdinternal + {cd:buffer} +\stopxmlcmd + +% \startxmlcmd {\cmdbasicsetup{xmltoparameters}} +% converts the match of \cmdinternal {cd:lpath} to key|/|values (for tracing) +% \stopxmlcmd + +The next command is only needed when you have messed with the tree using +\LUA\ code. + +\startxmlcmd {\cmdbasicsetup{xmladdindex}} + (re)indexes a tree +\stopxmlcmd + +The following macros are only used in special situations and are not really meant +for users. + +\startxmlcmd {\cmdbasicsetup{xmlraw}} + flush the content if \cmdinternal {cd:node} with original entities +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{startxmlraw}} + flush the wrapped content with original entities +\stopxmlcmd + +\stopsection + +\startsection[title={flushing data}] + +When we flush an element, the associated \XML\ setups are expanded. The most +straightforward way to flush an element is the following. Keep in mind that the +returned values itself can trigger setups and therefore flushes. + +\startxmlcmd {\cmdbasicsetup{xmlflush}} + returns all nodes under \cmdinternal {cd:node} +\stopxmlcmd + +You can restrict flushing by using commands that accept a specification. + +\startxmlcmd {\cmdbasicsetup{xmltext}} + returns the text of the matching \cmdinternal {cd:lpath} under \cmdinternal + {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlpure}} + returns the text of the matching \cmdinternal {cd:lpath} under \cmdinternal + {cd:node} without \type {\Ux} escaped special \TEX\ characters +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlflushtext}} + returns the text of the \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlflushpure}} + returns the text of the \cmdinternal {cd:node} without \type {\Ux} escaped + special \TEX\ characters +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlnonspace}} + returns the text of the matching \cmdinternal {cd:lpath} under \cmdinternal + {cd:node} without embedded spaces +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlall}} + returns all nodes under \cmdinternal {cd:node} that matches \cmdinternal + {cd:lpath} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmllastmatch}} + returns all nodes found in the last match +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlfirst}} + returns the first node under \cmdinternal {cd:node} that matches \cmdinternal + {cd:lpath} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmllast}} + returns the last node under \cmdinternal {cd:node} that matches \cmdinternal + {cd:lpath} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlfilter}} + at a match of \cmdinternal {cd:lpath} a given filter \type {filter} is applied + and the result is returned +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlsnippet}} + returns the \cmdinternal {cd:number}\high{th} element under \cmdinternal + {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlposition}} + returns the \cmdinternal {cd:number}\high{th} match of \cmdinternal + {cd:lpath} at node \cmdinternal {cd:node}; a negative number starts at the + end (alias: \type {\xmlindex}) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlelement}} + returns the \cmdinternal {cd:number}\high{th} child of node \cmdinternal {cd:node}; + a negative number starts at the end +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlpos}} + returns the index (position) in the parent node of \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlconcat}} + returns the sequence of nodes that match \cmdinternal {cd:lpath} at + \cmdinternal {cd:node} whereby \cmdinternal {cd:text} is put between each + match +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlconcatrange}} + returns the \cmdinternal {cd:first}\high {th} upto \cmdinternal + {cd:last}\high {th} of nodes that match \cmdinternal {cd:lpath} at + \cmdinternal {cd:node} whereby \cmdinternal {cd:text} is put between each + match +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlcommand}} + apply the given \cmdinternal {cd:xmlsetup} to each match of \cmdinternal + {cd:lpath} at node \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlstrip}} + remove leading and trailing spaces from nodes under \cmdinternal {cd:node} + that match \cmdinternal {cd:lpath} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlstripped}} + remove leading and trailing spaces from nodes under \cmdinternal {cd:node} + that match \cmdinternal {cd:lpath} and return the content afterwards +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlstripnolines}} + remove leading and trailing spaces as well as collapse embedded spaces + from nodes under \cmdinternal {cd:node} that match \cmdinternal {cd:lpath} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlstrippednolines}} + remove leading and trailing spaces as well as collapse embedded spaces from + nodes under \cmdinternal {cd:node} that match \cmdinternal {cd:lpath} and + return the content afterwards +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlverbatim}} + flushes the content verbatim code (without any wrapping, i.e. no fonts + are selected and such) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlinlineverbatim}} + return the content of the node as inline verbatim code; no further + interpretation (expansion) takes place and spaces are honoured; it uses the + following wrapper +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{startxmlinlineverbatim}} + wraps inline verbatim mode using the environment specified (a prefix \type + {xml:} is added to the environment name) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldisplayverbatim}} + return the content of the node as display verbatim code; no further + interpretation (expansion) takes place and leading and trailing spaces and + newlines are treated special; it uses the following wrapper +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{startxmldisplayverbatim}} + wraps the content in display verbatim using the environment specified (a prefix + \type {xml:} is added to the environment name) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlprettyprint}} + pretty print (with colors) the node \cmdinternal {cd:node}; use the \CONTEXT\ + \SCITE\ lexers when available (\type {\usemodule [scite]}) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlflushspacewise}} + flush node \cmdinternal {cd:node} obeying spaces and newlines +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlflushlinewise}} + flush node \cmdinternal {cd:node} obeying newlines +\stopxmlcmd + +\stopsection + +\startsection[title={information}] + +The following commands return strings. Normally these are used in tests. + +\startxmlcmd {\cmdbasicsetup{xmlname}} + returns the complete name (including namespace prefix) of the + given \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlnamespace}} + returns the namespace of the given \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmltag}} + returns the tag of the element, without namespace prefix +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlcount}} + returns the number of matches of \cmdinternal {cd:lpath} at node \cmdinternal + {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlatt}} + returns the value of attribute \cmdinternal {cd:name} or empty if no such + attribute exists +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlattdef}} + returns the value of attribute \cmdinternal {cd:name} or \cmdinternal + {cd:string} if no such attribute exists +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlrefatt}} + returns the value of attribute \cmdinternal {cd:name} or empty if no such + attribute exists; a leading \type {#} is removed (nicer for tex) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlchainatt}} + returns the value of attribute \cmdinternal {cd:name} or empty if no such + attribute exists; backtracks till a match is found +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlchainattdef}} + returns the value of attribute \cmdinternal {cd:name} or \cmdinternal + {cd:string} if no such attribute exists; backtracks till a match is found +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlattribute}} + finds a first match for \cmdinternal {cd:lpath} at \cmdinternal {cd:node} and + returns the value of attribute \cmdinternal {cd:name} or empty if no such + attribute exists +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlattributedef}} + finds a first match for \cmdinternal {cd:lpath} at \cmdinternal {cd:node} and + returns the value of attribute \cmdinternal {cd:name} or \cmdinternal + {cd:text} if no such attribute exists +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmllastatt}} + returns the last attribute found (this avoids a lookup) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlsetatt}} + set the value of attribute \cmdinternal {cd:name} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlsetattribute}} + set the value of attribute \cmdinternal {cd:name} for each match of \cmdinternal + {cd:lpath} +\stopxmlcmd + +\stopsection + +\startsection[title={manipulation}] + +You can use \LUA\ code to manipulate the tree and it makes no sense to duplicate +this in \TEX. In the future we might provide an interface to some of this +functionality. Keep in mind that manipuating the tree might have side effects as +we maintain several indices into the tree that also needs to be updated then. + +\stopsection + +\startsection[title={integration}] + +If you write a module that deals with \XML, for instance processing cals tables, +then you need ways to control specific behaviour. For instance, you might want to +add a background to the table. Such directives are collected in \XML\ files and +can be loaded on demand. + +\startxmlcmd {\cmdbasicsetup{xmlloaddirectives}} + loads \CONTEXT\ directives from \cmdinternal {cd:file} that will get + interpreted when processing documents +\stopxmlcmd + +A directives definition file looks as follows: + +\starttyping +<?xml version="1.0" standalone="yes"?> + +<directives> + <directive attribute='id' value="100" + setup="cdx:100"/> + <directive attribute='id' value="101" + setup="cdx:101"/> + <directive attribute='cdx' value="colors" element="cals:table" + setup="cdx:cals:table:colors"/> + <directive attribute='cdx' value="vertical" element="cals:table" + setup="cdx:cals:table:vertical"/> + <directive attribute='cdx' value="noframe" element="cals:table" + setup="cdx:cals:table:noframe"/> + <directive attribute='cdx' value="*" element="cals:table" + setup="cdx:cals:table:*"/> +</directives> +\stoptyping + +Examples of usage can be found in \type {x-cals.mkiv}. The directive is triggered +by an attribute. Instead of a setup you can specify a setup to be applied before +and after the node gets flushed. + +\startxmlcmd {\cmdbasicsetup{xmldirectives}} + apply the setups directive associated with the node +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldirectivesbefore}} + apply the before directives associated with the node +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldirectivesafter}} + apply the after directives associated with the node +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlinstalldirective}} + defines a directive that hooks into a handler +\stopxmlcmd + +Normally a directive will be put in the \XML\ file, for instance as: + +\starttyping +<?context-mathml-directive minus reduction yes ?> +\stoptyping + +Here the \type {mathml} is the general class of directives and \type {minus} a +subclass, in our case a specific element. + +\stopsection + +\startsection[title={setups}] + +The basic building blocks of \XML\ processing are setups. These are just +collections of macros that are expanded. These setups get one argument passed +(\type {#1}): + +\starttyping +\startxmlsetups somedoc:somesetup + \xmlflush{#1} +\stopxmlsetups +\stoptyping + +This argument is normally a number that internally refers to a specific node in +the \XML\ tree. The user should see it as an abstract reference and not depend on +its numeric property. Just think of it as \quote {the current node}. You can (and +probably will) call such setups using: + +\startxmlcmd {\cmdbasicsetup{xmlsetup}} + expands setup \cmdinternal {cd:setup} and pass \cmdinternal {cd:node} as + argument +\stopxmlcmd + +However, in most cases the setups are associated to specific elements, +something that users of \XSLT\ might recognize as templates. + +\startxmlcmd {\cmdbasicsetup{xmlsetfunction}} + associates function \cmdinternal {cd:luafunction} to the elements in + namespace \cmdinternal {cd:name} that match \cmdinternal {cd:lpath} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlsetsetup}} + associates setups \cmdinternal {cd:setup} (\TEX\ code) with the matching + nodes of \cmdinternal {cd:lpath} or root \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlprependsetup}} + pushes \cmdinternal {cd:setup} to the front of global list of setups +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlappendsetup}} + adds \cmdinternal {cd:setup} to the global list of setups to be applied + (alias: \type{\xmlregistersetup}) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlbeforesetup}} + pushes \cmdinternal {cd:setup} into the global list of setups; the + last setup is the position +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlaftersetup}} + adds \cmdinternal {cd:setup} to the global list of setups; the last setup + is the position +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlremovesetup}} + removes \cmdinternal {cd:setup} from the global list of setups +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlprependdocumentsetup}} + pushes \cmdinternal {cd:setup} to the front of list of setups to be applied + to \cmdinternal {cd:name} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlappenddocumentsetup}} + adds \cmdinternal {cd:setup} to the list of setups to be applied to + \cmdinternal {cd:name} (you can also use the alias: \type + {\xmlregisterdocumentsetup}) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlbeforedocumentsetup}} + pushes \cmdinternal {cd:setup} into the setups to be applied to \cmdinternal + {cd:name}; the last setup is the position +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlafterdocumentsetup}} + adds \cmdinternal {cd:setup} to the setups to be applied to \cmdinternal + {cd:name}; the last setup is the position +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlremovedocumentsetup}} + removes \cmdinternal {cd:setup} from the global list of setups to be applied + to \cmdinternal {cd:name} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlresetsetups}} + removes all global setups +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlresetdocumentsetups}} + removes all setups from the \cmdinternal {cd:name} specific list of setups to + be applied +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlflushdocumentsetups}{setup}} + applies \cmdinternal {cd:setup} (can be a list) to \cmdinternal {cd:name} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlregisteredsetups}} + applies all global setups to the current document +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlregistereddocumentsetups}} + applies all document specific \cmdinternal {cd:setup} to document + \cmdinternal {cd:name} +\stopxmlcmd + +\stopsection + +\startsection[title={testing}] + +The following test macros all take a \cmdinternal {cd:node} as first argument +and an \cmdinternal {cd:lpath} as second: + +\startxmlcmd {\cmdbasicsetup{xmldoif}} + expands to \cmdinternal {cd:true} when \cmdinternal {cd:lpath} matches at + node \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifnot}} + expands to \cmdinternal {cd:true} when \cmdinternal {cd:lpath} does not match + at node \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifelse}} + expands to \cmdinternal {cd:true} when \cmdinternal {cd:lpath} matches at + node \cmdinternal {cd:node} and to \cmdinternal {cd:false} otherwise +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoiftext}} + expands to \cmdinternal {cd:true} when the node matching \cmdinternal + {cd:lpath} at node \cmdinternal {cd:node} has some content +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifnottext}} + expands to \cmdinternal {cd:true} when the node matching \cmdinternal + {cd:lpath} at node \cmdinternal {cd:node} has no content +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifelsetext}} + expands to \cmdinternal {cd:true} when the node matching \cmdinternal + {cd:lpath} at node \cmdinternal {cd:node} has content and to \cmdinternal + {cd:false} otherwise +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifatt}} + expands to \cmdinternal {cd:true} when the attribute matching \cmdinternal + {cd:node} and the name given as second argument matches the third argument +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifnotatt}} + expands to \cmdinternal {cd:true} when the attribute matching \cmdinternal + {cd:node} and the name given as second argument differs from the third + argument +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifelseatt}} + expands to \cmdinternal {cd:true} when the attribute matching \cmdinternal + {cd:node} and the name given as second argument matches the third argument + and to \cmdinternal {cd:false} otherwise +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifelseempty}} + expands to \cmdinternal {cd:true} when the node matching \cmdinternal + {cd:lpath} at node \cmdinternal {cd:node} is empty and to \cmdinternal + {cd:false} otherwise +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifelseselfempty}} + expands to \cmdinternal {cd:true} when the node is empty and to \cmdinternal + {cd:false} otherwise +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifselfempty}} + expands to \cmdinternal {cd:true} when \cmdinternal {cd:node} is empty +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifnotselfempty}} + expands to \cmdinternal {cd:true} when \cmdinternal {cd:node} is not empty +\stopxmlcmd + +\stopsection + +\startsection[title={initialization}] + +The general setup command (not to be confused with setups) that deals with the +\MKIV\ tree handler is \type {\setupxml}. There are currently only a few options. + +\cmdfullsetup{setupxml} + +When you set \type {default} to \cmdinternal {cd:text} elements with no setup +assigned will end up as text. When set to \type {hidden} such elements will be +hidden. You can apply the default yourself using: + +\startxmlcmd {\cmdbasicsetup{xmldefaulttotext}} + presets the tree with root \cmdinternal {cd:node} to the handlers set up with + \type {\setupxml} option \cmdinternal{default} +\stopxmlcmd + +You can set \type {compress} to \type {yes} in which case comment is stripped +from the tree when the file is read. + +\startxmlcmd {\cmdbasicsetup{xmlregisterns}} + associates an internal namespace (like \type {mml}) with one given in the + document as \URL\ (like mathml) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlremapname}} + changes the namespace and tag of the matching elements +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlremapnamespace}} + replaces all references to the given namespace to a new one (applied + recursively) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlchecknamespace}} + sets the namespace of the matching elements unless a namespace is already set +\stopxmlcmd + +\stopsection + +\startsection[title={helpers}] + +Often an attribute will determine the rendering and this may result in many +tests. Especially when we have multiple attributes that control the output such +tests can become rather extensive and redundant because one gets $n\times m$ or +more such tests. + +Therefore we have a convenient way to map attributes onto for instance strings or +commands. + +\startxmlcmd {\cmdbasicsetup{xmlmapvalue}} + associate a \cmdinternal {cd:text} with a \cmdinternal {cd:category} and + \cmdinternal {cd:name} (alias: \type{\xmlmapval}) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlvalue}} + expand the value associated with a \cmdinternal {cd:category} and + \cmdinternal {cd:name} and if not resolved, expand to the \cmdinternal + {cd:text} (alias: \type{\xmlval}) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmldoifelsevalue}} + associate a \cmdinternal {cd:text} with a \cmdinternal {cd:category} and + \cmdinternal {cd:name} +\stopxmlcmd + +This is used as follows. We define a couple of mappings in the same category: + +\starttyping +\xmlmapvalue{emph}{bold} {\bf} +\xmlmapvalue{emph}{italic}{\it} +\stoptyping + +Assuming that we have associated the following setup with the \type {emph} +element, we can say (with \type {#1} being the current element): + +\starttyping +\startxmlsetups demo:emph + \begingroup + \xmlvalue{emph}{\xmlatt{#1}{type}}{} + \endgroup +\stopxmlsetups +\stoptyping + +In this case we have no default. The \type {type} attribute triggers the actions, +as in: + +\starttyping +normal <emph type='bold'>bold</emph> normal +\stoptyping + +This mechanism is not really bound to elements and attributes so you can use this +mechanism for other purposes as well. + +\stopsection + +\startsection[title={Parameters}] + +\startbuffer[test] +<something whatever="alpha"> + <what> + beta + </what> +</something> +\stopbuffer + +\startbuffer +\startxmlsetups xml:mysetups + \xmlsetsetup{\xmldocument}{*}{xml:*} +\stopxmlsetups + +\xmlregistersetup{xml:mysetups} + +\startxmlsetups xml:something + parameter : \xmlpar {#1}{whatever}\par + attribute : \xmlatt {#1}{whatever}\par + text : \xmlfirst {#1}{what} \par + \xmlsetpar{#1}{whatever}{gamma} + parameter : \xmlpar {#1}{whatever}\par + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:what + what: \xmlflush{#1}\par + parameter : \xmlparam{#1}{..}{whatever}\par +\stopxmlsetups + +\xmlprocessbuffer{main}{test}{} +\stopbuffer + +Say that we have this \XML\ blob: + +\typebuffer[test] + +With: + +\typebuffer + +we get: + +\getbuffer + +Parameters are stored with a node. + +\startxmlcmd {\cmdbasicsetup{xmlpar}} + returns the value of parameter \cmdinternal {cd:name} or empty if no such + parameter exists +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlparam}} + finds a first match for \cmdinternal {cd:lpath} at \cmdinternal {cd:node} and + returns the value of parameter \cmdinternal {cd:name} or empty if no such + parameter exists +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmllastpar}} + returns the last parameter found (this avoids a lookup) +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlsetpar}} + set the value of parameter \cmdinternal {cd:name} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlsetparam}} + set the value of parameter \cmdinternal {cd:name} for each match of \cmdinternal + {cd:lpath} +\stopxmlcmd + +\stopsection + +\stopchapter + +\stopcomponent diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-contents.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-contents.tex new file mode 100644 index 000000000..e0787ec5f --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-contents.tex @@ -0,0 +1,12 @@ +\environment xml-mkiv-style + +\startcomponent xml-mkiv-contents + +\starttitle[title=Contents] + +\placelist + [chapter,section] + +\stoptitle + +\stopcomponent diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-converter.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-converter.tex new file mode 100644 index 000000000..a457f962b --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-converter.tex @@ -0,0 +1,300 @@ +\environment xml-mkiv-style + +\startcomponent xml-mkiv-converter + +\startchapter[title={Setting up a converter}] + +\startsection[title={from structure to setup}] + +We use a very simple document structure for demonstrating how a converter is +defined. In practice a mapping will be more complex, especially when we have a +style with complex chapter openings using data coming from all kind of places, +different styling of sections with the same name, selectively (out of order) +flushed content, special formatting, etc. + +\typefile{manual-demo-1.xml} + +Say that this document is stored in the file \type {demo.xml}, then the following +code can be used as starting point: + +\starttyping +\startxmlsetups xml:demo:base + \xmlsetsetup{#1}{document|section|p}{xml:demo:*} +\stopxmlsetups + +\xmlregisterdocumentsetup{demo}{xml:demo:base} + +\startxmlsetups xml:demo:document + \starttitle[title={Contents}] + \placelist[chapter] + \stoptitle + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:demo:section + \startchapter[title=\xmlfirst{#1}{/title}] + \xmlfirst{#1}{/content} + \stopchapter +\stopxmlsetups + +\startxmlsetups xml:demo:p + \xmlflush{#1}\endgraf +\stopxmlsetups + +\xmlprocessfile{demo}{demo.xml}{} +\stoptyping + +Watch out! These are not just setups, but specific \XML\ setups which get an +argument passed (the \type {#1}). If for some reason your \XML\ processing fails, +it might be that you mistakenly have used a normal setup definition. The argument +\type {#1} represents the current node (element) and is a unique identifier. For +instance a \type {<p>..</p>} can have an identifier {demo::5}. So, we can get +something: + +\starttyping +\xmlflush{demo::5}\endgraf +\stoptyping + +but as well: + +\starttyping +\xmlflush{demo::6}\endgraf +\stoptyping + +Keep in mind that the references tor the actual nodes (elements) are +abstractions, you never see those \type {<id>::<number>}'s, because we will use +either the abstract \type {#1} (any node) or an explicit reference like \type +{demo}. The previous setup when issued will be like: + +\starttyping +\startchapter[title=\xmlfirst{demo::3}{/title}] + \xmlfirst{demo::4}{/content} +\stopchapter +\stoptyping + +Here the \type {title} is used to typeset the chapter title but also for an entry +in the table of contents. At the moment the title is typeset the \XML\ node gets +looked up and expanded in real text. However, for the list it gets stored for +later use. One can argue that this is not needed for \XML, because one can just +filter all the titles and use page references, but then one also looses the +control one normally has over such titles. For instance it can be that some +titles are rendered differently and for that we need to keep track of usage. +Doing that with transformations or filtering is often more complex than leaving +that to \TEX. As soon as the list gets typeset, the reference (\type {demo::#3}) +is used for the lookup. This is because by default the title is stored as given. +So, as long as we make sure the \XML\ source is loaded before the table of +contents is typeset we're ok. Later we will look into this in more detail, for +now it's enough to know that in most cases the abstract \type {#1} reference will +work out ok. + +Contrary to the style definitions this interface looks rather low level (with no +optional arguments) and the main reason for this is that we want processing to be +fast. So, the basic framework is: + +\starttyping +\startxmlsetups xml:demo:base + % associate setups with elements +\stopxmlsetups + +\xmlregisterdocumentsetup{demo}{xml:demo:base} + +% define setups for matches + +\xmlprocessfile{demo}{demo.xml}{} +\stoptyping + +In this example we mostly just flush the content of an element and in the case of +a section we flush explicit child elements. The \type {#1} in the example code +represents the current element. The line: + +\starttyping +\xmlsetsetup{demo}{*}{-} +\stoptyping + +sets the default for each element to \quote {just ignore it}. A \type {+} would +make the default to always flush the content. This means that at this point we +only handle: + +\starttyping +<section> + <title>Some title</title> + <content> + <p>a paragraph of text</p> + </content> +</section> +\stoptyping + +In the next section we will deal with the slightly more complex itemize and +figure placement. At first sight all these setups may look overkill but keep in +mind that normally the number of elements is rather limited. The complexity is +often in the style and having access to each snippet of content is actually +quite handy for that. + +\stopsection + +\startsection[title={alternative solutions}] + +Dealing with an itemize is rather simple (as long as we forget about +attributes that control the behaviour): + +\starttyping +<itemize> + <item>first</item> + <item>second</item> +</itemize> +\stoptyping + +First we need to add \type {itemize} to the setup assignment (unless we've used +the wildcard \type {*}): + +\starttyping +\xmlsetsetup{demo}{document|section|p|itemize}{xml:demo:*} +\stoptyping + +The setup can look like: + +\starttyping +\startxmlsetups xml:demo:itemize + \startitemize + \xmlfilter{#1}{/item/command(xml:demo:itemize:item)} + \stopitemize +\stopxmlsetups + +\startxmlsetups xml:demo:itemize:item + \startitem + \xmlflush{#1} + \stopitem +\stopxmlsetups +\stoptyping + +An alternative is to map item directly: + +\starttyping +\xmlsetsetup{demo}{document|section|p|itemize|item}{xml:demo:*} +\stoptyping + +and use: + +\starttyping +\startxmlsetups xml:demo:itemize + \startitemize + \xmlflush{#1} + \stopitemize +\stopxmlsetups + +\startxmlsetups xml:demo:item + \startitem + \xmlflush{#1} + \stopitem +\stopxmlsetups +\stoptyping + +Sometimes, a more local solution using filters and \type {/command(...)} makes more +sense, especially when the \type {item} tag is used for other purposes as well. + +Explicit flushing with \type {command} is definitely the way to go when you have +complex products. In one of our projects we compose math school books from many +thousands of small \XML\ files, and from one source set several products are +typeset. Within a book sections get done differently, content gets used, ignored +or interpreted differently depending on the kind of content, so there is a +constant checking of attributes that drive the rendering. In that a generic setup +for a title element makes less sense than explicit ones for each case. (We're +talking of huge amounts of files here, including multiple images on each rendered +page.) + +When using \type {command} you can pass two arguments, the first is the setup for +the match, the second one for the miss, as in: + +\starttyping +\xmlfilter{#1}{/element/command(xml:true,xml:false)} +\stoptyping + +Back to the example, this leaves us with dealing with the resources, like +figures: + +\starttyping +<resource type='figure'> + <caption>A picture of a cow.</caption> + <content><external file="cow.pdf"/></content> +</resource> +\stoptyping + +Here we can use a more restricted match: + +\starttyping +\xmlsetsetup{demo}{resource[@type='figure']}{xml:demo:figure} +\xmlsetsetup{demo}{external}{xml:demo:*} +\stoptyping + +and the definitions: + +\starttyping +\startxmlsetups xml:demo:figure + \placefigure + {\xmlfirst{#1}{/caption}} + {\xmlfirst{#1}{/content}} +\stopxmlsetups + +\startxmlsetups xml:demo:external + \externalfigure[\xmlatt{#1}{file}] +\stopxmlsetups +\stoptyping + +At this point it is good to notice that \type {\xmlatt{#1}{file}} is passed as it +is: a macro call. This means that when a macro like \type {\externalfigure} uses +the first argument frequently without first storing its value, the lookup is done +several times. A solution for this is: + +\starttyping +\startxmlsetups xml:demo:external + \expanded{\externalfigure[\xmlatt{#1}{file}]} +\stopxmlsetups +\stoptyping + +Because the lookup is rather fast, normally there is no need to bother about this +too much because internally \CONTEXT\ already makes sure such expansion happens +only once. + +An alternative definition for placement is the following: + +\starttyping +\xmlsetsetup{demo}{resource}{xml:demo:resource} +\stoptyping + +with: + +\starttyping +\startxmlsetups xml:demo:resource + \placefloat + [\xmlatt{#1}{type}] + {\xmlfirst{#1}{/caption}} + {\xmlfirst{#1}{/content}} +\stopxmlsetups +\stoptyping + +This way you can specify \type {table} as type too. Because you can define your +own float types, more complex variants are also possible. In that case it makes +sense to provide some default behaviour too: + +\starttyping +\definefloat[figure-here][figure][default=here] +\definefloat[figure-left][figure][default=left] +\definefloat[table-here] [table] [default=here] +\definefloat[table-left] [table] [default=left] + +\startxmlsetups xml:demo:resource + \placefloat + [\xmlattdef{#1}{type}{figure}-\xmlattdef{#1}{location}{here}] + {\xmlfirst{#1}{/caption}} + {\xmlfirst{#1}{/content}} +\stopxmlsetups +\stoptyping + +In this example we support two types and two locations. We default to a figure +placed (when possible) at the current location. + +\stopsection + +\stopchapter + +\stopcomponent diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-examples.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-examples.tex new file mode 100644 index 000000000..064510d6d --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-examples.tex @@ -0,0 +1,948 @@ +\environment xml-mkiv-style + +\startcomponent xml-mkiv-examples + +\startchapter[title=Examples] + +\startsection[title=attribute chains] + +In \CSS, when an attribute is not present, the parent element is checked, and when +not found again, the lookup follows the chain till a match is found or the root is +reached. The following example demonstrates how such a chain lookup works. + +\startbuffer[test] +<something mine="1" test="one" more="alpha"> + <whatever mine="2" test="two"> + <whocares mine="3"> + <!-- this is a test --> + </whocares> + </whatever> +</something> +\stopbuffer + +\typebuffer[test] + +We apply the following setups to this tree: + +\startbuffer[setups] +\startxmlsetups xml:common + [ + \xmlchainatt{#1}{mine}, + \xmlchainatt{#1}{test}, + \xmlchainatt{#1}{more}, + \xmlchainatt{#1}{none} + ]\par +\stopxmlsetups + +\startxmlsetups xml:something + something: \xmlsetup{#1}{xml:common} + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:whatever + whatever: \xmlsetup{#1}{xml:common} + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:whocares + whocares: \xmlsetup{#1}{xml:common} + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:mysetups + \xmlsetsetup{#1}{something|whatever|whocares}{xml:*} +\stopxmlsetups + +\xmlregisterdocumentsetup{example-1}{xml:mysetups} + +\xmlprocessbuffer{example-1}{test}{} +\stopbuffer + +\typebuffer[setups] + +This gives: + +\start + \getbuffer[setups] +\stop + +\stopsection + +\startsection[title=conditional setups] + +Say that we have this code: + +\starttyping +\xmldoifelse {#1} {/what[@a='1']} { + \xmlfilter {#1} {/what/command('xml:yes')} +} { + \xmlfilter {#1} {/what/command('xml:nop')} +} +\stoptyping + +Here we first determine if there is a child \type {what} with attribute \type {a} +set to \type {1}. Depending on the outcome again we check the child nodes for +being named \type {what}. A faster solution which also takes less code is this: + +\starttyping +\xmlfilter {#1} {/what[@a='1']/command('xml:yes','xml:nop')} +\stoptyping + +\stopsection + +\startsection[title=manipulating] + +Assume that we have the following \XML\ data: + +\startbuffer[test] +<A> + <B>right</B> + <B>wrong</B> +</A> +\stopbuffer + +\typebuffer[test] + +But, instead of \type {right} we want to see \type {okay}. We can do that with a +finalizer: + +\startbuffer +\startluacode +local rehash = { + ["right"] = "okay", +} + +function xml.finalizers.tex.Okayed(collected,what) + for i=1,#collected do + if what == "all" then + local str = xml.text(collected[i]) + context(rehash[str] or str) + else + context(str) + end + end +end +\stopluacode +\stopbuffer + +\typebuffer \getbuffer + +\startbuffer +\startxmlsetups xml:A + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:B + (It's \xmlfilter{#1}{./Okayed("all")}) +\stopxmlsetups + +\startxmlsetups xml:testsetups + \xmlsetsetup{#1}{A|B}{xml:*} +\stopxmlsetups + +\xmlregisterdocumentsetup{example-2}{xml:testsetups} +\xmlprocessbuffer{example-2}{test}{} +\stopbuffer + +\typebuffer + +The result is: \start \inlinebuffer \stop + +\stopsection + +\startsection[title=cross referencing] + +A rather common way to add cross references to \XML\ files is to borrow the +asymmetrical id's from \HTML. This means that one cannot simply use a value +of (say) \type {href} to locate an \type {id}. The next example came up on +the \CONTEXT\ mailing list. + +\startbuffer[test] +<doc> + <p>Text + <a href="#fn1" class="footnoteref" id="fnref1"><sup>1</sup></a> and + <a href="#fn2" class="footnoteref" id="fnref2"><sup>2</sup></a> + </p> + <div class="footnotes"> + <hr /> + <ol> + <li id="fn1"><p>A footnote.<a href="#fnref1">↩</a></p></li> + <li id="fn2"><p>A second footnote.<a href="#fnref2">↩</a></p></li> + </ol> + </div> +</doc> +\stopbuffer + +\typebuffer[test] + +We give two variants for dealing with such references. The first solution does +lookups and depending on the size of the file can be somewhat inefficient. + +\startbuffer +\startxmlsetups xml:doc + \blank + \xmlflush{#1} + \blank +\stopxmlsetups + +\startxmlsetups xml:p + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:footnote + (variant 1)\footnote + {\xmlfirst + {example-3-1} + {div[@class='footnotes']/ol/li[@id='\xmlrefatt{#1}{href}']}} +\stopxmlsetups + +\startxmlsetups xml:initialize + \xmlsetsetup{#1}{p|doc}{xml:*} + \xmlsetsetup{#1}{a[@class='footnoteref']}{xml:footnote} + \xmlsetsetup{#1}{div[@class='footnotes']}{xml:nothing} +\stopxmlsetups + +\xmlresetdocumentsetups{*} +\xmlregisterdocumentsetup{example-3-1}{xml:initialize} + +\xmlprocessbuffer{example-3-1}{test}{} +\stopbuffer + +\typebuffer + +This will typeset two footnotes. + +\getbuffer + +The second variant collects the references so that the time spend on lookups is +less. + +\startbuffer +\startxmlsetups xml:doc + \blank + \xmlflush{#1} + \blank +\stopxmlsetups + +\startxmlsetups xml:p + \xmlflush{#1} +\stopxmlsetups + +\startluacode + userdata.notes = {} +\stopluacode + +\startxmlsetups xml:collectnotes + \ctxlua{userdata.notes['\xmlrefatt{#1}{id}'] = '#1'} +\stopxmlsetups + +\startxmlsetups xml:footnote + (variant 2)\footnote + {\xmlflush + {\cldcontext{userdata.notes['\xmlrefatt{#1}{href}']}}} +\stopxmlsetups + +\startxmlsetups xml:initialize + \xmlsetsetup{#1}{p|doc}{xml:*} + \xmlsetsetup{#1}{a[@class='footnoteref']}{xml:footnote} + \xmlfilter{#1}{div[@class='footnotes']/ol/li/command(xml:collectnotes)} + \xmlsetsetup{#1}{div[@class='footnotes']}{} +\stopxmlsetups + +\xmlregisterdocumentsetup{example-3-2}{xml:initialize} + +\xmlprocessbuffer{example-3-2}{test}{} +\stopbuffer + +\typebuffer + +This will again typeset two footnotes: + +\getbuffer + +\stopsection + +\startsection[title=mapping values] + +One way to process options \type {frame} in the example below is to map the +values to values known by \CONTEXT. + +\startbuffer[test] +<a> + <nattable frame="on"> + <tr><td>#1</td><td>#2</td><td>#3</td><td>#4</td></tr> + <tr><td>#5</td><td>#6</td><td>#7</td><td>#8</td></tr> + </nattable> + <nattable frame="off"> + <tr><td>#1</td><td>#2</td><td>#3</td><td>#4</td></tr> + <tr><td>#5</td><td>#6</td><td>#7</td><td>#8</td></tr> + </nattable> + <nattable frame="no"> + <tr><td>#1</td><td>#2</td><td>#3</td><td>#4</td></tr> + <tr><td>#5</td><td>#6</td><td>#7</td><td>#8</td></tr> + </nattable> +</a> +\stopbuffer + +\typebuffer[test] + +\startbuffer +\startxmlsetups xml:a + \xmlflush{#1} +\stopxmlsetups + +\xmlmapvalue {nattable:frame} {on} {on} +\xmlmapvalue {nattable:frame} {yes} {on} +\xmlmapvalue {nattable:frame} {off} {off} +\xmlmapvalue {nattable:frame} {no} {off} + +\startxmlsetups xml:nattable + \startplacetable[title=#1] + \setupTABLE[frame=\xmlval{nattable:frame}{\xmlatt{#1}{frame}}{on}]% + \bTABLE + \xmlflush{#1} + \eTABLE + \stopplacetable +\stopxmlsetups + +\startxmlsetups xml:tr + \bTR + \xmlflush{#1} + \eTR +\stopxmlsetups + +\startxmlsetups xml:td + \bTD + \xmlflush{#1} + \eTD +\stopxmlsetups + +\startxmlsetups xml:testsetups + \xmlsetsetup{example-4}{a|nattable|tr|td|}{xml:*} +\stopxmlsetups + +\xmlregisterdocumentsetup{example-4}{xml:testsetups} + +\xmlprocessbuffer{example-4}{test}{} +\stopbuffer + +The \type {\xmlmapvalue} mechanism is rather efficient and involves a minimum +of testing. + +\typebuffer + +We get: + +\getbuffer + +\stopsection + +\startsection[title=using \LUA] + +In this example we demonstrate how you can delegate rendering to \LUA. We +will construct a so called extreme table. The input is: + +\startbuffer[demo] +<?xml version="1.0" encoding="utf-8"?> + +<a> + <b> <c>1</c> <d>Text</d> </b> + <b> <c>2</c> <d>More text</d> </b> + <b> <c>2</c> <d>Even more text</d> </b> + <b> <c>2</c> <d>And more</d> </b> + <b> <c>3</c> <d>And even more</d> </b> + <b> <c>2</c> <d>The last text</d> </b> +</a> +\stopbuffer + +\typebuffer[demo] + +The processor code is: + +\startbuffer[process] +\startxmlsetups xml:test_setups + \xmlsetsetup{#1}{a|b|c|d}{xml:*} +\stopxmlsetups + +\xmlregisterdocumentsetup{example-5}{xml:test_setups} + +\xmlprocessbuffer{example-5}{demo}{} +\stopbuffer + +\typebuffer + +We color a sequence of the same titles (numbers here) differently. The first +solution remembers the last title: + +\startbuffer +\startxmlsetups xml:a + \startembeddedxtable + \xmlflush{#1} + \stopembeddedxtable +\stopxmlsetups + +\startxmlsetups xml:b + \xmlfunction{#1}{test_ba} +\stopxmlsetups + +\startluacode +local lasttitle = nil + +function xml.functions.test_ba(t) + local title = xml.text(t, "/c") + local content = xml.text(t, "/d") + context.startxrow() + context.startxcell { + background = "color", + backgroundcolor = lasttitle == title and "colorone" or "colortwo", + foregroundstyle = "bold", + foregroundcolor = "white", + } + context(title) + lasttitle = title + context.stopxcell() + context.startxcell() + context(content) + context.stopxcell() + context.stopxrow() +end +\stopluacode +\stopbuffer + +\typebuffer \getbuffer + +The \type {embeddedxtable} environment is needed because the table is picked up +as argument. + +\startlinecorrection \getbuffer[process] \stoplinecorrection + +The second implemetation remembers what titles are already processed so here we +can color the last one too. + +\startbuffer +\startxmlsetups xml:a + \ctxlua{xml.functions.reset_bb()} + \startembeddedxtable + \xmlflush{#1} + \stopembeddedxtable +\stopxmlsetups + +\startxmlsetups xml:b + \xmlfunction{#1}{test_bb} +\stopxmlsetups + +\startluacode +local titles + +function xml.functions.reset_bb(t) + titles = { } +end + +function xml.functions.test_bb(t) + local title = xml.text(t, "/c") + local content = xml.text(t, "/d") + context.startxrow() + context.startxcell { + background = "color", + backgroundcolor = titles[title] and "colorone" or "colortwo", + foregroundstyle = "bold", + foregroundcolor = "white", + } + context(title) + titles[title] = true + context.stopxcell() + context.startxcell() + context(content) + context.stopxcell() + context.stopxrow() +end +\stopluacode +\stopbuffer + +\typebuffer \getbuffer + +\startlinecorrection \getbuffer[process] \stoplinecorrection + +A solution without any state variable is given below. + +\startbuffer +\startxmlsetups xml:a + \startembeddedxtable + \xmlflush{#1} + \stopembeddedxtable +\stopxmlsetups + +\startxmlsetups xml:b + \xmlfunction{#1}{test_bc} +\stopxmlsetups + +\startluacode +function xml.functions.test_bc(t) + local title = xml.text(t, "/c") + local content = xml.text(t, "/d") + context.startxrow() + local okay = xml.text(t,"./preceding-sibling::/[-1]") == title + context.startxcell { + background = "color", + backgroundcolor = okay and "colorone" or "colortwo", + foregroundstyle = "bold", + foregroundcolor = "white", + } + context(title) + context.stopxcell() + context.startxcell() + context(content) + context.stopxcell() + context.stopxrow() +end +\stopluacode +\stopbuffer + +\typebuffer \getbuffer + +\startlinecorrection \getbuffer[process] \stoplinecorrection + +Here is a solution that delegates even more to \LUA. The previous variants were +actually not that safe with repect to special characters and didn't handle +nested elements either but the next one does. + +\startbuffer[demo] +<?xml version="1.0" encoding="utf-8"?> + +<a> + <b> <c>#1</c> <d>Text</d> </b> + <b> <c>#2</c> <d>More text</d> </b> + <b> <c>#2</c> <d>Even more text</d> </b> + <b> <c>#2</c> <d>And more</d> </b> + <b> <c>#3</c> <d>And even more</d> </b> + <b> <c>#2</c> <d>Something <i>nested</i> </d> </b> +</a> +\stopbuffer + +\typebuffer[demo] + +We also need to map the \type {i} element. + +\startbuffer +\startxmlsetups xml:a + \starttexcode + \xmlfunction{#1}{test_a} + \stoptexcode +\stopxmlsetups + +\startxmlsetups xml:c + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:d + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:i + {\em\xmlflush{#1}} +\stopxmlsetups + +\startluacode +function xml.functions.test_a(t) + context.startxtable() + local previous = false + for b in xml.collected(lxml.getid(t),"/b") do + context.startxrow() + local current = xml.text(b,"/c") + context.startxcell { + background = "color", + backgroundcolor = (previous == current) and "colorone" or "colortwo", + foregroundstyle = "bold", + foregroundcolor = "white", + } + lxml.first(b,"/c") + context.stopxcell() + context.startxcell() + lxml.first(b,"/d") + context.stopxcell() + previous = current + context.stopxrow() + end + context.stopxtable() +end +\stopluacode + +\startxmlsetups xml:test_setups + \xmlsetsetup{#1}{a|b|c|d|i}{xml:*} +\stopxmlsetups + +\xmlregisterdocumentsetup{example-5}{xml:test_setups} + +\xmlprocessbuffer{example-5}{demo}{} +\stopbuffer + +\typebuffer + +\startlinecorrection \getbuffer \stoplinecorrection + +The question is, do we really need \LUA ? Often we don't, apart maybe from an +occasional special finalizer. A pure \TEX\ solution is given next: + +\startbuffer +\startxmlsetups xml:a + \glet\MyPreviousTitle\empty + \glet\MyCurrentTitle \empty + \startembeddedxtable + \xmlflush{#1} + \stopembeddedxtable +\stopxmlsetups + +\startxmlsetups xml:b + \startxrow + \xmlflush{#1} + \stopxrow +\stopxmlsetups + +\startxmlsetups xml:c + \xdef\MyCurrentTitle{\xmltext{#1}{.}} + \doifelse {\MyPreviousTitle} {\MyCurrentTitle} { + \startxcell + [background=color, + backgroundcolor=colorone, + foregroundstyle=bold, + foregroundcolor=white] + } { + \glet\MyPreviousTitle\MyCurrentTitle + \startxcell + [background=color, + backgroundcolor=colortwo, + foregroundstyle=bold, + foregroundcolor=white] + } + \xmlflush{#1} + \stopxcell +\stopxmlsetups + +\startxmlsetups xml:d + \startxcell + \xmlflush{#1} + \stopxcell +\stopxmlsetups + +\startxmlsetups xml:i + {\em\xmlflush{#1}} +\stopxmlsetups + +\startxmlsetups xml:test_setups + \xmlsetsetup{#1}{*}{xml:*} +\stopxmlsetups + +\xmlregisterdocumentsetup{example-5}{xml:test_setups} + +\xmlprocessbuffer{example-5}{demo}{} +\stopbuffer + +\typebuffer + +\startlinecorrection \getbuffer \stoplinecorrection + +You can even save a few lines of code: + +\starttyping +\startxmlsetups xml:c + \xdef\MyCurrentTitle{\xmltext{#1}{.}} + \startxcell + [background=color, + backgroundcolor=color\ifx\MyPreviousTitle\MyCurrentTitle one\else two\fi, + foregroundstyle=bold, + foregroundcolor=white] + \xmlflush{#1} + \stopxcell + \glet\MyPreviousTitle\MyCurrentTitle +\stopxmlsetups +\stoptyping + +Or if you prefer: + +\starttyping +\startxmlsetups xml:c + \xdef\MyCurrentTitle{\xmltext{#1}{.}} + \doifelse {\MyPreviousTitle} {\MyCurrentTitle} { + \xmlsetup{#1}{xml:c:one} + } { + \xmlsetup{#1}{xml:c:two} + } +\stopxmlsetups + +\startxmlsetups xml:c:one + \startxcell + [background=color, + backgroundcolor=colorone, + foregroundstyle=bold, + foregroundcolor=white] + \xmlflush{#1} + \stopxcell +\stopxmlsetups + +\startxmlsetups xml:c:two + \startxcell + [background=color, + backgroundcolor=colortwo, + foregroundstyle=bold, + foregroundcolor=white] + \xmlflush{#1} + \stopxcell + \global\let\MyPreviousTitle\MyCurrentTitle +\stopxmlsetups +\stoptyping + +These examples demonstrate that it doesn't hurt to know a little bit of \TEX\ +programming: defining macros and basic comparisons can come in handy. There are +examples in the test suite, you can peek in the source code, you can consult +the wiki or you can just ask on the list. + +\stopsection + +\startsection[title=last match] + +For the next example we use the following \XML\ input: + +\startbuffer[demo] +<?xml version "1.0"?> +<document> + <section id="1"> + <content> + <p>first</p> + <p>second</p> + </content> + </section> + <section id="2"> + <content> + <p>third</p> + <p>fourth</p> + </content> + </section> +</document> +\stopbuffer + +\typebuffer[demo] + +If you check if some element is present and then act accordingly, you can +end up with doing the same lookup twice. Although it might sound inefficient, +in practice it's often not measureable. + +\startbuffer +\startxmlsetups xml:demo:document + \type{\xmlall{#1}{/section[@id='2']/content/p}}\par + \xmldoif{#1}{/section[@id='2']/content/p} { + \xmlall{#1}{/section[@id='2']/content/p} + } + \type{\xmllastmatch}\par + \xmldoif{#1}{/section[@id='2']/content/p} { + \xmllastmatch + } + \type{\xmlall{#1}{last-match::}}\par + \xmldoif{#1}{/section[@id='2']/content/p} { + \xmlall{#1}{last-match::} + } + \type{\xmlfilter{#1}{last-match::/command(xml:demo:p)}}\par + \xmldoif{#1}{/section[@id='2']/content/p} { + \xmlfilter{#1}{last-match::/command(xml:demo:p)} + } +\stopxmlsetups + +\startxmlsetups xml:demo:p + \quad\xmlflush{#1}\endgraf +\stopxmlsetups + +\startxmlsetups xml:demo:base + \xmlsetsetup{#1}{document|p}{xml:demo:*} +\stopxmlsetups + +\xmlregisterdocumentsetup{example-6}{xml:demo:base} + +\xmlprocessbuffer{example-6}{demo}{} +\stopbuffer + +\typebuffer + +In the second check we just flush the last match, so effective we do an \type +{\xmlall} here. The third and fourth alternatives demonstrate how we can use +\type {last-match} as axis. The gain is 10\% or more on the lookup but of course +typesetting often takes relatively more time than the lookup. + +\startpacked +\getbuffer +\stoppacked + +\stopsection + +\startsection[title=Finalizers] + +The \XML\ parser is also available outside \TEX. Here is an example of its usage. +We pipe the result to \TEX\ but you can do with \type {t} whatever you like. + +\startbuffer +local x = xml.load("manual-demo-1.xml") +local t = { } + +for c in xml.collected(x,"//*") do + if not c.special and not t[c.tg] then + t[c.tg] = true + end +end + +context.tocontext(table.sortedkeys(t)) +\stopbuffer + +% \typebuffer + +This returns: + +\ctxluabuffer + +We can wrap this in a finalizer: + +\startbuffer +xml.finalizers.taglist = function(collected) + local t = { } + for i=1,#collected do + local c = collected[i] + if not c.special then + local tg = c.tg + if tg and not t[tg] then + t[tg] = true + end + end + end + return table.sortedkeys(t) +end +\stopbuffer + +\typebuffer + +Or in a more extensive one: + +\startbuffer +xml.finalizers.taglist = function(collected,parenttoo) + local t = { } + for i=1,#collected do + local c = collected[i] + if not c.special then + local tg = c.tg + if tg and not t[tg] then + t[tg] = true + end + if parenttoo then + local p = c.__p__ + if p and not p.special then + local tg = p.tg .. ":" .. tg + if tg and not t[tg] then + t[tg] = true + end + end + end + end + end + return table.sortedkeys(t) +end +\stopbuffer + +\typebuffer \ctxluabuffer + +Usage is as follows: + +\startbuffer +local x = xml.load("manual-demo-1.xml") +local t = xml.applylpath(x,"//*/taglist()") + +context.tocontext(t) +\stopbuffer + +\typebuffer + +And indeed we get: + +\ctxluabuffer + +But we can also say: + +\startbuffer +local x = xml.load("manual-demo-1.xml") +local t = xml.applylpath(x,"//*/taglist(true)") + +context.tocontext(t) +\stopbuffer + +\typebuffer + +Now we get: + +\ctxluabuffer + +\stopsection + +\startsection[title=Pure xml] + +One might wonder how a \TEX\ macro package would look like when backslashes, +dollars and percent signs would have no special meaning. In fact, it would be +rather useless as interpreting commands are triggered by such characters. Any +formatting or coding system needs such characters. Take \XML: angle brackets and +ampersands are really special. So, no matter what system we use, we do have to +deal with the (common) case where these characters need to be seen as they are. +Normally escaping is the solution. + +The \CONTEXT\ interface for \XML\ suffers from this as well. You really don't +want to know how many tricks are used for dealing with special characters and +entities: there are several ways these travel through the system and it is +possible to adapt and cheat. Especially roundtripped data (via tuc file) puts +some demands on the system because when ts \XML\ can become \TEX\ and vise versa. +The next example (derived from a mail on the list) demonstrates this: + +\starttyping +\startbuffer[demo] +<doc> + <pre><code>\ConTeXt\ is great</code></pre> + + <pre><code>but you need to know some tricks</code></pre> +</doc> +\stopbuffer + +\startxmlsetups xml:initialize + \xmlsetsetup{#1}{doc|p|code}{xml:*} + \xmlsetsetup{#1}{pre/code}{xml:pre:code} +\stopxmlsetups + +\xmlregistersetup{xml:initialize} + +\startxmlsetups xml:doc + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:pre:code + no solution + \comment[symbol=Key, location=inmargin,color=yellow]{\xmlflush{#1}} + \par + solution one \begingroup + \expandUx + \comment[symbol=Key, location=inmargin,color=yellow]{\xmlflush{#1}} + \endgroup + \par + solution two + \comment[symbol=Key, location=inmargin,color=yellow]{\xmlpure{#1}} + \par + \xmlprettyprint{#1}{tex} +\stopxmlsetups + +\xmlprocessbuffer{main}{demo}{} +\stoptyping + +The first comment (an interactive feature of \PDF\ comes out as: + +\starttyping +\Ux {5C}ConTeXt\Ux {5C} is great +\stoptyping + +The second and third comment are okay. It's one of the reasons why we have \type +{\xmlpure}. + +\stopsection + +\stopchapter + +\stopcomponent diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-expressions.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-expressions.tex new file mode 100644 index 000000000..0c126f2f8 --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-expressions.tex @@ -0,0 +1,645 @@ +\environment xml-mkiv-style + +\startcomponent xml-mkiv-expressions + +\startchapter[title={Expressions and filters}] + +\startsection[title={path expressions}] + +In the previous chapters we used \cmdinternal {cd:lpath} expressions, which are a variant +on \type {xpath} expressions as in \XSLT\ but in this case more geared towards +usage in \TEX. This mechanisms will be extended when demands are there. + +A path is a sequence of matches. A simple path expression is: + +\starttyping +a/b/c/d +\stoptyping + +Here each \type {/} goes one level deeper. We can go backwards in a lookup with +\type {..}: + +\starttyping +a/b/../d +\stoptyping + +We can also combine lookups, as in: + +\starttyping +a/(b|c)/d +\stoptyping + +A negated lookup is preceded by a \type {!}: + +\starttyping +a/(b|c)/!d +\stoptyping + +A wildcard is specified with a \type {*}: + +\starttyping +a/(b|c)/!d/e/*/f +\stoptyping + +In addition to these tag based lookups we can use attributes: + +\starttyping +a/(b|c)/!d/e/*/f[@type=whatever] +\stoptyping + +An \type {@} as first character means that we are dealing with an attribute. +Within the square brackets there can be boolean expressions: + +\starttyping +a/(b|c)/!d/e/*/f[@type=whatever and @id>100] +\stoptyping + +You can use functions as in: + +\starttyping +a/(b|c)/!d/e/*/f[something(text()) == "oeps"] +\stoptyping + +There are a couple of predefined functions: + +\starttabulate[|l|l|p|] +\NC \type{rootposition} \type{order} \NC number \NC the index of the matched root element (kind of special) \NC \NR +\NC \type{position} \NC number \NC the current index of the matched element in the match list \NC \NR +\NC \type{match} \NC number \NC the current index of the matched element sub list with the same parent \NC \NR +\NC \type{first} \NC number \NC \NC \NR +\NC \type{last} \NC number \NC \NC \NR +\NC \type{index} \NC number \NC the current index of the matched element in its parent list \NC \NR +\NC \type{firstindex} \NC number \NC \NC \NR +\NC \type{lastindex} \NC number \NC \NC \NR +\NC \type{element} \NC number \NC the element's index \NC \NR +\NC \type{firstelement} \NC number \NC \NC \NR +\NC \type{lastelement} \NC number \NC \NC \NR +\NC \type{text} \NC string \NC the textual representation of the matched element \NC \NR +\NC \type{content} \NC table \NC the node of the matched element \NC \NR +\NC \type{name} \NC string \NC the full name of the matched element: namespace and tag \NC \NR +\NC \type{namespace} \type{ns} \NC string \NC the namespace of the matched element \NC \NR +\NC \type{tag} \NC string \NC the tag of the matched element \NC \NR +\NC \type{attribute} \NC string \NC the value of the attribute with the given name of the matched element \NC \NR +\stoptabulate + +There are fundamental differences between \type {position}, \type {match} and +\type {index}. Each step results in a new list of matches. The \type {position} +is the index in this new (possibly intermediate) list. The \type {match} is also +an index in this list but related to the specific match of element names. The +\type {index} refers to the location in the parent element. + +Say that we have: + +\starttyping +<collection> + <resources> + <manual> + <screen>.1.</screen> + <paper>.1.</paper> + </manual> + <manual> + <paper>.2.</paper> + <screen>.2.</screen> + </manual> + <resources> + <resources> + <manual> + <screen>.3.</screen> + <paper>.3.</paper> + </manual> + <resources> +<collection> +\stoptyping + +The following then applies: + +\starttabulate[|l|l|] +\NC \type {collection/resources/manual[position()==1]/paper} \NC \type{.1.} \NC \NR +\NC \type {collection/resources/manual[match()==1]/paper} \NC \type{.1.} \type{.3.} \NC \NR +\NC \type {collection/resources/manual/paper[index()==1]} \NC \type{.2.} \NC \NR +\stoptabulate + +In most cases the \type {position} test is more restrictive than the \type +{match} test. + +You can pass your own functions too. Such functions are defined in the \type +{xml.expressions} namespace. We have defined a few shortcuts: + +\starttabulate[|l|l|] +\NC \type {find(str,pattern)} \NC \type{string.find} \NC \NR +\NC \type {contains(str)} \NC \type{string.find} \NC \NR +\NC \type {oneof(str,...)} \NC is \type{str} in list \NC \NR +\NC \type {upper(str)} \NC \type{characters.upper} \NC \NR +\NC \type {lower(str)} \NC \type{characters.lower} \NC \NR +\NC \type {number(str)} \NC \type{tonumber} \NC \NR +\NC \type {boolean(str)} \NC \type{toboolean} \NC \NR +\NC \type {idstring(str)} \NC removes leading hash \NC \NR +\NC \type {name(index)} \NC full tag name \NC \NR +\NC \type {tag(index)} \NC tag name \NC \NR +\NC \type {namespace(index)} \NC namespace of tag \NC \NR +\NC \type {text(index)} \NC content \NC \NR +\NC \type {error(str)} \NC quit and show error \NC \NR +\NC \type {quit()} \NC quit \NC \NR +\NC \type {print()} \NC print message \NC \NR +\NC \type {count(pattern)} \NC number of matches \NC \NR +\NC \type {child(pattern)} \NC take child that matches \NC \NR +\stoptabulate + + +You can also use normal \LUA\ functions as long as you make sure that you pass +the right arguments. There are a few predefined variables available inside such +functions. + +\starttabulate[|Tl|l|p|] +\NC \type{list} \NC table \NC the list of matches \NC \NR +\NC \type{l} \NC number \NC the current index in the list of matches \NC \NR +\NC \type{ll} \NC element \NC the current element that matched \NC \NR +\NC \type{order} \NC number \NC the position of the root of the path \NC \NR +\stoptabulate + +The given expression between \type {[]} is converted to a \LUA\ expression so you +can use the usual operators: + +\starttyping +== ~= <= >= < > not and or () +\stoptyping + +In addition, \type {=} equals \type {==} and \type {!=} is the same as \type +{~=}. If you mess up the expression, you quite likely get a \LUA\ error message. + +\stopsection + +\startsection[title={css selectors}] + +\startbuffer[selector-001] +<?xml version="1.0" ?> + +<a> + <b class="one">b.one</b> + <b class="two">b.two</b> + <b class="one two">b.one.two</b> + <b class="three">b.three</b> + <b id="first">b#first</b> + <c>c</c> + <d>d e</d> + <e>d e</e> + <e>d e e</e> + <d>d f</d> + <f foo="bar">@foo = bar</f> + <f bar="foo">@bar = foo</f> + <f bar="foo1">@bar = foo1</f> + <f bar="foo2">@bar = foo2</f> + <f bar="foo3">@bar = foo3</f> + <f bar="foo+4">@bar = foo+4</f> + <g>g</g> + <g><gg><d>g gg d</d></gg></g> + <g><gg><f>g gg f</f></gg></g> + <g><gg><f class="one">g gg f.one</f></gg></g> + <g>g</g> + <g><gg><f class="two">g gg f.two</f></gg></g> + <g><gg><f class="three">g gg f.three</f></gg></g> + <g><f class="one">g f.one</f></g> + <g><f class="three">g f.three</f></g> + <h whatever="four five six">@whatever = four five six</h> +</a> +\stopbuffer + +\xmlloadbuffer{selector-001}{selector-001} + +\startxmlsetups xml:selector:demo + \advance\scratchcounter\plusone + \inleftmargin{\the\scratchcounter}\ignorespaces\xmlverbatim{#1}\par +\stopxmlsetups + +\unexpanded\def\showCSSdemo#1#2% + {\blank + \textrule{\tttf#2} + \startlines + \dontcomplain + \tttf \obeyspaces + \scratchcounter\zerocount + \xmlcommand{#1}{#2}{xml:selector:demo} + \stoplines + \blank} + +The \CSS\ approach to filtering is a bit different from the path based one and is +supported too. In fact, you can combine both methods. Depending on what you +select, the \CSS\ one can be a little bit faster too. It has the advantage that +one can select more in one go but at the same time looks a bit less attractive. +This method was added just to show that it can be done but might be useful too. A +selector is given between curly braces (after all \CSS\ uses them and they have no +function yet in the parser. + +\starttyping +\xmlall{#1}{{foo bar .whatever, bar foo .whatever}} +\stoptyping + +The following methods are supported: + +\starttabulate[|T||] +\NC element \NC all tags element \NC \NR +\NC element-1 > element-2 \NC all tags element-2 with parent tag element-1 \NC \NR +\NC element-1 + element-2 \NC all tags element-2 preceded by tag element-1 \NC \NR +\NC element-1 ~ element-2 \NC all tags element-2 preceded by tag element-1 \NC \NR +\NC element-1 element-2 \NC all tags element-2 inside tag element-1 \NC \NR +\NC [attribute] \NC has attribute \NC \NR +\NC [attribute=value] \NC attribute equals value\NC \NR +\NC [attribute\lettertilde =value] \NC attribute contains value (space is separator) \NC \NR +\NC [attribute\letterhat ="value"] \NC attribute starts with value \NC \NR +\NC [attribute\letterdollar="value"] \NC attribute ends with value \NC \NR +\NC [attribute*="value"] \NC attribute contains value \NC \NR +\NC .class \NC has class \NC \NR +\NC \letterhash id \NC has id \NC \NR +\NC :nth-child(n) \NC the child at index n \NC \NR +\NC :nth-last-child(n) \NC the child at index n from the end \NC \NR +\NC :first-child \NC the first child \NC \NR +\NC :last-child \NC the last child \NC \NR +\NC :nth-of-type(n) \NC the match at index n \NC \NR +\NC :nth-last-of-type(n) \NC the match at index n from the end \NC \NR +\NC :first-of-type \NC the first match \NC \NR +\NC :last-of-type \NC the last match \NC \NR +\NC :only-of-type \NC the only match or nothing \NC \NR +\NC :only-child \NC the only child or nothing \NC \NR +\NC :empty \NC only when empty \NC \NR +\NC :root \NC the whole tree \NC \NR +\stoptabulate + +The next pages show some examples. For that we use the demo file: + +\typebuffer[selector-001] + +The class and id selectors often only make sense in \HTML\ like documents but they +are supported nevertheless. They are after all just shortcuts for filtering by +attribute. The class filtering is special in the sense that it checks for a class +in a list of classes given in an attribute. + +\showCSSdemo{selector-001}{{.one}} +\showCSSdemo{selector-001}{{.one, .two}} +\showCSSdemo{selector-001}{{.one, .two, \letterhash first}} + +Attributes can be filtered by presence, value, partial value and such. Quotes are +optional but we advice to use them. + +\showCSSdemo{selector-001}{{[foo], [bar=foo]}} +\showCSSdemo{selector-001}{{[bar\lettertilde=foo]}} +\showCSSdemo{selector-001}{{[bar\letterhat="foo"]}} +\showCSSdemo{selector-001}{{[whatever\lettertilde="five"]}} + +You can of course combine the methods as in: + +\showCSSdemo{selector-001}{{g f .one, g f .three}} +\showCSSdemo{selector-001}{{g > f .one, g > f .three}} +\showCSSdemo{selector-001}{{d + e}} +\showCSSdemo{selector-001}{{d ~ e}} +\showCSSdemo{selector-001}{{d ~ e, g f .one, g f .three}} + +You can also negate the result by using \type {:not} on a simple expression: + +\showCSSdemo{selector-001}{{:not([whatever\lettertilde="five"])}} +\showCSSdemo{selector-001}{{:not(d)}} + +The child and match selectors are also supported: + +\showCSSdemo{selector-001}{{a:nth-child(3)}} +\showCSSdemo{selector-001}{{a:nth-last-child(3)}} +\showCSSdemo{selector-001}{{g:nth-of-type(3)}} +\showCSSdemo{selector-001}{{g:nth-last-of-type(3)}} +\showCSSdemo{selector-001}{{a:first-child}} +\showCSSdemo{selector-001}{{a:last-child}} +\showCSSdemo{selector-001}{{e:first-of-type}} +\showCSSdemo{selector-001}{{gg d:only-of-type}} + +Instead of numbers you can also give the \type {an} and \type {an+b} formulas +as well as the \type {odd} and \type {even} keywords: + +\showCSSdemo{selector-001}{{a:nth-child(even)}} +\showCSSdemo{selector-001}{{a:nth-child(odd)}} +\showCSSdemo{selector-001}{{a:nth-child(3n+1)}} +\showCSSdemo{selector-001}{{a:nth-child(2n+3)}} + +There are a few special cases: + +\showCSSdemo{selector-001}{{g:empty}} +\showCSSdemo{selector-001}{{g:root}} +\showCSSdemo{selector-001}{{*}} + +Combining the \CSS\ methods with the regular ones is possible: + +\showCSSdemo{selector-001}{{g gg f .one}} +\showCSSdemo{selector-001}{g/gg/f[@class='one']} +\showCSSdemo{selector-001}{g/{gg f .one}} + +\startbuffer[selector-002] +<?xml version="1.0" ?> + +<document> + <title class="one" >title 1</title> + <title class="two" >title 2</title> + <title class="one" >title 3</title> + <title class="three">title 4</title> +</document> +\stopbuffer + +The next examples we use this file: + +\typebuffer[selector-002] + +\xmlloadbuffer{selector-002}{selector-002} + +When we filter from this (not too well structured) tree we can use both +methods to achieve the same: + +\showCSSdemo{selector-002}{{document title .one, document title .three}} + +\showCSSdemo{selector-002}{/document/title[(@class='one') or (@class='three')]} + +However, imagine this file: + +\startbuffer[selector-003] +<?xml version="1.0" ?> + +<document> + <title class="one">title 1</title> + <subtitle class="sub">title 1.1</subtitle> + <title class="two">title 2</title> + <subtitle class="sub">title 2.1</subtitle> + <title class="one">title 3</title> + <subtitle class="sub">title 3.1</subtitle> + <title class="two">title 4</title> + <subtitle class="sub">title 4.1</subtitle> +</document> +\stopbuffer + +\typebuffer[selector-003] + +\xmlloadbuffer{selector-003}{selector-003} + +The next filter in easier with the \CSS\ selector methods because these accumulate +independent (simple) expressions: + +\showCSSdemo{selector-003}{{document title .one + subtitle, document title .two + subtitle}} + +Watch how we get an output in the document order. Because we render a sequential document +a combined filter will trigger a sorting pass. + +\stopsection + +\startsection[title={functions as filters}] + +At the \LUA\ end a whole \cmdinternal {cd:lpath} expression results in a (set of) node(s) +with its environment, but that is hardly usable in \TEX. Think of code like: + +\starttyping +for e in xml.collected(xml.load('text.xml'),"title") do + -- e = the element that matched +end +\stoptyping + +The older variant is still supported but you can best use the previous variant. + +\starttyping +for r, d, k in xml.elements(xml.load('text.xml'),"title") do + -- r = root of the title element + -- d = data table + -- k = index in data table +end +\stoptyping + +Here \type {d[k]} points to the \type {title} element and in this case all titles +in the tree pass by. In practice this kind of code is encapsulated in function +calls, like those returning elements one by one, or returning the first or last +match. The result is then fed back into \TEX, possibly after being altered by an +associated setup. We've seen the wrappers to such functions already in a previous +chapter. + +In addition to the previously discussed expressions, one can add so called +filters to the expression, for instance: + +\starttyping +a/(b|c)/!d/e/text() +\stoptyping + +In a filter, the last part of the \cmdinternal {cd:lpath} expression is a +function call. The previous example returns the text of each element \type {e} +that results from matching the expression. When running \TEX\ the following +functions are available. Some are also available when using pure \LUA. In \TEX\ +you can often use one of the macros like \type {\xmlfirst} instead of a \type +{\xmlfilter} with finalizer \type {first()}. The filter can be somewhat faster +but that is hardly noticeable. + +\starttabulate[|l|l|p|] +\NC \type {context()} \NC string \NC the serialized text with \TEX\ catcode regime \NC \NR +%NC \type {ctxtext()} \NC string \NC \NC \NR +\NC \type {function()} \NC string \NC depends on the function \NC \NR +% +\NC \type {name()} \NC string \NC the (remapped) namespace \NC \NR +\NC \type {tag()} \NC string \NC the name of the element \NC \NR +\NC \type {tags()} \NC list \NC the names of the element \NC \NR +% +\NC \type {text()} \NC string \NC the serialized text \NC \NR +\NC \type {upper()} \NC string \NC the serialized text uppercased \NC \NR +\NC \type {lower()} \NC string \NC the serialized text lowercased \NC \NR +\NC \type {stripped()} \NC string \NC the serialized text stripped \NC \NR +\NC \type {lettered()} \NC string \NC the serialized text only letters (cf. \UNICODE) \NC \NR +% +\NC \type {count()} \NC number \NC the number of matches \NC \NR +\NC \type {index()} \NC number \NC the matched index in the current path \NC \NR +\NC \type {match()} \NC number \NC the matched index in the preceding path \NC \NR +% +%NC \type {lowerall()} \NC string \NC \NC \NR +%NC \type {upperall()} \NC string \NC \NC \NR +% +\NC \type {attribute(name)} \NC content \NC returns the attribute with the given name \NC \NR +\NC \type {chainattribute(name)} \NC content \NC sidem, but backtracks till one is found \NC \NR +\NC \type {command(name)} \NC content \NC expands the setup with the given name for each found element \NC \NR +\NC \type {position(n)} \NC content \NC processes the \type {n}\high{th} instance of the found element \NC \NR +\NC \type {all()} \NC content \NC processes all instances of the found element \NC \NR +%NC \type {default} \NC content \NC all \NC \NR +\NC \type {reverse()} \NC content \NC idem in reverse order \NC \NR +\NC \type {first()} \NC content \NC processes the first instance of the found element \NC \NR +\NC \type {last()} \NC content \NC processes the last instance of the found element \NC \NR +\NC \type {concat(...)} \NC content \NC concatinates the match \NC \NC \NR +\NC \type {concatrange(from,to,...)} \NC content \NC concatinates a range of matches \NC \NC \NR +\stoptabulate + +The extra arguments of the concatinators are: \type {separator} (string), \type +{lastseparator} (string) and \type {textonly} (a boolean). + +These filters are in fact \LUA\ functions which means that if needed more of them +can be added. Indeed this happens in some of the \XML\ related \MKIV\ modules, +for instance in the \MATHML\ processor. + +\stopsection + +\startsection[title={example}] + +The number of commands is rather large and if you want to avoid them this is +often possible. Take for instance: + +\starttyping +\xmlall{#1}{/a/b[position()>3]} +\stoptyping + +Alternatively you can use: + +\starttyping +\xmlfilter{#1}{/a/b[position()>3]/all()} +\stoptyping + +and actually this is also faster as internally it avoids a function call. Of +course in practice this is hardly measurable. + +In previous examples we've already seen quite some expressions, and it might be +good to point out that the syntax is modelled after \XSLT\ but is not quite the +same. The reason is that we started with a rather minimal system and have already +styles in use that depend on compatibility. + +\starttyping +namespace:// axis node(set) [expr 1]..[expr n] / ... / filter +\stoptyping + +When we are inside a \CONTEXT\ run, the namespace is \type {tex}. Hoewever, if +you want not to print back to \TEX\ you need to be more explicit. Say that we +typeset examns and have a (not that logical) structure like: + +\starttyping +<question> + <text>...</text> + <answer> + <item>one</item> + <item>two</item> + <item>three</item> + </answer> + <alternative> + <condition>true</condition> + <score>1</score> + </alternative> + <alternative> + <condition>false</condition> + <score>0</score> + </alternative> + <alternative> + <condition>true</condition> + <score>2</score> + </alternative> +</question> +\stoptyping + +Say that we typeset the questions with: + +\starttyping +\startxmlsetups question + \blank + score: \xmlfunction{#1}{totalscore} + \blank + \xmlfirst{#1}{text} + \startitemize + \xmlfilter{#1}{/answer/item/command(answer:item)} + \stopitemize + \endgraf + \blank +\stopxmlsetups +\stoptyping + +Each item in the answer results in a call to: + +\starttyping +\startxmlsetups answer:item + \startitem + \xmlflush{#1} + \endgraf + \xmlfilter{#1}{../../alternative[position()=rootposition()]/ + condition/command(answer:condition)} + \stopitem +\stopxmlsetups +\stoptyping + +\starttyping +\startxmlsetups answer:condition + \endgraf + condition: \xmlflush{#1} + \endgraf +\stopxmlsetups +\stoptyping + +Now, there are two rather special filters here. The first one involves +calculating the total score. As we look forward we use a function to deal with +this. + +\starttyping +\startluacode +function xml.functions.totalscore(root) + local score = 0 + for e in xml.collected(root,"/alternative") do + score = score + xml.filter(e,"xml:///score/number()") or 0 + end + tex.write(score) +end +\stopluacode +\stoptyping + +Watch how we use the namespace to keep the results at the \LUA\ end. + +The second special trick shown here is to limit a match using the current +position of the root (\type {#}) match. + +As you can see, a path expression can be more than just filtering a few nodes. At +the end of this manual you will find a bunch of examples. + +\stopsection + +\startsection[title={tables}] + +If you want to know how the internal \XML\ tables look you can print such a +table: + +\starttyping +print(table.serialize(e)) +\stoptyping + +This produces for instance: + +% s = xml.convert("<document><demo label='whatever'>some text</demo></document>") +% print(table.serialize(xml.filter(s,"demo")[1])) + +\starttyping +t={ + ["at"]={ + ["label"]="whatever", + }, + ["dt"]={ "some text" }, + ["ns"]="", + ["rn"]="", + ["tg"]="demo", +} +\stoptyping + +The \type {rn} entry is the renamed namespace (when renaming is applied). If you +see tags like \type {@pi@} this means that we don't have an element, but (in this +case) a processing instruction. + +\starttabulate[|l|p|] +\NC \type {@rt@} \NC the root element \NC \NR +\NC \type {@dd@} \NC document definition \NC \NR +\NC \type {@cm@} \NC comment, like \type {<!-- whatever -->} \NC \NR +\NC \type {@cd@} \NC so called \type {CDATA} \NC \NR +\NC \type {@pi@} \NC processing instruction, like \type {<?whatever we want ?>} \NC \NR +\stoptabulate + +There are many ways to deal with the content, but in the perspective of \TEX\ +only a few matter. + +\starttabulate[|l|p|] +\NC \type {xml.sprint(e)} \NC print the content to \TEX\ and apply setups if needed \NC \NR +\NC \type {xml.tprint(e)} \NC print the content to \TEX\ (serialize elements verbose) \NC \NR +\NC \type {xml.cprint(e)} \NC print the content to \TEX\ (used for special content) \NC \NR +\stoptabulate + +Keep in mind that anything low level that you uncover is not part of the official +interface unless mentioned in this manual. + +\stopsection + +\stopchapter + +\stopcomponent diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-filtering.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-filtering.tex new file mode 100644 index 000000000..5bb5a35de --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-filtering.tex @@ -0,0 +1,262 @@ +\environment xml-mkiv-style + +\startcomponent xml-mkiv-filtering + +\startchapter[title={Filtering content}] + +\startsection[title={\TEX\ versus \LUA}] + +It will not come as a surprise that we can access \XML\ files from \TEX\ as well +as from \LUA. In fact there are two methods to deal with \XML\ in \LUA. First +there are the low level \XML\ functions in the \type {xml} namespace. On top of +those functions there is a set of functions in the \type {lxml} namespace that +deals with \XML\ in a more \TEX ie way. Most of these have similar commands at +the \TEX\ end. + +\startbuffer +\startxmlsetups first:demo:one + \xmlfilter {#1} {artist/name[text()='Randy Newman']/.. + /albums/album[position()=3]/command(first:demo:two)} +\stopxmlsetups + +\startxmlsetups first:demo:two + \blank \start \tt + \xmldisplayverbatim{#1} + \stop \blank +\stopxmlsetups + +\xmlprocessfile{demo}{music-collection.xml}{first:demo:one} +\stopbuffer + +\typebuffer + +This gives the following snippet of verbatim \XML\ code. The indentation is +conform the indentation in the whole \XML\ file. \footnote {The (probably +outdated) \XML\ file contains the collection stores on my slimserver instance. +You can use the \type {mtxrun --script flac} to generate such files.} + +\doifmodeelse {atpragma} { + \getbuffer +} { + \typefile{xml-mkiv-01.xml} +} + +An alternative written in \LUA\ looks as follows: + +\startbuffer +\blank \start \tt \startluacode + local m = lxml.load("mine","music-collection.xml") -- m == lxml.id("mine") + local p = "artist/name[text()='Randy Newman']/../albums/album[position()=4]" + local l = lxml.filter(m,p) -- returns a list (with one entry) + lxml.displayverbatim(l[1]) +\stopluacode \stop \blank +\stopbuffer + +\typebuffer + +This produces: + +\doifmodeelse {atpragma} { + \getbuffer +} { + \typefile{xml-mkiv-02.xml} +} + +You can use both methods mixed but in practice we will use the \TEX\ commands in +regular styles and the mixture in modules, for instance in those dealing with +\MATHML\ and cals tables. For complex matters you can write your own finalizers +(the last action to be taken in a match) in \LUA\ and use them at the \TEX\ end. + +\stopsection + +\startsection[title={a few details}] + +In \CONTEXT\ setups are a rather common variant on macros (\TEX\ commands) but +with their own namespace. An example of a setup is: + +\starttyping +\startsetup doc:print + \setuppapersize[A4][A4] +\stopsetup + +\startsetup doc:screen + \setuppapersize[S6][S4] +\stopsetup +\stoptyping + +Given the previous definitions, later on we can say something like: + +\starttyping +\doifmodeelse {paper} { + \setup[doc:print] +} { + \setup[doc:screen] +} +\stoptyping + +Another example is: + +\starttyping +\startsetup[doc:header] + \marking[chapter] + \space + -- + \space + \pagenumber +\stopsetup +\stoptyping + +in combination with: + +\starttyping +\setupheadertexts[\setup{doc:header}] +\stoptyping + +Here the advantage is that instead of ending up with an unreadable header +definitions, we use a nicely formatted setup. An important property of setups and +the reason why they were introduced long ago is that spaces and newlines are +ignored in the definition. This means that we don't have to worry about so called +spurious spaces but it also means that when we do want a space, we have to use +the \type {\space} command. + +The only difference between setups and \XML\ setups is that the following ones +get an argument (\type {#1}) that reflects the current node in the \XML\ tree. + +\stopsection + +\startsection[title={CDATA}] + +What to do with \type {CDATA}? There are a few methods at tle \LUA\ end for +dealing with it but here we just mention how you can influence the rendering. +There are four macros that play a role here: + +\starttyping +\unexpanded\def\xmlcdataobeyedline {\obeyedline} +\unexpanded\def\xmlcdataobeyedspace{\strut\obeyedspace} +\unexpanded\def\xmlcdatabefore {\begingroup\tt} +\unexpanded\def\xmlcdataafter {\endgroup} +\stoptyping + +Technically you can overload them but beware of side effects. Normally you won't +see much \type {CDATA} and whenever we do, it involves special data that needs +very special treatment anyway. + +\stopsection + +\startsection[title={Entities}] + +As usual with any way of encoding documents you need escapes in order to encode +the characters that are used in tagging the content, embedding comments, escaping +special characters in strings (in programming languages), etc. In \XML\ this +means that in order characters like \type {<} you need an escape like \type +{<} and in order then to encode an \type {&} you need \type {&}. + +In a typesetting workflow using a programming language like \TEX, another problem +shows up. There we have different special characters, like \type {$ $} for triggering +math, but also the backslash, braces etc. Even one such special character is already +enough to have yet another escaping mechanism at work. + +Ideally a user should not worry about these issues but it helps figuring out issues +when you know what happens under the hood. Also it is good to know that in the +code there are several ways to deal with these issues. Take the following document: + +\starttyping +<text> + Here we have a bit of a <&mess>: + + # # + % % + \ \ + { { + | | + } } + ~ ~ +</text> +\stoptyping + +When the file is read the \type {<} entity will be replaced by \type {<} and +the \type {>} by \type {>}. The numeric entities will be replaced by the +characters they refer to. The \type {&mess} is kind of special. We do preload +a huge list of more or less standardized entities but \type {mess} is not in +there. However, it is possible to have it defined in the document preamble, like: + +\starttyping +<!DOCTYPE dummy SYSTEM "dummy.dtd" [ + <!ENTITY mess "what a mess" > +]> +\stoptyping + +or even this: + +\starttyping +<!DOCTYPE dummy SYSTEM "dummy.dtd" [ + <!ENTITY mess "<p>what a mess</p>" > +]> +\stoptyping + +You can also define it in your document style using one of: + +\startxmlcmd {\cmdbasicsetup{xmlsetentity}} + replaces entity with name \cmdinternal {cd:name} by \cmdinternal {cd:text} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmltexentity}} + replaces entity with name \cmdinternal {cd:name} by \cmdinternal {cd:text} + typeset under a \TEX\ regime +\stopxmlcmd + +Such a definition will always have a higher priority than the one defined +in the document. Anyway, when the document is read in all entities are +resolved and those that need a special treatment because they map to some +text are stored in such a way that we can roundtrip them. As a consequence, +as soon as the content gets pushed into \TEX, we need not only to intercept +special characters but also have to make sure that the following works: + +\starttyping +\xmltexentity {tex} {\TEX} +\stoptyping + +Here the backslash starts a control sequence while in regular content a +backslash is just that: a backslash. + +Special characters are really special when we have to move text around +in a \TEX\ ecosystem. + +\starttyping +<text> + <title>About #3</title> +</text> +\stoptyping + +If we map and define title as follows: + +\starttyping +\startxmlsetup xml:title + \title{\xmlflush{#1}} +\stopxmlsetup +\stoptyping + +normally something \type {\xmlflush {id::123}} will be written to the +auxiliary file and in most cases that is quite okay, but if we have this: + +\starttyping +\setuphead[title][expansion=yes] +\stoptyping + +then we don't want the \type {#} to end up as hash because later on \TEX\ +can get very confused about it because it sees some argument then in a +probably unexpected way. This is solved by escaping the hash like this: + +\starttyping +About \Ux{23}3 +\stoptyping + +The \type {\Ux} command will convert its hexadecimal argument into a +character. Of course one then needs to typeset such a text under a \TEX\ +character regime but that is normally the case anyway. + +\stopsection + +\stopchapter + +\stopcomponent diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-introduction.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-introduction.tex new file mode 100644 index 000000000..e7f0124da --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-introduction.tex @@ -0,0 +1,42 @@ +\environment xml-mkiv-style + +\startcomponent xml-mkiv-introduction + +\startchapter[title={Introduction}] + +This manual presents the \MKIV\ way of dealing with \XML. Although the +traditional \MKII\ streaming parser has a charming simplicity in its control, for +complex documents the tree based \MKIV\ method is more convenient. It is for this +reason that the old method has been removed from \MKIV. If you are familiar with +\XML\ processing in \MKII, then you will have noticed that the \MKII\ commands +have \type {XML} in their name. The \MKIV\ commands have a lowercase \type {xml} +in their names. That way there is no danger for confusion or a mixup. + +You may wonder why we do these manipulations in \TEX\ and not use \XSLT\ (or +other transformation methods) instead. The advantage of an integrated approach is +that it simplifies usage. Think of not only processing the document, but also +using \XML\ for managing resources in the same run. An \XSLT\ approach is just as +verbose (after all, you still need to produce \TEX\ code) and probably less +readable. In the case of \MKIV\ the integrated approach is also faster and gives +us the option to manipulate content at runtime using \LUA. It has the additional +advantage that to some extend we can handle a mix of \TEX\ and \XML\ because we +know when we're doing one or the other. + +This manual is dedicated to Taco Hoekwater, one of the first \CONTEXT\ users, and +also the first to use it for processing \XML. Who could have thought at that time +that we would have a more convenient way of dealing with those angle brackets. +The second version for this manual is dedicated to Thomas Schmitz, a power user +who occasionally became victim of the evolving mechanisms. + +\blank + +\startlines +Hans Hagen +\PRAGMA +Hasselt NL +2008\endash2016 +\stoplines + +\stopchapter + +\stopcomponent diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-lookups.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-lookups.tex new file mode 100644 index 000000000..e6afaa948 --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-lookups.tex @@ -0,0 +1,314 @@ +\environment xml-mkiv-style + +\startcomponent xml-mkiv-lookups + +\startchapter[title={Lookups using lpaths}] + +\startsection[title={introduction}] + +There is not that much system in the following examples. They resulted from tests +with different documents. The current implementation evolved out of the +experimental code. For instance, I decided to add the multiple expressions in row +handling after a few email exchanges with Jean|-|Michel Huffen. + +One of the main differences between the way \XSLT\ resolves a path and our way is +the anchor. Take: + +\starttyping +/something +something +\stoptyping + +The first one anchors in the current (!) element so it will only consider direct +children. The second one does a deep lookup and looks at the descendants as well. +Furthermore we have a few extra shortcuts like \type {**} in \type {a/**/b} which +represents all descendants. + +The expressions (between square brackets) has to be valid \LUA\ and some +preprocessing is done to resolve the built in functions. So, you might use code +like: + +\starttyping +my_lpeg_expression:match(text()) == "whatever" +\stoptyping + +given that \type {my_lpeg_expression} is known. In the examples below we use the +visualizer to show the steps. Some are shown more than once as part of a set. + +\stopsection + +\startsection[title={special cases}] + +\xmllshow{} +\xmllshow{*} +\xmllshow{.} +\xmllshow{/} + +\stopsection + +\startsection[title={wildcards}] + +\xmllshow{*} +\xmllshow{*:*} +\xmllshow{/*} +\xmllshow{/*:*} +\xmllshow{*/*} +\xmllshow{*:*/*:*} + +\xmllshow{a/*} +\xmllshow{a/*:*} +\xmllshow{/a/*} +\xmllshow{/a/*:*} + +\xmllshow{/*} +\xmllshow{/**} +\xmllshow{/***} + +\stopsection + +\startsection[title={multiple steps}] + +\xmllshow{answer} +\xmllshow{answer/test/*} +\xmllshow{answer/test/child::} +\xmllshow{answer/*} +\xmllshow{answer/*[tag()='p' and position()=1 and text()!='']} + +\stopsection + +\startsection[title={pitfals}] + +\xmllshow{[oneof(lower(@encoding),'tex','context','ctx')]} +\xmllshow{.[oneof(lower(@encoding),'tex','context','ctx')]} + +\stopsection + +\startsection[title={more special cases}] + +\xmllshow{**} +\xmllshow{*} +\xmllshow{..} +\xmllshow{.} +\xmllshow{//} +\xmllshow{/} + +\xmllshow{**/} +\xmllshow{**/*} +\xmllshow{**/.} +\xmllshow{**//} + +\xmllshow{*/} +\xmllshow{*/*} +\xmllshow{*/.} +\xmllshow{*//} + +\xmllshow{/**/} +\xmllshow{/**/*} +\xmllshow{/**/.} +\xmllshow{/**//} + +\xmllshow{/*/} +\xmllshow{/*/*} +\xmllshow{/*/.} +\xmllshow{/*//} + +\xmllshow{./} +\xmllshow{./*} +\xmllshow{./.} +\xmllshow{.//} + +\xmllshow{../} +\xmllshow{../*} +\xmllshow{../.} +\xmllshow{..//} + +\stopsection + +\startsection[title={more wildcards}] + +\xmllshow{one//two} +\xmllshow{one/*/two} +\xmllshow{one/**/two} +\xmllshow{one/***/two} +\xmllshow{one/x//two} +\xmllshow{one//x/two} +\xmllshow{//x/two} + +\stopsection + +\startsection[title={special axis}] + +\xmllshow{descendant::whocares/ancestor::whoknows} +\xmllshow{descendant::whocares/ancestor::whoknows/parent::} +\xmllshow{descendant::whocares/ancestor::} +\xmllshow{child::something/child::whatever/child::whocares} +\xmllshow{child::something/child::whatever/child::whocares|whoknows} +\xmllshow{child::something/child::whatever/child::(whocares|whoknows)} +\xmllshow{child::something/child::whatever/child::!(whocares|whoknows)} +\xmllshow{child::something/child::whatever/child::(whocares)} +\xmllshow{child::something/child::whatever/child::(whocares)[position()>2]} +\xmllshow{child::something/child::whatever[position()>2][position()=1]} +\xmllshow{child::something/child::whatever[whocares][whocaresnot]} +\xmllshow{child::something/child::whatever[whocares][not(whocaresnot)]} +\xmllshow{child::something/child::whatever/self::whatever} + +There is also \type {last-match::} that starts with the last found set of nodes. +This can save some run time when you do lots of tests combined with a same check +afterwards. There is however one pitfall: you never know what is done with that +last match in the setup that gets called nested. Take the following example: + +\starttyping +\startbuffer[test] +<something> + <crap> <crapa> <crapb> <crapc> <crapd> + <crape> + done 1 + </crape> + </crapd> </crapc> </crapb> </crapa> + <crap> <crapa> <crapb> <crapc> <crapd> + <crape> + done 2 + </crape> + </crapd> </crapc> </crapb> </crapa> + <crap> <crapa> <crapb> <crapc> <crapd> + <crape> + done 3 + </crape> + </crapd> </crapc> </crapb> </crapa> +</something> +\stopbuffer +\stoptyping + +One way to filter the content is this: + +\starttyping +\xmldoif {#1} {/crap/crapa/crapb/crapc/crapd/crape} { + some action +} +\stoptyping + +It is not unlikely that you will do something like this: + +\starttyping +\xmlfirst {#1} {/crap/crapa/crapb/crapc/crapd/crape} { + \xmlfirst{#1}{/crap/crapa/crapb/crapc/crapd/crape} +} +\stoptyping + +This means that the path is resolved twice but that can be avoided as +follows: + +\starttyping +\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{ + \xmlfirst{#1}{last-match::} +} +\stoptyping + +But the next is now guaranteed to work: + +\starttyping +\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{ + \xmlfirst{#1}{last-match::} + \xmllast{#1}{last-match::} +} +\stoptyping + +Because the first one can have done some lookup the last match can be replaced +and the second call will give unexpected results. You can overcome this with: + +\starttyping +\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{ + \xmlpushmatch + \xmlfirst{#1}{last-match::} + \xmlpopmatch +} +\stoptyping + +Does it pay off? Here are some timings of a 10.000 times text and lookup +like the previous (on a decent January 2016 laptop): + +\starttabulate[|r|l|] +\NC 0.239 \NC \type {\xmldoif {...} {...}} \NC \NR +\NC 0.292 \NC \type {\xmlfirst {...} {...}} \NC \NR +\NC 0.538 \NC \type {\xmldoif {...} {...} + \xmlfirst {...} {...}} \NC \NR +\NC 0.338 \NC \type {\xmldoif {...} {...} + \xmlfirst {...} {last-match::}} \NC \NR +\NC 0.349 \NC \type {+ \xmldoif {...} {...} + \xmlfirst {...} {last-match::}-} \NC \NR +\stoptabulate + +So, pushing and popping (the last row) is a bit slower than not doing that but it +is still much faster than not using \type {last-match::} at all. As a shortcut +you can use \type {=}, as in: + +\starttyping +\xmlfirst{#1}{=} +\stoptyping + +You can even do this: + +\starttyping +\xmlall{#1}{last-match::/text()} +\stoptyping + +or + +\starttyping +\xmlall{#1}{=/text()} +\stoptyping + + +\stopsection + +\startsection[title={some more examples}] + +\xmllshow{/something/whatever} +\xmllshow{something/whatever} +\xmllshow{/**/whocares} +\xmllshow{whoknows/whocares} +\xmllshow{whoknows} +\xmllshow{whocares[contains(text(),'f') or contains(text(),'g')]} +\xmllshow{whocares/first()} +\xmllshow{whocares/last()} +\xmllshow{whatever/all()} +\xmllshow{whocares/position(2)} +\xmllshow{whocares/position(-2)} +\xmllshow{whocares[1]} +\xmllshow{whocares[-1]} +\xmllshow{whocares[2]} +\xmllshow{whocares[-2]} +\xmllshow{whatever[3]/attribute(id)} +\xmllshow{whatever[2]/attribute('id')} +\xmllshow{whatever[3]/text()} +\xmllshow{/whocares/first()} +\xmllshow{/whocares/last()} + +\xmllshow{xml://whatever/all()} +\xmllshow{whatever/all()} +\xmllshow{//whocares} +\xmllshow{..[2]} +\xmllshow{../*[2]} + +\xmllshow{/(whocares|whocaresnot)} +\xmllshow{/!(whocares|whocaresnot)} +\xmllshow{/!whocares} + +\xmllshow{/interface/command/command(xml:setups:register)} +\xmllshow{/interface/command[@name='xxx']/command(xml:setups:typeset)} +\xmllshow{/arguments/*} +\xmllshow{/sequence/first()} +\xmllshow{/arguments/text()} +\xmllshow{/sequence/variable/first()} +\xmllshow{/interface/define[@name='xxx']/first()} +\xmllshow{/parameter/command(xml:setups:parameter:measure)} + +\xmllshow{/(*:library|figurelibrary)/*:figure/*:label} +\xmllshow{/(*:library|figurelibrary)/figure/*:label} +\xmllshow{/(*:library|figurelibrary)/figure/label} +\xmllshow{/(*:library|figurelibrary)/figure:*/label} + +\xmlshow {whatever//br[tag(1)='br']} + +\stopsection + +\stopchapter + +\stopcomponent diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-lpath.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-lpath.tex new file mode 100644 index 000000000..9c8b853c8 --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-lpath.tex @@ -0,0 +1,207 @@ +\input lxml-ctx.mkiv + +\ctxlua{dofile("t:/sources/lxml-lpt.lua")} + +\startbuffer[xmltest] +<?xml version='1.0'?> + +<!-- this is a test file --> + +<something id='1'> + <x:whatever id='1.1'> + <whocares id='1.1.1'> + test a + </whocares> + <whocaresnot id='1.1.2'> + test b + </whocaresnot> + </x:whatever> + <whatever id='2'> + <whocares id='2.1'> + test c + </whocares> + <whocaresnot id='2.2'> + test d + </whocaresnot> + </whatever> + <whatever id='3'> + test e + </whatever> + <whatever id='4' test="xxx"> + <whocares id='4.1'> + test f + </whocares> + <whocares id='4.2'> + test g + </whocares> + </whatever> + <whatever id='5' test="xxx"> + <whoknows id='5.1'> + <whocares id='5.1.1'> + test h + </whocares> + </whoknows> + <whoknows id='5.2'> + <whocaresnot id='5.2.1'> + test i + </whocaresnot> + </whoknows> + <whoknows id='5.3'> + <whocares id='5.3.1'> + test j + </whocares> + </whoknows> + </whatever> +</something> +\stopbuffer + +% \enabletrackers[xml.lparse] + +\setuplayout[width=middle,height=middle,header=1cm,footer=1cm,topspace=2cm,backspace=2cm] +\setupbodyfont[10pt] + +\setfalse\xmllshowbuffer + +\starttext + +\xmllshow{/(*:library|figurelibrary)/*:figure/*:label} +\xmllshow{/(*:library|figurelibrary)/figure/*:label} +\xmllshow{/(*:library|figurelibrary)/figure/label} +\xmllshow{/(*:library|figurelibrary)/figure:*/label} + +% \xmllshow{collection[@version='all']/resources/manual[match()==1]/paper/command(xml:overview)} +% \xmllshow{collection/resources/manual[match()=1]/paper/command(xml:overview)} + +% \xmllshow{answer//oeps} +% \xmllshow{answer/*/oeps} +% \xmllshow{answer/**/oeps} +% \xmllshow{answer/***/oeps} +% \xmllshow{answer/x//oeps} +% \xmllshow{answer//x/oeps} +% \xmllshow{//x/oeps} +% \xmllshow{answer/test/*} +% \xmllshow{answer/test/child::} +% \xmllshow{answer/*} +% \xmllshow{ oeps / answer / .. / * [tag()='p' and position()=1 and text()!=''] / oeps()} + +% \xmllshow{ artist / name [text()='Randy Newman'] / .. / albums / album [position()=3] / command(first:demo:two)} +% \xmllshow{/exa:selectors/exa:selector/exa:list/component[count()>1]} + +\stoptext + +\xmllshow{/*} +\xmllshow{child::} +\xmllshow{child::test} +\xmllshow{/test/test} +\xmllshow{../theory/sections/section/exercises} +\xmllshow{../training/practicalassignments} +\xmllshow{../../Outcome[position()=rootposition()]/Condition/command(xml:answer:mc:condition)} + +% \stoptext + +% \typebuffer[xmltest] \page + +\xmllshowbuffer{xmltest}{**}{id} +\xmllshowbuffer{xmltest}{*}{id} +\xmllshowbuffer{xmltest}{..}{id} +\xmllshowbuffer{xmltest}{.}{id} +\xmllshowbuffer{xmltest}{//}{id} +\xmllshowbuffer{xmltest}{/}{id} + +\xmllshowbuffer{xmltest}{**/}{id} +\xmllshowbuffer{xmltest}{**/*}{id} +\xmllshowbuffer{xmltest}{**/.}{id} +\xmllshowbuffer{xmltest}{**//}{id} + +\xmllshowbuffer{xmltest}{*/}{id} +\xmllshowbuffer{xmltest}{*/*}{id} +\xmllshowbuffer{xmltest}{*/.}{id} +\xmllshowbuffer{xmltest}{*//}{id} + +\xmllshowbuffer{xmltest}{/**/}{id} +\xmllshowbuffer{xmltest}{/**/*}{id} +\xmllshowbuffer{xmltest}{/**/.}{id} +\xmllshowbuffer{xmltest}{/**//}{id} + +\xmllshowbuffer{xmltest}{/*/}{id} +\xmllshowbuffer{xmltest}{/*/*}{id} +\xmllshowbuffer{xmltest}{/*/.}{id} +\xmllshowbuffer{xmltest}{/*//}{id} + +\xmllshowbuffer{xmltest}{./}{id} +\xmllshowbuffer{xmltest}{./*}{id} +\xmllshowbuffer{xmltest}{./.}{id} +\xmllshowbuffer{xmltest}{.//}{id} + +\xmllshowbuffer{xmltest}{../}{id} +\xmllshowbuffer{xmltest}{../*}{id} +\xmllshowbuffer{xmltest}{../.}{id} +\xmllshowbuffer{xmltest}{..//}{id} + +\xmllshowbuffer{xmltest}{descendant::whocares/ancestor::whoknows}{id} +\xmllshowbuffer{xmltest}{descendant::whocares/ancestor::whoknows/parent::}{id} +\xmllshowbuffer{xmltest}{descendant::whocares/ancestor::}{id} +\xmllshowbuffer{xmltest}{child::something/child::whatever/child::whocares}{id} +\xmllshowbuffer{xmltest}{child::something/child::whatever/child::whocares|whoknows}{id} +\xmllshowbuffer{xmltest}{child::something/child::whatever/child::(whocares|whoknows)}{id} +\xmllshowbuffer{xmltest}{child::something/child::whatever/child::!(whocares|whoknows)}{id} +\xmllshowbuffer{xmltest}{child::something/child::whatever/child::(whocares)}{id} +\xmllshowbuffer{xmltest}{child::something/child::whatever/child::(whocares)[position()>2]}{id} +\xmllshowbuffer{xmltest}{child::something/child::whatever[position()>2][position()=1]}{id} +\xmllshowbuffer{xmltest}{child::something/child::whatever[whocares][whocaresnot]}{id} +\xmllshowbuffer{xmltest}{child::something/child::whatever[whocares][not(whocaresnot)]}{id} +\xmllshowbuffer{xmltest}{child::something/child::whatever/self::whatever}{id} +\xmllshowbuffer{xmltest}{/something/whatever}{id} +\xmllshowbuffer{xmltest}{something/whatever}{id} +\xmllshowbuffer{xmltest}{/**/whocares}{id} +\xmllshowbuffer{xmltest}{whoknows/whocares}{id} +\xmllshowbuffer{xmltest}{whoknows}{id} +\xmllshowbuffer{xmltest}{whocares[contains(text(),'f') or contains(text(),'g')]}{id} +\xmllshowbuffer{xmltest}{whocares/first()}{id} +\xmllshowbuffer{xmltest}{whocares/last()}{id} +\xmllshowbuffer{xmltest}{whatever/all()}{id} +\xmllshowbuffer{xmltest}{whocares/position(2)}{id} +\xmllshowbuffer{xmltest}{whocares/position(-2)}{id} +\xmllshowbuffer{xmltest}{whocares[1]}{id} +\xmllshowbuffer{xmltest}{whocares[-1]}{id} +\xmllshowbuffer{xmltest}{whocares[2]}{id} +\xmllshowbuffer{xmltest}{whocares[-2]}{id} +\xmllshowbuffer{xmltest}{whatever[3]/attribute(id)}{id} +\xmllshowbuffer{xmltest}{whatever[2]/attribute('id')}{id} +\xmllshowbuffer{xmltest}{whatever[3]/text()}{id} +\xmllshowbuffer{xmltest}{/whocares/first()}{id} +\xmllshowbuffer{xmltest}{/whocares/last()}{id} + +\xmllshowbuffer{xmltest}{xml://whatever/all()}{id} +\xmllshowbuffer{xmltest}{whatever/all()}{id} +\xmllshowbuffer{xmltest}{//whocares}{id} +\xmllshowbuffer{xmltest}{..[2]}{id} +\xmllshowbuffer{xmltest}{../*[2]}{id} + +\xmllshowbuffer{xmltest}{/(whocares|whocaresnot)}{id} +\xmllshowbuffer{xmltest}{/!(whocares|whocaresnot)}{id} +\xmllshowbuffer{xmltest}{/!whocares}{id} + +% \page + +% \xmllshow{/interface/command/command(xml:setups:register)} +% \xmllshow{/interface/command[@name='xxx']/command(xml:setups:typeset)} +% \xmllshow{/arguments/*} +% \xmllshow{/sequence/first()} +% \xmllshow{/arguments/text()} +% \xmllshow{/sequence/variable/first()} +% \xmllshow{/interface/define[@name='xxx']/first()} +% \xmllshow{/parameter/command(xml:setups:parameter:measure)} + +% \page + +% \xmllshow{interface/command/command(xml:setups:register)} +% \xmllshow{interface/command[@name='xxx']/command(xml:setups:typeset)} +% \xmllshow{arguments/*} +% \xmllshow{sequence/first()} +% \xmllshow{arguments/text()} +% \xmllshow{sequence/variable/first()} +% \xmllshow{interface/define[@name='xxx']/first()} +% \xmllshow{parameter/command(xml:setups:parameter:measure)} + +\stoptext diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-style.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-style.tex new file mode 100644 index 000000000..8bcd74086 --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-style.tex @@ -0,0 +1,155 @@ +\startenvironment xml-mkiv-style + +\input lxml-ctx.mkiv + +\settrue \xmllshowtitle +\setfalse\xmllshowwarning + +\usemodule[set-11] + +\loadsetups[i-context] + +% \definehspace[squad][1em plus .25em minus .25em] + +\usemodule[abr-02] + +\setuplayout + [location=middle, + marking=on, + backspace=20mm, + cutspace=20mm, + topspace=15mm, + header=15mm, + footer=15mm, + height=middle, + width=middle] + +\setuppagenumbering + [alternative=doublesided, + location=] + +\setupfootertexts + [][pagenumber] + +\setupheadertexts + [][chapter] + +\setupheader + [color=colortwo, + style=bold] + +\setupfooter + [color=colortwo, + style=bold] + +\setuphead + [chapter] + [page={yes,header,right}, + header=empty, + style=\bfc] + +\setupsectionblock + [page={yes,header,right}] + +\starttexdefinition unexpanded section:chapter:number #1 + \doifmode{*sectionnumber} { + \bf + \llap{<\enspace}#1\enspace> + } +\stoptexdefinition + +\starttexdefinition unexpanded section:section:number #1 + \doifmode{*sectionnumber} { + \bf + \llap{<<\enspace}#1\enspace>> + } +\stoptexdefinition + +\starttexdefinition unexpanded section:subsection:number #1 + \doifmode{*sectionnumber} { + \bf + \llap{<<<\enspace}#1\enspace>>> + } +\stoptexdefinition + +\setuphead[chapter] [numbercolor=black,numbercommand=\texdefinition{section:chapter:number}] +\setuphead[section] [numbercolor=black,numbercommand=\texdefinition{section:section:number}] +\setuphead[subsection] [numbercolor=black,numbercommand=\texdefinition{section:subsection:number}] +\setuphead[subsubsection][numbercolor=,numbercommand=,before=\blank,after=\blank] + +\setuphead + [section] + [style=\bfa] + +\setuplist + [chapter] + [style=bold] + +\setupinteractionscreen + [option=doublesided] + +\setupalign + [tolerant,stretch] + +\setupwhitespace + [big] + +\setuptolerance + [tolerant] + +\doifelsemode {atpragma} { + \setupbodyfont[lucidaot,10pt] +} { + \setupbodyfont[dejavu,10pt] +} + +\definecolor[colorone] [b=.5] +\definecolor[colortwo] [s=.3] +\definecolor[colorthree][y=.5] + +\setuptype + [color=colorone] + +\setuptyping + [color=colorone] + +\setuphead + [lshowtitle] + [style=\tt, + color=colorone] + +\setuphead + [chapter,section] + [numbercolor=colortwo, + color=colorone] + +\definedescription + [xmlcmd] + [alternative=hanging, + width=line, + distance=1em, + margin=2em, + headstyle=monobold, + headcolor=colorone] + +\setupframedtext + [setuptext] + [framecolor=colorone, + rulethickness=1pt, + corner=round] + +\usemodule[punk] + +\usetypescript[punk] + +\definelayer + [page] + [width=\paperwidth, + height=\paperheight] + +\definestartstop + [smallexample] + [before={\blank\bgroup\ss\small\setupwhitespace[medium]\setupblank[medium]}, + after={\par\egroup\blank}] + +\stopenvironment diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-titlepage.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-titlepage.tex new file mode 100644 index 000000000..427557214 --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-titlepage.tex @@ -0,0 +1,47 @@ +\environment xml-mkiv-style + +\startcomponent xml-mkiv-titlepage + +\setuplayout[page] + +\startstandardmakeup + \startfontclass[none] % nil the current fontclass since it may append its features + \EnableRandomPunk + \setlayerframed + [page] + [width=\paperwidth,height=\paperheight, + background=color,backgroundcolor=colorone,backgroundoffset=1ex,frame=off] + {} + \definedfont[demo@punk at 18pt] + \setbox\scratchbox\vbox { + \hsize\dimexpr\paperwidth+2ex\relax + \setupinterlinespace + \baselineskip 1\baselineskip plus 1pt minus 1pt + \raggedcenter + \color[colortwo]{\dorecurse{1000}{XML }} + } + \setlayer + [page] + [preset=middle] + {\vsplit\scratchbox to \dimexpr\paperheight+2ex\relax} + \definedfont[demo@punk at 90pt] + \setstrut + \setlayerframed + [page] + [preset=rightbottom,offset=10mm] + [foregroundcolor=colorthree,align=flushright,offset=overlay,frame=off] + {Dealing\\with XML in\\Con\TeX t MkIV} + \definedfont[demo@punk at 18pt] + \setstrut + \setlayerframed + [page] + [preset=righttop,offset=10mm,x=3mm,rotation=90] + [foregroundcolor=colorthree,align=flushright,offset=overlay,frame=off] + {Hans Hagen, Pragma ADE, \currentdate} + \tightlayer[page] + \stopfontclass +\stopstandardmakeup + +\setuplayout + +\stopcomponent diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv-tricks.tex b/doc/context/sources/general/manuals/xml/xml-mkiv-tricks.tex new file mode 100644 index 000000000..f8c65ecc9 --- /dev/null +++ b/doc/context/sources/general/manuals/xml/xml-mkiv-tricks.tex @@ -0,0 +1,814 @@ +\environment xml-mkiv-style + +\startcomponent xml-mkiv-tricks + +\startchapter[title={Tips and tricks}] + +\startsection[title={tracing}] + +It can be hard to debug code as much happens kind of behind the screens. +Therefore we have a couple of tracing options. Of course you can typeset some +status information, using for instance: + +\startxmlcmd {\cmdbasicsetup{xmlshow}} + typeset the tree given by \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlinfo}} + typeset the name in the element given by \cmdinternal {cd:node} +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlpath}} + returns the complete path (including namespace prefix and index) of the + given \cmdinternal {cd:node} +\stopxmlcmd + +\startbuffer[demo] +<?xml version "1.0"?> +<document> + <section> + <content> + <p>first</p> + <p><b>second</b></p> + </content> + </section> + <section> + <content> + <p><b>third</b></p> + <p>fourth</p> + </content> + </section> +</document> +\stopbuffer + +Say that we have the following \XML: + +\typebuffer[demo] + +and the next definitions: + +\startbuffer +\startxmlsetups xml:demo:base + \xmlsetsetup{#1}{p|b}{xml:demo:*} +\stopxmlsetups + +\startxmlsetups xml:demo:p + \xmlflush{#1} + \par +\stopxmlsetups + +\startxmlsetups xml:demo:b + \par + \xmlpath{#1} : \xmlflush{#1} + \par +\stopxmlsetups + +\xmlregisterdocumentsetup{example-10}{xml:demo:base} + +\xmlprocessbuffer{example-10}{demo}{} +\stopbuffer + +\typebuffer + +This will give us: + +\blank \startpacked \getbuffer \stoppacked \blank + +If you use \type {\xmlshow} you will get a complete subtree which can +be handy for tracing but can also lead to large documents. + +We also have a bunch of trackers that can be enabled, like: + +\starttyping +\enabletrackers[xml.show,xml.parse] +\stoptyping + +The full list (currently) is: + +\starttabulate[|lT|p|] +\NC xml.entities \NC show what entities are seen and replaced \NC \NR +\NC xml.path \NC show the result of parsing an lpath expression \NC \NR +\NC xml.parse \NC show stepwise resolving of expressions \NC \NR +\NC xml.profile \NC report all parsed lpath expressions (in the log) \NC \NR +\NC xml.remap \NC show what namespaces are remapped \NC \NR +\NC lxml.access \NC report errors with respect to resolving (symbolic) nodes \NC \NR +\NC lxml.comments \NC show the comments that are encountered (if at all) \NC \NR +\NC lxml.loading \NC show what files are loaded and converted \NC \NR +\NC lxml.setups \NC show what setups are being associated to elements \NC \NR +\stoptabulate + +In one of our workflows we produce books from \XML\ where the (educational) +content is organized in many small files. Each book has about 5~chapters and each +chapter is made of sections that contain text, exercises, resources, etc.\ and so +the document is assembled from thousands of files (don't worry, runtime inclusion +is pretty fast). In order to see where in the sources content resides we can +trace the filename. + +\startxmlcmd {\cmdbasicsetup{xmlinclusion}} + returns the file where the node comes from +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlinclusions}} + returns the list of files where the node comes from +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlbadinclusions}} + returns a list of files that were not included due to some problem +\stopxmlcmd + +Of course you have to make sure that these names end up somewhere visible, for +instance in the margin. + +\stopsection + +\startsection[title={expansion}] + +For novice users the concept of expansion might sound frightening and to some +extend it is. However, it is important enough to spend some words on it here. + +It is good to realize that most setups are sort of immediate. When one setup is +issued, it can call another one and so on. Normally you won't notice that but +there are cases where that can be a problem. In \TEX\ you can define a macro, +take for instance: + +\starttyping +\startxmlsetups xml:foo + \def\foobar{\xmlfirst{#1}{/bar}} +\stopxmlsetups +\stoptyping + +you store the reference top node \type {bar} in \type {\foobar} maybe for later use. In +this case the content is not yet fetched, it will be done when \type {\foobar} is +called. + +\starttyping +\startxmlsetups xml:foo + \edef\foobar{\xmlfirst{#1}{/bar}} +\stopxmlsetups +\stoptyping + +Here the content of \type {bar} becomes the body of the macro. But what if +\type {bar} itself contains elements that also contain elements. When there +is a setup for \type {bar} it will be triggered and so on. + +When that setup looks like: + +\starttyping +\startxmlsetups xml:bar + \def\barfoo{\xmlflush{#1}} +\stopxmlsetups +\stoptyping + +Here we get something like: + +\starttyping +\foobar => {\def\barfoo{...}} +\stoptyping + +When \type {\barfoo} is not defined we get an error and when it is known and expands +to something weird we might also get an error. + +Especially when you don't know what content can show up, this can result in errors +when an expansion fails, for example because some macro being used is not defined. +To prevent this we can define a macro: + +\starttyping +\starttexdefinition unexpanded xml:bar:macro #1 + \def\barfoo{\xmlflush{#1}} +\stoptexdefinition + +\startxmlsetups xml:bar + \texdefinition{xml:bar:macro}{#1} +\stopxmlsetups +\stoptyping + +The setup \type {xml:bar} will still expand but the replacement text now is just the +call to the macro, think of: + +\starttyping +\foobar => {\texdefinition{xml:bar:macro}{#1}} +\stoptyping + +But this is often not needed, most \CONTEXT\ commands can handle the expansions +quite well but it's good to know that there is a way out. So, now to some +examples. Imagine that we have an \XML\ file that looks as follows: + +\starttyping +<?xml version='1.0' ?> +<demo> + <chapter> + <title>Some <em>short</em> title</title> + <content> + zeta + <index> + <key>zeta</key> + <content>zeta again</content> + </index> + alpha + <index> + <key>alpha</key> + <content>alpha <em>again</em></content> + </index> + gamma + <index> + <key>gamma</key> + <content>gamma</content> + </index> + beta + <index> + <key>beta</key> + <content>beta</content> + </index> + delta + <index> + <key>delta</key> + <content>delta</content> + </index> + done! + </content> + </chapter> +</demo> +\stoptyping + +There are a few structure related elements here: a chapter (with its list entry) +and some index entries. Both are multipass related and therefore travel around. +This means that when we let data end up in the auxiliary file, we need to make +sure that we end up with either expanded data (i.e.\ no references to the \XML\ +tree) or with robust forward and backward references to elements in the tree. + +Here we discuss three approaches (and more may show up later): pushing \XML\ into +the auxiliary file and using references to elements either or not with an +associated setup. We control the variants with a switch. + +\starttyping +\newcount\TestMode + +\TestMode=0 % expansion=xml +\TestMode=1 % expansion=yes, index, setup +\TestMode=2 % expansion=yes +\stoptyping + +We apply a couple of setups: + +\starttyping +\startxmlsetups xml:mysetups + \xmlsetsetup{\xmldocument}{demo|index|content|chapter|title|em}{xml:*} +\stopxmlsetups + +\xmlregistersetup{xml:mysetups} +\stoptyping + +The main document is processed with: + +\starttyping +\startxmlsetups xml:demo + \xmlflush{#1} + \subject{contents} + \placelist[chapter][criterium=all] + \subject{index} + \placeregister[index][criterium=all] + \page % else buffer is forgotten when placing header +\stopxmlsetups +\stoptyping + +First we show three alternative ways to deal with the chapter. The first case +expands the \XML\ reference so that we have an \XML\ stream in the auxiliary +file. This stream is processed as a small independent subfile when needed. The +second case registers a reference to the current element (\type {#1}). This means +that we have access to all data of this element, like attributes, title and +content. What happens depends on the given setup. The third variant does the same +but here the setup is part of the reference. + +\starttyping +\startxmlsetups xml:chapter + \ifcase \TestMode + % xml code travels around + \setuphead[chapter][expansion=xml] + \startchapter[title=eh: \xmltext{#1}{title}] + \xmlfirst{#1}{content} + \stopchapter + \or + % index is used for access via setup + \setuphead[chapter][expansion=yes,xmlsetup=xml:title:flush] + \startchapter[title=\xmlgetindex{#1}] + \xmlfirst{#1}{content} + \stopchapter + \or + % tex call to xml using index is used + \setuphead[chapter][expansion=yes] + \startchapter[title=hm: \xmlreference{#1}{xml:title:flush}] + \xmlfirst{#1}{content} + \stopchapter + \fi +\stopxmlsetups + +\startxmlsetups xml:title:flush + \xmltext{#1}{title} +\stopxmlsetups +\stoptyping + +We need to deal with emphasis and the content of the chapter. + +\starttyping +\startxmlsetups xml:em + \begingroup\em\xmlflush{#1}\endgroup +\stopxmlsetups + +\startxmlsetups xml:content + \xmlflush{#1} +\stopxmlsetups +\stoptyping + +A similar approach is followed with the index entries. Watch how we use the +numbered entries variant (in this case we could also have used just \type +{entries} and \type {keys}). + +\starttyping +\startxmlsetups xml:index + \ifcase \TestMode + \setupregister[index][expansion=xml,xmlsetup=] + \setstructurepageregister + [index] + [entries:1=\xmlfirst{#1}{content}, + keys:1=\xmltext{#1}{key}] + \or + \setupregister[index][expansion=yes,xmlsetup=xml:index:flush] + \setstructurepageregister + [index] + [entries:1=\xmlgetindex{#1}, + keys:1=\xmltext{#1}{key}] + \or + \setupregister[index][expansion=yes,xmlsetup=] + \setstructurepageregister + [index] + [entries:1=\xmlreference{#1}{xml:index:flush}, + keys:1=\xmltext{#1}{key}] + \fi +\stopxmlsetups + +\startxmlsetups xml:index:flush + \xmlfirst{#1}{content} +\stopxmlsetups +\stoptyping + +Instead of this flush, you can use the predefined setup \type {xml:flush} +unless it is overloaded by you. + +The file is processed by: + +\starttyping +\starttext + \xmlprocessfile{main}{test.xml}{} +\stoptext +\stoptyping + +We don't show the result here. If you're curious what the output is, you can test +it yourself. In that case it also makes sense to peek into the \type {test.tuc} +file to see how the information travels around. The \type {metadata} fields carry +information about how to process the data. + +The first case, the \XML\ expansion one, is somewhat special in the sense that +internally we use small pseudo files. You can control the rendering by tweaking +the following setups: + +\starttyping +\startxmlsetups xml:ctx:sectionentry + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups xml:ctx:registerentry + \xmlflush{#1} +\stopxmlsetups +\stoptyping + +{\em When these methods work out okay the other structural elements will be +dealt with in a similar way.} + +\stopsection + +\startsection[title={special cases}] + +Normally the content will be flushed under a special (so called) catcode regime. +This means that characters that have a special meaning in \TEX\ will have no such +meaning in an \XML\ file. If you want content to be treated as \TEX\ code, you can +use one of the following: + +\startxmlcmd {\cmdbasicsetup{xmlflushcontext}} + flush the given \cmdinternal {cd:node} using the \TEX\ character + interpretation scheme +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlcontext}} + flush the match of \cmdinternal {cd:lpath} for the given \cmdinternal + {cd:node} using the \TEX\ character interpretation scheme +\stopxmlcmd + +We use this in cases like: + +\starttyping +.... + \xmlsetsetup {#1} { + tm|texformula| + } {xml:*} +.... + +\startxmlsetups xml:tm + \mathematics{\xmlflushcontext{#1}} +\stopxmlsetups + +\startxmlsetups xml:texformula + \placeformula\startformula\xmlflushcontext{#1}\stopformula +\stopxmlsetups +\stoptyping + +\stopsection + +\startsection[title={collecting}] + +Say that your document has + +\starttyping +<table> + <tr> + <td>foo</td> + <td>bar<td> + </tr> +</table> +\stoptyping + +And that you need to convert that to \TEX\ speak like: + +\starttyping +\bTABLE + \bTR + \bTD foo \eTD + \bTD bar \eTD + \eTR +\eTABLE +\stoptyping + +A simple mapping is: + +\starttyping +\startxmlsetups xml:table + \bTABLE \xmlflush{#1} \eTABLE +\stopxmlsetups +\startxmlsetups xml:tr + \bTR \xmlflush{#1} \eTR +\stopxmlsetups +\startxmlsetups xml:td + \bTD \xmlflush{#1} \eTD +\stopxmlsetups +\stoptyping + +The \type {\bTD} command is a so called delimited command which means that it +picks up its argument by looking for an \type {\eTD}. For the simple case here +this works quite well because the flush is inside the pair. This is not the case +in the following variant: + +\starttyping +\startxmlsetups xml:td:start + \bTD +\stopxmlsetups +\startxmlsetups xml:td:stop + \eTD +\stopxmlsetups +\startxmlsetups xml:td + \xmlsetup{#1}{xml:td:start} + \xmlflush{#1} + \xmlsetup{#1}{xml:td:stop} +\stopxmlsetups +\stoptyping + +When for some reason \TEX\ gets confused you can revert to a mechanism that +collects content. + +\starttyping +\startxmlsetups xml:td:start + \startcollect + \bTD + \stopcollect +\stopxmlsetups +\startxmlsetups xml:td:stop + \startcollect + \eTD + \stopcollect +\stopxmlsetups +\startxmlsetups xml:td + \startcollecting + \xmlsetup{#1}{xml:td:start} + \xmlflush{#1} + \xmlsetup{#1}{xml:td:stop} + \stopcollecting +\stopxmlsetups +\stoptyping + +You can even implement solutions that effectively do this: + +\starttyping +\startcollecting + \startcollect \bTABLE \stopcollect + \startcollect \bTR \stopcollect + \startcollect \bTD \stopcollect + \startcollect foo\stopcollect + \startcollect \eTD \stopcollect + \startcollect \bTD \stopcollect + \startcollect bar\stopcollect + \startcollect \eTD \stopcollect + \startcollect \eTR \stopcollect + \startcollect \eTABLE \stopcollect +\stopcollecting +\stoptyping + +Of course you only need to go that complex when the situation demands it. Here is +another weird one: + +\starttyping +\startcollecting + \startcollect \setupsomething[\stopcollect + \startcollect foo=\stopcollect + \startcollect FOO,\stopcollect + \startcollect bar=\stopcollect + \startcollect BAR,\stopcollect + \startcollect ]\stopcollect +\stopcollecting +\stoptyping + +\stopsection + +\startsection[title={selectors and injectors}] + +This section describes a bit special feature, one that we needed for a project +where we could not touch the original content but could add specific sections for +our own purpose. Hopefully the example demonstrates its useability. + +\enabletrackers[lxml.selectors] + +\startbuffer[foo] +<?xml version="1.0" encoding="UTF-8"?> + +<?context-directive message info 1: this is a demo file ?> +<?context-message-directive info 2: this is a demo file ?> + +<one> + <two> + <?context-select begin t1 t2 t3 ?> + <three> + t1 t2 t3 + <?context-directive injector crlf t1 ?> + t1 t2 t3 + </three> + <?context-select end ?> + <?context-select begin t4 ?> + <four> + t4 + </four> + <?context-select end ?> + <?context-select begin t8 ?> + <four> + t8.0 + t8.0 + </four> + <?context-select end ?> + <?context-include begin t4 ?> + <!-- + <three> + t4.t3 + <?context-directive injector crlf t1 ?> + t4.t3 + </three> + --> + <three> + t3 + <?context-directive injector crlf t1 ?> + t3 + </three> + <?context-include end ?> + <?context-select begin t8 ?> + <four> + t8.1 + t8.1 + </four> + <?context-select end ?> + <?context-select begin t8 ?> + <four> + t8.2 + t8.2 + </four> + <?context-select end ?> + <?context-select begin t4 ?> + <four> + t4 + t4 + </four> + <?context-select end ?> + <?context-directive injector page t7 t8 ?> + foo + <?context-directive injector blank t1 ?> + bar + <?context-directive injector page t7 t8 ?> + bar + </two> +</one> +\stopbuffer + +\typebuffer[foo] + +First we show how to plug in a directive. Processing instructions like the +following are normally ignored by an \XML\ processor, unless they make sense +to it. + +\starttyping +<?context-directive message info 1: this is a demo file ?> +<?context-message-directive info 2: this is a demo file ?> +\stoptyping + +We can define a message handler as follows: + +\startbuffer +\def\MyMessage#1#2#3{\writestatus{#1}{#2 #3}} + +\xmlinstalldirective{message}{MyMessage} +\stopbuffer + +\typebuffer \getbuffer + +When this file is processed you will see this on the console: + +\starttyping +info > 1: this is a demo file +info > 2: this is a demo file +\stoptyping + +The file has some sections that can be used or ignored. The recipe for +obeying \type {t1} and \type {t4} is the following: + +\startbuffer +\xmlsetinjectors[t1] +\xmlsetinjectors[t4] + +\startxmlsetups xml:initialize + \xmlapplyselectors{#1} + \xmlsetsetup {#1} { + one|two|three|four + } {xml:*} +\stopxmlsetups + +\xmlregistersetup{xml:initialize} + +\startxmlsetups xml:one + [ONE \xmlflush{#1} ONE] +\stopxmlsetups + +\startxmlsetups xml:two + [TWO \xmlflush{#1} TWO] +\stopxmlsetups + +\startxmlsetups xml:three + [THREE \xmlflush{#1} THREE] +\stopxmlsetups + +\startxmlsetups xml:four + [FOUR \xmlflush{#1} FOUR] +\stopxmlsetups +\stopbuffer + +\typebuffer \getbuffer + +This typesets: + +\startnarrower +\xmlprocessbuffer{main}{foo}{} +\stopnarrower + +The include coding is kind of special: it permits adding content (in a comment) +and ignoring the rest so that we indeed can add something without interfering +with the original. Of course in a normal workflow such messy solutions are +not needed, but alas, often workflows are not that clean, especially when one +has no real control over the source. + +\startxmlcmd {\cmdbasicsetup{xmlsetinjectors}} + enables a list of injectors that will be used +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlresetinjectors}} + resets the list of injectors +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlinjector}} + expands an injection (command); normally this one is only used + (in some setup) or for testing +\stopxmlcmd + +\startxmlcmd {\cmdbasicsetup{xmlapplyselectors}} + analyze the tree \cmdinternal {cd:node} for marked sections that + will be injected +\stopxmlcmd + +We have some injections predefined: + +\starttyping +\startsetups xml:directive:injector:page + \page +\stopsetups + +\startsetups xml:directive:injector:column + \column +\stopsetups + +\startsetups xml:directive:injector:blank + \blank +\stopsetups +\stoptyping + +In the example we see: + +\starttyping +<?context-directive injector page t7 t8 ?> +\stoptyping + +When we set \type {\xmlsetinjector[t7]} a pagebreak will injected in that spot. +Tags like \type {t7}, \type {t8} etc.\ can represent versions. + +\stopsection + +\startsection[title=preprocessing] + +% local match = lpeg.match +% local replacer = lpeg.replacer("BAD TITLE:","<bold>BAD TITLE:</bold>") +% +% function lxml.preprocessor(data,settings) +% return match(replacer,data) +% end + +\startbuffer[pre-code] +\startluacode + function lxml.preprocessor(data,settings) + return string.find(data,"BAD TITLE:") + and string.gsub(data,"BAD TITLE:","<bold>BAD TITLE:</bold>") + or data + end +\stopluacode +\stopbuffer + +\startbuffer[pre-xml] +\startxmlsetups pre:demo:initialize + \xmlsetsetup{#1}{*}{pre:demo:*} +\stopxmlsetups + +\xmlregisterdocumentsetup{pre:demo}{pre:demo:initialize} + +\startxmlsetups pre:demo:root + \xmlflush{#1} +\stopxmlsetups + +\startxmlsetups pre:demo:bold + \begingroup\bf\xmlflush{#1}\endgroup +\stopxmlsetups + +\starttext + \xmlprocessbuffer{pre:demo}{demo}{} +\stoptext +\stopbuffer + +Say that you have the following \XML\ setup: + +\typebuffer[pre-xml] + +and that (such things happen) the input looks like this: + +\startbuffer[demo] +<root> +BAD TITLE: crap crap crap ... + +BAD TITLE: crap crap crap ... +</root> +\stopbuffer + +\typebuffer[demo] + +You can then clean up these \type {BAD TITLE}'s as follows: + +\typebuffer[pre-code] + +and get as result: + +\start \getbuffer[pre-code,pre-xml] \stop + +The preprocessor function gets as second argument the current settings, an d +the field \type {currentresource} can be used to limit the actions to +specific resources, in our case it's \type {buffer: demo}. Afterwards you can +reset the proprocessor with: + +\startluacode +lxml.preprocessor = nil +\stopluacode + +Future versions might give some more control over preprocessors. For now consider +it to be a quick hack. + +\stopsection + +\stopchapter + +\stopcomponent diff --git a/doc/context/sources/general/manuals/xml/xml-mkiv.tex b/doc/context/sources/general/manuals/xml/xml-mkiv.tex index 37f321646..77054e79c 100644 --- a/doc/context/sources/general/manuals/xml/xml-mkiv.tex +++ b/doc/context/sources/general/manuals/xml/xml-mkiv.tex @@ -31,4385 +31,25 @@ % % \xmldirect -\input lxml-ctx.mkiv - -\settrue \xmllshowtitle -\setfalse\xmllshowwarning - -\usemodule[set-11] - -\loadsetups[i-context] - -% \definehspace[squad][1em plus .25em minus .25em] - -\usemodule[abr-02] - -\setuplayout - [location=middle, - marking=on, - backspace=20mm, - cutspace=20mm, - topspace=15mm, - header=15mm, - footer=15mm, - height=middle, - width=middle] - -\setuppagenumbering - [alternative=doublesided, - location=] - -\setupfootertexts - [][pagenumber] - -\setupheadertexts - [][chapter] - -\setupheader - [color=colortwo, - style=bold] - -\setupfooter - [color=colortwo, - style=bold] - -\setuphead - [chapter] - [page={yes,header,right}, - header=empty, - style=\bfc] - -\setupsectionblock - [page={yes,header,right}] - -\starttexdefinition unexpanded section:chapter:number #1 - \doifmode{*sectionnumber} { - \bf - \llap{<\enspace}#1\enspace> - } -\stoptexdefinition - -\starttexdefinition unexpanded section:section:number #1 - \doifmode{*sectionnumber} { - \bf - \llap{<<\enspace}#1\enspace>> - } -\stoptexdefinition - -\starttexdefinition unexpanded section:subsection:number #1 - \doifmode{*sectionnumber} { - \bf - \llap{<<<\enspace}#1\enspace>>> - } -\stoptexdefinition - -\setuphead[chapter] [numbercolor=black,numbercommand=\texdefinition{section:chapter:number}] -\setuphead[section] [numbercolor=black,numbercommand=\texdefinition{section:section:number}] -\setuphead[subsection][numbercolor=black,numbercommand=\texdefinition{section:subsection:number}] - -\setuphead - [section] - [style=\bfa] - -\setuplist - [chapter] - [style=bold] - -\setupinteractionscreen - [option=doublesided] - -\setupalign - [tolerant,stretch] - -\setupwhitespace - [big] - -\setuptolerance - [tolerant] - -\doifelsemode {atpragma} { - \setupbodyfont[lucidaot,10pt] -} { - \setupbodyfont[dejavu,10pt] -} - -\definecolor[colorone] [b=.5] -\definecolor[colortwo] [s=.3] -\definecolor[colorthree][y=.5] - -\setuptype - [color=colorone] - -\setuptyping - [color=colorone] - -\setuphead - [lshowtitle] - [style=\tt, - color=colorone] - -\setuphead - [chapter,section] - [numbercolor=colortwo, - color=colorone] - -\definedescription - [xmlcmd] - [alternative=hanging, - width=line, - distance=1em, - margin=2em, - headstyle=monobold, - headcolor=colorone] - -\setupframedtext - [setuptext] - [framecolor=colorone, - rulethickness=1pt, - corner=round] - -\usemodule[punk] - -\usetypescript[punk] - -\definelayer - [page] - [width=\paperwidth, - height=\paperheight] +\environment xml-mkiv-style \starttext -\setuplayout[page] - -\startstandardmakeup - \startfontclass[none] % nil the current fontclass since it may append its features - \EnableRandomPunk - \setlayerframed - [page] - [width=\paperwidth,height=\paperheight, - background=color,backgroundcolor=colorone,backgroundoffset=1ex,frame=off] - {} - \definedfont[demo@punk at 18pt] - \setbox\scratchbox\vbox { - \hsize\dimexpr\paperwidth+2ex\relax - \setupinterlinespace - \baselineskip 1\baselineskip plus 1pt minus 1pt - \raggedcenter - \color[colortwo]{\dorecurse{1000}{XML }} - } - \setlayer - [page] - [preset=middle] - {\vsplit\scratchbox to \dimexpr\paperheight+2ex\relax} - \definedfont[demo@punk at 90pt] - \setstrut - \setlayerframed - [page] - [preset=rightbottom,offset=10mm] - [foregroundcolor=colorthree,align=flushright,offset=overlay,frame=off] - {Dealing\\with XML in\\Con\TeX t MkIV} - \definedfont[demo@punk at 18pt] - \setstrut - \setlayerframed - [page] - [preset=righttop,offset=10mm,x=3mm,rotation=90] - [foregroundcolor=colorthree,align=flushright,offset=overlay,frame=off] - {Hans Hagen, Pragma ADE, \currentdate} - \tightlayer[page] - \stopfontclass -\stopstandardmakeup - -\setuplayout +\component xml-mkiv-titlepage \startfrontmatter - -\starttitle[title=Contents] - -\placelist - [chapter,section] - -\stoptitle - -\startchapter[title={Introduction}] - -This manual presents the \MKIV\ way of dealing with \XML. Although the -traditional \MKII\ streaming parser has a charming simplicity in its control, for -complex documents the tree based \MKIV\ method is more convenient. It is for this -reason that the old method has been removed from \MKIV. If you are familiar with -\XML\ processing in \MKII, then you will have noticed that the \MKII\ commands -have \type {XML} in their name. The \MKIV\ commands have a lowercase \type {xml} -in their names. That way there is no danger for confusion or a mixup. - -You may wonder why we do these manipulations in \TEX\ and not use \XSLT\ (or -other transformation methods) instead. The advantage of an integrated approach is -that it simplifies usage. Think of not only processing the document, but also -using \XML\ for managing resources in the same run. An \XSLT\ approach is just as -verbose (after all, you still need to produce \TEX\ code) and probably less -readable. In the case of \MKIV\ the integrated approach is also faster and gives -us the option to manipulate content at runtime using \LUA. It has the additional -advantage that to some extend we can handle a mix of \TEX\ and \XML\ because we -know when we're doing one or the other. - -This manual is dedicated to Taco Hoekwater, one of the first \CONTEXT\ users, and -also the first to use it for processing \XML. Who could have thought at that time -that we would have a more convenient way of dealing with those angle brackets. -The second version for this manual is dedicated to Thomas Schmitz, a power user -who occasionally became victim of the evolving mechanisms. - -\blank - -\startlines -Hans Hagen -\PRAGMA -Hasselt NL -2008\endash2016 -\stoplines - -\stopchapter - + \component xml-mkiv-contents + \component xml-mkiv-introduction \stopfrontmatter \startbodymatter - -\startchapter[title={Setting up a converter}] - -\startsection[title={from structure to setup}] - -We use a very simple document structure for demonstrating how a converter is -defined. In practice a mapping will be more complex, especially when we have a -style with complex chapter openings using data coming from all kind of places, -different styling of sections with the same name, selectively (out of order) -flushed content, special formatting, etc. - -\typefile{manual-demo-1.xml} - -Say that this document is stored in the file \type {demo.xml}, then the following -code can be used as starting point: - -\starttyping -\startxmlsetups xml:demo:base - \xmlsetsetup{#1}{document|section|p}{xml:demo:*} -\stopxmlsetups - -\xmlregisterdocumentsetup{demo}{xml:demo:base} - -\startxmlsetups xml:demo:document - \starttitle[title={Contents}] - \placelist[chapter] - \stoptitle - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:demo:section - \startchapter[title=\xmlfirst{#1}{/title}] - \xmlfirst{#1}{/content} - \stopchapter -\stopxmlsetups - -\startxmlsetups xml:demo:p - \xmlflush{#1}\endgraf -\stopxmlsetups - -\xmlprocessfile{demo}{demo.xml}{} -\stoptyping - -Watch out! These are not just setups, but specific \XML\ setups which get an -argument passed (the \type {#1}). If for some reason your \XML\ processing fails, -it might be that you mistakenly have used a normal setup definition. The argument -\type {#1} represents the current node (element) and is a unique identifier. For -instance a \type {<p>..</p>} can have an identifier {demo::5}. So, we can get -something: - -\starttyping -\xmlflush{demo::5}\endgraf -\stoptyping - -but as well: - -\starttyping -\xmlflush{demo::6}\endgraf -\stoptyping - -Keep in mind that the references tor the actual nodes (elements) are -abstractions, you never see those \type {<id>::<number>}'s, because we will use -either the abstract \type {#1} (any node) or an explicit reference like \type -{demo}. The previous setup when issued will be like: - -\starttyping -\startchapter[title=\xmlfirst{demo::3}{/title}] - \xmlfirst{demo::4}{/content} -\stopchapter -\stoptyping - -Here the \type {title} is used to typeset the chapter title but also for an entry -in the table of contents. At the moment the title is typeset the \XML\ node gets -looked up and expanded in real text. However, for the list it gets stored for -later use. One can argue that this is not needed for \XML, because one can just -filter all the titles and use page references, but then one also looses the -control one normally has over such titles. For instance it can be that some -titles are rendered differently and for that we need to keep track of usage. -Doing that with transformations or filtering is often more complex than leaving -that to \TEX. As soon as the list gets typeset, the reference (\type {demo::#3}) -is used for the lookup. This is because by default the title is stored as given. -So, as long as we make sure the \XML\ source is loaded before the table of -contents is typeset we're ok. Later we will look into this in more detail, for -now it's enough to know that in most cases the abstract \type {#1} reference will -work out ok. - -Contrary to the style definitions this interface looks rather low level (with no -optional arguments) and the main reason for this is that we want processing to be -fast. So, the basic framework is: - -\starttyping -\startxmlsetups xml:demo:base - % associate setups with elements -\stopxmlsetups - -\xmlregisterdocumentsetup{demo}{xml:demo:base} - -% define setups for matches - -\xmlprocessfile{demo}{demo.xml}{} -\stoptyping - -In this example we mostly just flush the content of an element and in the case of -a section we flush explicit child elements. The \type {#1} in the example code -represents the current element. The line: - -\starttyping -\xmlsetsetup{demo}{*}{-} -\stoptyping - -sets the default for each element to \quote {just ignore it}. A \type {+} would -make the default to always flush the content. This means that at this point we -only handle: - -\starttyping -<section> - <title>Some title</title> - <content> - <p>a paragraph of text</p> - </content> -</section> -\stoptyping - -In the next section we will deal with the slightly more complex itemize and -figure placement. At first sight all these setups may look overkill but keep in -mind that normally the number of elements is rather limited. The complexity is -often in the style and having access to each snippet of content is actually -quite handy for that. - -\stopsection - -\startsection[title={alternative solutions}] - -Dealing with an itemize is rather simple (as long as we forget about -attributes that control the behaviour): - -\starttyping -<itemize> - <item>first</item> - <item>second</item> -</itemize> -\stoptyping - -First we need to add \type {itemize} to the setup assignment (unless we've used -the wildcard \type {*}): - -\starttyping -\xmlsetsetup{demo}{document|section|p|itemize}{xml:demo:*} -\stoptyping - -The setup can look like: - -\starttyping -\startxmlsetups xml:demo:itemize - \startitemize - \xmlfilter{#1}{/item/command(xml:demo:itemize:item)} - \stopitemize -\stopxmlsetups - -\startxmlsetups xml:demo:itemize:item - \startitem - \xmlflush{#1} - \stopitem -\stopxmlsetups -\stoptyping - -An alternative is to map item directly: - -\starttyping -\xmlsetsetup{demo}{document|section|p|itemize|item}{xml:demo:*} -\stoptyping - -and use: - -\starttyping -\startxmlsetups xml:demo:itemize - \startitemize - \xmlflush{#1} - \stopitemize -\stopxmlsetups - -\startxmlsetups xml:demo:item - \startitem - \xmlflush{#1} - \stopitem -\stopxmlsetups -\stoptyping - -Sometimes, a more local solution using filters and \type {/command(...)} makes more -sense, especially when the \type {item} tag is used for other purposes as well. - -Explicit flushing with \type {command} is definitely the way to go when you have -complex products. In one of our projects we compose math school books from many -thousands of small \XML\ files, and from one source set several products are -typeset. Within a book sections get done differently, content gets used, ignored -or interpreted differently depending on the kind of content, so there is a -constant checking of attributes that drive the rendering. In that a generic setup -for a title element makes less sense than explicit ones for each case. (We're -talking of huge amounts of files here, including multiple images on each rendered -page.) - -When using \type {command} you can pass two arguments, the first is the setup for -the match, the second one for the miss, as in: - -\starttyping -\xmlfilter{#1}{/element/command(xml:true,xml:false)} -\stoptyping - -Back to the example, this leaves us with dealing with the resources, like -figures: - -\starttyping -<resource type='figure'> - <caption>A picture of a cow.</caption> - <content><external file="cow.pdf"/></content> -</resource> -\stoptyping - -Here we can use a more restricted match: - -\starttyping -\xmlsetsetup{demo}{resource[@type='figure']}{xml:demo:figure} -\xmlsetsetup{demo}{external}{xml:demo:*} -\stoptyping - -and the definitions: - -\starttyping -\startxmlsetups xml:demo:figure - \placefigure - {\xmlfirst{#1}{/caption}} - {\xmlfirst{#1}{/content}} -\stopxmlsetups - -\startxmlsetups xml:demo:external - \externalfigure[\xmlatt{#1}{file}] -\stopxmlsetups -\stoptyping - -At this point it is good to notice that \type {\xmlatt{#1}{file}} is passed as it -is: a macro call. This means that when a macro like \type {\externalfigure} uses -the first argument frequently without first storing its value, the lookup is done -several times. A solution for this is: - -\starttyping -\startxmlsetups xml:demo:external - \expanded{\externalfigure[\xmlatt{#1}{file}]} -\stopxmlsetups -\stoptyping - -Because the lookup is rather fast, normally there is no need to bother about this -too much because internally \CONTEXT\ already makes sure such expansion happens -only once. - -An alternative definition for placement is the following: - -\starttyping -\xmlsetsetup{demo}{resource}{xml:demo:resource} -\stoptyping - -with: - -\starttyping -\startxmlsetups xml:demo:resource - \placefloat - [\xmlatt{#1}{type}] - {\xmlfirst{#1}{/caption}} - {\xmlfirst{#1}{/content}} -\stopxmlsetups -\stoptyping - -This way you can specify \type {table} as type too. Because you can define your -own float types, more complex variants are also possible. In that case it makes -sense to provide some default behaviour too: - -\starttyping -\definefloat[figure-here][figure][default=here] -\definefloat[figure-left][figure][default=left] -\definefloat[table-here] [table] [default=here] -\definefloat[table-left] [table] [default=left] - -\startxmlsetups xml:demo:resource - \placefloat - [\xmlattdef{#1}{type}{figure}-\xmlattdef{#1}{location}{here}] - {\xmlfirst{#1}{/caption}} - {\xmlfirst{#1}{/content}} -\stopxmlsetups -\stoptyping - -In this example we support two types and two locations. We default to a figure -placed (when possible) at the current location. - -\stopsection - -\stopchapter - -\startchapter[title={Filtering content}] - -\startsection[title={\TEX\ versus \LUA}] - -It will not come as a surprise that we can access \XML\ files from \TEX\ as well -as from \LUA. In fact there are two methods to deal with \XML\ in \LUA. First -there are the low level \XML\ functions in the \type {xml} namespace. On top of -those functions there is a set of functions in the \type {lxml} namespace that -deals with \XML\ in a more \TEX ie way. Most of these have similar commands at -the \TEX\ end. - -\startbuffer -\startxmlsetups first:demo:one - \xmlfilter {#1} {artist/name[text()='Randy Newman']/.. - /albums/album[position()=3]/command(first:demo:two)} -\stopxmlsetups - -\startxmlsetups first:demo:two - \blank \start \tt - \xmldisplayverbatim{#1} - \stop \blank -\stopxmlsetups - -\xmlprocessfile{demo}{music-collection.xml}{first:demo:one} -\stopbuffer - -\typebuffer - -This gives the following snippet of verbatim \XML\ code. The indentation is -conform the indentation in the whole \XML\ file. \footnote {The (probably -outdated) \XML\ file contains the collection stores on my slimserver instance. -You can use the \type {mtxrun --script flac} to generate such files.} - -\doifmodeelse {atpragma} { - \getbuffer -} { - \typefile{xml-mkiv-01.xml} -} - -An alternative written in \LUA\ looks as follows: - -\startbuffer -\blank \start \tt \startluacode - local m = lxml.load("mine","music-collection.xml") -- m == lxml.id("mine") - local p = "artist/name[text()='Randy Newman']/../albums/album[position()=4]" - local l = lxml.filter(m,p) -- returns a list (with one entry) - lxml.displayverbatim(l[1]) -\stopluacode \stop \blank -\stopbuffer - -\typebuffer - -This produces: - -\doifmodeelse {atpragma} { - \getbuffer -} { - \typefile{xml-mkiv-02.xml} -} - -You can use both methods mixed but in practice we will use the \TEX\ commands in -regular styles and the mixture in modules, for instance in those dealing with -\MATHML\ and cals tables. For complex matters you can write your own finalizers -(the last action to be taken in a match) in \LUA\ and use them at the \TEX\ end. - -\stopsection - -\startsection[title={a few details}] - -In \CONTEXT\ setups are a rather common variant on macros (\TEX\ commands) but -with their own namespace. An example of a setup is: - -\starttyping -\startsetup doc:print - \setuppapersize[A4][A4] -\stopsetup - -\startsetup doc:screen - \setuppapersize[S6][S4] -\stopsetup -\stoptyping - -Given the previous definitions, later on we can say something like: - -\starttyping -\doifmodeelse {paper} { - \setup[doc:print] -} { - \setup[doc:screen] -} -\stoptyping - -Another example is: - -\starttyping -\startsetup[doc:header] - \marking[chapter] - \space - -- - \space - \pagenumber -\stopsetup -\stoptyping - -in combination with: - -\starttyping -\setupheadertexts[\setup{doc:header}] -\stoptyping - -Here the advantage is that instead of ending up with an unreadable header -definitions, we use a nicely formatted setup. An important property of setups and -the reason why they were introduced long ago is that spaces and newlines are -ignored in the definition. This means that we don't have to worry about so called -spurious spaces but it also means that when we do want a space, we have to use -the \type {\space} command. - -The only difference between setups and \XML\ setups is that the following ones -get an argument (\type {#1}) that reflects the current node in the \XML\ tree. - -\stopsection - -\startsection[title={CDATA}] - -What to do with \type {CDATA}? There are a few methods at tle \LUA\ end for -dealing with it but here we just mention how you can influence the rendering. -There are four macros that play a role here: - -\starttyping -\unexpanded\def\xmlcdataobeyedline {\obeyedline} -\unexpanded\def\xmlcdataobeyedspace{\strut\obeyedspace} -\unexpanded\def\xmlcdatabefore {\begingroup\tt} -\unexpanded\def\xmlcdataafter {\endgroup} -\stoptyping - -Technically you can overload them but beware of side effects. Normally you won't -see much \type {CDATA} and whenever we do, it involves special data that needs -very special treatment anyway. - -\stopsection - -\startsection[title={Entities}] - -As usual with any way of encoding documents you need escapes in order to encode -the characters that are used in tagging the content, embedding comments, escaping -special characters in strings (in programming languages), etc. In \XML\ this -means that in order characters like \type {<} you need an escape like \type -{<} and in order then to encode an \type {&} you need \type {&}. - -In a typesetting workflow using a programming language like \TEX, another problem -shows up. There we have different special characters, like \type {$ $} for triggering -math, but also the backslash, braces etc. Even one such special character is already -enough to have yet another escaping mechanism at work. - -Ideally a user should not worry about these issues but it helps figuring out issues -when you know what happens under the hood. Also it is good to know that in the -code there are several ways to deal with these issues. Take the following document: - -\starttyping -<text> - Here we have a bit of a <&mess>: - - # # - % % - \ \ - { { - | | - } } - ~ ~ -</text> -\stoptyping - -When the file is read the \type {<} entity will be replaced by \type {<} and -the \type {>} by \type {>}. The numeric entities will be replaced by the -characters they refer to. The \type {&mess} is kind of special. We do preload -a huge list of more or less standardized entities but \type {mess} is not in -there. However, it is possible to have it defined in the document preamble, like: - -\starttyping -<!DOCTYPE dummy SYSTEM "dummy.dtd" [ - <!ENTITY mess "what a mess" > -]> -\stoptyping - -or even this: - -\starttyping -<!DOCTYPE dummy SYSTEM "dummy.dtd" [ - <!ENTITY mess "<p>what a mess</p>" > -]> -\stoptyping - -You can also define it in your document style using one of: - -\startxmlcmd {\cmdbasicsetup{xmlsetentity}} - replaces entity with name \cmdinternal {cd:name} by \cmdinternal {cd:text} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmltexentity}} - replaces entity with name \cmdinternal {cd:name} by \cmdinternal {cd:text} - typeset under a \TEX\ regime -\stopxmlcmd - -Such a definition will always have a higher priority than the one defined -in the document. Anyway, when the document is read in all entities are -resolved and those that need a special treatment because they map to some -text are stored in such a way that we can roundtrip them. As a consequence, -as soon as the content gets pushed into \TEX, we need not only to intercept -special characters but also have to make sure that the following works: - -\starttyping -\xmltexentity {tex} {\TEX} -\stoptyping - -Here the backslash starts a control sequence while in regular content a -backslash is just that: a backslash. - -Special characters are really special when we have to move text around -in a \TEX\ ecosystem. - -\starttyping -<text> - <title>About #3</title> -</text> -\stoptyping - -If we map and define title as follows: - -\starttyping -\startxmlsetup xml:title - \title{\xmlflush{#1}} -\stopxmlsetup -\stoptyping - -normally something \type {\xmlflush {id::123}} will be written to the -auxiliary file and in most cases that is quite okay, but if we have this: - -\starttyping -\setuphead[title][expansion=yes] -\stoptyping - -then we don't want the \type {#} to end up as hash because later on \TEX\ -can get very confused about it because it sees some argument then in a -probably unexpected way. This is solved by escaping the hash like this: - -\starttyping -About \Ux{23}3 -\stoptyping - -The \type {\Ux} command will convert its hexadecimal argument into a -character. Of course one then needs to typeset such a text under a \TEX\ -character regime but that is normally the case anyway. - -\stopsection - -\stopchapter - -\startchapter[title={Commands}] - -\startsection[title={nodes and lpaths}] - -The amount of commands available for manipulating the \XML\ file is rather large. -Many of the commands cooperate with the already discussed setups, a fancy name -for a collection of macro calls either or not mixed with text. - -Most of the commands are just shortcuts to \LUA\ calls, which means that the real -work is done by \LUA. In fact, what happens is that we have a continuous transfer -of control from \TEX\ to \LUA, where \LUA\ prints back either data (like element -content or attribute values) or just invokes a setup whereby it passes a -reference to the node resolved conform the path expression. The invoked setup -itself might return control to \LUA\ again, etc. - -This sounds complicated but examples will show what we mean here. First we -present the whole repertoire of commands. Because users can read the source code, -they might uncover more commands, but only the ones discussed here are official. -The commands are grouped in categories. - -In the following sections \cmdinternal {cd:node} means a reference to a node: -this can be the identifier of the root (the loaded xml tree) or a reference to a -node in that tree (often the result of some lookup. A \cmdinternal {cd:lpath} is -a fancy name for a path expression (as with \XSLT) but resolved by \LUA. - -\stopsection - -\startsection[title={commands}] - -There are a lot of commands available but you probably can ignore most of them. -We try to be complete which means that there is for instance \type {\xmlfirst} as -well as \type {\xmllast} but you probably never need the last one. There are also -commands that were used when testing this interface and we see no reason to -remove them. Some obscure ones are used in modules and after a while even I often -forget that they exist. To give you an idea of what commands are important we -show their use in generating the \CONTEXT\ command definitions (\type -{x-set-11.mkiv}) per January 2016: - -\startcolumns[n=2,balance=yes] -\starttabulate[|l|r|] -\NC \type {\xmlall} \NC 1 \NC \NR -\NC \type {\xmlatt} \NC 23 \NC \NR -\NC \type {\xmlattribute} \NC 1 \NC \NR -\NC \type {\xmlcount} \NC 1 \NC \NR -\NC \type {\xmldoif} \NC 2 \NC \NR -\NC \type {\xmldoifelse} \NC 1 \NC \NR -\NC \type {\xmlfilterlist} \NC 4 \NC \NR -\NC \type {\xmlflush} \NC 5 \NC \NR -\NC \type {\xmlinclude} \NC 1 \NC \NR -\NC \type {\xmlloadonly} \NC 1 \NC \NR -\NC \type {\xmlregisterdocumentsetup} \NC 1 \NC \NR -\NC \type {\xmlsetsetup} \NC 1 \NC \NR -\NC \type {\xmlsetup} \NC 4 \NC \NR -\stoptabulate -\stopcolumns - -As you can see filtering, flushing and accessing attributes score high. Below we show -the statistics of a quite complex rendering (5 variants of schoolbooks: basic book, -answers, teachers guide, worksheets, full blown version with extensive tracing). - -\startcolumns[n=2,balance=yes] -\starttabulate[|l|r|] -\NC \type {\xmladdindex} \NC 3 \NC \NR -\NC \type {\xmlall} \NC 5 \NC \NR -\NC \type {\xmlappendsetup} \NC 1 \NC \NR -\NC \type {\xmlapplyselectors} \NC 1 \NC \NR -\NC \type {\xmlatt} \NC 40 \NC \NR -\NC \type {\xmlattdef} \NC 9 \NC \NR -\NC \type {\xmlattribute} \NC 10 \NC \NR -\NC \type {\xmlbadinclusions} \NC 3 \NC \NR -\NC \type {\xmlconcat} \NC 3 \NC \NR -\NC \type {\xmlcount} \NC 1 \NC \NR -\NC \type {\xmldelete} \NC 11 \NC \NR -\NC \type {\xmldoif} \NC 39 \NC \NR -\NC \type {\xmldoifelse} \NC 28 \NC \NR -\NC \type {\xmldoifelsetext} \NC 13 \NC \NR -\NC \type {\xmldoifnot} \NC 2 \NC \NR -\NC \type {\xmldoifnotselfempty} \NC 1 \NC \NR -\NC \type {\xmlfilter} \NC 100 \NC \NR -\NC \type {\xmlfirst} \NC 51 \NC \NR -\NC \type {\xmlflush} \NC 69 \NC \NR -\NC \type {\xmlflushcontext} \NC 2 \NC \NR -\NC \type {\xmlinclude} \NC 1 \NC \NR -\NC \type {\xmlincludeoptions} \NC 5 \NC \NR -\NC \type {\xmlinclusion} \NC 16 \NC \NR -\NC \type {\xmlinjector} \NC 1 \NC \NR -\NC \type {\xmlloaddirectives} \NC 1 \NC \NR -\NC \type {\xmlmapvalue} \NC 4 \NC \NR -\NC \type {\xmlmatch} \NC 1 \NC \NR -\NC \type {\xmlprependsetup} \NC 5 \NC \NR -\NC \type {\xmlregisterdocumentsetup} \NC 2 \NC \NR -\NC \type {\xmlregistersetup} \NC 1 \NC \NR -\NC \type {\xmlremapnamespace} \NC 1 \NC \NR -\NC \type {\xmlsetfunction} \NC 2 \NC \NR -\NC \type {\xmlsetinjectors} \NC 2 \NC \NR -\NC \type {\xmlsetsetup} \NC 11 \NC \NR -\NC \type {\xmlsetup} \NC 76 \NC \NR -\NC \type {\xmlstrip} \NC 1 \NC \NR -\NC \type {\xmlstripanywhere} \NC 1 \NC \NR -\NC \type {\xmltag} \NC 1 \NC \NR -\NC \type {\xmltext} \NC 53 \NC \NR -\NC \type {\xmlvalue} \NC 2 \NC \NR -\stoptabulate -\stopcolumns - -Here many more are used but this is an exceptional case. The top is again -dominated by filtering, flushing and attribute consulting. The list can actually -be smaller. For instance, the \type {\xmlcount} can just as well be \type -{\xmlfilter} with a \type {count} finalizer. There are also some special ones, -like the injectors, that are needed for finetuning the final result. - -\stopsection - -\startsection[title={loading}] - -\startxmlcmd {\cmdbasicsetup{xmlloadfile}} - loads the file \cmdinternal {cd:file} and registers it under \cmdinternal - {cd:name} and applies either given or standard \cmdinternal - {cd:xmlsetup} (alias: \type {\xmlload}) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlloadbuffer}} - loads the buffer \cmdinternal {cd:buffer} and registers it under - \cmdinternal {cd:name} and applies either given or standard - \cmdinternal {cd:xmlsetup} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlloaddata}} - loads \cmdinternal {cd:text} and registers it under \cmdinternal - {cd:name} and applies either given or standard \cmdinternal - {cd:xmlsetup} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlloadonly}} - loads \cmdinternal {cd:text} and registers it under \cmdinternal - {cd:name} and applies either given or standard \cmdinternal - {cd:xmlsetup} but doesn't flush the content -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlinclude}} - includes the file specified by attribute \cmdinternal {cd:name} of the - element located by \cmdinternal {cd:lpath} at node \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlprocessfile}} - registers file \cmdinternal {cd:file} as \cmdinternal {cd:name} and - process the tree starting with \cmdinternal {cd:xmlsetup} (alias: - \type {\xmlprocess}) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlprocessbuffer}} - registers buffer \cmdinternal {cd:name} as \cmdinternal {cd:name} and process - the tree starting with \cmdinternal {cd:xmlsetup} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlprocessdata}} - registers \cmdinternal {cd:text} as \cmdinternal {cd:name} and process - the tree starting with \cmdinternal {cd:xmlsetup} -\stopxmlcmd - -The initial setup defaults to \type {xml:process} that is defined -as follows: - -\starttyping -\startsetups xml:process - \xmlregistereddocumentsetups\xmldocument - \xmlmain\xmldocument -\stopsetups -\stoptyping - -First we apply the setups associated with the document (including common setups) -and then we flush the whole document. The macro \type {\xmldocument} expands to -the current document id. There is also \type {\xmlself} which expands to the -current node number (\type {#1} in setups). - -\startxmlcmd {\cmdbasicsetup{xmlmain}} - returns the whole document -\stopxmlcmd - -Normally such a flush will trigger a chain reaction of setups associated with the -child elements. - -\stopsection - -\startsection[title={saving}] - -\startxmlcmd {\cmdbasicsetup{xmlsave}} - saves the given node \cmdinternal {cd:node} in the file \cmdinternal {cd:file} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmltofile}} - saves the match of \cmdinternal {cd:lpath} in the file \cmdinternal {cd:file} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmltobuffer}} - saves the match of \cmdinternal {cd:lpath} in the buffer \cmdinternal {cd:buffer} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmltobufferverbose}} - saves the match of \cmdinternal {cd:lpath} verbatim in the buffer \cmdinternal - {cd:buffer} -\stopxmlcmd - -% \startxmlcmd {\cmdbasicsetup{xmltoparameters}} -% converts the match of \cmdinternal {cd:lpath} to key|/|values (for tracing) -% \stopxmlcmd - -The next command is only needed when you have messed with the tree using -\LUA\ code. - -\startxmlcmd {\cmdbasicsetup{xmladdindex}} - (re)indexes a tree -\stopxmlcmd - -The following macros are only used in special situations and are not really meant -for users. - -\startxmlcmd {\cmdbasicsetup{xmlraw}} - flush the content if \cmdinternal {cd:node} with original entities -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{startxmlraw}} - flush the wrapped content with original entities -\stopxmlcmd - -\stopsection - -\startsection[title={flushing data}] - -When we flush an element, the associated \XML\ setups are expanded. The most -straightforward way to flush an element is the following. Keep in mind that the -returned values itself can trigger setups and therefore flushes. - -\startxmlcmd {\cmdbasicsetup{xmlflush}} - returns all nodes under \cmdinternal {cd:node} -\stopxmlcmd - -You can restrict flushing by using commands that accept a specification. - -\startxmlcmd {\cmdbasicsetup{xmltext}} - returns the text of the matching \cmdinternal {cd:lpath} under \cmdinternal - {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlpure}} - returns the text of the matching \cmdinternal {cd:lpath} under \cmdinternal - {cd:node} without \type {\Ux} escaped special \TEX\ characters -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlflushtext}} - returns the text of the \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlflushpure}} - returns the text of the \cmdinternal {cd:node} without \type {\Ux} escaped - special \TEX\ characters -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlnonspace}} - returns the text of the matching \cmdinternal {cd:lpath} under \cmdinternal - {cd:node} without embedded spaces -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlall}} - returns all nodes under \cmdinternal {cd:node} that matches \cmdinternal - {cd:lpath} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmllastmatch}} - returns all nodes found in the last match -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlfirst}} - returns the first node under \cmdinternal {cd:node} that matches \cmdinternal - {cd:lpath} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmllast}} - returns the last node under \cmdinternal {cd:node} that matches \cmdinternal - {cd:lpath} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlfilter}} - at a match of \cmdinternal {cd:lpath} a given filter \type {filter} is applied - and the result is returned -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlsnippet}} - returns the \cmdinternal {cd:number}\high{th} element under \cmdinternal - {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlposition}} - returns the \cmdinternal {cd:number}\high{th} match of \cmdinternal - {cd:lpath} at node \cmdinternal {cd:node}; a negative number starts at the - end (alias: \type {\xmlindex}) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlelement}} - returns the \cmdinternal {cd:number}\high{th} child of node \cmdinternal {cd:node}; - a negative number starts at the end -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlpos}} - returns the index (position) in the parent node of \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlconcat}} - returns the sequence of nodes that match \cmdinternal {cd:lpath} at - \cmdinternal {cd:node} whereby \cmdinternal {cd:text} is put between each - match -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlconcatrange}} - returns the \cmdinternal {cd:first}\high {th} upto \cmdinternal - {cd:last}\high {th} of nodes that match \cmdinternal {cd:lpath} at - \cmdinternal {cd:node} whereby \cmdinternal {cd:text} is put between each - match -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlcommand}} - apply the given \cmdinternal {cd:xmlsetup} to each match of \cmdinternal - {cd:lpath} at node \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlstrip}} - remove leading and trailing spaces from nodes under \cmdinternal {cd:node} - that match \cmdinternal {cd:lpath} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlstripped}} - remove leading and trailing spaces from nodes under \cmdinternal {cd:node} - that match \cmdinternal {cd:lpath} and return the content afterwards -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlstripnolines}} - remove leading and trailing spaces as well as collapse embedded spaces - from nodes under \cmdinternal {cd:node} that match \cmdinternal {cd:lpath} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlstrippednolines}} - remove leading and trailing spaces as well as collapse embedded spaces from - nodes under \cmdinternal {cd:node} that match \cmdinternal {cd:lpath} and - return the content afterwards -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlverbatim}} - flushes the content verbatim code (without any wrapping, i.e. no fonts - are selected and such) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlinlineverbatim}} - return the content of the node as inline verbatim code; no further - interpretation (expansion) takes place and spaces are honoured; it uses the - following wrapper -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{startxmlinlineverbatim}} - wraps inline verbatim mode using the environment specified (a prefix \type - {xml:} is added to the environment name) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldisplayverbatim}} - return the content of the node as display verbatim code; no further - interpretation (expansion) takes place and leading and trailing spaces and - newlines are treated special; it uses the following wrapper -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{startxmldisplayverbatim}} - wraps the content in display verbatim using the environment specified (a prefix - \type {xml:} is added to the environment name) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlprettyprint}} - pretty print (with colors) the node \cmdinternal {cd:node}; use the \CONTEXT\ - \SCITE\ lexers when available (\type {\usemodule [scite]}) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlflushspacewise}} - flush node \cmdinternal {cd:node} obeying spaces and newlines -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlflushlinewise}} - flush node \cmdinternal {cd:node} obeying newlines -\stopxmlcmd - -\stopsection - -\startsection[title={information}] - -The following commands return strings. Normally these are used in tests. - -\startxmlcmd {\cmdbasicsetup{xmlname}} - returns the complete name (including namespace prefix) of the - given \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlnamespace}} - returns the namespace of the given \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmltag}} - returns the tag of the element, without namespace prefix -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlcount}} - returns the number of matches of \cmdinternal {cd:lpath} at node \cmdinternal - {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlatt}} - returns the value of attribute \cmdinternal {cd:name} or empty if no such - attribute exists -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlattdef}} - returns the value of attribute \cmdinternal {cd:name} or \cmdinternal - {cd:string} if no such attribute exists -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlrefatt}} - returns the value of attribute \cmdinternal {cd:name} or empty if no such - attribute exists; a leading \type {#} is removed (nicer for tex) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlchainatt}} - returns the value of attribute \cmdinternal {cd:name} or empty if no such - attribute exists; backtracks till a match is found -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlchainattdef}} - returns the value of attribute \cmdinternal {cd:name} or \cmdinternal - {cd:string} if no such attribute exists; backtracks till a match is found -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlattribute}} - finds a first match for \cmdinternal {cd:lpath} at \cmdinternal {cd:node} and - returns the value of attribute \cmdinternal {cd:name} or empty if no such - attribute exists -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlattributedef}} - finds a first match for \cmdinternal {cd:lpath} at \cmdinternal {cd:node} and - returns the value of attribute \cmdinternal {cd:name} or \cmdinternal - {cd:text} if no such attribute exists -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmllastatt}} - returns the last attribute found (this avoids a lookup) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlsetatt}} - set the value of attribute \cmdinternal {cd:name} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlsetattribute}} - set the value of attribute \cmdinternal {cd:name} for each match of \cmdinternal - {cd:lpath} -\stopxmlcmd - -\stopsection - -\startsection[title={manipulation}] - -You can use \LUA\ code to manipulate the tree and it makes no sense to duplicate -this in \TEX. In the future we might provide an interface to some of this -functionality. Keep in mind that manipuating the tree might have side effects as -we maintain several indices into the tree that also needs to be updated then. - -\stopsection - -\startsection[title={integration}] - -If you write a module that deals with \XML, for instance processing cals tables, -then you need ways to control specific behaviour. For instance, you might want to -add a background to the table. Such directives are collected in \XML\ files and -can be loaded on demand. - -\startxmlcmd {\cmdbasicsetup{xmlloaddirectives}} - loads \CONTEXT\ directives from \cmdinternal {cd:file} that will get - interpreted when processing documents -\stopxmlcmd - -A directives definition file looks as follows: - -\starttyping -<?xml version="1.0" standalone="yes"?> - -<directives> - <directive attribute='id' value="100" - setup="cdx:100"/> - <directive attribute='id' value="101" - setup="cdx:101"/> - <directive attribute='cdx' value="colors" element="cals:table" - setup="cdx:cals:table:colors"/> - <directive attribute='cdx' value="vertical" element="cals:table" - setup="cdx:cals:table:vertical"/> - <directive attribute='cdx' value="noframe" element="cals:table" - setup="cdx:cals:table:noframe"/> - <directive attribute='cdx' value="*" element="cals:table" - setup="cdx:cals:table:*"/> -</directives> -\stoptyping - -Examples of usage can be found in \type {x-cals.mkiv}. The directive is triggered -by an attribute. Instead of a setup you can specify a setup to be applied before -and after the node gets flushed. - -\startxmlcmd {\cmdbasicsetup{xmldirectives}} - apply the setups directive associated with the node -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldirectivesbefore}} - apply the before directives associated with the node -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldirectivesafter}} - apply the after directives associated with the node -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlinstalldirective}} - defines a directive that hooks into a handler -\stopxmlcmd - -Normally a directive will be put in the \XML\ file, for instance as: - -\starttyping -<?context-mathml-directive minus reduction yes ?> -\stoptyping - -Here the \type {mathml} is the general class of directives and \type {minus} a -subclass, in our case a specific element. - -\stopsection - -\startsection[title={setups}] - -The basic building blocks of \XML\ processing are setups. These are just -collections of macros that are expanded. These setups get one argument passed -(\type {#1}): - -\starttyping -\startxmlsetups somedoc:somesetup - \xmlflush{#1} -\stopxmlsetups -\stoptyping - -This argument is normally a number that internally refers to a specific node in -the \XML\ tree. The user should see it as an abstract reference and not depend on -its numeric property. Just think of it as \quote {the current node}. You can (and -probably will) call such setups using: - -\startxmlcmd {\cmdbasicsetup{xmlsetup}} - expands setup \cmdinternal {cd:setup} and pass \cmdinternal {cd:node} as - argument -\stopxmlcmd - -However, in most cases the setups are associated to specific elements, -something that users of \XSLT\ might recognize as templates. - -\startxmlcmd {\cmdbasicsetup{xmlsetfunction}} - associates function \cmdinternal {cd:luafunction} to the elements in - namespace \cmdinternal {cd:name} that match \cmdinternal {cd:lpath} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlsetsetup}} - associates setups \cmdinternal {cd:setup} (\TEX\ code) with the matching - nodes of \cmdinternal {cd:lpath} or root \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlprependsetup}} - pushes \cmdinternal {cd:setup} to the front of global list of setups -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlappendsetup}} - adds \cmdinternal {cd:setup} to the global list of setups to be applied - (alias: \type{\xmlregistersetup}) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlbeforesetup}} - pushes \cmdinternal {cd:setup} into the global list of setups; the - last setup is the position -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlaftersetup}} - adds \cmdinternal {cd:setup} to the global list of setups; the last setup - is the position -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlremovesetup}} - removes \cmdinternal {cd:setup} from the global list of setups -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlprependdocumentsetup}} - pushes \cmdinternal {cd:setup} to the front of list of setups to be applied - to \cmdinternal {cd:name} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlappenddocumentsetup}} - adds \cmdinternal {cd:setup} to the list of setups to be applied to - \cmdinternal {cd:name} (you can also use the alias: \type - {\xmlregisterdocumentsetup}) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlbeforedocumentsetup}} - pushes \cmdinternal {cd:setup} into the setups to be applied to \cmdinternal - {cd:name}; the last setup is the position -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlafterdocumentsetup}} - adds \cmdinternal {cd:setup} to the setups to be applied to \cmdinternal - {cd:name}; the last setup is the position -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlremovedocumentsetup}} - removes \cmdinternal {cd:setup} from the global list of setups to be applied - to \cmdinternal {cd:name} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlresetsetups}} - removes all global setups -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlresetdocumentsetups}} - removes all setups from the \cmdinternal {cd:name} specific list of setups to - be applied -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlflushdocumentsetups}{setup}} - applies \cmdinternal {cd:setup} (can be a list) to \cmdinternal {cd:name} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlregisteredsetups}} - applies all global setups to the current document -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlregistereddocumentsetups}} - applies all document specific \cmdinternal {cd:setup} to document - \cmdinternal {cd:name} -\stopxmlcmd - -\stopsection - -\startsection[title={testing}] - -The following test macros all take a \cmdinternal {cd:node} as first argument -and an \cmdinternal {cd:lpath} as second: - -\startxmlcmd {\cmdbasicsetup{xmldoif}} - expands to \cmdinternal {cd:true} when \cmdinternal {cd:lpath} matches at - node \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifnot}} - expands to \cmdinternal {cd:true} when \cmdinternal {cd:lpath} does not match - at node \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifelse}} - expands to \cmdinternal {cd:true} when \cmdinternal {cd:lpath} matches at - node \cmdinternal {cd:node} and to \cmdinternal {cd:false} otherwise -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoiftext}} - expands to \cmdinternal {cd:true} when the node matching \cmdinternal - {cd:lpath} at node \cmdinternal {cd:node} has some content -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifnottext}} - expands to \cmdinternal {cd:true} when the node matching \cmdinternal - {cd:lpath} at node \cmdinternal {cd:node} has no content -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifelsetext}} - expands to \cmdinternal {cd:true} when the node matching \cmdinternal - {cd:lpath} at node \cmdinternal {cd:node} has content and to \cmdinternal - {cd:false} otherwise -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifatt}} - expands to \cmdinternal {cd:true} when the attribute matching \cmdinternal - {cd:node} and the name given as second argument matches the third argument -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifnotatt}} - expands to \cmdinternal {cd:true} when the attribute matching \cmdinternal - {cd:node} and the name given as second argument differs from the third - argument -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifelseatt}} - expands to \cmdinternal {cd:true} when the attribute matching \cmdinternal - {cd:node} and the name given as second argument matches the third argument - and to \cmdinternal {cd:false} otherwise -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifelseempty}} - expands to \cmdinternal {cd:true} when the node matching \cmdinternal - {cd:lpath} at node \cmdinternal {cd:node} is empty and to \cmdinternal - {cd:false} otherwise -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifelseselfempty}} - expands to \cmdinternal {cd:true} when the node is empty and to \cmdinternal - {cd:false} otherwise -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifselfempty}} - expands to \cmdinternal {cd:true} when \cmdinternal {cd:node} is empty -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifnotselfempty}} - expands to \cmdinternal {cd:true} when \cmdinternal {cd:node} is not empty -\stopxmlcmd - -\stopsection - -\startsection[title={initialization}] - -The general setup command (not to be confused with setups) that deals with the -\MKIV\ tree handler is \type {\setupxml}. There are currently only a few options. - -\cmdfullsetup{setupxml} - -When you set \type {default} to \cmdinternal {cd:text} elements with no setup -assigned will end up as text. When set to \type {hidden} such elements will be -hidden. You can apply the default yourself using: - -\startxmlcmd {\cmdbasicsetup{xmldefaulttotext}} - presets the tree with root \cmdinternal {cd:node} to the handlers set up with - \type {\setupxml} option \cmdinternal{default} -\stopxmlcmd - -You can set \type {compress} to \type {yes} in which case comment is stripped -from the tree when the file is read. - -\startxmlcmd {\cmdbasicsetup{xmlregisterns}} - associates an internal namespace (like \type {mml}) with one given in the - document as \URL\ (like mathml) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlremapname}} - changes the namespace and tag of the matching elements -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlremapnamespace}} - replaces all references to the given namespace to a new one (applied - recursively) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlchecknamespace}} - sets the namespace of the matching elements unless a namespace is already set -\stopxmlcmd - -\stopsection - -\startsection[title={helpers}] - -Often an attribute will determine the rendering and this may result in many -tests. Especially when we have multiple attributes that control the output such -tests can become rather extensive and redundant because one gets $n\times m$ or -more such tests. - -Therefore we have a convenient way to map attributes onto for instance strings or -commands. - -\startxmlcmd {\cmdbasicsetup{xmlmapvalue}} - associate a \cmdinternal {cd:text} with a \cmdinternal {cd:category} and - \cmdinternal {cd:name} (alias: \type{\xmlmapval}) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlvalue}} - expand the value associated with a \cmdinternal {cd:category} and - \cmdinternal {cd:name} and if not resolved, expand to the \cmdinternal - {cd:text} (alias: \type{\xmlval}) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmldoifelsevalue}} - associate a \cmdinternal {cd:text} with a \cmdinternal {cd:category} and - \cmdinternal {cd:name} -\stopxmlcmd - -This is used as follows. We define a couple of mappings in the same category: - -\starttyping -\xmlmapvalue{emph}{bold} {\bf} -\xmlmapvalue{emph}{italic}{\it} -\stoptyping - -Assuming that we have associated the following setup with the \type {emph} -element, we can say (with \type {#1} being the current element): - -\starttyping -\startxmlsetups demo:emph - \begingroup - \xmlvalue{emph}{\xmlatt{#1}{type}}{} - \endgroup -\stopxmlsetups -\stoptyping - -In this case we have no default. The \type {type} attribute triggers the actions, -as in: - -\starttyping -normal <emph type='bold'>bold</emph> normal -\stoptyping - -This mechanism is not really bound to elements and attributes so you can use this -mechanism for other purposes as well. - -\stopsection - -\startsection[title={Parameters}] - -\startbuffer[test] -<something whatever="alpha"> - <what> - beta - </what> -</something> -\stopbuffer - -\startbuffer -\startxmlsetups xml:mysetups - \xmlsetsetup{\xmldocument}{*}{xml:*} -\stopxmlsetups - -\xmlregistersetup{xml:mysetups} - -\startxmlsetups xml:something - parameter : \xmlpar {#1}{whatever}\par - attribute : \xmlatt {#1}{whatever}\par - text : \xmlfirst {#1}{what} \par - \xmlsetpar{#1}{whatever}{gamma} - parameter : \xmlpar {#1}{whatever}\par - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:what - what: \xmlflush{#1}\par - parameter : \xmlparam{#1}{..}{whatever}\par -\stopxmlsetups - -\xmlprocessbuffer{main}{test}{} -\stopbuffer - -Say that we have this \XML\ blob: - -\typebuffer[test] - -With: - -\typebuffer - -we get: - -\getbuffer - -Parameters are stored with a node. - -\startxmlcmd {\cmdbasicsetup{xmlpar}} - returns the value of parameter \cmdinternal {cd:name} or empty if no such - parameter exists -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlparam}} - finds a first match for \cmdinternal {cd:lpath} at \cmdinternal {cd:node} and - returns the value of parameter \cmdinternal {cd:name} or empty if no such - parameter exists -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmllastpar}} - returns the last parameter found (this avoids a lookup) -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlsetpar}} - set the value of parameter \cmdinternal {cd:name} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlsetparam}} - set the value of parameter \cmdinternal {cd:name} for each match of \cmdinternal - {cd:lpath} -\stopxmlcmd - -\stopsection - -\stopchapter - -\startchapter[title={Expressions and filters}] - -\startsection[title={path expressions}] - -In the previous chapters we used \cmdinternal {cd:lpath} expressions, which are a variant -on \type {xpath} expressions as in \XSLT\ but in this case more geared towards -usage in \TEX. This mechanisms will be extended when demands are there. - -A path is a sequence of matches. A simple path expression is: - -\starttyping -a/b/c/d -\stoptyping - -Here each \type {/} goes one level deeper. We can go backwards in a lookup with -\type {..}: - -\starttyping -a/b/../d -\stoptyping - -We can also combine lookups, as in: - -\starttyping -a/(b|c)/d -\stoptyping - -A negated lookup is preceded by a \type {!}: - -\starttyping -a/(b|c)/!d -\stoptyping - -A wildcard is specified with a \type {*}: - -\starttyping -a/(b|c)/!d/e/*/f -\stoptyping - -In addition to these tag based lookups we can use attributes: - -\starttyping -a/(b|c)/!d/e/*/f[@type=whatever] -\stoptyping - -An \type {@} as first character means that we are dealing with an attribute. -Within the square brackets there can be boolean expressions: - -\starttyping -a/(b|c)/!d/e/*/f[@type=whatever and @id>100] -\stoptyping - -You can use functions as in: - -\starttyping -a/(b|c)/!d/e/*/f[something(text()) == "oeps"] -\stoptyping - -There are a couple of predefined functions: - -\starttabulate[|l|l|p|] -\NC \type{rootposition} \type{order} \NC number \NC the index of the matched root element (kind of special) \NC \NR -\NC \type{position} \NC number \NC the current index of the matched element in the match list \NC \NR -\NC \type{match} \NC number \NC the current index of the matched element sub list with the same parent \NC \NR -\NC \type{first} \NC number \NC \NC \NR -\NC \type{last} \NC number \NC \NC \NR -\NC \type{index} \NC number \NC the current index of the matched element in its parent list \NC \NR -\NC \type{firstindex} \NC number \NC \NC \NR -\NC \type{lastindex} \NC number \NC \NC \NR -\NC \type{element} \NC number \NC the element's index \NC \NR -\NC \type{firstelement} \NC number \NC \NC \NR -\NC \type{lastelement} \NC number \NC \NC \NR -\NC \type{text} \NC string \NC the textual representation of the matched element \NC \NR -\NC \type{content} \NC table \NC the node of the matched element \NC \NR -\NC \type{name} \NC string \NC the full name of the matched element: namespace and tag \NC \NR -\NC \type{namespace} \type{ns} \NC string \NC the namespace of the matched element \NC \NR -\NC \type{tag} \NC string \NC the tag of the matched element \NC \NR -\NC \type{attribute} \NC string \NC the value of the attribute with the given name of the matched element \NC \NR -\stoptabulate - -There are fundamental differences between \type {position}, \type {match} and -\type {index}. Each step results in a new list of matches. The \type {position} -is the index in this new (possibly intermediate) list. The \type {match} is also -an index in this list but related to the specific match of element names. The -\type {index} refers to the location in the parent element. - -Say that we have: - -\starttyping -<collection> - <resources> - <manual> - <screen>.1.</screen> - <paper>.1.</paper> - </manual> - <manual> - <paper>.2.</paper> - <screen>.2.</screen> - </manual> - <resources> - <resources> - <manual> - <screen>.3.</screen> - <paper>.3.</paper> - </manual> - <resources> -<collection> -\stoptyping - -The following then applies: - -\starttabulate[|l|l|] -\NC \type {collection/resources/manual[position()==1]/paper} \NC \type{.1.} \NC \NR -\NC \type {collection/resources/manual[match()==1]/paper} \NC \type{.1.} \type{.3.} \NC \NR -\NC \type {collection/resources/manual/paper[index()==1]} \NC \type{.2.} \NC \NR -\stoptabulate - -In most cases the \type {position} test is more restrictive than the \type -{match} test. - -You can pass your own functions too. Such functions are defined in the \type -{xml.expressions} namespace. We have defined a few shortcuts: - -\starttabulate[|l|l|] -\NC \type {find(str,pattern)} \NC \type{string.find} \NC \NR -\NC \type {contains(str)} \NC \type{string.find} \NC \NR -\NC \type {oneof(str,...)} \NC is \type{str} in list \NC \NR -\NC \type {upper(str)} \NC \type{characters.upper} \NC \NR -\NC \type {lower(str)} \NC \type{characters.lower} \NC \NR -\NC \type {number(str)} \NC \type{tonumber} \NC \NR -\NC \type {boolean(str)} \NC \type{toboolean} \NC \NR -\NC \type {idstring(str)} \NC removes leading hash \NC \NR -\NC \type {name(index)} \NC full tag name \NC \NR -\NC \type {tag(index)} \NC tag name \NC \NR -\NC \type {namespace(index)} \NC namespace of tag \NC \NR -\NC \type {text(index)} \NC content \NC \NR -\NC \type {error(str)} \NC quit and show error \NC \NR -\NC \type {quit()} \NC quit \NC \NR -\NC \type {print()} \NC print message \NC \NR -\NC \type {count(pattern)} \NC number of matches \NC \NR -\NC \type {child(pattern)} \NC take child that matches \NC \NR -\stoptabulate - - -You can also use normal \LUA\ functions as long as you make sure that you pass -the right arguments. There are a few predefined variables available inside such -functions. - -\starttabulate[|Tl|l|p|] -\NC \type{list} \NC table \NC the list of matches \NC \NR -\NC \type{l} \NC number \NC the current index in the list of matches \NC \NR -\NC \type{ll} \NC element \NC the current element that matched \NC \NR -\NC \type{order} \NC number \NC the position of the root of the path \NC \NR -\stoptabulate - -The given expression between \type {[]} is converted to a \LUA\ expression so you -can use the usual operators: - -\starttyping -== ~= <= >= < > not and or () -\stoptyping - -In addition, \type {=} equals \type {==} and \type {!=} is the same as \type -{~=}. If you mess up the expression, you quite likely get a \LUA\ error message. - -\stopsection - -\startsection[title={css selectors}] - -\startbuffer[selector-001] -<?xml version="1.0" ?> - -<a> - <b class="one">b.one</b> - <b class="two">b.two</b> - <b class="one two">b.one.two</b> - <b class="three">b.three</b> - <b id="first">b#first</b> - <c>c</c> - <d>d e</d> - <e>d e</e> - <e>d e e</e> - <d>d f</d> - <f foo="bar">@foo = bar</f> - <f bar="foo">@bar = foo</f> - <f bar="foo1">@bar = foo1</f> - <f bar="foo2">@bar = foo2</f> - <f bar="foo3">@bar = foo3</f> - <f bar="foo+4">@bar = foo+4</f> - <g>g</g> - <g><gg><d>g gg d</d></gg></g> - <g><gg><f>g gg f</f></gg></g> - <g><gg><f class="one">g gg f.one</f></gg></g> - <g>g</g> - <g><gg><f class="two">g gg f.two</f></gg></g> - <g><gg><f class="three">g gg f.three</f></gg></g> - <g><f class="one">g f.one</f></g> - <g><f class="three">g f.three</f></g> - <h whatever="four five six">@whatever = four five six</h> -</a> -\stopbuffer - -\xmlloadbuffer{selector-001}{selector-001} - -\startxmlsetups xml:selector:demo - \advance\scratchcounter\plusone - \inleftmargin{\the\scratchcounter}\ignorespaces\xmlverbatim{#1}\par -\stopxmlsetups - -\unexpanded\def\showCSSdemo#1#2% - {\blank - \textrule{\tttf#2} - \startlines - \dontcomplain - \tttf \obeyspaces - \scratchcounter\zerocount - \xmlcommand{#1}{#2}{xml:selector:demo} - \stoplines - \blank} - -The \CSS\ approach to filtering is a bit different from the path based one and is -supported too. In fact, you can combine both methods. Depending on what you -select, the \CSS\ one can be a little bit faster too. It has the advantage that -one can select more in one go but at the same time looks a bit less attractive. -This method was added just to show that it can be done but might be useful too. A -selector is given between curly braces (after all \CSS\ uses them and they have no -function yet in the parser. - -\starttyping -\xmlall{#1}{{foo bar .whatever, bar foo .whatever}} -\stoptyping - -The following methods are supported: - -\starttabulate[|T||] -\NC element \NC all tags element \NC \NR -\NC element-1 > element-2 \NC all tags element-2 with parent tag element-1 \NC \NR -\NC element-1 + element-2 \NC all tags element-2 preceded by tag element-1 \NC \NR -\NC element-1 ~ element-2 \NC all tags element-2 preceded by tag element-1 \NC \NR -\NC element-1 element-2 \NC all tags element-2 inside tag element-1 \NC \NR -\NC [attribute] \NC has attribute \NC \NR -\NC [attribute=value] \NC attribute equals value\NC \NR -\NC [attribute\lettertilde =value] \NC attribute contains value (space is separator) \NC \NR -\NC [attribute\letterhat ="value"] \NC attribute starts with value \NC \NR -\NC [attribute\letterdollar="value"] \NC attribute ends with value \NC \NR -\NC [attribute*="value"] \NC attribute contains value \NC \NR -\NC .class \NC has class \NC \NR -\NC \letterhash id \NC has id \NC \NR -\NC :nth-child(n) \NC the child at index n \NC \NR -\NC :nth-last-child(n) \NC the child at index n from the end \NC \NR -\NC :first-child \NC the first child \NC \NR -\NC :last-child \NC the last child \NC \NR -\NC :nth-of-type(n) \NC the match at index n \NC \NR -\NC :nth-last-of-type(n) \NC the match at index n from the end \NC \NR -\NC :first-of-type \NC the first match \NC \NR -\NC :last-of-type \NC the last match \NC \NR -\NC :only-of-type \NC the only match or nothing \NC \NR -\NC :only-child \NC the only child or nothing \NC \NR -\NC :empty \NC only when empty \NC \NR -\NC :root \NC the whole tree \NC \NR -\stoptabulate - -The next pages show some examples. For that we use the demo file: - -\typebuffer[selector-001] - -The class and id selectors often only make sense in \HTML\ like documents but they -are supported nevertheless. They are after all just shortcuts for filtering by -attribute. The class filtering is special in the sense that it checks for a class -in a list of classes given in an attribute. - -\showCSSdemo{selector-001}{{.one}} -\showCSSdemo{selector-001}{{.one, .two}} -\showCSSdemo{selector-001}{{.one, .two, \letterhash first}} - -Attributes can be filtered by presence, value, partial value and such. Quotes are -optional but we advice to use them. - -\showCSSdemo{selector-001}{{[foo], [bar=foo]}} -\showCSSdemo{selector-001}{{[bar\lettertilde=foo]}} -\showCSSdemo{selector-001}{{[bar\letterhat="foo"]}} -\showCSSdemo{selector-001}{{[whatever\lettertilde="five"]}} - -You can of course combine the methods as in: - -\showCSSdemo{selector-001}{{g f .one, g f .three}} -\showCSSdemo{selector-001}{{g > f .one, g > f .three}} -\showCSSdemo{selector-001}{{d + e}} -\showCSSdemo{selector-001}{{d ~ e}} -\showCSSdemo{selector-001}{{d ~ e, g f .one, g f .three}} - -You can also negate the result by using \type {:not} on a simple expression: - -\showCSSdemo{selector-001}{{:not([whatever\lettertilde="five"])}} -\showCSSdemo{selector-001}{{:not(d)}} - -The child and match selectors are also supported: - -\showCSSdemo{selector-001}{{a:nth-child(3)}} -\showCSSdemo{selector-001}{{a:nth-last-child(3)}} -\showCSSdemo{selector-001}{{g:nth-of-type(3)}} -\showCSSdemo{selector-001}{{g:nth-last-of-type(3)}} -\showCSSdemo{selector-001}{{a:first-child}} -\showCSSdemo{selector-001}{{a:last-child}} -\showCSSdemo{selector-001}{{e:first-of-type}} -\showCSSdemo{selector-001}{{gg d:only-of-type}} - -Instead of numbers you can also give the \type {an} and \type {an+b} formulas -as well as the \type {odd} and \type {even} keywords: - -\showCSSdemo{selector-001}{{a:nth-child(even)}} -\showCSSdemo{selector-001}{{a:nth-child(odd)}} -\showCSSdemo{selector-001}{{a:nth-child(3n+1)}} -\showCSSdemo{selector-001}{{a:nth-child(2n+3)}} - -There are a few special cases: - -\showCSSdemo{selector-001}{{g:empty}} -\showCSSdemo{selector-001}{{g:root}} -\showCSSdemo{selector-001}{{*}} - -Combining the \CSS\ methods with the regular ones is possible: - -\showCSSdemo{selector-001}{{g gg f .one}} -\showCSSdemo{selector-001}{g/gg/f[@class='one']} -\showCSSdemo{selector-001}{g/{gg f .one}} - -\startbuffer[selector-002] -<?xml version="1.0" ?> - -<document> - <title class="one" >title 1</title> - <title class="two" >title 2</title> - <title class="one" >title 3</title> - <title class="three">title 4</title> -</document> -\stopbuffer - -The next examples we use this file: - -\typebuffer[selector-002] - -\xmlloadbuffer{selector-002}{selector-002} - -When we filter from this (not too well structured) tree we can use both -methods to achieve the same: - -\showCSSdemo{selector-002}{{document title .one, document title .three}} - -\showCSSdemo{selector-002}{/document/title[(@class='one') or (@class='three')]} - -However, imagine this file: - -\startbuffer[selector-003] -<?xml version="1.0" ?> - -<document> - <title class="one">title 1</title> - <subtitle class="sub">title 1.1</subtitle> - <title class="two">title 2</title> - <subtitle class="sub">title 2.1</subtitle> - <title class="one">title 3</title> - <subtitle class="sub">title 3.1</subtitle> - <title class="two">title 4</title> - <subtitle class="sub">title 4.1</subtitle> -</document> -\stopbuffer - -\typebuffer[selector-003] - -\xmlloadbuffer{selector-003}{selector-003} - -The next filter in easier with the \CSS\ selector methods because these accumulate -independent (simple) expressions: - -\showCSSdemo{selector-003}{{document title .one + subtitle, document title .two + subtitle}} - -Watch how we get an output in the document order. Because we render a sequential document -a combined filter will trigger a sorting pass. - -\stopsection - -\startsection[title={functions as filters}] - -At the \LUA\ end a whole \cmdinternal {cd:lpath} expression results in a (set of) node(s) -with its environment, but that is hardly usable in \TEX. Think of code like: - -\starttyping -for e in xml.collected(xml.load('text.xml'),"title") do - -- e = the element that matched -end -\stoptyping - -The older variant is still supported but you can best use the previous variant. - -\starttyping -for r, d, k in xml.elements(xml.load('text.xml'),"title") do - -- r = root of the title element - -- d = data table - -- k = index in data table -end -\stoptyping - -Here \type {d[k]} points to the \type {title} element and in this case all titles -in the tree pass by. In practice this kind of code is encapsulated in function -calls, like those returning elements one by one, or returning the first or last -match. The result is then fed back into \TEX, possibly after being altered by an -associated setup. We've seen the wrappers to such functions already in a previous -chapter. - -In addition to the previously discussed expressions, one can add so called -filters to the expression, for instance: - -\starttyping -a/(b|c)/!d/e/text() -\stoptyping - -In a filter, the last part of the \cmdinternal {cd:lpath} expression is a -function call. The previous example returns the text of each element \type {e} -that results from matching the expression. When running \TEX\ the following -functions are available. Some are also available when using pure \LUA. In \TEX\ -you can often use one of the macros like \type {\xmlfirst} instead of a \type -{\xmlfilter} with finalizer \type {first()}. The filter can be somewhat faster -but that is hardly noticeable. - -\starttabulate[|l|l|p|] -\NC \type {context()} \NC string \NC the serialized text with \TEX\ catcode regime \NC \NR -%NC \type {ctxtext()} \NC string \NC \NC \NR -\NC \type {function()} \NC string \NC depends on the function \NC \NR -% -\NC \type {name()} \NC string \NC the (remapped) namespace \NC \NR -\NC \type {tag()} \NC string \NC the name of the element \NC \NR -\NC \type {tags()} \NC list \NC the names of the element \NC \NR -% -\NC \type {text()} \NC string \NC the serialized text \NC \NR -\NC \type {upper()} \NC string \NC the serialized text uppercased \NC \NR -\NC \type {lower()} \NC string \NC the serialized text lowercased \NC \NR -\NC \type {stripped()} \NC string \NC the serialized text stripped \NC \NR -\NC \type {lettered()} \NC string \NC the serialized text only letters (cf. \UNICODE) \NC \NR -% -\NC \type {count()} \NC number \NC the number of matches \NC \NR -\NC \type {index()} \NC number \NC the matched index in the current path \NC \NR -\NC \type {match()} \NC number \NC the matched index in the preceding path \NC \NR -% -%NC \type {lowerall()} \NC string \NC \NC \NR -%NC \type {upperall()} \NC string \NC \NC \NR -% -\NC \type {attribute(name)} \NC content \NC returns the attribute with the given name \NC \NR -\NC \type {chainattribute(name)} \NC content \NC sidem, but backtracks till one is found \NC \NR -\NC \type {command(name)} \NC content \NC expands the setup with the given name for each found element \NC \NR -\NC \type {position(n)} \NC content \NC processes the \type {n}\high{th} instance of the found element \NC \NR -\NC \type {all()} \NC content \NC processes all instances of the found element \NC \NR -%NC \type {default} \NC content \NC all \NC \NR -\NC \type {reverse()} \NC content \NC idem in reverse order \NC \NR -\NC \type {first()} \NC content \NC processes the first instance of the found element \NC \NR -\NC \type {last()} \NC content \NC processes the last instance of the found element \NC \NR -\NC \type {concat(...)} \NC content \NC concatinates the match \NC \NC \NR -\NC \type {concatrange(from,to,...)} \NC content \NC concatinates a range of matches \NC \NC \NR -\stoptabulate - -The extra arguments of the concatinators are: \type {separator} (string), \type -{lastseparator} (string) and \type {textonly} (a boolean). - -These filters are in fact \LUA\ functions which means that if needed more of them -can be added. Indeed this happens in some of the \XML\ related \MKIV\ modules, -for instance in the \MATHML\ processor. - -\stopsection - -\startsection[title={example}] - -The number of commands is rather large and if you want to avoid them this is -often possible. Take for instance: - -\starttyping -\xmlall{#1}{/a/b[position()>3]} -\stoptyping - -Alternatively you can use: - -\starttyping -\xmlfilter{#1}{/a/b[position()>3]/all()} -\stoptyping - -and actually this is also faster as internally it avoids a function call. Of -course in practice this is hardly measurable. - -In previous examples we've already seen quite some expressions, and it might be -good to point out that the syntax is modelled after \XSLT\ but is not quite the -same. The reason is that we started with a rather minimal system and have already -styles in use that depend on compatibility. - -\starttyping -namespace:// axis node(set) [expr 1]..[expr n] / ... / filter -\stoptyping - -When we are inside a \CONTEXT\ run, the namespace is \type {tex}. Hoewever, if -you want not to print back to \TEX\ you need to be more explicit. Say that we -typeset examns and have a (not that logical) structure like: - -\starttyping -<question> - <text>...</text> - <answer> - <item>one</item> - <item>two</item> - <item>three</item> - </answer> - <alternative> - <condition>true</condition> - <score>1</score> - </alternative> - <alternative> - <condition>false</condition> - <score>0</score> - </alternative> - <alternative> - <condition>true</condition> - <score>2</score> - </alternative> -</question> -\stoptyping - -Say that we typeset the questions with: - -\starttyping -\startxmlsetups question - \blank - score: \xmlfunction{#1}{totalscore} - \blank - \xmlfirst{#1}{text} - \startitemize - \xmlfilter{#1}{/answer/item/command(answer:item)} - \stopitemize - \endgraf - \blank -\stopxmlsetups -\stoptyping - -Each item in the answer results in a call to: - -\starttyping -\startxmlsetups answer:item - \startitem - \xmlflush{#1} - \endgraf - \xmlfilter{#1}{../../alternative[position()=rootposition()]/ - condition/command(answer:condition)} - \stopitem -\stopxmlsetups -\stoptyping - -\starttyping -\startxmlsetups answer:condition - \endgraf - condition: \xmlflush{#1} - \endgraf -\stopxmlsetups -\stoptyping - -Now, there are two rather special filters here. The first one involves -calculating the total score. As we look forward we use a function to deal with -this. - -\starttyping -\startluacode -function xml.functions.totalscore(root) - local score = 0 - for e in xml.collected(root,"/alternative") do - score = score + xml.filter(e,"xml:///score/number()") or 0 - end - tex.write(score) -end -\stopluacode -\stoptyping - -Watch how we use the namespace to keep the results at the \LUA\ end. - -The second special trick shown here is to limit a match using the current -position of the root (\type {#}) match. - -As you can see, a path expression can be more than just filtering a few nodes. At -the end of this manual you will find a bunch of examples. - -\stopsection - -\startsection[title={tables}] - -If you want to know how the internal \XML\ tables look you can print such a -table: - -\starttyping -print(table.serialize(e)) -\stoptyping - -This produces for instance: - -% s = xml.convert("<document><demo label='whatever'>some text</demo></document>") -% print(table.serialize(xml.filter(s,"demo")[1])) - -\starttyping -t={ - ["at"]={ - ["label"]="whatever", - }, - ["dt"]={ "some text" }, - ["ns"]="", - ["rn"]="", - ["tg"]="demo", -} -\stoptyping - -The \type {rn} entry is the renamed namespace (when renaming is applied). If you -see tags like \type {@pi@} this means that we don't have an element, but (in this -case) a processing instruction. - -\starttabulate[|l|p|] -\NC \type {@rt@} \NC the root element \NC \NR -\NC \type {@dd@} \NC document definition \NC \NR -\NC \type {@cm@} \NC comment, like \type {<!-- whatever -->} \NC \NR -\NC \type {@cd@} \NC so called \type {CDATA} \NC \NR -\NC \type {@pi@} \NC processing instruction, like \type {<?whatever we want ?>} \NC \NR -\stoptabulate - -There are many ways to deal with the content, but in the perspective of \TEX\ -only a few matter. - -\starttabulate[|l|p|] -\NC \type {xml.sprint(e)} \NC print the content to \TEX\ and apply setups if needed \NC \NR -\NC \type {xml.tprint(e)} \NC print the content to \TEX\ (serialize elements verbose) \NC \NR -\NC \type {xml.cprint(e)} \NC print the content to \TEX\ (used for special content) \NC \NR -\stoptabulate - -Keep in mind that anything low level that you uncover is not part of the official -interface unless mentioned in this manual. - -\stopsection - -\stopchapter - -\startchapter[title={Tips and tricks}] - -\startsection[title={tracing}] - -It can be hard to debug code as much happens kind of behind the screens. -Therefore we have a couple of tracing options. Of course you can typeset some -status information, using for instance: - -\startxmlcmd {\cmdbasicsetup{xmlshow}} - typeset the tree given by \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlinfo}} - typeset the name in the element given by \cmdinternal {cd:node} -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlpath}} - returns the complete path (including namespace prefix and index) of the - given \cmdinternal {cd:node} -\stopxmlcmd - -\startbuffer[demo] -<?xml version "1.0"?> -<document> - <section> - <content> - <p>first</p> - <p><b>second</b></p> - </content> - </section> - <section> - <content> - <p><b>third</b></p> - <p>fourth</p> - </content> - </section> -</document> -\stopbuffer - -Say that we have the following \XML: - -\typebuffer[demo] - -and the next definitions: - -\startbuffer -\startxmlsetups xml:demo:base - \xmlsetsetup{#1}{p|b}{xml:demo:*} -\stopxmlsetups - -\startxmlsetups xml:demo:p - \xmlflush{#1} - \par -\stopxmlsetups - -\startxmlsetups xml:demo:b - \par - \xmlpath{#1} : \xmlflush{#1} - \par -\stopxmlsetups - -\xmlregisterdocumentsetup{example-10}{xml:demo:base} - -\xmlprocessbuffer{example-10}{demo}{} -\stopbuffer - -\typebuffer - -This will give us: - -\blank \startpacked \getbuffer \stoppacked \blank - -If you use \type {\xmlshow} you will get a complete subtree which can -be handy for tracing but can also lead to large documents. - -We also have a bunch of trackers that can be enabled, like: - -\starttyping -\enabletrackers[xml.show,xml.parse] -\stoptyping - -The full list (currently) is: - -\starttabulate[|lT|p|] -\NC xml.entities \NC show what entities are seen and replaced \NC \NR -\NC xml.path \NC show the result of parsing an lpath expression \NC \NR -\NC xml.parse \NC show stepwise resolving of expressions \NC \NR -\NC xml.profile \NC report all parsed lpath expressions (in the log) \NC \NR -\NC xml.remap \NC show what namespaces are remapped \NC \NR -\NC lxml.access \NC report errors with respect to resolving (symbolic) nodes \NC \NR -\NC lxml.comments \NC show the comments that are encountered (if at all) \NC \NR -\NC lxml.loading \NC show what files are loaded and converted \NC \NR -\NC lxml.setups \NC show what setups are being associated to elements \NC \NR -\stoptabulate - -In one of our workflows we produce books from \XML\ where the (educational) -content is organized in many small files. Each book has about 5~chapters and each -chapter is made of sections that contain text, exercises, resources, etc.\ and so -the document is assembled from thousands of files (don't worry, runtime inclusion -is pretty fast). In order to see where in the sources content resides we can -trace the filename. - -\startxmlcmd {\cmdbasicsetup{xmlinclusion}} - returns the file where the node comes from -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlinclusions}} - returns the list of files where the node comes from -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlbadinclusions}} - returns a list of files that were not included due to some problem -\stopxmlcmd - -Of course you have to make sure that these names end up somewhere visible, for -instance in the margin. - -\stopsection - -\startsection[title={expansion}] - -For novice users the concept of expansion might sound frightening and to some -extend it is. However, it is important enough to spend some words on it here. - -It is good to realize that most setups are sort of immediate. When one setup is -issued, it can call another one and so on. Normally you won't notice that but -there are cases where that can be a problem. In \TEX\ you can define a macro, -take for instance: - -\starttyping -\startxmlsetups xml:foo - \def\foobar{\xmlfirst{#1}{/bar}} -\stopxmlsetups -\stoptyping - -you store the reference top node \type {bar} in \type {\foobar} maybe for later use. In -this case the content is not yet fetched, it will be done when \type {\foobar} is -called. - -\starttyping -\startxmlsetups xml:foo - \edef\foobar{\xmlfirst{#1}{/bar}} -\stopxmlsetups -\stoptyping - -Here the content of \type {bar} becomes the body of the macro. But what if -\type {bar} itself contains elements that also contain elements. When there -is a setup for \type {bar} it will be triggered and so on. - -When that setup looks like: - -\starttyping -\startxmlsetups xml:bar - \def\barfoo{\xmlflush{#1}} -\stopxmlsetups -\stoptyping - -Here we get something like: - -\starttyping -\foobar => {\def\barfoo{...}} -\stoptyping - -When \type {\barfoo} is not defined we get an error and when it is known and expands -to something weird we might also get an error. - -Especially when you don't know what content can show up, this can result in errors -when an expansion fails, for example because some macro being used is not defined. -To prevent this we can define a macro: - -\starttyping -\starttexdefinition unexpanded xml:bar:macro #1 - \def\barfoo{\xmlflush{#1}} -\stoptexdefinition - -\startxmlsetups xml:bar - \texdefinition{xml:bar:macro}{#1} -\stopxmlsetups -\stoptyping - -The setup \type {xml:bar} will still expand but the replacement text now is just the -call to the macro, think of: - -\starttyping -\foobar => {\texdefinition{xml:bar:macro}{#1}} -\stoptyping - -But this is often not needed, most \CONTEXT\ commands can handle the expansions -quite well but it's good to know that there is a way out. So, now to some -examples. Imagine that we have an \XML\ file that looks as follows: - -\starttyping -<?xml version='1.0' ?> -<demo> - <chapter> - <title>Some <em>short</em> title</title> - <content> - zeta - <index> - <key>zeta</key> - <content>zeta again</content> - </index> - alpha - <index> - <key>alpha</key> - <content>alpha <em>again</em></content> - </index> - gamma - <index> - <key>gamma</key> - <content>gamma</content> - </index> - beta - <index> - <key>beta</key> - <content>beta</content> - </index> - delta - <index> - <key>delta</key> - <content>delta</content> - </index> - done! - </content> - </chapter> -</demo> -\stoptyping - -There are a few structure related elements here: a chapter (with its list entry) -and some index entries. Both are multipass related and therefore travel around. -This means that when we let data end up in the auxiliary file, we need to make -sure that we end up with either expanded data (i.e.\ no references to the \XML\ -tree) or with robust forward and backward references to elements in the tree. - -Here we discuss three approaches (and more may show up later): pushing \XML\ into -the auxiliary file and using references to elements either or not with an -associated setup. We control the variants with a switch. - -\starttyping -\newcount\TestMode - -\TestMode=0 % expansion=xml -\TestMode=1 % expansion=yes, index, setup -\TestMode=2 % expansion=yes -\stoptyping - -We apply a couple of setups: - -\starttyping -\startxmlsetups xml:mysetups - \xmlsetsetup{\xmldocument}{demo|index|content|chapter|title|em}{xml:*} -\stopxmlsetups - -\xmlregistersetup{xml:mysetups} -\stoptyping - -The main document is processed with: - -\starttyping -\startxmlsetups xml:demo - \xmlflush{#1} - \subject{contents} - \placelist[chapter][criterium=all] - \subject{index} - \placeregister[index][criterium=all] - \page % else buffer is forgotten when placing header -\stopxmlsetups -\stoptyping - -First we show three alternative ways to deal with the chapter. The first case -expands the \XML\ reference so that we have an \XML\ stream in the auxiliary -file. This stream is processed as a small independent subfile when needed. The -second case registers a reference to the current element (\type {#1}). This means -that we have access to all data of this element, like attributes, title and -content. What happens depends on the given setup. The third variant does the same -but here the setup is part of the reference. - -\starttyping -\startxmlsetups xml:chapter - \ifcase \TestMode - % xml code travels around - \setuphead[chapter][expansion=xml] - \startchapter[title=eh: \xmltext{#1}{title}] - \xmlfirst{#1}{content} - \stopchapter - \or - % index is used for access via setup - \setuphead[chapter][expansion=yes,xmlsetup=xml:title:flush] - \startchapter[title=\xmlgetindex{#1}] - \xmlfirst{#1}{content} - \stopchapter - \or - % tex call to xml using index is used - \setuphead[chapter][expansion=yes] - \startchapter[title=hm: \xmlreference{#1}{xml:title:flush}] - \xmlfirst{#1}{content} - \stopchapter - \fi -\stopxmlsetups - -\startxmlsetups xml:title:flush - \xmltext{#1}{title} -\stopxmlsetups -\stoptyping - -We need to deal with emphasis and the content of the chapter. - -\starttyping -\startxmlsetups xml:em - \begingroup\em\xmlflush{#1}\endgroup -\stopxmlsetups - -\startxmlsetups xml:content - \xmlflush{#1} -\stopxmlsetups -\stoptyping - -A similar approach is followed with the index entries. Watch how we use the -numbered entries variant (in this case we could also have used just \type -{entries} and \type {keys}). - -\starttyping -\startxmlsetups xml:index - \ifcase \TestMode - \setupregister[index][expansion=xml,xmlsetup=] - \setstructurepageregister - [index] - [entries:1=\xmlfirst{#1}{content}, - keys:1=\xmltext{#1}{key}] - \or - \setupregister[index][expansion=yes,xmlsetup=xml:index:flush] - \setstructurepageregister - [index] - [entries:1=\xmlgetindex{#1}, - keys:1=\xmltext{#1}{key}] - \or - \setupregister[index][expansion=yes,xmlsetup=] - \setstructurepageregister - [index] - [entries:1=\xmlreference{#1}{xml:index:flush}, - keys:1=\xmltext{#1}{key}] - \fi -\stopxmlsetups - -\startxmlsetups xml:index:flush - \xmlfirst{#1}{content} -\stopxmlsetups -\stoptyping - -Instead of this flush, you can use the predefined setup \type {xml:flush} -unless it is overloaded by you. - -The file is processed by: - -\starttyping -\starttext - \xmlprocessfile{main}{test.xml}{} -\stoptext -\stoptyping - -We don't show the result here. If you're curious what the output is, you can test -it yourself. In that case it also makes sense to peek into the \type {test.tuc} -file to see how the information travels around. The \type {metadata} fields carry -information about how to process the data. - -The first case, the \XML\ expansion one, is somewhat special in the sense that -internally we use small pseudo files. You can control the rendering by tweaking -the following setups: - -\starttyping -\startxmlsetups xml:ctx:sectionentry - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:ctx:registerentry - \xmlflush{#1} -\stopxmlsetups -\stoptyping - -{\em When these methods work out okay the other structural elements will be -dealt with in a similar way.} - -\stopsection - -\startsection[title={special cases}] - -Normally the content will be flushed under a special (so called) catcode regime. -This means that characters that have a special meaning in \TEX\ will have no such -meaning in an \XML\ file. If you want content to be treated as \TEX\ code, you can -use one of the following: - -\startxmlcmd {\cmdbasicsetup{xmlflushcontext}} - flush the given \cmdinternal {cd:node} using the \TEX\ character - interpretation scheme -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlcontext}} - flush the match of \cmdinternal {cd:lpath} for the given \cmdinternal - {cd:node} using the \TEX\ character interpretation scheme -\stopxmlcmd - -We use this in cases like: - -\starttyping -.... - \xmlsetsetup {#1} { - tm|texformula| - } {xml:*} -.... - -\startxmlsetups xml:tm - \mathematics{\xmlflushcontext{#1}} -\stopxmlsetups - -\startxmlsetups xml:texformula - \placeformula\startformula\xmlflushcontext{#1}\stopformula -\stopxmlsetups -\stoptyping - -\stopsection - -\startsection[title={collecting}] - -Say that your document has - -\starttyping -<table> - <tr> - <td>foo</td> - <td>bar<td> - </tr> -</table> -\stoptyping - -And that you need to convert that to \TEX\ speak like: - -\starttyping -\bTABLE - \bTR - \bTD foo \eTD - \bTD bar \eTD - \eTR -\eTABLE -\stoptyping - -A simple mapping is: - -\starttyping -\startxmlsetups xml:table - \bTABLE \xmlflush{#1} \eTABLE -\stopxmlsetups -\startxmlsetups xml:tr - \bTR \xmlflush{#1} \eTR -\stopxmlsetups -\startxmlsetups xml:td - \bTD \xmlflush{#1} \eTD -\stopxmlsetups -\stoptyping - -The \type {\bTD} command is a so called delimited command which means that it -picks up its argument by looking for an \type {\eTD}. For the simple case here -this works quite well because the flush is inside the pair. This is not the case -in the following variant: - -\starttyping -\startxmlsetups xml:td:start - \bTD -\stopxmlsetups -\startxmlsetups xml:td:stop - \eTD -\stopxmlsetups -\startxmlsetups xml:td - \xmlsetup{#1}{xml:td:start} - \xmlflush{#1} - \xmlsetup{#1}{xml:td:stop} -\stopxmlsetups -\stoptyping - -When for some reason \TEX\ gets confused you can revert to a mechanism that -collects content. - -\starttyping -\startxmlsetups xml:td:start - \startcollect - \bTD - \stopcollect -\stopxmlsetups -\startxmlsetups xml:td:stop - \startcollect - \eTD - \stopcollect -\stopxmlsetups -\startxmlsetups xml:td - \startcollecting - \xmlsetup{#1}{xml:td:start} - \xmlflush{#1} - \xmlsetup{#1}{xml:td:stop} - \stopcollecting -\stopxmlsetups -\stoptyping - -You can even implement solutions that effectively do this: - -\starttyping -\startcollecting - \startcollect \bTABLE \stopcollect - \startcollect \bTR \stopcollect - \startcollect \bTD \stopcollect - \startcollect foo\stopcollect - \startcollect \eTD \stopcollect - \startcollect \bTD \stopcollect - \startcollect bar\stopcollect - \startcollect \eTD \stopcollect - \startcollect \eTR \stopcollect - \startcollect \eTABLE \stopcollect -\stopcollecting -\stoptyping - -Of course you only need to go that complex when the situation demands it. Here is -another weird one: - -\starttyping -\startcollecting - \startcollect \setupsomething[\stopcollect - \startcollect foo=\stopcollect - \startcollect FOO,\stopcollect - \startcollect bar=\stopcollect - \startcollect BAR,\stopcollect - \startcollect ]\stopcollect -\stopcollecting -\stoptyping - -\stopsection - -\startsection[title={selectors and injectors}] - -This section describes a bit special feature, one that we needed for a project -where we could not touch the original content but could add specific sections for -our own purpose. Hopefully the example demonstrates its useability. - -\enabletrackers[lxml.selectors] - -\startbuffer[foo] -<?xml version="1.0" encoding="UTF-8"?> - -<?context-directive message info 1: this is a demo file ?> -<?context-message-directive info 2: this is a demo file ?> - -<one> - <two> - <?context-select begin t1 t2 t3 ?> - <three> - t1 t2 t3 - <?context-directive injector crlf t1 ?> - t1 t2 t3 - </three> - <?context-select end ?> - <?context-select begin t4 ?> - <four> - t4 - </four> - <?context-select end ?> - <?context-select begin t8 ?> - <four> - t8.0 - t8.0 - </four> - <?context-select end ?> - <?context-include begin t4 ?> - <!-- - <three> - t4.t3 - <?context-directive injector crlf t1 ?> - t4.t3 - </three> - --> - <three> - t3 - <?context-directive injector crlf t1 ?> - t3 - </three> - <?context-include end ?> - <?context-select begin t8 ?> - <four> - t8.1 - t8.1 - </four> - <?context-select end ?> - <?context-select begin t8 ?> - <four> - t8.2 - t8.2 - </four> - <?context-select end ?> - <?context-select begin t4 ?> - <four> - t4 - t4 - </four> - <?context-select end ?> - <?context-directive injector page t7 t8 ?> - foo - <?context-directive injector blank t1 ?> - bar - <?context-directive injector page t7 t8 ?> - bar - </two> -</one> -\stopbuffer - -\typebuffer[foo] - -First we show how to plug in a directive. Processing instructions like the -following are normally ignored by an \XML\ processor, unless they make sense -to it. - -\starttyping -<?context-directive message info 1: this is a demo file ?> -<?context-message-directive info 2: this is a demo file ?> -\stoptyping - -We can define a message handler as follows: - -\startbuffer -\def\MyMessage#1#2#3{\writestatus{#1}{#2 #3}} - -\xmlinstalldirective{message}{MyMessage} -\stopbuffer - -\typebuffer \getbuffer - -When this file is processed you will see this on the console: - -\starttyping -info > 1: this is a demo file -info > 2: this is a demo file -\stoptyping - -The file has some sections that can be used or ignored. The recipe for -obeying \type {t1} and \type {t4} is the following: - -\startbuffer -\xmlsetinjectors[t1] -\xmlsetinjectors[t4] - -\startxmlsetups xml:initialize - \xmlapplyselectors{#1} - \xmlsetsetup {#1} { - one|two|three|four - } {xml:*} -\stopxmlsetups - -\xmlregistersetup{xml:initialize} - -\startxmlsetups xml:one - [ONE \xmlflush{#1} ONE] -\stopxmlsetups - -\startxmlsetups xml:two - [TWO \xmlflush{#1} TWO] -\stopxmlsetups - -\startxmlsetups xml:three - [THREE \xmlflush{#1} THREE] -\stopxmlsetups - -\startxmlsetups xml:four - [FOUR \xmlflush{#1} FOUR] -\stopxmlsetups -\stopbuffer - -\typebuffer \getbuffer - -This typesets: - -\startnarrower -\xmlprocessbuffer{main}{foo}{} -\stopnarrower - -The include coding is kind of special: it permits adding content (in a comment) -and ignoring the rest so that we indeed can add something without interfering -with the original. Of course in a normal workflow such messy solutions are -not needed, but alas, often workflows are not that clean, especially when one -has no real control over the source. - -\startxmlcmd {\cmdbasicsetup{xmlsetinjectors}} - enables a list of injectors that will be used -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlresetinjectors}} - resets the list of injectors -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlinjector}} - expands an injection (command); normally this one is only used - (in some setup) or for testing -\stopxmlcmd - -\startxmlcmd {\cmdbasicsetup{xmlapplyselectors}} - analyze the tree \cmdinternal {cd:node} for marked sections that - will be injected -\stopxmlcmd - -We have some injections predefined: - -\starttyping -\startsetups xml:directive:injector:page - \page -\stopsetups - -\startsetups xml:directive:injector:column - \column -\stopsetups - -\startsetups xml:directive:injector:blank - \blank -\stopsetups -\stoptyping - -In the example we see: - -\starttyping -<?context-directive injector page t7 t8 ?> -\stoptyping - -When we set \type {\xmlsetinjector[t7]} a pagebreak will injected in that spot. -Tags like \type {t7}, \type {t8} etc.\ can represent versions. - -\stopsection - -\startsection[title=preprocessing] - -% local match = lpeg.match -% local replacer = lpeg.replacer("BAD TITLE:","<bold>BAD TITLE:</bold>") -% -% function lxml.preprocessor(data,settings) -% return match(replacer,data) -% end - -\startbuffer[pre-code] -\startluacode - function lxml.preprocessor(data,settings) - return string.find(data,"BAD TITLE:") - and string.gsub(data,"BAD TITLE:","<bold>BAD TITLE:</bold>") - or data - end -\stopluacode -\stopbuffer - -\startbuffer[pre-xml] -\startxmlsetups pre:demo:initialize - \xmlsetsetup{#1}{*}{pre:demo:*} -\stopxmlsetups - -\xmlregisterdocumentsetup{pre:demo}{pre:demo:initialize} - -\startxmlsetups pre:demo:root - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups pre:demo:bold - \begingroup\bf\xmlflush{#1}\endgroup -\stopxmlsetups - -\starttext - \xmlprocessbuffer{pre:demo}{demo}{} -\stoptext -\stopbuffer - -Say that you have the following \XML\ setup: - -\typebuffer[pre-xml] - -and that (such things happen) the input looks like this: - -\startbuffer[demo] -<root> -BAD TITLE: crap crap crap ... - -BAD TITLE: crap crap crap ... -</root> -\stopbuffer - -\typebuffer[demo] - -You can then clean up these \type {BAD TITLE}'s as follows: - -\typebuffer[pre-code] - -and get as result: - -\start \getbuffer[pre-code,pre-xml] \stop - -The preprocessor function gets as second argument the current settings, an d -the field \type {currentresource} can be used to limit the actions to -specific resources, in our case it's \type {buffer: demo}. Afterwards you can -reset the proprocessor with: - -\startluacode -lxml.preprocessor = nil -\stopluacode - -Future versions might give some more control over preprocessors. For now consider -it to be a quick hack. - -\stopsection - -\stopchapter - -\startchapter[title={Lookups using lpaths}] - -\startsection[title={introduction}] - -There is not that much system in the following examples. They resulted from tests -with different documents. The current implementation evolved out of the -experimental code. For instance, I decided to add the multiple expressions in row -handling after a few email exchanges with Jean|-|Michel Huffen. - -One of the main differences between the way \XSLT\ resolves a path and our way is -the anchor. Take: - -\starttyping -/something -something -\stoptyping - -The first one anchors in the current (!) element so it will only consider direct -children. The second one does a deep lookup and looks at the descendants as well. -Furthermore we have a few extra shortcuts like \type {**} in \type {a/**/b} which -represents all descendants. - -The expressions (between square brackets) has to be valid \LUA\ and some -preprocessing is done to resolve the built in functions. So, you might use code -like: - -\starttyping -my_lpeg_expression:match(text()) == "whatever" -\stoptyping - -given that \type {my_lpeg_expression} is known. In the examples below we use the -visualizer to show the steps. Some are shown more than once as part of a set. - -\stopsection - -\startsection[title={special cases}] - -\xmllshow{} -\xmllshow{*} -\xmllshow{.} -\xmllshow{/} - -\stopsection - -\startsection[title={wildcards}] - -\xmllshow{*} -\xmllshow{*:*} -\xmllshow{/*} -\xmllshow{/*:*} -\xmllshow{*/*} -\xmllshow{*:*/*:*} - -\xmllshow{a/*} -\xmllshow{a/*:*} -\xmllshow{/a/*} -\xmllshow{/a/*:*} - -\xmllshow{/*} -\xmllshow{/**} -\xmllshow{/***} - -\stopsection - -\startsection[title={multiple steps}] - -\xmllshow{answer} -\xmllshow{answer/test/*} -\xmllshow{answer/test/child::} -\xmllshow{answer/*} -\xmllshow{answer/*[tag()='p' and position()=1 and text()!='']} - -\stopsection - -\startsection[title={pitfals}] - -\xmllshow{[oneof(lower(@encoding),'tex','context','ctx')]} -\xmllshow{.[oneof(lower(@encoding),'tex','context','ctx')]} - -\stopsection - -\startsection[title={more special cases}] - -\xmllshow{**} -\xmllshow{*} -\xmllshow{..} -\xmllshow{.} -\xmllshow{//} -\xmllshow{/} - -\xmllshow{**/} -\xmllshow{**/*} -\xmllshow{**/.} -\xmllshow{**//} - -\xmllshow{*/} -\xmllshow{*/*} -\xmllshow{*/.} -\xmllshow{*//} - -\xmllshow{/**/} -\xmllshow{/**/*} -\xmllshow{/**/.} -\xmllshow{/**//} - -\xmllshow{/*/} -\xmllshow{/*/*} -\xmllshow{/*/.} -\xmllshow{/*//} - -\xmllshow{./} -\xmllshow{./*} -\xmllshow{./.} -\xmllshow{.//} - -\xmllshow{../} -\xmllshow{../*} -\xmllshow{../.} -\xmllshow{..//} - -\stopsection - -\startsection[title={more wildcards}] - -\xmllshow{one//two} -\xmllshow{one/*/two} -\xmllshow{one/**/two} -\xmllshow{one/***/two} -\xmllshow{one/x//two} -\xmllshow{one//x/two} -\xmllshow{//x/two} - -\stopsection - -\startsection[title={special axis}] - -\xmllshow{descendant::whocares/ancestor::whoknows} -\xmllshow{descendant::whocares/ancestor::whoknows/parent::} -\xmllshow{descendant::whocares/ancestor::} -\xmllshow{child::something/child::whatever/child::whocares} -\xmllshow{child::something/child::whatever/child::whocares|whoknows} -\xmllshow{child::something/child::whatever/child::(whocares|whoknows)} -\xmllshow{child::something/child::whatever/child::!(whocares|whoknows)} -\xmllshow{child::something/child::whatever/child::(whocares)} -\xmllshow{child::something/child::whatever/child::(whocares)[position()>2]} -\xmllshow{child::something/child::whatever[position()>2][position()=1]} -\xmllshow{child::something/child::whatever[whocares][whocaresnot]} -\xmllshow{child::something/child::whatever[whocares][not(whocaresnot)]} -\xmllshow{child::something/child::whatever/self::whatever} - -There is also \type {last-match::} that starts with the last found set of nodes. -This can save some run time when you do lots of tests combined with a same check -afterwards. There is however one pitfall: you never know what is done with that -last match in the setup that gets called nested. Take the following example: - -\starttyping -\startbuffer[test] -<something> - <crap> <crapa> <crapb> <crapc> <crapd> - <crape> - done 1 - </crape> - </crapd> </crapc> </crapb> </crapa> - <crap> <crapa> <crapb> <crapc> <crapd> - <crape> - done 2 - </crape> - </crapd> </crapc> </crapb> </crapa> - <crap> <crapa> <crapb> <crapc> <crapd> - <crape> - done 3 - </crape> - </crapd> </crapc> </crapb> </crapa> -</something> -\stopbuffer -\stoptyping - -One way to filter the content is this: - -\starttyping -\xmldoif {#1} {/crap/crapa/crapb/crapc/crapd/crape} { - some action -} -\stoptyping - -It is not unlikely that you will do something like this: - -\starttyping -\xmlfirst {#1} {/crap/crapa/crapb/crapc/crapd/crape} { - \xmlfirst{#1}{/crap/crapa/crapb/crapc/crapd/crape} -} -\stoptyping - -This means that the path is resolved twice but that can be avoided as -follows: - -\starttyping -\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{ - \xmlfirst{#1}{last-match::} -} -\stoptyping - -But the next is now guaranteed to work: - -\starttyping -\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{ - \xmlfirst{#1}{last-match::} - \xmllast{#1}{last-match::} -} -\stoptyping - -Because the first one can have done some lookup the last match can be replaced -and the second call will give unexpected results. You can overcome this with: - -\starttyping -\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{ - \xmlpushmatch - \xmlfirst{#1}{last-match::} - \xmlpopmatch -} -\stoptyping - -Does it pay off? Here are some timings of a 10.000 times text and lookup -like the previous (on a decent January 2016 laptop): - -\starttabulate[|r|l|] -\NC 0.239 \NC \type {\xmldoif {...} {...}} \NC \NR -\NC 0.292 \NC \type {\xmlfirst {...} {...}} \NC \NR -\NC 0.538 \NC \type {\xmldoif {...} {...} + \xmlfirst {...} {...}} \NC \NR -\NC 0.338 \NC \type {\xmldoif {...} {...} + \xmlfirst {...} {last-match::}} \NC \NR -\NC 0.349 \NC \type {+ \xmldoif {...} {...} + \xmlfirst {...} {last-match::}-} \NC \NR -\stoptabulate - -So, pushing and popping (the last row) is a bit slower than not doing that but it -is still much faster than not using \type {last-match::} at all. As a shortcut -you can use \type {=}, as in: - -\starttyping -\xmlfirst{#1}{=} -\stoptyping - -You can even do this: - -\starttyping -\xmlall{#1}{last-match::/text()} -\stoptyping - -or - -\starttyping -\xmlall{#1}{=/text()} -\stoptyping - - -\stopsection - -\startsection[title={some more examples}] - -\xmllshow{/something/whatever} -\xmllshow{something/whatever} -\xmllshow{/**/whocares} -\xmllshow{whoknows/whocares} -\xmllshow{whoknows} -\xmllshow{whocares[contains(text(),'f') or contains(text(),'g')]} -\xmllshow{whocares/first()} -\xmllshow{whocares/last()} -\xmllshow{whatever/all()} -\xmllshow{whocares/position(2)} -\xmllshow{whocares/position(-2)} -\xmllshow{whocares[1]} -\xmllshow{whocares[-1]} -\xmllshow{whocares[2]} -\xmllshow{whocares[-2]} -\xmllshow{whatever[3]/attribute(id)} -\xmllshow{whatever[2]/attribute('id')} -\xmllshow{whatever[3]/text()} -\xmllshow{/whocares/first()} -\xmllshow{/whocares/last()} - -\xmllshow{xml://whatever/all()} -\xmllshow{whatever/all()} -\xmllshow{//whocares} -\xmllshow{..[2]} -\xmllshow{../*[2]} - -\xmllshow{/(whocares|whocaresnot)} -\xmllshow{/!(whocares|whocaresnot)} -\xmllshow{/!whocares} - -\xmllshow{/interface/command/command(xml:setups:register)} -\xmllshow{/interface/command[@name='xxx']/command(xml:setups:typeset)} -\xmllshow{/arguments/*} -\xmllshow{/sequence/first()} -\xmllshow{/arguments/text()} -\xmllshow{/sequence/variable/first()} -\xmllshow{/interface/define[@name='xxx']/first()} -\xmllshow{/parameter/command(xml:setups:parameter:measure)} - -\xmllshow{/(*:library|figurelibrary)/*:figure/*:label} -\xmllshow{/(*:library|figurelibrary)/figure/*:label} -\xmllshow{/(*:library|figurelibrary)/figure/label} -\xmllshow{/(*:library|figurelibrary)/figure:*/label} - -\xmlshow {whatever//br[tag(1)='br']} - -\stopsection - -\stopchapter - -\startchapter[title=Examples] - -\startsection[title=attribute chains] - -In \CSS, when an attribute is not present, the parent element is checked, and when -not found again, the lookup follows the chain till a match is found or the root is -reached. The following example demonstrates how such a chain lookup works. - -\startbuffer[test] -<something mine="1" test="one" more="alpha"> - <whatever mine="2" test="two"> - <whocares mine="3"> - <!-- this is a test --> - </whocares> - </whatever> -</something> -\stopbuffer - -\typebuffer[test] - -We apply the following setups to this tree: - -\startbuffer[setups] -\startxmlsetups xml:common - [ - \xmlchainatt{#1}{mine}, - \xmlchainatt{#1}{test}, - \xmlchainatt{#1}{more}, - \xmlchainatt{#1}{none} - ]\par -\stopxmlsetups - -\startxmlsetups xml:something - something: \xmlsetup{#1}{xml:common} - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:whatever - whatever: \xmlsetup{#1}{xml:common} - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:whocares - whocares: \xmlsetup{#1}{xml:common} - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:mysetups - \xmlsetsetup{#1}{something|whatever|whocares}{xml:*} -\stopxmlsetups - -\xmlregisterdocumentsetup{example-1}{xml:mysetups} - -\xmlprocessbuffer{example-1}{test}{} -\stopbuffer - -\typebuffer[setups] - -This gives: - -\start - \getbuffer[setups] -\stop - -\stopsection - -\startsection[title=conditional setups] - -Say that we have this code: - -\starttyping -\xmldoifelse {#1} {/what[@a='1']} { - \xmlfilter {#1} {/what/command('xml:yes')} -} { - \xmlfilter {#1} {/what/command('xml:nop')} -} -\stoptyping - -Here we first determine if there is a child \type {what} with attribute \type {a} -set to \type {1}. Depending on the outcome again we check the child nodes for -being named \type {what}. A faster solution which also takes less code is this: - -\starttyping -\xmlfilter {#1} {/what[@a='1']/command('xml:yes','xml:nop')} -\stoptyping - -\stopsection - -\startsection[title=manipulating] - -Assume that we have the following \XML\ data: - -\startbuffer[test] -<A> - <B>right</B> - <B>wrong</B> -</A> -\stopbuffer - -\typebuffer[test] - -But, instead of \type {right} we want to see \type {okay}. We can do that with a -finalizer: - -\startbuffer -\startluacode -local rehash = { - ["right"] = "okay", -} - -function xml.finalizers.tex.Okayed(collected,what) - for i=1,#collected do - if what == "all" then - local str = xml.text(collected[i]) - context(rehash[str] or str) - else - context(str) - end - end -end -\stopluacode -\stopbuffer - -\typebuffer \getbuffer - -\startbuffer -\startxmlsetups xml:A - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:B - (It's \xmlfilter{#1}{./Okayed("all")}) -\stopxmlsetups - -\startxmlsetups xml:testsetups - \xmlsetsetup{#1}{A|B}{xml:*} -\stopxmlsetups - -\xmlregisterdocumentsetup{example-2}{xml:testsetups} -\xmlprocessbuffer{example-2}{test}{} -\stopbuffer - -\typebuffer - -The result is: \start \inlinebuffer \stop - -\stopsection - -\startsection[title=cross referencing] - -A rather common way to add cross references to \XML\ files is to borrow the -asymmetrical id's from \HTML. This means that one cannot simply use a value -of (say) \type {href} to locate an \type {id}. The next example came up on -the \CONTEXT\ mailing list. - -\startbuffer[test] -<doc> - <p>Text - <a href="#fn1" class="footnoteref" id="fnref1"><sup>1</sup></a> and - <a href="#fn2" class="footnoteref" id="fnref2"><sup>2</sup></a> - </p> - <div class="footnotes"> - <hr /> - <ol> - <li id="fn1"><p>A footnote.<a href="#fnref1">↩</a></p></li> - <li id="fn2"><p>A second footnote.<a href="#fnref2">↩</a></p></li> - </ol> - </div> -</doc> -\stopbuffer - -\typebuffer[test] - -We give two variants for dealing with such references. The first solution does -lookups and depending on the size of the file can be somewhat inefficient. - -\startbuffer -\startxmlsetups xml:doc - \blank - \xmlflush{#1} - \blank -\stopxmlsetups - -\startxmlsetups xml:p - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:footnote - (variant 1)\footnote - {\xmlfirst - {example-3-1} - {div[@class='footnotes']/ol/li[@id='\xmlrefatt{#1}{href}']}} -\stopxmlsetups - -\startxmlsetups xml:initialize - \xmlsetsetup{#1}{p|doc}{xml:*} - \xmlsetsetup{#1}{a[@class='footnoteref']}{xml:footnote} - \xmlsetsetup{#1}{div[@class='footnotes']}{xml:nothing} -\stopxmlsetups - -\xmlresetdocumentsetups{*} -\xmlregisterdocumentsetup{example-3-1}{xml:initialize} - -\xmlprocessbuffer{example-3-1}{test}{} -\stopbuffer - -\typebuffer - -This will typeset two footnotes. - -\getbuffer - -The second variant collects the references so that the time spend on lookups is -less. - -\startbuffer -\startxmlsetups xml:doc - \blank - \xmlflush{#1} - \blank -\stopxmlsetups - -\startxmlsetups xml:p - \xmlflush{#1} -\stopxmlsetups - -\startluacode - userdata.notes = {} -\stopluacode - -\startxmlsetups xml:collectnotes - \ctxlua{userdata.notes['\xmlrefatt{#1}{id}'] = '#1'} -\stopxmlsetups - -\startxmlsetups xml:footnote - (variant 2)\footnote - {\xmlflush - {\cldcontext{userdata.notes['\xmlrefatt{#1}{href}']}}} -\stopxmlsetups - -\startxmlsetups xml:initialize - \xmlsetsetup{#1}{p|doc}{xml:*} - \xmlsetsetup{#1}{a[@class='footnoteref']}{xml:footnote} - \xmlfilter{#1}{div[@class='footnotes']/ol/li/command(xml:collectnotes)} - \xmlsetsetup{#1}{div[@class='footnotes']}{} -\stopxmlsetups - -\xmlregisterdocumentsetup{example-3-2}{xml:initialize} - -\xmlprocessbuffer{example-3-2}{test}{} -\stopbuffer - -\typebuffer - -This will again typeset two footnotes: - -\getbuffer - -\stopsection - -\startsection[title=mapping values] - -One way to process options \type {frame} in the example below is to map the -values to values known by \CONTEXT. - -\startbuffer[test] -<a> - <nattable frame="on"> - <tr><td>#1</td><td>#2</td><td>#3</td><td>#4</td></tr> - <tr><td>#5</td><td>#6</td><td>#7</td><td>#8</td></tr> - </nattable> - <nattable frame="off"> - <tr><td>#1</td><td>#2</td><td>#3</td><td>#4</td></tr> - <tr><td>#5</td><td>#6</td><td>#7</td><td>#8</td></tr> - </nattable> - <nattable frame="no"> - <tr><td>#1</td><td>#2</td><td>#3</td><td>#4</td></tr> - <tr><td>#5</td><td>#6</td><td>#7</td><td>#8</td></tr> - </nattable> -</a> -\stopbuffer - -\typebuffer[test] - -\startbuffer -\startxmlsetups xml:a - \xmlflush{#1} -\stopxmlsetups - -\xmlmapvalue {nattable:frame} {on} {on} -\xmlmapvalue {nattable:frame} {yes} {on} -\xmlmapvalue {nattable:frame} {off} {off} -\xmlmapvalue {nattable:frame} {no} {off} - -\startxmlsetups xml:nattable - \startplacetable[title=#1] - \setupTABLE[frame=\xmlval{nattable:frame}{\xmlatt{#1}{frame}}{on}]% - \bTABLE - \xmlflush{#1} - \eTABLE - \stopplacetable -\stopxmlsetups - -\startxmlsetups xml:tr - \bTR - \xmlflush{#1} - \eTR -\stopxmlsetups - -\startxmlsetups xml:td - \bTD - \xmlflush{#1} - \eTD -\stopxmlsetups - -\startxmlsetups xml:testsetups - \xmlsetsetup{example-4}{a|nattable|tr|td|}{xml:*} -\stopxmlsetups - -\xmlregisterdocumentsetup{example-4}{xml:testsetups} - -\xmlprocessbuffer{example-4}{test}{} -\stopbuffer - -The \type {\xmlmapvalue} mechanism is rather efficient and involves a minimum -of testing. - -\typebuffer - -We get: - -\getbuffer - -\stopsection - -\startsection[title=using \LUA] - -In this example we demonstrate how you can delegate rendering to \LUA. We -will construct a so called extreme table. The input is: - -\startbuffer[demo] -<?xml version="1.0" encoding="utf-8"?> - -<a> - <b> <c>1</c> <d>Text</d> </b> - <b> <c>2</c> <d>More text</d> </b> - <b> <c>2</c> <d>Even more text</d> </b> - <b> <c>2</c> <d>And more</d> </b> - <b> <c>3</c> <d>And even more</d> </b> - <b> <c>2</c> <d>The last text</d> </b> -</a> -\stopbuffer - -\typebuffer[demo] - -The processor code is: - -\startbuffer[process] -\startxmlsetups xml:test_setups - \xmlsetsetup{#1}{a|b|c|d}{xml:*} -\stopxmlsetups - -\xmlregisterdocumentsetup{example-5}{xml:test_setups} - -\xmlprocessbuffer{example-5}{demo}{} -\stopbuffer - -\typebuffer - -We color a sequence of the same titles (numbers here) differently. The first -solution remembers the last title: - -\startbuffer -\startxmlsetups xml:a - \startembeddedxtable - \xmlflush{#1} - \stopembeddedxtable -\stopxmlsetups - -\startxmlsetups xml:b - \xmlfunction{#1}{test_ba} -\stopxmlsetups - -\startluacode -local lasttitle = nil - -function xml.functions.test_ba(t) - local title = xml.text(t, "/c") - local content = xml.text(t, "/d") - context.startxrow() - context.startxcell { - background = "color", - backgroundcolor = lasttitle == title and "colorone" or "colortwo", - foregroundstyle = "bold", - foregroundcolor = "white", - } - context(title) - lasttitle = title - context.stopxcell() - context.startxcell() - context(content) - context.stopxcell() - context.stopxrow() -end -\stopluacode -\stopbuffer - -\typebuffer \getbuffer - -The \type {embeddedxtable} environment is needed because the table is picked up -as argument. - -\startlinecorrection \getbuffer[process] \stoplinecorrection - -The second implemetation remembers what titles are already processed so here we -can color the last one too. - -\startbuffer -\startxmlsetups xml:a - \ctxlua{xml.functions.reset_bb()} - \startembeddedxtable - \xmlflush{#1} - \stopembeddedxtable -\stopxmlsetups - -\startxmlsetups xml:b - \xmlfunction{#1}{test_bb} -\stopxmlsetups - -\startluacode -local titles - -function xml.functions.reset_bb(t) - titles = { } -end - -function xml.functions.test_bb(t) - local title = xml.text(t, "/c") - local content = xml.text(t, "/d") - context.startxrow() - context.startxcell { - background = "color", - backgroundcolor = titles[title] and "colorone" or "colortwo", - foregroundstyle = "bold", - foregroundcolor = "white", - } - context(title) - titles[title] = true - context.stopxcell() - context.startxcell() - context(content) - context.stopxcell() - context.stopxrow() -end -\stopluacode -\stopbuffer - -\typebuffer \getbuffer - -\startlinecorrection \getbuffer[process] \stoplinecorrection - -A solution without any state variable is given below. - -\startbuffer -\startxmlsetups xml:a - \startembeddedxtable - \xmlflush{#1} - \stopembeddedxtable -\stopxmlsetups - -\startxmlsetups xml:b - \xmlfunction{#1}{test_bc} -\stopxmlsetups - -\startluacode -function xml.functions.test_bc(t) - local title = xml.text(t, "/c") - local content = xml.text(t, "/d") - context.startxrow() - local okay = xml.text(t,"./preceding-sibling::/[-1]") == title - context.startxcell { - background = "color", - backgroundcolor = okay and "colorone" or "colortwo", - foregroundstyle = "bold", - foregroundcolor = "white", - } - context(title) - context.stopxcell() - context.startxcell() - context(content) - context.stopxcell() - context.stopxrow() -end -\stopluacode -\stopbuffer - -\typebuffer \getbuffer - -\startlinecorrection \getbuffer[process] \stoplinecorrection - -Here is a solution that delegates even more to \LUA. The previous variants were -actually not that safe with repect to special characters and didn't handle -nested elements either but the next one does. - -\startbuffer[demo] -<?xml version="1.0" encoding="utf-8"?> - -<a> - <b> <c>#1</c> <d>Text</d> </b> - <b> <c>#2</c> <d>More text</d> </b> - <b> <c>#2</c> <d>Even more text</d> </b> - <b> <c>#2</c> <d>And more</d> </b> - <b> <c>#3</c> <d>And even more</d> </b> - <b> <c>#2</c> <d>Something <i>nested</i> </d> </b> -</a> -\stopbuffer - -\typebuffer[demo] - -We also need to map the \type {i} element. - -\startbuffer -\startxmlsetups xml:a - \starttexcode - \xmlfunction{#1}{test_a} - \stoptexcode -\stopxmlsetups - -\startxmlsetups xml:c - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:d - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:i - {\em\xmlflush{#1}} -\stopxmlsetups - -\startluacode -function xml.functions.test_a(t) - context.startxtable() - local previous = false - for b in xml.collected(lxml.getid(t),"/b") do - context.startxrow() - local current = xml.text(b,"/c") - context.startxcell { - background = "color", - backgroundcolor = (previous == current) and "colorone" or "colortwo", - foregroundstyle = "bold", - foregroundcolor = "white", - } - lxml.first(b,"/c") - context.stopxcell() - context.startxcell() - lxml.first(b,"/d") - context.stopxcell() - previous = current - context.stopxrow() - end - context.stopxtable() -end -\stopluacode - -\startxmlsetups xml:test_setups - \xmlsetsetup{#1}{a|b|c|d|i}{xml:*} -\stopxmlsetups - -\xmlregisterdocumentsetup{example-5}{xml:test_setups} - -\xmlprocessbuffer{example-5}{demo}{} -\stopbuffer - -\typebuffer - -\startlinecorrection \getbuffer \stoplinecorrection - -The question is, do we really need \LUA ? Often we don't, apart maybe from an -occasional special finalizer. A pure \TEX\ solution is given next: - -\startbuffer -\startxmlsetups xml:a - \glet\MyPreviousTitle\empty - \glet\MyCurrentTitle \empty - \startembeddedxtable - \xmlflush{#1} - \stopembeddedxtable -\stopxmlsetups - -\startxmlsetups xml:b - \startxrow - \xmlflush{#1} - \stopxrow -\stopxmlsetups - -\startxmlsetups xml:c - \xdef\MyCurrentTitle{\xmltext{#1}{.}} - \doifelse {\MyPreviousTitle} {\MyCurrentTitle} { - \startxcell - [background=color, - backgroundcolor=colorone, - foregroundstyle=bold, - foregroundcolor=white] - } { - \glet\MyPreviousTitle\MyCurrentTitle - \startxcell - [background=color, - backgroundcolor=colortwo, - foregroundstyle=bold, - foregroundcolor=white] - } - \xmlflush{#1} - \stopxcell -\stopxmlsetups - -\startxmlsetups xml:d - \startxcell - \xmlflush{#1} - \stopxcell -\stopxmlsetups - -\startxmlsetups xml:i - {\em\xmlflush{#1}} -\stopxmlsetups - -\startxmlsetups xml:test_setups - \xmlsetsetup{#1}{*}{xml:*} -\stopxmlsetups - -\xmlregisterdocumentsetup{example-5}{xml:test_setups} - -\xmlprocessbuffer{example-5}{demo}{} -\stopbuffer - -\typebuffer - -\startlinecorrection \getbuffer \stoplinecorrection - -You can even save a few lines of code: - -\starttyping -\startxmlsetups xml:c - \xdef\MyCurrentTitle{\xmltext{#1}{.}} - \startxcell - [background=color, - backgroundcolor=color\ifx\MyPreviousTitle\MyCurrentTitle one\else two\fi, - foregroundstyle=bold, - foregroundcolor=white] - \xmlflush{#1} - \stopxcell - \glet\MyPreviousTitle\MyCurrentTitle -\stopxmlsetups -\stoptyping - -Or if you prefer: - -\starttyping -\startxmlsetups xml:c - \xdef\MyCurrentTitle{\xmltext{#1}{.}} - \doifelse {\MyPreviousTitle} {\MyCurrentTitle} { - \xmlsetup{#1}{xml:c:one} - } { - \xmlsetup{#1}{xml:c:two} - } -\stopxmlsetups - -\startxmlsetups xml:c:one - \startxcell - [background=color, - backgroundcolor=colorone, - foregroundstyle=bold, - foregroundcolor=white] - \xmlflush{#1} - \stopxcell -\stopxmlsetups - -\startxmlsetups xml:c:two - \startxcell - [background=color, - backgroundcolor=colortwo, - foregroundstyle=bold, - foregroundcolor=white] - \xmlflush{#1} - \stopxcell - \global\let\MyPreviousTitle\MyCurrentTitle -\stopxmlsetups -\stoptyping - -These examples demonstrate that it doesn't hurt to know a little bit of \TEX\ -programming: defining macros and basic comparisons can come in handy. There are -examples in the test suite, you can peek in the source code, you can consult -the wiki or you can just ask on the list. - -\stopsection - -\startsection[title=last match] - -For the next example we use the following \XML\ input: - -\startbuffer[demo] -<?xml version "1.0"?> -<document> - <section id="1"> - <content> - <p>first</p> - <p>second</p> - </content> - </section> - <section id="2"> - <content> - <p>third</p> - <p>fourth</p> - </content> - </section> -</document> -\stopbuffer - -\typebuffer[demo] - -If you check if some element is present and then act accordingly, you can -end up with doing the same lookup twice. Although it might sound inefficient, -in practice it's often not measureable. - -\startbuffer -\startxmlsetups xml:demo:document - \type{\xmlall{#1}{/section[@id='2']/content/p}}\par - \xmldoif{#1}{/section[@id='2']/content/p} { - \xmlall{#1}{/section[@id='2']/content/p} - } - \type{\xmllastmatch}\par - \xmldoif{#1}{/section[@id='2']/content/p} { - \xmllastmatch - } - \type{\xmlall{#1}{last-match::}}\par - \xmldoif{#1}{/section[@id='2']/content/p} { - \xmlall{#1}{last-match::} - } - \type{\xmlfilter{#1}{last-match::/command(xml:demo:p)}}\par - \xmldoif{#1}{/section[@id='2']/content/p} { - \xmlfilter{#1}{last-match::/command(xml:demo:p)} - } -\stopxmlsetups - -\startxmlsetups xml:demo:p - \quad\xmlflush{#1}\endgraf -\stopxmlsetups - -\startxmlsetups xml:demo:base - \xmlsetsetup{#1}{document|p}{xml:demo:*} -\stopxmlsetups - -\xmlregisterdocumentsetup{example-6}{xml:demo:base} - -\xmlprocessbuffer{example-6}{demo}{} -\stopbuffer - -\typebuffer - -In the second check we just flush the last match, so effective we do an \type -{\xmlall} here. The third and fourth alternatives demonstrate how we can use -\type {last-match} as axis. The gain is 10\% or more on the lookup but of course -typesetting often takes relatively more time than the lookup. - -\startpacked -\getbuffer -\stoppacked - -\stopsection - -\startsection[title=Finalizers] - -The \XML\ parser is also available outside \TEX. Here is an example of its usage. -We pipe the result to \TEX\ but you can do with \type {t} whatever you like. - -\startbuffer -local x = xml.load("manual-demo-1.xml") -local t = { } - -for c in xml.collected(x,"//*") do - if not c.special and not t[c.tg] then - t[c.tg] = true - end -end - -context.tocontext(table.sortedkeys(t)) -\stopbuffer - -\typebuffer - -This returns: - -\ctxluabuffer - -We can wrap this in a finalizer: - -\startbuffer -xml.finalizers.taglist = function(collected) - local t = { } - for i=1,#collected do - local c = collected[i] - if not c.special then - local tg = c.tg - if tg and not t[tg] then - t[tg] = true - end - end - end - return table.sortedkeys(t) -end -\stopbuffer - -\typebuffer - -Or in a more extensive one: - -\startbuffer -xml.finalizers.taglist = function(collected,parenttoo) - local t = { } - for i=1,#collected do - local c = collected[i] - if not c.special then - local tg = c.tg - if tg and not t[tg] then - t[tg] = true - end - if parenttoo then - local p = c.__p__ - if p and not p.special then - local tg = p.tg .. ":" .. tg - if tg and not t[tg] then - t[tg] = true - end - end - end - end - end - return table.sortedkeys(t) -end -\stopbuffer - -\typebuffer \ctxluabuffer - -Usage is as follows: - -\startbuffer -local x = xml.load("manual-demo-1.xml") -local t = xml.applylpath(x,"//*/taglist()") - -context.tocontext(t) -\stopbuffer - -\typebuffer - -And indeed we get: - -\ctxluabuffer - -But we can also say: - -\startbuffer -local x = xml.load("manual-demo-1.xml") -local t = xml.applylpath(x,"//*/taglist(true)") - -context.tocontext(t) -\stopbuffer - -\typebuffer - -Now we get: - -\ctxluabuffer - -\startsection[title=Pure xml] - -One might wonder how a \TEX\ macro package would look like when backslashes, -dollars and percent signs would have no special meaning. In fact, it would be -rather useless as interpreting commands are triggered by such characters. Any -formatting or coding system needs such characters. Take \XML: angle brackets and -ampersands are really special. So, no matter what system we use, we do have to -deal with the (common) case where these characters need to be seen as they are. -Normally escaping is the solution. - -The \CONTEXT\ interface for \XML\ suffers from this as well. You really don't -want to know how many tricks are used for dealing with special characters and -entities: there are several ways these travel through the system and it is -possible to adapt and cheat. Especially roundtripped data (via tuc file) puts -some demands on the system because when ts \XML\ can become \TEX\ and vise versa. -The next example (derived from a mail on the list) demonstrates this: - -\starttyping -\startbuffer[demo] -<doc> - <pre><code>\ConTeXt\ is great</code></pre> - - <pre><code>but you need to know some tricks</code></pre> -</doc> -\stopbuffer - -\startxmlsetups xml:initialize - \xmlsetsetup{#1}{doc|p|code}{xml:*} - \xmlsetsetup{#1}{pre/code}{xml:pre:code} -\stopxmlsetups - -\xmlregistersetup{xml:initialize} - -\startxmlsetups xml:doc - \xmlflush{#1} -\stopxmlsetups - -\startxmlsetups xml:pre:code - no solution - \comment[symbol=Key, location=inmargin,color=yellow]{\xmlflush{#1}} - \par - solution one \begingroup - \expandUx - \comment[symbol=Key, location=inmargin,color=yellow]{\xmlflush{#1}} - \endgroup - \par - solution two - \comment[symbol=Key, location=inmargin,color=yellow]{\xmlpure{#1}} - \par - \xmlprettyprint{#1}{tex} -\stopxmlsetups - -\xmlprocessbuffer{main}{demo}{} -\stoptyping - -The first comment (an interactive feature of \PDF\ comes out as: - -\starttyping -\Ux {5C}ConTeXt\Ux {5C} is great -\stoptyping - -The second and third comment are okay. It's one of the reasons why we have \type -{\xmlpure}. - -\stopsection - -\stopchapter - + \component xml-mkiv-converter + \component xml-mkiv-filtering + \component xml-mkiv-commands + \component xml-mkiv-expressions + \component xml-mkiv-tricks + \component xml-mkiv-lookups + \component xml-mkiv-examples \stopbodymatter \stoptext |