% language=uk
% author : Hans Hagen
% copyright : PRAGMA ADE & ConTeXt Development Team
% license : Creative Commons Attribution ShareAlike 4.0 International
% reference : pragma-ade.nl | contextgarden.net | texlive (related) distributions
% origin : the ConTeXt distribution
%
% comment : Because this manual is distributed with TeX distributions it comes with a rather
% liberal license. We try to adapt these documents to upgrades in the (sub)systems
% that they describe. Using parts of the content otherwise can therefore conflict
% with existing functionality and we cannot be held responsible for that. Many of
% the manuals contain characteristic graphics and personal notes or examples that
% make no sense when used out-of-context.
%
% comment : Some chapters might have been published in TugBoat, the NTG Maps, the ConTeXt
% Group journal or otherwise. Thanks to the editors for corrections. Also thanks
% to users for testing, feedback and corrections.
% to be checked:
%
% \Ux in index
%
% undocumented:
%
% \processXMLbuffer
% \processxmlbuffer
% \processxmlfile
%
% kind of special ... tricky explanation needed:
%
% \xmldirect
\input lxml-ctx.mkiv
\settrue \xmllshowtitle
\setfalse\xmllshowwarning
\usemodule[set-11]
\loadsetups[i-context]
% \definehspace[squad][1em plus .25em minus .25em]
\usemodule[abr-02]
\setuplayout
[location=middle,
marking=on,
backspace=20mm,
cutspace=20mm,
topspace=15mm,
header=15mm,
footer=15mm,
height=middle,
width=middle]
\setuppagenumbering
[alternative=doublesided,
location=]
\setupfootertexts
[][pagenumber]
\setupheadertexts
[][chapter]
\setupheader
[color=colortwo,
style=bold]
\setupfooter
[color=colortwo,
style=bold]
\setuphead
[chapter]
[page={yes,header,right},
header=empty,
style=\bfc]
\setupsectionblock
[page={yes,header,right}]
\starttexdefinition unexpanded section:chapter:number #1
\doifmode{*sectionnumber} {
\bf
\llap{<\enspace}#1\enspace>
}
\stoptexdefinition
\starttexdefinition unexpanded section:section:number #1
\doifmode{*sectionnumber} {
\bf
\llap{<<\enspace}#1\enspace>>
}
\stoptexdefinition
\starttexdefinition unexpanded section:subsection:number #1
\doifmode{*sectionnumber} {
\bf
\llap{<<<\enspace}#1\enspace>>>
}
\stoptexdefinition
\setuphead[chapter] [numbercolor=black,numbercommand=\texdefinition{section:chapter:number}]
\setuphead[section] [numbercolor=black,numbercommand=\texdefinition{section:section:number}]
\setuphead[subsection][numbercolor=black,numbercommand=\texdefinition{section:subsection:number}]
\setuphead
[section]
[style=\bfa]
\setuplist
[chapter]
[style=bold]
\setupinteractionscreen
[option=doublesided]
\setupalign
[tolerant,stretch]
\setupwhitespace
[big]
\setuptolerance
[tolerant]
\doifelsemode {atpragma} {
\setupbodyfont[lucidaot,10pt]
} {
\setupbodyfont[dejavu,10pt]
}
\definecolor[colorone] [b=.5]
\definecolor[colortwo] [s=.3]
\definecolor[colorthree][y=.5]
\setuptype
[color=colorone]
\setuptyping
[color=colorone]
\setuphead
[lshowtitle]
[style=\tt,
color=colorone]
\setuphead
[chapter,section]
[numbercolor=colortwo,
color=colorone]
\definedescription
[xmlcmd]
[alternative=hanging,
width=line,
distance=1em,
margin=2em,
headstyle=monobold,
headcolor=colorone]
\setupframedtext
[setuptext]
[framecolor=colorone,
rulethickness=1pt,
corner=round]
\usemodule[punk]
\usetypescript[punk]
\definelayer
[page]
[width=\paperwidth,
height=\paperheight]
\starttext
\setuplayout[page]
\startstandardmakeup
\startfontclass[none] % nil the current fontclass since it may append its features
\EnableRandomPunk
\setlayerframed
[page]
[width=\paperwidth,height=\paperheight,
background=color,backgroundcolor=colorone,backgroundoffset=1ex,frame=off]
{}
\definedfont[demo@punk at 18pt]
\setbox\scratchbox\vbox {
\hsize\dimexpr\paperwidth+2ex\relax
\setupinterlinespace
\baselineskip 1\baselineskip plus 1pt minus 1pt
\raggedcenter
\color[colortwo]{\dorecurse{1000}{XML }}
}
\setlayer
[page]
[preset=middle]
{\vsplit\scratchbox to \dimexpr\paperheight+2ex\relax}
\definedfont[demo@punk at 90pt]
\setstrut
\setlayerframed
[page]
[preset=rightbottom,offset=10mm]
[foregroundcolor=colorthree,align=flushright,offset=overlay,frame=off]
{Dealing\\with XML in\\Con\TeX t MkIV}
\definedfont[demo@punk at 18pt]
\setstrut
\setlayerframed
[page]
[preset=righttop,offset=10mm,x=3mm,rotation=90]
[foregroundcolor=colorthree,align=flushright,offset=overlay,frame=off]
{Hans Hagen, Pragma ADE, \currentdate}
\tightlayer[page]
\stopfontclass
\stopstandardmakeup
\setuplayout
\startfrontmatter
\starttitle[title=Contents]
\placelist
[chapter,section]
\stoptitle
\startchapter[title={Introduction}]
This manual presents the \MKIV\ way of dealing with \XML. Although the
traditional \MKII\ streaming parser has a charming simplicity in its control, for
complex documents the tree based \MKIV\ method is more convenient. It is for this
reason that the old method has been removed from \MKIV. If you are familiar with
\XML\ processing in \MKII, then you will have noticed that the \MKII\ commands
have \type {XML} in their name. The \MKIV\ commands have a lowercase \type {xml}
in their names. That way there is no danger for confusion or a mixup.
You may wonder why we do these manipulations in \TEX\ and not use \XSLT\ (or
other transformation methods) instead. The advantage of an integrated approach is
that it simplifies usage. Think of not only processing the document, but also
using \XML\ for managing resources in the same run. An \XSLT\ approach is just as
verbose (after all, you still need to produce \TEX\ code) and probably less
readable. In the case of \MKIV\ the integrated approach is also faster and gives
us the option to manipulate content at runtime using \LUA. It has the additional
advantage that to some extend we can handle a mix of \TEX\ and \XML\ because we
know when we're doing one or the other.
This manual is dedicated to Taco Hoekwater, one of the first \CONTEXT\ users, and
also the first to use it for processing \XML. Who could have thought at that time
that we would have a more convenient way of dealing with those angle brackets.
The second version for this manual is dedicated to Thomas Schmitz, a power user
who occasionally became victim of the evolving mechanisms.
\blank
\startlines
Hans Hagen
\PRAGMA
Hasselt NL
2008\endash2016
\stoplines
\stopchapter
\stopfrontmatter
\startbodymatter
\startchapter[title={Setting up a converter}]
\startsection[title={from structure to setup}]
We use a very simple document structure for demonstrating how a converter is
defined. In practice a mapping will be more complex, especially when we have a
style with complex chapter openings using data coming from all kind of places,
different styling of sections with the same name, selectively (out of order)
flushed content, special formatting, etc.
\typefile{manual-demo-1.xml}
Say that this document is stored in the file \type {demo.xml}, then the following
code can be used as starting point:
\starttyping
\startxmlsetups xml:demo:base
\xmlsetsetup{#1}{document|section|p}{xml:demo:*}
\stopxmlsetups
\xmlregisterdocumentsetup{demo}{xml:demo:base}
\startxmlsetups xml:demo:document
\starttitle[title={Contents}]
\placelist[chapter]
\stoptitle
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:demo:section
\startchapter[title=\xmlfirst{#1}{/title}]
\xmlfirst{#1}{/content}
\stopchapter
\stopxmlsetups
\startxmlsetups xml:demo:p
\xmlflush{#1}\endgraf
\stopxmlsetups
\xmlprocessfile{demo}{demo.xml}{}
\stoptyping
Watch out! These are not just setups, but specific \XML\ setups which get an
argument passed (the \type {#1}). If for some reason your \XML\ processing fails,
it might be that you mistakenly have used a normal setup definition. The argument
\type {#1} represents the current node (element) and is a unique identifier. For
instance a \type {
..
} can have an identifier {demo::5}. So, we can get
something:
\starttyping
\xmlflush{demo::5}\endgraf
\stoptyping
but as well:
\starttyping
\xmlflush{demo::6}\endgraf
\stoptyping
Keep in mind that the references tor the actual nodes (elements) are
abstractions, you never see those \type {::}'s, because we will use
either the abstract \type {#1} (any node) or an explicit reference like \type
{demo}. The previous setup when issued will be like:
\starttyping
\startchapter[title=\xmlfirst{demo::3}{/title}]
\xmlfirst{demo::4}{/content}
\stopchapter
\stoptyping
Here the \type {title} is used to typeset the chapter title but also for an entry
in the table of contents. At the moment the title is typeset the \XML\ node gets
looked up and expanded in real text. However, for the list it gets stored for
later use. One can argue that this is not needed for \XML, because one can just
filter all the titles and use page references, but then one also looses the
control one normally has over such titles. For instance it can be that some
titles are rendered differently and for that we need to keep track of usage.
Doing that with transformations or filtering is often more complex than leaving
that to \TEX. As soon as the list gets typeset, the reference (\type {demo::#3})
is used for the lookup. This is because by default the title is stored as given.
So, as long as we make sure the \XML\ source is loaded before the table of
contents is typeset we're ok. Later we will look into this in more detail, for
now it's enough to know that in most cases the abstract \type {#1} reference will
work out ok.
Contrary to the style definitions this interface looks rather low level (with no
optional arguments) and the main reason for this is that we want processing to be
fast. So, the basic framework is:
\starttyping
\startxmlsetups xml:demo:base
% associate setups with elements
\stopxmlsetups
\xmlregisterdocumentsetup{demo}{xml:demo:base}
% define setups for matches
\xmlprocessfile{demo}{demo.xml}{}
\stoptyping
In this example we mostly just flush the content of an element and in the case of
a section we flush explicit child elements. The \type {#1} in the example code
represents the current element. The line:
\starttyping
\xmlsetsetup{demo}{*}{-}
\stoptyping
sets the default for each element to \quote {just ignore it}. A \type {+} would
make the default to always flush the content. This means that at this point we
only handle:
\starttyping
Some title
a paragraph of text
\stoptyping
In the next section we will deal with the slightly more complex itemize and
figure placement. At first sight all these setups may look overkill but keep in
mind that normally the number of elements is rather limited. The complexity is
often in the style and having access to each snippet of content is actually
quite handy for that.
\stopsection
\startsection[title={alternative solutions}]
Dealing with an itemize is rather simple (as long as we forget about
attributes that control the behaviour):
\starttyping
- first
- second
\stoptyping
First we need to add \type {itemize} to the setup assignment (unless we've used
the wildcard \type {*}):
\starttyping
\xmlsetsetup{demo}{document|section|p|itemize}{xml:demo:*}
\stoptyping
The setup can look like:
\starttyping
\startxmlsetups xml:demo:itemize
\startitemize
\xmlfilter{#1}{/item/command(xml:demo:itemize:item)}
\stopitemize
\stopxmlsetups
\startxmlsetups xml:demo:itemize:item
\startitem
\xmlflush{#1}
\stopitem
\stopxmlsetups
\stoptyping
An alternative is to map item directly:
\starttyping
\xmlsetsetup{demo}{document|section|p|itemize|item}{xml:demo:*}
\stoptyping
and use:
\starttyping
\startxmlsetups xml:demo:itemize
\startitemize
\xmlflush{#1}
\stopitemize
\stopxmlsetups
\startxmlsetups xml:demo:item
\startitem
\xmlflush{#1}
\stopitem
\stopxmlsetups
\stoptyping
Sometimes, a more local solution using filters and \type {/command(...)} makes more
sense, especially when the \type {item} tag is used for other purposes as well.
Explicit flushing with \type {command} is definitely the way to go when you have
complex products. In one of our projects we compose math school books from many
thousands of small \XML\ files, and from one source set several products are
typeset. Within a book sections get done differently, content gets used, ignored
or interpreted differently depending on the kind of content, so there is a
constant checking of attributes that drive the rendering. In that a generic setup
for a title element makes less sense than explicit ones for each case. (We're
talking of huge amounts of files here, including multiple images on each rendered
page.)
When using \type {command} you can pass two arguments, the first is the setup for
the match, the second one for the miss, as in:
\starttyping
\xmlfilter{#1}{/element/command(xml:true,xml:false)}
\stoptyping
Back to the example, this leaves us with dealing with the resources, like
figures:
\starttyping
A picture of a cow.
\stoptyping
Here we can use a more restricted match:
\starttyping
\xmlsetsetup{demo}{resource[@type='figure']}{xml:demo:figure}
\xmlsetsetup{demo}{external}{xml:demo:*}
\stoptyping
and the definitions:
\starttyping
\startxmlsetups xml:demo:figure
\placefigure
{\xmlfirst{#1}{/caption}}
{\xmlfirst{#1}{/content}}
\stopxmlsetups
\startxmlsetups xml:demo:external
\externalfigure[\xmlatt{#1}{file}]
\stopxmlsetups
\stoptyping
At this point it is good to notice that \type {\xmlatt{#1}{file}} is passed as it
is: a macro call. This means that when a macro like \type {\externalfigure} uses
the first argument frequently without first storing its value, the lookup is done
several times. A solution for this is:
\starttyping
\startxmlsetups xml:demo:external
\expanded{\externalfigure[\xmlatt{#1}{file}]}
\stopxmlsetups
\stoptyping
Because the lookup is rather fast, normally there is no need to bother about this
too much because internally \CONTEXT\ already makes sure such expansion happens
only once.
An alternative definition for placement is the following:
\starttyping
\xmlsetsetup{demo}{resource}{xml:demo:resource}
\stoptyping
with:
\starttyping
\startxmlsetups xml:demo:resource
\placefloat
[\xmlatt{#1}{type}]
{\xmlfirst{#1}{/caption}}
{\xmlfirst{#1}{/content}}
\stopxmlsetups
\stoptyping
This way you can specify \type {table} as type too. Because you can define your
own float types, more complex variants are also possible. In that case it makes
sense to provide some default behaviour too:
\starttyping
\definefloat[figure-here][figure][default=here]
\definefloat[figure-left][figure][default=left]
\definefloat[table-here] [table] [default=here]
\definefloat[table-left] [table] [default=left]
\startxmlsetups xml:demo:resource
\placefloat
[\xmlattdef{#1}{type}{figure}-\xmlattdef{#1}{location}{here}]
{\xmlfirst{#1}{/caption}}
{\xmlfirst{#1}{/content}}
\stopxmlsetups
\stoptyping
In this example we support two types and two locations. We default to a figure
placed (when possible) at the current location.
\stopsection
\stopchapter
\startchapter[title={Filtering content}]
\startsection[title={\TEX\ versus \LUA}]
It will not come as a surprise that we can access \XML\ files from \TEX\ as well
as from \LUA. In fact there are two methods to deal with \XML\ in \LUA. First
there are the low level \XML\ functions in the \type {xml} namespace. On top of
those functions there is a set of functions in the \type {lxml} namespace that
deals with \XML\ in a more \TEX ie way. Most of these have similar commands at
the \TEX\ end.
\startbuffer
\startxmlsetups first:demo:one
\xmlfilter {#1} {artist/name[text()='Randy Newman']/..
/albums/album[position()=3]/command(first:demo:two)}
\stopxmlsetups
\startxmlsetups first:demo:two
\blank \start \tt
\xmldisplayverbatim{#1}
\stop \blank
\stopxmlsetups
\xmlprocessfile{demo}{music-collection.xml}{first:demo:one}
\stopbuffer
\typebuffer
This gives the following snippet of verbatim \XML\ code. The indentation is
conform the indentation in the whole \XML\ file. \footnote {The (probably
outdated) \XML\ file contains the collection stores on my slimserver instance.
You can use the \type {mtxrun --script flac} to generate such files.}
\doifmodeelse {atpragma} {
\getbuffer
} {
\typefile{xml-mkiv-01.xml}
}
An alternative written in \LUA\ looks as follows:
\startbuffer
\blank \start \tt \startluacode
local m = lxml.load("mine","music-collection.xml") -- m == lxml.id("mine")
local p = "artist/name[text()='Randy Newman']/../albums/album[position()=4]"
local l = lxml.filter(m,p) -- returns a list (with one entry)
lxml.displayverbatim(l[1])
\stopluacode \stop \blank
\stopbuffer
\typebuffer
This produces:
\doifmodeelse {atpragma} {
\getbuffer
} {
\typefile{xml-mkiv-02.xml}
}
You can use both methods mixed but in practice we will use the \TEX\ commands in
regular styles and the mixture in modules, for instance in those dealing with
\MATHML\ and cals tables. For complex matters you can write your own finalizers
(the last action to be taken in a match) in \LUA\ and use them at the \TEX\ end.
\stopsection
\startsection[title={a few details}]
In \CONTEXT\ setups are a rather common variant on macros (\TEX\ commands) but
with their own namespace. An example of a setup is:
\starttyping
\startsetup doc:print
\setuppapersize[A4][A4]
\stopsetup
\startsetup doc:screen
\setuppapersize[S6][S4]
\stopsetup
\stoptyping
Given the previous definitions, later on we can say something like:
\starttyping
\doifmodeelse {paper} {
\setup[doc:print]
} {
\setup[doc:screen]
}
\stoptyping
Another example is:
\starttyping
\startsetup[doc:header]
\marking[chapter]
\space
--
\space
\pagenumber
\stopsetup
\stoptyping
in combination with:
\starttyping
\setupheadertexts[\setup{doc:header}]
\stoptyping
Here the advantage is that instead of ending up with an unreadable header
definitions, we use a nicely formatted setup. An important property of setups and
the reason why they were introduced long ago is that spaces and newlines are
ignored in the definition. This means that we don't have to worry about so called
spurious spaces but it also means that when we do want a space, we have to use
the \type {\space} command.
The only difference between setups and \XML\ setups is that the following ones
get an argument (\type {#1}) that reflects the current node in the \XML\ tree.
\stopsection
\startsection[title={CDATA}]
What to do with \type {CDATA}? There are a few methods at tle \LUA\ end for
dealing with it but here we just mention how you can influence the rendering.
There are four macros that play a role here:
\starttyping
\unexpanded\def\xmlcdataobeyedline {\obeyedline}
\unexpanded\def\xmlcdataobeyedspace{\strut\obeyedspace}
\unexpanded\def\xmlcdatabefore {\begingroup\tt}
\unexpanded\def\xmlcdataafter {\endgroup}
\stoptyping
Technically you can overload them but beware of side effects. Normally you won't
see much \type {CDATA} and whenever we do, it involves special data that needs
very special treatment anyway.
\stopsection
\startsection[title={Entities}]
As usual with any way of encoding documents you need escapes in order to encode
the characters that are used in tagging the content, embedding comments, escaping
special characters in strings (in programming languages), etc. In \XML\ this
means that in order characters like \type {<} you need an escape like \type
{<} and in order then to encode an \type {&} you need \type {&}.
In a typesetting workflow using a programming language like \TEX, another problem
shows up. There we have different special characters, like \type {$ $} for triggering
math, but also the backslash, braces etc. Even one such special character is already
enough to have yet another escaping mechanism at work.
Ideally a user should not worry about these issues but it helps figuring out issues
when you know what happens under the hood. Also it is good to know that in the
code there are several ways to deal with these issues. Take the following document:
\starttyping
Here we have a bit of a <&mess>:
# #
% %
\ \
{ {
| |
} }
~ ~
\stoptyping
When the file is read the \type {<} entity will be replaced by \type {<} and
the \type {>} by \type {>}. The numeric entities will be replaced by the
characters they refer to. The \type {&mess} is kind of special. We do preload
a huge list of more or less standardized entities but \type {mess} is not in
there. However, it is possible to have it defined in the document preamble, like:
\starttyping
]>
\stoptyping
or even this:
\starttyping
what a mess" >
]>
\stoptyping
You can also define it in your document style using one of:
\startxmlcmd {\cmdbasicsetup{xmlsetentity}}
replaces entity with name \cmdinternal {cd:name} by \cmdinternal {cd:text}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmltexentity}}
replaces entity with name \cmdinternal {cd:name} by \cmdinternal {cd:text}
typeset under a \TEX\ regime
\stopxmlcmd
Such a definition will always have a higher priority than the one defined
in the document. Anyway, when the document is read in all entities are
resolved and those that need a special treatment because they map to some
text are stored in such a way that we can roundtrip them. As a consequence,
as soon as the content gets pushed into \TEX, we need not only to intercept
special characters but also have to make sure that the following works:
\starttyping
\xmltexentity {tex} {\TEX}
\stoptyping
Here the backslash starts a control sequence while in regular content a
backslash is just that: a backslash.
Special characters are really special when we have to move text around
in a \TEX\ ecosystem.
\starttyping
About #3
\stoptyping
If we map and define title as follows:
\starttyping
\startxmlsetup xml:title
\title{\xmlflush{#1}}
\stopxmlsetup
\stoptyping
normally something \type {\xmlflush {id::123}} will be written to the
auxiliary file and in most cases that is quite okay, but if we have this:
\starttyping
\setuphead[title][expansion=yes]
\stoptyping
then we don't want the \type {#} to end up as hash because later on \TEX\
can get very confused about it because it sees some argument then in a
probably unexpected way. This is solved by escaping the hash like this:
\starttyping
About \Ux{23}3
\stoptyping
The \type {\Ux} command will convert its hexadecimal argument into a
character. Of course one then needs to typeset such a text under a \TEX\
character regime but that is normally the case anyway.
\stopsection
\stopchapter
\startchapter[title={Commands}]
\startsection[title={nodes and lpaths}]
The amount of commands available for manipulating the \XML\ file is rather large.
Many of the commands cooperate with the already discussed setups, a fancy name
for a collection of macro calls either or not mixed with text.
Most of the commands are just shortcuts to \LUA\ calls, which means that the real
work is done by \LUA. In fact, what happens is that we have a continuous transfer
of control from \TEX\ to \LUA, where \LUA\ prints back either data (like element
content or attribute values) or just invokes a setup whereby it passes a
reference to the node resolved conform the path expression. The invoked setup
itself might return control to \LUA\ again, etc.
This sounds complicated but examples will show what we mean here. First we
present the whole repertoire of commands. Because users can read the source code,
they might uncover more commands, but only the ones discussed here are official.
The commands are grouped in categories.
In the following sections \cmdinternal {cd:node} means a reference to a node:
this can be the identifier of the root (the loaded xml tree) or a reference to a
node in that tree (often the result of some lookup. A \cmdinternal {cd:lpath} is
a fancy name for a path expression (as with \XSLT) but resolved by \LUA.
\stopsection
\startsection[title={commands}]
There are a lot of commands available but you probably can ignore most of them.
We try to be complete which means that there is for instance \type {\xmlfirst} as
well as \type {\xmllast} but you probably never need the last one. There are also
commands that were used when testing this interface and we see no reason to
remove them. Some obscure ones are used in modules and after a while even I often
forget that they exist. To give you an idea of what commands are important we
show their use in generating the \CONTEXT\ command definitions (\type
{x-set-11.mkiv}) per Januari 2016:
\startcolumns[n=2,balance=yes]
\starttabulate[|l|r|]
\NC \type {\xmlall} \NC 1 \NC \NR
\NC \type {\xmlatt} \NC 23 \NC \NR
\NC \type {\xmlattribute} \NC 1 \NC \NR
\NC \type {\xmlcount} \NC 1 \NC \NR
\NC \type {\xmldoif} \NC 2 \NC \NR
\NC \type {\xmldoifelse} \NC 1 \NC \NR
\NC \type {\xmlfilterlist} \NC 4 \NC \NR
\NC \type {\xmlflush} \NC 5 \NC \NR
\NC \type {\xmlinclude} \NC 1 \NC \NR
\NC \type {\xmlloadonly} \NC 1 \NC \NR
\NC \type {\xmlregisterdocumentsetup} \NC 1 \NC \NR
\NC \type {\xmlsetsetup} \NC 1 \NC \NR
\NC \type {\xmlsetup} \NC 4 \NC \NR
\stoptabulate
\stopcolumns
As you can see filtering, flushing and accessing attributes score high. Below we show
the statistics of a quite complex rendering (5 variants of schoolbooks: basic book,
answers, teachers guide, worksheets, full blown version with extensive tracing).
\startcolumns[n=2,balance=yes]
\starttabulate[|l|r|]
\NC \type {\xmladdindex} \NC 3 \NC \NR
\NC \type {\xmlall} \NC 5 \NC \NR
\NC \type {\xmlappendsetup} \NC 1 \NC \NR
\NC \type {\xmlapplyselectors} \NC 1 \NC \NR
\NC \type {\xmlatt} \NC 40 \NC \NR
\NC \type {\xmlattdef} \NC 9 \NC \NR
\NC \type {\xmlattribute} \NC 10 \NC \NR
\NC \type {\xmlbadinclusions} \NC 3 \NC \NR
\NC \type {\xmlconcat} \NC 3 \NC \NR
\NC \type {\xmlcount} \NC 1 \NC \NR
\NC \type {\xmldelete} \NC 11 \NC \NR
\NC \type {\xmldoif} \NC 39 \NC \NR
\NC \type {\xmldoifelse} \NC 28 \NC \NR
\NC \type {\xmldoifelsetext} \NC 13 \NC \NR
\NC \type {\xmldoifnot} \NC 2 \NC \NR
\NC \type {\xmldoifnotselfempty} \NC 1 \NC \NR
\NC \type {\xmlfilter} \NC 100 \NC \NR
\NC \type {\xmlfirst} \NC 51 \NC \NR
\NC \type {\xmlflush} \NC 69 \NC \NR
\NC \type {\xmlflushcontext} \NC 2 \NC \NR
\NC \type {\xmlinclude} \NC 1 \NC \NR
\NC \type {\xmlincludeoptions} \NC 5 \NC \NR
\NC \type {\xmlinclusion} \NC 16 \NC \NR
\NC \type {\xmlinjector} \NC 1 \NC \NR
\NC \type {\xmlloaddirectives} \NC 1 \NC \NR
\NC \type {\xmlmapvalue} \NC 4 \NC \NR
\NC \type {\xmlmatch} \NC 1 \NC \NR
\NC \type {\xmlprependsetup} \NC 5 \NC \NR
\NC \type {\xmlregisterdocumentsetup} \NC 2 \NC \NR
\NC \type {\xmlregistersetup} \NC 1 \NC \NR
\NC \type {\xmlremapnamespace} \NC 1 \NC \NR
\NC \type {\xmlsetfunction} \NC 2 \NC \NR
\NC \type {\xmlsetinjectors} \NC 2 \NC \NR
\NC \type {\xmlsetsetup} \NC 11 \NC \NR
\NC \type {\xmlsetup} \NC 76 \NC \NR
\NC \type {\xmlstrip} \NC 1 \NC \NR
\NC \type {\xmlstripanywhere} \NC 1 \NC \NR
\NC \type {\xmltag} \NC 1 \NC \NR
\NC \type {\xmltext} \NC 53 \NC \NR
\NC \type {\xmlvalue} \NC 2 \NC \NR
\stoptabulate
\stopcolumns
Here many more are used but this is an exceptional case. The top is again
dominated by filtering, flushing and attribute consulting. The list can actually
be smaller. For instance, the \type {\xmlcount} can just as well be \type
{\xmlfilter} with a \type {count} finalizer. There are also some special ones,
like the injectors, that are needed for finetuning the final result.
\stopsection
\startsection[title={loading}]
\startxmlcmd {\cmdbasicsetup{xmlloadfile}}
loads the file \cmdinternal {cd:file} and registers it under \cmdinternal
{cd:name} and applies either given or standard \cmdinternal
{cd:xmlsetup} (alias: \type {\xmlload})
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlloadbuffer}}
loads the buffer \cmdinternal {cd:buffer} and registers it under
\cmdinternal {cd:name} and applies either given or standard
\cmdinternal {cd:xmlsetup}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlloaddata}}
loads \cmdinternal {cd:text} and registers it under \cmdinternal
{cd:name} and applies either given or standard \cmdinternal
{cd:xmlsetup}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlloadonly}}
loads \cmdinternal {cd:text} and registers it under \cmdinternal
{cd:name} and applies either given or standard \cmdinternal
{cd:xmlsetup} but doesn't flush the content
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlinclude}}
includes the file specified by attribute \cmdinternal {cd:name} of the
element located by \cmdinternal {cd:lpath} at node \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlprocessfile}}
registers file \cmdinternal {cd:file} as \cmdinternal {cd:name} and
process the tree starting with \cmdinternal {cd:xmlsetup} (alias:
\type {\xmlprocess})
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlprocessbuffer}}
registers buffer \cmdinternal {cd:name} as \cmdinternal {cd:name} and process
the tree starting with \cmdinternal {cd:xmlsetup}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlprocessdata}}
registers \cmdinternal {cd:text} as \cmdinternal {cd:name} and process
the tree starting with \cmdinternal {cd:xmlsetup}
\stopxmlcmd
The initial setup defaults to \type {xml:process} that is defined
as follows:
\starttyping
\startsetups xml:process
\xmlregistereddocumentsetups\xmldocument
\xmlmain\xmldocument
\stopsetups
\stoptyping
First we apply the setups associated with the document (including common setups)
and then we flush the whole document. The macro \type {\xmldocument} expands to
the current document id. There is also \type {\xmlself} which expands to the
current node number (\type {#1} in setups).
\startxmlcmd {\cmdbasicsetup{xmlmain}}
returns the whole document
\stopxmlcmd
Normally such a flush will trigger a chain reaction of setups associated with the
child elements.
\stopsection
\startsection[title={saving}]
\startxmlcmd {\cmdbasicsetup{xmlsave}}
saves the given node \cmdinternal {cd:node} in the file \cmdinternal {cd:file}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmltofile}}
saves the match of \cmdinternal {cd:lpath} in the file \cmdinternal {cd:file}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmltobuffer}}
saves the match of \cmdinternal {cd:lpath} in the buffer \cmdinternal {cd:buffer}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmltobufferverbose}}
saves the match of \cmdinternal {cd:lpath} verbatim in the buffer \cmdinternal
{cd:buffer}
\stopxmlcmd
% \startxmlcmd {\cmdbasicsetup{xmltoparameters}}
% converts the match of \cmdinternal {cd:lpath} to key|/|values (for tracing)
% \stopxmlcmd
The next command is only needed when you have messed with the tree using
\LUA\ code.
\startxmlcmd {\cmdbasicsetup{xmladdindex}}
(re)indexes a tree
\stopxmlcmd
The following macros are only used in special situations and are not really meant
for users.
\startxmlcmd {\cmdbasicsetup{xmlraw}}
flush the content if \cmdinternal {cd:node} with original entities
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{startxmlraw}}
flush the wrapped content with original entities
\stopxmlcmd
\stopsection
\startsection[title={flushing data}]
When we flush an element, the associated \XML\ setups are expanded. The most
straightforward way to flush an element is the following. Keep in mind that the
returned values itself can trigger setups and therefore flushes.
\startxmlcmd {\cmdbasicsetup{xmlflush}}
returns all nodes under \cmdinternal {cd:node}
\stopxmlcmd
You can restrict flushing by using commands that accept a specification.
\startxmlcmd {\cmdbasicsetup{xmltext}}
returns the text of the matching \cmdinternal {cd:lpath} under \cmdinternal
{cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlpure}}
returns the text of the matching \cmdinternal {cd:lpath} under \cmdinternal
{cd:node} without \type {\Ux} escaped special \TEX\ characters
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlflushtext}}
returns the text of the \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlflushpure}}
returns the text of the \cmdinternal {cd:node} without \type {\Ux} escaped
special \TEX\ characters
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlnonspace}}
returns the text of the matching \cmdinternal {cd:lpath} under \cmdinternal
{cd:node} without embedded spaces
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlall}}
returns all nodes under \cmdinternal {cd:node} that matches \cmdinternal
{cd:lpath}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmllastmatch}}
returns all nodes found in the last match
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlfirst}}
returns the first node under \cmdinternal {cd:node} that matches \cmdinternal
{cd:lpath}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmllast}}
returns the last node under \cmdinternal {cd:node} that matches \cmdinternal
{cd:lpath}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlfilter}}
at a match of \cmdinternal {cd:lpath} a given filter \type {filter} is applied
and the result is returned
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlsnippet}}
returns the \cmdinternal {cd:number}\high{th} element under \cmdinternal
{cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlposition}}
returns the \cmdinternal {cd:number}\high{th} match of \cmdinternal
{cd:lpath} at node \cmdinternal {cd:node}; a negative number starts at the
end (alias: \type {\xmlindex})
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlelement}}
returns the \cmdinternal {cd:number}\high{th} child of node \cmdinternal {cd:node};
a negative number starts at the end
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlpos}}
returns the index (position) in the parent node of \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlconcat}}
returns the sequence of nodes that match \cmdinternal {cd:lpath} at
\cmdinternal {cd:node} whereby \cmdinternal {cd:text} is put between each
match
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlconcatrange}}
returns the \cmdinternal {cd:first}\high {th} upto \cmdinternal
{cd:last}\high {th} of nodes that match \cmdinternal {cd:lpath} at
\cmdinternal {cd:node} whereby \cmdinternal {cd:text} is put between each
match
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlcommand}}
apply the given \cmdinternal {cd:xmlsetup} to each match of \cmdinternal
{cd:lpath} at node \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlstrip}}
remove leading and trailing spaces from nodes under \cmdinternal {cd:node}
that match \cmdinternal {cd:lpath}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlstripped}}
remove leading and trailing spaces from nodes under \cmdinternal {cd:node}
that match \cmdinternal {cd:lpath} and return the content afterwards
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlstripnolines}}
remove leading and trailing spaces as well as collapse embedded spaces
from nodes under \cmdinternal {cd:node} that match \cmdinternal {cd:lpath}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlstrippednolines}}
remove leading and trailing spaces as well as collapse embedded spaces from
nodes under \cmdinternal {cd:node} that match \cmdinternal {cd:lpath} and
return the content afterwards
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlverbatim}}
flushes the content verbatim code (without any wrapping, i.e. no fonts
are selected and such)
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlinlineverbatim}}
return the content of the node as inline verbatim code; no further
interpretation (expansion) takes place and spaces are honoured; it uses the
following wrapper
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{startxmlinlineverbatim}}
wraps inline verbatim mode using the environment specified (a prefix \type
{xml:} is added to the environment name)
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldisplayverbatim}}
return the content of the node as display verbatim code; no further
interpretation (expansion) takes place and leading and trailing spaces and
newlines are treated special; it uses the following wrapper
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{startxmldisplayverbatim}}
wraps the content in display verbatim using the environment specified (a prefix
\type {xml:} is added to the environment name)
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlprettyprint}}
pretty print (with colors) the node \cmdinternal {cd:node}; use the \CONTEXT\
\SCITE\ lexers when available (\type {\usemodule [scite]})
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlflushspacewise}}
flush node \cmdinternal {cd:node} obeying spaces and newlines
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlflushlinewise}}
flush node \cmdinternal {cd:node} obeying newlines
\stopxmlcmd
\stopsection
\startsection[title={information}]
The following commands return strings. Normally these are used in tests.
\startxmlcmd {\cmdbasicsetup{xmlname}}
returns the complete name (including namespace prefix) of the
given \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlnamespace}}
returns the namespace of the given \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmltag}}
returns the tag of the element, without namespace prefix
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlcount}}
returns the number of matches of \cmdinternal {cd:lpath} at node \cmdinternal
{cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlatt}}
returns the value of attribute \cmdinternal {cd:name} or empty if no such
attribute exists
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlattdef}}
returns the value of attribute \cmdinternal {cd:name} or \cmdinternal
{cd:string} if no such attribute exists
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlrefatt}}
returns the value of attribute \cmdinternal {cd:name} or empty if no such
attribute exists; a leading \type {#} is removed (nicer for tex)
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlchainatt}}
returns the value of attribute \cmdinternal {cd:name} or empty if no such
attribute exists; backtracks till a match is found
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlchainattdef}}
returns the value of attribute \cmdinternal {cd:name} or \cmdinternal
{cd:string} if no such attribute exists; backtracks till a match is found
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlattribute}}
finds a first match for \cmdinternal {cd:lpath} at \cmdinternal {cd:node} and
returns the value of attribute \cmdinternal {cd:name} or empty if no such
attribute exists
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlattributedef}}
finds a first match for \cmdinternal {cd:lpath} at \cmdinternal {cd:node} and
returns the value of attribute \cmdinternal {cd:name} or \cmdinternal
{cd:text} if no such attribute exists
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmllastatt}}
returns the last attribute found (this avoids a lookup)
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlsetatt}}
set the value of attribute \cmdinternal {cd:name}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlsetattribute}}
set the value of attribute \cmdinternal {cd:name} for each match of \cmdinternal
{cd:lpath}
\stopxmlcmd
\stopsection
\startsection[title={manipulation}]
You can use \LUA\ code to manipulate the tree and it makes no sense to duplicate
this in \TEX. In the future we might provide an interface to some of this
functionality. Keep in mind that manipuating the tree might have side effects as
we maintain several indices into the tree that also needs to be updated then.
\stopsection
\startsection[title={integration}]
If you write a module that deals with \XML, for instance processing cals tables,
then you need ways to control specific behaviour. For instance, you might want to
add a background to the table. Such directives are collected in \XML\ files and
can be loaded on demand.
\startxmlcmd {\cmdbasicsetup{xmlloaddirectives}}
loads \CONTEXT\ directives from \cmdinternal {cd:file} that will get
interpreted when processing documents
\stopxmlcmd
A directives definition file looks as follows:
\starttyping
\stoptyping
Examples of usage can be found in \type {x-cals.mkiv}. The directive is triggered
by an attribute. Instead of a setup you can specify a setup to be applied before
and after the node gets flushed.
\startxmlcmd {\cmdbasicsetup{xmldirectives}}
apply the setups directive associated with the node
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldirectivesbefore}}
apply the before directives associated with the node
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldirectivesafter}}
apply the after directives associated with the node
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlinstalldirective}}
defines a directive that hooks into a handler
\stopxmlcmd
Normally a directive will be put in the \XML\ file, for instance as:
\starttyping
\stoptyping
Here the \type {mathml} is the general class of directives and \type {minus} a
subclass, in our case a specific element.
\stopsection
\startsection[title={setups}]
The basic building blocks of \XML\ processing are setups. These are just
collections of macros that are expanded. These setups get one argument passed
(\type {#1}):
\starttyping
\startxmlsetups somedoc:somesetup
\xmlflush{#1}
\stopxmlsetups
\stoptyping
This argument is normally a number that internally refers to a specific node in
the \XML\ tree. The user should see it as an abstract reference and not depend on
its numeric property. Just think of it as \quote {the current node}. You can (and
probably will) call such setups using:
\startxmlcmd {\cmdbasicsetup{xmlsetup}}
expands setup \cmdinternal {cd:setup} and pass \cmdinternal {cd:node} as
argument
\stopxmlcmd
However, in most cases the setups are associated to specific elements,
something that users of \XSLT\ might recognize as templates.
\startxmlcmd {\cmdbasicsetup{xmlsetfunction}}
associates function \cmdinternal {cd:luafunction} to the elements in
namespace \cmdinternal {cd:name} that match \cmdinternal {cd:lpath}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlsetsetup}}
associates setups \cmdinternal {cd:setup} (\TEX\ code) with the matching
nodes of \cmdinternal {cd:lpath} or root \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlprependsetup}}
pushes \cmdinternal {cd:setup} to the front of global list of setups
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlappendsetup}}
adds \cmdinternal {cd:setup} to the global list of setups to be applied
(alias: \type{\xmlregistersetup})
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlbeforesetup}}
pushes \cmdinternal {cd:setup} into the global list of setups; the
last setup is the position
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlaftersetup}}
adds \cmdinternal {cd:setup} to the global list of setups; the last setup
is the position
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlremovesetup}}
removes \cmdinternal {cd:setup} from the global list of setups
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlprependdocumentsetup}}
pushes \cmdinternal {cd:setup} to the front of list of setups to be applied
to \cmdinternal {cd:name}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlappenddocumentsetup}}
adds \cmdinternal {cd:setup} to the list of setups to be applied to
\cmdinternal {cd:name} (you can also use the alias: \type
{\xmlregisterdocumentsetup})
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlbeforedocumentsetup}}
pushes \cmdinternal {cd:setup} into the setups to be applied to \cmdinternal
{cd:name}; the last setup is the position
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlafterdocumentsetup}}
adds \cmdinternal {cd:setup} to the setups to be applied to \cmdinternal
{cd:name}; the last setup is the position
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlremovedocumentsetup}}
removes \cmdinternal {cd:setup} from the global list of setups to be applied
to \cmdinternal {cd:name}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlresetsetups}}
removes all global setups
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlresetdocumentsetups}}
removes all setups from the \cmdinternal {cd:name} specific list of setups to
be applied
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlflushdocumentsetups}{setup}}
applies \cmdinternal {cd:setup} (can be a list) to \cmdinternal {cd:name}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlregisteredsetups}}
applies all global setups to the current document
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlregistereddocumentsetups}}
applies all document specific \cmdinternal {cd:setup} to document
\cmdinternal {cd:name}
\stopxmlcmd
\stopsection
\startsection[title={testing}]
The following test macros all take a \cmdinternal {cd:node} as first argument
and an \cmdinternal {cd:lpath} as second:
\startxmlcmd {\cmdbasicsetup{xmldoif}}
expands to \cmdinternal {cd:true} when \cmdinternal {cd:lpath} matches at
node \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldoifnot}}
expands to \cmdinternal {cd:true} when \cmdinternal {cd:lpath} does not match
at node \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldoifelse}}
expands to \cmdinternal {cd:true} when \cmdinternal {cd:lpath} matches at
node \cmdinternal {cd:node} and to \cmdinternal {cd:false} otherwise
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldoiftext}}
expands to \cmdinternal {cd:true} when the node matching \cmdinternal
{cd:lpath} at node \cmdinternal {cd:node} has some content
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldoifnottext}}
expands to \cmdinternal {cd:true} when the node matching \cmdinternal
{cd:lpath} at node \cmdinternal {cd:node} has no content
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldoifelsetext}}
expands to \cmdinternal {cd:true} when the node matching \cmdinternal
{cd:lpath} at node \cmdinternal {cd:node} has content and to \cmdinternal
{cd:false} otherwise
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldoifelseempty}}
expands to \cmdinternal {cd:true} when the node matching \cmdinternal
{cd:lpath} at node \cmdinternal {cd:node} is empty and to \cmdinternal
{cd:false} otherwise
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldoifelseselfempty}}
expands to \cmdinternal {cd:true} when the node is empty and to \cmdinternal
{cd:false} otherwise
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldoifselfempty}}
expands to \cmdinternal {cd:true} when \cmdinternal {cd:node} is empty
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldoifnotselfempty}}
expands to \cmdinternal {cd:true} when \cmdinternal {cd:node} is not empty
\stopxmlcmd
\stopsection
\startsection[title={initialization}]
The general setup command (not to be confused with setups) that deals with the
\MKIV\ tree handler is \type {\setupxml}. There are currently only a few options.
\cmdfullsetup{setupxml}
When you set \type {default} to \cmdinternal {cd:text} elements with no setup
assigned will end up as text. When set to \type {hidden} such elements will be
hidden. You can apply the default yourself using:
\startxmlcmd {\cmdbasicsetup{xmldefaulttotext}}
presets the tree with root \cmdinternal {cd:node} to the handlers set up with
\type {\setupxml} option \cmdinternal{default}
\stopxmlcmd
You can set \type {compress} to \type {yes} in which case comment is stripped
from the tree when the file is read.
\startxmlcmd {\cmdbasicsetup{xmlregisterns}}
associates an internal namespace (like \type {mml}) with one given in the
document as \URL\ (like mathml)
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlremapname}}
changes the namespace and tag of the matching elements
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlremapnamespace}}
replaces all references to the given namespace to a new one (applied
recursively)
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlchecknamespace}}
sets the namespace of the matching elements unless a namespace is already set
\stopxmlcmd
\stopsection
\startsection[title={helpers}]
Often an attribute will determine the rendering and this may result in many
tests. Especially when we have multiple attributes that control the output such
tests can become rather extensive and redundant because one gets $n\times m$ or
more such tests.
Therefore we have a convenient way to map attributes onto for instance strings or
commands.
\startxmlcmd {\cmdbasicsetup{xmlmapvalue}}
associate a \cmdinternal {cd:text} with a \cmdinternal {cd:category} and
\cmdinternal {cd:name} (alias: \type{\xmlmapval})
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlvalue}}
expand the value associated with a \cmdinternal {cd:category} and
\cmdinternal {cd:name} and if not resolved, expand to the \cmdinternal
{cd:text} (alias: \type{\xmlval})
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmldoifelsevalue}}
associate a \cmdinternal {cd:text} with a \cmdinternal {cd:category} and
\cmdinternal {cd:name}
\stopxmlcmd
This is used as follows. We define a couple of mappings in the same category:
\starttyping
\xmlmapvalue{emph}{bold} {\bf}
\xmlmapvalue{emph}{italic}{\it}
\stoptyping
Assuming that we have associated the following setup with the \type {emph}
element, we can say (with \type {#1} being the current element):
\starttyping
\startxmlsetups demo:emph
\begingroup
\xmlvalue{emph}{\xmlatt{#1}{type}}{}
\endgroup
\stopxmlsetups
\stoptyping
In this case we have no default. The \type {type} attribute triggers the actions,
as in:
\starttyping
normal bold normal
\stoptyping
This mechanism is not really bound to elements and attributes so you can use this
mechanism for other purposes as well.
\stopsection
\startsection[title={Parameters}]
\startbuffer[test]
beta
\stopbuffer
\startbuffer
\startxmlsetups xml:mysetups
\xmlsetsetup{\xmldocument}{*}{xml:*}
\stopxmlsetups
\xmlregistersetup{xml:mysetups}
\startxmlsetups xml:something
parameter : \xmlpar {#1}{whatever}\par
attribute : \xmlatt {#1}{whatever}\par
text : \xmlfirst {#1}{what} \par
\xmlsetpar{#1}{whatever}{gamma}
parameter : \xmlpar {#1}{whatever}\par
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:what
what: \xmlflush{#1}\par
parameter : \xmlparam{#1}{..}{whatever}\par
\stopxmlsetups
\xmlprocessbuffer{main}{test}{}
\stopbuffer
Say that we have this \XML\ blob:
\typebuffer[test]
With:
\typebuffer
we get:
\getbuffer
Parameters are stored with a node.
\startxmlcmd {\cmdbasicsetup{xmlpar}}
returns the value of parameter \cmdinternal {cd:name} or empty if no such
parameter exists
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlparam}}
finds a first match for \cmdinternal {cd:lpath} at \cmdinternal {cd:node} and
returns the value of parameter \cmdinternal {cd:name} or empty if no such
parameter exists
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmllastpar}}
returns the last parameter found (this avoids a lookup)
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlsetpar}}
set the value of parameter \cmdinternal {cd:name}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlsetparam}}
set the value of parameter \cmdinternal {cd:name} for each match of \cmdinternal
{cd:lpath}
\stopxmlcmd
\stopsection
\stopchapter
\startchapter[title={Expressions and filters}]
\startsection[title={path expressions}]
In the previous chapters we used \cmdinternal {cd:lpath} expressions, which are a variant
on \type {xpath} expressions as in \XSLT\ but in this case more geared towards
usage in \TEX. This mechanisms will be extended when demands are there.
A path is a sequence of matches. A simple path expression is:
\starttyping
a/b/c/d
\stoptyping
Here each \type {/} goes one level deeper. We can go backwards in a lookup with
\type {..}:
\starttyping
a/b/../d
\stoptyping
We can also combine lookups, as in:
\starttyping
a/(b|c)/d
\stoptyping
A negated lookup is preceded by a \type {!}:
\starttyping
a/(b|c)/!d
\stoptyping
A wildcard is specified with a \type {*}:
\starttyping
a/(b|c)/!d/e/*/f
\stoptyping
In addition to these tag based lookups we can use attributes:
\starttyping
a/(b|c)/!d/e/*/f[@type=whatever]
\stoptyping
An \type {@} as first character means that we are dealing with an attribute.
Within the square brackets there can be boolean expressions:
\starttyping
a/(b|c)/!d/e/*/f[@type=whatever and @id>100]
\stoptyping
You can use functions as in:
\starttyping
a/(b|c)/!d/e/*/f[something(text()) == "oeps"]
\stoptyping
There are a couple of predefined functions:
\starttabulate[|l|l|p|]
\NC \type{rootposition} \type{order} \NC number \NC the index of the matched root element (kind of special) \NC \NR
\NC \type{position} \NC number \NC the current index of the matched element in the match list \NC \NR
\NC \type{match} \NC number \NC the current index of the matched element sub list with the same parent \NC \NR
\NC \type{first} \NC number \NC \NC \NR
\NC \type{last} \NC number \NC \NC \NR
\NC \type{index} \NC number \NC the current index of the matched element in its parent list \NC \NR
\NC \type{firstindex} \NC number \NC \NC \NR
\NC \type{lastindex} \NC number \NC \NC \NR
\NC \type{element} \NC number \NC the element's index \NC \NR
\NC \type{firstelement} \NC number \NC \NC \NR
\NC \type{lastelement} \NC number \NC \NC \NR
\NC \type{text} \NC string \NC the textual representation of the matched element \NC \NR
\NC \type{content} \NC table \NC the node of the matched element \NC \NR
\NC \type{name} \NC string \NC the full name of the matched element: namespace and tag \NC \NR
\NC \type{namespace} \type{ns} \NC string \NC the namespace of the matched element \NC \NR
\NC \type{tag} \NC string \NC the tag of the matched element \NC \NR
\NC \type{attribute} \NC string \NC the value of the attribute with the given name of the matched element \NC \NR
\stoptabulate
There are fundamental differences between \type {position}, \type {match} and
\type {index}. Each step results in a new list of matches. The \type {position}
is the index in this new (possibly intermediate) list. The \type {match} is also
an index in this list but related to the specific match of element names. The
\type {index} refers to the location in the parent element.
Say that we have:
\starttyping
.1.
.1.
.2.
.2.
.3.
.3.
\stoptyping
The following then applies:
\starttabulate[|l|l|]
\NC \type {collection/resources/manual[position()==1]/paper} \NC \type{.1.} \NC \NR
\NC \type {collection/resources/manual[match()==1]/paper} \NC \type{.1.} \type{.3.} \NC \NR
\NC \type {collection/resources/manual/paper[index()==1]} \NC \type{.2.} \NC \NR
\stoptabulate
In most cases the \type {position} test is more restrictive than the \type
{match} test.
You can pass your own functions too. Such functions are defined in the the \type
{xml.expressions} namespace. We have defined a few shortcuts:
\starttabulate[|l|l|]
\NC \type {find(str,pattern)} \NC \type{string.find} \NC \NR
\NC \type {contains(str)} \NC \type{string.find} \NC \NR
\NC \type {oneof(str,...)} \NC is \type{str} in list \NC \NR
\NC \type {upper(str)} \NC \type{characters.upper} \NC \NR
\NC \type {lower(str)} \NC \type{characters.lower} \NC \NR
\NC \type {number(str)} \NC \type{tonumber} \NC \NR
\NC \type {boolean(str)} \NC \type{toboolean} \NC \NR
\NC \type {idstring(str)} \NC removes leading hash \NC \NR
\NC \type {name(index)} \NC full tag name \NC \NR
\NC \type {tag(index)} \NC tag name \NC \NR
\NC \type {namespace(index)} \NC namespace of tag \NC \NR
\NC \type {text(index)} \NC content \NC \NR
\NC \type {error(str)} \NC quit and show error \NC \NR
\NC \type {quit()} \NC quit \NC \NR
\NC \type {print()} \NC print message \NC \NR
\NC \type {count(pattern)} \NC number of matches \NC \NR
\NC \type {child(pattern)} \NC take child that matches \NC \NR
\stoptabulate
You can also use normal \LUA\ functions as long as you make sure that you pass
the right arguments. There are a few predefined variables available inside such
functions.
\starttabulate[|Tl|l|p|]
\NC \type{list} \NC table \NC the list of matches \NC \NR
\NC \type{l} \NC number \NC the current index in the list of matches \NC \NR
\NC \type{ll} \NC element \NC the current element that matched \NC \NR
\NC \type{order} \NC number \NC the position of the root of the path \NC \NR
\stoptabulate
The given expression between \type {[]} is converted to a \LUA\ expression so you
can use the usual operators:
\starttyping
== ~= <= >= < > not and or ()
\stoptyping
In addition, \type {=} equals \type {==} and \type {!=} is the same as \type
{~=}. If you mess up the expression, you quite likely get a \LUA\ error message.
\stopsection
\startsection[title={css selectors}]
\startbuffer[selector-001]
b.one
b.two
b.one.two
b.three
b#first
c
d e
d e
d e e
d f
@foo = bar
@bar = foo
@bar = foo1
@bar = foo2
@bar = foo3
@bar = foo+4
g
g gg d
g gg f
g gg f.one
g
g gg f.two
g gg f.three
g f.one
g f.three
@whatever = four five six
\stopbuffer
\xmlloadbuffer{selector-001}{selector-001}
\startxmlsetups xml:selector:demo
\advance\scratchcounter\plusone
\inleftmargin{\the\scratchcounter}\ignorespaces\xmlverbatim{#1}\par
\stopxmlsetups
\unexpanded\def\showCSSdemo#1#2%
{\blank
\textrule{\tttf#2}
\startlines
\dontcomplain
\tttf \obeyspaces
\scratchcounter\zerocount
\xmlcommand{#1}{#2}{xml:selector:demo}
\stoplines
\blank}
The \CSS\ approach to filtering is a bit different from the path based one and is
supported too. In fact, you can combine both methods. Depending on what you
select, the \CSS\ one can be a little bit faster too. It has the advantage that
one can select more in one go but at the same time looks a bit less attractive.
This method was added just to show that it can be done but might be useful too. A
selector is given between curly braces (after all \CSS\ uses them and they have no
function yet in the parser.
\starttyping
\xmlall{#1}{{foo bar .whatever, bar foo .whatever}}
\stoptyping
The following methods are supported:
\starttabulate[|T||]
\NC element \NC all tags element \NC \NR
\NC element-1 > element-2 \NC all tags element-2 with parent tag element-1 \NC \NR
\NC element-1 + element-2 \NC all tags element-2 preceded by tag element-1 \NC \NR
\NC element-1 ~ element-2 \NC all tags element-2 preceded by tag element-1 \NC \NR
\NC element-1 element-2 \NC all tags element-2 inside tag element-1 \NC \NR
\NC [attribute] \NC has attribute \NC \NR
\NC [attribute=value] \NC attribute equals value\NC \NR
\NC [attribute\lettertilde =value] \NC attribute contains value (space is separator) \NC \NR
\NC [attribute\letterhat ="value"] \NC attribute starts with value \NC \NR
\NC [attribute\letterdollar="value"] \NC attribute ends with value \NC \NR
\NC [attribute*="value"] \NC attribute contains value \NC \NR
\NC .class \NC has class \NC \NR
\NC \letterhash id \NC has id \NC \NR
\NC :nth-child(n) \NC the child at index n \NC \NR
\NC :nth-last-child(n) \NC the child at index n from the end \NC \NR
\NC :first-child \NC the first child \NC \NR
\NC :last-child \NC the last child \NC \NR
\NC :nth-of-type(n) \NC the match at index n \NC \NR
\NC :nth-last-of-type(n) \NC the match at index n from the end \NC \NR
\NC :first-of-type \NC the first match \NC \NR
\NC :last-of-type \NC the last match \NC \NR
\NC :only-of-type \NC the only match or nothing \NC \NR
\NC :only-child \NC the only child or nothing \NC \NR
\NC :empty \NC only when empty \NC \NR
\NC :root \NC the whole tree \NC \NR
\stoptabulate
The next pages show some examples. For that we use the demo file:
\typebuffer[selector-001]
The class and id selectors often only make sense in \HTML\ like documents but they
are supported nevertheless. They are after all just shortcuts for filtering by
attribute. The class filtering is special in the sense that it checks for a class
in a list of classes given in an attribute.
\showCSSdemo{selector-001}{{.one}}
\showCSSdemo{selector-001}{{.one, .two}}
\showCSSdemo{selector-001}{{.one, .two, \letterhash first}}
Attributes can be filtered by presence, value, partial value and such. Quotes are
optional but we advice to use them.
\showCSSdemo{selector-001}{{[foo], [bar=foo]}}
\showCSSdemo{selector-001}{{[bar\lettertilde=foo]}}
\showCSSdemo{selector-001}{{[bar\letterhat="foo"]}}
\showCSSdemo{selector-001}{{[whatever\lettertilde="five"]}}
You can of course combine the methods as in:
\showCSSdemo{selector-001}{{g f .one, g f .three}}
\showCSSdemo{selector-001}{{g > f .one, g > f .three}}
\showCSSdemo{selector-001}{{d + e}}
\showCSSdemo{selector-001}{{d ~ e}}
\showCSSdemo{selector-001}{{d ~ e, g f .one, g f .three}}
You can also negate the result by using \type {:not} on a simple expression:
\showCSSdemo{selector-001}{{:not([whatever\lettertilde="five"])}}
\showCSSdemo{selector-001}{{:not(d)}}
The child and match selectors are also supported:
\showCSSdemo{selector-001}{{a:nth-child(3)}}
\showCSSdemo{selector-001}{{a:nth-last-child(3)}}
\showCSSdemo{selector-001}{{g:nth-of-type(3)}}
\showCSSdemo{selector-001}{{g:nth-last-of-type(3)}}
\showCSSdemo{selector-001}{{a:first-child}}
\showCSSdemo{selector-001}{{a:last-child}}
\showCSSdemo{selector-001}{{e:first-of-type}}
\showCSSdemo{selector-001}{{gg d:only-of-type}}
Instead of numbers you can also give the \type {an} and \type {an+b} formulas
as well as the \type {odd} and \type {even} keywords:
\showCSSdemo{selector-001}{{a:nth-child(even)}}
\showCSSdemo{selector-001}{{a:nth-child(odd)}}
\showCSSdemo{selector-001}{{a:nth-child(3n+1)}}
\showCSSdemo{selector-001}{{a:nth-child(2n+3)}}
There are a few special cases:
\showCSSdemo{selector-001}{{g:empty}}
\showCSSdemo{selector-001}{{g:root}}
\showCSSdemo{selector-001}{{*}}
Combining the \CSS\ methods with the regular ones is possible:
\showCSSdemo{selector-001}{{g gg f .one}}
\showCSSdemo{selector-001}{g/gg/f[@class='one']}
\showCSSdemo{selector-001}{g/{gg f .one}}
\startbuffer[selector-002]
title 1
title 2
title 3
title 4
\stopbuffer
The next examples we use this file:
\typebuffer[selector-002]
\xmlloadbuffer{selector-002}{selector-002}
When we filter from this (not too well structured) tree we can use both
methods to achieve the same:
\showCSSdemo{selector-002}{{document title .one, document title .three}}
\showCSSdemo{selector-002}{/document/title[(@class='one') or (@class='three')]}
However, imagine this file:
\startbuffer[selector-003]
title 1
title 1.1
title 2
title 2.1
title 3
title 3.1
title 4
title 4.1
\stopbuffer
\typebuffer[selector-003]
\xmlloadbuffer{selector-003}{selector-003}
The next filter in easier with the \CSS\ selector methods because these accumulate
independent (simple) expressions:
\showCSSdemo{selector-003}{{document title .one + subtitle, document title .two + subtitle}}
Watch how we get an output in the document order. Because we render a sequential document
a combined filter will trigger a sorting pass.
\stopsection
\startsection[title={functions as filters}]
At the \LUA\ end a whole \cmdinternal {cd:lpath} expression results in a (set of) node(s)
with its environment, but that is hardly usable in \TEX. Think of code like:
\starttyping
for e in xml.collected(xml.load('text.xml'),"title") do
-- e = the element that matched
end
\stoptyping
The older variant is still supported but you can best use the previous variant.
\starttyping
for r, d, k in xml.elements(xml.load('text.xml'),"title") do
-- r = root of the title element
-- d = data table
-- k = index in data table
end
\stoptyping
Here \type {d[k]} points to the \type {title} element and in this case all titles
in the tree pass by. In practice this kind of code is encapsulated in function
calls, like those returning elements one by one, or returning the first or last
match. The result is then fed back into \TEX, possibly after being altered by an
associated setup. We've seen the wrappers to such functions already in a previous
chapter.
In addition to the previously discussed expressions, one can add so called
filters to the expression, for instance:
\starttyping
a/(b|c)/!d/e/text()
\stoptyping
In a filter, the last part of the \cmdinternal {cd:lpath} expression is a
function call. The previous example returns the text of each element \type {e}
that results from matching the expression. When running \TEX\ the following
functions are available. Some are also available when using pure \LUA. In \TEX\
you can often use one of the macros like \type {\xmlfirst} instead of a \type
{\xmlfilter} with finalizer \type {first()}. The filter can be somewhat faster
but that is hardly noticeable.
\starttabulate[|l|l|p|]
\NC \type {context()} \NC string \NC the serialized text with \TEX\ catcode regime \NC \NR
%NC \type {ctxtext()} \NC string \NC \NC \NR
\NC \type {function()} \NC string \NC depends on the function \NC \NR
%
\NC \type {name()} \NC string \NC the (remapped) namespace \NC \NR
\NC \type {tag()} \NC string \NC the name of the element \NC \NR
\NC \type {tags()} \NC list \NC the names of the element \NC \NR
%
\NC \type {text()} \NC string \NC the serialized text \NC \NR
\NC \type {upper()} \NC string \NC the serialized text uppercased \NC \NR
\NC \type {lower()} \NC string \NC the serialized text lowercased \NC \NR
\NC \type {stripped()} \NC string \NC the serialized text stripped \NC \NR
\NC \type {lettered()} \NC string \NC the serialized text only letters (cf. \UNICODE) \NC \NR
%
\NC \type {count()} \NC number \NC the number of matches \NC \NR
\NC \type {index()} \NC number \NC the matched index in the current path \NC \NR
\NC \type {match()} \NC number \NC the matched index in the preceding path \NC \NR
%
%NC \type {lowerall()} \NC string \NC \NC \NR
%NC \type {upperall()} \NC string \NC \NC \NR
%
\NC \type {attribute(name)} \NC content \NC returns the attribute with the given name \NC \NR
\NC \type {chainattribute(name)} \NC content \NC sidem, but backtracks till one is found \NC \NR
\NC \type {command(name)} \NC content \NC expands the setup with the given name for each found element \NC \NR
\NC \type {position(n)} \NC content \NC processes the \type {n}\high{th} instance of the found element \NC \NR
\NC \type {all()} \NC content \NC processes all instances of the found element \NC \NR
%NC \type {default} \NC content \NC all \NC \NR
\NC \type {reverse()} \NC content \NC idem in reverse order \NC \NR
\NC \type {first()} \NC content \NC processes the first instance of the found element \NC \NR
\NC \type {last()} \NC content \NC processes the last instance of the found element \NC \NR
\NC \type {concat(...)} \NC content \NC concatinates the match \NC \NC \NR
\NC \type {concatrange(from,to,...)} \NC content \NC concatinates a range of matches \NC \NC \NR
\stoptabulate
The extra arguments of the concatinators are: \type {separator} (string), \type
{lastseparator} (string) and \type {textonly} (a boolean).
These filters are in fact \LUA\ functions which means that if needed more of them
can be added. Indeed this happens in some of the \XML\ related \MKIV\ modules,
for instance in the \MATHML\ processor.
\stopsection
\startsection[title={example}]
The number of commands is rather large and if you want to avoid them this is
often possible. Take for instance:
\starttyping
\xmlall{#1}{/a/b[position()>3]}
\stoptyping
Alternatively you can use:
\starttyping
\xmlfilter{#1}{/a/b[position()>3]/all()}
\stoptyping
and actually this is also faster as internally it avoids a function call. Of
course in practice this is hardly measurable.
In previous examples we've already seen quite some expressions, and it might be
good to point out that the syntax is modelled after \XSLT\ but is not quite the
same. The reason is that we started with a rather minimal system and have already
styles in use that depend on compatibility.
\starttyping
namespace:// axis node(set) [expr 1]..[expr n] / ... / filter
\stoptyping
When we are inside a \CONTEXT\ run, the namespace is \type {tex}. Hoewever, if
you want not to print back to \TEX\ you need to be more explicit. Say that we
typeset examns and have a (not that logical) structure like:
\starttyping
...
- one
- two
- three
true
1
false
0
true
2
\stoptyping
Say that we typeset the questions with:
\starttyping
\startxmlsetups question
\blank
score: \xmlfunction{#1}{totalscore}
\blank
\xmlfirst{#1}{text}
\startitemize
\xmlfilter{#1}{/answer/item/command(answer:item)}
\stopitemize
\endgraf
\blank
\stopxmlsetups
\stoptyping
Each item in the answer results in a call to:
\starttyping
\startxmlsetups answer:item
\startitem
\xmlflush{#1}
\endgraf
\xmlfilter{#1}{../../alternative[position()=rootposition()]/
condition/command(answer:condition)}
\stopitem
\stopxmlsetups
\stoptyping
\starttyping
\startxmlsetups answer:condition
\endgraf
condition: \xmlflush{#1}
\endgraf
\stopxmlsetups
\stoptyping
Now, there are two rather special filters here. The first one involves
calculating the total score. As we look forward we use a function to deal with
this.
\starttyping
\startluacode
function xml.functions.totalscore(root)
local score = 0
for e in xml.collected(root,"/alternative") do
score = score + xml.filter(e,"xml:///score/number()") or 0
end
tex.write(score)
end
\stopluacode
\stoptyping
Watch how we use the namespace to keep the results at the \LUA\ end.
The second special trick shown here is to limit a match using the current
position of the root (\type {#}) match.
As you can see, a path expression can be more than just filtering a few nodes. At
the end of this manual you will find a bunch of examples.
\stopsection
\startsection[title={tables}]
If you want to know how the internal \XML\ tables look you can print such a
table:
\starttyping
print(table.serialize(e))
\stoptyping
This produces for instance:
% s = xml.convert("some text")
% print(table.serialize(xml.filter(s,"demo")[1]))
\starttyping
t={
["at"]={
["label"]="whatever",
},
["dt"]={ "some text" },
["ns"]="",
["rn"]="",
["tg"]="demo",
}
\stoptyping
The \type {rn} entry is the renamed namespace (when renaming is applied). If you
see tags like \type {@pi@} this means that we don't have an element, but (in this
case) a processing instruction.
\starttabulate[|l|p|]
\NC \type {@rt@} \NC the root element \NC \NR
\NC \type {@dd@} \NC document definition \NC \NR
\NC \type {@cm@} \NC comment, like \type {} \NC \NR
\NC \type {@cd@} \NC so called \type {CDATA} \NC \NR
\NC \type {@pi@} \NC processing instruction, like \type {} \NC \NR
\stoptabulate
There are many ways to deal with the content, but in the perspective of \TEX\
only a few matter.
\starttabulate[|l|p|]
\NC \type {xml.sprint(e)} \NC print the content to \TEX\ and apply setups if needed \NC \NR
\NC \type {xml.tprint(e)} \NC print the content to \TEX\ (serialize elements verbose) \NC \NR
\NC \type {xml.cprint(e)} \NC print the content to \TEX\ (used for special content) \NC \NR
\stoptabulate
Keep in mind that anything low level that you uncover is not part of the official
interface unless mentioned in this manual.
\stopsection
\stopchapter
\startchapter[title={Tips and tricks}]
\startsection[title={tracing}]
It can be hard to debug code as much happens kind of behind the screens.
Therefore we have a couple of tracing options. Of course you can typeset some
status information, using for instance:
\startxmlcmd {\cmdbasicsetup{xmlshow}}
typeset the tree given by \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlinfo}}
typeset the name if the element given by \cmdinternal {cd:node}
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlpath}}
returns the complete path (including namespace prefix and index) of the
given \cmdinternal {cd:node}
\stopxmlcmd
\startbuffer[demo]
\stopbuffer
Say that we have the following \XML:
\typebuffer[demo]
and the next definitions:
\startbuffer
\startxmlsetups xml:demo:base
\xmlsetsetup{#1}{p|b}{xml:demo:*}
\stopxmlsetups
\startxmlsetups xml:demo:p
\xmlflush{#1}
\par
\stopxmlsetups
\startxmlsetups xml:demo:b
\par
\xmlpath{#1} : \xmlflush{#1}
\par
\stopxmlsetups
\xmlregisterdocumentsetup{example-10}{xml:demo:base}
\xmlprocessbuffer{example-10}{demo}{}
\stopbuffer
\typebuffer
This will give us:
\blank \startpacked \getbuffer \stoppacked \blank
If you use \type {\xmlshow} you will get a complete subtree which can
be handy for tracing but can also lead to large documents.
We also have a bunch of trackers that can be enabled, like:
\starttyping
\enabletrackers[xml.show,xml.parse]
\stoptyping
The full list (currently) is:
\starttabulate[|lT|p|]
\NC xml.entities \NC show what entities are seen and replaced \NC \NR
\NC xml.path \NC show the result of parsing an lpath expression \NC \NR
\NC xml.parse \NC show stepwise resolving of expressions \NC \NR
\NC xml.profile \NC report all parsed lpath expressions (in the log) \NC \NR
\NC xml.remap \NC show what namespaces are remapped \NC \NR
\NC lxml.access \NC report errors with respect to resolving (symbolic) nodes \NC \NR
\NC lxml.comments \NC show the comments that are encountered (if at all) \NC \NR
\NC lxml.loading \NC show what files are loaded and converted \NC \NR
\NC lxml.setups \NC show what setups are being associated to elements \NC \NR
\stoptabulate
In one of our workflows we produce books from \XML\ where the (educational)
content is organized in many small files. Each book has about 5~chapters and each
chapter is made of sections that contain text, exercises, resources, etc.\ and so
the document is assembled from thousands of files (don't worry, runtime inclusion
is pretty fast). In order to see where in the sources content resides we can
trace the filename.
\startxmlcmd {\cmdbasicsetup{xmlinclusion}}
returns the file where the node comes from
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlinclusions}}
returns the list of files where the node comes from
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlbadinclusions}}
returns a list of files that were not included due to some problem
\stopxmlcmd
Of course you have to make sure that these names end up somewhere visible, for
instance in the margin.
\stopsection
\startsection[title={expansion}]
For novice users the concept of expansion might sound frightening and to some
extend it is. However, it is important enough to spend some words on it here.
It is good to realize that most setups are sort of immediate. When one setup is
issued, it can call another one and so on. Normally you won't notice that but
there are cases where that can be a problem. In \TEX\ you can define a macro,
take for instance:
\starttyping
\startxmlsetups xml:foo
\def\foobar{\xmlfirst{#1}{/bar}}
\stopxmlsetups
\stoptyping
you store the reference top node \type {bar} in \type {\foobar} maybe for later use. In
this case the content is not yet fetched, it will be done when \type {\foobar} is
called.
\starttyping
\startxmlsetups xml:foo
\edef\foobar{\xmlfirst{#1}{/bar}}
\stopxmlsetups
\stoptyping
Here the content of \type {bar} becomes the body of the macro. But what if
\type {bar} itself contains elements that also contain elements. When there
is a setup for \type {bar} it will be triggered and so on.
When that setup looks like:
\starttyping
\startxmlsetups xml:bar
\def\barfoo{\xmlflush{#1}}
\stopxmlsetups
\stoptyping
Here we get something like:
\starttyping
\foobar => {\def\barfoo{...}}
\stoptyping
When \type {\barfoo} is not defined we get an error and when it is known and expands
to something weird we might also get an error.
Especially when you don't know what content can show up, this can result in errors
when an expansion fails, for example because some macro being used is not defined.
To prevent this we can define a macro:
\starttyping
\starttexdefinition unexpanded xml:bar:macro #1
\def\barfoo{\xmlflush{#1}}
\stoptexdefinition
\startxmlsetups xml:bar
\texdefinition{xml:bar:macro}{#1}
\stopxmlsetups
\stoptyping
The setup \type {xml:bar} will still expand but the replacement text now is just the
call to the macro, think of:
\starttyping
\foobar => {\texdefinition{xml:bar:macro}{#1}}
\stoptyping
But this is often not needed, most \CONTEXT\ commands can handle the expansions
quite well but it's good to know that there is a way out. So, now to some
examples. Imagine that we have an \XML\ file that looks as follows:
\starttyping
Some short title
zeta
zeta
zeta again
alpha
alpha
alpha again
gamma
gamma
gamma
beta
beta
beta
delta
delta
delta
done!
\stoptyping
There are a few structure related elements here: a chapter (with its list entry)
and some index entries. Both are multipass related and therefore travel around.
This means that when we let data end up in the auxiliary file, we need to make
sure that we end up with either expanded data (i.e.\ no references to the \XML\
tree) or with robust forward and backward references to elements in the tree.
Here we discuss three approaches (and more may show up later): pushing \XML\ into
the auxiliary file and using references to elements either or not with an
associated setup. We control the variants with a switch.
\starttyping
\newcount\TestMode
\TestMode=0 % expansion=xml
\TestMode=1 % expansion=yes, index, setup
\TestMode=2 % expansion=yes
\stoptyping
We apply a couple of setups:
\starttyping
\startxmlsetups xml:mysetups
\xmlsetsetup{\xmldocument}{demo|index|content|chapter|title|em}{xml:*}
\stopxmlsetups
\xmlregistersetup{xml:mysetups}
\stoptyping
The main document is processed with:
\starttyping
\startxmlsetups xml:demo
\xmlflush{#1}
\subject{contents}
\placelist[chapter][criterium=all]
\subject{index}
\placeregister[index][criterium=all]
\page % else buffer is forgotten when placing header
\stopxmlsetups
\stoptyping
First we show three alternative ways to deal with the chapter. The first case
expands the \XML\ reference so that we have an \XML\ stream in the auxiliary
file. This stream is processed as a small independent subfile when needed. The
second case registers a reference to the current element (\type {#1}). This means
that we have access to all data of this element, like attributes, title and
content. What happens depends on the given setup. The third variant does the same
but here the setup is part of the reference.
\starttyping
\startxmlsetups xml:chapter
\ifcase \TestMode
% xml code travels around
\setuphead[chapter][expansion=xml]
\startchapter[title=eh: \xmltext{#1}{title}]
\xmlfirst{#1}{content}
\stopchapter
\or
% index is used for access via setup
\setuphead[chapter][expansion=yes,xmlsetup=xml:title:flush]
\startchapter[title=\xmlgetindex{#1}]
\xmlfirst{#1}{content}
\stopchapter
\or
% tex call to xml using index is used
\setuphead[chapter][expansion=yes]
\startchapter[title=hm: \xmlreference{#1}{xml:title:flush}]
\xmlfirst{#1}{content}
\stopchapter
\fi
\stopxmlsetups
\startxmlsetups xml:title:flush
\xmltext{#1}{title}
\stopxmlsetups
\stoptyping
We need to deal with emphasis and the content of the chapter.
\starttyping
\startxmlsetups xml:em
\begingroup\em\xmlflush{#1}\endgroup
\stopxmlsetups
\startxmlsetups xml:content
\xmlflush{#1}
\stopxmlsetups
\stoptyping
A similar approach is followed with the index entries. Watch how we use the
numbered entries variant (in this case we could also have used just \type
{entries} and \type {keys}).
\starttyping
\startxmlsetups xml:index
\ifcase \TestMode
\setupregister[index][expansion=xml,xmlsetup=]
\setstructurepageregister
[index]
[entries:1=\xmlfirst{#1}{content},
keys:1=\xmltext{#1}{key}]
\or
\setupregister[index][expansion=yes,xmlsetup=xml:index:flush]
\setstructurepageregister
[index]
[entries:1=\xmlgetindex{#1},
keys:1=\xmltext{#1}{key}]
\or
\setupregister[index][expansion=yes,xmlsetup=]
\setstructurepageregister
[index]
[entries:1=\xmlreference{#1}{xml:index:flush},
keys:1=\xmltext{#1}{key}]
\fi
\stopxmlsetups
\startxmlsetups xml:index:flush
\xmlfirst{#1}{content}
\stopxmlsetups
\stoptyping
Instead of this flush, you can use the predefined setup \type {xml:flush}
unless it is overloaded by you.
The file is processed by:
\starttyping
\starttext
\xmlprocessfile{main}{test.xml}{}
\stoptext
\stoptyping
We don't show the result here. If you're curious what the output is, you can test
it yourself. In that case it also makes sense to peek into the \type {test.tuc}
file to see how the information travels around. The \type {metadata} fields carry
information about how to process the data.
The first case, the \XML\ expansion one, is somewhat special in the sense that
internally we use small pseudo files. You can control the rendering by tweaking
the following setups:
\starttyping
\startxmlsetups xml:ctx:sectionentry
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:ctx:registerentry
\xmlflush{#1}
\stopxmlsetups
\stoptyping
{\em When these methods work out okay the other structural elements will be
dealt with in a similar way.}
\stopsection
\startsection[title={special cases}]
Normally the content will be flushed under a special (so called) catcode regime.
This means that characters that have a special meaning in \TEX\ will have no such
meaning in an \XML\ file. If you want content to be treated as \TEX\ code, you can
use one of the following:
\startxmlcmd {\cmdbasicsetup{xmlflushcontext}}
flush the given \cmdinternal {cd:node} using the \TEX\ character
interpretation scheme
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlcontext}}
flush the match of \cmdinternal {cd:lpath} for the given \cmdinternal
{cd:node} using the \TEX\ character interpretation scheme
\stopxmlcmd
We use this in cases like:
\starttyping
....
\xmlsetsetup {#1} {
tm|texformula|
} {xml:*}
....
\startxmlsetups xml:tm
\mathematics{\xmlflushcontext{#1}}
\stopxmlsetups
\startxmlsetups xml:texformula
\placeformula\startformula\xmlflushcontext{#1}\stopformula
\stopxmlsetups
\stoptyping
\stopsection
\startsection[title={collecting}]
Say that your document has
\starttyping
\stoptyping
And that you need to convert that to \TEX\ speak like:
\starttyping
\bTABLE
\bTR
\bTD foo \eTD
\bTD bar \eTD
\eTR
\eTABLE
\stoptyping
A simple mapping is:
\starttyping
\startxmlsetups xml:table
\bTABLE \xmlflush{#1} \eTABLE
\stopxmlsetups
\startxmlsetups xml:tr
\bTR \xmlflush{#1} \eTR
\stopxmlsetups
\startxmlsetups xml:td
\bTD \xmlflush{#1} \eTD
\stopxmlsetups
\stoptyping
The \type {\bTD} command is a so called delimited command which means that it
picks up its argument by looking for an \type {\eTD}. For the simple case here
this works quite well because the flush is inside the pair. This is not the case
in the following variant:
\starttyping
\startxmlsetups xml:td:start
\bTD
\stopxmlsetups
\startxmlsetups xml:td:stop
\eTD
\stopxmlsetups
\startxmlsetups xml:td
\xmlsetup{#1}{xml:td:start}
\xmlflush{#1}
\xmlsetup{#1}{xml:td:stop}
\stopxmlsetups
\stoptyping
When for some reason \TEX\ gets confused you can revert to a mechanism that
collects content.
\starttyping
\startxmlsetups xml:td:start
\startcollect
\bTD
\stopcollect
\stopxmlsetups
\startxmlsetups xml:td:stop
\startcollect
\eTD
\stopcollect
\stopxmlsetups
\startxmlsetups xml:td
\startcollecting
\xmlsetup{#1}{xml:td:start}
\xmlflush{#1}
\xmlsetup{#1}{xml:td:stop}
\stopcollecting
\stopxmlsetups
\stoptyping
You can even implement solutions that effectively do this:
\starttyping
\startcollecting
\startcollect \bTABLE \stopcollect
\startcollect \bTR \stopcollect
\startcollect \bTD \stopcollect
\startcollect foo\stopcollect
\startcollect \eTD \stopcollect
\startcollect \bTD \stopcollect
\startcollect bar\stopcollect
\startcollect \eTD \stopcollect
\startcollect \eTR \stopcollect
\startcollect \eTABLE \stopcollect
\stopcollecting
\stoptyping
Of course you only need to go that complex when the situation demands it. Here is
another weird one:
\starttyping
\startcollecting
\startcollect \setupsomething[\stopcollect
\startcollect foo=\stopcollect
\startcollect FOO,\stopcollect
\startcollect bar=\stopcollect
\startcollect BAR,\stopcollect
\startcollect ]\stopcollect
\stopcollecting
\stoptyping
\stopsection
\startsection[title={selectors and injectors}]
This section describes a bit special feature, one that we needed for a project
where we could not touch the original content but could add specific sections for
our own purpose. Hopefully the example demonstrates its useability.
\enabletrackers[lxml.selectors]
\startbuffer[foo]
t1 t2 t3
t1 t2 t3
t4
t8.0
t8.0
t3
t3
t8.1
t8.1
t8.2
t8.2
t4
t4
foo
bar
bar
\stopbuffer
\typebuffer[foo]
First we show how to plug in a directive. Processing instructions like the
following are normally ignored by an \XML\ processor, unless they make sense
to it.
\starttyping
\stoptyping
We can define a message handler as follows:
\startbuffer
\def\MyMessage#1#2#3{\writestatus{#1}{#2 #3}}
\xmlinstalldirective{message}{MyMessage}
\stopbuffer
\typebuffer \getbuffer
When this file is processed you will see this on the console:
\starttyping
info > 1: this is a demo file
info > 2: this is a demo file
\stoptyping
The file has some sections that can be used or ignored. The recipe for
obeying \type {t1} and \type {t4} is the following:
\startbuffer
\xmlsetinjectors[t1]
\xmlsetinjectors[t4]
\startxmlsetups xml:initialize
\xmlapplyselectors{#1}
\xmlsetsetup {#1} {
one|two|three|four
} {xml:*}
\stopxmlsetups
\xmlregistersetup{xml:initialize}
\startxmlsetups xml:one
[ONE \xmlflush{#1} ONE]
\stopxmlsetups
\startxmlsetups xml:two
[TWO \xmlflush{#1} TWO]
\stopxmlsetups
\startxmlsetups xml:three
[THREE \xmlflush{#1} THREE]
\stopxmlsetups
\startxmlsetups xml:four
[FOUR \xmlflush{#1} FOUR]
\stopxmlsetups
\stopbuffer
\typebuffer \getbuffer
This typesets:
\startnarrower
\xmlprocessbuffer{main}{foo}{}
\stopnarrower
The include coding is kind of special: it permits adding content (in a comment)
and ignoring the rest so that we indeed can add something without interfering
with the original. Of course in a normal workflow such messy solutions are
not needed, but alas, often workflows are not that clean, especially when one
has no real control over the source.
\startxmlcmd {\cmdbasicsetup{xmlsetinjectors}}
enables a list of injectors that will be used
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlresetinjectors}}
resets the list of injectors
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlinjector}}
expands an injection (command); normally this one is only used
(in some setup) or for testing
\stopxmlcmd
\startxmlcmd {\cmdbasicsetup{xmlapplyselectors}}
analyze the tree \cmdinternal {cd:node} for marked sections that
will be injected
\stopxmlcmd
We have some injections predefined:
\starttyping
\startsetups xml:directive:injector:page
\page
\stopsetups
\startsetups xml:directive:injector:column
\column
\stopsetups
\startsetups xml:directive:injector:blank
\blank
\stopsetups
\stoptyping
In the example we see:
\starttyping
\stoptyping
When we set \type {\xmlsetinjector[t7]} a pagebreak will injected in that spot.
Tags like \type {t7}, \type {t8} etc.\ can represent versions.
\stopsection
\startsection[title=preprocessing]
% local match = lpeg.match
% local replacer = lpeg.replacer("BAD TITLE:","BAD TITLE:")
%
% function lxml.preprocessor(data,settings)
% return match(replacer,data)
% end
\startbuffer[pre-code]
\startluacode
function lxml.preprocessor(data,settings)
return string.find(data,"BAD TITLE:")
and string.gsub(data,"BAD TITLE:","BAD TITLE:")
or data
end
\stopluacode
\stopbuffer
\startbuffer[pre-xml]
\startxmlsetups pre:demo:initialize
\xmlsetsetup{#1}{*}{pre:demo:*}
\stopxmlsetups
\xmlregisterdocumentsetup{pre:demo}{pre:demo:initialize}
\startxmlsetups pre:demo:root
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups pre:demo:bold
\begingroup\bf\xmlflush{#1}\endgroup
\stopxmlsetups
\starttext
\xmlprocessbuffer{pre:demo}{demo}{}
\stoptext
\stopbuffer
Say that you have the following \XML\ setup:
\typebuffer[pre-xml]
and that (such things happen) the input looks like this:
\startbuffer[demo]
BAD TITLE: crap crap crap ...
BAD TITLE: crap crap crap ...
\stopbuffer
\typebuffer[demo]
You can then clean up these \type {BAD TITLE}'s as follows:
\typebuffer[pre-code]
and get as result:
\start \getbuffer[pre-code,pre-xml] \stop
The preprocessor function gets as second argument the current settings, an d
the field \type {currentresource} can be used to limit the actions to
specific resources, in our case it's \type {buffer: demo}. Afterwards you can
reset the proprocessor with:
\startluacode
lxml.preprocessor = nil
\stopluacode
Future versions might give some more control over preprocessors. For now consider
it to be a quick hack.
\stopsection
\stopchapter
\startchapter[title={Lookups using lpaths}]
\startsection[title={introduction}]
There is not that much system in the following examples. They resulted from tests
with different documents. The current implementation evolved out of the
experimental code. For instance, I decided to add the multiple expressions in row
handling after a few email exchanges with Jean|-|Michel Huffen.
One of the main differences between the way \XSLT\ resolves a path and our way is
the anchor. Take:
\starttyping
/something
something
\stoptyping
The first one anchors in the current (!) element so it will only consider direct
children. The second one does a deep lookup and looks at the descendants as well.
Furthermore we have a few extra shortcuts like \type {**} in \type {a/**/b} which
represents all descendants.
The expressions (between square brackets) has to be valid \LUA\ and some
preprocessing is done to resolve the built in functions. So, you might use code
like:
\starttyping
my_lpeg_expression:match(text()) == "whatever"
\stoptyping
given that \type {my_lpeg_expression} is known. In the examples below we use the
visualizer to show the steps. Some are shown more than once as part of a set.
\stopsection
\startsection[title={special cases}]
\xmllshow{}
\xmllshow{*}
\xmllshow{.}
\xmllshow{/}
\stopsection
\startsection[title={wildcards}]
\xmllshow{*}
\xmllshow{*:*}
\xmllshow{/*}
\xmllshow{/*:*}
\xmllshow{*/*}
\xmllshow{*:*/*:*}
\xmllshow{a/*}
\xmllshow{a/*:*}
\xmllshow{/a/*}
\xmllshow{/a/*:*}
\xmllshow{/*}
\xmllshow{/**}
\xmllshow{/***}
\stopsection
\startsection[title={multiple steps}]
\xmllshow{answer}
\xmllshow{answer/test/*}
\xmllshow{answer/test/child::}
\xmllshow{answer/*}
\xmllshow{answer/*[tag()='p' and position()=1 and text()!='']}
\stopsection
\startsection[title={pitfals}]
\xmllshow{[oneof(lower(@encoding),'tex','context','ctx')]}
\xmllshow{.[oneof(lower(@encoding),'tex','context','ctx')]}
\stopsection
\startsection[title={more special cases}]
\xmllshow{**}
\xmllshow{*}
\xmllshow{..}
\xmllshow{.}
\xmllshow{//}
\xmllshow{/}
\xmllshow{**/}
\xmllshow{**/*}
\xmllshow{**/.}
\xmllshow{**//}
\xmllshow{*/}
\xmllshow{*/*}
\xmllshow{*/.}
\xmllshow{*//}
\xmllshow{/**/}
\xmllshow{/**/*}
\xmllshow{/**/.}
\xmllshow{/**//}
\xmllshow{/*/}
\xmllshow{/*/*}
\xmllshow{/*/.}
\xmllshow{/*//}
\xmllshow{./}
\xmllshow{./*}
\xmllshow{./.}
\xmllshow{.//}
\xmllshow{../}
\xmllshow{../*}
\xmllshow{../.}
\xmllshow{..//}
\stopsection
\startsection[title={more wildcards}]
\xmllshow{one//two}
\xmllshow{one/*/two}
\xmllshow{one/**/two}
\xmllshow{one/***/two}
\xmllshow{one/x//two}
\xmllshow{one//x/two}
\xmllshow{//x/two}
\stopsection
\startsection[title={special axis}]
\xmllshow{descendant::whocares/ancestor::whoknows}
\xmllshow{descendant::whocares/ancestor::whoknows/parent::}
\xmllshow{descendant::whocares/ancestor::}
\xmllshow{child::something/child::whatever/child::whocares}
\xmllshow{child::something/child::whatever/child::whocares|whoknows}
\xmllshow{child::something/child::whatever/child::(whocares|whoknows)}
\xmllshow{child::something/child::whatever/child::!(whocares|whoknows)}
\xmllshow{child::something/child::whatever/child::(whocares)}
\xmllshow{child::something/child::whatever/child::(whocares)[position()>2]}
\xmllshow{child::something/child::whatever[position()>2][position()=1]}
\xmllshow{child::something/child::whatever[whocares][whocaresnot]}
\xmllshow{child::something/child::whatever[whocares][not(whocaresnot)]}
\xmllshow{child::something/child::whatever/self::whatever}
There is also \type {last-match::} that starts with the last found set of nodes.
This can save some run time when you do lots of tests combined with a same check
afterwards. There is however one pitfall: you never know what is done with that
last match in the setup that gets called nested. Take the following example:
\starttyping
\startbuffer[test]
done 1
done 2
done 3
\stopbuffer
\stoptyping
One way to filter the content is this:
\starttyping
\xmldoif {#1} {/crap/crapa/crapb/crapc/crapd/crape} {
some action
}
\stoptyping
It is not unlikely that you will do something like this:
\starttyping
\xmlfirst {#1} {/crap/crapa/crapb/crapc/crapd/crape} {
\xmlfirst{#1}{/crap/crapa/crapb/crapc/crapd/crape}
}
\stoptyping
This means that the path is resolved twice but that can be avoided as
follows:
\starttyping
\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{
\xmlfirst{#1}{last-match::}
}
\stoptyping
But the next is now guaranteed to work:
\starttyping
\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{
\xmlfirst{#1}{last-match::}
\xmllast{#1}{last-match::}
}
\stoptyping
Because the first one can have done some lookup the last match can be replaced
and the second call will give unexpected results. You can overcome this with:
\starttyping
\xmldoif{#1}{/crap/crapa/crapb/crapc/crapd/crape}{
\xmlpushmatch
\xmlfirst{#1}{last-match::}
\xmlpopmatch
}
\stoptyping
Does it pay off? Here are some timings of a 10.000 times text and lookup
like the previous (on a decent Januari 2016 laptop):
\starttabulate[|r|l|]
\NC 0.239 \NC \type {\xmldoif {...} {...}} \NC \NR
\NC 0.292 \NC \type {\xmlfirst {...} {...}} \NC \NR
\NC 0.538 \NC \type {\xmldoif {...} {...} + \xmlfirst {...} {...}} \NC \NR
\NC 0.338 \NC \type {\xmldoif {...} {...} + \xmlfirst {...} {last-match::}} \NC \NR
\NC 0.349 \NC \type {+ \xmldoif {...} {...} + \xmlfirst {...} {last-match::}-} \NC \NR
\stoptabulate
So, pushing and popping (the last row) is a bit slower than not doing that but it
is still much faster than not using \type {last-match::} at all. As a shortcut
you can use \type {=}, as in:
\starttyping
\xmlfirst{#1}{=}
\stoptyping
You can even do this:
\starttyping
\xmlall{#1}{last-match::/text()}
\stoptyping
or
\starttyping
\xmlall{#1}{=/text()}
\stoptyping
\stopsection
\startsection[title={some more examples}]
\xmllshow{/something/whatever}
\xmllshow{something/whatever}
\xmllshow{/**/whocares}
\xmllshow{whoknows/whocares}
\xmllshow{whoknows}
\xmllshow{whocares[contains(text(),'f') or contains(text(),'g')]}
\xmllshow{whocares/first()}
\xmllshow{whocares/last()}
\xmllshow{whatever/all()}
\xmllshow{whocares/position(2)}
\xmllshow{whocares/position(-2)}
\xmllshow{whocares[1]}
\xmllshow{whocares[-1]}
\xmllshow{whocares[2]}
\xmllshow{whocares[-2]}
\xmllshow{whatever[3]/attribute(id)}
\xmllshow{whatever[2]/attribute('id')}
\xmllshow{whatever[3]/text()}
\xmllshow{/whocares/first()}
\xmllshow{/whocares/last()}
\xmllshow{xml://whatever/all()}
\xmllshow{whatever/all()}
\xmllshow{//whocares}
\xmllshow{..[2]}
\xmllshow{../*[2]}
\xmllshow{/(whocares|whocaresnot)}
\xmllshow{/!(whocares|whocaresnot)}
\xmllshow{/!whocares}
\xmllshow{/interface/command/command(xml:setups:register)}
\xmllshow{/interface/command[@name='xxx']/command(xml:setups:typeset)}
\xmllshow{/arguments/*}
\xmllshow{/sequence/first()}
\xmllshow{/arguments/text()}
\xmllshow{/sequence/variable/first()}
\xmllshow{/interface/define[@name='xxx']/first()}
\xmllshow{/parameter/command(xml:setups:parameter:measure)}
\xmllshow{/(*:library|figurelibrary)/*:figure/*:label}
\xmllshow{/(*:library|figurelibrary)/figure/*:label}
\xmllshow{/(*:library|figurelibrary)/figure/label}
\xmllshow{/(*:library|figurelibrary)/figure:*/label}
\xmlshow {whatever//br[tag(1)='br']}
\stopsection
\stopchapter
\startchapter[title=Examples]
\startsection[title=attribute chains]
In \CSS, when an attribute is not present, the parent element is checked, and when
not found again, the lookup follows the chain till a match is found or the root is
reached. The following example demonstrates how such a chain lookup works.
\startbuffer[test]
\stopbuffer
\typebuffer[test]
We apply the following setups to this tree:
\startbuffer[setups]
\startxmlsetups xml:common
[
\xmlchainatt{#1}{mine},
\xmlchainatt{#1}{test},
\xmlchainatt{#1}{more},
\xmlchainatt{#1}{none}
]\par
\stopxmlsetups
\startxmlsetups xml:something
something: \xmlsetup{#1}{xml:common}
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:whatever
whatever: \xmlsetup{#1}{xml:common}
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:whocares
whocares: \xmlsetup{#1}{xml:common}
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:mysetups
\xmlsetsetup{#1}{something|whatever|whocares}{xml:*}
\stopxmlsetups
\xmlregisterdocumentsetup{example-1}{xml:mysetups}
\xmlprocessbuffer{example-1}{test}{}
\stopbuffer
\typebuffer[setups]
This gives:
\start
\getbuffer[setups]
\stop
\stopsection
\startsection[title=conditional setups]
Say that we have this code:
\starttyping
\xmldoifelse {#1} {/what[@a='1']} {
\xmlfilter {#1} {/what/command('xml:yes')}
} {
\xmlfilter {#1} {/what/command('xml:nop')}
}
\stoptyping
Here we first determine if there is a child \type {what} with attribute \type {a}
set to \type {1}. Depending on the outcome again we check the child nodes for
being named \type {what}. A faster solution which also takes less code is this:
\starttyping
\xmlfilter {#1} {/what[@a='1']/command('xml:yes','xml:nop')}
\stoptyping
\stopsection
\startsection[title=manipulating]
Assume that we have the following \XML\ data:
\startbuffer[test]
right
wrong
\stopbuffer
\typebuffer[test]
But, instead of \type {right} we want to see \type {okay}. We can do that with a
finalizer:
\startbuffer
\startluacode
local rehash = {
["right"] = "okay",
}
function xml.finalizers.tex.Okayed(collected,what)
for i=1,#collected do
if what == "all" then
local str = xml.text(collected[i])
context(rehash[str] or str)
else
context(str)
end
end
end
\stopluacode
\stopbuffer
\typebuffer \getbuffer
\startbuffer
\startxmlsetups xml:A
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:B
(It's \xmlfilter{#1}{./Okayed("all")})
\stopxmlsetups
\startxmlsetups xml:testsetups
\xmlsetsetup{#1}{A|B}{xml:*}
\stopxmlsetups
\xmlregisterdocumentsetup{example-2}{xml:testsetups}
\xmlprocessbuffer{example-2}{test}{}
\stopbuffer
\typebuffer
The result is: \start \inlinebuffer \stop
\stopsection
\startsection[title=cross referencing]
A rather common way to add cross references to \XML\ files is to borrow the
asymmetrical id's from \HTML. This means that one cannot simply use a value
of (say) \type {href} to locate an \type {id}. The next example came up on
the \CONTEXT\ mailing list.
\startbuffer[test]
Text
and
\stopbuffer
\typebuffer[test]
We give two variants for dealing with such references. The first solution does
lookups and depending on the size of the file can be somewhat inefficient.
\startbuffer
\startxmlsetups xml:doc
\blank
\xmlflush{#1}
\blank
\stopxmlsetups
\startxmlsetups xml:p
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:footnote
(variant 1)\footnote
{\xmlfirst
{example-3-1}
{div[@class='footnotes']/ol/li[@id='\xmlrefatt{#1}{href}']}}
\stopxmlsetups
\startxmlsetups xml:initialize
\xmlsetsetup{#1}{p|doc}{xml:*}
\xmlsetsetup{#1}{a[@class='footnoteref']}{xml:footnote}
\xmlsetsetup{#1}{div[@class='footnotes']}{xml:nothing}
\stopxmlsetups
\xmlresetdocumentsetups{*}
\xmlregisterdocumentsetup{example-3-1}{xml:initialize}
\xmlprocessbuffer{example-3-1}{test}{}
\stopbuffer
\typebuffer
This will typeset two footnotes.
\getbuffer
The second variant collects the references so that the time spend on lookups is
less.
\startbuffer
\startxmlsetups xml:doc
\blank
\xmlflush{#1}
\blank
\stopxmlsetups
\startxmlsetups xml:p
\xmlflush{#1}
\stopxmlsetups
\startluacode
userdata.notes = {}
\stopluacode
\startxmlsetups xml:collectnotes
\ctxlua{userdata.notes['\xmlrefatt{#1}{id}'] = '#1'}
\stopxmlsetups
\startxmlsetups xml:footnote
(variant 2)\footnote
{\xmlflush
{\cldcontext{userdata.notes['\xmlrefatt{#1}{href}']}}}
\stopxmlsetups
\startxmlsetups xml:initialize
\xmlsetsetup{#1}{p|doc}{xml:*}
\xmlsetsetup{#1}{a[@class='footnoteref']}{xml:footnote}
\xmlfilter{#1}{div[@class='footnotes']/ol/li/command(xml:collectnotes)}
\xmlsetsetup{#1}{div[@class='footnotes']}{}
\stopxmlsetups
\xmlregisterdocumentsetup{example-3-2}{xml:initialize}
\xmlprocessbuffer{example-3-2}{test}{}
\stopbuffer
\typebuffer
This will again typeset two footnotes:
\getbuffer
\stopsection
\startsection[title=mapping values]
One way to process options \type {frame} in the example below is to map the
values to values known by \CONTEXT.
\startbuffer[test]
#1 | #2 | #3 | #4 |
#5 | #6 | #7 | #8 |
#1 | #2 | #3 | #4 |
#5 | #6 | #7 | #8 |
#1 | #2 | #3 | #4 |
#5 | #6 | #7 | #8 |
\stopbuffer
\typebuffer[test]
\startbuffer
\startxmlsetups xml:a
\xmlflush{#1}
\stopxmlsetups
\xmlmapvalue {nattable:frame} {on} {on}
\xmlmapvalue {nattable:frame} {yes} {on}
\xmlmapvalue {nattable:frame} {off} {off}
\xmlmapvalue {nattable:frame} {no} {off}
\startxmlsetups xml:nattable
\startplacetable[title=#1]
\setupTABLE[frame=\xmlval{nattable:frame}{\xmlatt{#1}{frame}}{on}]%
\bTABLE
\xmlflush{#1}
\eTABLE
\stopplacetable
\stopxmlsetups
\startxmlsetups xml:tr
\bTR
\xmlflush{#1}
\eTR
\stopxmlsetups
\startxmlsetups xml:td
\bTD
\xmlflush{#1}
\eTD
\stopxmlsetups
\startxmlsetups xml:testsetups
\xmlsetsetup{example-4}{a|nattable|tr|td|}{xml:*}
\stopxmlsetups
\xmlregisterdocumentsetup{example-4}{xml:testsetups}
\xmlprocessbuffer{example-4}{test}{}
\stopbuffer
The \type {\xmlmapvalue} mechanism is rather efficient and involves a minimum
of testing.
\typebuffer
We get:
\getbuffer
\stopsection
\startsection[title=using \LUA]
In this example we demonstrate how you can delegate rendering to \LUA. We
will construct a so called extreme table. The input is:
\startbuffer[demo]
1 Text
2 More text
2 Even more text
2 And more
3 And even more
2 The last text
\stopbuffer
\typebuffer[demo]
The processor code is:
\startbuffer[process]
\startxmlsetups xml:test_setups
\xmlsetsetup{#1}{a|b|c|d}{xml:*}
\stopxmlsetups
\xmlregisterdocumentsetup{example-5}{xml:test_setups}
\xmlprocessbuffer{example-5}{demo}{}
\stopbuffer
\typebuffer
We color a sequence of the same titles (numbers here) differently. The first
solution remembers the last title:
\startbuffer
\startxmlsetups xml:a
\startembeddedxtable
\xmlflush{#1}
\stopembeddedxtable
\stopxmlsetups
\startxmlsetups xml:b
\xmlfunction{#1}{test_ba}
\stopxmlsetups
\startluacode
local lasttitle = nil
function xml.functions.test_ba(t)
local title = xml.text(t, "/c")
local content = xml.text(t, "/d")
context.startxrow()
context.startxcell {
background = "color",
backgroundcolor = lasttitle == title and "colorone" or "colortwo",
foregroundstyle = "bold",
foregroundcolor = "white",
}
context(title)
lasttitle = title
context.stopxcell()
context.startxcell()
context(content)
context.stopxcell()
context.stopxrow()
end
\stopluacode
\stopbuffer
\typebuffer \getbuffer
The \type {embeddedxtable} environment is needed because the table is picked up
as argument.
\startlinecorrection \getbuffer[process] \stoplinecorrection
The second implemetation remembers what titles are already processed so here we
can color the last one too.
\startbuffer
\startxmlsetups xml:a
\ctxlua{xml.functions.reset_bb()}
\startembeddedxtable
\xmlflush{#1}
\stopembeddedxtable
\stopxmlsetups
\startxmlsetups xml:b
\xmlfunction{#1}{test_bb}
\stopxmlsetups
\startluacode
local titles
function xml.functions.reset_bb(t)
titles = { }
end
function xml.functions.test_bb(t)
local title = xml.text(t, "/c")
local content = xml.text(t, "/d")
context.startxrow()
context.startxcell {
background = "color",
backgroundcolor = titles[title] and "colorone" or "colortwo",
foregroundstyle = "bold",
foregroundcolor = "white",
}
context(title)
titles[title] = true
context.stopxcell()
context.startxcell()
context(content)
context.stopxcell()
context.stopxrow()
end
\stopluacode
\stopbuffer
\typebuffer \getbuffer
\startlinecorrection \getbuffer[process] \stoplinecorrection
A solution without any state variable is given below.
\startbuffer
\startxmlsetups xml:a
\startembeddedxtable
\xmlflush{#1}
\stopembeddedxtable
\stopxmlsetups
\startxmlsetups xml:b
\xmlfunction{#1}{test_bc}
\stopxmlsetups
\startluacode
function xml.functions.test_bc(t)
local title = xml.text(t, "/c")
local content = xml.text(t, "/d")
context.startxrow()
local okay = xml.text(t,"./preceding-sibling::/[-1]") == title
context.startxcell {
background = "color",
backgroundcolor = okay and "colorone" or "colortwo",
foregroundstyle = "bold",
foregroundcolor = "white",
}
context(title)
context.stopxcell()
context.startxcell()
context(content)
context.stopxcell()
context.stopxrow()
end
\stopluacode
\stopbuffer
\typebuffer \getbuffer
\startlinecorrection \getbuffer[process] \stoplinecorrection
Here is a solution that delegates even more to \LUA. The previous variants were
actually not that safe with repect to special characters and didn't handle
nested elements either but the next one does.
\startbuffer[demo]
#1 Text
#2 More text
#2 Even more text
#2 And more
#3 And even more
#2 Something nested
\stopbuffer
\typebuffer[demo]
We also need to map the \type {i} element.
\startbuffer
\startxmlsetups xml:a
\starttexcode
\xmlfunction{#1}{test_a}
\stoptexcode
\stopxmlsetups
\startxmlsetups xml:c
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:d
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:i
{\em\xmlflush{#1}}
\stopxmlsetups
\startluacode
function xml.functions.test_a(t)
context.startxtable()
local previous = false
for b in xml.collected(lxml.getid(t),"/b") do
context.startxrow()
local current = xml.text(b,"/c")
context.startxcell {
background = "color",
backgroundcolor = (previous == current) and "colorone" or "colortwo",
foregroundstyle = "bold",
foregroundcolor = "white",
}
lxml.first(b,"/c")
context.stopxcell()
context.startxcell()
lxml.first(b,"/d")
context.stopxcell()
previous = current
context.stopxrow()
end
context.stopxtable()
end
\stopluacode
\startxmlsetups xml:test_setups
\xmlsetsetup{#1}{a|b|c|d|i}{xml:*}
\stopxmlsetups
\xmlregisterdocumentsetup{example-5}{xml:test_setups}
\xmlprocessbuffer{example-5}{demo}{}
\stopbuffer
\typebuffer
\startlinecorrection \getbuffer \stoplinecorrection
The question is, do we really need \LUA ? Often we don't, apart maybe from an
occasional special finalizer. A pure \TEX\ solution is given next:
\startbuffer
\startxmlsetups xml:a
\glet\MyPreviousTitle\empty
\glet\MyCurrentTitle \empty
\startembeddedxtable
\xmlflush{#1}
\stopembeddedxtable
\stopxmlsetups
\startxmlsetups xml:b
\startxrow
\xmlflush{#1}
\stopxrow
\stopxmlsetups
\startxmlsetups xml:c
\xdef\MyCurrentTitle{\xmltext{#1}{.}}
\doifelse {\MyPreviousTitle} {\MyCurrentTitle} {
\startxcell
[background=color,
backgroundcolor=colorone,
foregroundstyle=bold,
foregroundcolor=white]
} {
\glet\MyPreviousTitle\MyCurrentTitle
\startxcell
[background=color,
backgroundcolor=colortwo,
foregroundstyle=bold,
foregroundcolor=white]
}
\xmlflush{#1}
\stopxcell
\stopxmlsetups
\startxmlsetups xml:d
\startxcell
\xmlflush{#1}
\stopxcell
\stopxmlsetups
\startxmlsetups xml:i
{\em\xmlflush{#1}}
\stopxmlsetups
\startxmlsetups xml:test_setups
\xmlsetsetup{#1}{*}{xml:*}
\stopxmlsetups
\xmlregisterdocumentsetup{example-5}{xml:test_setups}
\xmlprocessbuffer{example-5}{demo}{}
\stopbuffer
\typebuffer
\startlinecorrection \getbuffer \stoplinecorrection
You can even save a few lines of code:
\starttyping
\startxmlsetups xml:c
\xdef\MyCurrentTitle{\xmltext{#1}{.}}
\startxcell
[background=color,
backgroundcolor=color\ifx\MyPreviousTitle\MyCurrentTitle one\else two\fi,
foregroundstyle=bold,
foregroundcolor=white]
\xmlflush{#1}
\stopxcell
\glet\MyPreviousTitle\MyCurrentTitle
\stopxmlsetups
\stoptyping
Or if you prefer:
\starttyping
\startxmlsetups xml:c
\xdef\MyCurrentTitle{\xmltext{#1}{.}}
\doifelse {\MyPreviousTitle} {\MyCurrentTitle} {
\xmlsetup{#1}{xml:c:one}
} {
\xmlsetup{#1}{xml:c:two}
}
\stopxmlsetups
\startxmlsetups xml:c:one
\startxcell
[background=color,
backgroundcolor=colorone,
foregroundstyle=bold,
foregroundcolor=white]
\xmlflush{#1}
\stopxcell
\stopxmlsetups
\startxmlsetups xml:c:two
\startxcell
[background=color,
backgroundcolor=colortwo,
foregroundstyle=bold,
foregroundcolor=white]
\xmlflush{#1}
\stopxcell
\global\let\MyPreviousTitle\MyCurrentTitle
\stopxmlsetups
\stoptyping
These examples demonstrate that it doesn't hurt to know a little bit of \TEX\
programming: defining macros and basic comparisons can come in handy. There are
examples in the test suite, you can peek in the source code, you can consult
the wiki or you can just ask on the list.
\stopsection
\startsection[title=last match]
For the next example we use the following \XML\ input:
\startbuffer[demo]
\stopbuffer
\typebuffer[demo]
If you check if some element is present and then act accordingly, you can
end up with doing the same lookup twice. Although it might sound inefficient,
in practice it's often not measureable.
\startbuffer
\startxmlsetups xml:demo:document
\type{\xmlall{#1}{/section[@id='2']/content/p}}\par
\xmldoif{#1}{/section[@id='2']/content/p} {
\xmlall{#1}{/section[@id='2']/content/p}
}
\type{\xmllastmatch}\par
\xmldoif{#1}{/section[@id='2']/content/p} {
\xmllastmatch
}
\type{\xmlall{#1}{last-match::}}\par
\xmldoif{#1}{/section[@id='2']/content/p} {
\xmlall{#1}{last-match::}
}
\type{\xmlfilter{#1}{last-match::/command(xml:demo:p)}}\par
\xmldoif{#1}{/section[@id='2']/content/p} {
\xmlfilter{#1}{last-match::/command(xml:demo:p)}
}
\stopxmlsetups
\startxmlsetups xml:demo:p
\quad\xmlflush{#1}\endgraf
\stopxmlsetups
\startxmlsetups xml:demo:base
\xmlsetsetup{#1}{document|p}{xml:demo:*}
\stopxmlsetups
\xmlregisterdocumentsetup{example-6}{xml:demo:base}
\xmlprocessbuffer{example-6}{demo}{}
\stopbuffer
\typebuffer
In the second check we just flush the last match, so effective we do an \type
{\xmlall} here. The third and fourth alternatives demonstrate how we can use
\type {last-match} as axis. The gain is 10\% or more on the lookup but of course
typesetting often takes relatively more time than the lookup.
\startpacked
\getbuffer
\stoppacked
\stopsection
\startsection[title=Finalizers]
The \XML\ parser is also available outside \TEX. Here is an example of its usage.
We pipe the result to \TEX\ but you can do with \type {t} whatever you like.
\startbuffer
local x = xml.load("manual-demo-1.xml")
local t = { }
for c in xml.collected(x,"//*") do
if not c.special and not t[c.tg] then
t[c.tg] = true
end
end
context.tocontext(table.sortedkeys(t))
\stopbuffer
\typebuffer
This returns:
\ctxluabuffer
We can wrap this in a finalizer:
\startbuffer
xml.finalizers.taglist = function(collected)
local t = { }
for i=1,#collected do
local c = collected[i]
if not c.special then
local tg = c.tg
if tg and not t[tg] then
t[tg] = true
end
end
end
return table.sortedkeys(t)
end
\stopbuffer
\typebuffer
Or in a more extensive one:
\startbuffer
xml.finalizers.taglist = function(collected,parenttoo)
local t = { }
for i=1,#collected do
local c = collected[i]
if not c.special then
local tg = c.tg
if tg and not t[tg] then
t[tg] = true
end
if parenttoo then
local p = c.__p__
if p and not p.special then
local tg = p.tg .. ":" .. tg
if tg and not t[tg] then
t[tg] = true
end
end
end
end
end
return table.sortedkeys(t)
end
\stopbuffer
\typebuffer \ctxluabuffer
Usage is as follows:
\startbuffer
local x = xml.load("manual-demo-1.xml")
local t = xml.applylpath(x,"//*/taglist()")
context.tocontext(t)
\stopbuffer
\typebuffer
And indeed we get:
\ctxluabuffer
But we can also say:
\startbuffer
local x = xml.load("manual-demo-1.xml")
local t = xml.applylpath(x,"//*/taglist(true)")
context.tocontext(t)
\stopbuffer
\typebuffer
Now we get:
\ctxluabuffer
\startsection[title=Pure xml]
One might wonder how a \TEX\ macro package would look like when backslashes,
dollars and percent signs would have no special meaning. In fact, it would be
rather useless as interpreting commands are triggered by such characters. Any
formatting or coding system needs such characters. Take \XML: angle brackets and
ampersands are really special. So, no matter what system we use, we do have to
deal with the (common) case where these characters need to be sees as they are.
Normally escaping is the solution.
The \CONTEXT\ interface for \XML\ suffers from this as well. You really don't
want to know how many tricks are used for dealing with special characters and
entities: there are several ways these travel through the system and it is
possible to adapt and cheat. Especially roundtripped data (via tuc file) puts
some demands on the system because when ts \XML\ can become \TEX\ and vise versa.
The next example (derived from a mail on the list) demonstrates this:
\starttyping
\startbuffer[demo]
\ConTeXt\ is great
but you need to know some tricks
\stopbuffer
\startxmlsetups xml:initialize
\xmlsetsetup{#1}{doc|p|code}{xml:*}
\xmlsetsetup{#1}{pre/code}{xml:pre:code}
\stopxmlsetups
\xmlregistersetup{xml:initialize}
\startxmlsetups xml:doc
\xmlflush{#1}
\stopxmlsetups
\startxmlsetups xml:pre:code
no solution
\comment[symbol=Key, location=inmargin,color=yellow]{\xmlflush{#1}}
\par
solution one \begingroup
\expandUx
\comment[symbol=Key, location=inmargin,color=yellow]{\xmlflush{#1}}
\endgroup
\par
solution two
\comment[symbol=Key, location=inmargin,color=yellow]{\xmlpure{#1}}
\par
\xmlprettyprint{#1}{tex}
\stopxmlsetups
\xmlprocessbuffer{main}{demo}{}
\stoptyping
The first comment (an interactive feature of \PDF\ comes out as:
\starttyping
\Ux {5C}ConTeXt\Ux {5C} is great
\stoptyping
The second and third comment are okay. It's one of the reasons why we have \type
{\xmlpure}.
\stopsection
\stopchapter
\stopbodymatter
\stoptext