summaryrefslogtreecommitdiff
path: root/doc/context/sources/general/manuals/publications/publications-datasets.tex
diff options
context:
space:
mode:
Diffstat (limited to 'doc/context/sources/general/manuals/publications/publications-datasets.tex')
-rw-r--r--doc/context/sources/general/manuals/publications/publications-datasets.tex337
1 files changed, 337 insertions, 0 deletions
diff --git a/doc/context/sources/general/manuals/publications/publications-datasets.tex b/doc/context/sources/general/manuals/publications/publications-datasets.tex
new file mode 100644
index 000000000..d612b72e0
--- /dev/null
+++ b/doc/context/sources/general/manuals/publications/publications-datasets.tex
@@ -0,0 +1,337 @@
+\environment publications-style
+
+\startcomponent publications-datasets
+
+\startchapter[title=Datasets]
+
+Normally in a document you will use only one bibliographic database, whether or
+not its source is distributed over multiple files. Nevertheless, we support
+multiple database formats as well which is why we talk of datasets instead. The
+use of multiple datasets allows the isolation of different bibliographies (a
+single bibliography can nevertheless be rendered by structure element: section,
+chapter, part, etc. as we shall see later). A good example of the use of multiple
+datasets would be for a proper bibliography itself in addition to a reference
+catalog (of equipment, suppliers, software, patents, legal jurisprudence, music,
+\unknown). Indeed, datasets can be used to hold both bibliographic and
+non|-|bibliographic information.
+
+A dataset is initiated with the \Cindex {definebtxdataset} command.
+
+\cindex {definebtxdataset}
+
+\startTEX
+\definebtxdataset[default]
+\stopTEX
+
+\startaside
+A default database, \TEXcode {default}, is predefined, yet we recommend defining
+it explicitly because in the future we may provide more options.
+\stopaside
+
+Like other commands in \CONTEXT, the dataset options can be setup using the
+command \Cindex {setupbtxdataset}.
+
+\cindex {definebtxdataset}
+\showsetup[definebtxdataset]
+
+\cindex {setupbtxdataset}
+\showsetup[setupbtxdataset]
+
+A dataset is loaded from some source through the use of the
+\Cindex {usebtxdataset} command.
+
+Here are some examples:
+
+\cindex {usebtxdataset}
+\tindex {.bib}
+\tindex {.xml}
+\tindex {.lua}
+\tindex {.bbl}
+
+\startTEX
+\usebtxdataset[tugboat][tugboat.bib]
+\usebtxdataset[default][mtx-bibtex-output.xml]
+\usebtxdataset[default][test-001-btx-standard.lua]
+\usebtxdataset[default][mkii-publications.bbl]
+\usebtxdataset[default][named.buffer]
+\stopTEX
+
+\cindex {usebtxdataset}
+\showsetup[usebtxdataset]
+
+The four suffixes illustrated in the example above are understood by the loader.
+Here the dataset (other than the first) has the name \TEXcode {default} and the
+four database files are merged. The last example shows that a \TEXcode {named}
+\Index {buffer} can also be employed to add dataset entries (in \BIBTEX\ format).
+This may be useful for small additions or examples, but it is generally a better
+idea (for convenience of management of data) to place them in files separate from
+the document source code.
+
+Definitions in the document source (coded in \TEX\ speak) are also added, and
+they are saved for successive runs. This means that if you load and define
+entries, they will be known at a next run beforehand, so that references to them
+are independent of where in the document source loading and definitions take
+place. This is convenient to eventually break|-|up the dataset loading calls to
+relevant sections of the document structure.
+
+In this document we use some example databases, so let's load one of them now:
+\startfootnote This code snippet demonstrates that \TEXcode {\usebtxdataset} will
+implicitly declare an undefined dataset name, although this practice is to be
+discouraged. Similarly, omitting to specify the dataset name \TEXcode {[default]}
+in the examples given earlier would fall|-|back correctly, but this, too, is to
+be discouraged as being potentially error|-|prone. \stopfootnote
+
+\startbuffer
+\usebtxdataset[example][mkiv-publications.bib]
+\stopbuffer
+
+\cindex {definebtxdataset}
+\cindex {usebtxdataset}
+
+\typeTEXbuffer
+
+\getbuffer
+
+The beginning of the file \type {mkiv-publications.bib} is shown below in \in
+{table} [tab:mkiv-publications.bib]. This bibliography database test file
+contains one entry of each standard type or category, with the \Index {tag} set
+to the entry type name. This entry shown here illustrates many features that will
+be explained elsewhere in the text.
+
+\startsection[title=Dataset coverage]
+
+You can load much more data than you actually need. Usually only those entries
+that are referred to explicitly will be shown in lists, and commands used to
+select these dataset entries will described in \in {chapter} [ch:cite].
+
+A single bibliography list can span groups of datasets; also multiple datasets
+can loaded from the same source, for example, one per chapter, in order to
+achieve a complete \Index {isolation} of bibliographies with respect to numbering
+and references.
+
+As this concept is not obvious but can be quite useful, we will repeat this last
+point: multiple datasets can be loaded using the same source file, i.e.\
+containing the same data, to be used in parallel, independently. There is little
+penalty in keeping even very large datasets as multiple copies in memory.
+
+The current active dataset to be used by default can be set with
+
+\startbuffer
+\setupbtx[dataset=example]
+\stopbuffer
+
+\cindex {setupbtx}
+
+\typeTEXbuffer
+
+\getbuffer
+
+However, most publication|-|related commands accept optional arguments that
+denote the dataset and references to entries can always be prefixed with a
+dataset identifier. More about that later.
+
+\showsetup[setupbtx]
+
+\stopsection
+
+\startsection [title=Specification]
+
+The content of a dataset can really be anything: entries of type (or categories)
+of all sorts, each containing arbitrary fields. The use to be made of this data
+can vary greatly since the system is not limited to the production of
+bibliography lists, in particular. The intended use is reflected through a set of
+specifications, specific to each bibliography (or non|-|bibliography) style.
+These specifications affect the interpretation of dataset categories and fields
+as well as their rendering. They will also affect the rendering of citations or
+the reference or invocation of individual data entries.
+
+The \TEXcode {default} bibliography specification is very simple: only the
+categories \TEXcode {book} and \TEXcode {article} are explicitly defined. These
+were shown along with their default rendering in the quick|-|start example on \at
+{page} [ch:quick]. We purposely limited this \TEXcode {default} specification as
+a minimal example for a bibliography.
+
+The notion of categories and the fields that they might contain and their
+interpretation depend on a particular specification, although the dataset
+\emphasis {content} is independent of all eventual rendering specifications that
+may be applied.
+
+An alternative set of specifications can be selected using, for example
+
+\startbuffer
+\usebtxdefinitions[apa]
+\stopbuffer
+
+\cindex {usebtxdefinitions}
+\index {style+APA}
+\seeindex {specification}{style}
+
+\typeTEXbuffer
+
+\getbuffer
+
+Alternately, the set of specifications can be loaded and (later) activated using
+
+\cindex {loadbtxdefinitionfile}
+\cindex {setupbtx}
+\index {style+APA}
+
+\startTEX
+\loadbtxdefinitionfile[apa]
+...
+\setupbtx[specification=apa]
+\stopTEX
+
+but it is safer to use the \TEXcode {\use} rather than \TEXcode {\load} form, in
+particular with specifications that may themselves have several variants. Also,
+it is way too easy to later forget to set the \TEXcode {specification} parameter
+and then wonder why the loaded specification was not applied.
+
+\startaside
+We wish to clarify that each specification defines the categories of entries and
+the interpretation or use of the fields that they contain, but does not alter the
+data itself, only how this data is used. It also defines \emphasis {setups} that
+control the rendering of lists as well as citations (to be described below).
+Additionally, it creates a namespace with settings for particular \emphasis
+{parameters} controlling the formatting of names, for example, punctuation as
+well as other stylistic features. The user can tune or overload these settings as
+needed.
+\stopaside
+
+A specification need not be activated before loading a dataset; indeed the
+contents of a dataset are stored independent of the specification, and multiple
+specifications can be applied to the same dataset (although this will not usually
+be the case). Furthermore, multiple specification files can be loaded
+simultaneously as they reside in separate namespaces, but only one specification
+can be selected at a time. We introduce these commands here in the context of
+datasets as the labeling of categories and of field use can change depending on
+the specification. Indeed, some specifications might ignore certain fields
+present in the dataset that may be used with other specifications. The details of
+how this is programmed will be explained in \in {Chapter} [ch:custom].
+
+So a specification is both a definition of how a dataset is to be interpreted as
+well as stylistic tuning of how it is to be rendered.
+
+\cindex {loadbtxdefinitionfile}
+\showsetup[loadbtxdefinitionfile]
+
+\cindex {usebtxdefinitions}
+\showsetup[usebtxdefinitions]
+
+\stopsection
+
+\startsection [title=Dataset diagnostics]
+
+You can ask for an overview of entries present in a dataset with:
+
+\startbuffer
+\showbtxdatasetfields[example]
+\stopbuffer
+
+\cindex {showbtxdatasetfields}
+
+\typeTEXbuffer
+
+The listing that this produces is shown in \in {Appendix} [ch:datasetfields].
+
+\cindex {showbtxdatasetfields}
+\showsetup[showbtxdatasetfields]
+\showsetup[showbtxdatasetfields:argument]
+
+Sometimes you might want to check a database, listing all of its entries in
+detail. This can be particularly useful when in doubt concerning the correctness
+or the completeness of the data source, remembering that invalid entries and some
+syntax errors are simply skipped over. One way of examining the loaded dataset in
+detail is the following:
+
+\startbuffer
+\showbtxdatasetcompleteness[example]
+\stopbuffer
+
+\cindex {showbtxdatasetcompleteness}
+
+\typeTEXbuffer
+
+The diagnostic listing (which can be rather long) is shown in \in {Appendix}
+[ch:datasetcompleteness].
+
+\cindex {showbtxdatasetcompleteness}
+\showsetup[showbtxdatasetcompleteness]
+\showsetup[showbtxdatasetcompleteness:argument]
+
+The dataset contains many entries and each entry is assigned to a \Index
+{category}. It must be stressed, so we repeat ourselves here, that these \quote
+{categories} can be of any sort whatsoever, the meaning of which resides in the
+rendering style that is chosen. The entries contain fields, and these too can be
+of any sort; their use also depends on the rendering style and the \Index
+{category} in which they belong. \BibTeX\ has conventionally defined a number of
+standard categories, each making use of a number of fields considered either
+\index {field+required}required, \index {field+optional}optional or \index
+{field+ignored}ignored. However, different traditional \BIBTEX\ rendering styles
+can make inconsistant use of these standard categories and fields. To make
+matters worse, different \Tindex {.bib} database handling programs might use (and
+impose) differing \quote {standards} as well, as mentioned above. \startfootnote
+For example, \Tindex {jabref}, in addition to discarding all comments contained
+in the database file, will convert all unrecognized, preciously named categories
+to \tindex {@other}\BTXcode {@Other}! Of course, \Tindex {jabref} is flexible
+enough to be configured with new categories and additional fields, so users of
+\Tindex {jabref} with \CONTEXT\ will probably want to use an extended, custom
+configuration. \stopfootnote This situation arises from the complexity of
+handling bibliographic data of all sorts.
+
+You can see all (currently known) \index {category}categories and \index
+{field}fields with:
+
+\cindex {showbtxfields}
+
+\startTEX
+\showbtxfields[rotation=...]
+\stopTEX
+
+The result is shown \in {table} [tab:fields], below.
+
+\cindex {showbtxfields}
+\showsetup[showbtxfields]
+\showsetup[showbtxfields:argument]
+
+Note that other, possibly non|-|bibliographic use of the present dataset system
+might define entirely different categories and field types, possibly having
+nothing at all to do with the names shown here. An example of such use is given
+in \in {chapter} [ch:duane].
+
+Just as a database can be much larger than needed for a document, the same is
+true for the fields that make up an entry; not all entry fields will be
+necessarily used. This idea will be developed in the next section describing the
+rendering of bibliography lists.
+
+\stopsection
+
+\startplacetable
+ [reference=tab:mkiv-publications.bib,
+ title={mkiv-publications.bib\\
+ This test file was constructed to illustrate various features of the
+ \BIBTEX\ format and contains some fields that might at first glance
+ appear somewhat curious.}].
+ \typeBTXfile
+ [range={@Comment{Start example},@Comment{Stop example}}]
+ {mkiv-publications.bib}
+\stopplacetable
+
+\startplacetable
+ [reference=tab:fields,
+ list={\TEXcode {\showbtxfields[rotation=90]}},
+ title={\cindex {showbtxfields}\TEXcode {\showbtxfields[rotation=90]} The entry
+ \Index {category} and \Index {field} names (and how they are used) are
+ defined by both the rendering style as well as by the contents of the
+ dataset. \index {field+required}\quote {Required} fields are indicated
+ in green. All unmarked fields are normally \index
+ {field+ignored}ignored in the rendering.}]
+ \small
+ \showbtxfields[rotation=90]
+\stopplacetable
+
+\placefloats
+
+\stopchapter
+
+\stopcomponent