path: root/doc/context/sources/general/manuals/languages/languages-basics.tex
diff options
authorContext Git Mirror Bot <>2016-07-30 01:22:07 +0200
committerContext Git Mirror Bot <>2016-07-30 01:22:07 +0200
commit5135aef167bec739fe429e1aa987671768b237bc (patch)
treebd9f9696704e57c45f453bb7dc6becd5501cb657 /doc/context/sources/general/manuals/languages/languages-basics.tex
parent9d7c4ba8449bec1da920c01e24a17c41bbf2211d (diff)
2016-07-30 00:31:00
Diffstat (limited to 'doc/context/sources/general/manuals/languages/languages-basics.tex')
1 files changed, 0 insertions, 348 deletions
diff --git a/doc/context/sources/general/manuals/languages/languages-basics.tex b/doc/context/sources/general/manuals/languages/languages-basics.tex
deleted file mode 100644
index 840897096..000000000
--- a/doc/context/sources/general/manuals/languages/languages-basics.tex
+++ /dev/null
@@ -1,348 +0,0 @@
-% language=uk
-\startcomponent languages-basics
-\environment languages-environment
-\startchapter[title=Some basics][color=darkyellow]
-In this chapter we will see how we can toggle between languages. A first
-introduction to patterns will be given. Some details of how to control the
-hyphenation with specific patterns will be given in a later chapter.
-\startsection[title={Available languages}]
-When you use the English version of \CONTEXT\ you will default to US English as
-main language. This means that hyphenation will be US specific, which by the way
-is different from the rules in GB. All labels that are generated by the system
-are also in English. Languages can often be accessed by names like \type
-{english} or \type {dutch} although it is quite common to use the short tags like
-\type {en} and \type {nl}. Because we want to be as compatible as possible with
-\MKII, there are quite some synonyms. The following table lists the languages that
-for which support is built|-|in.\footnote {More languages can be defined. It is
-up to users to provide the information.}
-You can call up such a table with the following commands:
-Instead you can run \type {context --global languages-system.mkiv}.
-As you can see, many languages have hyphenation patterns but for Japanese,
-Korean, Chinese as well as Arabic languages they make no sense. The patterns are
-loaded on demand. The number is the internal number that is used in the engine; a
-user never has to use that number. Numbers $<1$ are used to disable hyphenation.
-The file tag is used to locate and load a specification. Such files have names
-like type {lang-nl.lua}.
-Some languages share the same hyphenation patterns but can have demands that
-differ, like labels or quotes. The characters shown in the table are those found
-in the pattern files. The number of patterns differs a lot between languages.
-This relates to the systematic behind them. Some languages use word stems, others
-base their hyphenation on syllables. Some language have inflections which adds to
-the complexity while others can combine words in ways that demand special care
-for word boundaries. Of course a low or high number can signal a low quality as
-well, but most pattern collections are assembled over many years and updated when
-for instance spelling rules change. I think that we can safely say that most patterns
-are quite stable and of good quality.
-The document language is set with
-but when you want to apply the proper hyphenation rules to an embedded language
-you can use:
-or just:
-The main language determines what labels show up, how numbering happens, in what
-way dates get formatted, etc. Normally the \typ {\mainlanguage} command comes
-before the \typ {\starttext} command.
-In \LUATEX\ each character that gets typeset not only carries a font id and character
-code, but also a language number. You can switch language whenever you want and
-the change will be carried with the characters. Switching within a word doesn't make
-sense but it is permitted:
-\NC 1 \NC \type{\de incrediblykompliziert} \NC \hyphenatedword{\de incrediblykompliziert} \NC \NR
-\NC 2 \NC \type{\en incrediblykompliziert} \NC \hyphenatedword{\en incrediblykompliziert} \NC \NR
-\NC 3 \NC \type{\en incredibly\de kompliziert} \NC \hyphenatedword{\en incredibly\de kompliziert} \NC \NR
-\NC 4 \NC \type{\en incredibly\de\-kompliziert} \NC \hyphenatedword{\en incredibly\de\-kompliziert} \NC \NR
-\NC 5 \NC \type{\en incredibly\de-kompliziert} \NC \hyphenatedword{\en incredibly\de-kompliziert} \NC \NR
-In the line 4 we have a \type {\-} between the two words, and in the last
-line just a \type {-}. If you look closely you will notice that the snippets
-can be quite small. If we typeset a word with a 1mm text width we get this:
-\blank \start \en \hsize 1mm incredibly \par \stop \blank
-If you are familiar with the details of hyphenation, you know that the number of
-characters at the end and beginning of a word is controlled by the two variables
-\typ {\lefthyphenmin} and \typ {\righthyphenmin}. However, these only influence
-the hyphenation process. What bits and pieces eventually end up on a line is
-determined by the par builder and there the \type {\hsize} matters. In practice
-you will not run into these situations, unless you have extreme long words and a
-narrow column.
-Hyphenation normally is limited to regular characters that make up the alphabet of
-a language. It is insensitive for capitalization as the following text shows:
-\hyphenatedword {This time the musical distraction while developing code came
-from watching youtube performances of Cory Henry (also known from Snarky Puppy,
-a conglomerate of excellent players). Just search the web for his name with \quote
-{Stevie Wonder and Michael Jackson Tribute}. There is no keyboard he can't play.
-Another interesting keyboard player is Sun Rai (a short name for Rai
-Thistlethwayte, just google for \quote {The Beatles, Come Together, Live Piano
-Acoustic with Loop Pedal}, or do a combined search with \quote {Matt
-Chamberlain}. Okay, and talking of keyboards, let's not forget Vika Yermolyeva
-(vkgoeswild) as she's one of a kind too on the web. And then there is Jacob
-Collier, in one word: incredible (or hyphenated the Dutch way {\nl incredible},
-let me repeat that in French {\fr incredible}).} \footnote {Get me right, there
-are of course many more fantastic musicians.}
-Of course, names are often short and don't need to be hyphenated
-(or the left and right settings prohibit it). Another complication with names is
-that they can come from another language so we either need to switch language
-temporarily or we need to add an exception (more about that later).
-In traditional \TEX\ the language is not a property of a character but is
-triggered by a signal in the (so called) list. Think of:
-<language 1>this is <language 2>nederlands<language 1> mixed with english
-This number is set by the primitive \typ {\language}. Language triggers are
-injected into the list depending on the value of this number. There is also a \typ
-{\setlanguage} primitive that can inject triggers without setting the \typ
-{\language} number. Because in \LUATEX\ the state is kept with the character
-you don't need to worry about the subtle differences here.
-In \CONTEXT\ the \typ {\language} and \typ {\setlanguage} commands are overloaded
-by a more advanced switch macro. You cannot assume that they work as explained in
-general manuals about \TEX. Currently you can still assign a number but that
-might change. Just consider the language to be an abstraction and don't mess with
-this number. Both commands not only change the current language but also do
-specific initializations when needed.
-What characters get involved in hyhenation is historically determines by the so
-called \type {\lccode} values. Each character can have such a value which maps
-an uppercase to a lowercase character. This concept has been extended in \ETEX\
-where it binds to a pattern set (language). However, in \CONTEXT\ the user never
-has to worry about such details.
-% The \type {\patterns} primitive is
-% The \type {\hyphenation} primitive is
-In traditional hyphenation there will not be hyphenated if the sum of \typ
-{\lefthyphenmin} and \typ {\righthyphenmin} exceeds 62. This limitation is not
-present in the to be presented \LUA\ variant of this routine as there is no
-good reason for this limitation other than implementation constraints.
-We already mentioned \typ {\lefthyphenmin} and \typ {\righthyphenmin}. These
-two variables control the area in a word that is subjected to hyphenation.
-Setting these values is a matter of taste but making them too small can result in
-bad hyphenation when the patterns are made with the assumptions that certain
-minima are used. Using a \typ {\lefthyphenmin} of 2 while the patterns are made
-with a value of 3 in mind is a bad idea.
-context.bTABLE { option = "stretch", align= "middle" }
- context.bTR()
- context.bTD { ny = 2, align = "middle,lohi", style = "monobold" }
- context.verbatim("\\lefthyphenmin")
- context.eTD()
- context.bTD { nx = 5, style = "monobold" }
- context.verbatim("\\righthyphenmin")
- context.eTD()
- context.eTR()
- context.bTR()
- for right=1,5 do
- context.bTD()
- context.mono(right)
- context.eTD()
- end
- context.eTR()
- for left=1,5 do
- context.bTR()
- context.bTD()
- context.mono(left)
- context.eTD()
- for right=1,5 do
- context.bTD()
- context("\\lefthyphenmin %s \\righthyphenmin %s \\hyphenatedword{interesting}",left,right)
- context.eTD()
- end
- context.eTR()
- end
-When \TEX\ breaks a paragraph into lines it will try do so without hyphenation.
-When that fails (read: when the badness becomes too high) a next effort will take
-hyphenation into account. \footnote {Because in \LUATEX\ we always hyphenate
-there is no real gain in trying not to hyphenate. Because in traditional \TEX\
-hyphenation happens on the fly a pass without hyphenating makes more sense.} When
-the badness is still too high, an optional emergency pass can be made but only
-when the tolerances are set to permit this. In \CONTEXT\ you can try these
-settings when you get too many over- or underfull boxes reported on the console.
-Personally I tend to use the last setting, especially in automated flows. After
-all, \TEX\ will not apply stretch unless it's really needed.
-The two \typ {\*hyphenmin} parameters can be set any time and the current value
-is stored with each character. They can also be set with the language which we
-will see later.
-When \TEX\ hyphenates words it has to decide where a word starts and ends. In
-traditional \TEX\ the words starts normally at a character that falls within the
-scope of the hyphenator. It ends at when a box (hlist or vlist) is seen, but also
-at a rule, discretionary, accent (forget about this in \CONTEXT) or math. An
-example will be given in the chapter that discussed the \LUA\ alternative.
- todo
-Languages are one of the mechanisms where you can access the current state. There are
-for instance two (official) macros that contain the current (main) language:
-\NC \bf macro \NC \bf value \NC \NR
-\NC \type {\currentmainlanguage} \NC \currentmainlanguage \NC \NR
-\NC \type {\currentlanguage} \NC \currentlanguage \NC \NR
-When we have set \type {\language[nl]} we get this:
-\start \nl \getbuffer \stop
-If you write a style that needs to adapt to a language you can use modes. There
-are several ways to do this:
- \color[darkred]{main english}
- \color[darkred]{local english}
- \color[darkblue]{main dutch}
- \color[darkblue]{local dutch}
- [*en] {\color[darkgreen]{english set}}
- [*nl] {\color[darkgreen]{dutch set}}
-This typesets:
-\blank \startpacked \setupindenting[no] \getbuffer \stoppacked \blank
-When you use setups you can use the following trick:
-\startsetups language:en
- \color[darkorange]{something english}
-\startsetups language:nl
- \color[darkorange]{something dutch}
-As expected we get:
-\blank \start \setupindenting[no] \getbuffer \stop \blank