diff options
author | Context Git Mirror Bot <phg42.2a@gmail.com> | 2016-07-30 01:22:07 +0200 |
---|---|---|
committer | Context Git Mirror Bot <phg42.2a@gmail.com> | 2016-07-30 01:22:07 +0200 |
commit | 5135aef167bec739fe429e1aa987671768b237bc (patch) | |
tree | bd9f9696704e57c45f453bb7dc6becd5501cb657 /doc/context/sources/general/manuals/languages/languages-basics.tex | |
parent | 9d7c4ba8449bec1da920c01e24a17c41bbf2211d (diff) | |
download | context-5135aef167bec739fe429e1aa987671768b237bc.tar.gz |
2016-07-30 00:31:00
Diffstat (limited to 'doc/context/sources/general/manuals/languages/languages-basics.tex')
-rw-r--r-- | doc/context/sources/general/manuals/languages/languages-basics.tex | 348 |
1 files changed, 0 insertions, 348 deletions
diff --git a/doc/context/sources/general/manuals/languages/languages-basics.tex b/doc/context/sources/general/manuals/languages/languages-basics.tex deleted file mode 100644 index 840897096..000000000 --- a/doc/context/sources/general/manuals/languages/languages-basics.tex +++ /dev/null @@ -1,348 +0,0 @@ -% language=uk - -\startcomponent languages-basics - -\environment languages-environment - -\startchapter[title=Some basics][color=darkyellow] - -\startsection[title={Introduction}] - -In this chapter we will see how we can toggle between languages. A first -introduction to patterns will be given. Some details of how to control the -hyphenation with specific patterns will be given in a later chapter. - -\stopsection - -\startsection[title={Available languages}] - -When you use the English version of \CONTEXT\ you will default to US English as -main language. This means that hyphenation will be US specific, which by the way -is different from the rules in GB. All labels that are generated by the system -are also in English. Languages can often be accessed by names like \type -{english} or \type {dutch} although it is quite common to use the short tags like -\type {en} and \type {nl}. Because we want to be as compatible as possible with -\MKII, there are quite some synonyms. The following table lists the languages that -for which support is built|-|in.\footnote {More languages can be defined. It is -up to users to provide the information.} - -\startbuffer -\usemodule[languages-system] - -\loadinstalledlanguages -\showinstalledlanguages -\stopbuffer - -\getbuffer - -You can call up such a table with the following commands: - -\typebuffer - -Instead you can run \type {context --global languages-system.mkiv}. - -As you can see, many languages have hyphenation patterns but for Japanese, -Korean, Chinese as well as Arabic languages they make no sense. The patterns are -loaded on demand. The number is the internal number that is used in the engine; a -user never has to use that number. Numbers $<1$ are used to disable hyphenation. -The file tag is used to locate and load a specification. Such files have names -like type {lang-nl.lua}. - -Some languages share the same hyphenation patterns but can have demands that -differ, like labels or quotes. The characters shown in the table are those found -in the pattern files. The number of patterns differs a lot between languages. -This relates to the systematic behind them. Some languages use word stems, others -base their hyphenation on syllables. Some language have inflections which adds to -the complexity while others can combine words in ways that demand special care -for word boundaries. Of course a low or high number can signal a low quality as -well, but most pattern collections are assembled over many years and updated when -for instance spelling rules change. I think that we can safely say that most patterns -are quite stable and of good quality. - -\stopsection - -\startsection[title=Switching] - -The document language is set with - -\starttyping -\mainlanguage[en] -\stoptyping - -but when you want to apply the proper hyphenation rules to an embedded language -you can use: - -\starttyping -\language[en] -\stoptyping - -or just: - -\starttyping -\en -\stoptyping - -The main language determines what labels show up, how numbering happens, in what -way dates get formatted, etc. Normally the \typ {\mainlanguage} command comes -before the \typ {\starttext} command. - -\stopsection - -\startsection[title=Hyphenation] - -In \LUATEX\ each character that gets typeset not only carries a font id and character -code, but also a language number. You can switch language whenever you want and -the change will be carried with the characters. Switching within a word doesn't make -sense but it is permitted: - -\starttabulate[|||T|] -\NC 1 \NC \type{\de incrediblykompliziert} \NC \hyphenatedword{\de incrediblykompliziert} \NC \NR -\NC 2 \NC \type{\en incrediblykompliziert} \NC \hyphenatedword{\en incrediblykompliziert} \NC \NR -\NC 3 \NC \type{\en incredibly\de kompliziert} \NC \hyphenatedword{\en incredibly\de kompliziert} \NC \NR -\NC 4 \NC \type{\en incredibly\de\-kompliziert} \NC \hyphenatedword{\en incredibly\de\-kompliziert} \NC \NR -\NC 5 \NC \type{\en incredibly\de-kompliziert} \NC \hyphenatedword{\en incredibly\de-kompliziert} \NC \NR -\stoptabulate - -In the line 4 we have a \type {\-} between the two words, and in the last -line just a \type {-}. If you look closely you will notice that the snippets -can be quite small. If we typeset a word with a 1mm text width we get this: - -\blank \start \en \hsize 1mm incredibly \par \stop \blank - -If you are familiar with the details of hyphenation, you know that the number of -characters at the end and beginning of a word is controlled by the two variables -\typ {\lefthyphenmin} and \typ {\righthyphenmin}. However, these only influence -the hyphenation process. What bits and pieces eventually end up on a line is -determined by the par builder and there the \type {\hsize} matters. In practice -you will not run into these situations, unless you have extreme long words and a -narrow column. - -Hyphenation normally is limited to regular characters that make up the alphabet of -a language. It is insensitive for capitalization as the following text shows: - -\blank - -\startnarrower -\hyphenatedword {This time the musical distraction while developing code came -from watching youtube performances of Cory Henry (also known from Snarky Puppy, -a conglomerate of excellent players). Just search the web for his name with \quote -{Stevie Wonder and Michael Jackson Tribute}. There is no keyboard he can't play. -Another interesting keyboard player is Sun Rai (a short name for Rai -Thistlethwayte, just google for \quote {The Beatles, Come Together, Live Piano -Acoustic with Loop Pedal}, or do a combined search with \quote {Matt -Chamberlain}. Okay, and talking of keyboards, let's not forget Vika Yermolyeva -(vkgoeswild) as she's one of a kind too on the web. And then there is Jacob -Collier, in one word: incredible (or hyphenated the Dutch way {\nl incredible}, -let me repeat that in French {\fr incredible}).} \footnote {Get me right, there -are of course many more fantastic musicians.} -\stopnarrower - -\blank - -Of course, names are often short and don't need to be hyphenated -(or the left and right settings prohibit it). Another complication with names is -that they can come from another language so we either need to switch language -temporarily or we need to add an exception (more about that later). - -\stopsection - -\startsection[title=Primitives] - -In traditional \TEX\ the language is not a property of a character but is -triggered by a signal in the (so called) list. Think of: - -\starttyping -<language 1>this is <language 2>nederlands<language 1> mixed with english -\stoptyping - -This number is set by the primitive \typ {\language}. Language triggers are -injected into the list depending on the value of this number. There is also a \typ -{\setlanguage} primitive that can inject triggers without setting the \typ -{\language} number. Because in \LUATEX\ the state is kept with the character -you don't need to worry about the subtle differences here. - -In \CONTEXT\ the \typ {\language} and \typ {\setlanguage} commands are overloaded -by a more advanced switch macro. You cannot assume that they work as explained in -general manuals about \TEX. Currently you can still assign a number but that -might change. Just consider the language to be an abstraction and don't mess with -this number. Both commands not only change the current language but also do -specific initializations when needed. - -What characters get involved in hyhenation is historically determines by the so -called \type {\lccode} values. Each character can have such a value which maps -an uppercase to a lowercase character. This concept has been extended in \ETEX\ -where it binds to a pattern set (language). However, in \CONTEXT\ the user never -has to worry about such details. - -% The \type {\patterns} primitive is -% The \type {\hyphenation} primitive is - -In traditional hyphenation there will not be hyphenated if the sum of \typ -{\lefthyphenmin} and \typ {\righthyphenmin} exceeds 62. This limitation is not -present in the to be presented \LUA\ variant of this routine as there is no -good reason for this limitation other than implementation constraints. - -\stopsection - -\startsection[title=Control] - -We already mentioned \typ {\lefthyphenmin} and \typ {\righthyphenmin}. These -two variables control the area in a word that is subjected to hyphenation. -Setting these values is a matter of taste but making them too small can result in -bad hyphenation when the patterns are made with the assumptions that certain -minima are used. Using a \typ {\lefthyphenmin} of 2 while the patterns are made -with a value of 3 in mind is a bad idea. - -\startlinecorrection[blank] -\startluacode -context.bTABLE { option = "stretch", align= "middle" } - context.bTR() - context.bTD { ny = 2, align = "middle,lohi", style = "monobold" } - context.verbatim("\\lefthyphenmin") - context.eTD() - context.bTD { nx = 5, style = "monobold" } - context.verbatim("\\righthyphenmin") - context.eTD() - context.eTR() - context.bTR() - for right=1,5 do - context.bTD() - context.mono(right) - context.eTD() - end - context.eTR() - for left=1,5 do - context.bTR() - context.bTD() - context.mono(left) - context.eTD() - for right=1,5 do - context.bTD() - context("\\lefthyphenmin %s \\righthyphenmin %s \\hyphenatedword{interesting}",left,right) - context.eTD() - end - context.eTR() - end -context.eTABLE() -\stopluacode -\stoplinecorrection - -When \TEX\ breaks a paragraph into lines it will try do so without hyphenation. -When that fails (read: when the badness becomes too high) a next effort will take -hyphenation into account. \footnote {Because in \LUATEX\ we always hyphenate -there is no real gain in trying not to hyphenate. Because in traditional \TEX\ -hyphenation happens on the fly a pass without hyphenating makes more sense.} When -the badness is still too high, an optional emergency pass can be made but only -when the tolerances are set to permit this. In \CONTEXT\ you can try these -settings when you get too many over- or underfull boxes reported on the console. - -\starttyping -\setupalign[tolerant] -\setupalign[verytolerant] -\setupalign[verytolerant,stretch] -\stoptyping - -Personally I tend to use the last setting, especially in automated flows. After -all, \TEX\ will not apply stretch unless it's really needed. - -The two \typ {\*hyphenmin} parameters can be set any time and the current value -is stored with each character. They can also be set with the language which we -will see later. - -When \TEX\ hyphenates words it has to decide where a word starts and ends. In -traditional \TEX\ the words starts normally at a character that falls within the -scope of the hyphenator. It ends at when a box (hlist or vlist) is seen, but also -at a rule, discretionary, accent (forget about this in \CONTEXT) or math. An -example will be given in the chapter that discussed the \LUA\ alternative. - -\stopsection - -\startsection[title=Installing] - - todo - -\stopsection - -\startsection[title=Modes] - -Languages are one of the mechanisms where you can access the current state. There are -for instance two (official) macros that contain the current (main) language: - -\startbuffer -\starttabulate[||Tc|] -\HL -\NC \bf macro \NC \bf value \NC \NR -\HL -\NC \type {\currentmainlanguage} \NC \currentmainlanguage \NC \NR -\NC \type {\currentlanguage} \NC \currentlanguage \NC \NR -\HL -\stoptabulate -\stopbuffer - -\getbuffer - -When we have set \type {\language[nl]} we get this: - -\start \nl \getbuffer \stop - -If you write a style that needs to adapt to a language you can use modes. There -are several ways to do this: - -\startbuffer -\language[nl] - -\startmode[**en] - \color[darkred]{main english} -\stopmode - -\startmode[*en] - \color[darkred]{local english} -\stopmode - -\startmode[**nl] - \color[darkblue]{main dutch} -\stopmode - -\startmode[*nl] - \color[darkblue]{local dutch} -\stopmode - -\startmodeset - [*en] {\color[darkgreen]{english set}} - [*nl] {\color[darkgreen]{dutch set}} -\stopmodeset -\stopbuffer - -\typebuffer - -This typesets: - -\blank \startpacked \setupindenting[no] \getbuffer \stoppacked \blank - -When you use setups you can use the following trick: - -\startbuffer -\language[nl] - -\startsetups language:en - \color[darkorange]{something english} -\stopsetups - -\startsetups language:nl - \color[darkorange]{something dutch} -\stopsetups - -\setups[language:\currentlanguage] -\stopbuffer - -\typebuffer - -As expected we get: - -\blank \start \setupindenting[no] \getbuffer \stop \blank - -\stopsection - -\stopchapter - -\stopcomponent |