Thanks in particular to its math typesetting quality and its LaTeX and AMSTeX add-ons, TeX is the de-facto standard system for typesetting scientific and mathematical publications. Its programmability means TeX enables standardized layouts and batch processing (e.g. for automatic data conversion). It also offers a high level of flexibility and scalability.
Author data or: There's TeX and there's TeX
The fact that TeX is so powerful means that authors use self-defined, sometimes unorthodox markups. Added to this is a problem that TeX has in common with other text processing and DTP programs: It allows authors to format manuscripts with purely visual commands (bold, 14 pt, space; superscript 12 followed by upright C; freely formatted reference) rather than using common sense or a logical structure (chapter heading; chemical symbol; structured reference). This tends to make standardized formatting or data conversion more difficult.
More and more publishers have come to recognize their content as an asset that can be reused in ways that extend beyond the latest printed edition. These include data import into specialized databases, the sale (incl. chapter by chapter) of linked e-books and HTML renditions, the publication of purely electronic interim editions, or the long-term archiving and reuse of content.
TeX is better suited to this type of extended content reuse than many other data formats. Even mathematical, scientific, and technical works with math content can be edited directly in TeX by the author of a follow-up edition (many authors might prefer to edit the data in Word, in which case a conversion is always required because typesetting works with math content in Word is not recommended).
To ensure the content is rendered in a standardized way and is reusable, a process of normalization is required.
le-tex uses tools for efficient normalization of TeX and other input (e.g. Word or Plain Text from OCR):
- TeX macro packages for semantic markup (e.g. science.sty),
- xemacs macro packages for automatic or semi-automatic structuring (references, numerical values / physical units, chemical formulae etc.),
- Conversion tools from TeX to different XML document types with validation of the TeX input during conversion and validation of the XML output against a DTD / an XML schema,
- Creation of macro packages for publishers to release to authors,
- Author consulting, and
- Not least, many experienced employees, including numerous academics, who keep a close eye on the whole process and content lifecycle.
TeX can typeset XML data directly in the accustomed TeX quality. This makes it far superior to all XSL-FO engines. It is therefore suitable as a print backend system for loose-leaf publications, fulltext XML journals, and every other type of structured or semi-structured content. le-tex typesets numerous journals and books in TeX directly from XML source text and is therefore at the cutting edge when it comes to productive application of xmltex.
xmltex and LaTeX can handle any Unicode character, provided it is contained within a font. However, if the resulting PDF file also needs to contain the characters at the correct Unicode position (e.g. for correct searchability of linguistic texts), XeTeX with Unicode fonts can be used as an alternative. Rather than being directly typeset from XML using TeX, in this case the documents are typically converted into UTF-8-coded LaTeX source text by an upstream XSLT processor as part of a stable, makefile-controlled workflow.
Books and articles typeset in TeX must not be recognizable as such. With TeX, many people expect the typical look of single-column text in Computer Modern fonts, as generated by the LaTeX standard document classes. Examples such as Informatik Spektrum (IT Spectrum) or the Springer Handbook series illustrate how le-tex creates sophisticated layouts with multiple columns, customized fonts, and spot colors in TeX for numerous journals and books.