Copyright (c) 2006, David Harvey. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license, and the LATEX source for this manual, is included in the blahtex source distribution.
This is the manual for blahtex version 0.4.4. The most up-to-date information about blahtex, including a PDF version of this document, is available at http://web.archive.org/web/20070503042839/http://www.blahtex.org/.
Blahtex is a free software tool/library that translates TEX markup into MathML markup. It is also capable of generating PNG format images, using some external tools (LATEX and dvipng).
Blahtex is not designed to process entire TEX documents. Rather, it focuses on the mathematical capabilities of the TEX language, processing only a single equation at a time. It is designed to provide mathematical support to a larger document markup system. Currently, the main target platform is MediaWiki -- the software that powers Wikipedia and many other wikis -- but blahtex has been designed with flexibility of integration in mind.
Blahtex concentrates on matching the appearance of TEX output, as far as this is possible given the fonts available to the MathML renderer. It only outputs Presentation MathML, not Content MathML. Blahtex is aware of at least some of TEX's rules concerning spacing and fonts. For example, it knows about `atom flavours' (like ord, rel, op, etc) and TEX's algorithms for determining the amount of space between them.
Blahtex implements some subset of TEX, LATEX and AMS-LATEX, including almost all of the symbols. A complete list of supported and quasi-supported commands can be found in Section 2.
Blahtex is internally Unicode-based. Non-ASCII characters may be used in text mode (e.g. within \text{...} blocks). These will be handled correctly for MathML output. For PNG output, blahtex can currently handle some extended Latin characters (see Section 2.21), and there is experimental support for Cyrillic and Japanese. More scripts may be added in the future.
Blahtex is open source software. The source code is released under the GNU GPL (General Public License). This means that although the source is copyrighted, you may modify it, use it in your own programs, or even sell it, as long as you adhere to the GPL.
Blahtex is written in C++. It compiles on Linux and Mac OS X systems, but probably is not as portable as it could be (see Section 3.1).
Blahtex obviously owes a lot to texvc, the software presently used by MediaWiki to handle TEX input, written by Tomasz Wegrzanowski.
Blahtex is a work in progress. I hereby solicit your feedback, to help me improve it as much as possible.
(It has not escaped the author's attention that every paragraph of this section either begins or ends with the word `blahtex'.)
In the beginning there was TEX. Later, we also met LATEX, and ConTeXt, teTeX, MiKTeX, blah blah blah...
There are a variety of other TEX-to-MathML converters available. The MathML home page (http://www.w3.org/Math/) has quite a long list. Here are a few that have online demos available:
They have their pros and cons, as does blahtex. I happen to think blahtex is rather good, but of course I am biased :-) Feel free to disagree. Please let me know if you think blahtex is no good, and why it's no good, so that maybe I can fix it. (Also, let me know if you think it's great!)
Thanks to the crew at Wikipedia, for pioneering such a fabulous resource, especially the regulars at WikiProject Mathematics.
Thanks to Jitse Niesen for his ongoing work on integrating blahtex into MediaWiki (currently on show at wiki.blahtex.org), and for generally being very supportive of this project.
Blahtex supports some subset of TEX, LATEX and AMS-LATEX. This section gives a complete list of supported commands, together with some comments where the support is known to be incomplete.
Blahtex supports \newcommand, including arguments (but not optional arguments).
Blahtex protects against a malicious user eliciting exponential time via recursive macros, by imposing a hard limit on the amount of macro processing that can occur.
Note that \newcommand is not local to blocks, as is the case in TEX. For example, {\newcommand{\abc}{xyz}} \abc is legal in blahtex, but not in TEX, because TEX only remembers the definition of \abc within the outermost {...} block.
Clearly \newcommand is not very useful for an individual equation. In a larger document markup system, a good approach might be to provide a facility for specifying a document-wide collection of macros, and the software would automatically append the relevant \newcommands to the beginning of each equation in which a macro need to be available. It is not clear at this stage whether this model would be technically feasible in MediaWiki.
\begin{XYZ} ... \end{XYZ}, where XYZ is one of:
matrix pmatrix bmatrix Bmatrix vmatrix Vmatrix cases aligned smallmatrix
\sqrt (including with optional argument) \substack \overset \underset \not
When it encounters \not, blahtex will attempt to find a MathML character that directly corresponds to the negation of any operator appearing after \not. Failing that, it will try to draw an ordinary slash in the right place, using the MathML <mpadded> element to fudge things.
Blahtex supports \color{X}, where X is one of the following named colours:
GreenYellow Yellow yellow Goldenrod Dandelion Apricot Peach Melon YellowOrange Orange BurntOrange Bittersweet RedOrange Mahogany Maroon BrickRed Red red OrangeRed RubineRed WildStrawberry Salmon CarnationPink Magenta magenta VioletRed Rhodamine Mulberry RedViolet Fuchsia Lavender Thistle Orchid DarkOrchid Purple Plum Violet RoyalPurple BlueViolet Periwinkle CadetBlue CornflowerBlue MidnightBlue NavyBlue RoyalBlue Blue blue Cerulean Cyan cyan ProcessBlue SkyBlue Turquoise TealBlue Aquamarine BlueGreen Emerald JungleGreen SeaGreen Green green ForestGreen PineGreen LimeGreen YellowGreen SpringGreen OliveGreen RawSienna Sepia Brown Tan Gray Black black White white
At this time there is no support for colour models, so you can't do things like \color[rgb]{0.2,0.3,0.4}.
There are some subtle bugs in the parsing of \color commands. Things like \overset{a}{\color{blue}x} are not legal in LATEX, for reasons I haven't yet fully investigated; blahtex still accepts them.
\text \textit \textbf \textrm \texttt \textsf \emph \hbox \mbox
The command \hbox doesn't really behave like it should, because MathML doesn't really have a notion of `horizontal box'. Blahtex treats \hbox essentially equivalently to \text, with slightly different formatting rules. Things like \hbox to 12pt are not supported.
\frac \cfrac \over \binom \choose \atop
\left \right \big \Big \bigg \Bigg \bigl \Bigl \biggl \Biggl \bigr \Bigr \biggr \Biggr
\mathop \mathrel \mathord \mathbin \mathopen \mathclose \mathpunct \mathinner
\limits \nolimits \displaylimits
\, \! \ \; \> \quad \qquad
\hat \widehat \dot \ddot \bar \overline \underline \overbrace \underbrace \overleftarrow \overrightarrow \overleftrightarrow \check \acute \grave \vec \breve \tilde \widetilde
\mathbf \mathbb \mathrm \mathit \mathcal \mathfrak \mathsf \mathtt \boldsymbol \rm \bf \it \cal \tt \sf \Bbb \bold
\displaystyle \textstyle \scriptstyle \scriptscriptstyle
\operatorname \operatornamewithlimits \lim \sup \inf \limsup \liminf \injlim \projlim \varlimsup \varliminf \varinjlim \varprojlim \min \max \gcd \det \Pr \ker \hom \dim \arg \sin \cos \sec \csc \tan \cot \arcsin \arccos \arctan \sinh \cosh \tanh \coth \log \lg \ln \exp \deg \mod \bmod \pmod
\_ \& \$ \# \% \{ \}
\alpha \beta \gamma \delta \epsilon \varepsilon \zeta \eta \vartheta \theta \iota \kappa \varkappa \lambda \mu \nu \pi \varpi \rho \varrho \sigma \varsigma \tau \upsilon \phi \varphi \chi \psi \omega \xi \digamma \Gamma \Delta \Theta \Lambda \Pi \Sigma \Upsilon \Phi \Psi \Omega \Xi
\ast \implies \neg \ne \ge \le \land \lor \gets \to \vert \lvert \rvert \Vert \lVert \rVert \lfloor \rfloor \lceil \rceil \lbrace \rbrace \langle \rangle \lbrack \rbrack \aleph \beth \gimel \daleth \wp \ell \P \imath \forall \exists \Finv \Game \partial \Re \Im \leftarrow \rightarrow \longleftarrow \longrightarrow \Leftarrow \Rightarrow \Longleftarrow \Longrightarrow \mapsto \longmapsto \leftrightarrow \Leftrightarrow \longleftrightarrow \Longleftrightarrow \uparrow \Uparrow \downarrow \Downarrow \updownarrow \Updownarrow \searrow \nearrow \swarrow \nwarrow \hookrightarrow \hookleftarrow \upharpoonright \upharpoonleft \downharpoonright \downharpoonleft \rightharpoonup \rightharpoondown \leftharpoonup \leftharpoondown \nleftarrow \nrightarrow \supset \subset \supseteq \subseteq \sqsupset \sqsubset \sqsupseteq \sqsubseteq \supsetneq \subsetneq \in \ni \notin \iff \mid \sim \simeq \approx \propto \equiv \cong \neq \ll \gg \geq \leq \triangleleft \triangleright \trianglelefteq \trianglerighteq \models \vdash \Vdash \vDash \lesssim \nless \ngeq \nleq \times \div \wedge \vee \oplus \otimes \cap \cup \sqcap \sqcup \smile \frown \smallsmile \smallfrown \setminus \smallsetminus \And \star \triangle \wr \infty \circ \hbar \lnot \nabla \prime \backslash \pm \mp \emptyset \varnothing \S \angle \colon \Diamond \nmid \square \Box \checkmark \complement \eth \hslash \mho \flat \sharp \natural \bullet \dagger \ddagger \clubsuit \spadesuit \heartsuit \diamondsuit \top \bot \perp \ldots \cdot \cdots \vdots \ddots \dots \dotsb \circledR \yen \maltese \circledS \Bbbk \jmath \ulcorner \urcorner \llcorner \lrcorner \dashrightarrow \dashleftarrow \backprime \vartriangle \blacktriangle \triangledown \blacktriangledown \blacksquare \lozenge \blacklozenge \bigstar \sphericalangle \measuredangle \dotplus \ltimes \rtimes \Cap \leftthreetimes \rightthreetimes \Cup \barwedge \curlywedge \veebar \curlyvee \doublebarwedge \boxminus \circleddash \boxtimes \circledast \boxdot \circledcirc \boxplus \centerdot \divideontimes \intercal \leqq \geqq \leqslant \geqslant \eqslantless \eqslantgtr \gtrsim \lessapprox \gtrapprox \approxeq \eqsim \lessdot \gtrdot \lll \ggg \lessgtr \gtrless \lesseqgtr \gtreqless \lesseqqgtr \gtreqqless \doteqdot \eqcirc \risingdotseq \circeq \fallingdotseq \triangleq \backsim \thicksim \backsimeq \thickapprox \subseteqq \supseteqq \Subset \Supset \preccurlyeq \succcurlyeq \curlyeqprec \curlyeqsucc \precsim \succsim \precapprox \succapprox \Vvdash \shortmid \shortparallel \bumpeq \between \Bumpeq \varpropto \backepsilon \blacktriangleleft \blacktriangleright \therefore \because \ngtr \nleqslant \ngeqslant \nleqq \ngeqq \lneqq \gneqq \lvertneqq \gvertneqq \lnsim \gnsim \lnapprox \gnapprox \nprec \nsucc \npreceq \nsucceq \precneqq \succneqq \precnsim \succnsim \precnapprox \succnapprox \nsim \ncong \nshortmid \nshortparallel \nmid \nparallel \nvdash \nvDash \nVdash \nVDash \ntriangleleft \ntriangleright \ntrianglelefteq \ntrianglerighteq \nsubseteq \nsupseteq \nsubseteqq \nsupseteqq \subsetneq \supsetneq \varsubsetneq \varsupsetneq \subsetneqq \supsetneqq \varsubsetneqq \varsupsetneqq \leftleftarrows \rightrightarrows \leftrightarrows \rightleftarrows \Lleftarrow \Rrightarrow \twoheadleftarrow \twoheadrightarrow \leftarrowtail \rightarrowtail \looparrowleft \looparrowright \leftrightharpoons \rightleftharpoons \curvearrowleft \curvearrowright \circlearrowleft \circlearrowright \Lsh \Rsh \upuparrows \downdownarrows \multimap \rightsquigarrow \leftrightsquigarrow \nLeftarrow \nRightarrow \nleftrightarrow \nLeftrightarrow \pitchfork \nexists \lhd \rhd \unlhd \unrhd \leadsto \uplus \diamond \bigtriangleup \bigtriangledown \ominus \oslash \odot \bigcirc \amalg \prec \succ \preceq \succeq \dashv \asymp \doteq \parallel \bowtie \surd \doublecap \restriction \llless \gggtr \Doteq \doublecup \dasharrow \vartriangleleft \vartriangleright \Join
\sum \prod \int \iint \iiint \iiiint \oint \bigcap \bigodot \bigcup \bigotimes \coprod \bigsqcup \bigoplus \bigvee \biguplus \bigwedge
\O \" \' \textbackslash \textvisiblespace \textasciicircum \textasciitilde
If the magic command \strictspacing occurs anywhere in the input, blahtex will switch to `strict spacing mode' for the entire equation. This overrides the command-line -spacing setting.
Blahtex will serenely transcribe any non-ASCII characters for MathML output, as long as they appear in text mode (for example, surrounded by \text{...}). For PNG output, things are more difficult, because LATEX needs special packages and fonts available. At a minimum, the blahtex command line option -use-ucs-package must be used. The following sections describe which characters are permitted for PNG output.
The following characters are handled directly by the LATEX ucs package.
¡ £ § © ¬ ® ° µ ¶ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý ß à á â ã ä å æ ç è é ê ë ì í î ñ ò ó ô õ ö ÷ ø ù ú û ü ý ÿ Ā ā Ă ă Ć ć Ĉ ĉ Ċ ċ Č č Ď ď Ē ē Ĕ ĕ Ė ė Ě ě Ĝ ĝ Ð ð Ġ ġ Ģ Ĥ ĥ Ĩ ĩ Ī ī Ĭ ĭ Ý ý Ĵ ĵ Ķ ķ Ĺ ĺ Ļ ļ Ľ ľ Ł ł Ń ń Ņ ņ Ň ň Ō ō Ŏ ŏ Ő ő Œ œ Ŕ ŕ Ŗ ŗ Ř ř Ś ś Ŝ ŝ Þ þ Š š Ţ ţ Ť ť Ũ ũ Ū ū Ŭ ŭ Ů ů Ű ű Ŵ ŵ Ŷ ŷ Ÿ Ź ź Ż ż Ž ž Ǎ ǎ Ǐ ǐ Ǒ ǒ Ǔ ǔ Ǣ ǣ Ǧ ǧ Ǩ ǩ ǰ Ǵ ǵ Ǹ ǹ Ǽ ǽ Ǿ ǿ Ș ș Ț ț Ȟ ȟ Ȧ ȧ Ȩ ȩ Ȯ ȯ Ȳ ȳCurrently blahtex does not recognise TEX's accent commands (like \"o), so it is necessary to enter characters requiring accents directly in UTF-8.
Blahtex experimentally supports Cyrillic characters, by using LATEX's fontenc package with the X2 font encoding. Input must be entered in UTF-8, and surrounded by the (nonstandard) \cyr{...} command. Commands like \CYRSHA are not supported. Only the basic Cyrillic alphabet is supported, which as far as I can tell is sufficient for Russian.
Disclaimer: I don't know anything about Cyrillic, or any languages that use it. If I've messed something up, your advice would be appreciated.
Blahtex experimentally supports Japanese (Kanji, Hiragana, Katakana) by using the LATEX CJK package. Input must be entered in UTF-8, and surrounded by the (nonstandard) \jap{...} command. The command-line option -use-cjk-package must be used. Additionally, the TEX system must have a Japanese font installed, and blahtex needs to be informed via the command-line option -japanese-font.
Disclaimer: I don't know anything about the Japanese language or writing system. If I've messed something up, your advice would be appreciated.
Blahtex supports many TEX/LATEX/AMS-LATEX commands not supported by texvc, especially many of the symbols in AMS-LATEX.
The main feature of texvc that is missing in blahtex is support for HTML output. This may or may not be added in future.
Blahtex has much more robust syntax error reporting than texvc. Rather than a handful of generic error messages, blahtex can generate a wide variety of more detailed error messages to help the user diagnose the problem.
Blahtex generally achieves much higher compatibility with TEX's parsing than texvc does. Texvc is generally more permissive. For example, the following are legal in texvc, but in TEX and blahtex they require additional grouping braces:
The characters $ and % are legal in texvc, but are illegal in blahtex. (Of course \$ and \% are available.)
These parsing differences may cause problems in replacing texvc with blahtex in an existing MediaWiki installation, since some legacy equations may not be compatible with blahtex. Preliminary research suggests that about 0.5% of equations on Wikipedia itself (including the ten largest language Wikipedias) would be affected.
Blahtex has a command-line option (-texvc-compatible-commands) that enables all of the nonstandard commands in texvc's dialect of TEX; that is, commands which are not present in TEX, LATEX, or AMS-LATEX. It appears that most of these commands were added to texvc to make life easier for people familiar with HTML entities; for example, \isin is a texvc synonym for the standard \in. This option should be useful for backward compatibility with existing equations in databases like Wikipedia. Here is the complete list:
\R \Reals \reals \Z \N \natnums \Complex \cnums \alefsym \alef \larr \rarr \Larr \lArr \Rarr \rArr \uarr \uArr \Uarr \darr \dArr \Darr \lrarr \harr \Lrarr \Harr \lrArr \hAar \sub \supe \sube \infin \lang \rang \real \image \bull \weierp \isin \plusmn \Dagger \exist \sect \clubs \spades \hearts \diamonds \sdot \ang \thetasym \Alpha \Beta \Epsilon \Zeta \Eta \Iota \Kappa \Mu \Nu \Rho \Tau \Chi \arcsec \arccsc \arccot \sgn
Also included are the four commands \empty, \and, \or, \part. These commands are part of TEX/LATEX/AMS-LATEX, but they do not do what texvc thinks they should do! Blahtex emulates texvc's behaviour for these commands (assuming that the -texvc-compatible-commands option is active).
The blahtex source code is available from www.blahtex.org. No binaries will be made available. All official releases should have been signed with a PGP key whose ID is 0x6269E206 and whose fingerprint is 9A51 0B6A B144 6A4D E1E5 0DE6 D604 6405 6269 E206. This key is valid until 2nd August 2007. You can either get it from the blahtex website, or try searching for `blahtex' on a public keyserver.
Besides reading this document, the interested developer is strongly advised to ``use the source''.
Blahtex has been successfully compiled and run on the following configurations:
Some of the source files seem to need a bit of memory to compile. I had trouble with -O3 level optimisation on an older machine with 256MB RAM. It should be fine with 512MB or above.
Other UNIX-based systems might work too. You will probably encounter problems with compilers other than gcc, or with older versions of gcc. (Probably gcc 3.3 is still okay.) I have personally met at least one older Solaris compiler that couldn't stomach the code. Your compiler must support wstring and 32-bit wchar_ts. If you want to compile it on MS Windows... good luck, let me know how it goes.
You will need an installation of the GNU iconv library. On some systems this is preinstalled, so you don't need to do anything. On my Mac I needed to install it (for example via fink).
To generate PNGs, you will need LATEX and the dvipng utility, which is included in many LATEX distributions. Blahtex assumes that the following LATEX packages are available: color, fontenc, inputenc, amsmath, amsfonts, amssymb. All of these packages are included in teTeX, one of the most popular TEX distributions for UNIX systems.
Additionally, to handle non-ASCII characters, the ucs package must be installed, and blahtex must be informed by using the -use-ucs-package command line option. To enable computation of height and depth of the output PNG image, the preview package must be installed, and blahtex must be informed by using the -use-preview-package option.
The version of dvipng running on the blahtex website is a slightly modified version of dvipng 1.7. The modification pertains to the automatic hinting method used with the underlying FreeType 2 library, and was made with the help of the author of dvipng, Jan-Åke Larsson (thanks Jan-Åke!).
It's quite simple: in the source file ft.c, just replace FT_LOAD_NO_HINTING by FT_LOAD_TARGET_LIGHT, and recompile. The author has indicated that this modification will appear in dvipng version 1.8.
To handle Japanese, the LATEX CJK package must be installed, and a Japanese font must be installed.
Warning: Installing TrueType CJK fonts for use by LATEX/dvipng is a dark art. In this section I will describe a sequence of steps that worked for me. I will explain along the way what I believe the purpose of each step to be, and caveats that you should be aware of. However, this should not be construed to imply that I have any idea at all of what I am talking about.
You will need a Japanese TrueType font. For testing, I have been using the Sazanami gothic font: http://sourceforge.jp/projects/efont/files/. Look inside for the TrueType font file sazanami-gothic.ttf.
Warning: I have not read the license document for this font. It is mostly in Japanese. It is quite possible that it is not legal to use this font for certain purposes. Since it is advertised as being targeted at OpenOffice, I expect that all is okay, but I am not a lawyer.
The strategy outlined below is to convert the TrueType font to a bunch of smaller Type 1 fonts, and to provide enough other information to make LATEX and dvipng happy.
You will need FontForge, from http://fontforge.sourceforge.net/. (Note that to install FontForge on Mac OS X, you will need the StuffIt Expander utility to decompress the installation package. StuffIt Expander was included in Mac OS 10.3.x, but is not shipped with Mac OS 10.4.x. I had a copy available from an older OS, but if you have only OS 10.4.x, you will need to download StuffIt Expander from http://www.stuffit.com/mac/expander/. Also on the Mac you need to make sure that you have an X11 server available. On Mac OS 10.4.x it should be pre-installed in /Applications/Utilities/X11.App. On earlier versions you may need to download X11 from Apple's website.)
Create a temporary working directory somewhere, which I will refer to in these instructions as /temp.
You need to select a name for your font. Probably best to keep it very short. I will use the name `saza' throughout the following example; you will need to replace every `saza' with whatever you have chosen.
Boot up X11, and run FontForge. You should get an `Open Font' dialog; open the ttf file from above. Then select `Generate Fonts...' from the File menu. Navigate to your /temp directory; this is where the output from the `generate fonts' process will be saved. On the drop-down list on the left, select `PS Type 1 (Multiple)'. (The point here is to split the font up into many smaller sub-fonts. This is necessary because TEX can only really work with fonts that contain at most 256 symbols, and CJK fonts have many more than that.) The default file name will be something like sazanami-gothic%s.pfb; change this to saza-uni%s.pfb. Now press `Options', and make sure `Output TFM & ENC' is checked. Then hit `Save'. A new `Find Sub Font Definitions' dialog will pop up. You will need to find the file Unicode.sfd on the web somewhere (Google is your friend); save this file somewhere and tell the dialog where it is. Press OK.
FontForge should go away and think for a while. When it's finished, your /temp directory should be filled with lots of .tfm, .pfb, .afm, and .enc files. You can throw away the last two; we only need the .tfm and .pfb files. In your texmf tree, make a new directory called /texmf/fonts/tfm/saza/, and put all the .tfm files there. Similarly, put all the .pfb files into a directory /texmf/fonts/type1/saza/.
(The .tfm files are `TEX font metric' files. Roughly speaking, they tell TEX how much space each character takes up. The corresponding .pfb files are Adobe Type 1 font files; they describe the actual glyphs for each character.)
Create a plain text file called C70saza.fd, and fill it with the following text:
\DeclareFontFamily{C70}{saza}{\hyphenchar \font\mne}
\DeclareFontShape{C70}{saza}{m}{n}{<-> CJK * saza-uni}{}
\DeclareFontShape{C70}{saza}{bx}{n}{<-> CJKb * saza-uni}{CJKbold}
Save this file under /texmf/tex/latex/saza/. (I think the idea of this
file is to tell LATEX something about
the new font you have installed.)
That's all the files you need. Now you need to run mktexlsr (or sudo mktexlsr) to update TEX's filename cache.
When you run blahtex, you will need to use the command line options -use-cjk-package -use-ucs-package -japanese-font saza.
Unpack the source into your favourite directory.
The basic syntax is: blahtex [ options ]; the command-line options are listed below. The TEX input should be supplied on standard input in UTF-8 encoding, which means plain ASCII if you don't care about Unicode. If no input is given, blahtex will print a help screen. If neither of the -mathml or -png options are selected, then blahtex will still process the input for syntax errors, but will product no output.
Blahtex pays a lot of attention to spacing, because the MathML defaults (via the operator dictionary) are often inadequate. To see the difference, try the simple input a := b on blahtex (with spacing set to moderate or strict) and compare with the output of other translators.
Blahtex's output looks like XML. (Unless a really fatal error occurs :-)) By default, the output is completely ASCII, although there are command-line options which enable UTF-8 output for certain characters. The entire output is surrounded by the tags <blahtex>...</blahtex>. Inside these tags, there are several possibilities:
If the PNG image was generated successfully, then it will be stored in a file called X.png, where X is an md5 hash (32 character lowercase hex string); the <png> block will then contain <md5>X</md5>. (In fact X is the md5 hash of the TEX file that got sent to LATEX to generate the image.) If the option -use-preview-package was used, the <png> block will also contain blocks <height>H</height> and <depth>D</depth> which indicate the height and depth of the image, in pixels. (These are computed by dvipng.) If you want to display the PNG in a web page so that it is aligned with surrounding text, you can use the depth value as follows: <img src="..." style="vertical-align: -Dpx">.
If there was an error generating the PNG file, the <png> block will instead contain an <error> block describing the problem. The possible error IDs here are:
The <error> block (mentioned several times above) has the following format. First, it contains an <id>...</id> block, containing an error ID (i.e. one of the CamelCase strings listed above). Next, a sequence of zero or more <arg>...</arg> blocks, representing the `arguments' of the error. Finally there is a <message>...</message> block, containing a translation of the error into English. For example, one possible error block is:
<error>
<id>MismatchedBeginAndEnd</id>
<arg>\begin{matrix}</arg>
<arg>\end{array}</arg>
<message>The commands "\begin{matrix}" and "\end{array}" do not match</message>
</error>
The simplest way to report the error to the user is to extract the <message> block. If you want to implement some localisation of error messages, you should use the <id> and <arg> fields. A complete list of error messages can be found in the source file Messages.cpp, or try the command-line option -print-error-messages. The error IDs may change in future versions of blahtex.
This section gives a summary of how to link blahtex directly into a C++ application. You will need to write a wrapper if you want to use a different language. (If you do this, please consider sending me the wrapper so I can make it available for others to use.)
The blahtex source code is divided into two parts:
To use the blahtex core in your C++ application, you should follow these steps:
The blahtex core is internally Unicode throughout, and works exclusively with wide strings -- wstring, not string. If your code only deals with ASCII strings, or UTF-8, you will need a way of converting between narrow and wide strings. The blahtex command-line application has a class UnicodeConverter which provides precisely this functionality; it is essentially a C++ wrapper for the iconv library in terms of string (for storing UTF-8 strings) and wstring (for storing UCS-32 strings; endianness depends on the platform). To use this class:
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.70)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore, Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -t 'Blahtex 0.4.4 manual' -split +0
-no_navigation -show_section_numbers manual
The translation was initiated by David on 2006-03-25