Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

April 3, 2005

itex2MML 0.12

Time for another release of itex2MML, the commandline utility that converts a dialect of TeX to MathML. It’s the brains behind plugins for various blogging platforms (MovableType, WordPress, ecto, b2Evolution, …) which allow you to enter TeX formulæ and have them automatically rendered to MathML.

itex2MML acts as a stream filter, converting TeX equations delimited by $...$ (for inline equations) or \[...\] (for display equations) to MathML. This version adds support for a boatload more LaTeX/AMSLaTeX symbols. I was motivated by a recent post in which I needed \gtrsim ( ≳ ). Please let me know if you encounter any bugs, misfeatures or, for that matter, features that ought to be there but aren’t.

As always, my distribution comes with a precompiled binary for MacOSX, the plugin for MovableType, and the source code to compile itex2MML for other platforms (just type “make”).

Posted by distler at April 3, 2005 1:35 AM

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/544

19 Comments & 2 Trackbacks

Re: itex2MML 0.12

Is there any way that you could make items inside [tex]…[/tex] be formulas? I use LateXrender and it would make the conversion to itexmml much easier.

Posted by: didier on April 3, 2005 4:47 AM | Permalink | Reply to this

Delimiters

You mean

#!/usr/bin/perl

undef $/;
$_= <>;
s:\[tex\](.*?)\[/tex\]:\$$1\$:gs;
print;

or similar? I’m not going to change the delimiters used by the main itex2MML engine. That could have unintended side-effects with existing data.

It might be OK to fold some code like the above into the plugin which uses the itex2MML engine. But you’re likely to have other problems doing that. With your current data, the ‘$’ character has no special meaning. With itex2MML, suddenly, it’s a delimiter.

It would be preferable to run a converter on your existing data, which would

  1. Convert all instances of ‘$’ to ‘&#24;’ and all instances of ‘\\[’ to ‘&#92;[’.
  2. Convert ‘[tex]...[/tex]’ to ‘$...$’.

That would be safer and — in the long run — more convenient, too.

Posted by: Jacques Distler on April 3, 2005 1:24 PM | Permalink | PGP Sig | Reply to this

Re: itex2MML 0.12

Please let me know if you encounter any bugs

I am recently encountering problems with mathematical fonts in entries in the String Coffee Table.

For instance in this one and in this one all my \mathfr{g}, \mathcal{P} etc. did not appear as 𝔤\mathfr{g} and 𝒫\mathcal{P} but as question marks.

The same problem does not appear, obviously, in the comment section, here as well as at the SCT.

Could it be that the entries use another parsing engine than the comments? Or am I making a stupid mistake somewhere?

Posted by: Urs Schreiber on April 7, 2005 4:32 AM | Permalink | Reply to this

Re: itex2MML 0.12

Sorry, I am trying to give a good description of the problem that I am encountering, but it is getting a little confusing:

In my previous comments I saw the fonts correctly displayed when previewing my comment, but now that ot has appeared it is again just question marks. (I tried Firefox and IE+MP.)

Posted by: Urs Schreiber on April 7, 2005 4:36 AM | Permalink | Reply to this

NCRs

Whenever you see a question mark, that means you do not have the requisite font installed on your system (to display the glyph in question).

If you want to produce a Fraktur letter, like 𝔤, you would type \mathfrak{g}, which itex2MML will turn into the MathML named entity, &gfr; which represents the Unicode character, U+1D524.

On the basis of a suggestion by Henri Sivonen to improve the display of these characters under Mozilla/Mac, we convert (everywhere except for the Comment Preview page, where I forgot) the named entity, &gfr; into a numeric character reference, &#x1D524;.

While these should be precisely equivalent, apparently, they are handled slightly differently on the Mac versus on Windows. If I understand you correctly, what improves the display on the Mac seems to break it on Windows.

I will see if I can get a response from Henri.

Posted by: Jacques Distler on April 7, 2005 8:17 AM | Permalink | PGP Sig | Reply to this

Re: NCRs

we convert (everywhere except for the Comment Preview page, where I forgot

Ah, I see. Good that you forgot to do it there! :-)

you would type \mathfrak{g}

Actually, I would have to type \mathfr{g}. I learned this the hard way after having typed it the former way a dozen of times before previewing.

Since you asked for general suggestions let me mention the following:

One thing that would be nice were if it was possible to copy-and-paste well-formed LaTex code from any LaTeX document into the weblog, without having to adjust a couple of ever so slight but still time-consuming differences, like \mathfrak \to \mathfr or the non-support of \;, for instance.

(Note that I am not complaining or anything. I know you have already done an enormous job on the MathML support. But since you asked for features that one might want to have, I thought I’d mention it.)

BTW, there is another related problem I have, which maybe I can solve myself:

When I preview comments they appear with all the math and everything. When I preview entries the math is not displayed in a readable way. Only after I actually publish any entry can I check that the math is rendered as desired.

Could that be a problem with my configuration?

Posted by: Urs Schreiber on April 7, 2005 9:43 AM | Permalink | Reply to this

Fraktur

Actually, I would have to type \mathfr{g}. I learned this the hard way after having typed it the former way a dozen of times before previewing.

Ack! My bad. Something to add to the TODO list for itex2MML.

One thing that would be nice were if it was possible to copy-and-paste well-formed LaTex code from any LaTeX document into the weblog…

The goal is to get as close to that as possible. Won’t ever be completey achievable, as there are irreconcilable differences between LaTeX and itex, but I’m trying to get close. \mathfrak is definitely something that should be supported.

Any others?

Posted by: Jacques Distler on April 7, 2005 10:02 AM | Permalink | PGP Sig | Reply to this

Unsupported LaTeX

Any others?

Let’s see. Right now I recall that the following commands are not recognized:

\lbrace

\rbrace

\!

\;

The way these are not recognized differs. The first two appear as text, the third is ignored, the fourth produces an error message.

Of course none of these commands are essential. I can use \{ and \}, for instance. I don’t know how to emulate \!, though.

Posted by: Urs Schreiber on April 7, 2005 11:41 AM | Permalink | Reply to this

Supported now

  1. \mathfrak{} is now implemented as a synonym of \mathfr{}.
  2. \lbrace and \rbrace are synonyms of \\{ and, \\} respectively.
  3. \: and \; are synonyms for \medspace and \thickspace, respectively. \, and \\! were already implemented as synonyms for \smallspace and \negspace. Don’t know why the latter did not work for you.

Unless I hear some other requests, I guess I will package these changes up and distribute itex2MML 0.13. In the meantime, they’re working here on Golem.

Posted by: Jacques Distler on April 9, 2005 1:03 AM | Permalink | PGP Sig | Reply to this

Re: Supported now

Thanks!

Posted by: Urs Schreiber on April 12, 2005 6:45 AM | Permalink | Reply to this

Admin Interface

When I preview comments they appear with all the math and everything. When I preview entries the math is not displayed in a readable way. Only after I actually publish any entry can I check that the math is rendered as desired.

Could that be a problem with my configuration?

Nope. The Admin interface for MovableType is served as text/html instead of application/xhtml+xml. I haven’t checked with the latest version, but with previous versions, certain function would yield a ‘Yellow Screen of Death’ (ill-formed XHTML) if served with the “correct” MIME type.

So there’s no MathML support in the Admin Interface of MovableType.

Sorry.

Posted by: Jacques Distler on April 7, 2005 10:13 AM | Permalink | PGP Sig | Reply to this

Re: Admin Interface

I see. Thanks.

If you don’t mind, this gives rise to a further potential suggestion:

The obvious workaround for the above is simply to develop new entries using the comment preview and move the whole thing to the admin interface only after it is fully tested there.

This has two slight drawbacks, though:

First, it seems the comment validator (or whatever it is called) is more tolerant than the one applied in the admin interface. For instance in the comment section I can do blockquote without having to include <p> for every quoted paragraph. (Which is in fact very convenient, since commented text is typically obtained by copy-and-pasting it from a source that may have paragraphs but no paragraph tags.)

(That’s not a big problem of course, once one knows how it works.)

The other thing is that it can get a little uncomfortable to edit nontrivial entries in the comment section, simply because the edit window is rather small.

As I said, this are just comments from somebody who is using your great weblog technology a lot.

(And, BTW, sorry for not signing my comments currently. My hard disk has crashed recently and I will need to reinstall the encryption software. Someday.)

Posted by: Urs Schreiber on April 7, 2005 11:28 AM | Permalink | Reply to this

Re: Admin Interface

since commented text is

Sorry, of course I meant to say ‘quoted text’.

Posted by: Urs Schreiber on April 7, 2005 11:43 AM | Permalink | Reply to this

PUA fakes vs. the real astral chars

Sorry about taking so long to react.

In the days of yore when MathML support in Mozilla for Windows and Linux was designed (there was no Mac OS X back then and rbs didn’t have a Mac anyway), it was thought that Mozilla can’t deal with astral characters. I am not sure if this really was the case back then, but it is not the case now. The string API says “UCS2”, but the i18n guys have for quite some time treated the strings as UTF-16. This hasn’t been well communicated to all developers using the API, however.

Anyway, in order to stay on the BMP, which is no longer necessary, the DTD in Mozilla’s catalog fakes the astral math characters by mapping them to PUA characters instead of the astral characters the real DTD would map them to. The Win32 and X11 gfx special case these PUA characters and give them special treatment using legacy (non-Unicode) math fonts. The Mac gfx seems to be broken, as usual. Also, requiring legacy fonts on Mac OS X would be a bad idea, because ATSUI doesn’t know the fonts lie.

With the pure UTF-8 approach using a real astral characters, the UTF-8 astral characters are properly converted into UTF-16 surrogate pairs and travel through the application without getting caught in any special MathML-related code. When they reach gfx on the Mac, they go down the code path intended for astral CJK characters. (Comparable code path exists on Windows.) The surrogate pairs make it to the system API intact and get rendered. It seems that there are problems with the baseline. As usual, the Mac gfx is broken in some way even when it is partway right.

In fact, I think the Mac gfx is so broken in so many ways that it should be abandoned and reimplemented using ATSUI and Quartz. The sad part is that the gfx API was designed for GDI, raw X11 and QuickDraw. I think it would really benefit from an overhaul done with Quartz/ATSUI, GDI+/Uniscribe and libart/Pango in mind. However, to make that happen the drivers who get paid for working on Mozilla would need to be committed to such a major cross-platform change. Still, even with the current gfx API, I think the Mac gfx needs a rewrite. The piecemeal fixes just don’t cut it.

I would proceed here by installing a font that is properly encoded to provide glyphs for the astral chars and seeing if that works on Windows. It would certainly be interesting if it didn’t, because CJK astral characters work on both Mac and Windows provided a font has the right glyphs.

I finally got around to filing a bug about this.

Posted by: Henri Sivonen on April 11, 2005 1:42 PM | Permalink | Reply to this

Re: PUA fakes vs. the real astral chars

So, if I understand you correctly, I have screwed my Linux/Windows users.

They have the Computer Modern Fonts (“legacy (non-Unicode) math fonts”) installed, and Mozilla’s PUA hack maps (say) &gfr; to some PUA and thence to the correct glyph in the Computer Modern fonts.

This fails (or is otherwise screwed up) on MacOSX, which, in any case, cannot use Computer Modern (it being ATSUI incompatible).

Instead, I have taken to sending NCRs. &#x1D524; gets mapped to the corresponding Unicode character, U+1D524, which, if you have an appropriate Unicode font installed, will display correctly.

Unfortunately, most Windows/Linux users won’t have such a font installed. At best, they’ll have the Computer Modern Fonts installed.

So they just see a missing glyph.

Is there any resolution for this, short of praying that the stixfonts are delivered sometime this Century?

Posted by: Jacques Distler on April 11, 2005 3:31 PM | Permalink | PGP Sig | Reply to this

Re: PUA fakes vs. the real astral chars

Do I understand correctly that the problem for Win/Linux users could be resolved if they installed some Unicode font? Where can one get this font?

Posted by: Urs on April 28, 2005 7:15 AM | Permalink | Reply to this

Unicode Fonts

Eventually, the Stix Fonts will provide a complete set of Unicode compatible Math fonts. In the meantime, you can try the Code 2001 font which, while rather crude, does cover these “Plane-1” characters.

Let me know if that resolves the problem for you.

P.S.: I’m working on trying to get the MovableType Administrative interface working under application/xhtml+xml (so you can preview posts with MathML). It’s damned difficult, so don’t hold your breath.

Posted by: Jacques Distler on April 28, 2005 8:18 AM | Permalink | PGP Sig | Reply to this

Re: Unicode Fonts

Hm, I have downloaded that Code 2001 font (on WinXP), installed it, rebooted everything to make sure, but I am still getting question marks for \oplus and \mathfrak{g} for instance. Do I have to do anything else?

Posted by: Urs Schreiber on May 4, 2005 9:44 AM | Permalink | Reply to this

Re: itex2MML 0.12

I made a small patch to enable
$$ \alpha $$ syntax with the itex2mml
and enable “\over” syntax

here is the patch

http://chem.skku.ac.kr/~wkpark/patches/itex.diff

Posted by: wkpark on April 14, 2005 7:39 AM | Permalink | Reply to this
Read the post PSM and Algebroids, Part II
Weblog: The String Coffee Table
Excerpt: How Courant algebroids gives rise to L_infinity algebras.
Tracked: April 29, 2005 2:03 PM
Read the post PSM and Algebroids, Part III
Weblog: The String Coffee Table
Excerpt: On the equivalence between algebroids, dg algebras and Lie p-algebras, the definition of algebroid YM theories and the relation to generalized geometry and abelian p-bundles and p-gerbes.
Tracked: June 22, 2005 11:33 PM

Post a New Comment