Serving XHTML+RDFa properly isn't really possible

After much messing around with including RDFa in XHTML documents using the W3C's new XHTML+RDFa document type (see here for an introduction to using RDFa in XHTML), I've come to the conclusion that it's pretty close to impossible to do it properly. Here's why:

  • Most of us who use XHTML actually end up serving it to the client as HTML - because the content is sent with a text/html MIME type instead of an application/xhtml+xml MIME type. The major hurdle here is that Internet Explorer can't handle the latter, and so if you're going to serve it properly you need to use some browser-specific content negotiation on your server. But for plain XHTML this doesn't matter too much - it just means that the browser will actually parse the content as HTML and not as XHTML (so you needn't have bothered closing all those tags).
  • For XHTML that includes other XML content, however, the WC3 says that you cannot serve the content as text/html. It has to be served as application/xhtml+xml. For a start, this means you can't serve XHTML+RDFa to IE users.
  • But even if we ignore IE users for a moment (as it's always fun to do), there is another problem. The XHTML+RDFa DTD doesn't include definitions for any of the special HTML entities («, », —, etc). Which means that you cannot use these in your code. Either you use the numeric entity references or insert them as unicode characters. If you write all your own content then that's fine - but try getting this to work with a content management system which invariably has these sprinkled liberally all over the place.

    So it appears that serving proper XHTML+RDFa is possible if you write your own code (or have the time and energy to find and replace all the entities in your Wordpress install, every time you upgrade) and you don't give a damn about IE users. Even those who aspire to both, I think will struggle. The solution? Wait for HTML5 I guess, which will support RDFa as text/html.