New site launch

August the 12th 2010 at 6 o’clock pm

Abstract

With the fresh design and publishing system, what have I actually done? I explain my on-and-off evening experimenting by noting the shortcomings and cool new things I have done with the site.

1. Browser issues
2. Future plans
- 2.1. Short-term, stylistic changes
- 2.2. Very future ideas
3. What’s new then?

I should like to explain and announce the new look of my site. I have been productively creating uptime with my downtime over the last couple of weeks, ironing out the issues and getting everything polished. It is now much better than the previous iteration (Wordpress, of reasoned loathing), with a better layout, an XML-safe publishing flow for MathML and SVG, portable storage backend (DocBook), and much improved typography and layout.

As my first post on the new system, I will have to begin by apologising. The site has been mostly correctly tested, but still has some rough edges. So, I regret the little bugs that remain. Secondly, I have never correctly stored my content in a future-proof way before, so I wrote a conversion utility that canonicalises Wordpress markup to save my posts going back to January, but all the fascinating things I wrote about before that are unfortunately not yet imported, if ever.

In order to end on a happy note, I will quickly outline the browser bugs I have not yet worked around, then describe some of the future plans and fixes I have, to finish with a description of what is new and great about the system. Please at least skip to the end if you want to find out what I have actually done already.

1. Browser issues

On Firefox 3.6–4.0b3, everything is fine. It is the best browser in the world (once I get my patch in for fixing the feed processor to handle and sanitize feeds properly—incidentally, I have fixed half a dozen bugs in that component, but need my dev computer to make the Mercurial patches, which will follow shortly).

Internet Explorer I expected to be very bad, but it is even worse: IE8 has a supremely awful parser (the bit that takes the page and turns the markup into objects that can be positioned and displayed). Not only do we have to wait for IE9 to get a basic XML parser, the IE8 HTML parser is massively broken and only lexes for tags it recognises, which does not include any HTML5 elements. This can be fixed according to a rather astute observation first made it seems by Sjoerd Visscher and picked by many other people (including the now-standard Google code project). I have enabled this on my site, and some of the parsing errors have gone, but inexplicably some still remain (the IE dev tools window tells me the aside is somehow parsed to be inside the section). That looks like it will be very hard to fix, so for the time being there will be no improvement on IE support.

Chrome/WebKit is great, apart from all the bugs which are much more obtrusive than the Firefox ones. Column support is essentially non-existent and broken, because all floats and positioned elements go wild, so columns on the post pages are only enabled for Firefox. The only show-stopper bug at the moment is the lack of support for loading external SVG using use, which seems like a minor little thing, but entirely prevents my beautiful site logos from appearing (bug 12499). My work-around will probably be to just inline the whole SVG file on each page load, possibly with some browser sniffing to avoid it for Firefox and Opera.

Opera is untested at the moment. There is bound to be some small problem, but I will fix it when I get round to it.

2. Future plans

2.1. Short-term, stylistic changes

Most prominently, there are various papercuts to fix. Generating and putting in routine text is boring and takes time. What will take more work is filling the drop-down menu on article pages and the search interface. Already implemented is the ability to subscribe to a subset feed of items just from the categories you are interested in. This means my mother can subscribe without having to read techie posts like this one. The articles code is very powerful at the moment (enjoy the category-filtered full text Google search!), as powerful as it will ever need to be with HTML and Atom output for every combination of category and keyword search. On the other hand, I ought to have a better filter or search widget to actually open up access to the posts in a user-friendly way.

Aside from browsing issues, there are styling concerns. I want to fix up my fonts situation on the site. At the moment, rather inconsiderately the site pulls in a huge stack of Latin Modern font files, which I have not yet optimized nor customized. They need to be optimized because the excellent @font-face services like Font Squirrel do not have sensible defaults by lowering quality for filesize; customized because I want to include some extra ornamental glyphs and dingbats. I may or may not stay with LModern, but there are not so many really suitable alternatives. My aesthetic drive also includes cleaning up the markup produced by my DocBook conversion and fleshing out the style sheet as I use more elements.

A couple of codish things like the Atom feed history extension, and more lively caching should land soon as well.

2.2. Very future ideas

2.2.1. Publishing

Longer-term changes to the design and layout of the site include investigating making private posts available to friends and toying about with restricted feeds. Secondly, I would like to re-evaluate the options with using the screen space for horizontal layout. There are some issues, especially when one large image or block of code needs more room than the column can provide. Working out a suitable automated system for presenting content in that context will be interesting, and must include at least a syntax highlighter as well.

2.2.2. Authoring

Thirdly and finally of my longer-term site changes, I still need to look into a decent XML authoring system. The solution I use at the moment is XXE (XMLMind XML Editor), which seems to be all-round the best free XML editor I can find. Unfortunately, while the interface is superb, it is pretty weak on several fronts: firstly, integrating it with a publishing system as the authoring end of a site is extremely expensive and hardly the open sort of system I want to be encouraging others to use; secondly, its MathML support is very good (in the professional edition) but not super-fast. With suitable macros, it can be made reasonably quick, but for lots of small equations has nowhere near the fluency of LaTeX.

I have two plans then: write an extension for it to enable single-keypress rapid entering of LaTeX equations, which ought not to be too hard given the excellent documentation and libraries available to handle this; secondly, be kind to the world and do something similar but open and in native HTML. There do not unfortunately seem to be any projects which realistically seem to be good to join to add this sort of functionality, which is a real nuisance. Neither Bespin nor other projects seem to be implementing the sort of interface I want. That is a whole other project though. On the positive side, an hour’s playing this evening managed to get the basics of the interface working well enough for me to be satisfied it should be possible without too much extra time.

3. What’s new then?

For a start, I implemented all my thoughts on good design for a content publishing system. No more is my data stored in some crazy customised/pseudo-HTML/pseudo-wpautop-Wordpress format, but rather sensible, plain DocBook markup. The files themselves are really maximally portable, with a RELAXNG grammar specifying the set of available markup I can use (like embedding categories in schemed subjects, and putting the post extracts in abstracts). I have even gone to the trouble of rewriting absolute URLs stored in articles to the base directory of the site, if the publishing system happens to be installed in a subdirectory. All very spiffy.

The broad architecture of the system has been outlined before, and I do not even have to write about it because I have implemented the neat facilities of a view source tool for the whole site. The entirety of what is served up (in future minus some password files) is visible for all, complete without documentation. This makes me very happy even if no-one else is pleased by it.

The site pushes the bounds not really of what is possible, in terms of what is commonly done. Despite good browser support now, how many sites have you seen with display:table used on html? How many SVG icons that get repainted by hover? How many with fancy use of columns and alignment to ensure text lines up across columns? In fact, this last is the sort of five-minute small touch of polish I am really glad to by experimenting with: in Firefox, the only browser the can display the article’s columns, enable the lines using the View → Page styles submenu and marvel at how the text still lines up around items inserted into the flow:

Neat!

Regarding the backend, nothing is particularly advanced nor efficient, but I do not need it to scale and the caching used is pretty extensive, so performance is never going to be a problem for me. I still run much faster than Wordpress (5ms vs 190ms, admittedly without any caching extensions). The architecture is arguably totally over-the-top for my requirements, and the XSLT is extremely verbose, but I can still bask in the glow of billion dollar markup and the thought that posts like these will be accessible in the future, and safeguard my ability to work with the more meaningful thoughts I spit out.