flag of the United Kingdom

Extending User Comments

Date:  Tue, 1st-Jun-2004prevnext

Tags: CMS, Programming, Website Development

My comment handler is missing one pretty significant feature, there is no support for any kind of mark-up.

I am going to add this functionality now but, before I do, I need to know what to implement...

The comment handler has some nice touches: neat time-stamping, edit functionality, spam defences, duplicate post prevention, private annotations, subscriptions, and more.

It is also flawed, in that it totally prohibits the inclusion of mark-up in comments.

This was a deliberate architectural decision on my part. I designed it this way to prevent users posting comments containing invalid mark-up. "Broken" mark-up would undermine my efforts to maintain a valid, XHTML standards-compliant website. In order to prevent this, the handler simply strips anything resembling (X)HTML out of all comments.

This has some significant consequences. When Gus Gollings tried to post a snippet of Apache configuration code, the comment handler made a real mess of his post. The result was virtually incomprehensible and I had to introduce a small hack in order to make his valuable contribution presentable. Furthermore, my users noticed the missing functionality and some have asked for its introduction.

I am happy to oblige!

Of course, I am still concerned about validation. Fortunately, Simon Willison has addressed this issue and I will be examining his SafeHtmlChecker.class for enlightenment when I start work on the handler.

About the Implementation

I want the handler to be able to continue operating as it does now for users who are unfamiliar with XHTML. That is, carriage returns are automatically converted to "<br />" and URLs and email addresses are automatically hyperlinked.

However, if the parser detects any XHTML it should automatically switch into "professional" mode. In this state, it will allow valid mark-up through and advise the user when the mark-up is invalid.


What tags do I need support? I am currently planning for: "a", "p", "blockquote", "ul", "ol", "li", "em", "strong", "br" and "code". Are there any others I should consider?

Should I consider a WYSIWYG editor?

Should I allow inline images?

You can comment on this entry, or read what others have written (16 comments).