Extending User Comments
Tags: CMS, Programming, Website Development
My comment handler is missing one pretty significant feature, there is no support for any kind of mark-up.
I am going to add this functionality now but, before I do, I need to know what to implement...
The comment handler has some nice touches: neat time-stamping, edit functionality, spam defences, duplicate post prevention, private annotations, subscriptions, and more.
It is also flawed, in that it totally prohibits the inclusion of mark-up in comments.
This was a deliberate architectural decision on my part. I designed it this way to prevent users posting comments containing invalid mark-up. "Broken" mark-up would undermine my efforts to maintain a valid, XHTML standards-compliant website. In order to prevent this, the handler simply strips anything resembling (X)HTML out of all comments.
This has some significant consequences. When Gus Gollings tried to post a snippet of Apache configuration code, the comment handler made a real mess of his post. The result was virtually incomprehensible and I had to introduce a small hack in order to make his valuable contribution presentable. Furthermore, my users noticed the missing functionality and some have asked for its introduction.
I am happy to oblige!
Of course, I am still concerned about validation. Fortunately, Simon Willison has addressed this issue and I will be examining his SafeHtmlChecker.class for enlightenment when I start work on the handler.
About the Implementation
I want the handler to be able to continue operating as it does now for users who are unfamiliar with XHTML. That is, carriage returns are automatically converted to "<br />" and URLs and email addresses are automatically hyperlinked.
However, if the parser detects any XHTML it should automatically switch into "professional" mode. In this state, it will allow valid mark-up through and advise the user when the mark-up is invalid.
Questions
What tags do I need support? I am currently planning for: "a", "p", "blockquote", "ul", "ol", "li", "em", "strong", "br" and "code". Are there any others I should consider?
Should I consider a WYSIWYG editor?
Should I allow inline images?
You can comment on this entry, or read what others have written (16 comments).