So, I was thinking about pages that are very shaped by JavaScript and how they might not do so well with web crawlers.
For example, if the comments on a blog is loaded by Ajax (which I know many blogs do), then those comments might not be indexed by web crawlers. This is a sad loss for us web developers because the number of keywords and the content of those comments are not being indexed.
Another example, that I'm facing myself, is when you have pages with what one could call JavaScript applications. An Ajax based chat, or a fun game of JavaScript Tetris, if you will. These pages are entirely generated in JavaScript, and thus, when a web crawler tries to index it, it will find absolutely nothing. Or even worse, the infamous "You appear to have turned off JavaScript in your browser, please enable it" message.
So, I thought about ways to deal with this issue. The first possibility I thought of was to have a message in the HTML document, and then remove it with JavaScript as soon as the page loads. If I have an JavaScript based Tetris game, I could write in the document's body "JavaScript Tetris is a fun application based on the classic addictive game we all know", and have it instantly removed with JavaScript. This way, the web crawler would pick up this description for when it's indexing, but your visitors would not see it.
Another idea I had was to use .htaccess to secretly redirect the web crawlers to a fake copy of the document that contains nothing but the description. The web crawler would then index the page with that text, but when users (who are not web crawlers) access the page, they are not redirected and thus gets the real application instead. This solution scares me a little though, as I don't think most search engines would look kindly upon being tricked to the content of a webpage. Indeed, this trick could be used for malicious purposes, and it wouldn't surprise me if big search engines like Google would punish you and not index your page if they found out.
JavaScript is invisible to web crawlers but also to vast numbers of humans. What ever you do with JavaScript make certain that there is a non-JavaScript alternative. Those Ajax comments, the script the sends the request should know how to send them as a whole web page and you should use the NOSCRIPT element to provide a link to that service. Problem solved.
“The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect.”
—Tim Berners-Lee, W3C Director and inventor of the World Wide Web
Charles, I hope you aren't suggesting I should provide alternatives for my JavaScript applications to users who doesn't have JavaScript? I agree that requiring JavaScript for simple pages to work is a bad practice, any website should display even if you don't have JavaScript or flash. But to say that JavaScript applications themselves should work without JavaScript is madness.
But to say that JavaScript applications themselves should work without JavaScript is madness.
Not at all. It just takes a tiny bit of thought and typing and your Ajax works double duty.
“The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect.”
—Tim Berners-Lee, W3C Director and inventor of the World Wide Web
Bookmarks