Years ago when I was young and naive (since then I've just grown older, I suppose) I made a website that used sessions for navigation. A user would click a top level menu entry, the page would reload and present the next level menu, and so on, and each previous menu choice would be stored in a php session variable so that the correct submenu or sub-submenu would be served up.
Needless to say, this proved a bad idea. Saving deeplinks into any page was impossible because the navigation history stored in the session could not be bookmarked, and search engine spiders didn't make it past the home page since they didn't grok my weird and wonderful navigation system.
Now, though, I am concerned that I might be repeating my mistake. I'm building a website with multiple language support, and I use a session variable ($_SESSION['lang'] to be precise) to store the user's language preference. If $_SESSION['lang'] is not set I revert to a default language.
Will search engines be able to index all languages supported, or only the default language?
12-17-2009, 08:40 AM
I guess it would depend on how sophisticated the robot is: specifically whether it receives/sends the session cookie. It is certainly possible for a robot to do so, and surely the developers at the major search engines are smart enough to do so; however, I do not know if they actually do so or not.
I suppose that doesn't help much other than to say you're not necessarily out of luck, but it might take further research and/or testing to find out what each search engine you care about actually does.
I personally prefer a URL rewriting approach, e.g. "http://example.com/fr/page.php" gets rewritten to "http://example.com/page.php?lang=fr" and then you can do the language processing in page.php based on the existence and value of $_GET['lang']. Or you can use sub-domains for each non-default language, having it point to the same directory as the default domain, and then parse the value $_SERVER['SERVER_NAME'] to determine what language to use.
12-17-2009, 08:52 AM
You can use sessions to store the language that users have chosen, but the page in an alternate language should always (whether the session variable is set or not) be available to anybody by clicking on a link.
The above URLs could be rewritten from something like example.com?lang=en
In the case when a get variable specifying the language is specified, it should take precedence over whatever is in the session variable for language, if it exists. This is how your system should be built. Makes it simpler for users to change languages as well. Robots, in this case, will have no problem indexing all the content in all the languages as long as they are accessible through links. No need for worrying about what robots do with session variables in this scenario!!
12-17-2009, 11:25 PM
I appreciate your replies, guys! I've been thinking about it some more, and I agree with the notion that language settings and other navigational details really should be part of the URL. So I'll find a way to do that. It's not rocket science... the hardest part is that I'll have to part with my wonderful ideas of doing everything session-based. Such an elegant idea... if it only would work! :-)
Thanks for your advice! Cheers!!
12-18-2009, 05:02 AM
You can still do it session based. Actually, I think a hybrid concept is better because that way the language variable won't always have to be part of the URL. This is how I conceptually view the script:
1) Grab the get variable 'lang' from the url - use this to generate the page in the specified language. If a get variable is set, then also set the session variable 'lang' to the same value.
2) If a get variable 'lang' doesn't exist, then use the value of the session variable 'lang' to generate the page in the specified language
3) if neither a get variable of a session variable exist, then:
a) show a coverup div (like lightbox) which requires the user to select his/her language before they continue OR
b) show the page in english
I would also always have a language selector in a prominent place on the page that links to the current page the user is viewing, plus an additional parameter lang in the query string.