Click to See Complete Forum and Search --> : Hiding source
Pembar
06-26-2009, 02:54 AM
Has anyone noticed that google search pages seem to have a different source code?
Here's what you can try.
1. Goto www.google.com
2. Type in any word, I tried "rugby" or "american football"
3. Wait for the page to load, then look for certain words or phrases in the search results. Then look at the source code and do a find, you won't be able to find those words, yet you see them right in front of you.
Anyone knows how google does it? Shouldn't the source code be exactly what you see on the screen?
coothead
06-26-2009, 07:03 AM
Hi there Pembar,
the phrases are there in the "Source code" but some times they have HTML
rendering which would give the impression that the phrase is not in there...
With international <em>rugby</em> rankings
...is just one example.
coothead
Pembar
06-26-2009, 07:27 AM
Strange, I've tried it a couple times and I honestly don't see it. (Same for both IE and FF).
One thing I noticed, when I "view source", I only get to see a part of the entire source code, the HTML starts off correctly with <HTML> but the last code in there is </script>, it doesn't have </body></html> at all.
Charles
06-26-2009, 12:04 PM
Google's HTML is pretty bad but not that bad. Both the start and end tags for HTML, HEAD and BODY elements are completely optional. Google is omitting them as is their right.
I just found this thread while researching this very issue, specifically with regard to Google search results page.
You are indeed correct that the links, and the associated text, are not in the code for the page.
The reason I'm concerned about this is that I noticed in Firefox that when I click on a link in the Google results page, I'm being redirected through some other link before arriving at my destination.
I don't know if Google is doing this, or if some malware has gotten onto my system.
I noticed that this is only happening to me in Firefox, not Internet Explorer 8 (though I've been having problems loading pages in IE8, which got me using Firefox in the first place).
I ran identical Google searches in both Firefox (v3.52, by the way, and running in Firefox safe mode) and in IE8. The resulting IE8 page code was similar to what you might expect, with all of the links explicit in the page code. The Firefox code was similar to what you have described and posted.
A quick (15 minute) look at the generated page code tells me that the search results are being stored in an array, or possibly externally (there is a reference to a .js file; the code is obscure -- no doubt purposely), and the links and much of the text is being written on the fly.
Writing javascript code on the client side, within the browser, is not a new trick; if you use a script and use the write (I think it's document.writeln, but I don't have time to look it up right now -- I need to finish this up and go), whatever you write will be written into the page at that point, as if it had been static text (including possibly HTML) downloaded from the server.
Beware if you decide to do this that if someone has javascript turned off in his/her browser, this technique won't work.
In fact, if you turn off javascript in your browser and retry the example that you've described in this thread, you'll discover that the web page source code will be what you probably expected it to be, with all the HTML code explicitly spelled out.
Now I just need to figure out if it's Google rewriting the code, or if it's something even more sinister (even if it's Google, it's sinister, because I'm being redirected through an intermediate link without being forewarned).
Anyway, good luck, and you're not crazy. The full HTML is being hidden from you.
Tom
webdeveloper.com
Copyright Internet.com Inc., All Rights Reserved.