HTML Tips and Tricks: The Webmaster's Bakery
What is causing all this talk of subversion, espionage, and theft? Magic cookies, of course!
They're an invasion of my privacy." "Every time you get one, it reads your whole hard drive." "It's like getting a virus, and it can even read all of your private accounting documents!" What is causing all this talk of subversion, espionage, and theft? Magic cookies: a topic which seems to anger many people and bemuse the rest. What exactly are cookies, and what do they do? Why do some Web pages load cookies for no apparent reason? Are they an invasion of your privacy? What kind of information can cookies retrieve, and can you stop them? Since there seems to be so much confusion on the Internet about cookies, we're going to take off where Glenn Fleishman left off in his last "Web Talk for Wireheads" column (Web Developer® January/February 1997). We're going to make some cookies, and bake up a batch-showing you just what they're made of. Grab yourself a big glass of milk and a napkin: it's cookie time!
According to Netscape's document, Persistent Client State HTTP Cookies, cookies "are a general mechanism which server side connections (such as CGI scripts) can use to both store and retrieve information on the client side of the connection." When your browser contacts a Web server, the server may send a piece of data called "state information" that will be stored in your machine. This information includes a range of URLs that may access the information. If you once again request a document from any server within that range, the "state information" object is sent to the server as a part of the request.
This "state object" is known as a "cookie" or a "magic cookie." Why a cookie? Perhaps someone was just fond of cookies...no one really knows why.
So what information can a server get from you just by receiving your request? The information below has nothing to do with cookies, except that-like every other request that goes from your browser to a Web server-this information is passed from the client making the request to the server filling the request. This information is passed whether the request is for an image, audio clip, CGI file, text file, or a Web page:
- Referrer - This is the Web page that referred the client; the page that the client is coming from.
- UserAgent - The name of the browser.
- RemoteAddress - IP address of the requesting client. This isn't necessarily the client's IP, as it may just be the IP of the host they are connected to. Also, those who connect with dynamic IPs will have a different IP address each time they log on to the Internet.
- RemoteHost-This is the fully qualified domain name of the requesting client. If the server cannot decipher it, it's set to null.
- RemoteUser - This is the user ID that is sent by the client, but the server must support user authentication, and 99% of Web servers are not set up to do so.
- RequestMethod - Just as it suggests, the Request Method by the client. It is either GET, POST, HEAD, PUT, DELETE, LINK, and UNLINK. Nothing sinister here, just the actual method the client uses to retrieve data from the server.
The server cannot grab your e-mail address or any other information from your browser's user preferences, your hard drive, or anywhere else-unless you actually fill out a form online and send it in. If you did that, the information could then be saved inside a cookie, and the next time you visited the site, the server would be able to "remember" that information about you.
To see the contents of the cookie file, just use any text editor to view it (in Windows versions of Navigator, it's called cookies.txt, and is stored in the same directory or folder as Netscape. Mac users will find it in their Netscape folder in the System|Preferences folder). Internet Explorer stores its cookies as separate files (one for each site) in folders named "Temporary Internet Files" or "Cookies" under the main Windows folder. You'll be able to see all the places that you've been that stored cookies, and a lot of the information that they've stored (although cookies.txt is a text file, much of the stored cookie information may be difficult to decipher). A cookie generally tells the server:
- the domain from which the cookie originated
- if the cookie requires a secure transmission or not
- specific URLs that may access the cookie
- the cookie's expiration date
- the name of the cookie item
- the actual data for the cookie item
Each domain has a limit of one cookie per page with Microsoft's Internet Explorer, while Netscape Navigator's limit is 20 per page, and Navigator limits the total number of cookies in the cookies.txt file to 300.
If you want to know each time you are being sent a cookie, both the latest Navigator and Internet Explorer can be configured to notify you when a cookie is sent. Once you've done that, you'll see just how many sites are using cookies.
So how do these cookies get set? The code, boss, the code! Cookies are pretty straightforward; the hard stuff is using the information that you store in them effectively. Netscape's cookie page explains the syntax for setting cookies using CGI, and the January/February 1997 issue's "Web Talk for Wireheads" went into details on the use of CGI for setting cookies, so we won't spend any time here on that.
document.cookie = "OurCookie=isnowset; PATH = /";
This sets the new cookie, "OurCookie," with a value of "isnowset." The PATH is set so that any pages served from the root directory outward are okay, as long as they are coming from the same domain. If no expiration date is set, as with this example, the cookie is kept in memory, and isn't written to the cookie.txt file.
So what kind of stuff can you do using a simple batch of cookies? How about letting a visitor specify whether or not they want frames, and using that decision to write the page accordingly, on the fly? I've set this up so that when the visitor first goes to the page, the non-framed page has a checkbox to click if they prefer frames. When the page is reloaded, frames are displayed, and later, when they go back to the site, their preferences are noted, and frames are again loaded. If the visitor changes their mind, the frames version features the same checkbox to turn frames off.
This script starts with some public cookie functions for setting and retrieving cookies that Bill Dortch has been so kind as to freely give to the Web community.
| [ HTML Tips and Tricks: The Webmaster's Bakery: |
Part 2 > ]