WebDeveloper.com �: Where Web Developers and Designers Learn How to Build Web Sites, Program in Java and JavaScript, and More!   
Web Developer Resource Directory WebDev Jobs
Animated GIFs
CSS Properties
HTML 4.01 Tags
Site Management
WD Forums

    Web Video
    Expression Web



    Forum, Blog, Wiki & CMS

 Site Management
    Domain Names
    Search Engines
    Website Reviews

 Web Development
  Business Issues

    Business Matters

    The Coffee Lounge
    Computer Issues

Web Talk for Wireheads: Cookies
Part 2

by Glenn Fleishman

The $2.50 Solution

Setting cookies too often could annoy people who visit the site regularly and who leave the cookie alert on. This interrupts the flow of their surfing, and may provoke them not to turn the alert off, but to avoid the site. One of our clients received a message from a friend asking if we were starting up a Mrs. Fields franchise, given how many cookies we were giving out.

So setting cookies with as much useful information as possible, and using long expire times is the best course of action, since reading a cookie is part of the HTTP conversation (the browser sends it as part of the HTTP header information it transmits to the server).

Certainly, resetting a cookie every time someone enters the site allows you to do something as sophisticated as generating a list of pages that have changed since their last visit. The computational crunch of doing this on each visit is enormous, even for a low-traffic site, so a cron job running hourly or daily would do the trick.

This script does this task recursively, building a short file that contains items changed in the last month. You might change this time duration to reflect the frequency of updates on your site. Here's the code that would run from the cron job (it doesn't have to run as root at all):

 #!/usr/local/bin/perl open (MODLIST, "> /usr/local/www/cgi- bin/mods/modlist.txt"); &dir_recur("/usr/local/www/"); close MODLIST; sub dir_recur { local($localdir) = @_; local(@dircontents, $diropen); $diropen = "$rootdir/$localdir"; $diropen =~ s/\/\//\//g; opendir (CURRDIR, $diropen); @dircontents = sort (grep(!(-l $_) & !/^\./, readdir(CURRDIR))); close CURRDIR; $webdir = $localdir; $webdir =~ s/\/usr\/local\/www\///i; foreach (@dircontents) { $file = $_; if (/\.html?$/i && (-T "$diropen/$file") && (-M "$diropen/$file") < 32) { $found = 0; $title = ""; open (IN, "< $diropen/$file"); while (!$found) { $_ = <IN>; /()/; if (/<TITLE>(.*)<\/TITLE>/i) { $title = $1; $found = 1; } } close IN; print MODLIST (-M "$diropen/$file") . "\t$title\t$webdir/$file\t"DATESTAMP" \n"; } if ($file && (-d "$diropen/$file") & !(-l "$diropen/$file") & $localdir !~ /(\/\/|hyper)/) { $next_dir = "$localdir/$file"; $_ = $next_dir; &dir_recur($next_dir); } $found = 0; } } 

The CGI just has to check against this file to find modified entries. It uses the cookie to determine the last visit. The files "modhead.txt" and "modfoot.txt" should contain the surrounding HTML you want for this file. This could also be called as a server-side include, in which case you can simply remove the header and footer code.

 #!/usr/local/bin/perl $modroot = "/usr/local/www/cgi-bin/mods"; $cookie = $ENV{'HTTP_COOKIE'}; $cookie =~ /lastvisit\=([^\;]+)/; $lastvisit = $1; if ($lastvisit) { $time = time; $nettime = $time - $lastvisit / 60 / 60 / 24; } else { $nettime = 31; } open (IN, "< $modroot/modlist.txt"); @mods = <IN>; close IN; print "Content-Type: text/html\n\n<HEAD>\n"; &dump("$modroot/modhead.txt"); print "<TABLE BORDER=1> <TR><TH ALIGN=CENTER VALIGN=TOP>Date Changed</TH> <TH ALIGN=CENTER VALIGN=TOP>Title or File</TH></TR>"; foreach (@mod) { chop; @items = split('\t'); if ($items[0] < $nettime) { if (!$items[1]) { $items[1] = $items[2]; } print "<TR> <TD ALIGN=LEFT VALIGN=CENTER>$items[3]</TD> <TD ALIGN=LEFT VALIGN=CENTER> <A HREF=\"/$items[2]\" >$items[1]</A> </TDv </TRv\n"; } } print "</TABLEv\n"; &dump("$modroot/modfoot.txt"); sub dump { ($file) = @local; open (DUMP, "< $file); while (<DUMP>) { print; } close DUMP; } 

To make this all work, of course, you have to set the cookie. Probably on the home page or other likely entry points, you need to have some perl code or related item feed out the HTTP header:

 Set-Cookie: lastvisit=datestamp; domain=.domain.com; path=/ 

Datestamp should be the UNIX datestamp (seconds since 1970); domain.com should be replaced with whatever pattern you're using; and path could be set to specific subpaths, if you wanted to keep a separate lastvisit setting to different sections.

In the previous cookie column, there are several ideas about how to set this cookie. One I failed to mention is that even if you don't use an NPH (no-parse headers) script, but are using just a regular CGI, you can still feed out cookies. In the Content-type line, where normally you just feed out that HTTP header, add whatever you like:

 print "Content-type: text/html\nSet-Cookie: lastvisit=" . time . "; domain=.blah.com; path=/\n\n<HEAD>\n"; 

[ < Web Talk for Wireheads: Cookies:
Part 1 ]
[ Web Talk for Wireheads: Cookies:
Part 3 > ]

HTML5 Development Center

Recent Articles