WebDeveloper.com �: Where Web Developers and Designers Learn How to Build Web Sites, Program in Java and JavaScript, and More!   
Web Developer Resource Directory WebDev Jobs
Animated GIFs
CSS
CSS Properties
Database
Design
Flash
HTML
HTML 4.01 Tags
JavaScript
.NET
PHP
Reference
Security
Site Management
Video
XML/RSS
WD Forums
 Client-Side
  Development

    HTML
    XML
    CSS
    Graphics
    JavaScript
    ASP
    Multimedia
    Web Video
    Accessibility
    Dreamweaver
    General
    Accessibility
    Dreamweaver
    Expression Web

    General

 Server-Side
  Development

    PHP
    Perl
    .NET
    Forum, Blog, Wiki & CMS
    SQL
    Java
    Others

 Site Management
    Domain Names
    Search Engines
    Website Reviews

 Web Development
  Business Issues

    Business Matters

 Etc.
    The Coffee Lounge
    Computer Issues
    Feedback




Web Log Analysis: Who's Doing What, When? Part 3

Web Log Analysis: Who's Doing What, When?
Part 3

by Glenn Fleishman

BYTECOUNT


 #!/usr/local/bin/perl require 'ctime.pl'; $root = "/usr/logs/"; if ($ARGV[0]) { $x = $ARGV[0]; } else { print "File name? "; $x = <STDIN>; chop $x; } while (!-e "$root/$x") { print "Bad file name\nFile name? "; $x = <STDIN>; chop $x; } if ($x =~ /\.gz$/i) { open (IN, "gunzip -c $root/$x |"); } else { open (IN, "< $root/$x"); } while (<IN>) { /()()()()/; /\"(GET|POST|HEAD) \/([^\/\s\-A-Z]+)\/[^\"]*\" ([0-9]*) ([0-9]*)/; $clients{$2} += $4; $i++; } close IN; open (OUT, ">> bytereport"); select (OUT); print &ctime(time) . "\n"; foreach (sort keys %clients) { if ($clients{$_} > 100000) { printf "%-15.15s : %15.15s\n", $_, $clients{$_}; } } select (STDOUT); close OUT; 


QUICKDIRTY


 #!/usr/local/bin/perl $root = "/usr/logs/"; $clientthres = 1000; $refthres = 10; if ($ARGV[0]) { $x = $ARGV[0]; } else { print "File name? "; $x = <STDIN>; chop $x; } while (!-e "$root/$x") { print "Bad file name\nFile name? "; $x = <STDIN>; chop $x; } /()/; $x =~ /httpd\-log\.([^\.]*)\./; if ($1) { $nomain ="www.${1}"; } else { $nomain = "niente"; } open (ENV, "< $x") || die "Can't open file\n"; while (<ENV>) { /()()/; chop; /\"([^\"]*)\" \"([^\"]*)\" \"[^\"]*\"$/; $ref = $1; $cli = $2; /()()/; /\"(GET|POST|HEAD) \/([^\/]*)\//; $head = $2; if ($ref ne "-" & $ref =~ /http/i & \ $ref !~ /$nomain/i) { $url{$ref}++; } if ($cli) { $browser{$cli}++; } } if (!$ARGV[0]) { open (OUT, "> env.temp"); select (OUT); } print "\nREFERERS:\n"; foreach (sort keys %url) { $urlnum{$url{$_}} .= "$_\n"; } for $i (0..($refthres - 1)) { $urlnum{$i} = ""; } foreach $num (sort numerically keys %urlnum) { foreach $val (split('\n', $urlnum{$num})) { printf ("%-70.70s : %5.5s\n", $val, $num); } } print "\n\nBROWSERS:\n"; foreach (sort keys %browser) { if ($browser{$_} > $clientthres) { printf ("%-60.60s : %10.10s\n", $_, $browser{$_}); } } select (STDOUT); sub numerically { $a <=> $b; } 

Common Log Format

The common log format appears exactly as follows:

 host/ip rfcname logname [DD/MMM/YYYY:HH:MM:SS -0000] "METHOD /PATH HTTP/1.0" code bytes 

host/ip If reverse DNS works and DNS lookup is enabled, the hostname of the client is dropped in; otherwise the IP number displays.
RFC name If you enable identd (see Web Developer® Spring 1996 issue, p. 23), you can retrieve a name from the remote server for the user. If no value is present, a "-" is substituted.
logname If you're using local authentication and registration, the user's log name will appear; likewise, if no value is present, a "-" is substituted.
datestamp The format is day, month (three-letter abbreviation), year, hour in 24- hour clock, minute, second, and the offset from Greenwich Mean Time (for example, Pacific Standard Time is -0800).
retrieval Method is GET, PUT, POST, or HEAD; path is the path and file retrieved; HTTP/1.0 defines the protocol.
code HTTP completion code. 200 is successful, 304 is a reload from cache, 404 is file not found, and so forth.
bytes number of bytes in file retrieved.

Here's an example:

 sniksnak.foobar.org - - [30/Feb/1996:06:03:24 -0800] "GET /film/logos/the.movies.main.gif HTTP/1.0" 200 278 

[ < Web Log Analysis: Who's Doing What, When?:
Part 2 ]
[ Web Log Analysis: Who's Doing What, When?:
Part 1 > ]




HTML5 Development Center


Recent Articles