WebDeveloper.com ®: Where Web Developers and Designers Learn How to Build Web Sites, Program in Java and JavaScript, and More!   
Web Developer Resource DirectoryWebDev Jobs  
Animated GIFs
CSS
CSS Properties
Database
Design
Flash
HTML
HTML 4.01 Tags
JavaScript
.NET
PHP
Reference
Security
Site Management
Video
XML/RSS
WD Forums
 Client-Side
  Development

    CSS
    Graphics
    HTML
    JavaScript
    XML
    Dreamweaver/FrontPage
    Multimedia
    Web Video
    General
    Accessibility

 Server-Side
  Development

    ASP
    Perl
    PHP
    .NET
    Java
    SQL
    Other

 Web Development
  Business Issues

    Business Matters
    Website Reviews

 E-Commerce
    Domain Names
    Search Engines

 Etc.
    Computer Issues
    Forum Software
    Feedback
    The Coffee Lounge



Script Downloads
Detecting Google Chrome

Featured: September 5, 2008
Description: Need to check Google's new browser to see if it has a quirk that none of the other browsers have? You can detect Chrome by using this snippet in your scripts.

Get Script

Hosting Search
Unix   Windows
PHP   Webmail

Sign up for the free WebDeveloper E-mail newsletter!


JupiterWeb Commerce
Partners & Affiliates
Partner With Us















internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers

Web Log Analysis: Who's Doing What, When?
Part 3

by Glenn Fleishman

BYTECOUNT


#!/usr/local/bin/perl

require 'ctime.pl';
$root = "/usr/logs/";

if ($ARGV[0]) { $x = $ARGV[0]; }
else { print "File name? "; $x = <STDIN>; chop $x; }
while (!-e "$root/$x") {
		print "Bad file name\nFile name? ";
		$x = <STDIN>;
		chop $x;
}

if ($x =~ /\.gz$/i) {
	open (IN, "gunzip -c $root/$x |");
} else {
	open (IN, "< $root/$x");
}

while (<IN>) {
	/()()()()/;
	/\"(GET|POST|HEAD)
		\/([^\/\s\-A-Z]+)\/[^\"]*\" ([0-9]*) ([0-9]*)/;
	$clients{$2} += $4;
	$i++;
}
close IN;

open (OUT, ">> bytereport");
select (OUT);
print &ctime(time) . "\n";
foreach (sort keys %clients) {
	if ($clients{$_} > 100000)
	{
	printf "%-15.15s : %15.15s\n", $_, $clients{$_};
	}
}
select (STDOUT);
close OUT;


QUICKDIRTY


#!/usr/local/bin/perl

$root = "/usr/logs/";
$clientthres = 1000;
$refthres = 10;

if ($ARGV[0]) { $x = $ARGV[0]; }
else { print "File name? "; $x = <STDIN>; chop $x; }
while (!-e "$root/$x") {
		print "Bad file name\nFile name? ";
		$x = <STDIN>;
		chop $x;
}

/()/;
$x =~ /httpd\-log\.([^\.]*)\./;
if ($1) { $nomain ="www.${1}"; }
else { $nomain = "niente"; }

open (ENV, "< $x") || die "Can't open file\n";
while (<ENV>) {
	/()()/;
	chop;
	/\"([^\"]*)\" \"([^\"]*)\" \"[^\"]*\"$/;
	$ref = $1;
	$cli = $2;
	/()()/;
	/\"(GET|POST|HEAD) \/([^\/]*)\//;
	$head = $2;
	if ($ref ne "-" & $ref =~ /http/i & \
	   $ref !~ /$nomain/i)
		{ $url{$ref}++; }
	if ($cli) { $browser{$cli}++; }
}

if (!$ARGV[0]) {
	open (OUT, "> env.temp");
	select (OUT);
}
print "\nREFERERS:\n";
foreach (sort keys %url)
	{ $urlnum{$url{$_}} .= "$_\n"; }

for $i (0..($refthres - 1)) { $urlnum{$i} = ""; }

foreach $num (sort numerically keys %urlnum) {
	foreach $val (split('\n', $urlnum{$num})) {
		printf ("%-70.70s : %5.5s\n", $val, $num);
	}
}
print "\n\nBROWSERS:\n";
foreach (sort keys %browser) {
         if ($browser{$_} > $clientthres)
	{
	printf ("%-60.60s : %10.10s\n", $_, $browser{$_});
	}
}
select (STDOUT);

sub numerically { $a <=> $b; }

Common Log Format

The common log format appears exactly as follows:

host/ip rfcname logname [DD/MMM/YYYY:HH:MM:SS -0000]
"METHOD /PATH HTTP/1.0" code bytes

host/ip If reverse DNS works and DNS lookup is enabled, the hostname of the client is dropped in; otherwise the IP number displays.
RFC name If you enable identd (see Web Developer® Spring 1996 issue, p. 23), you can retrieve a name from the remote server for the user. If no value is present, a "-" is substituted.
logname If you're using local authentication and registration, the user's log name will appear; likewise, if no value is present, a "-" is substituted.
datestamp The format is day, month (three-letter abbreviation), year, hour in 24- hour clock, minute, second, and the offset from Greenwich Mean Time (for example, Pacific Standard Time is -0800).
retrieval Method is GET, PUT, POST, or HEAD; path is the path and file retrieved; HTTP/1.0 defines the protocol.
code HTTP completion code. 200 is successful, 304 is a reload from cache, 404 is file not found, and so forth.
bytes number of bytes in file retrieved.

Here's an example:

sniksnak.foobar.org - - [30/Feb/1996:06:03:24 -0800]
"GET /film/logos/the.movies.main.gif HTTP/1.0" 200 278

[ < Web Log Analysis: Who's Doing What, When?:
Part 2 ]
[ Web Log Analysis: Who's Doing What, When?:
Part 1 > ]




Acceptable Use Policy

JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers