Click to See Complete Forum and Search --> : Recording hits


ken_walker_jr
08-03-2003, 02:28 PM
I need to verify hits to a site. I don't know a lot yet about CGI, I'm reading a book, but I need help quickly, if possible. All I need to do is log an IP address and a date and time a hit occured. Can you please help me?

goofball
08-04-2003, 04:06 PM
If you have access to your server access logs, this information should already be available to you. Inquire of your web host about the access_log and about running site statistics a.k.a traffic reports.

ken_walker_jr
08-04-2003, 06:04 PM
Correct, the info is available. However, I would like to email it directly to the client when it occurs. It's kinda like a 'I don't believe it' kinda thing. He's using a pay-per click service which says he received 400+ hits last month, the host says it's more like 30. Please advise.

goofball
08-05-2003, 11:04 AM
There's about a million different ways to configure/misconfigure the way a server writes to the access_log, and just as many ways to run traffic reports incorrectly. I never trust what the web host's logs are telling me OR what the client says. I'll give you a simple hit-counter written in Perl.
Just make sure that the click-thru provider is redirecting the client's link to the hit counter script in stead of straight to the HTML page.


#!/usr/bin/perl

use CGI ':standard';

use Fcntl ':flock';
sub lock {
flock($_[0],LOCK_EX);
seek($_[0], 0, 2);
}
sub unlock {
flock($_[0],LOCK_UN);
}

$ip_address = $ENV{'REMOTE_ADDR'};
$date = gmtime();

open(LOG, ">>log_file.txt");
&lock('LOG');
print LOG "$ip_address $date\n";
&unlock('LOG');
close(LOG);

print redirect("http://www.yoursite.com/whatever.html");

## This will email the client every 10 hits. Adjust as needed. ##
## *Check with your host about the path to sendmail.* Here it's ##
## /bin/sendmail The -t flag is not part of the filepath. ##

$notify = 10;

open(LOG, "log_file.txt");
@hits = <LOG>;
close(LOG);

$total_hits = @hits;

if(@hits >= $notify && @hits % $notify == 0) {
open (MAIL,"| /bin/sendmail -t"); ## *path to sendmail ##
print MAIL "From: you\@yourdomain.com\n";
print MAIL "To: client\@hisdomain.com\n";
print MAIL "Bcc: you\@yourdomain.com\n";
print MAIL "Subject: Pay-per-click hits\n\n";
print MAIL "Your pay-per-click total has reached $total_hits hits:\n\n";
foreach(@hits) {
$number++;
print MAIL "$number) $_";
}
close(MAIL);
}

exit;



P.S. make sure to change the permissions of the script on the server to make it executable.

Let me know how it goes!

cheers

ken_walker_jr
08-05-2003, 04:03 PM
Thank you very much! I will let you know as soon as I have it working.

ken_walker_jr
08-06-2003, 09:13 PM
I just wanted to let you know, it worked flawlessly. (except for the fact I forgot to change index.html to index.htm, that through me for a loop for a bit)
Anyhow, thanks much. I tried it out and it did just what I wanted.

Since I'm new to PERL/CGI, can you tell me, can I call a CGI program from JavaScript?

ken_walker_jr
08-06-2003, 09:20 PM
I take it back, there is one problem. It mailed after the first 10 hits was reached. After that, though several have come, it did not mail any more. I was wondering if I should kill the file after the info is sent. Or put another type of flag in the file to cause it to restart the count even though more than 10 entries are there. What do you think?

Jeff Mott
08-06-2003, 10:18 PM
He's using a pay-per click service which says he received 400+ hits last month, the host says it's more like 30. Please advise.Some simple counters will just mindlessly count hits. For instance, the same person may visit your home page, click a link, hit back, click another link, hit back again, and so on. Your average counter may register 5 counts for that (or even more depending on how long the user browses). The more sophisticated counters (probably what your host is using) will either track the IP address of- or set a cookie onto the computer on its first visit. Then ignore any more hits from that computer for a given time period. The latter method provides a much more accurate number in regards to how often the site is visited.

goofball
08-07-2003, 08:07 AM
ken_walker_jr:

No, there shouldn't be any need to kill the file. Let me ask you this: How many hits are in the log? You said there were more than 10, but I'm thinking it's still less than 20. That counter should send an email every 10 hits- the first at 10, the second at 20, then 30, and so on.

You see the line that says:
@hits % $notify == 0 ?
That % sign is a "modulus" operater, which means this line is performing a division of the number returned by @hits by the number in $notify - in this case 10. The modulus returns a remainder of that division operation. So as long as the number of lines in your hit counter file is evenly divisible by 10 (or whatever number you assign to the $notify variable), the modulus will return a zero & the email is sent.

* Also note that the same if(test) checks that @hits is greater than or equal to $notify. This is beacuse if the number of hits hasn't reached $notify yet, the division I talked about will result in a number less than 1, which means no remainder. The modulus operation will return 0 and the email would be sent every hit until $notify is reached.

Jeff Mott makes a good point about some hit counters "mindlessly counting hits" when a browser goes back to the first page. But this script will avoid some of that. It doesn't keep track of actual visitors via cookies, bu you could always compare the IP addresses in the log to differentiate between hits and actual visitors. The redirect() command assures that you won't have any repeat hits if someone refreshes their browser window or hits the Back button.

Jeff Mott
08-07-2003, 11:37 AM
...of course it is also possible that the data file generated by goofball's script has become corrupted. Since it is not locked when reading from or writing to that file.

ken_walker_jr
08-08-2003, 04:03 PM
Well, whatever the cause, it IS working now. I added another section to email me in addition to emailing them. I hit the site and got it up to 40 and it emailed right away. Anyhow, thanks to both of you for all your help.

In training,

goofball
08-15-2003, 08:11 AM
Jeff Mott:
...of course it is also possible that the data file generated by goofball's script has become corrupted. Since it is not locked when reading from or writing to that file.
I updated the script above to include locking the file for writing. But don't lock the the file for reading, as you won't be able to read a locked file. That's half the reason for the lock in the first place. (at least in my test, putting those lock() & unlock() functions around the line: @hits = <LOG>; gives you an empty @hits array).

ken_walker_jr:
... I added another section to email me in addition to emailing them.
I wasn't sure what you meant by "another section", but I figured I would add the "Bcc" field to the email part of the script in case you just want to use that. That's all you need to copy yourself on the email without the other recipient knowing about it. (You could also replace "Bcc" with "Cc" if you want a visible carbon copy).

ken_walker_jr
08-15-2003, 08:15 AM
Hey, thanks for the update. That's great. Actually, at the time I didn't know about the Bcc, so I just did the whole sendmail thing over but to me. I'll update the locking too. I really appreciate your help. Also, how can I call a cgi program from JavaScript?

goofball
08-15-2003, 08:23 AM
JavaScript is generally kept on the client side (browser), so if you want to call the cgi, just use JavaScript to direct the browser toward the cgi's URL. The same way you would put a URL into an href in HTML.

document.location = 'http://www.site.com/cgi-bin/script.cgi';

ken_walker_jr
08-15-2003, 08:47 AM
Roger that, it's almost too simple. Thanks agian.

Jeff Mott
08-17-2003, 05:08 PM
But don't lock the the file for reading, as you won't be able to read a locked file.It is still very important to lock the file when reading. The only difference being you implement a shared lock rather than an exclusive lock. (at least in my test, putting those lock() & unlock() functions around the line: @hits = <LOG>; gives you an empty @hits array).That is because your unecessary customized locking routine seeks to EOF. Obviously there is nothing left to read after you do that.

goofball
08-18-2003, 08:33 AM
check
so for reading, I should ditch the seek & just use:
flock(HANDLE,LOCK_SH);
... and the same for requiring?

Remember that other thread where we discussed file locking http://forums.webdeveloper.com/showthread.php?s=&threadid=13897
-esp. about using flock with require, ... I was wondering if your test included trying to require a file locked with LOCK_SH ? Maybe I'm setting up my tests wrong, but I can't confirm that flock is doing what it should at all. I've been using flock for only write operations to my database and haven't had any more file corruption problems, but I still can't get my tests to work. Anyway, that's why I ask about your test.

Also, why is it important to lock for reading? Isn't the system just copying data into memory when it reads?

Jeff Mott
08-18-2003, 06:36 PM
so for reading, I should ditch the seek & just use:
flock(HANDLE,LOCK_SH);Yes. Though the more technically correct isflock(FILEHANDLE, LOCK_SH) or die $!;A good rule of thumb (this actually applies to all programing languages): if the function returns a true/false for success/failure then check it.... and the same for requiring?A require statement will implicitly call flock FILEHANDLE, LOCK_SH|LOCK_NB (Note that this information comes from testing. I could not find any documentation to confirm this.) So reading from a file that to be required is fine, but attempting to require a file that has an exclusive lock in place will cause the program to error out.Also, why is it important to lock for reading? Isn't the system just copying data into memory when it reads?Locking a file with flock does not cause open statements to wait it's turn, it causes other flock calls to wait its turn. If you read from a file without locking it, it is possible that that file could be written to at the same moment. This will not affect the data in the file, but the data read by that particular process may be garbage.

goofball
08-19-2003, 04:30 PM
ok, cool. That makes sense about locking the file for reading.
And even better, I got my tests to work - finally. By using the following code, I managed to prevent fatal errors when requiring files:

open(FILE,"file.pl") or die $!;
flock(FILE,LOCK_SH) or die $!;
require "file.pl";
flock(FILE,LOCK_UN);
close(FILE);

So when another process has an exclusive lock in place on file.pl, this request for a shared lock blocks until the other process releases its exclusive lock. Then every-ting be 'airy.

But when I turn the test around and have the requiring process lock the file first, I have a problem with the other process - because it wants to exclusively lock the file for an over-write, like so:

open(FILE,">file.pl") or die $!;
flock(FILE,LOCK_EX) or die $!;
print FILE "whatever ... \n";
flock(LOCK_UN);
close(FILE);

The error occurs when the requiring process is run first and then calls require, because the other process tries to get that exclusive lock after opening the file with the ">" string - clobbering the file just before "require" is happening in the first process. (Whichever script I call first, I have it sleep for 10 seconds between flock() and require -or- between flock() and print. That way I can test if the second process was kept out of the file by the current lock).

... still with me ...?

So my question is:: is there a way I can get a FILEHANDLE for "file.pl" (to be passed to flock) without first opening the file (and sometimes clobbering it)? I am aware of the Filehandle and Symbol modules, but don't know exactly how they work yet. I thought there might be a built-in way already in Perl ... ?

Like if I just assign a filepath to a scalar variable, then use that variable as my filehandle? This doesn't seem to work:

$fh = "file.pl";
flock($fh,LOCK_EX) or die $!; ## 'bad file descriptor' error ##
open($fh); .... etc...

goofball
08-20-2003, 08:38 AM
UPDATE:

I found a temporary solution that seems to be working. On the over-writing process, I call open on the file without a mode string, request an exclusive lock, and with the very next statement I call open on the file with a mode string of '>' and lock the file again. That way I avoid clobbering the file until I already have a lock. Perl automatically closes the first file handle when the new one with the same name is opened. (i think)

open(FILE, "file.pl") or die $!;
flock(FILE,LOCK_EX) or die $!;

open(FILE, ">file.pl") or die $!;
flock(FILE,LOCK_EX) or die $!;
seek(FILE, 0, 2);

Since the second open destroys the first lock, there's probably about a millisecond of a window in between the second open & the second lock where another process might be able to get in and grab away the lock, but I figure the chances of that happening are slim.

Still, if you have a more elegant / more correct solution, please let me know. Thanks!

By the way Jeff- I ran the code in your signature ... nice.

Jeff Mott
08-20-2003, 02:17 PM
Still, if you have a more elegant / more correct solution, please let me knowopen(FILE, '+<file.pl") or die $!;
flock(FILE, LOCK_EX) or die $!;
truncate(FILE, 0) or die $!;

goofball
08-20-2003, 03:02 PM
Ok, thanks.
I had tried the mode string "+<" before, but missed the truncate. Now I know why it wasn't working. :cool:

BUT - as this exaple: http://www.perldoc.com/perl5.6.1/pod/func/flock.html
uses ">>" to append the file in stead of over-writing --
if I need to, I would like to do a similar seek to EOF in case the file was updated while the LOCK_EX request was blocking. Will I still need to do that -or why not? If so, would the seek go before or after truncate?

Jeff Mott
08-20-2003, 10:30 PM
Why would you need to seek at all if you are truncating the file? Just as the > open mode, truncate in my example above will clobber the file. After that is done there is no where to seek to. The beginning of the file and the end of the file are the same with nothing in between.

Even when appending to a file, there is no need to seek to EOF after flocking. This is because when a file is opened in append mode Perl will not start printing anywhere except EOF. e.g.,use Fcntl qw[:seek];

open FH, '>>test.txt' or die $!;
seek FH, 0, SEEK_SET or die $!;
print FH 'Hello ' or die $!;

open FH, '>>test.txt' or die $!;
seek FH, 0, SEEK_SET or die $!;
print FH 'World' or die $!;

goofball
08-21-2003, 07:41 AM
Ok. I wanted to cover all the bases, as I have no prior experience with some of these filehandle-related functions.

Thanks again! Problem solved.

on a side note:
Even when appending to a file, there is no need to seek to EOF after flocking. This is because ...
But I think the reason that the example I linked to http://www.perldoc.com/perl5.6.1/pod/func/flock.html used seek after flock becaue the file was opened in >> mode before the lock was requested. Since flocks are requested on already open filehandles, a locked file can still be opened in any mode by another process. If the other process already has the file locked when the >> open occurs, then the EOF position at that time may be changed by the other process' append operation while flock waits. Thus seek is used to get to the new EOF. At least, that's what I gathered...

Is the example incorrect, or do I misunderstand?

Jeff Mott
08-21-2003, 11:52 AM
If you see my last example, I seek to the beginning of the file. That means that just before the print operation the file position is not at EOF. But it will still start printing at EOF regardless. So even data had been added to the file while waiting for the lock (meaning the file's position is no longer at EOF) the print operation will still begin printing at EOF.