StuPeas
08-12-2006, 05:56 AM
Hi
I'm learning perl and have hit problems fairly early on. I have typed a program into perls editor that is suppost to determine the frequency of the occurence of each word in a string.
Once i have saved the program and run it i get a warning about a variable being used only once. If i proceed and type in the string "One Two Two Three" (without the quotes), nothing happens and the programme wont exit. the only way to get out of it is to close down the MS DOS window. i have written the code below, i would be greatfull if somebody could tell me the problem with it.
It would be realy helpful to know why this doesnt work, and not to recieve any advice on programs that would do the same thing.
THANX
#!C:\perl\bin\perl.exe
while ($line = <STDIN> )
{
while ($line =~ s/(\w+)(.*)/$2/)
{
$word = $1;
$wordHash{$word}++;
}
}
while ( ($word, $count) = each(%wordHash) )
{
$wordArray[$i] = "$word\t$count";
$i++;
print ("$word\n");
}
Charles
08-12-2006, 08:12 AM
You've got a lot going on there. And Perl has a style that is very different than Java. Here's a working version for you to study:#!C:\perl\bin\perl.exe
use strict;
$/ = '';
my %word_hash = ();
$word_hash {$_}++ foreach (split /\b/, <DATA>);
print "$word_hash{$_} $_\n" foreach (sort keys %word_hash);
__DATA__
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. In auctor commodo est. Suspendisse potenti. Vivamus sagittis turpis quis lectus. Nulla ipsum sem, ultrices vel, blandit eget, euismod eget, lectus. Nam imperdiet velit at neque. Nam ut massa at justo consectetuer blandit. Morbi porta feugiat libero. Suspendisse urna nisi, fringilla sit amet, iaculis sed, interdum id, enim. Mauris consectetuer hendrerit purus. Etiam sit amet felis id metus ultricies suscipit. Etiam enim urna, pulvinar ac, pharetra ac, aliquet sit amet, ante. Duis est libero, luctus ut, fermentum sit amet, blandit eget, odio. Etiam consectetuer.
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Vestibulum odio nibh, auctor vitae, aliquet sit amet, tincidunt ut, ligula. Mauris sed nunc. Aenean molestie facilisis ipsum. Etiam urna. Integer leo. Vivamus ullamcorper lorem sit amet urna. Donec ac nunc. Pellentesque fermentum ornare ligula. Phasellus dapibus felis quis tortor. Nulla sed arcu non diam tempor fermentum. Ut bibendum ultrices erat. Suspendisse libero nisi, sodales vel, porta imperdiet, adipiscing porta, enim.
In id nibh. In iaculis orci id ipsum. Cras adipiscing est sit amet neque. Nam egestas, urna at dapibus ornare, orci erat varius ipsum, non sagittis diam ligula eget justo. Suspendisse congue, odio eget tincidunt blandit, sapien justo aliquam nunc, eget blandit mauris nibh nec sapien. Donec at felis a lorem feugiat sodales. Quisque sodales, magna in faucibus nonummy, purus sapien malesuada magna, sit amet mattis leo libero at massa. Mauris aliquet tempor urna. Vivamus ornare ullamcorper nisl. Fusce nulla. Donec aliquam pretium massa.
Fusce tincidunt, lorem sed ultricies venenatis, urna lorem aliquet orci, eu viverra nibh mi at eros. Vestibulum sit amet libero sed elit dignissim euismod. Donec facilisis ornare neque. Donec elementum. Sed molestie dui id pede. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Donec ultricies. Mauris ultricies lacus sit amet elit. Proin eget sem. Curabitur eu nisl gravida lectus porta placerat. Curabitur non eros non nisi auctor vestibulum. Maecenas pulvinar, purus vel porttitor rhoncus, metus lectus semper leo, quis volutpat urna est in sem. Ut hendrerit ligula id libero. Nulla pede tortor, mattis a, euismod nec, rutrum vitae, quam. Aliquam augue dolor, sollicitudin a, sagittis sed, consectetuer quis, lacus.
Pellentesque justo neque, venenatis id, varius eget, interdum a, neque. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Suspendisse sollicitudin purus in nisl. Curabitur bibendum. Phasellus diam. Maecenas laoreet eros in diam. Curabitur tempus dictum odio. Suspendisse molestie ipsum sit amet lectus. Duis congue, odio ac ornare aliquet, diam odio ultrices urna, id tristique lectus ligula dapibus ipsum. Sed non ipsum eget massa pretium pulvinar. Aenean molestie lobortis odio. Maecenas a leo. Etiam id nunc vel ante egestas pretium. Praesent suscipit libero id ante. Maecenas non nisi non erat facilisis tempus. Nunc molestie metus a arcu. Suspendisse potenti. Praesent ac urna egestas quam sollicitudin pretium.You'll have some questions.
StuPeas
08-12-2006, 08:29 AM
Thanx for the responce, but as i stated in my post I dont want to get bogged down in other code, what I really need is to know why this script doesnt work.
The code is from a manual for my CIW course, and is strictly a learning question.
I do not need to use the script for anything, and i understand exactly what it SHOULD do, and how it should do it. It is more important for me to understand why this PARTICULAR CODE does not work.
I understand that to help me out someone may actual have to run the script themselves, and that this is alot of work, but i would be eternaly grateful.
If you cannot help me further, thanx for responding anyway.
Charles
08-12-2006, 08:35 AM
I don't have time to list the things wrong with that piece of script - that's why I gave you a working version to ponder.
StuPeas
08-12-2006, 08:42 AM
I totaly understand the time thing Charles.
Regarding the infinite loop: wouldnt the (\w+) in the regular expression would result in the while loop evaluating to zero once the last word in the string is read.
Charles
08-12-2006, 04:23 PM
Upon closer inspection that part works but it's very ineficient, kind of like using 10*10 instead of 100. What it does is match the first one or more word characters and then all other character but a new line. The first one or more word characters are assigned to $1 and the rest of the line to $2. Then the whole line is replaced by $2 and $1 you play with.
A less ineficient method would be to use $word_hash {$1}++ while ($line =~ s/(\w+)//).
Even less would be $word_hash {$_}++ for ($line =~ m/\w+/g);.
StuPeas
08-13-2006, 08:14 AM
Thanxs again.
Your answer has risen some questions in my head about how Perl reads expressions (in what order). Since the code you sent was not an assignment, then i am guessing that it is read from left to right. If this is so then wouldnt the first backreference ($1) contain a random value from a previous match that would then be stored in our hash.
Obviously this value would not be part of our string (I say obviously, but this is pure guesswork on my part)
I may post this as another thread later as i have to go and visit a freind in hospital right now.
THANX again for your time spent on an inquisitive newbe Charles. :)
Charles
08-14-2006, 07:00 AM
Sometimes Perl is evaluated left to right, and sometimes right to left.