Click to See Complete Forum and Search --> : $/ Not Working


saintpretz59
01-15-2008, 07:32 AM
Hello everybody,
I'm making a script that should parse XML files (already done before, I know), and I'm having a bit of trouble. The sub accepts two strings: location of the xml file and the name of the element for the "main entries." (For a file of root <phonebook> with many individual entries <person>, you'd enter: person as an argument. The sub should then change that into </person>. The next step is to read off chunks of the XML file. Instead of reading of one line at a time, the $/ variable is set to </person>. After every occurrence of </person>, the sub should print of END OF ENTRY.
>>Not the most practical, obviously, but it should improve once i solve this problem:
Instead of printing END OF ENTRY at the appropriate places, it only prints it at the very end of the file.... Does anyone know why this would happen?

#!/usr/bin/perl
#form.cgi
use warnings;
use strict;
print "Content-type: text/plain\n\n";
sub xmlTag{
my $file = $_[0];
my $separator = "</" . "$_[1]" . ">";
$/ = "$separator";
print "We will be parsing $file with a separator of $/\.\n\n\n";
open(XMLFILE, "$file");
for (<XMLFILE>){
print "$_";
print "END OF ENTRY";
}
close(XMLFILE);

}

xmlTag("QuestDex.xml", "person");

dragle
01-15-2008, 08:53 AM
WFM, with the following data file:

<phonebook>
<person>Dan</person>
<person>Jane</person>
<person>Harvey</person>
</phonebook>

though you may want to local the $/ setting when using it like this (so your setting doesn't remain for the rest of the script); i.e., local $/ = "$separator";

Are you sure your target data file has </person> tags (and not, e.g., </PERSON> or </Person>)? Are you sure you're reading the right QuestDex.xml file (try providing it as an absolute pathname to be sure). The behavior you describe is what you would get if none of your defined separators were found in the file.

Here's a self-contained version for testing:

#!/usr/bin/perl

use strict;
use warnings;

$/ = '</person>';
while (<DATA>){
print "$_";
print "END OF ENTRY";
}

__DATA__
<phonebook>
<person>Dan</person>
<person>Jane</person>
<person>Harvey</person>
</phonebook>

saintpretz59
01-15-2008, 03:59 PM
Well, it reads the right file.... The file I want is the file that's displayed... the problem is, it reads it all as one chunk. I'm positive that the capitalization is right and everything.... might there be some CRAZY reg exp problem? Hhhh....I hate when there is no "quick fix."
Thanks, Dragle.

Jeff Mott
01-15-2008, 04:45 PM
saintpretz59, I copied you code exactly, using the test XML from dragle's post, and I got this result.

<phonebook>
<person>Dan</person>END OF ENTRY
<person>Jane</person>END OF ENTRY
<person>Harvey</person>END OF ENTRY
</phonebook>END OF ENTRY

That's what you were looking for, right? An END OF ENTRY after every </person>. And the last EOE is there because perl still reads to the end of file whether $/ is there or not. It seems to me to be working exactly as expected. :-\

saintpretz59
01-17-2008, 07:36 PM
Well, i saw your reply a few days ago, didn't know what to do. I narrowed it down: It's not working to set $/ to more than one letter. I'm able to have read up to ever 'r' for example, but it won't work with 2 or more letters.
any ideas?
<edit time="0150 GMT 10.18.08 ">
I'm using version 5.8.6 if that helps.... ?
</edit>

dragle
01-18-2008, 09:29 AM
Beats me. So far as I know multi-character $/ definitions have been supported since at least v5.6.

Does the self-contained version I provided above work? You should get the same output that Jeff described in his post.

dragle
01-18-2008, 10:18 AM
By any chance, is your QuestDex.xml file saved in a Unicode format other than UTF-8? What happens if you force the XMLFILE to be read as, say, UTF-16:
open(XMLFILE, '<:encoding(UTF-16)', $file);
?