CFHutton
09-15-2009, 12:10 PM
Hi all,
I'm a novice with Perl but need to create a script that will pull some data out of table cells in order to use in an rss feed. I found some older code that works in part for me, but not completely. The code can be found here:
http://www.perl.com/pub/a/2001/11/15/creatingrss.html
I understand what most of that code does I think, and when I modify it for my needs, what I get is a file produced that has no items, but everything else. In other words, it has the xml tag, the rss tag, the opening channel tag and channel information, the closing channel tag, and the closing rss tag. It doesn't have ANY items in it, though.
I think maybe the problem is the while loop isn't working because that code deals with a div my main chunk I'm dealing with is a table. I have a table that I've added a class to and use that table/class. Here's the relevant part of my Perl file:
while ( $tag = $stream->get_tag('table') ) {
if ($tag->[1]{class} and $tag->[1]{class} eq 'RSSItems') {
$tag = $stream->get_tag('td');
$mypostdate = $stream->get_trimmed_text('/td');
$tag = $stream->get_tag('td');
$tag = $stream->get_tag('a');
$headline = $stream->get_trimmed_text('/a');
$rss->add_item( title => $headline, date => $mypostdate );
}
}
Unfortunately, I don't have the html file in question on a public server, but I've cut it all the way down to just the one table with the class "RSSItems" and it's a two column table (muliple rows) with the first column containing dates (ex 9/6/2009) and the second column containing a linked headline. Eventually I'll need to pull the href out and use it as another item node, but right now I'm just trying to get things going.
This code is old. It uses LWP:Simple; HTML::TokeParser, and XML::RSS. I know there are some better modules (WWW:Mechanize), but the code I found at the link above is the closest I've come to finding anything useful to do what I need.
I wish I could provide more, but if anyone can just tell me why that while loop is producing zero items in my saved file, that would be a HUGE help.
Thanks,
CFH
I'm a novice with Perl but need to create a script that will pull some data out of table cells in order to use in an rss feed. I found some older code that works in part for me, but not completely. The code can be found here:
http://www.perl.com/pub/a/2001/11/15/creatingrss.html
I understand what most of that code does I think, and when I modify it for my needs, what I get is a file produced that has no items, but everything else. In other words, it has the xml tag, the rss tag, the opening channel tag and channel information, the closing channel tag, and the closing rss tag. It doesn't have ANY items in it, though.
I think maybe the problem is the while loop isn't working because that code deals with a div my main chunk I'm dealing with is a table. I have a table that I've added a class to and use that table/class. Here's the relevant part of my Perl file:
while ( $tag = $stream->get_tag('table') ) {
if ($tag->[1]{class} and $tag->[1]{class} eq 'RSSItems') {
$tag = $stream->get_tag('td');
$mypostdate = $stream->get_trimmed_text('/td');
$tag = $stream->get_tag('td');
$tag = $stream->get_tag('a');
$headline = $stream->get_trimmed_text('/a');
$rss->add_item( title => $headline, date => $mypostdate );
}
}
Unfortunately, I don't have the html file in question on a public server, but I've cut it all the way down to just the one table with the class "RSSItems" and it's a two column table (muliple rows) with the first column containing dates (ex 9/6/2009) and the second column containing a linked headline. Eventually I'll need to pull the href out and use it as another item node, but right now I'm just trying to get things going.
This code is old. It uses LWP:Simple; HTML::TokeParser, and XML::RSS. I know there are some better modules (WWW:Mechanize), but the code I found at the link above is the closest I've come to finding anything useful to do what I need.
I wish I could provide more, but if anyone can just tell me why that while loop is producing zero items in my saved file, that would be a HUGE help.
Thanks,
CFH