www.webdeveloper.com
Results 1 to 5 of 5

Thread: Alt tag text

  1. #1
    Join Date
    Jun 2008
    Posts
    223

    Alt tag text

    Has anyone ever grabbed the text from inbetween html alt tags?
    Without a module?

    For instance
    Code:
    <img src="any.gif" alt="The text inside">
    So it would print to screen: The text inside

    Hoe someone can help - thanks

  2. #2
    Join Date
    Oct 2007
    Location
    Vienna, Austria
    Posts
    393
    Looks like a job for HTML::Parser.

    Assuming your have the HTML code you want to search in a $html variable:
    Code:
    use HTML::Parser ();
    my $p = HTML::Parser->new(start_h => [\&start, 'text, attr']);
    sub start {
        my ($text, $attr) = @_;
        if (exists $attr->{alt}) {
            print $attr->{alt}, " (in $text)\n";
        }
    }
    $p->parse($html);
    Or if you think this is an overkill, you can simply do
    Code:
    my @alts;
    while ($html =~ /<[^>]*\balt=(["'])(.*?)\1/ig) {
        push @alts, $2;
    }
    print "values of alt tags:\n", join("\n",@alts), "\n";

  3. #3
    Join Date
    Jun 2008
    Posts
    223
    Hi Sixtease, I tried the one without the module like this:

    File:
    Code:
    <img src="/altest/images/01.gif" border="0" width="160" height="150" alt="One is one and all alone">
    <img src="/altest/images/02.gif" border="0" width="160" height="150" alt="Two for the road">
    Code:
    open(TFL, "file.txt") || die("could not open");
    @MFL = <TFL>;
    close(TFL);
    foreach $line (@MFL) {
      my @alts = $line;
      while ($html =~ /<[^>]*\balt=(["'])(.*?)\1/ig) {
      push @alts, $2;
      }
      print "values of alt tags:\n", join("\n",@alts), "<br>\n";
    }
    The result was:
    values of alt tags: The whole image, as an image
    values of alt tags: The whole image, as an image

    Have I done it wrong?
    ------------------------------------------------
    It's alright I've seen what I did wrong

    Code:
    foreach $line (@MFL) {
      my @alts;
      while ($line =~ /<[^>]*\balt=(["'])(.*?)\1/ig) {
      push @alts, $2;
      }
      print "", join("\n",@alts), "<br>\n";
    }
    It now works fine.

    Thank you very much.
    Last edited by edatz; 12-22-2009 at 07:08 AM.

  4. #4
    Join Date
    Oct 2007
    Location
    Vienna, Austria
    Posts
    393
    You need not do the outer loop, either:
    Code:
    open(TFL, "file.txt") || die("could not open");
    my $html = join('', <TFL>);
    close(TFL);
    
    my @alts;
    while ($html =~ /<[^>]*\balt=(["'])(.*?)\1/isg) {
      push @alts, $2;
    }
    print "values of alt tags:\n", join("\n",@alts), "<br>\n";
    Update: I also added a s modifier to the regexp to deal with multiline alt attributes.
    Last edited by Sixtease; 12-22-2009 at 07:29 AM.

  5. #5
    Join Date
    Jun 2008
    Posts
    223
    Thanks for that Sixtease. At first the output didn't work quite right for me, but I tweaked it a little and it's doing the result with good breaks.

    Code:
    open(TFL, "file.txt") || die("could not open");
    $MFL = join('', <TFL>);
    close(TFL);
    
    my @alts;
    while ($TFL =~ /<[^>]*\balt=(["'])(.*?)\1/ig) {
      push @alts, $2;
    }
    print join("<br>\n",@alts);
    I then applied it to a file that's a small FFDB of 4 fields. I picked up the file name and used the extract on its image fields and I now have the result I wanted. Works a treat. Thanks again.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles