www.webdeveloper.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 16

Thread: Get website metea tags with Ajax, PHP and Jquery

  1. #1
    Join Date
    Dec 2009
    Posts
    92

    Get website metea tags with Ajax, PHP and Jquery

    Hi,
    I have a script which allows to fetch website content as plain HTML. It is written in PHP like this:

    Code:
    <?php
    if(isset($_GET['site'])){
      $f = fopen($_GET['site'], 'r');
      $html = '';
      while (!feof($f)) {
        $html .= fread($f, 24000);
      }
      fclose($f);
      echo $html;
    }
    ?>
    With a bit of Jquery/Ajax I can 'find' links in the data and show them as plain HTML like this:

    Code:
    $(function(){
    
       var site = 'http://www.SomeSite.com/';
    
       $.get('proxy.php', { site:site }, function(data){
    
            headlines = $(data).find('a');
    
            headlines.map(function(elem, index){ 
    
                href = $(this).prop('href');
    
                $('#Div').append('' + href + '');
    			
            });
    
       }, 'html');

    Instead of getting the 'url' I would really like to fetch the meta data (description, title etc.). So I tried:

    Code:
    ....
    headlines = $(data).find('meta');
    
    headlines.map(function(elem, index){ 
          
    meta = $(this).attr('meta[name='title'');
    
                $('#Div').append('' + meta + '');
    			
            });
    I tried many other variations but I can't seem to get it to work. Help is very much appreciated.
    Last edited by yomoore; 12-28-2012 at 11:52 AM. Reason: updated code

  2. #2
    Join Date
    Nov 2010
    Posts
    1,083
    I'm not sure exactly what you are looking for, but maybe this will help:

    Code:
    headlines = $(data).find('meta');
    
    headlines.map(function(elem, index){ 
    var meta1 = $(this).context.name;
    var meta2 = $(this).context.content;
    			alert(meta1);
    			alert(meta2);
            });

  3. #3
    Join Date
    Dec 2009
    Posts
    92
    No, that doesn't work (I get nothing, no alert)

  4. #4
    Join Date
    Nov 2010
    Posts
    1,083
    what does the error console say?

  5. #5
    Join Date
    Dec 2009
    Posts
    92
    Quote Originally Posted by xelawho View Post
    what does the error console say?
    Hi, I have no debugging skills , unfortunately ):

  6. #6
    Join Date
    Nov 2010
    Posts
    1,083
    not even enough to pull up your own error console?

    I guess you're kind of screwed, then...

  7. #7
    Join Date
    Dec 2009
    Posts
    92
    HAhahaha Maybe there are no errors...? I remembered that IE has a console which shows errors, so I opened the whole thing in IE (latest) and the only error it showed was concerning my doctype. It told me to change this:

    "<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">"

    In to this:

    "<!DOCTYPE html>"

    After doing this I had no errors..but it still doesn't work..no alerts what so ever.

  8. #8
    Join Date
    Nov 2010
    Posts
    1,083
    Chrome has a better error console. The firebug add-on for Firefox is better, too.

    If your page is live, you can post a link

  9. #9
    Join Date
    Dec 2009
    Posts
    92
    You can take a look at it over here http://mayy.in/

    Your code and my previous code should have worked.

  10. #10
    Join Date
    Dec 2009
    Posts
    92
    Quote Originally Posted by xelawho View Post
    Chrome has a better error console. The firebug add-on for Firefox is better, too.

    If your page is live, you can post a link
    Hi again, I have example code over here: http://mayy.in/mayyx.html
    Here you can see the proxy is working just fine extracting the HTML

  11. #11
    Join Date
    Nov 2010
    Posts
    1,083
    yes, but for some reason you are getting the data OK, but "headlines" is an empty array for the metadata, so there is nothing to alert... it seems like the php is not pulling in the metadata. It could be an x-path thing, but I don't know much about that. You can do this easily enough with the yahoo yql...

    Code:
    <body>
    <div id="res"></div>
    <script>
    $(document).ready(function(){
    var url='http://www.bbc.co.uk/news'
    $.getJSON("http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D'"+encodeURIComponent(url)+"'%20and%20xpath%3D'%2F%2Fmeta'&format=json&callback=?",
    function(data){
    headlines = data.query.results.meta;
    $.fn.reverse = [].reverse;
    headlines.reverse();
    headlines.map(function(elem, index){
    $("#res").prepend("<br>");
    $.each(elem, function(key, value) { 
    $("#res").prepend(key + ': ' + value+' '); 
    				});
    
    			});
            }); 
    }); 
    
    
    </script>
    </body>

  12. #12
    Join Date
    Dec 2009
    Posts
    92
    Quote Originally Posted by xelawho View Post
    yes, but for some reason you are getting the data OK, but "headlines" is an empty array for the metadata, so there is nothing to alert... it seems like the php is not pulling in the metadata. It could be an x-path thing, but I don't know much about that. You can do this easily enough with the yahoo yql...
    Yes I know, that's how I was doing it before. But the thing with yql is that you have to wait for it to respond. Sometimes it respond to slow, especially in the evening and in the weekend, and I also experienced times where the yql did not respond at all. Therefore I can not rely on yql. Furthermore I'm planning to do more with the data involving others 'calls' to other api's. Yql is one to many. Thanks for trying anyway

  13. #13
    Join Date
    Oct 2010
    Location
    Versailles, France
    Posts
    1,266
    Try this with a file prxy.php
    Code:
    <?php
    	header("content-type: text/html; charset=utf-8");
    	header("expires: mon, 26 jul 1997 05:00:00 gmt");
    	header("cache-control: no-cache, must-revalidate");
    	header("pragma: no-cache");
    
    	$chnMta='';
    	
    	if(isset($_GET['ste'])){ echo $_GET['ste'].'<br>';
    		$html = file_get_contents($_GET['ste']);
    		// echo $html;
    		// We capture only the meta tags
    		if (preg_match_all("@<(meta[^>]+/?)>@",$html,$m,PREG_PATTERN_ORDER)){
    			foreach($m[1] as $k=>$v) $chnMta.='&lt;'.$v.'&gt;<br>';}
    	}
    	echo $chnMta;
    ?>
    I do not use jQuery, but prefer something like this :
    Code:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
    <html>
    <head>
    <meta http-equiv="content-type" content="text/html; charset=utf-8">
    <meta name="generator" content="PSPad editor, www.pspad.com">
    <title></title>
    <style type="text/css">
    </style>
    </head>
    <body>
    <div id="rsp"></div>
    <script type="text/javascript">
    (function(){
    	var xmlObj=false;
    	var xmlFct=[function(){return new XMLHttpRequest()}
    		,function(){return new ActiveXObject("Msxml2.XMLHTTP")}
    		,function(){return new ActiveXObject("Msxml3.XMLHTTP")}
    		,function (){return new ActiveXObject("Microsoft.XMLHTTP")}];
    	for (var i=0;i<xmlFct.length;i++) {try{xmlObj = xmlFct[i]();}catch(e){continue;}break;}
    	
    	return sndRqt=function(url,cllbck,pstDta){
    		var req=xmlObj;
    		if (!req) return;
    		var mth=(pstDta)? "POST":"GET";
    		req.open(mth,url,true);
    		req.setRequestHeader('User-Agent','XMLHTTP/1.0');
    		if (pstDta) req.setRequestHeader('Content-type','application/x-www-form-urlencoded');
    		req.onreadystatechange=function(){
    			if (req.readyState!=4) return;
    			if (req.status!=200 && req.status!=304) return;
    			cllbck(req);}
    		if (req.readyState==4) return;
    		req.send(pstDta);
    	};
    }());
    var url='http://www.bbc.co.uk/news';
    sndRqt('prxy.php?ste='+encodeURIComponent(url),function(r){
    	document.getElementById('rsp').innerHTML=r.responseText});
    </script>
    </body>
    </html>
    It would be possible to capture only the description, contents, keywords... with other regular expressions on the $v values...
    Last edited by 007Julien; 12-28-2012 at 08:23 PM. Reason: complements, errors

  14. #14
    Join Date
    Dec 2009
    Posts
    92
    @007Julien Thank you very much, your code is working 100% To bad it's not Jquery, I don't know PHP. But then again its a very small code, I don't expect trouble in the future regarding my ignorance with PHP.

  15. #15
    Join Date
    Oct 2010
    Location
    Versailles, France
    Posts
    1,266
    It would be prudent to add an i (for ignore case) in the regular expression ("@<(meta[^>]+/?)>@i") to capture too upper-case META tags.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles