dcsimg
www.webdeveloper.com
Results 1 to 9 of 9

Thread: Extract contents using regex.

  1. #1
    Join Date
    Sep 2013
    Posts
    221

    Extract contents using regex.

    I want to extract all the contents between <span class="jcn"> using regex.

    Below is the code from which i want to extract the content.


    <section class="jbbg">

    <a class='jdr' href='javascript:void(0);' onClick="return openDiv('jrtp');"></a>
    <span class="jcn">
    <a href="http://justdial.com/Mumbai/Goga-Thali-Pure-Veg-Banquet-Hall-&lt;near&gt;-Opposite-Sanghavi-Apartment-Kandivali-West/022PXX22-XX22-130928123504-L6D4_TXVtYmFpIFJlc3RhdXJhbnRz_BZDET" title='Goga Thali Pure Veg & Banquet Hall in Kandivali West, Mumbai' >Goga Thali Pure Veg & Banquet Hall</a>
    </span>

    <section class="jrat">
    <a rel="nofollow" href="http://justdial.com/Mumbai/Goga-Thali-Pure-Veg-Banquet-Hall-&lt;near&gt;-Opposite-Sanghavi-Apartment-Kandivali-West/022PXX22-XX22-130928123504-L6D4_TXVtYmFpIFJlc3RhdXJhbnRz_BZDET#rvw"><span class='s10'></span><span class='s10'></span><span class='s10'></span><span class='s10'></span><span class='s0'></span></a>
    <a class="jrt" href="http://justdial.com/Mumbai/Goga-Thali-Pure-Veg-Banquet-Hall-&lt;near&gt;-Opposite-Sanghavi-Apartment-Kandivali-West/022PXX22-XX22-130928123504-L6D4_TXVtYmFpIFJlc3RhdXJhbnRz_BZDET#rvw">1 rating</a>
    <span class="jrt"> |</span>
    <a class="rate_this" onclick="_ct('ratethis','lspg');" href="http://justdial.com/Mumbai/Goga-Thali-Pure-Veg-Banquet-Hall-&lt;near&gt;-Opposite-Sanghavi-Apartment-Kandivali-West/022PXX22-XX22-130928123504-L6D4_TXVtYmFpIFJlc3RhdXJhbnRz_BZDET/writereview">Rate this</a>
    </section>

    Thanks in advance.
    strad solutionswww.stradsolutions.com

  2. #2
    Join Date
    Apr 2010
    Posts
    88
    PHP Code:
    $pattern '#<span class="jcn">(.+?)</span>#si';
    if (
    preg_match_all($pattern$content$matches))
    {
            
    print_r($matches);


  3. #3
    Join Date
    Sep 2013
    Posts
    221
    Well, this code only prints the array. I want the all content of <span class="jcn"> to be displayed.
    strad solutionswww.stradsolutions.com

  4. #4
    Join Date
    Apr 2010
    Posts
    88
    You should loop through the matches[0] or matches[1] array and print whatever you need.

  5. #5
    Join Date
    Sep 2013
    Posts
    221
    Well, the arrays values are not displaying the contents.
    If some can help me with the extract the correct content of <span class="jcn">
    strad solutionswww.stradsolutions.com

  6. #6
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    21,433
    Works fine for me:
    PHP Code:
    <?php

    $text 
    = <<<EOD
    <section class="jbbg">

    <a class='jdr' href='javascript:void(0);' onClick="return openDiv('jrtp');"></a>
    <span class="jcn">
    <a href="http://justdial.com/Mumbai/Goga-Thali-Pure-Veg-Banquet-Hall-&lt;near&gt;-Opposite-Sanghavi-Apartment-Kandivali-West/022PXX22-XX22-130928123504-L6D4_TXVtYmFpIFJlc3RhdXJhbnRz_BZDET" title='Goga Thali Pure Veg & Banquet Hall in Kandivali West, Mumbai' >Goga Thali Pure Veg & Banquet Hall</a>
    </span>

    <section class="jrat">
    <a rel="nofollow" href="http://justdial.com/Mumbai/Goga-Thali-Pure-Veg-Banquet-Hall-&lt;near&gt;-Opposite-Sanghavi-Apartment-Kandivali-West/022PXX22-XX22-130928123504-L6D4_TXVtYmFpIFJlc3RhdXJhbnRz_BZDET#rvw"><span class='s10'></span><span class='s10'></span><span class='s10'></span><span class='s10'></span><span class='s0'></span></a>
    <a class="jrt" href="http://justdial.com/Mumbai/Goga-Thali-Pure-Veg-Banquet-Hall-&lt;near&gt;-Opposite-Sanghavi-Apartment-Kandivali-West/022PXX22-XX22-130928123504-L6D4_TXVtYmFpIFJlc3RhdXJhbnRz_BZDET#rvw">1 rating</a>
    <span class="jrt"> |</span>
    <a class="rate_this" onclick="_ct('ratethis','lspg');" href="http://justdial.com/Mumbai/Goga-Thali-Pure-Veg-Banquet-Hall-&lt;near&gt;-Opposite-Sanghavi-Apartment-Kandivali-West/022PXX22-XX22-130928123504-L6D4_TXVtYmFpIFJlc3RhdXJhbnRz_BZDET/writereview">Rate this</a>
    </section>
    EOD;

    preg_match_all('#<span class="jcn">(.+?)</span>#si'$text$matches);
    foreach(
    $matches[1] as $match) {
        echo 
    "<pre>".htmlspecialchars($match)."</pre>";
    }
    If that's not what you want, then you need to be more specific about what you need, and you need to show us the actual code you are using.
    "Well done....Consciousness to sarcasm in five seconds!" ~ Terry Pratchett, Night Watch

    How to Ask Questions the Smart Way (not affiliated with this site, but well worth reading)

    My Blog
    cwrBlog: simple, no-database PHP blogging framework

  7. #7
    Join Date
    Sep 2013
    Posts
    221
    Well i tried with the below code. It works to get the contents between <span class="jcn">.

    <html>
    <body>
    <?php
    $content = file_get_contents('http://justdial.com/Mumbai/Restaurants/ct-304085');
    preg_match_all('#<span class="jcn">(.+?)</span>#si', $content, $matches);
    foreach($matches[1] as $match) {
    echo "<pre>".htmlspecialchars($match)."</pre>";
    }
    ?>
    </body>
    </html>

    But now what i want is to get only <a>tag contents of from the above code.
    Can someone help me up with this code....
    strad solutionswww.stradsolutions.com

  8. #8
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    21,433
    Take the results you get from the above process, an apply the same logic for the <a> tags.

    A more robust solution would probably be to make use of the DOM extension, though.
    "Well done....Consciousness to sarcasm in five seconds!" ~ Terry Pratchett, Night Watch

    How to Ask Questions the Smart Way (not affiliated with this site, but well worth reading)

    My Blog
    cwrBlog: simple, no-database PHP blogging framework

  9. #9
    Join Date
    Sep 2013
    Posts
    221
    Thnx for your advice, i am now able to extract the contents. Below is the code:
    <html>
    <body>
    <?php


    set_time_limit(300);

    $content = file_get_contents('http://justdial.com/Mumbai/Restaurants/ct-304085');
    preg_match_all('#<span class="jcn">(.+?)</span>#si', $content, $matches);
    foreach($matches[1] as $match) {
    $var="<pre>".htmlspecialchars($match)."</pre>";
    //echo $var;
    }
    preg_match_all ('/<a\s+href[^>]+>([^<]+)<\/a>/', $var, $Matches);
    $n=implode(', ', $matches[1]);
    echo $n;





    //no:

    ini_set( "display_errors", 0); //warning not to display


    $xml = '<span class="ctc jcn">Details i want to get</span>';
    $dom = new DOMDocument;
    $dom->loadHTML($content);
    $spans = $dom->getElementsByTagName('a');


    echo $span;

    foreach ($spans as $span) {
    if($span->getAttribute('class') == 'ctc f13') {

    $t=trim(strip_tags($span->nodeValue));

    echo $t;
    }
    }
    //insert:
    $con=mysqli_connect("localhost","root","","test");
    // Check connection
    if (mysqli_connect_errno())
    {
    echo "Failed to connect to MySQL: " . mysqli_connect_error();
    }

    $sql="INSERT INTO together (names,numbers) VALUES ('$n','$t')";
    if (!mysqli_query($con,$sql))
    {
    die('Error: ' . mysqli_error($con));
    }
    echo "1 record added";
    mysqli_close($con);
    ?>
    </body>
    </html>
    But now i want to insert these values ie. 1names and 1numbers together as 1 record in my database.
    Please some one can help me out with this one...
    strad solutionswww.stradsolutions.com

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles