www.webdeveloper.com
Results 1 to 7 of 7

Thread: scraping the data from website

  1. #1
    Join Date
    Sep 2008
    Posts
    62

    scraping the data from website

    Hi,

    I am having a problem with scraping the data from the website. I can't be able to output the data to my php after I have scraping the data from the website. On my php it show as a empty page.

    here is the html source I want to scrape:

    Code:
    <span id="row3Time" class="zc-ssl-pg-time">11:00 AM</span>
    <a id="rowTitle3" class="zc-ssl-pg-title" href='http://tvlistings.zap2it.com/tv/sportscenter/EP00019917'>SportsCenter</a>
    <ul class="zc-icons">
    <li class="zc-ic zc-ic-span"><span class="zc-ic-live">LIVE</span></li></ul>
    </li>
    <li class="zc-ssl-pg" id="row1-4" style="">
    
    <span id="row4Time" class="zc-ssl-pg-time">12:00 PM</span>
    <a id="rowTitle4" class="zc-ssl-pg-title" href='http://tvlistings.zap2it.com/tv/sportscenter/EP00019917'>SportsCenter</a>
    <ul class="zc-icons">
    <li class="zc-ic zc-ic-span"><span class="zc-ic-live">LIVE</span></li></ul>
    </li>
    <li class="zc-ssl-pg" id="row1-5" style="">
    
    <span id="row5Time" class="zc-ssl-pg-time">1:00 PM</span>
    <a id="rowTitle5" class="zc-ssl-pg-title" href='http://tvlistings.zap2it.com/tv/sportscenter/EP00019917'>SportsCenter</a>
    <ul class="zc-icons">
    <li class="zc-ic zc-ic-span"><span class="zc-ic-live">LIVE</span></li></ul>
    here is the php source:

    PHP Code:

    <?php

    $contents 
    file_get_contents('http://tvlistings.zap2it.com/tvlistings/ZCSGrid.do?stnNum=10179');
    preg_match('/<a id="rowTitle3" class="zc-ssl-pg-title"[.*]<\/a>/i'$data$matches);
    $rowtitle $matches[1];
    echo 
    $rowtitle."<br>\n";
    ?>
    And here is the php output:
    PHP Code:
    <br
    does anyone know how I can scraping the data from that website using with <a id=rowTitle3 to the end of the page?

    any advice would be much appreicated.

    Thanks in advance
    Last edited by mark107; 04-15-2013 at 08:53 AM.

  2. #2
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    22,326
    Have you confirmed that $contents actually contains the expected HTML text?

    Also, you may find the DOM extension to be a more robust way to grab data than preg functions.
    "Well done....Consciousness to sarcasm in five seconds!" ~ Terry Pratchett, Night Watch

    How to Ask Questions the Smart Way (not affiliated with this site, but well worth reading)

    My Blog
    cwrBlog: simple, no-database PHP blogging framework

  3. #3
    Join Date
    Sep 2008
    Posts
    62
    yeah I am confirmed that $contents actually contains in the expected HTML text. But I am looking for to match the data before ouput them in my php.


    Could you please post the source for variable that I am looking for to match the data and then to output them to my php?

  4. #4
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    22,326
    PHP Code:
    '/<a id="rowTitle3".*/is' // "s" modifier makes "." include newlines 
    "Well done....Consciousness to sarcasm in five seconds!" ~ Terry Pratchett, Night Watch

    How to Ask Questions the Smart Way (not affiliated with this site, but well worth reading)

    My Blog
    cwrBlog: simple, no-database PHP blogging framework

  5. #5
    Join Date
    Sep 2008
    Posts
    62
    Thank you for your help. I have got a problem with scraping the data from a website that I use to output the data in my php. It did not scraping the correct data in a correct time, e.g my local time is 10:00pm and the current time of the tv programme is 5:00pm. I can only scraping the data that is outside of the current time of tv programme like 3:00pm.

    here is the php:

    PHP Code:
       <?php
        
        $data 
    file_get_contents('http://tvlistings.zap2it.com/tvlistings/ZCSGrid.do?stnNum=10179');
        
    preg_match_all('/<a id="rowTitle\d+" class="zc-ssl-pg-title"[^>]*>([^<]+)<\/a>/im'$data$matches);
        
    $titles $matches[1];
        
        echo 
    $titles[19];
        
    ?>

    Do you know how I can scraping the data from the tv programme website in a current time, e.g my local time is 10:00pm and the tv programme time is 5:00pm??

  6. #6
    Join Date
    Apr 2013
    Posts
    2
    It is fun and bring a lot of useful things for me.thank you for sharing

  7. #7
    Join Date
    Sep 2008
    Posts
    62
    what the hell are you talking about?

    Trolling on my thread?

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center

"

"

X vBulletin 4.2.2 Debug Information

  • Page Generation 0.16407 seconds
  • Memory Usage 2,904KB
  • Queries Executed 13 (?)
More Information
Template Usage (34):
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_global_above_footer
  • (1)ad_global_below_navbar
  • (1)ad_global_header1
  • (1)ad_global_header2
  • (1)ad_navbar_below
  • (1)ad_showthread_firstpost_sig
  • (1)ad_showthread_firstpost_start
  • (1)ad_thread_first_post_content
  • (1)ad_thread_last_post_content
  • (1)bbcode_code
  • (4)bbcode_php
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)headinclude_bottom
  • (7)memberaction_dropdown
  • (1)navbar
  • (4)navbar_link
  • (1)navbar_moderation
  • (1)navbar_noticebit
  • (1)navbar_tabs
  • (2)option
  • (7)postbit
  • (7)postbit_onlinestatus
  • (7)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available (6):
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files (26):
  • ./showthread.php
  • ./global.php
  • ./includes/class_bootstrap.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/functions_navigation.php
  • ./includes/class_friendly_url.php
  • ./includes/class_hook.php
  • ./includes/class_bootstrap_framework.php
  • ./vb/vb.php
  • ./vb/phrase.php
  • ./includes/functions_facebook.php
  • ./includes/functions_calendar.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_notice.php
  • ./packages/vbattach/attach.php
  • ./vb/types.php
  • ./vb/cache.php
  • ./vb/cache/db.php
  • ./vb/cache/observer/db.php
  • ./vb/cache/observer.php 

Hooks Called (70):
  • init_startup
  • friendlyurl_resolve_class
  • init_startup_session_setup_start
  • database_pre_fetch_array
  • database_post_fetch_array
  • init_startup_session_setup_complete
  • global_bootstrap_init_start
  • global_bootstrap_init_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • load_show_variables
  • load_forum_show_variables
  • global_state_check
  • global_bootstrap_complete
  • global_start
  • style_fetch
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • strip_bbcode
  • friendlyurl_clean_fragment
  • friendlyurl_geturl
  • forumjump
  • cache_templates
  • cache_templates_process
  • template_register_var
  • template_render_output
  • fetch_template_start
  • fetch_template_complete
  • parse_templates
  • fetch_musername
  • notices_check_start
  • notices_noticebit
  • process_templates_complete
  • friendlyurl_redirect_canonical
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • memberaction_dropdown
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • build_navigation_data
  • build_navigation_array
  • check_navigation_permission
  • process_navigation_links_start
  • process_navigation_links_complete
  • set_navigation_menu_element
  • build_navigation_menudata
  • build_navigation_listdata
  • build_navigation_list
  • set_navigation_tab_main
  • set_navigation_tab_fallback
  • navigation_tab_complete
  • fb_like_button
  • showthread_complete
  • page_templates