www.webdeveloper.com
Results 1 to 4 of 4

Thread: [RESOLVED] PHP4 to 5 upgrade broke scripts...

  1. #1
    Join Date
    Jun 2008
    Location
    Europe
    Posts
    1,096

    resolved [RESOLVED] PHP4 to 5 upgrade broke scripts...

    I have had a problem with some scripts I wrote (Screenscrapers) that worked great in PHP4, but stopped working the minute I upgraded to PHP5.

    I can change all of my filenames to have the .PHP4 extension and this solves the problem, but since this encompasses a number of sites, internal links and hundreds of files, this is not my first choice solution.

    Here is the scraper, what it does, is it takes items from the zazzle Results Page by category, strips out the formatting, adds my affiliate ID and then I can present these items on my page.

    PHP Code:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
    <title>Test of scrape</title>

    <link rel="stylesheet" type="text/css" href="/css/scraper.css" />
    <script type='text/javascript' src='http://www.zazzle.com/js/logging/omniture/s_code.zjs/r-52.78223/site-zazzle.js'></script>

    </head>

    <body>

    <div class="gridCell " id="page_productsGrid_assetCell1">

    <?php 
    $page 
    file_get_contents("http://www.zazzle.com/cool+smiley+gifts");

    //comment out the <span> tags completely
    $page preg_replace('/<span/'"<!-- <span"$page);
    $page preg_replace('/<\/span>/'"<\/span> -->"$page);
    $page preg_replace('/<a /'"<a rel=\"nofollow\" "$page);


    $rf_id="238219236805025733";
    // Regular expression to parse "&rf=" and the $rf_id into the existing link
    $page=preg_replace("/(.*?)(href\s*=+\s*[\"\'])(.*?)([\"\'])(.*?)/is","$1$2$3?rf=$rf_id$4$5",$page); 


    $test explode('<div style="position:relative" class="clearfix">',$page);
    for(
    $t=1;$t<=count($test)-2;$t++){
     print 
    "<div class=\"gridCellInfo\" id=\"page_products\">";
     print 
    $test[$t];
    }
    ?>
    </body>
    </html>
    This works find in PHP4; not at all in PHP5.

    My ideal solution would be an .htaccess file that I could put in any directory under PHP5 to make it default to php4.
    I have tried this, to no avail (.htaccess):
    Code:
    <IfModule mod_rewrite.c> 
        RewriteEngine On 
        AddType text/html .php4 
        AddHandler php4-script .php .html php5
    </IfModule>
    I have also tried a few alternatives... this appears to be a common problem and I have scoured the web and found no solution.


    Here are the two pages in PHP4 and PHP5:
    php4: http://www.killersmiley.com/test/cool-smiley.php4
    php5: http://www.killersmiley.com/test/cool-smiley.php
    Last edited by donatello; 02-25-2011 at 05:22 AM. Reason: Added the hyperlinks to the php4 and php5 test pages

  2. #2
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    19,220
    Define: does not work. Are there any error messages in the PHP error log?

    Most upgrade problems I've seen are due to a change in the PHP configuration, not actual version issues (e.g. dependence on register_globals, or in this case maybe due to allow_url_fopen being turned off would be a likely candidate).
    "Please give us a simple answer, so that we don't have to think, because if we think, we might find answers that don't fit the way we want the world to be."
    ~ Terry Pratchett in Nation

    eBookworm.us

  3. #3
    Join Date
    Oct 2008
    Location
    U.S.
    Posts
    726
    The problem is in the regular expression for the href replacements. Some experimenting with adding in: echo $page; before or after that preg_replace line will show you this. This works for me in PHP 5:
    PHP Code:
    <?php 
    $page 
    file_get_contents("http://www.zazzle.com/cool+smiley+gifts"); 
    //comment out the <span> tags completely 
    $page preg_replace('/<span/'"<!-- <span"$page); 
    $page preg_replace('/<\/span>/'"<\/span> -->"$page); 
    $page preg_replace('/<a /'"<a rel=\"nofollow\" "$page); 
    $rf_id="238219236805025733"
    // Regular expression to parse "&rf=" and the $rf_id into the existing link 
    $page preg_replace("/(href\s*=+\s*\"[^\"]*)/is""$1?rf=$rf_id"$page);
    $test explode('<div style="position:relative" class="clearfix">',$page); 
    for(
    $t=1;$t<=count($test)-2;$t++){ 
    print 
    "<div class=\"gridCellInfo\" id=\"page_products_$t\">"//append _$t to the id, invalid html to have multiple elements same id
    print $test[$t]; 

    ?>
    The regex did not seem to like the question marks after the asterisk's so much, and that seemed unnecessary anyhow.
    Last edited by astupidname; 02-26-2011 at 09:18 PM.

  4. #4
    Join Date
    Jun 2008
    Location
    Europe
    Posts
    1,096

    Resolved

    Wow!
    Thanks a million! That fixed it and I am now able to fix the dozens of pages that got whacked when I did the upgrade!

    Thanks a million!!!!!!


Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles