www.webdeveloper.com
Results 1 to 7 of 7

Thread: preg_match of string with embedded newlines

  1. #1
    Join Date
    Sep 2005
    Location
    Portland Oregon
    Posts
    153

    preg_match of string with embedded newlines

    Fairly simple problem that I've been scratching my head with regex patters for too long.

    Problem: Determine if a message is a forwarded message (Subject: FT.*). If so, strip off everthing above (and including) that line and return the rest.

    Constraints: Must work for both Windoz and Unix newlines ('\r\n' and '\n' respectively).

    I Have:
    Code:
    $fwdPat = '/.*subject:[ ]*fw.*[\\r\\n](.*)/msi'; 
    
    if( preg_match($fwdPat, $msgBuf, $results) ) {
    	$msgBuf = $results[1];
    }
    return($msgBuf);
    I get the match, but everything is returned in the pattern match ($result[0]) and nothing in the substring match ($result[1]).

    I'm sure it's simple, I just can't see it.

    Thanks in advance for any pointers.

    tony

  2. #2
    Join Date
    Jan 2005
    Location
    Alicante (Spain)
    Posts
    7,742
    This is being caused due to improper control over greediness. For more help show me the text of the first few lines of the email.

    Quote Originally Posted by tbirnseth
    Constraints: Must work for both Windoz and Unix newlines ('\r\n' and '\n' respectively).
    <CRLF> is the standard as per the RFC, irrespective of the platform.
    Last edited by bokeh; 09-29-2006 at 04:20 PM.

  3. #3
    Join Date
    Jan 2005
    Location
    Alicante (Spain)
    Posts
    7,742
    Probably something like this:
    PHP Code:
    $fwdPat '/^.*subject:\s*fw[^\r\n]*\r?\n(.*)$/msi'

  4. #4
    Join Date
    Sep 2005
    Location
    Portland Oregon
    Posts
    153
    Assume the greediness is the use of both 'm' and 's' modifiers....

    Here's part of the message. Note: can't rely on the ---Original Message--- line. This particulary message is from a Windoz environment, hence there are both '\r' and '\n' as EOL character.

    From: Orders [orders@smallelectrics.com]
    Sent: Thursday, September 28, 2006 9:30 PM
    To: Tony Birnseth
    Subject: FW: Order 22619 from catalog smallelectrics



    -----Original Message-----
    From: Patty Person (through Yahoo! Store Order System) [mailtoattyperson@junk.net]
    Sent: Thursday, September 28, 2006 2:30 PM
    To: orders@smallelectrics.com
    Subject: Order 22619 from catalog smallelectrics

    Date Thu Sep 28 14:29:31 PDT 2006

  5. #5
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    19,529
    Also note that since you are single-quoting your regex, you don't need to escape the back-slashes before your \r and \n escape sequences. (Bokeh correctly changed them, I just wanted to explicitly point it out in case you didn't catch that.)
    "Please give us a simple answer, so that we don't have to think, because if we think, we might find answers that don't fit the way we want the world to be."
    ~ Terry Pratchett in Nation

    eBookworm.us

  6. #6
    Join Date
    Sep 2005
    Location
    Portland Oregon
    Posts
    153

    Smile

    Pattern works, I hadn't thought of negating the '\r' and '\n' character and then explicitly matching them at the end.

    Thanks for the help. Now my head will stop bleeding!

    tony

  7. #7
    Join Date
    Jan 2005
    Location
    Alicante (Spain)
    Posts
    7,742
    The pattern in post #3 seems to work.
    Quote Originally Posted by tbirnseth
    Assume the greediness is the use of both 'm' and 's' modifiers.
    "*" is a greedy quantifier by default whereas "*?" is lazy. This means they match as much (greedy) or as little (lazy) as possible. If there is only one match both methods will find that one match, but if there is more than one match each method will find a different match.

    Quote Originally Posted by NogDog
    you don't need to escape the back-slashes before your \r and \n escape sequences.
    That's right but in this instance the end result is the same:
    PHP Code:
    <?php

    echo '\\r\\n'# \r\n

    echo '<br>';

    echo 
    '\r\n'# \r\n

    ?>
    Last edited by bokeh; 09-29-2006 at 04:53 PM.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles