www.webdeveloper.com
Results 1 to 15 of 15

Thread: Just Need A "Little" Regular Expression...

  1. #1
    Join Date
    Sep 2008
    Posts
    260

    Just Need A "Little" Regular Expression...

    Hello...once again......I have a little nagging issue...

    Let's say I have this:

    PHP Code:
    $foo1="Some text : some more text"
    If I want to use str_split() to split the text in half at the colon into an array variable, I would use this:

    PHP Code:
    $split_string str_split($foo1strpos($foo1":") + 1); 
    But the problem is that I want to split the text at the very first character that isn't numeric or alpha...instead of being specific to a colon.

    Now I've tried a regular expression but I'm not hitting it for some reason.

    PHP Code:
    $split_string str_split($foo1strpos($foo1"[^A-Za-z0-9]") + 1); 
    ..this feel like it should work but it doesn't..?..any suggestions?

  2. #2
    Join Date
    Nov 2008
    Posts
    2,477
    str_split is not the right function for this. For your first example, you would be better off using explode():

    PHP Code:
    $split_string explode($foo1':'); 
    Your second example could never work. Have another look at the strpos manual page. The second argument is the string you want to find the position of. It neither knows nor cares if your string happens to be a regex pattern - it will just look for that literal string. When you want to use regex, you need to use a regex-specific function. In this case, you want preg_split:

    PHP Code:
    $split_string preg_split('/[^A-Z0-9]/i'$foo1); 
    The first rule of Tautology Club is the first rule of Tautology Club.

  3. #3
    Join Date
    Sep 2008
    Posts
    260
    Quote Originally Posted by Mindzai View Post
    str_split is not the right function for this. For your first example, you would be better off using explode():

    PHP Code:
    $split_string explode($foo1':'); 
    Your second example could never work. Have another look at the strpos manual page. The second argument is the string you want to find the position of. It neither knows nor cares if your string happens to be a regex pattern - it will just look for that literal string. When you want to use regex, you need to use a regex-specific function. In this case, you want preg_split:

    PHP Code:
    $split_string preg_split('/[^A-Z0-9]/i'$foo1); 
    Thanks for the response...however, neither of your suggestions are working so far.

    With the second suggestion, which would be more applicable to my situation, for some reason it splits after the end of the first word in the string..

  4. #4
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    19,246
    PHP Code:
    $parts preg_split('#\s*[^\sa-z0-9]\s*#'$text); 
    ?
    "Please give us a simple answer, so that we don't have to think, because if we think, we might find answers that don't fit the way we want the world to be."
    ~ Terry Pratchett in Nation

    eBookworm.us

  5. #5
    Join Date
    Sep 2008
    Posts
    260
    Quote Originally Posted by NogDog View Post
    PHP Code:
    $parts preg_split('#\s*[^\sa-z0-9]\s*#'$text); 
    ?
    This didn't work either...

    Okay...here's what I'm trying to split

    PHP Code:
    <title>New ATL Music Rocko FtGucci ManeOfficer Ross &ampSoulja Boy &#8211; Maybe (Remix)</title> 
    This is a line in an xml file.

    I'm trying to split the string right at the "colon" or whatever the very first non alpha, or non numeric character the user enters. Like if they accidentally enter in a comma or semicolon...

    This worked for me, but it doesn't look out for user error (any other non alpha or numeric character)..

    PHP Code:

    $split_string 
    str_split($titlestrpos($title":") + 1); 
    $item_title $split_string[0];
    $item_artist $split_string[1]; 
    Last edited by ChuckB; 07-26-2010 at 11:05 AM.

  6. #6
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    19,246
    Well, "<" and ">" are non-alpha, non-numeric characters. Should they also be excluded in the character class? Or will the separator always be surrounded by a space on each side? What if there is a comma or apostrophe in the first part of the text before the separator? Should you be using SimpleXML or DOM to get just the text without the tags first, then parse it? Inquiring minds want to know.
    "Please give us a simple answer, so that we don't have to think, because if we think, we might find answers that don't fit the way we want the world to be."
    ~ Terry Pratchett in Nation

    eBookworm.us

  7. #7
    Join Date
    Sep 2008
    Posts
    260
    Quote Originally Posted by NogDog View Post
    Well, "<" and ">" are non-alpha, non-numeric characters. Should they also be excluded in the character class? Or will the separator always be surrounded by a space on each side? What if there is a comma or apostrophe in the first part of the text before the separator? Should you be using SimpleXML or DOM to get just the text without the tags first, then parse it? Inquiring minds want to know.
    I see what you're saying...there are too many possibilities so the user has to follow some instruction.

    So what I'm looking for now is as follows:

    Split at any one of these characters: "~ (tilde), : (colon) , ; (semicolon) , - (hyphen), | (..not sure what this is so I'll call it an 'or' statement), & (ampersand), ^ (..not sure what this is either)"..

    - split regardless of any space...

    and that's it...

  8. #8
    Join Date
    Sep 2008
    Posts
    260
    Quote Originally Posted by NogDog View Post
    Well, "<" and ">" are non-alpha, non-numeric characters. Should they also be excluded in the character class? Or will the separator always be surrounded by a space on each side? What if there is a comma or apostrophe in the first part of the text before the separator? Should you be using SimpleXML or DOM to get just the text without the tags first, then parse it? Inquiring minds want to know.
    im sorry...i'm just trying to keep it rather simple...I'm trying to split at any non-alpha and non-numeric character regardless of spacing...it doesn't matter what the character is...

  9. #9
    Join Date
    Nov 2008
    Posts
    2,477
    Quote Originally Posted by ChuckB View Post
    Thanks for the response...however, neither of your suggestions are working so far.

    With the second suggestion, which would be more applicable to my situation, for some reason it splits after the end of the first word in the string..
    Ah I got the arguments the wrong way round in the first one, I should have double checked the manual.

    The second one is working correctly, because a space (which is what's causing the split) is a non alphanumeric character.
    The first rule of Tautology Club is the first rule of Tautology Club.

  10. #10
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    19,246
    Quote Originally Posted by ChuckB View Post
    I see what you're saying...there are too many possibilities so the user has to follow some instruction.

    So what I'm looking for now is as follows:

    Split at any one of these characters: "~ (tilde), : (colon) , ; (semicolon) , - (hyphen), | (..not sure what this is so I'll call it an 'or' statement), & (ampersand), ^ (..not sure what this is either)"..

    - split regardless of any space...

    and that's it...
    Try:
    PHP Code:
    '#\s*[~:;|-]\s*#' 
    "Please give us a simple answer, so that we don't have to think, because if we think, we might find answers that don't fit the way we want the world to be."
    ~ Terry Pratchett in Nation

    eBookworm.us

  11. #11
    Join Date
    Nov 2008
    Posts
    2,477
    Quote Originally Posted by ChuckB View Post
    im sorry...i'm just trying to keep it rather simple...I'm trying to split at any non-alpha and non-numeric character regardless of spacing...it doesn't matter what the character is...
    The issue is that spaces are non-alpha and non-numeric characters. If you want to exclude them from the split, you need to include them in the character class:

    PHP Code:
    $split_string preg_split('/[^A-Z0-9\s]/i'$foo1); 
    The first rule of Tautology Club is the first rule of Tautology Club.

  12. #12
    Join Date
    Aug 2007
    Posts
    3,767
    By the way, | is called a vertical bar, or a pipe from its common use as the pipe command in Unix (similar to how you called it or). ^ is normally called a carot (or circumflex on a letter), and sometimes called hat in Maths.
    Great wit and madness are near allied, and fine a line their bounds divide.

  13. #13
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    19,246
    Quote Originally Posted by Declan1991 View Post
    By the way, | is called a vertical bar, or a pipe from its common use as the pipe command in Unix (similar to how you called it or). ^ is normally called a carot (or circumflex on a letter), and sometimes called hat in Maths.
    <spelling_police> Caret </spelling_police>
    "Please give us a simple answer, so that we don't have to think, because if we think, we might find answers that don't fit the way we want the world to be."
    ~ Terry Pratchett in Nation

    eBookworm.us

  14. #14
    Join Date
    Sep 2008
    Posts
    260
    Quote Originally Posted by Mindzai View Post
    The issue is that spaces are non-alpha and non-numeric characters. If you want to exclude them from the split, you need to include them in the character class:

    PHP Code:
    $split_string preg_split('/[^A-Z0-9\s]/i'$foo1); 
    Thanks for the help guys..sorry for the late response..my internet was down for a couple of days.

    But thanks..everything worked with this line above...thanks Mindzai..

  15. #15
    Join Date
    Aug 2007
    Posts
    3,767
    Quote Originally Posted by NogDog View Post
    <spelling_police> Caret </spelling_police>
    Thanks for that. That goes down as one of my worst typos, I must have been hungry!
    Great wit and madness are near allied, and fine a line their bounds divide.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles