Webskater
08-20-2003, 03:34 PM
Before passing phrases into a Knowledge Base search, I would like to loop through the phrase and remove a set list of words such as "and" "if" etc. Anyone know a clever way of doing this.
Cheers.
Cheers.
|
Click to See Complete Forum and Search --> : remove words from string Webskater 08-20-2003, 03:34 PM Before passing phrases into a Knowledge Base search, I would like to loop through the phrase and remove a set list of words such as "and" "if" etc. Anyone know a clever way of doing this. Cheers. AdamGundry 08-20-2003, 04:00 PM Something like this (using a RegExp): str = 'The sentence goes here.'; removeWords = 'the|and|but|if'; re = new RegExp(removeWords, 'gi'); str = str.replace(re, ''); Adam Webskater 08-21-2003, 04:58 AM Thanks for your reply. Trying to add some more words into the list of words to be removed, I got stuck trying to add words like can't and don't. It does not like the apostrophe. How can I get over this please. Also I would like to be able to remove a full stop from the end of any words. Thanks again for your help. AdamGundry 08-21-2003, 05:16 AM To include words with apostrophes, you need to escape them using a backslash, and you also need to escape the full stop, like this: removeWords = 'can\\'t|don\\'t|\.'; See the RegExp documentation: http://devedge.netscape.com/library/manuals/2000/javascript/1.3/reference/regexp.html Adam Charles 08-21-2003, 05:49 AM Or you can use the other syntax: replace(/the|and|but|if|can't|don't/gi, '') Webskater 08-21-2003, 07:03 AM Thanks for your replies. If I try to eliminate words thus: |do|dont| //someone typing don't without apostrophe the 'do' gets stripped off the front of 'dont' leaving 'nt'. Is there a way of forcing this to examine each word as a separate entity i.e. the characters between one space and the next form a word the characters from the beginning of the string to the first space form a word the characters from the last space to the end of the string form a word Thanks again pyro 08-21-2003, 07:13 AM Use \b to designate a word boundry: <script type="text/javascript"> str = "Dont try this at home."; str = str.replace(/\b(the|and|but|if|can't|do|don't)\b/gi, ''); alert (str); </script> Charles 08-21-2003, 07:23 AM Try flipping the do and the don't. That is to say, filter out the don't first. Webskater 08-21-2003, 07:27 AM Thanks for all your replies - it now works perfectly. This RegExp stuff is a couple of brain cells too far for me. Thanks again. webdeveloper.com
Copyright Internet.com Inc., All Rights Reserved. |