Click to See Complete Forum and Search --> : the match and substitute functionality


cgi_js
10-30-2005, 10:28 PM
Hello

I have some feedback form my customers, and when I receive it some of its sentences look like the following: something1.something2

This is not good english and I want to convert it into something1. something2 so as you can see this is the right way of writing an end of a sentence followed by another (meaning: a space between the period and the first character of the next sentence)

Also, it's not only the period that concerns me, as well as the following:

something1,something2 to be changed to something1, something2
something1(something2) to be changed to something1 (something2)

and some other non-word characters as well.

However, let's suppose that when I receive the feedback of the user, I store it into a file and then to read it from that file I store it in the following variable: $tmp

Therefore, I do this:


if($tmp =~ /[\S][\W][\S]/){
$tmp =~ s/\./\. /g;
$tmp =~ s/\(/ \(/g;
$tmp =~ s/\)/\) /g;
...
...
... and all my concerned non-word characters as well
}


This is not practical for me due to the fact that there are a lot of non-word characters out there and I need to put them all in my consideration. Therefore, I tried this:


if($tmp =~ /[\S][\W][\S]/){
$tmp =~ s/[\S][\W][\S]/[\S][\W] [\S]/g;
}



it didn't work, and it gave me unwanted result, so how can I go around this?

Nedals
10-31-2005, 08:15 PM
use strict;

while (<DATA>) {
/(\w)(\.|\,|\()(\w)/; # list of chars where space needs to be added
if ($2 eq '(') {
s/(\w)(\.|\,|\()(\w)/$1 $2$3/g;
print "$_\n";
} else {
s/(\w)(\.|\,|\()(\w)/$1$2 $3/g;
print "$_\n";
}
}

__DATA__
something1.something2
something1,something2
something1(something2)

There's probably a better way to code this, but here's a first cut