Click to See Complete Forum and Search --> : updated bad word filter- explode it?


Mouse77e
10-20-2006, 09:18 AM
I am using a simple but effective “bad word” filter on my site –

Bad_word Filter code:


$bad_words = explode('|', 'badword1|badword2|badword3|etc|etc');
foreach ($bad_words as $naughty)
{
$comments = eregi_replace($naughty, "#!@%*#", $comments);
}



but as always when you solve one little problem you come up with another, those pesky kids and their rude #!@%*# words.

As I’ve added words to the filter they are replaced with other words where one or more characters have been replace with symbols or digits. i.e. in the word ‘*****’, [sorry for any offence to Meredith Brooks fans, or anybody else for that matter] this is easily be filtered on its own using a list, but the same word spelt ‘8itch’, B1tch, Bi+ch etc. could be equally offensive.

Any ideas? (Short of trying to work out every connotation of every word)

Mouse

NogDog
10-20-2006, 10:25 AM
Here's what I've come up with after thinking about it a bit:

<?php
$altChars = array
(
'/a/i' => '[a@]',
'/b/i' => '[b3]',
'/i/i' => '[i!|]',
'/s/i' => '[s\$5]',
'/t/i' => '[t+]',
'/z/i' => '[z2]'
);
$badWords = array
(
'bismuth',
'bastion',
'stable'
);
function makeRegex(&$word, $key, $altChars)
{
$word = ('/'.preg_replace(array_keys($altChars), $altChars, $word).'/i');
}
array_walk($badWords, 'makeRegex', $altChars);
$comments = 'Bi$mu+h and B@ST!ON are not words often associated with sTa3lEs';
echo preg_replace($badWords, '#####', $comments);
?>