Click to See Complete Forum and Search --> : need help for preg_replace(. filter out some text


binhaus
07-25-2007, 10:04 AM
Hi room . i need some expert help for preg php function..
the folowing code is using in my forums, it will preg_replace all the links on post to the message notice to login when user not loged in yet!!

but infact I need this not preg_replace include the the links have same domain or other some popular domain like googe yahoo youtube, or it have some specific text, that mean if the post have some links like http://xaluan.com or www.google.com is still viewable without loged in, but other links like www.mtvvui.com or webdeveloper.com is hidden by message ..
( sorry my english hope you understand )


$ret = "this is the text with some links eg http://www.xaluan.com/ hay http://www.mtvvui.com";

if ( !$logedin ){

$replacer = "lam on login neu muon nhin thay may cai links nay"
// matches an "xxxx://yyyy" URL at the start of a line, or after a space.
// xxxx can only be alpha characters.
// yyyy is anything up to the first space, newline, comma, double quote or <
$ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", $replacer, $ret);

// matches a "www|ftp.xxxx.yyyy[/zzzz]" kinda lazy URL thing
// Must contain at least 2 dots. xxxx contains either alphanum, or "-"
// zzzz is optional.. will contain everything up to the first space, newline,
// comma, double quote or <.
$ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", $replacer, $ret);
}


thanks for any idea or help
ben

-----
My web site sig: http://www.vnhits.com/downloads --> find all you softs you need download free..
-----

binhaus
07-26-2007, 01:20 AM
any one can help!!!??? urgent

MatMel
07-26-2007, 06:14 AM
Ok if ive understood your problem properly, you want to make exceptions for some defined urls...
I would solve it like this:

$exceptions = array("google.de" , "yahoo.com");
preg_match_all("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", $ret, $result);

foreach($result as $value)
{
$parsed_url = parse_url($value[0]);
if (!in_array($parsed_url['host'],$exceptions) )
{
$ret = str_replace($value[0], $replacer, $ret);
}
}

I didn't check your Regex, but I suppose the script worked before ...

binhaus
07-27-2007, 03:33 AM
Hi thanks for nice reply
infact i have work out with similar function but not success, i have test your script but not too ,
The problems are strings used in the array white list are domain name or text or string in the $exceptions but not work for the parse_url if the links get from preg_match_all do not have http:// and also the in case of domain have subdomain it will return full like http://aaa.bbb.com so if I put the bbb.com in exceptions array, the result still failed ( we need returned true) so parse_url for host is not good solution ( i might wrong )..

Here the full script function i working on .

<?
$ret = ' http://www.mydomain.com/modules.php?name= <br> www.google.com/aaamodules.php?name=
<br> www.192.168.1.1/modules.php?name= <br> lkjl <br> ftp://abc.com:24 <br> ftp.webdeveloper.com/fopost783035
<br> ;jljh kjk lkj kj<br> www.au2.php.net/manual/en/fse-url.php <br>
http://au2.500mb.net/manual/en/fse-url.php <br>
ftp.youa.com/virus.exe sfa<br> dgda.com/virus.exe ';

$retold = $ret;
$u_login = true;
$links_except = array('google.com','youtube','mydomain.com'); // links still viewable and clickable without loged in
$links_except_no = array('500mb.com','virus.exe','xxxx'); // links will not clickable ( but text viewable ) in both case loged or not
// all other links must loged in to view and click

function make_clickable($text)
{
global $u_login, $links_except, $links_except_no;
// pad it with a space so we can match things at the start of the 1st line.
$ret = ' ' . $text;
//
// Hide links from unregistered users mod
//
if ( !$u_login ) //not loged in
{
// The thing we replace links with. I like using a quote like box
$replacer = ' Please login to see the link ';
/*
//// HERE will funcion to process $links_except arraay -- still working on
*/
// matches an "xxxx://yyyy" URL at the start of a line, or after a space.
// xxxx can only be alpha characters.
// yyyy is anything up to the first space, newline, comma, double quote or <
$ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", $replacer, $ret);
// matches a "www|ftp.xxxx.yyyy[/zzzz]" kinda lazy URL thing
// Must contain at least 2 dots. xxxx contains either alphanum, or "-"
// zzzz is optional.. will contain everything up to the first space, newline,
// comma, double quote or <.
$ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", $replacer, $ret);
// the regex fowlowing can used for replace two regex before
// $ret = preg_replace("#(^|[\n ])(((www|ftp)\.|[\w]+?://)[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", $replacer, $ret);
}
else
{
$ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a href=\"url.php?\\2\" target=\"_blank\">\\2</a>", $ret);
$ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a href=\"url.php?http://\\2\" target=\"_blank\">\\2</a>", $ret);
// $ret = preg_replace("#(^|[\n ])(([\w]+?://|(www|ftp)\.)[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a href=\"url.php?\\2\" target=\"_blank\">\\2</a>", $ret);
}
// Remove our padding..
$ret = substr($ret, 1);
return($ret);
}

echo $retold."<br><hr><br>";
echo make_clickable($ret);
?>


please help !!

binhaus
07-27-2007, 04:25 AM
sorrry last post:
must swich $u_login = false; for testing function in case of not loged in
=============
anyone can help!!
thanks

MatMel
07-27-2007, 04:43 AM
Oh I didn't know that parse_url has problems when the "http" is missing...
Couldn't you just add the "http://", if $parsed_url['host'] is empty and start parse_url again?

And for the subdomains... I just didn't think of them. Couldn't you just cut them with another regex? Because I don't think you want to whitelist only some special subdomains ...

binhaus
07-27-2007, 08:25 AM
Hi affter few hours work out .. i got 70% complate but there r some thing need to fix ..
the problems here -- i need to make exact that link will replaced - ( include it has " " blank space or "\n" go to next line at fist and end or url ..


$ret = ereg_replace("(^|[\n ])(".$urls_filter[$i]."*)", "<a href=\"url.php?".$urls_filter[$i]."\" target=\"_blank\">".$urls_filter[$i]."</a>", $ret);

need some one expert in regex hepl please...


this my new function

<?

$ret = ' http://www.mydomain.com/modules.php?name= <br> www.192.168.1.1/aaamodules.php?name=
<br> www.google.com/modules.php?name= <br> lkjl <br> ftp://abc.com:24 <br> ftp.webdeveloper.com/fopost783035
<br> ;jljh kjk lkj kj<br> www.au2.php.net/manual/en/fse-url.php <br>
http://au2.500mb.net/manual/en/fse-url.php <br>
ftp.youa.com/virus.exe sfa<br> dgda.com/virus.exe ftp://500mb.com/ydflhl/dfd/ http://youtube.com
www.youtube.com
http://youtube.com/watch?v=loiuon
http://youtube.com/watch?v=asds
www.google.com
ftp://google.com
ftp://www.google.com
llkjjk';

$retold = $ret;
$u_login = false;
$links_except = array('youtube.com','google.com','dailymotion.com'); // links still viewable and clickable without loged in
$links_except_no = array('500mb.net','virus.exe','xxxx'); // links will not clickable (but text viewable) in case loged
// all other links must loged in to view and click

function make_clickable($text)
{

global $links_except, $links_except_no;

$text = preg_replace('#(script|about|applet|activex|chrome):#is', "\\1:", $text);

// pad it with a space so we can match things at the start of the 1st line.
$ret = ' ' . $text;

//
// Hide links from unregistered users mod
//
if ( !$u_login )
{
// The thing we replace links with. I like using a quote like box
$replacer = ' pjpjpjpj';
// make good trusted domain clickabel or filter out the link have cencor word or bad domain when no loged in
preg_match_all("#(^|[\n ])(((www|ftp)\.|[\w]+?://)[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", $ret, $result_filter);
$urls_filter = $result_filter[2]; //

if (count($urls_filter) > 0) {
$link_count = count($links_except);
for($i=0;$i<count($urls_filter);$i++)
{
for($ia=0;$ia<$link_count;$ia++) {
if (stristr($urls_filter[$i],$links_except[$ia]) )
{
//$ret = str_replace($urls_filter[$i], "<a href=\"url.php?".$urls_filter[$i]."\" target=\"_blank\">".$urls_filter[$i]."</a>", $ret); // this not work right when have multil urls similar
$ret = ereg_replace("(^|[\n ])(".$urls_filter[$i]."*)", "<a href=\"url.php?".$urls_filter[$i]."\" target=\"_blank\">".$urls_filter[$i]."</a>", $ret);
}
}
}
}

/// end filter out
$ret = preg_replace("#(^|[\n ])(((www|ftp)\.|[\w]+?://)[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", $replacer, $ret);

}
else
{

preg_match_all("#(^|[\n ])(((www|ftp)\.|[\w]+?://)[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", $ret, $result_filter);
$urls_filter = $result_filter[2]; //

if (count($urls_filter) > 0) {
$link_count = count($links_except_no);
for($i=0;$i<count($urls_filter);$i++)
{
for($ia=0;$ia<$link_count;$ia++) {
if (stristr($urls_filter[$i],$links_except_no[$ia]) ) {
$ret = str_replace($urls_filter[$i], "<font color=\"#0DB0FF\">url:&nbsp;&nbsp;".$urls_filter[$i]."</font>", $ret);
}
}
}
}
$ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a href=\"url.php?\\2\" target=\"_blank\">\\2</a>", $ret);

$ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "\\1<a href=\"url.php?http://\\2\" target=\"_blank\">\\2</a>", $ret);


}
$ret = substr($ret, 1);

return($ret);
}
echo $retold."<br><hr><br>";
echo make_clickable($ret);
?>



any help wiill be usefull
thanks

binhaus
07-30-2007, 09:46 AM
Hi all .. i have done that so far but the solution is not realy good, just add the " " before and affter urls to detect those differ white the one have same domain but differ urls
problems is it take two time of regex one call preg_match_all and one preg_replace to resolve the " " spaces
===========
i thinking about other solution but when testing it not return as i need
full code for testing

$ret = ' http://www.mydomain.com/modules.php?name= <br> www.192.168.1.1/aaamodules.php?name=
<br> www.google.com/modules.php?name= <br> lkjl <br> ftp://abc.com:24 <br> ftp.webdeveloper.com/fopost783035
<br> ;jljh kjk lkj kj<br> www.au2.php.net/manual/en/fse-url.php <br>
http://au2.500mb.net/manual/en/fse-url.php <br>
ftp.youa.com/virus.exe sfa<br> dgda.com/virus.exe ftp://500mb.com/ydflhl/dfd/ http://youtube.com
www.youtube.com
http://youtube.com/watch?v=loiuon
http://youtube.com/watch?v=asds
www.google.com
ftp://google.com
ftp://www.google.com
llkjjk';

$links_except = array('xaluan.com','google.com','youtube.com','mtvvui.com','192.168.1.1'); // links still viewable and clickable without loged in

//need help here
$ret = preg_replace("#(^|[\n ])(([\w]+?://|(www|ftp)\.)[\w\#$%&~/.\-;:=,?@\[\]+]*)#is", "'\\1' . excepviewtlinks('\\2') . ''", $ret);

function excepviewtlinks($url){
global $links_except;
echo $url." -> testing result";
if (($link_count = count($links_except))>0 && $url !=''){
for($ia=0;$ia<$link_count;$ia++) {
if (stristr($url,$links_except[$ia]) )
{
$url = str_replace($url, "<a href=\"url.php?".$url."\" target=\"_blank\">".$url."</a>",$url);
echo $url." affter testing result";
}
}
}
return $url;
} /////

echo $ret;



the result is: retuned orginal url and not call the funtion as i need

Any help will thank very much
================================

The other problems i hit is about regex of java script
it not match value which i need return


<script type="text/javascript">
function cleanImagesHack(){
// text = document.selection.createRange().text

var dotext_regx = /\[URL=http:\/\/img\w+.imageshack.us\/my.php\?image=\w+.\w+\]\[IMG\](\w+)\[\/IMG\]\[\/URL\]/gi;
var oldtext = document.forms['post'].message.value;
var newtext = oldtext.replace(dotext_regx, "$1");
document.forms['post'].message.value = newtext;
alert(newtext);