Hi all,

I'm trying to read MS word contents using fread or file_get_contents ...
It works fine using both.

But my problem is explained in the attached file.
basma fahiem.doc


I want to ignore non English characters because they are always converted into strange chars.




This is my code :

PHP Code:
function parseWord($userDoc
{
$fileHandle fopen($userDoc"r");

$line =mb_convert_encoding( @fread($fileHandlefilesize($userDoc)) , "UTF-8");


$lines explode(chr(0x0D),$line);
$outtext "";
foreach(
$lines as $thisline)
{
$pos strpos($thislinechr(0x00));
if ((
$pos !== FALSE)||(strlen($thisline)==0))
{
} else {
$outtext .= $thisline." ";
}
}
$outtext preg_replace("/[^a-zA-Z0-9\s\,\.\-\n\r\t@\/\_\(\)]/","",$outtext);
return 
$outtext;

Can someone help?