www.webdeveloper.com
Results 1 to 2 of 2

Thread: Trying to read MS word content

Hybrid View

  1. #1
    Join Date
    Dec 2011
    Posts
    28

    Trying to read MS word content

    Hi all,

    I'm trying to read MS word contents using fread or file_get_contents ...
    It works fine using both.

    But my problem is explained in the attached file.
    basma fahiem.doc


    I want to ignore non English characters because they are always converted into strange chars.




    This is my code :

    PHP Code:
    function parseWord($userDoc
    {
    $fileHandle fopen($userDoc"r");

    $line =mb_convert_encoding( @fread($fileHandlefilesize($userDoc)) , "UTF-8");


    $lines explode(chr(0x0D),$line);
    $outtext "";
    foreach(
    $lines as $thisline)
    {
    $pos strpos($thislinechr(0x00));
    if ((
    $pos !== FALSE)||(strlen($thisline)==0))
    {
    } else {
    $outtext .= $thisline." ";
    }
    }
    $outtext preg_replace("/[^a-zA-Z0-9\s\,\.\-\n\r\t@\/\_\(\)]/","",$outtext);
    return 
    $outtext;

    Can someone help?

  2. #2
    Join Date
    Dec 2011
    Posts
    28
    Any help?

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles