www.webdeveloper.com
Results 1 to 10 of 10

Thread: fopen can't read arabic characters

  1. #1
    Join Date
    Jul 2013
    Posts
    18

    fopen can't read arabic characters

    This function reads (document.doc) files.. But it turns arabic characters into english characters

    I want to make it read arabic characters , Or remove it at least.

    PHP Code:
    function word($filename){
        
        
        if((
    $fh fopen($filename'r')) !== false ) {
            
           
    $headers fread($fh0xA00);
           
    $n1 = ( ord($headers[0x21C]) - );
           
    $n2 = ( ( ord($headers[0x21D]) - ) * 256 );
           
    $n3 = ( ( ord($headers[0x21E]) * 256 ) * 256 );
           
    $n4 = ( ( ( ord($headers[0x21F]) * 256 ) * 256 ) * 256 );
           
    $textLength = ($n1 $n2 $n3 $n4);
           if(
    $extracted_plaintext = @fread($fh$textLength)){

           }else{
                return 
    docx2text($filename); // Save this contents to file
           
    }
           
            
    $text=str_replace(  chr(13) , "\n"$extracted_plaintext);

            echo 
    $text;
        }
        
        
    }

    word('filename.doc'); 

  2. #2
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    19,337
    Any difference if you read it as binary?
    PHP Code:
    fopen($filename'rb'
    Just a stab in the dark, not sure if it should/would make any difference in this case.
    "Please give us a simple answer, so that we don't have to think, because if we think, we might find answers that don't fit the way we want the world to be."
    ~ Terry Pratchett in Nation

    eBookworm.us

  3. #3
    Join Date
    Jul 2013
    Posts
    18
    No , It didn't make any difference ..

    An Arabic word is converted into English word with similar amount of digits.
    Example , A word 'علي' in the file is read as '9Dj' ..

    Any suggestion ?

  4. #4
    Join Date
    Jul 2013
    Posts
    18
    I noticed that each Arabic character is converted into a specific English one .. May this help diagnosing the problem..

  5. #5
    Join Date
    Mar 2007
    Location
    localhost
    Posts
    2,348
    Perhaps Ali needs to be represented as an escaped character or code.
    Yes, I know I'm about as subtle as being hit by a bus..(\\.\ Aug08)
    Yep... I say it like I see it, even if it is like a baseball bat in the nutz... (\\.\ Aug08)
    I want to leave this world the same way I came into it, Screaming, Incontinent & No memory!
    I laughed that hard I burst my colostomy bag... (\\.\ May03)
    Life for some is like a car accident... Mine is like a motorway pile up...

    Problems with Vista? :: Getting Cryptic wid it. :: The 'C' word! :: Whois?

  6. #6
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    19,337
    What does the docx2text() function do? If by any chance it implements PHPDOCX, perhaps you need to make use of the setEncodeUTF8() method?
    "Please give us a simple answer, so that we don't have to think, because if we think, we might find answers that don't fit the way we want the world to be."
    ~ Terry Pratchett in Nation

    eBookworm.us

  7. #7
    Join Date
    Jul 2013
    Posts
    18
    docx2text() will only direct it to read another format (docx rather than doc) , but the problem is n't in it .. In case of docx , the file is read correctly .. But in case of the word , the current error occurs.
    nction
    I'm not using PHPDOCX and there is no chance to use it in that project.

    Is there any similar function to setEncodeUTF8() ?

  8. #8
    Join Date
    Jul 2013
    Posts
    18
    I found that each Arabic character is converted to its corresponding ASCII character ..

    How can I prevent this , or how can I reconvert it to its original Arabic chr ..

  9. #9
    Join Date
    Aug 2004
    Location
    Ankh-Morpork
    Posts
    19,337
    Where is this conversion happening? It's not clear to me exactly when this transition is happening. Maybe htmlentities() could handle it, or maybe the mb_string functions, but as of yet I'm not really sure where the problem is.
    "Please give us a simple answer, so that we don't have to think, because if we think, we might find answers that don't fit the way we want the world to be."
    ~ Terry Pratchett in Nation

    eBookworm.us

  10. #10
    Join Date
    Mar 2007
    Location
    localhost
    Posts
    2,348
    Yes, I know I'm about as subtle as being hit by a bus..(\\.\ Aug08)
    Yep... I say it like I see it, even if it is like a baseball bat in the nutz... (\\.\ Aug08)
    I want to leave this world the same way I came into it, Screaming, Incontinent & No memory!
    I laughed that hard I burst my colostomy bag... (\\.\ May03)
    Life for some is like a car accident... Mine is like a motorway pile up...

    Problems with Vista? :: Getting Cryptic wid it. :: The 'C' word! :: Whois?

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles