Results 1 to 6 of 6

Thread: Format WORD and WYSIWYG

  1. #1
    Join Date
    Sep 2005

    Format WORD and WYSIWYG

    As we know WORD format and WYSIWYG shows code behind content when put in WYSIWYG like CMS. Is this possible to avoid or only chance is to transfer to TXT.

  2. #2
    Join Date
    Jul 2009
    I believe that in some cases even that^^^ doesn't work.

    In the future, if you can help it, you should avoid using anything from a .doc file in "Word", for your websites. I'm stuck using a logo and content that a client made in "Word": and, I will probably end up re-making, or re-typing the contents of these files. A major PITA.

  3. #3
    Join Date
    Mar 2009
    Word just isnt there for making websites. Avoid like the plague.

  4. #4
    Join Date
    Dec 2005
    Lots of RTE's (embedded in CMS' or otherwise) are either built to automatically remove proprietary word formatting, but keep the text formatting (e.g. underlined will always be underlined), or they have a special button like CKEditor (formally FCKEditor) that shows a paste textarea where you paste your word code and it gets cleaned and converted.

    What WYSIWYG or CMS are you using?
    I've switched careers...
    I'm NO LONGER a scientist,
    but now a web developer...

  5. #5
    Join Date
    Nov 2008
    Most WYSIWYG editors have a past from word function which processes the input and converts or removes the characters from Word's character set. You can also do it in PHP if you build a map of word -> utf8 (or whatever) character points:

    PHP Code:
        // Map of windows 1252 character points to utf-8 character points
    $_cp1252Map = array(
    "\xc2\x80" => "\xe2\x82\xac"/* EURO SIGN */
    "\xc2\x82" => "\xe2\x80\x9a"/* SINGLE LOW-9 QUOTATION MARK */
    "\xc2\x83" => "\xc6\x92",     /* LATIN SMALL LETTER F WITH HOOK */
    "\xc2\x84" => "\xe2\x80\x9e"/* DOUBLE LOW-9 QUOTATION MARK */
    "\xc2\x85" => "\xe2\x80\xa6"/* HORIZONTAL ELLIPSIS */
    "\xc2\x86" => "\xe2\x80\xa0"/* DAGGER */
    "\xc2\x87" => "\xe2\x80\xa1"/* DOUBLE DAGGER */
    "\xc2\x88" => "\xcb\x86",     /* MODIFIER LETTER CIRCUMFLEX ACCENT */
    "\xc2\x89" => "\xe2\x80\xb0"/* PER MILLE SIGN */
    "\xc2\x8a" => "\xc5\xa0",     /* LATIN CAPITAL LETTER S WITH CARON */
    "\xc2\x8b" => "\xe2\x80\xb9"/* SINGLE LEFT-POINTING ANGLE QUOTATION */
    "\xc2\x8c" => "\xc5\x92",     /* LATIN CAPITAL LIGATURE OE */
    "\xc2\x8e" => "\xc5\xbd",     /* LATIN CAPITAL LETTER Z WITH CARON */
    "\xc2\x91" => "\xe2\x80\x98"/* LEFT SINGLE QUOTATION MARK */
    "\xc2\x92" => "\xe2\x80\x99"/* RIGHT SINGLE QUOTATION MARK */
    "\xc2\x93" => "\xe2\x80\x9c"/* LEFT DOUBLE QUOTATION MARK */
    "\xc2\x94" => "\xe2\x80\x9d"/* RIGHT DOUBLE QUOTATION MARK */
    "\xc2\x95" => "\xe2\x80\xa2"/* BULLET */
    "\xc2\x96" => "\xe2\x80\x93"/* EN DASH */
    "\xc2\x97" => "\xe2\x80\x94"/* EM DASH */
    "\xc2\x98" => "\xcb\x9c",     /* SMALL TILDE */
    "\xc2\x99" => "\xe2\x84\xa2"/* TRADE MARK SIGN */
    "\xc2\x9a" => "\xc5\xa1",     /* LATIN SMALL LETTER S WITH CARON */
    "\xc2\x9b" => "\xe2\x80\xba"/* SINGLE RIGHT-POINTING ANGLE QUOTATION*/
    "\xc2\x9c" => "\xc5\x93",     /* LATIN SMALL LIGATURE OE */
    "\xc2\x9e" => "\xc5\xbe",     /* LATIN SMALL LETTER Z WITH CARON */
    "\xc2\x9f" => "\xc5\xb8"      /* LATIN CAPITAL LETTER Y WITH DIAERESIS*/
    I also tend to map some utf8 character points to html entities too:

    PHP Code:
        $_entMap = array(
    "\xe2\x80\x98" => '‘'
    "\xe2\x80\x99" => '’'
    "\xe2\x80\x9c" => '“'
    "\xe2\x80\x9d" => '”',
    "\xe2\x82\xac" => '€',
    "\xe2\x80\xa6" => '…'
    to use:

    PHP Code:
    $string str_replace(array_keys($_cp1252Map), $_cp1252Map$string); 

  6. #6
    Join Date
    Aug 2010

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
HTML5 Development Center

Recent Articles