Click to See Complete Forum and Search --> : Accented ASCII charaters in HTML


heavenly_blue
08-07-2007, 01:32 PM
I was under the impression - in HTML - there is no need to encode normal accented characters that're in the standard ASCII set, namely:

Á
É
Í
Ó
Ú
á
é
í
ó
ú
Ñ
ñ
¡
¿


Logically I would think this would extend to all characters with character codes between 33 and 255 excluding whitespace characters, ampersand, quotation mark, less than symbol and greater than symbol.

I understand that it does make sense to use entity references to characters like © using © since it's fast, descriptive, and characters like that aren't on any keyboards.

I just don't think it makes sense at all to have é replace each and every é in a source file.

International developers enter these charaters directly from their keyboard and do not ever need to encode them right?

Does anyone have a solid reference or source confirming or denying this? I'd like to be able to prove this.


Also...what are the effects between using these characters in UTF-8 and iso-8859-1 / quirks mode? Will they work in all three?

mactheweb
08-07-2007, 11:56 PM
Whether or not your page displays accent characters without encoding them is a function of your stated character set. For the best results you would want to use the following (along with a document type definition):
<meta http-equiv="content-type" content="text/html; charset=utf-8" />

If you don't specifically state the character set, many browsers will default to Western Latin, charset=iso-8859-1, which does not support accented characters, em dashes, ellipses or most other typographic niceties.

For a more thorough discussion of the subject see:
http://www.mezzoblue.com/archives/2003/07/29/html_and_for/
and
http://www.alistapart.com/stories/emen/