Click to See Complete Forum and Search --> : question on content type
pawky
09-08-2004, 04:11 AM
ok, for content type, what is the differences between them. I know u can have like UTF-8 and ANSI or whatever and u use the meta tag:
<meta http-equiv=""Content-Type"" content=""text/html; charset=UTF-8"">"
to define it or something. But what's the difference? thx
spufi
09-08-2004, 07:59 PM
A quick guide to UTF-8 (http://annevankesteren.nl/archives/2004/06/utf-8)
That should hopefully answer your question.
ray326
09-09-2004, 01:23 AM
http://www.freesoft.org/CIE/RFC/1521/index.htm
pawky
09-09-2004, 01:35 AM
omg im so confused :P too much and i dont understand it lol
so am i supposed to use the meta tag content-type for it? or what? im so lost sry ;P
but you actually want to avoid the HTTP-EQUIV attribute always and everywhere:
also, i saw like 20 different character encodings there and as far as i can tell its always safe to go w/ utf-8 but not sure. lol, ANY help would be greatly appreciated, thx
EDIT: ok, just read the site posted by spufi, ill read the one by ray and c how it goes
pawky
09-09-2004, 01:57 AM
ok, that is way over my head i think ;P i dont understand what 90% of the senternces are even trying to say lol. um... is there like an explanation for dumbies explaining it? lol
Have a look at these links and his (Anne can be a male name in Dutch) site discussions.
http://annevankesteren.nl/archives/2004/06/utf-8
http://annevankesteren.nl/archives/2004/08/mime-types
Basics:
content: file type
charset: file charachter encoding
If you don't use the http-equiv the browser will guess, not always correctly.
You normally set these values using a meta tag, but they can be set using a serverside language or in the .htaccess file, but all these can be overridden by you provider in a server directive.
If you have the "web developer's bar" in Mozilla, you can check how your page is being served up in "Information > View Response Headers"
toicontien
09-09-2004, 02:28 PM
I think the content-types, or character set declarations are like this:
1) UTF-16, or Unicode: Each character in the file is composed of 16 bits, for a possible 65,000+ characters in the character set. This is enough characters to display most any language on earth.
2) UTF-8: Is a sub set of UTF-16. Each character is composed of 8 bits for a possible 256 characters in the character set. This is good enough to display European languages.
3) ANSI: I think this too is an 8-bit character set, but is slanted toward the English language. In fact, ANSI stands for American National Standards Institute, i.e. English (and maybe Spanish) characters.
4) ASCII: American Standard for Computer Information Interchange, a 6 bit character set that is a sub set of ANSI, I think. It has a total of 128 characters in it, and that's good for the English language.
The default character set for XML files (including XHTML files) is to be UTF-8, according to the W3C. If you need to display characters in pictograph languages like Japanese or Chinese, save your files as Unicode (UTF-16).
Basically, the character sets like UTF-8 and 16 were developed to make the Web more International and multi-cultural, to throw some buzz words at you.
pawky
09-09-2004, 11:54 PM
Originally posted by toicontien
I think the content-types, or character set declarations are like this:
...
Awesome! I understood that :D so basically use UTF-16 or utf-8? and use utf-16 only when u need all the other character types? will using utf-16 unnessecarilly cause the dl time to take a little longer(even if not entirely noticable)?
and last question i hope. the best way to declares this is what? meta tag?
thank you so much for the explanation. Greatly appreciated
toicontien
09-10-2004, 03:47 PM
Originally posted by pawky
Awesome! I understood that :D so basically use UTF-16 or utf-8? and use utf-16 only when u need all the other character types?
I'm not sure if character types is the right word for it. If you need more than 256 characters to display a language, then use UTF-16.
Originally posted by pawky
will using utf-16 unnessecarilly cause the dl time to take a little longer(even if not entirely noticable)?
Yes, a little perhaps on slower connections. In order for the content-type declaration to really matter in your HTML document, the document needs to be saved as a UTF-16, UTF-8, or Unicode text file. If you save an HTML file with a Unicode character set, then each character within that file takes 16 bits of data to store, or roughly twice as much space to store a single character. A good estimate of Unicode text file sizes is, take the file size of what it would be by default, then multiply that by 2.
The default character set for most operating systems in english language countries is ANSI, which is 8 bits per character. A 10KB ANSI text file would be 20KB in size saved as a Unicode text file.
pawky
09-11-2004, 01:43 AM
Originally posted by toicontien
I'm not sure if character types is the right word for it. If you need more than 256 characters to display a language, then use UTF-16.
ok, yea that's what i meant, not character types :P u understood though ;P
Yes, a little perhaps on slower connections. In order for the content-type declaration to really matter in your HTML document, the document needs to be saved as a UTF-16, UTF-8, or Unicode text file. If you save an HTML file with a Unicode character set, then each character within that file takes 16 bits of data to store, or roughly twice as much space to store a single character. A good estimate of Unicode text file sizes is, take the file size of what it would be by default, then multiply that by 2.
The default character set for most operating systems in english language countries is ANSI, which is 8 bits per character. A 10KB ANSI text file would be 20KB in size saved as a Unicode text file.
ok, thank you :) the clouds are clearing :) last question (maybe), what is the difference in using utf-8 over ansi? thx for all your help
toicontien
09-11-2004, 03:17 PM
Since ANSI was developed by the American National Standards Institute, it is slanted toward the English and probably Spanish languages, as those are the main langauges spoken in the U.S. UTF-8 was developed by an international standards organization so that the character set can support most any language as long as it only needs around 256 characters.
In short:
ANSI = American
UTF-8 = International
pawky
09-11-2004, 05:14 PM
gotcha, thank you so much for all your help.
also, thx to those that tried to help in the earlier posts :) it was just above my head ;P thx all
pawky
09-11-2004, 06:54 PM
just thought of another question, is there a way to state the char set in the css file? thx :)
ok, i added the meta tag in one page and tried validating it at: http://validator.w3.org/. when using ansi it said it did not recognize it but it would work w/ utf-8. :(
Not in the css.
ANSI (http://www.ansi.org/) is an institute.
Maybe you are thinking of "us-ascii".
pawky
09-12-2004, 11:37 AM
Originally posted by Fang
Not in the css.
ANSI (http://www.ansi.org/) is an institute.
Maybe you are thinking of "us-ascii".
ok, if us-ascii is the american equivalent of the utf-8 or whatever like mentioned above then sure ;P thx :)