If you are interested in the matter, here's some additional information about handling special characters, ASCII codes, foreign alphabets. Unicode and Regular Expressions.
You said something about Asian characters and double-byte characters. In fact there are several way of encoding (encoding and characters set are different animals): 1, 2, 3, or 4 bytes per code point (or combination). Code points are a sort of a mapping between characters and numbers. There are "mysterious" name for those Unicode variants (UTF-36 covers 4 bytes per code point, while UTF-8 covers 1,2,3, and 4 bytes per code point )
Take care, also about the difference between bit and byte
When in comes about Regular Expressions, a range of characters can be created not only upon their ASCII decimal value, but also on their Unicode or Hexa values. A detailed explanation/standard: