The ISO 8859 Alphabet material
By williamxue on May 26, 2007
Here is a paper about ISO 8859 material. It is very pretty introduction about ISO 8859 and There are many useful and important link.
I have composed the following more detailed documents:
- Combined tables of the upper halves of ISO 8859 character sets. "Upper half" means code positions from 160 to 255 (decimal); the lower halves are all identical (to each other and to ASCII). Warning: That document is viewable only on a browser which has at least minimal support to Unicode, such as IE 4 or Netscape 4. See notes on browser settings in my document Using national and special characters in HTML.
- Combined tables of the upper halves of ISO Latin alphabets. This is the same as the document mentioned above, except that only the Latin alphabets (as opposite to Latin/Cyrillic, Latin/Arabic, etc.) have been included:
The Latin alphabets of ISO 8859 standard name of alphabet characterization ISO 8859-1 Latin alphabet No. 1 "Western", "West European" ISO 8859-2 Latin alphabet No. 2
"Central European", "East European"
ISO 8859-3 Latin alphabet No. 3 "South European"; "Maltese & Esperanto" ISO 8859-4 Latin alphabet No. 4 "North European" ISO 8859-9 Latin alphabet No. 5 "Turkish" ISO 8859-10 Latin alphabet No. 6 "Nordic" (Sámi, Inuit, Icelandic) ISO 8859-13 Latin alphabet No. 7 Baltic Rim ISO 8859-14 Latin alphabet No. 8 Celtic ISO 8859-15 Latin alphabet No. 9 "euro" ISO 8859-16 Latin alphabet No. 10 Romanian and some other languages
Notice that only the following characters in the "upper halves" are invariant in the ISO Latin alphabets, in the sense of occurring in all of them in the same code position: no-break space, soft hyphen, and the characters § Ä ä Ö ö Ü ü É é ß.
- Characters in upper halves of ISO Latin alphabets. A table which lists the Unicode characters present in at least one of the ISO Latin alphabets and their code positions in them. Best viewed on a browser which supports Unicode, but characters are specified by their Unicode code numbers too.
- Coverage of European languages by ISO Latin alphabets. You might use it to determine which (if any) of the alphabets are suitable for a document in a given language or combination of languages.
- A combined mapping table from ISO 8859 character sets to Unicode. This is plain text and mostly useful for creating various programs like converters. The structure is hopefully obvious: each row begins with a code position m (in hexadecimal); each column has a heading n; and table entry on row m, column n specifies the Unicode code value of the character which is in code position m in ISO 8859-n. A value of 0000 indicates that in ISO 8859-n, no character is assigned to code position m.
- The ISO Latin 1 character repertoire - a description with usage notes. A detailed look at the characters in ISO 8859-1.
- ISO 8859-7 vs. windows-1253 (differences between two Greek "character sets")
- ISO Latin 9 as compared with ISO Latin 1. (ISO Latin 9 = ISO 8859-15).
There is an excellent online character database by Indrek Hein at the Institute of the Estonian Language. You can e.g. search for Unicode characters by name or code position, get lists of differences or conversion between some character sets (such as the ISO 8859 family and many others), and get lists of characters needed for different languages.