October 2009 – Ultrasparky

Wow, this’ll be a huge change. ICANN is getting ready to allow internationalized domain names — web-site domain names that use non-Latin character encodings instead of the regular western alphabet.

Internationalized Domain Names (IDNs) are domain names represented by local language characters. Such domain names could contain letters or characters from non-ASCII scripts (for example, Arabic or Chinese). Many efforts are ongoing in the Internet community to make domain names available in character sets other than ASCII.

These “internationalized domain name” (IDN) efforts were the subject of a 25 September 2000 resolution by the ICANN Board of Directors, which recognized “that it is important that the Internet evolve to be more accessible to those who do not use the ASCII character set,” and also stressed that “the internationalization of the Internet’s domain name system must be accomplished through standards that are open, non-proprietary, and fully compatible with the Internet’s existing end-to-end model and that preserve globally unique naming in a universally resolvable public name space.”

You can read this BBC story for the more easily digested version:

The internet is on the brink of the “biggest change” to its working “since it was invented 40 years ago”, the net regulator Icann has said.

The body said it that it was finalising plans to introduce web addresses using non-Latin characters.

The proposal — initially approved in 2008 — would allow domain names written in Asian, Arabic or other scripts.

My head is swimming with questions. A lot of them are about the specifics of getting this to work, considering how complex it still is to handle a lot of non-Latin writing systems with current fonts and technologies. There’s still a big divide between encoding the characters and representing them visually with a lot of scripts, and still only limited solutions for handling that. With Indic scripts, to pick an example I’ve been rather deeply immersed in lately, the way the Unicode-based characters are typed in is just the first step: rendering names or words in a way that makes linguistic sense requires a little extra software processing, and some carefully built fonts. So are the address bars in web browsers going to handle OpenType substitutions to make that happen? Or is there going to be a different encoding solution that’s a little more WYSIWYG when it comes to typing in non-Latin addresses? I’m guessing a lot of those issues have come up in the ICAAN proceedings, so I suppose it’s time to wade through them and see what the deal is.

In one sense, of course, this is brilliant. Domain names are part of our online identities and brands these days, and people should be able to use their own languages and writing systems to identify themselves online. It’s only fair, and it shows respect for the huge sectors of the world that don’t use our alphabet everyday. Hopefully this will also encourage more technical support — and type design,if we’re starting to think about web fonts, too — for non-Latin scripts. (Trust me, it’s a typographic desert out there in the non-Latin world.)

But there will also be a certain amount of balkanization that’s likely to come of it, on top of just the language barrier. Linking to non-Latin domain names will require extra know-how about how to key in those names. It will require some understanding of encoding versus representation, writing direction, and even sensitivity to the differences betwen one character and another in an unfamiliar alphabet. Again, these are things that would all be good for people to learn, but we can’t even get people to use nice, clean HTML all the time. It would be a shame if the extra complexity keeps people from bothering to connect to the internationalized portion of the web. I suppose some sort of transliteration layer will spring up, but again…so many questions!

A nice tidbit From Gulliver’s Travels, circa 1726:

His majesty, in another audience, was at the pains to recapitulate the sum of all I had spoken; compared the questions he made with the answers I had given; then taking me into his hands, and stroking me gently, delivered himself in these words, which I shall never forget, nor the manner he spoke them in: “My little friend Grildrig, you have made a most admirable panegyric upon your country; you have clearly proved, that ignorance, idleness, and vice, are the proper ingredients for qualifying a legislator; that laws are best explained, interpreted, and applied, by those whose interest and abilities lie in perverting, confounding, and eluding them. I observe among you some lines of an institution, which, in its original, might have been tolerable, but these half erased, and the rest wholly blurred and blotted by corruptions. It does not appear, from all you have said, how any one perfection is required toward the procurement of any one station among you; much less, that men are ennobled on account of their virtue; that priests are advanced for their piety or learning; soldiers, for their conduct or valour; judges, for their integrity; senators, for the love of their country; or counsellors for their wisdom. As for yourself,” continued the king, “who have spent the greatest part of your life in travelling, I am well disposed to hope you may hitherto have escaped many vices of your country. But by what I have gathered from your own relation, and the answers I have with much pains wrung and extorted from you, I cannot but conclude the bulk of your natives to be the most pernicious race of little odious vermin that nature ever suffered to crawl upon the surface of the earth.”

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Month: October 2009

Non-Latin domain names coming soon

Gertrude and Alice should have been so lucky

Pernicious race of little odious vermin

After the party