ArabUniC : glyphs <-> unicode converterAUC

Nicolas Goudard & Stéphane Humbel
Tests et mise au point avec le Prof Ali Rahmouni et Sylvain Mazas
LucasFonts is gratefully acknowledged for providing knowledge in arabic ligature, useful suggestions and data.

This applet converts any characters to their unicode equivalent, whether they are arabic, chinese, corean, ascii, cyrillic, whatsoever. Reciprocally, is converts any Unicode to any glyphs. Special care was taken for Arabic sentences because of Arabic writing rules. Hence the name. ArabUniC can be useful for internationalisation of Java codes and typography. It is believed to be useful for some FontLab users, and is mentionned for instance by Chris Hoke on www.ehow.com.
  1. Glyph to Unicode
    • If the box "arabic ligatures" is checked, arabic ligatures will be handled (this is the default).
    • If the box "arabic ligatures" is not checked, arabic ligatures will not be handled at all. This should be usefull for any other characters (chinese, corean, ascii, cyrillic, etc)
    • Note that on some Apple OS X system copy/paste does not work but drag & drop does, because of Java security restrictions. You have to donwload the standalone version to use the clipboard.
    • Last, a virtual arabic keyboard can be found at Lexilogos.
    • Put some text in the field "String", then use the right arrow button to convert the field "String" to the field "Unicode"
    • Use the menu "Preferences / Check Unicode" to check the converted result
  2. Unicode to Glyph
    • Put some unicode characters in the field "Unicode", then use the left arrow button to convert the field "Unicode" to the field "String"

Your browser doesn't support Java. You can download Java here or here. After installing Java, you must refresh this page.
This is a contribution from iSm2 (Marseille - France).

This web version is permanent. ArabUnic supports 183 arabic characters in their 4 forms (isolated, beginning, medium, final). The rule of isolating glyphs (like alif) has been implemented for 66 characters. Thanks are given to Sylvain Mazas for this issue. It is beleived that arabunic should support various writings such as pashto, sindhi, kurdish, arabic, persian, urdu, azerbaycan, gilaki, mazeruni, punjabi, kashmiri, hausa, sindhi, kurdi sorani, masri, turkmen, uyghur, kazak, kirghiz, malay, tatar, turkish, morisco. (please write us if you find errors in this list).

Download the 2.0 standalone version (runnable java) ArabUnic (66ko) (thanks for telling any residual bug)

Other unicode converters exists. For instance http://snible.org/java2/ or http://www.mikezilla.com/exp0012.html but they do not handle the specific ligature scheme of arabic glyphs.
In the version 1.99f, there was a residual bug in the end of "some" words. Lam-Alif should be properly handled.
Standalone versions have a time stamp:
- ArabUniC 1.99h : lasts until December 31st 2011.
- ArabUniC 1.99i : lasts until December 31st 2014 (diacritic bug corrected - thanks to Khurram for the hints and Asma for help).
- ArabUniC 2.0 (current version) will last until December 31st 2014 (new GUI, some bugs corrected).

ArabUniC was made initially to internationalize HuLiS. A (lite) html5 version of HuLiS can be found at Huckel for mobile phones

The glyphs correspondance was first taken from Wikipedia.

2010