Reading unicode from plain text/file


Reading Unicode characters from a raw text and want to use as Unicode character, require to unescape the character first and then can be used as Unicode.

The code below can be used for the same.

String unescape(String s) {
    int i=0,len=s.length();
    char c;
    StringBuffer sb = new StringBuffer(len);
    while (i<len) {
        c = s.charAt(i++);
        if (c=='\\') {
            if (i<len) {
               c = s.charAt(i++);
               if (c=='u'  || c == 'U' ) {
                   c = (char) Integer.parseInt(s.substring(i,i+4),16);
                   i += 4;
               } // add other cases here as desired...
           }
        } // fall through: \ escapes itself, quotes any character but u
       sb.append(c);
    }
return sb.toString();
}
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: