2 way Charset

2 way Charset SearchSearch
Author Message
Omar
New member
Username: Kfsmart

Post Number: 22
Registered: 01-2008
Posted on Sunday, September 13, 2009 - 09:47 am:   

Hi,
We are receiving mostly arabic and English messages,but sometime we receive the arabic messages with unknown characters like this :
꒤꒤ꐀ 笀   ㄀    紀₤꒤꒤ #مسعود مفتاح محمد الصبيحى اجدابياꐆ䐆䨆⠆䨆
I checked the incoming SMS from INSMS log:
00A400A400A400A400A40020007B00200020002000310030002000200020007D002000A400A400A4 00A400A4002000230645063306390648062F002006450641062A0627062D00200645062D0645062F 00200627064406350628064A062D064900200627062C062F06270628064A062700A40644064A0628 064A062.

Coverting this to utf-8 must be:
¤¤¤¤¤ { 10 } ¤¤¤¤¤ #مسعود مفتاح محمد الصبيحى اجدابيا¤ليبيا

Note:I musing UTF-8 as charset,what you recommand me to do?
Des - NowSMS Support
Board Administrator
Username: Desosms

Post Number: 1239
Registered: 08-2008
Posted on Sunday, September 13, 2009 - 05:06 pm:   

Hi Omar,

I'm not sure that I understand.

Your decoding looks correct.

Are you saying that NowSMS gives you:

꒤꒤ꐀ 笀   ㄀    紀₤꒤꒤ #مسعود مفتاح محمد الصبيحى اجدابياꐆ䐆䨆⠆䨆

instead of:

¤¤¤¤¤ { 10 } ¤¤¤¤¤ #مسعود مفتاح محمد الصبيحى اجدابيا¤ليبيا



--
Des
NowSMS Support
Omar
New member
Username: Kfsmart

Post Number: 23
Registered: 01-2008
Posted on Monday, September 14, 2009 - 09:26 am:   

HI Des,

Yes it gives me "꒤꒤ꐀ 笀   ㄀    紀₤꒤꒤ #مسعود مفتاح محمد الصبيحى اجدابياꐆ䐆䨆⠆䨆"

although in sms-In.log gives correct encoding.
Des - NowSMS Support
Board Administrator
Username: Desosms

Post Number: 1241
Registered: 08-2008
Posted on Monday, September 14, 2009 - 04:16 pm:   

Hi Omar,

How are you decoding to the UTF-8 in your 2-way command?

I ask, because I just routed your example Unicode message above into a 2-way command in NowSMS, and the UTF-8 passed to the 2-way command seemed to match the correct decoding.

Let me explain what I did to test ...

I manually submitted the Unicode hex string into NowSMS as an inbound message with the following URL format:

http://127.0.0.1:8800/?sender=xxxxx&phonenumber=yyyy&dcs=8&inboundmessage=yes&bi nary=yes&data=00A400A400A400A400A40020007B00200020002000310030002000200020007D00 2000A400A400A400A400A4002000230645063306390648062F002006450641062A0627062D002006 45062D0645062F00200627064406350628064A062D064900200627062C062F06270628064A062700 A40644064A0628064A062

(This discussion board software inserts spaces into long strings, so I had to remove them.)

The message was routed to a 2-way HTTP command, which received the following request:

/?sender=xxxxx&text=%C2%A4%C2%A4%C2%A4%C2%A4%C2%A4%20{%20%20%2010%20%20%20}%20%C 2%A4%C2%A4%C2%A4%C2%A4%C2%A4%20%23%D9%85%D8%B3%D8%B9%D9%88%D8%AF%20%D9%85%D9%81% D8%AA%D8%A7%D8%AD%20%D9%85%D8%AD%D9%85%D8%AF%20%D8%A7%D9%84%D8%B5%D8%A8%D9%8A%D8 %AD%D9%89%20%D8%A7%D8%AC%D8%AF%D8%A7%D8%A8%D9%8A%D8%A7%C2%A4%D9%84%D9%8A%D8%A8%D 9%8Ab

I decoded this UTF-8 string as:

¤¤¤¤¤ { 10 } ¤¤¤¤¤ #مسعود مفتاح محمد الصبيحى اجدابيا¤ليبيb

(The Arabic part may be a little bit different, but there is an uneven number of characters in your hex string, so I think a character went missing in the cut & paste, or the discussion board software lost it.)

So I think NowSMS is encoding it correctly. Maybe your script is having a problem decoding some UTF-8 characters?

I would look especially at UTF-8 C2 A4, which is Unicode 00A4, or the ¤ character.

--
Des
NowSMS Support