MMS Header Encoding

MMS Header Encoding SearchSearch
Author Message
Liviu
Unregistered guest
Posted on Monday, March 08, 2004 - 05:29 pm:   

Hello everybody,

I read several posts on the MMS forum on your website which are related to MMS headers in general and header encoding in particular. Although most of the people on the forum tend to agree that IANA charsets are encoded by OR-ing the assigned number ( e.g 0x6A for UTF-8) and 0x80, I'm still not sure this is the right thing to do. I would really appreciate a quick answer to two questions:

1. What happens with assigned numbers which are coded on two bytes ? (e.g. 0x07EA = BIG5). Are we going to OR every byte ?

2. How do I know how many bytes do I have to read for the encoding when I receive a PDU with an encoded header set ? My understanding is that
the charset has to be encoded as an Integer-value (WSP 8.4.2.3). In this case, the OR theory seems to fall to pieces ...

Thank you for your time,

Liviu
Bryce Norwood - NowSMS Support
Board Administrator
Username: Bryce

Post Number: 1979
Registered: 10-2002
Posted on Monday, March 08, 2004 - 05:47 pm:   

Hi Liviu,

The OR-ing is what you do with a single byte value. And this is 100% consistent with the integer-value encoding that you reference in WSP 8.4.2.3.

If you were encoding charset=utf-8 as a parameter for a content-type encoding, it would be encoded as 81 EA. Integer-value can either be encoded as a Short-integer or a Long-integer. Since the character set value (0x6A) is less than 127 (0x7F), Short-integer encoding is used, which basically involves just setting the high bit (or with 0x80).

If you were encoding charset=big5 as a parameter for a content-type encoding, it would be encoded as 81 02 EA 07. In this case, the character set value (0x7EA) is greater than 127 (0x7F), Long-integer encoding is used.

When you decode such a character set value, you can automatically determine which type of encoding is used based upon the value, as described in WSP Section 8.4.1.2. Based upon the first octet of the value, you can determine what the encoding is:

0-30 (0x00-0x1E) - The octet is followed by the indicated number of data octets. (Context dependent, but in this case, a Long-integer value.)

31 (0x1F) - The octet is followed by a uintvar which indicates the number of data octets after it. (Context dependent, but 0x1F is the length-quote value, see the definition of Value-length.)

32-127 (0x20-0x7F) - The value is a text string, terminated by a zero octet (NUL character).

128-255 (0x80-0xFF) - It is an encoded 7-bit value; this header has no more data. (In this case, we know it to be a short-integer.)

-bn
Venkat Akella
Unregistered guest
Posted on Wednesday, March 10, 2004 - 05:01 am:   

Hello guys,
I have a quick question related to this topic. When Encoded-String-Value is "Value-lenght Char-set Text-String", I am thinking that we can figure out the length of the Text-String with Value-lenght. My question here is, do I have to still teriminate the Text-String with zero octet(NUL character)in this case.

Thanks for your time.
Bryce Norwood - NowSMS Support
Board Administrator
Username: Bryce

Post Number: 2083
Registered: 10-2002
Posted on Thursday, March 11, 2004 - 07:13 pm:   

By spec, if you did not include the NULL character, the message would be considered corrupt. (Many clients and and MMSCs would still probably try to decode it, and might decode it correctly, but by spec it would be considered corrupt.)

A Text-string must be terminated with the End-of-string token (which is defined as <Octet 0>, our friend the NUL character).

-bn
Liviu
Unregistered guest
Posted on Friday, March 12, 2004 - 12:36 am:   

Hey Bryce,
Thank you for the clarifications. As soon as I've read your post I realized that the missing piece in my theory was that I overlooked the fact that the short integer is encoded with the MS bit set :-)

Liviu
Anonymous
 
Posted on Friday, March 26, 2004 - 09:54 am:   

hi all,
I have query. I have to encode a .mms according to wsp spec.(m - retreive conf.).Headers part are encoded perfectly.But we are facing problems while trying to add contents
For eg:-, lets say i have 2 parts (Text,Image)

so the format will be.....
<84><a3>
<Number of Parts><HeaderLen><DataLen><83>Data
then again,
<HeaderLen><DataLen><90>ImageData

Am I right??( without Smil example)

if wrong, please let me know..........

If correct, then i have doubt regarding
DataLength of Image part...How to encode this filed...say i have image data of 5 KB..what am i supposed to do. How to obtain datalength of Image...Can any one please please please explain and also suggest any links....

Thanks a lot in advance.........,
Siddarth
Anonymous
 
Posted on Sunday, August 08, 2004 - 09:19 pm:   

you will never do it while you can't spell "retrIEve"... lol
Lee
Unregistered guest
Posted on Monday, August 09, 2004 - 09:17 am:   

Why don't u download the MMS Decomposer from http://www.mmssdk.com and check u'r MMS file with it.

Its also works good with NowSMS.

Lee