Explain MMS binary code

Explain MMS binary code SearchSearch
Author Message
wang_jinwnag
Posted on Wednesday, July 16, 2003 - 09:01 am:   

the header of a MMS binary code is
8C 84 //message type = m_retrieve_conf
8D 90 //mms version = 1001 0000????
85 04 3E 2C A8 F0 //date from 1970....
96 42 54 20 49 67 6E 69 74 65 20 4D 4D 53 00 //subject = "BT Ignite MMS"
84 1F 21 B3 //content type = ?
8A 3c 62 74 6D 6D 73 2E 73 6D 69 6c 3E 00
//message-class = ???
89 61 70 70 6c 69 63 61 74 69 6F 6E 2F 73 6D 69 6C 00 //from = applocation smil
04 2C //???
83 6E 61 70 70 6C 69 63 61 74 69 6F 6E 2F 73 6D 69
6c 00 //content-location = napplocation/smil
8E 62 74 //message-size = 29794?
6D 6D 73 2E 73 6D 69 6C 00 //mms.smil
followed by smil presentation and other files

is my explain right??
i found some of hexes i cant explain
anyone can help me?
thanks
Bryce Norwood - NowSMS Support
Posted on Wednesday, July 16, 2003 - 03:09 pm:   

This particular example is a very simple one ... it doesn't have much in the way of headers.

8C 84 //message type = m_retrieve_conf
8D 90 //mms version = 1001 0000????
85 04 3E 2C A8 F0 //date from 1970....
96 42 54 20 49 67 6E 69 74 65 20 4D 4D 53 00 //subject = "BT Ignite MMS"
84 1F 21 B3 8A 3c 62 74 6D 6D 73 2E 73 6D 69 6c 3E 00 89 61 70 70 6c 69 63 61 74 69 6F 6E 2F 73 6D 69 6C 00 // content-type: application/vnd.wap.multipart.related; start=<btmms.smil>; type=application/smil

After that, you're done with the headers, and into the "application/vnd.wap.multipart.related" content.

-bn
wang_jinwang
Posted on Thursday, July 17, 2003 - 07:02 am:   

thanks :-)
more questions!
i foundin the MIME head
there is no "boudary" parameter
so how to divide the attach file below?
in some binary mms files,there are also have no
"content-type" parameters(no in the MIME head,but in the head of the attach file), so how can i recognize the attach file type??
Bryce Norwood - NowSMS Support
Posted on Thursday, July 17, 2003 - 02:56 pm:   

The content is "application/vnd.wap.multipart.related", which is a binary encoding for multipart MIME.

This MIME type is defined in Section 8.5 of the WAP Wireless Session Protocol (WSP) specification.

-bn
wang_jinwang
Posted on Monday, July 21, 2003 - 05:05 am:   

according to WSP specification, The MIME entry of MMS is like [headerslen][datalen][contenttype][headers][data]

here is a MIME head & entry :
o4 //4 files
2c //headerlen 44
83 6e //datalen 3212
61 70 70 6c 69 63 61 74 69 6F 6E 2F 73 6D 69 6C 00
//content type application/smil
C0 22 3C 62 74 6D 6D 73 2E 73 2E 73 6D 69 6C 3E 00
8E 62 74 6D 6D 73 2E 73 6D 69 6C 00??
so can you help me explain the last line binary code? i cant explain the [header] part of the MIME entry,maybe simple ,but i cant find its detail explain in the WSP specification.
Bryce Norwood - NowSMS Support
Posted on Wednesday, July 30, 2003 - 09:20 pm:   

Ok. Apologies for the delay in response ... I've had to concentrate on our own product support this past week because of a high volume of questions about our beta release (http://www.nowsms.com/beta5).

Those are WSP headers.

When you have a multipart MIME message, each part of the message has a MIME header followed by the actual body of the content.

In "application/vnd.wap.multipart.related" or "application/vnd.wap.multipart.mixed" encoding, the first field in the an individual part header is always the content, as you have discovered.

What follows is additional MIME headers that are encoded using WSP header encoding. Essentially, section 8.4 ("Header Encoding") of the WSP specification covers this, with Table 38 and 39 documenting the binary codes associated with well known header field names and parameters.

So taking the example above ...

C0 22 3C 62 74 6D 6D 73 2E 73 2E 73 6D 69 6C 3E 00

Content-ID: <btmms.s.smil>

Content-ID's WSP encoding is 0x40, but as a well-known field, it is or'd with 0x80 to produce 0xC0.

The WAP spec defines the Content-ID value as a quoted string, hence the quote (") character to start the field value.

8E 62 74 6D 6D 73 2E 73 6D 69 6C 00

Content-Location: btmms.smil

Content-Location's WSP encoding is 0x0E, but as a well-known field, it is or'd with 0x80 to produce 0x8E.

As you can see, I believe there is a typo in the message above, adding the extra ".s" in the middle of the Content-ID value. That explains why the "headerlen" shows 0x2C, but the example has 0x2E bytes of header data.

-bn
Anonymous
Posted on Thursday, August 07, 2003 - 10:17 am:   

83 6e //datalen 3212

can anybody explain how did this come ?

83h = 131
6Eh = 110

..! Confused
Bryce Norwood - NowSMS Support
Posted on Thursday, August 07, 2003 - 05:03 pm:   

See the fourth message in the following thread for a reference to where length encoding is defined:

http://support.nowsms.com/discus/messages/12/522.html
Jery
Posted on Tuesday, August 12, 2003 - 02:01 am:   

85 04 3E 2C A8 F0 //date from 1970....

Anyone can tell we what the date format is

Thank you
Bryce Norwood - NowSMS Support
Posted on Tuesday, August 12, 2003 - 06:37 am:   

Jery,

I'm sure that people get tired of reading messages where I refer someone to a particular specification. But I think it is very important that if someone is doing any encoding or decoding of this type of data ... they should have some understanding of the specification. If you rely only on reverse engineering, you're likely to run into problems with different implementations of the same protocol that encode things slightly different.

Section 7.2.5 of the MMS Encapsulation Protocol specification published by the Open Mobile Alliance specifies that the date field is a long-integer specifying the number of seconds since 1970-01-01 00:00:00 GMT.

The MMS encapsulation header uses WSP style rules for the encoding of header fields. So you'll find the definition of the long-integer format in Section 8.4.2.1. Basically, 04 is the number of bytes in the multi-octet integer (most significant octet first in this type of integer encoding), and 3E 2C A8 F0 is the value (# of seconds).

-bn
Jery
Posted on Tuesday, August 12, 2003 - 07:11 am:   

Thank you Bryce
I first think a long is only four byte,
and WAP209 don't write the encode format
so double here

Anonymous
Posted on Tuesday, September 16, 2003 - 04:37 am:   

in the explaination you gave i have 4 confounded yet.

1)0x90 the version is? major /minor?

2)84 1f 21 b3 , b3 is the content type,but what's the prefix 1f 21?

3)8A 3c 62 74 6D 6D 73 2E 73 6D 69 6c 3E 00,8A is msg class you consider it as start?

4)89 61 70 70 6c 69 63 61 74 69 6F 6E 2F 73 6D 69 6C 00, 89 means from, why you refer it as type?


thanks in advance

This particular example is a very simple one ... it doesn't have much in the way of headers.

8C 84 //message type = m_retrieve_conf
8D 90 //mms version = 1001 0000????
85 04 3E 2C A8 F0 //date from 1970....
96 42 54 20 49 67 6E 69 74 65 20 4D 4D 53 00 //subject = "BT Ignite MMS"
84 1F 21 B3 8A 3c 62 74 6D 6D 73 2E 73 6D 69 6c 3E 00 89 61 70 70 6c 69 63 61 74 69 6F 6E 2F 73 6D 69 6C 00 // content-type: application/vnd.wap.multipart.related; start=<btmms.smil>; type=application/smil

After that, you're done with the headers, and into the "application/vnd.wap.multipart.related" content.

-bn
Anonymous
Posted on Friday, September 26, 2003 - 09:14 am:   

who can help me? thanks lots
Anonymous
Posted on Friday, September 26, 2003 - 09:23 am:   

may i miss any specs? i wonder which i should read be4 explain the mms binary.thanks you all
Bryce Norwood - NowSMS Support
Posted on Saturday, October 11, 2003 - 09:30 pm:   


quote:

in the explaination you gave i have 4 confounded yet.




It does help to refer to the specifications. For this area, everything is convered in the Open Mobile Alliance's WSP and MMS Encapsulation Specifications.

To provide some pointers on the specific questions ...


quote:

1)0x90 the version is? major /minor?




A good explanation of this encoding is provided in the following thread:

http://support.nowsms.com/discus/messages/12/522.html


quote:

2)84 1f 21 b3 , b3 is the content type,but what's the prefix 1f 21?




Section 8.4.2.24 of the WSP specification documents the format of the content-type field.

The content type field can either have an encoding of "Constrained-media" (which is a text string or a single byte binary code such as "B3"), or "Content-general-form" for a more complex content type header.

For example, a text version of a more complex content type header might look like this:

Content-type: text/plain; charset="utf-8"

If the content type contains any parameters (such as charset in my simple example here), then you have to use "Content-general-form" encoding so that you do not lose the parameters.

Section 8.4.2.24 of the WSP spec defines "Content-general-form" as "Value-length Media-type".

"Value-length" refers to the length of the "Media-type" data in which the content type is encoded.

Section 8.4.2.2 of the WSP spec defines how "Value-length" is encoded. Basically, if the length is between 0 and 30 bytes, then it is encoded as a "short-length" value where the length is encoded as a single byte. If the length is longer than 30 bytes, then it is encoded as length-quote (31, or 0x1F) followed by a uintvar encoding of the length.

In the example above, 1F 21 is a "value-length" encoding of 0x21 (33 decimal) indicating that the "media-type" encoding that follows is 33 bytes in length.

Refer to the 2nd message on this thread for a translation of the content type header for this particular message.


quote:

3)8A 3c 62 74 6D 6D 73 2E 73 6D 69 6c 3E 00,8A is msg class you consider it as start?

4)89 61 70 70 6c 69 63 61 74 69 6F 6E 2F 73 6D 69 6C 00, 89 means from, why you refer it as type?




Nope. We're still decoding the content-type header. The 1F 21 bytes that start the content-type header tell us that we've got 33 bytes to decode for the content-type header (after the value-length itself).
Anonymous
 
Posted on Tuesday, November 04, 2003 - 03:54 am:   

in MultiPart MiME, there is a the following PDU,i can't parse them as what you said.
[07 2D 83 7A 19 61 70 70 6C 69 .....]

[headerslen][datalen][contenttype][headers][data]

07 ----Number of Entries, 7 files
2D ---- headerslen =45 bytes
83 7A ---- datalen = 506 bytes
19-----???????? what's the meaning?
61 70 70 6C 69 ...---content type application/smil
.....
am i right ?
Anonymous
 
Posted on Wednesday, November 05, 2003 - 03:27 am:   

19 is length of Content-type ?
Value-length = short-interger text string?
Bryce Norwood - NowSMS Support
Board Administrator
Username: Bryce

Post Number: 1140
Registered: 10-2002
Posted on Thursday, November 13, 2003 - 09:16 pm:   

Yes, 0x19 is the length of the content-type. If this byte is less than 0x00 thru 0x1E, then you have a single byte length. If this byte is 0x1F, then that is a length quote, and the length follows as a uintvar. If this byte is 0x20 thru 0x7F, then it is a null terminated text string. If this byte is 0x80 thru 0xFF, then it is a single byte binary encoding of the content type.

When the content type field starts with a length, this is because there are parameters associated with the content type (e.g., charset=, name=, etc.)
Shubha B
Unregistered guest
Posted on Monday, November 24, 2003 - 07:04 am:   

How do we handle decoding for Encoded-string-value?
WAP209 says Encoded-string-value= value-length charset text string.
If we find some charset like UTF 8, then the text string is a simple text string. If charset = UCS2 or UCS4 or UTF16, what is the format of the following text string? Can you give me an example and explain. Thanks in advance
Bryce Norwood - NowSMS Support
Board Administrator
Username: Bryce

Post Number: 1240
Registered: 10-2002
Posted on Wednesday, November 26, 2003 - 03:33 am:   

Shubba,

That is a good question. I cannot claim to know the answer to that question.

However, my interpretation would be that it is not valid to use UCS2 or UCS4 or UTF16 as the character set for a header value.

Let me explain why I have this interpretation.

As you've seen, WAP-209 defines encoded-string-value like this:


quote:

Encoded-string-value = Text-string | Value-length Char-set Text-string




WAP-230 defines Text-string as follows:


quote:

Text-string = [Quote] *TEXT End-of-string
; If the first character in the TEXT is in the range of 128-255, a Quote character must precede it.
; Otherwise the Quote character must be omitted. The Quote is not part of the contents.




And "TEXT" is defined as follows:


quote:

The rules for Token,
TEXT and OCTET have the same definition as per [RFC2616].




RFC2616 defines "TEXT" as:


quote:

TEXT = <any OCTET except CTLs, but including LWS>

OCTET = <any 8-bit sequence of data>


CTL = <any US-ASCII control character (octets 0 - 31) and DEL (127)>





So a Text-string cannot contain octets 0 thru 31 (except certain white space characters/LWS), or 127.

Therefore, it is not possible to represent UCS2 or UCS4 or UTF16 in the Text-String encoding, as those encodings can contain octets that are not allowed within the Text-string encoding.

Of course, this is just my interpretation.