Segments of the same message submitting to different SMSCs | Search |
NowSMS Support Forums ⬆ NowSMS Support - SMS Issues ⬆ Archive through December 23, 2009 ⬆ |
◄ ► |
Author | Message | |||
ashot shahbazian Frequent Contributor Username: Animatele Post Number: 59 Registered: 06-2004 |
Bryce, Des: We've been noticing this occasionally: Received from the customer on the 1st server and submitted to 2nd server upstream: 2009-11-12 13:53:42,SAR-+7918554xxxx-6a-2-1.req,127.0.0.1,+7918554xxxx,OK -- SMPP - 00russag:1111,SubmitUser=dixxxx;Sender=+BANK;SMSCMsgId=SAR-+7918554xxxx-6a-2-1;U DH=0500036A0201;Text="OPLATA NA SUMMU 2009-11-12 13:53:43,SAR-+7918554xxxx-6a-2-2.req,127.0.0.1,+7918554xxxx,OK -- SMPP - 00russag:1111,SubmitUser=dixxxx;Sender=+BANK;SMSCMsgId=SAR-+7918554xxxx-6a-2-2;U DH=0500036A0202;Text="ROSTOV-NA-DON First segment delivered, second not: 2009-11-12 13:53:54,7EE846B1.req,,+BANK,OK -- LocalUser:dixxxx,SubmitUser=SMPP - 00russag#2:1111;Sender=+7918554xxxx;Text="id:SAR-+7918554xxxx-6a-2-2 sub:001 dlvrd:001 submit date:0911121353 done date:0911121353 stat:DELIVRD err:000" 2009-11-12 14:03:54,7EE86DFC.req,,+BANK,OK -- LocalUser:dixxxx,SubmitUser=SMPP - 00russag#2:1111;Sender=+7918554xxxx;Text="id:SAR-+7918554xxxx-6a-2-1 sub:0 dlvrd:0 submit date:0911121353 done date:0911121403 stat:UNDELIV err:0 Text:" Submitted upstream to SMSC on the second server. First segment went to one SMSC (smsc1:2222) 2009-11-12 13:53:42,SAR-+7918554xxxx-6a-2-1.req,10.0.0.6,+7918554xxxx,OK -- SMPP - smsc1#2:2222,SubmitUser=00russag;Sender=BANK;SMSCMsgId=679299002;UDH=0500036A020 1;Text="OPLATA NA SUMMU Second segment submitted off another SMSC (smsc2:3333) 2009-11-12 13:53:43,SAR-+7918554xxxx-6a-2-2.req,10.0.0.6,+7918554xxxx,OK -- SMPP - smsc2:3333,SubmitUser=00russag;Sender=BANK;SMSCMsgId=5403168604891832374;UDH=050 0036A0202;Text="ROSTOV-NA-DON First segment delivered, 2nd failed. 2009-11-12 13:53:52,54158F11.req,,BANK,OK -- LocalUser:00russag,SubmitUser=SMPP - smsc2:3333;Sender=+7918554xxxx;Text="id:SAR-+7918554xxxx-6a-2-2 sub:001 dlvrd:001 submit date:0911121353 done date:0911121353 stat:DELIVRD err:000" 2009-11-12 14:03:52,541596AE.req,,BANK,OK -- LocalUser:00russag,SubmitUser=SMPP - smsc1#4:2222;Sender=+7918554xxxx;Text="id:SAR-+7918554xxxx-6a-2-1 sub:0 dlvrd:0 submit date:0911121353 done date:0911121403 stat:UNDELIV err:0 Text:" We don’t have tools to track this as we presume different segments must always submit over the same link. So I can’t say how often it is happening, but we receive one such complaint in about 4-5 weeks. I think most have involved messages with Alphanumeric source addresses. The BANK Source Address is not in the Senderadress field for either or the uplinks (if it were it would have been happening to a lot more segmented messages – unless you’ve fixed it) There is no AllowedUser restriction for these uplinks. The +7 is not in the RoutePrefOnly lists. RoutePrefOnly is not set for the uplinks. However, there’s a couple of hundred prefixes on each uplink as RouteX=yyyy (prefixes are different on different uplinks) to force messages to a particular destination route over this uplink only (there are indications that this isn’t always performed accurately) while if no prefixes are set the messages would load-share across different uplinks (such is the case with Russia +7) The closest matches in prefix lists to the number in question are: On smsc2: Route159=+7705* Route160=+7777* On smsc1: Route114=+70* Route115=+77* Route116=+7701* Route117=+7702* Route118=+7705* Route119=+7777* Not sure why 7705 and 7777 are across both, but that’s Kazakhstan, not Russia. The NowSMS version on the second (last in the chain before SMSC-s) server is 2009.07.09 Perhaps there was a patch for this already and we’ve missed it? I think someone else has been asking about a similar problem. Kind regards, Ashot | |||
Bryce Norwood - NowSMS Support Board Administrator Username: Bryce Post Number: 7869 Registered: 10-2002 |
Hi Ashot, I think this is a different issue. I do recall another issue, but it was resolved prior to 2009.07.09. I'm suspicious regarding the timings. I think the problem is that we've already dispatched part 1 by the time we receive part 2. That presents a problem for the tracking. I was concerned about this possibility when we encountered the earlier problem. But at the time, we didn't run any tests to try to confirm that this was a problem. Clearly, this scenario is a problem. We need to spend some time working on a solution. -bn | |||
ashot shahbazian Frequent Contributor Username: Animatele Post Number: 61 Registered: 06-2004 |
Hi Bryce, Could this be because of the stray .lck file issue? Perhaps in some cases you are deleting .lck files earlier than needed? Or not creating it quickly enough? This could be happening while we were having intermittent connection problems or delays (slow submit_sm_resp) to the 1st SMSC, smsc1#2:2222. Here's what's in the SMSWEB log of the 2nd in chain server for this message: 2009-11-12 13:53:41,10.0.0.6,00russag,SAR-+7918554xxxx-6a-2-1.req,Binary 2009-11-12 13:53:42,10.0.0.6,00russag,SAR-+7918554xxxx-6a-2-2.req,Binary But since the SMSOUT log records on receipt of the submit_sm_resp and both are 1 sec. apart from the message receipt on the server the SMSC delay could not be the reason.. Clocks on this and the 1st server are not in sync obviously. I know this must be tough to troubleshoot, I’m not even sure how to recreate this scenario. Let me dig for more of these in the logs. Amazing! This is never happening with any segmented messages except with those from a particular customer. All fit exactly the same pattern, here are 3 examples from SMSOUT and respective SMSWEB logs: 2009-11-21 00:04:56,SAR-+7916133xxxx-99-2-1.req,10.0.0.6,+7916133xxxx,OK -- SMPP - smsc2:3333,SubmitUser=00russag;Sender=Bank;SMSCMsgId=5406294800549939766;DCS=8;U DH=050003990201;Text="" 2009-11-21 00:04:59,SAR-+7916133xxxx-99-2-2.req,10.0.0.6,+7916133xxxx,OK -- SMPP - smsc1#10:2222,SubmitUser=00russag;Sender=Bank;SMSCMsgId=687563595;DCS=8;UDH=0500 03990202;Text="" 2009-11-21 00:04:56,10.0.0.6,00russag,SAR-+7916133xxxx-99-2-1.req,Binary 2009-11-21 00:04:58,10.0.0.6,00russag,SAR-+7916133xxxx-99-2-2.req,Binary 2009-11-21 00:16:17,SAR-+7911829xxxx-9e-2-1.req,10.0.0.6,+7911829xxxx,OK -- SMPP - smsc2:3333,SubmitUser=00russag;Sender=Bank;SMSCMsgId=5406297725422758454;DCS=8;U DH=0500039E0201;Text="" 2009-11-21 00:16:18,SAR-+7911829xxxx-9e-2-2.req,10.0.0.6,+7911829xxxx,OK -- SMPP - smsc1#5:2222,SubmitUser=00russag;Sender=Bank;SMSCMsgId=687566174;DCS=8;UDH=05000 39E0202;Text="" 2009-11-21 00:16:17,10.0.0.6,00russag,SAR-+7911829xxxx-9e-2-1.req,Binary 2009-11-21 00:16:18,10.0.0.6,00russag,SAR-+7911829xxxx-9e-2-2.req,Binary 2009-11-21 00:50:42,SAR-+7962101xxxx-a8-2-1.req,10.0.0.6,+7962101xxxx,OK -- SMPP - smsc2:3333,SubmitUser=00russag;Sender=Bank;SMSCMsgId=5406306590235377462;DCS=8;U DH=050003A80201;Text="" 2009-11-21 00:50:43,SAR-+7962101xxxx-a8-2-2.req,10.0.0.6,+7962101xxxx,OK -- SMPP - smsc1#2:2222,SubmitUser=00russag;Sender=Bank;SMSCMsgId=687571985;DCS=8;UDH=05000 3A80202;Text="" 2009-11-21 00:50:41,10.0.0.6,00russag,SAR-+7962101xxxx-a8-2-1.req,Binary 2009-11-21 00:50:42,10.0.0.6,00russag,SAR-+7962101xxxx-a8-2-2.req,Binary 1. All are segmented Unicode 2. All have a “Bank” source addr 3. Despite being in Unicode, messages contain Latin letters only 4. The customer submits (to the 1st server) the source address with a wrong TON=1, but this seems to be corrected by the 1st server sending to this one (source addr in the 2nd server’s log does not contain the “+”) I hope I’ve narrowed it down. This is definitely not happening with other concats, while this thread seems to be simply load-sharing and ignoring the segments.. If you would need more info such as the raw PDU-s from the SMSIN logs or debugs I can do it using secure means (upload via HTTPS or FTPS) only, as these messages contain bank transaction info. Or I can send it in Skype. Kind regards, Ashot | |||
Bryce Norwood - NowSMS Support Board Administrator Username: Bryce Post Number: 7871 Registered: 10-2002 |
Hi Ashot, I've had consistent success recreating the problem simply by throttling a submitter so that it only submits a message every few seconds. We essentially forget about the routing of a multipart message if there are no parts remaining in our queue. We're experimenting with an adjustment. It looks good so far, but we need to put it under some more load. -bn | |||
ashot shahbazian Frequent Contributor Username: Animatele Post Number: 62 Registered: 06-2004 |
Bryce, I'm positive that this is only happening with that "Bank" thread. I've spent hours searching (manually) through the log which had hundred times more concat messages with regular numeric (MSISDN with Source Addr TON=1) source addresses and couldn't find a single misrouted one.. I'll ask the engineers to grep the logs to confirm it's not happening with regular messages. There was no throttling, neither artificially set on the uplinks nor by the SMSC-s upstream - which is evident from no latency between WEB and OUT records. Both SMSC definitions are verbose via the Hosts file. Both host names are 5-letter and begin with an "a". The smsc1 definition is: - 12 TRX sessions with a window 25 on each - 2 RX sessions. For some reason they had both "Receive" and "Support any Outbound" checked, but that hasn't made them send anything, as "Send and Receive" was not checked. I’ve corrected it now. - Sender Address field of every transceiver session has a few comma-separated numeric (some are beginning with +7) and alphanumeric addresses. A “BANK” is not among them, but the verbose host name is. Such as, the SMSC hostname is smsc1#8:2222, and “smsc1” is also in the list of sender addresses. - Segmentation method is set to Default (7-bit not checked) - Dest TON and NPI are set to 1 and 1 The smsc2 definition is very similar. The differences are: - 1 TRX session only - No Dest TON/NPI override - Sender Address field also contains a few addresses, but none is matching that for the smsc1 definitions As I mentioned, both SMSC-s have relatively long lists in “Preferred SMSC Connection for,” but “Support any outbound traffic” is also checked for both. The “BANK” thread is specific in that: - it is with an Alpha sender - Unicode despite characters all Latin - comes with a wrong Source Address TON on the 1st server, which is being corrected by I think the second server. Note that we don’t use the Separate Outbound Message Queues setting. Also if you recall this version was made on our request so that it won’t create sub-folders in the \q folders. The [SMSGW] section looks like: [SMSGW] TrackSMPPReceipts=Yes RetryMaxAttempts=5 QDir=\xxx LogDirectory=\yyy WebAuth=Yes WebMenu=Yes WebPort=aaaaa SMPPPort=bbbbb SMPPPortSSL=ccccc ReceiveSMS=No ReceiveMMS=No SeparateUserQueues=No Two entries for [SMPP] [SMPP] DefaultDelReceipt=Yes and [SMPP] [Inbound SMS Routing] testsatuser=xxxxxxxx As you are already working on this, can you please also check and confirm that concats won’t misroute if for different SMSC-s: 1. Same Sender Addresses are indicated for (definitely was a problem is some past release) 2. Same RouteXX= are set for 3. There’s no AllowedUser for smsc1 and there is for smsc2, but the Sender Address in the message matches that specified for smsc1 – segments all route through one uplink, according to which setting has the higher priority 4. Different segments of the same message came from different users – they should always route through the same uplink, regardless of the AllowedUser setting. Believe it or not, we commonly receive segments from different aggregators/hubs. When that happens the last to arrive get stuck. 5. Routing for concats works properly if routing by service_type is used. In other words, proper routing of concatenated messages should be of higher priority than other routing rules. Kind regards, Ashot | |||
ashot shahbazian Frequent Contributor Username: Animatele Post Number: 63 Registered: 06-2004 |
Hi Bryce, Des Is there an update about this issue? Kind regards, Ashot | |||
Des - NowSMS Support Board Administrator Username: Desosms Post Number: 1558 Registered: 08-2008 |
Hi Ashot, Sorry, I forgot to follow-up on this. The update at http://www.nowsms.com/download/nowsmsupdate.zip includes a fix for the problem that Bryce described. -- Des NowSMS Support | |||
Des - NowSMS Support Board Administrator Username: Desosms Post Number: 1559 Registered: 08-2008 |
Actually I just tested that ZIP. It appears to be corrupt (incomplete upload). I just updated it. | |||
ashot shahbazian Frequent Contributor Username: Animatele Post Number: 64 Registered: 06-2004 |
Thanks Des! Patched it, will check in a couple of days if the trouble's gone. Kind regards, Ashot P.S. funny you've mentioned the "incomplete file" issue. Often when I try downloading from NowSMS site or mirrors I get an imcomplete one, less than 1MB. Same just happened with this update when I clicked on the link, after you've updated it. But when I tried "save target as" I got the complete file. Not sure if this has to do with my ancient browser (IE6) or something's in fact wrong with the file. | |||
Des - NowSMS Support Board Administrator Username: Desosms Post Number: 1572 Registered: 08-2008 |
Hi Ashot, Our hosting provider (Rackspace) does seem to have a strange timeout issue with downloads. For awhile, I did have a redirect that redirected all downloads to www2.nowsms.com instead of www.nowsms.com to avoid problems with incomplete downloads. But it appears that this redirect was removed awhile ago. Anyway ... the hosting provider is a cluster/cloud system with pretty good performance. But what we found was that the www cluster would timeout downloads quickly when the browser asked the user what they wanted to do with the download. I tried some tests just now with Chrome (what I use most of the time), and IE8. And I didn't see any problems. But I did notice that both of these browsers keep downloading when they are prompting the user what to do with the file. So it may be the older browsers that we were originally having this problem with. Something else for me to keep an eye on. Right click an save target as is probably a good suggestion for now. -- Des NowSMS Support | |||
ashot shahbazian Frequent Contributor Username: Animatele Post Number: 70 Registered: 06-2004 |
You're right, LOL! It'd time out every time I'd add a date to the name of the file before starting the download! If I don't it'd more often complete normally than not. Kind regards, Ashot | |||
ashot shahbazian Frequent Contributor Username: Animatele Post Number: 71 Registered: 06-2004 |
Checked for misrouting of segments in that "Bank" thread. Not a single case in 24 hours after an update to v.2009.12.08, despite this was a hard day: one of the SMSC-s servicing this thread kept timing out messages. Great job, thanks much! Kind regards, Ashot |