Close_wait bug?

Close_wait bug? SearchSearch
Author Message
Gonzalo Escuder Bell
New member
Username: Gescuder

Post Number: 1
Registered: 03-2004
Posted on Saturday, April 24, 2004 - 10:39 pm:   

Hi,

I´m having some trouble with NowSMS, because "mmsc.exe" leaves tcp connections in "close_wait" state. This makes the CPU go to 100% usage, causing serious performance problems.

As I understand, this happens when:

"In a scenario where the client sends a close, the server acknowledges this and sends whatever data is still in its buffers. Server in CLOSE_WAIT state. The server will only close the connection once it has sent a FIN to the client and received an ACK for that.

CLOSE_WAIT state means the other end of the connection has been closed while the local end is still waiting for the app to close."

Is there any patch to fix the close_wait timer or to force mmsc.exe to close sockets that have been in "close_wait" state for too long?

Thank you!

Bryce Norwood - NowSMS Support
Board Administrator
Username: Bryce

Post Number: 2413
Registered: 10-2002
Posted on Wednesday, April 28, 2004 - 08:01 pm:   

Hi,

That doesn't really make any sense. The MMSC never waits more than 2 minutes on a socket before it closes the connection.

So I can't imagine why there would be any CLOSE_WAIT connections.

That said, we did just make some adjustments yesterday in how the MMSC closes sockets.

Try installing the v5.50 beta release (download link from our home page). Then replace MMSC.EXE with the version in http://www.nowsms.com/download/richie.zip. That MMSC.EXE in the ZIP is a newer version than what is in the current v5.50 beta download, and it specifically includes a fix to gracefully close sockets.

I would not expect that to affect a CLOSE_WAIT problem, however. I simply haven't seen that, and don't understand how we could be leaving sockets in that state.

If you're hesitant to install the beta on a production system, v5.50 should be released in the next few days, and we are planning to include the fix that gracefully closes sockets.

-bn
Gonzalo Escuder Bell
New member
Username: Gescuder

Post Number: 2
Registered: 03-2004
Posted on Wednesday, April 28, 2004 - 09:55 pm:   

Thank you, very much!.
The problem is not happening since 2 days.
We have changed the wap gateway too, so that could be a point. I will try the new NowSMS version too.
Bye.
Gonzalo Escuder Bell
New member
Username: Gescuder

Post Number: 3
Registered: 03-2004
Posted on Sunday, May 02, 2004 - 06:04 pm:   

Hi Bryce,

Yesterday that "close_wait" state in the connection happened again, pushing the CPU to the 100% (in this state, mmsc.exe continues to receive/send messages, but starves the pc resources). All right, I´ll try upgrading to the next version... but in the mean time I´ve sniffed the connections.
In order to follow the next conclusions, understanding of the tcp states while closing a connection is needed.
The reasoning is as follows:

As you said, in the majority of the cases the mmsc.exe is who initiates the closing of the connections (both http and smtp cases) so mmsc.exe could not be in "close_wait" state (maybe time_wait1 or time_wait2). This is normal and works properly (time_wait states are normal for 2 minutes).
It seems to be that in some abnormal cases (luckily not often), the mmsc.exe receives the tcp "FIN" from the other side trying to close the connection. When this happens, the mmsc.exe sends back an ACK and stays in "close_wait" state (as expected). In order to close the connection properly, mmsc.exe SHOULD now send a FIN to the other side and wait for the final ACK. The problem may be happening here, with mmsc.exe in this "close_wait" state (it keeps in this state forever).

As I can understand, this is a problem of the mmsc.exe (as I said, in this abnormal case when it receives the closing of the connection from the other side).
I will try to capture this and send it to you. I hope this would be useful to fix this problem (maybe it´s already fix in the 5.50 version).

Gonzalo
Bryce Norwood - NowSMS Support
Board Administrator
Username: Bryce

Post Number: 2446
Registered: 10-2002
Posted on Monday, May 03, 2004 - 03:04 pm:   

Gonzalo,

What my previous statement was saying is that MMSC.EXE should never be in a situation where it does not close a connection within 2 minutes of not receiving any data over the connection.

So, if the other side of the connection initiates a connection close, NowSMS should never wait more than 2 minutes before it terminates its end of the connection. (Generally it is much quicker than that, but 2 minutes is a worst case.)

I've seen applications leave sockets in a CLOSE_WAIT state because the application does not close its socket. I do not believe we have any such problem in MMSC.EXE.

But when you talk about FIN and ACK processing ... that is all handled at the operating system level.

What OS are you using? Do you have all of the latest service packs for that OS installed?

Do you by any chance have a virus scanner installed on the PC? Can you remove that virus scanner? (Many virus scanners do intercept HTTP connections, and can become confused and overwhelmed by server applications like NowSMS. We have seen 100% CPU utilisation problems in the past that were caused by virus scanners.)

I've asked a couple of NowSMS users to send me NETSTAT -a reports to see if they are experiencing CLOSE_WAIT problems. But I haven't seen any.

So I'm not sure what could be happening on your system (other than the virus scanner idea).

I'd suggest enabling the debug log in the MMSC. You can do this by editing MMSC.INI, and under the [MMSC] header, add Debug=Yes.

Then restart the MMSC service.

An MMSCDEBUG.LOG file will be produced as the gateway runs.

Perhaps there is some sort of unusual transaction that is occurring or a unusual configuration issue which is leading to problems. If you e-mail this MMSCDEBUG.LOG to nowsms@now.co.uk when the system hits the 100% CPU utilisation problem, then I can look over this file to see if there is something that appears unusual.

-bn