Building Highly Scalable Servers with Java NIO
Subject:   Design advice- Thread distribution for NIO on multiple CPU,
Date:   2005-12-26 15:54:02
From:   nuno.santos
Response to: Design advice- Thread distribution for NIO on multiple CPU,


I've stopped working with NIO for more than one year, but anyway I'll give you my two cents...

300 hundreds connections per minute it's not much, a modern server can easily handle that using a threaded model. But as a matter of principle, I would not design for the average case, or else the system might collapse during usage peaks. I would assume at least 1.000 connections per minute and consider mechanisms for graceful degradation if the system ever goes over that value.

Having said that, it's also important to know how long does it take to transmit the full data of a client. You mentioned GPRS, which is kind of slow and can be very slow if the conditions of transmission are bad. In ideally conditons I would suppose a connection would take 10 seconds. Under not so good conditions, I would assume perhaps some 20 seconds.

This means 1.000 new TCP connections per minute, each lasting 20 seconds, an average of 333 connections at any time. This is a worse case scenario. If you want to support this case, I would definetely suggest NIO, as keeping 300 threads alive completely destroys the throughput of the server by wasting time on context switches.

Concerning the threading model, I became a big fan of doing the work on the same thread that is running the selector. This simplifies programming, as you don't have to worry about handing out the control of a socket between threads, it eliminates the time wasted in context switches and on synchronization between threads (which is always a waste of CPU cycles), and finally provides very good scalability: if the server starts to be overloaded, then the selector thread will spend more time processing the requests and go into the select call more rarely. But whenever it calls select, it brings a lot of work ready to be done (it accumulated since the last call to select). This way, the thread enters a "bulk mode", which actually improves the efficiency by making less system calls.

So in your place I would go for a NIO architecture. I'm not sure about the best number of selector threads. If the writes into the database can block the thread (which can be the case if the database is configured to synchronize the the hard drive to ensure that the data is really written), then you would need a bigger number of selector threads to hide this latency. Probably 10 or 20. If the writes to the DB are not blocking (if they are small and they are using the write buffers of the hard drive or of the OS), then 4 or 5 should be enough. But some experiments should provide you with the right answer.

Hope this helps.
Nuno Santos