Hangs after 16384 requests

Jan 21, 2009 at 5:06 PM
Hi,

Is there any particular reason that the HttpListener cannot take more than 16384 requests from the same source (didn't try different sources yet)?
Wanted to run some performance checks firing GET requests on the server. It performs great (less than 3ms per request), but after 16384 +/- 10 requests, it just stops. No error messages, no other strange things. It just doesn't answer any more.

Tried to search for overflows etc., but didn't find anything. Any way to activate keep-alive? Maybe it's too many ports being held open at the same time?

Any hints are more than appreciated :-)

Thanks!

Carsten
Coordinator
Jan 21, 2009 at 5:32 PM
Keep-Alive is supported and will be used if the client requests it (header: "Connection: Keep-Alive").

I have no idea why it stops after 16384 requests. Sockets sounds like an idea. I do not pass any sockets to the HttpClientContextImp today, only a Stream (either a NetworkStream or a SslStream containing a NetworkStream). I do not know if closing a NetworkStream will also close a socket. If not, that might be the problem.



Jan 21, 2009 at 5:43 PM
Edited Jan 21, 2009 at 5:45 PM
Checked again running netstat in parallel. Obviously sockets are not the problem (it's only using a single socket for all non-concurrent requests).
Must be something else.

Maybe someone can confirm that? Here's the easiest way to find out:

On a Linux-based machine (or MacOS, or maybe even Windows -- but you will need to install Apache there) run:

$ ab -c 1 -n 40000 http://192.168.160.133:8081/
(obviously you need to replace the URL accordingly)


@jgauffin: Thanks for applying my patch regarding the NullPointerException that fast!


Edit: will try to build a multi-threaded web server using the HttpListener today or tomorrow (not a huge task anyway). Then we'll see, if it's an issue related to the code or the network or maybe something completely different.

Jan 23, 2009 at 1:44 PM
Tried it again, multi-threaded this time. The bug must be somewhere on the way to (or rather before!) the OnRequest handler. At some point -- around 16384 requests -- it stops receiving requests and the client gets timeouts. However, no exception is being risen. Can anyone confirm this oddity?

Jan 23, 2009 at 2:14 PM
Okay, false alarm here. Sorry about that! 
Keep-Alive wasn't working. Now it does and I can kick millions of requests through the ether.
I also did some more tests without keep-alive, and it seems to be a windows issue. Apparently only 16384 sockets can be open at the same time. Then it takes a while to free resources, and the server accepts requests again.

Any Windows experts here? Under Linux (probably we will run the server there at some point) that's all tweakable using /proc and sysctl. What's the situation on Windows? Thanks!



Coordinator
Jan 25, 2009 at 3:22 PM
Did keep alive not work in the webserver or in the client that you tested with?

I'll make sure that the socket is closed properly (with the reuse flag set to true) next week.
Jan 26, 2009 at 5:38 PM
was my mistake. I didn't tell Apache Benchmark to use keep-alive :-(
Anyway, I just threw 500,000 requests (simultaneously from different machines) on my threaded version of the server and stored all of them into a DB (async, buffered). Response time per request: ~1ms. (not even a dedicated machine, just a VM)

Now we can go ahead with the logging server. 

Great stuff, "jgauffin"!


Coordinator
Jan 26, 2009 at 8:07 PM
Edited Jan 26, 2009 at 8:08 PM
thanks :)

I would really appreciate if you could do some benchmarking, write a little about it, your setup etc, so that I can post it in the wiki. I would of course give you credits for it.
Coordinator
Jan 27, 2009 at 2:04 PM
I've created a class that derives NetworkStream ("ReusableSocketNetworkStream" ;) ) to that disconnects the socket with the Reusable flag set to true. The listener should now work with more than 16384 requests, even if keep-alive is inactive.
Jan 27, 2009 at 2:35 PM
That sounds interesting. Will check that out later (hopefully today).

As to the benchmarking: Will come. I may create another open source project anyway, using your C# Webserver as the foundation for a multi-threaded high-performance logging server, which does nothing but
  • accepting requests
  • dispatching them to multiple ServerThreads which do pre-processing for LogQueue
  • enqueueing them in LogQueue for writing to database (async!)
  • diverting them to a fallback thread (OverflowSerializer), which buffers to disk (in case DB is down) and feeds them into the DB as soon as it is back up again
  • delivering status code 200 (that's done after pre-processing by the ServerThread classes, unless panic flags have been set by the PanicEventHandler, in which case the ServerThreads will return status code 500)
Planned features are
  • PanicEventHandler (memory usage in LogQueue grows much faster than the DB can write, OR -- even worse -- DB is down, OR -- even worse -- fallback thread runs out of disk space)
  • Status Listener on another port to gather some runtime data and stats

The fall-back thing is not implemented yet. But with a concurrency of 5 requests at a time, I got average response times below 1ms in a very poorly performing environment. Memory footprint: 20MB including debug code.

It's not priority here at the moment, so more details may take a while. :-)