Developer Drain Brain

April 5, 2011

Playing with long polls

Filed under: Development — rcomian @ 12:11 pm

Server Push is a technique to get data down to a client as soon as it becomes available. It’s also got other names such as Long Poll and Comet.

There’s nothing particularly fancy about the technique, it’s essentially a web request that takes a really long time. The request stays open until whatever data the client’s interested in is available on the server.

Using it means that clients can get updates almost as soon as they happen on the server without polling. Polling is generally inefficient and wasteful of both network resources and server processing. It’s up to the client to work out when they poll and you can never really tell when or if a client is going to get the update.

Server push is really easy to configure – if your client can make a web request to your server, then you can use this technique without installing any fancy software or opening any unexpected ports, it just works.

The trick with Server Push is to have a very efficient server. Traditional web servers would start up a thread for each and every connection. Whilst this works when you’ve got very limited simultaneous requests, it doesn’t scale up very well when dealing with lots of requests that, for most of their lives, aren’t doing anything.

Frameworks like python’s Twisted, ruby’s EventMachine and javascript’s Node.Js have shown a way where this can be really efficient – don’t spawn threads, just process them using events. So we know it’s possible – are there any other ways to do it?

I’ve been looking at the problem using .Net’s HttpListener, which is actually quite efficient. It’s essentially the same as http.sys, but in managed code. Web requests come in through a call to GetContext – but no extra thread is created, you just get an object representing the request, and you respond to it as you see fit.

It’s quite simple then, to put the requests that come in straight onto a list and wait for something to happen to your data. When that something happens, simply iterate over the list sending the updates to the clients. You don’t have to reply from the same thread that accepted the request – indeed there’s no need to reply immediately at all.

It’s a beautifully simple principle and works really well. I’ve had 135k active connections on my test machine and it doesn’t even break a sweat. Once the connections are made, the only resource used is RAM, and those 135k connections only used about 800mb. No handles, threads or cpu are consumed on the server, so there’s no fixed upper limit that I can find.

What’s even better, is that clients will appear to wait forever for the results. I’ve had tests running for >12hrs, then receiving the response and carrying on like nothing was amiss.

It’s also worth noting that HttpListener can be a really good server in it’s own right, by caching the replies sent down to each client, I’ve been able to consistently handle 150k requests a minute on a quad core server over 100mb/s network connection.

There’s a few things I’ve found that are worth noting:

Accept connections as quickly as possible

In essence this means getting to the next call to GetContext as quickly as possible. There are two methods you can use for this – have a single loop that continuously calls GetContext and farms out the actual work in some way, or use BeginGetContext, and in the callback method, call BeginGetContext again immediately, before you process your current request.
Remember that List is backed by an array.
If you’re adding hundreds of thousands of items to it in a time critical context (such as within a lock block) it gets really inefficient as it copies the data each time it grows the backing array. You can either pre-allocate if you know roughly how many items will be in it, or just use a LinkedList – which has constant time adds and removes – which is what you’ll likely be doing most of the time anyway. List is really bad if you start paging, since it’s essentially copying that data at disk speed.
Don’t block the list
The list is central to this processing, and every lock on it reduces your ability to service requests. An example is in sending out the replies – it may not take long to loop over 100k objects in a list and call BeginWrite on each one, but it takes a few seconds. It’s much better to take your lock (which isn’t on the list itself), copy the list reference and create a new list in its place. Then you can iterate through your list of subscribers at leisure without hampering the server’s ability to accept new requests. If you’ve got a lot of subscribers, those clients that you serviced at the beginning of the list will already be queueing up for the next update by the time you get through your list, so this is far from academic.
Don’t throw that exception
Time really is of the essence, whether you’re getting an entry onto a list or sending the data out, it needs to be quick. Throwing exceptions is slow. So even though it’s more clunky, use that TryParse instead of Parse and structure your logic so that you’re not throwing trivial failures into the exception handler for mere coding convenience.
Cache Everything
This setup can be really efficient if the same data is being sent to everyone. In this case, you can render the data you send to one client into a byte array and keep it, so that every subsequent client receives a copy of that same byte array.
Use a versioned REST style interface
This is just a suggestion, but use the url to locate what resource you want, and have the client send a query string stating what version they’ve got. If your current version doesn’t match or the client doesn’t provide a version, just send back the latest version to the client without queueing anything. If the version does match, add it to the subscriber list for when the next version becomes available. It goes without saying that you need to synchronise this so that no-one gets left behind.
Use the background threads
I’m a little more cautious about this advice, but it’s worked for me in my testing – using the asynchronous calls for everything does appear to work really nicely. One note of caution – the thread count can go up alarmingly once things start going wrong (actually, I’ve only seen this on the client, not the server). It’s only a temporary situation, however, once the issue is resolved the threadpool sorts itself out nicely.
Have a plan for dropped subscribers
In my proof of concept I didn’t really care, but in reality, the subscribers can disconnect. Clearly, you don’t want to keep these in your list any longer than necessary. I haven’t yet found a way of cleanly and reliably detecting a failed connection using the HttpListener classes, although I haven’t looked too hard. It might be the case that the best you can do is return an empty response and ask them to reconnect. Whilst this might not sound any better than polling, it’s still fairly lightweight, responsive, and the server can dynamically determine what it needs to do, modifying these ‘poll periods’ according to load.

I’ve actually found it much harder to write a client that makes 50k requests than a server that accepts 100k requests.

Know your http limits
By default, only 2 http concurrent connections are made – per machine to any particular server. Use kb282402 to fix this if you’re using a wininet based client, and set ServicePointManager.DefaultConnectionLimit if you’re using .Net’s WebRequest class
Know your ports
Each request to the server requires an open, bound port on the client. There are only ~65k available for each ip-address. So if you want >65k connections from one machine, you’ll have to have multiple ip addresses on the network and bind your requests to them explicitly. Also, windows varies how many ports are available for this use. My developer machine allowed everything >1024 for this, whereas Vista/2008 and above only use ports >49124. This limits you to about 16k outbound connections of any kind. Use kb929851 to configure this. Also keep in mind that ports don’t become available just because the connection closed. Ports can stay unavailable for 4 minutes whilst they mop-up any late server packets. This can be reduced by the OS if there are a lot of ports in this state, but it can bite you if you’re trying to recycle your 50,000 connections.
Know your handles
With .Net’s WebRequest class, and probably with any outbound network connection, a handle is created for each one. Windows XP limited the number of open user handles to 10,000, in windows 2008R2 I’ve not had any trouble when running 50,000 connections, but it’s something else to be aware of.
Any server can be flooded, live with it
A server can only accept so many connections a second. The network stack can itself queue up the backlog of connections, but the size of the backlog is limited (the best documentation I can find says between 5 & 200). Beyond this, the network stack will reject the connections without your server code ever knowing that they were there. This is the main reason why it’s so important to accept those connections quickly. But even so, if you’re making 50,000 asynchronous connections to a server, from 4 clients, at the same time for a total of 200k connections, that could all take place in a few seconds. No server can handle that, you will get errors, and you must have a way of recognizing them, handling them gracefully and re-trying them.

I wonder how many people can make use of this technique, how many people are using this, and what other issues there are with this. Has anyone found a different solution for the problem?


Create a free website or blog at