Thursday, October 30, 2008

Moving web servers without downtime

In IOLA, we've been moving some of the sites we're hosting, most notably iola.dk with associated services.

The problem with moving web services is DNS. DNS is a silly protocol - it uses a hierarchy to scale, which perhaps makes sense admin-wise (don't forget that scaling means more than just performance, administration is usually at least as important given the cost of man hours). But a hierarchy is a disaster when it comes to performance since the root nodes quickly end up being swamped - think 100 million clients all querying .com domains simultanously. DNS uses caching extensively to alleviate this problem.

And this is the problem starts. There is as far as I know no way to clear the cache. So with a typical DNS entry with maybe a time to live of 12 hours, you have a pretty long period in which some clients may see the old IP address and some clients are seeing the new.

In some cases, it's not a problem to have both web servers running at the same time. But if a site is modifying the database, it's more difficult as you don't want to have two inconsistent databases.

There are several ways to fix this, including staying up late at night until all your visitors have gone to bed, but I'll show a neat trick which is really easy to pull off and which shouldn't cause any sweat.

Start by installing Varnish, a reverse proxy server, on the old server. If you are on Debian, it's aptitude install varnish and you then get a /etc/default/varnish file. In there are a couple of lines like:
DAEMON_OPTS="-a :6081 \
-T localhost:6082 \
-b localhost:8080 \
-u varnish -g varnish \
-s file,/var/lib/varnish/$INSTANCE/varnish_storage.bin,1G"

The first line means that Varnish is listening on port 6081 and the third line means that it is forwarding everything it sees to localhost port 8080. So change that third line to -b 123.456.789.012:80 or whatever the IP address of your shiny new server, execute /etc/init.d/varnish restart and go test it by accessing your old host on port 6081, e.g. http://www.example.com:6081. It should return the page from your new server - you can verify it by looking in the log on the new server.

What happens here is that Varnish parses the request from your browser, forwards it to the backend (your web server on the new server) and caches and returns the web server response to the browser. It's of course a pretty dumb idea to fetch the content over the web twice, from the new server to the old server and then from the old server to the web browser, but we won't be doing it for long. And Varnish will cache as much as its default configuration allows for.

Then when you think you're ready, change the first line above so that Varnish is listening on port 80 instead:
DAEMON_OPTS="-a :80 \
-T localhost:6082 \
-b 123.456.789.012:80 \
-u varnish -g varnish \
-s file,/var/lib/varnish/$INSTANCE/varnish_storage.bin,1G"

Then stop your web server, e.g. with /etc/init.d/apache2 stop and restart varnish with /etc/init.d/varnish restart. This puts Varnish in the place where the old web server was before. So from now on, there's only one web server seing the requests, the one on the new server. So you can switch over DNS from the old IP address to the new address, and when the last DNS entries have expired, you won't be getting hits anymore on Varnish on the old server.

Varnish can process requests based on URL and host so if you're serving multiple hosts from the same server, it's possible to forward only some request to the new server. It's pretty easy to do, you just need to enable a configuration file and put the stuff in there. In Debian there's an example configuration in /etc/varnish/default.vcl. As long as Varnish is not listening on port 80, it's pretty easy to experiment with it.

While you're at it you might want to start with putting Varnish on the new server in front of your ordinary web server. If there's something Varnish can cache, pages that look the same for several clients, it can save some serious load.

No comments:

Post a Comment