Commentor Blog

When Quality Matters

Commentor A/S

When Quality Matters

Contact usSend mail

Recent comments

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2012

Scalable doesn’t mean fast

(Cross-posted from ploeh blog)

Recently I spent a couple of days with Thomas Jespersen who’s working towards a launch of spiir.dk – on Windows Azure. The reason I got to talk to him was to see if I could help with some performance issues he had with Azure Table Storage.

The scenario is really simple: the application needs to load all of a user’s bank transactions into memory to enable pretty advanced sorting and filtering. That sounds like a lot, but really isn’t more than approximately 200 kB of data retrieved through a single query – so: there are no 1+N problems in play here, but even so it originally took more than two seconds. That’s a bit long to wait before you can even start rendering a web page.

By tweaking his partitioning strategy and using parallel queries, Thomas managed to bring down the data retrieval time to approximately one second. Although stress testing indicated that this duration was very stable, even under load, it is still too slow. So we met to see what could be done.

Thomas had done a great job tweaking the query, so I couldn’t really suggest some sort of secret API that would make it run significantly faster. Basically, we have to deal with Azure storage being based on REST and that there are a lot of things about run-time behavior we cannot control. Apart from designing a proper partitioning strategy, we can’t add indexes to Azure Table Storage.

It was time to take a different approach.

As far as I can tell, Windows Azure is designed to be very scalable. However, just because scalability implies that you can handle an insane amount of work within acceptable time frames, it doesn’t mean that you can extrapolate it to mean that under a light load, everything will be lightning fast. That’s not the case at all.

Scalability means that performance characteristics remain stable from light to heavy load.

Consequently this means that if performance is adequate under heavy load, it will also be adequate under a light load. Azure Storage is first and foremost designed to be scalable, and as a second priority, as fast as possible.

As Thomas discovered, Azure Table Storage isn’t particularly fast.

It may be a masochistic side of me that I’m not otherwise aware of, but I actually appreciate that. It makes us reassess our most basic assumptions.

The data that Thomas needs to read isn’t particularly dynamic, so what if we take a snapshot of it? In short, we loaded all of a user’s data into memory and serialized it to Azure Blob Storage.

Loading the same data from a binary serialized Blob took only 1/6 of the time it did to load it from Table Storage.

As it turns out, Thomas doesn’t even need all the columns from the Table to populate the view, so we could even make the serialized Blob smaller yet.

At this point, however, we now have two representations of the same data: The original data in Table Storage, and a persistent cache in Blob Storage. The remaining challenge is to figure out how to keep these in sync.

This may seem like a hack, but is really represents a paradigm shift. Letting go of ACID opens up a lot of new opportunities.

Actually, I spend most of the next day trying to convince Thomas that CQRS would be the best approach, or that we could at least pick up some of the techniques from asynchronous, messaging based architectures, but that’s another story.

The morale here is that on Azure, things may be slower than you are used to, but storage is (relatively) cheap, so denormalization can save you a lot of execution time.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Categories: Azure
Posted by mark.seemann on Monday, January 24, 2011 7:05 AM
Permalink | Comments (0) | Post RSSRSS comment feed

Windows Azure Migration Sanity Check

[Reposted from ploeh blog]

Recently I attended a workshop where we attempted to migrate an existing web application to Windows Azure. It was a worthwhile workshop that left me with a few key takeaways related to migrating applications to Windows Azure.

The first and most important point is so self-evident that I seriously considered not publishing it. However, apparently it wasn’t self-evident to all workshop participants, so someone else might also benefit from this general advice:

Before migrating to Windows Azure, make sure that the application scales to at least two normal servers.

It’s as simple as that – still, lots of web developers never consider this aspect of application architecture.

Why is this important in relation to Azure? The Windows Azure SLA only applies if you deploy two or more instances, which makes sense since the hosting servers occasionally need to be patched etc.

Unless you don’t care about the SLA, your application must be able to ‘scale’ to at least two servers. If it can’t, fix this issue first, before attempting to migrate to Windows Azure. You can test this locally by simply installing your application on two different servers and put them behind a load balancer (you can use virtual machines if you don’t have the hardware). Only if it works consistently in this configuration should you consider deploying to Azure.

Here are the most common issues that may prevent the application from ‘scaling’:

  • Keeping state in memory. If you must use session state, use one of the out-of-process session state store providers.
  • Using the file system for persistent storage. The file system is local to each server.

Making sure that the application ‘scales’ to at least two servers is such a simple sanity check that it should go without saying, but apparently it doesn’t.

Please note that I put ‘scaling’ in quotes here. An application that runs on only two servers has yet to prove that it’s truly scalable, but that’s another story.

Also note that this sanity check in no way guarantees that the application will run on Azure. However, if the check fails, it most likely will not.

Currently rated 1.5 by 2 people

  • Currently 1.5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Categories: Azure
Posted by mark.seemann on Thursday, October 14, 2010 7:23 AM
Permalink | Comments (0) | Post RSSRSS comment feed