Cache Works … Sometimes

With the improved economics of flash memory, we see more and more machines incorporating flash wherever the designers can wedge it in to provide improved value to customers. There is no question that this will become more and more prominent feature in the IT industry, especially in servers and storage.

Flash as Cache

While RAM is faster than flash, it has the disadvantage that it is volatile, so power interruptions can cause loss of data if the RAM isn’t destaged to a non-volatile element when the lights go out. IBM’s 1110 (XIV if you prefer Roman Numerals) uses massive deep-cycle lead-acid batteries to keep the entire machine running long enough to write blocks from RAM to disk on power failure. It’s an old Soviet era approach as I blogged about recently.

The beauty of flash cache is that it retains data over power failures, and hence doesn’t require floundering around with batteries, certainly a good thing. And, since you don’t have to provide power during “an event,” you can have large amounts of flash cache, unconstrained by the time it takes to move the data out of it and onto disk, since flash retains it indefinitely.

So what can go wrong? Well, assuming a great design, only a few things. What I want to concentrate on here is what goes wrong between engineering and marketing: Is my machine really as fast as they say it is? The answer is yes and no. Yes, a storage array with Flash cache (or battery backed RAM) will accept random writes much faster than it can write them to disk. No, it isn’t that fast if you continue the random writes long enough to fill up the write cache, because at that point the storage array has to write blocks to disk to make space for new writes. At that point it doesn’t matter how large your write cache is.

The figure below shows an analogy to catching water from a pipe into a basin. Catch basins for flood control work exactly like this. If it doesn’t rain too hard for too long, the basin allows us to quickly accept lots of water and control the rate water runs out of it; keep Aunt Molly’s house on the creek from being washed away. Long, hard rains cause the basin to fill and one way or another the outflow will equal the inflow, i.e. it overflows – good bye Aunt Molly.

For a storage cache, we just aren’t able accept the incoming data as fast as the applications would like to write it, and the whole system slows to disk speed. If you are running a hosting service for example, customers will see this as a huge disappointment because response times drastically slow. Well, maybe not as disappointed as Aunt Molly whose three bedroom single story is now careening down the flooded creek, but dissatisfied nonetheless.

Pipes1-Normal-500

Pipes2-Critical-500

So if you have a system that seems to run like a scalded ape, and then all of a sudden is more like the same ape on Ambien, well, this could be because your sustained write speed occasionally fills up the cache. The figure below shows exactly what happens to write performance:

Chart3

So when a marketing team sees how fast the system can be, they often advertise that the system is that fast. Unfortunately workload variations that even occasionally fill the write cache will cause huge performance swings and this is not usually advertised.

There is nothing much new here. This is a deterministic phenomenon and cannot be waved off with a coat hanger and some incense. Flash cache simply takes write cache to a larger scale. If you have 1TB of cache, it is easy to “forget” that it’ll just take 10 times longer at the same write data rate to see the performance fall off that the system would see with 100GB of RAM cache. Unless we test for a long enough period of time we may convince ourselves that this sucker is a lot faster than it might appear to be in real life.

What’s the moral of the story? Cache is a good thing, and more of it is better. But, the amount you need depends on workload, the average sustained throughput required, and the “burstiness” of the write load. As a buddy of mine at IBM used to say, “The average depth of Lake Erie may be such that you think you can walk across it, but try it and you’ll drown.” Averages just don’t always provide enough characterization of a problem.

Irrational exuberance may be fun for awhile, but it leads to expectations that fall apart sooner or later and usually at the wrong time. I think Morty Finklestein was wrong.

Mike Workman is Chairman & CEO of Pillar Data Systems. Mike has spent his career breaking new technical ground in the storage industry. In his 25+ years in the storage business, Mike's appointments have included vice president of worldwide development for IBM's storage technology division, senior vice president and CTO of Conner Peripherals, and vice president of OEM storage subsystems for IBM. He has a PhD and Masters from Stanford, a Bachelors degree from Berkeley and holds over fifteen technology patents.