So What’s Next, Software-Defined Cloud?

SDN

Every now and then some new buzzword catches the public’s fancy and becomes a sensation that everyone (especially the “experts”) talks about with fervid enthusiasm. Each time this happens, I end up feeling like that naïve boy looking at the naked emperor (a very unpleasant sight). Today’s phrase is “Software Defined Storage,” or the esoteric-sounding acronym, SDS. But what is it?

Like everyone else does, I looked it up on Wikipedia. In essence, “… SDS is an evolving concept for computer data storage software to manage policy-based provisioning and management of data storage independent of hardware. […] typically include a form of storage virtualization to separate the storage hardware from the software that manages the storage infrastructure…”, etc.

Hmmm, that sounds very familiar. I vaguely remember some history from the not so distant past…

Once upon a time, there was this great visionary named Reijane Huai. Foreseeing the emerging Gigabit Ethernet as a viable challenger to the fibre channel SAN, he gathered a team of engineers to create a SCSI-over-IP product. The hope was to use the new GbE to provide a routable, more ubiquitous, and potentially more cost-effective storage connectivity to break the fibre channel monopoly. This was before the days of iSCSI.

The year was 2000. After the concept was tested using Linux by Professor Eric Chen and his students in Taiwan, FalconStor was founded. Our team in New York quickly started to work on a prototype for Windows, which was a much bigger challenge since — unlike Linux — we did not have source code. I experimented with some sample code to confirm the NDIS driver access to the network stack. After that we quickly used that code to create the world’s first working SCSI over IP driver for Windows, demonstrated by playing a movie with an IP virtual disk in Windows from a CD drive connected to a Linux server.

To achieve speed the team implemented an efficient zero-memory copy between the network buffer and the SCSI interface in the kernel, achieving 120 MB/s with our SAN over IP. This speed was 50% higher than the 80 MB/s maximum set by fibre channel at that time. We thought we had something that could lead the company to success.

Alas, it was not meant to be. Unfortunately the great “Dot-Bomb” of 2000 killed the potential of many new technologies. We realized we needed more than just SCSI over IP, and started to create a complete storage platform by adding novel storage functions such as differential snapshot and micro scan IP replication. Bernie Wu, who had great insights into the industry, urged us to also add fibre channel connectivity.

By then EMC’s snapshot was BVC. It was a full volume mirroring, which would then be broken off to preserve the data image. So each snapshot was a full volume. Our solution was block level differential, which meant only changed data was stored. After seeing what we did, IBM’s storage experts named our snapshots “space efficient snapshots.” We invented many techniques for these advanced functions, and subsequently received many patents.

The result was IPStor, the world’s first storage virtualization product that provided the most advanced storage functionalities available, even by today’s standards. It was pure software. Heterogeneous storage devices were virtualized in one environment and managed by a central platform. Storage pools could be created to allow policy-based allocation, even self-allocation from the client hosts.

Wait, How Does Wikipedia Describe SDS Again?

I guess people have either forgotten or are unaware that there was a time (just 15 years ago) when storage had no functions other than acting as the RAID controller. Then we created a product that was “storage virtualization to separate the software to manage policy-based provisioning and management of data storage independent of hardware….” (from Wikipedia).

Soon many storage companies started to add similar functions to their hardware. Of course each company did this in their own way, and soon a jungle was created. The word “virtualization” was “good,” then “bad,” then “good,” then “bad.” I guess now it is “good” once again. Someone evidently wandered into today’s storage jungle and came up with the brilliant idea: “Why don’t we create a virtualized storage software platform to centralize and manage all this hardware? And let’s call it … uh… Software Defined Storage, since there is already Software Defined Networking…”

Today, almost everything is software defined. Storage is no exception. This is especially true for today’s IT environment. Perhaps we should just call it the “Software Defined Age,” since now even servers are software.

A new storage platform for tomorrow must take into consideration not only just the typical data storage as a SAN volume, or as a NAS share, or as simple object repository, but should also allow flexible integration into today’s cloud infrastructure. This means the complete storage paradigm should encompass the virtualized, distributed computational and data environment over the ubiquitous and high capacity Internet connectivity. The days of individual and isolated islands of storage being the main challenge of storage administrators are fading fast.

One More Time, So What Is SDS?

From the SDN example, one can best derive that SDS should allow for consumers to specify/request specific properties, or capabilities of the storage devices in a more flexible manner. This requires storage systems to provide appropriate APIs, and — more importantly — the capabilities to satisfy such requests; with functions like thin provisioning, snapshots, deduplications, etc. These functions have been in the industry for a while in various forms, including the example mentioned on storage pools with self-allocation based on performance characteristics. The VM computational environment has further pushed adaptation from many storage vendors. But one can hardly discern these specific facts from all the chit-chat about SDS.

In the recent past, a prodigious number of novel applications have emerged, due to a confluence of events and circumstances for each particular moment in time – cheaper and faster storage, faster connection speeds, etc. especially for multimedia data, – have created a feeding frenzy on storage. This brought in very ingenious, specific solutions to meet the new wave of storage challenges. But with each wave of new and innovative architectural frameworks comes an equivalent wave of hype being preached as a panacea to every problem under the sun.

This phenomenon has occurred repeatedly since the information revolution, which come to think of it, is only few decades old. We are living in a truly interesting and exciting time. At the end of the day, these 1’s and 0’s (a lot of them), must reside on some physical media somewhere, and that physical media needs to be managed, protected, and available. SDS or not.

Wai Lam

Wai Lam is co-founder and CTO of Cirrus Data Solutions, a developer of Data Migration Server and Data Caching Server for storage area networks (SANs). He was previously CTO and VP of Engineering at FalconStor, a company he co-founded in 2000. There, he was the chief architect, holding 18 of 21 company patents. His inventions and innovations include many industry "firsts" in advanced storage virtualisation, data protection, and disaster recovery. Wai received the prestigious China national "Top 1000 Technological Leaders" award in 2013.