Thursday, October 9, 2014

4 Issues that Ail Swift

While I'm a Swift enthusiast, I also realize it hasn't taken over the world. Other object storage systems (open-source e.g. CEPH, and proprietary e.g. Scality, Cleversafe, Amplidata) are finding success in various use-cases. In fact, industry analyst Marc Staimer is outright down on Swift. While his scoring is overly harsh on Swift, he does make a number of great points.

So what ails Swift? Or in other words, what issues need to be fixed to make Swift a clear #1? I can think of four key issues.

Friday, July 11, 2014

The Number One Inhibitor to Cloud Storage (Part 2 of 2)!

The number one inhibitor is Access! (Part 2)

I've been feeling bad about delaying this second part of my blog, but in hindsight it was good; EMC acquired TwinStrata in the meantime validating the whole premise of my current blog!

Anyways, a few weeks ago I talked about how access, in my view, is the biggest inhibitor to cloud storage. Specifically the five issues are:

1. How do I get massive amounts of data in-and-out of the cloud?
2. How do I get my application to interface with cloud storage?
3. How do I get cloud storage to fit within my current workflow?
4. How do I figure out what data to move to the cloud?
5. Once the data is moved, how do I know it's in the cloud?

Also, the publisher for my OpenStack Swift book is having this contest:
Book Give-away:

Get a chance to win a free copy of the Implementing Cloud Storage with OpenStack Swift, just by commenting about the book with the link -! For the contest we have 7  ecopy each of  the book Implementing Cloud Storage with OpenStack Swift, to be given away to 7 lucky winners.

How you can win:

To win your copy of this book, all you need to do is come up with a comment below highlighting the reason "why you would like to win this book” with the link of the book - Implementing Cloud Storage with OpenStack Swift

Note – To win, the winners must also mention the book link in their comments  -

Duration of the contest & selection of winners:
The contest is valid for 1 week (i.e. from 7/25/14 - 8/1/14), and is open to everyone. Winners will be selected on the basis of their comment posted.

Tuesday, May 20, 2014

The Number One Inhibitor to Cloud Storage (Part 1 of 2)!

The number one inhibitor is Access!

Before I jump in, a quick comment:
I’ve been missing-in-action on my blogs because I’ve been busy co-authoring a new book “Implementing Cloud Storage with OpenStack Swift". As of today, the book is available at (will be available on Amazon and B&N in a couple of days)! Until tomorrow users can take advantage of this 20% off promo code: uwQF3UaR (first 300 orders) on the Packt website. Now that the book is complete, I hope to get back to my regular blog schedule i.e. about 1 every 2 months.
OK, now on to the topic-at-hand, to paraphrase a quote from President Bill Clinton’s 1992 campaign, “It’s the access, stupid”  (of course, this is comment is being addressed to me and not the reader :-). Before the availability of EVault’s Long-Term Storage Service (LTS2), if you had asked me about the key problem(s) with cloud storage, I would have naively gone on-and-on about core storage issues like durability, scalability, management, eventual consistency etc. However, now after having talked to numerous customers, my view is different. The number one problem is: how does a user access cloud storage? In fact, I’ve found that most customers actually accept that the storage piece in cloud storage is taken care of and are much more worried about access. Let me explain the access problem through these five questions:

Tuesday, March 11, 2014

Swift Book

References for Chapter 8 of the "Build Cloud Storage with OpenStack Swift Book"

This section will be updated from time-to-time.

Use cases as discussed in Hongkong Summit

DigitalFilm Tree

Insightful presentations on use of OpenStack Swift along with other modules
IBM initiatives on OpenStack charting out new

HR Management

Openstack Swift usecase with billions of images stored

Additional Resources related to OpenStack Swift

Video Title
Workshop: Deploying OpenStack Swift
SwiftStack will be about deployment and configuration of OpenStack Swift.Also information about step-by-step installation from the ground up.Note- Idea about Swift explorer utility/Tool
Object Striping in Swift
In the current implementation of Swift, the entire object is stored mostly as a single file on object server. Idea behind striping the object is basically to have parallel read/writes of object stripes from multiple object servers.
Case Study: Concur Delivering a SaaS application with Swift
This session explores how Swift is used to support one of the largest enterprise SaaS-based ERP product companies
An Intimate Look at Running Openstack Swift at Scale
Look at the scale at which Openstack Swift run at Rackspace. Discussion about graph of network
Ceph: The De Facto Storage Backend for OpenStack
The main goal of the talk is to convince those of you who aren't already using Ceph as a storage backend for OpenStack to do so
Sheepdog: Yet Another All-In-One Storage for Openstack
Sheepdog is purely userspace distributed storage system for QEMU. It is essentially an object storage system that manages disks and aggregates the space and performance of disks linearly in hyper scale on commodity hardware in a smart way.
StackTach - Ceilometer Integration

Monitioring tool for Rackspace like Nagios, Zenoss
Is Open Source Good Enough? A Deep Study of Swift and Ceph Performance

In this presentation, a deep study on both swift and Ceph performance on Commodity x86 platform, including testing environment, methodolgy and thorough analysis. Tuning guide and optimization BKM (best known method) will also be shared for reference

User Stories from

Video title(private/public)
OpenStack Service Providers Talk
cloud providers talking about why opensatck and its value
OpenStack Summit Fall 2012 Keynotes: Troy Toman, Rackspace
Describes rackspace journey with openstack
DreamCompute in the Demo Theater
Describes about openstack networking(Quantum)
Using OpenStack In A Traditional Hosting Environment
Describes about messaging system and nova component in openstack
Best Buy Keynote at OpenStack Summit April 2013 in Portland
Describes about cloud architecture & continuously delivery cloud
Keynote: Bloomberg User Spotlight
Describes about openstack features
Keynote: Clouds in High Energy Physics
Using clouds in High Energy physics
OpenStack Summit Fall 2012 Keynotes: Reinhardt Quelle, Cisco
Talking about Nova component in openstack
OpenStack Meets Openflow
Describes about networking
How Much for an Openstack Cloud Please?
Talking about cost for openstack cloud
Intel's Innovations with OpenStack
enhancement for compute, networking and swift bench marking and app architural guidance
Learning to Scale OpenStack: A Case Study in Rackspace's Op
openstack -- scaling

Useful Links:

Monday, December 16, 2013

3 Reasons Customers Will Store Exabytes of Long-Term Data onto EVault's New Cloud Storage Service

Just about every industry is creating massive amounts of digital data. Examples range from media & entertainment (film, broadcast TV/radio content), healthcare (medical images), oil & gas (seismological data),  surveillance (video content), bioinformatics (genome sequencing), finance (records), legal (records & discovery), pharmaceutical (research data), engineering (design data)... The list goes on.

This data needs to be stored in a reliable manner. Storage time periods are getting stretched out, in some cases to decades and (I estimate) 100’s of Exabytes of cumulative data between now and 2017 will need to be stored for the long-term.

When retrieved, though not often, the data needs to be accessed immediately. For instance a radiologist needing a medical image or a consumer wanting long-tail video content won’t be satisfied waiting more than a few seconds for it. Something has to be done to address these requirements and it has to be done economically.

Such a storage solution didn't exist (until last week).

Thursday, October 24, 2013

Seagate Kinetic – A Game Changer for Cloud Storage Hardware Architectures

Seagate recently announced a new technology platform called the Kinetic Open Storage Platform that is a genuine game changer for cloud storage hardware architectures (and perhaps other storage architectures as well).  My prediction is that in 2-3 years, cloud storage hardware will be unrecognizable as compared to the classic x86 architecture of today.

Friday, October 4, 2013

Swift Durability and the Mystery of 11 9s

This blog builds on my earlier blog on Swift reliability calculated via MTTDL.

A key measure of cloud storage reliability is a metric called durability. This metric was brought into vogue by Amazon and it is interesting to note that the metric wasn't popular before S3. Durability is defined as the 1 - average annual expected loss of objects as a percentage. For example, 11 9’s of durability means that if you store 10,000 objects you can expect an average loss of a single object every 10,000,000 years. The product of the two i.e. 10^4 objects and 10^7 years gives you 10^11 which corresponds to the 11 9’s.

The question is, can OpenStack Swift match the durability advertised by major cloud storage providers which is 11 9s?

Wednesday, August 14, 2013

OpenStack API Wars – Is Swift API:S3 API::Android:Apple IOS?

I recently wrote an EVault blog about the recent OpenStack API wars. Although most of the discussion is around Nova, the debate also applies to Swift. In this blog I look at the various API options for a public cloud storage offering, and give my 2 cents on what makes sense. Please check it out! 

Also, there's an OpenStack meetup at EVault tomorrow (sponsored by my employer EVault)  where Randy Bias, CloudScaling, and Boris Renski, Mirantis, will debate the API topic. If you haven't registered, there's still time. I think it's going to be a lot of fun as compared to a regular meetup :-)!

Monday, June 10, 2013

Hollywood and OpenStack

I recently wrote an EVault blog about the recent announcement by the Entertainment Technology Center at USC about their Production in the Cloud project . The announcement further states that the project will utilize OpenStack. I think this is a big win for both sides. The media & entertainment industry will win big with  OpenStack cloud technology  to help slash IT costs. OpenStack will win since a key vertical is adopting OpenStack in new and interesting ways. My blog explains my views on this topic in more detail, please check it out!

Wednesday, May 1, 2013

OpenStack Swift Comes of Age with the Grizzly Release

I recently wrote an EVault blog about the recent OpenStack Summit and the Coming of Age of Swift . The blog talks about the dynamics around Swift at the OpenStack Summit rather than talking about specific  feature of Grizzly (which has been covered by a number of blogs & articles). For example, I talk the various unconference sessions which were of very high quality. Please check out the blog.

Friday, January 11, 2013

Microserver Architectures & Cloud Storage

First, apologies for a long pause between postings. I was in the middle of changing jobs -- I'm now at EVault, a Seagate subsidiary that offers cloud-connected backup & restore. I'm excited about this for multiple reasons i) I get to work on cloud storage full-time as opposed to a hobby, and they actually pay me for it ;-) ii) EVault is in San Francisco. Just between San Jose to San Francisco means such a sea change in the culture. iii) While I can't talk specifics, the project I'm working on is very ambitious and cutting edge!

Now to the topic of the post - there is tremendous industry buzz around the potential use of microserver CPUs (also called "wimpy" cores and most often associated with ARM SoCs) for datacenter applications as an alternative to traditional "brawny" x86 CPUs. These are a new class of light-weight power-efficient CPUs that promise to reduce power, real-estate, and cost while delivering the same aggregate performance. It may, of course, take multiple microserver CPUs to match the performance of one traditional CPU. But as we will see further in the post, that may not matter in scale-out architectures.

A microserver, therefore, is a new class of extremely low-power dense server. CPUs with wimpy cores have several common elements to them:

  • Very low power for a given unit of performance
  • Less number of lighter cores as compared to a typical Xeon or Opteron CPUs. Typically these processors also don’t carry the burden of supporting legacy modes, full blown virtualization etc.
  • System-on-a-chip (SoC) integration that eliminates an expensive chip-set
  • Need not be based on x86 architecture – While brawny cores in servers are all based on x86, wimpy cores are mostly based on ARM (with the exception of Intel).
There’s a ton of activity from the vendor side where companies such as AMD, Marvell, Calxeda, Cavium, NVidia, Samsung, Applied Micro, TI are working on enterprise class ARM SoCs; while Intel is working on similar Atom products. These companies either already have products or have announced plans for products in this category. The activity is so frenzied that wimpy cores might become a self-fulfilling prophecy!

There is also a lot of buzz from end-users e.g. this article demonstrates FaceBook’s interest in wimpy cores

However, these new lighter-weight CPUs are not a good fit for all workloads. If one were to broadly classify workloads as A) virtualized B) scale-up C) HPC and D) scale-out, microservers are best suited for scale-out computing. This is because scale-out workloads are typically simple, independent, homogeneous, but numerous and bursty. Scale-out computing is also based on a lot of open-source code which makes it easier to port to new server architectures e.g. ARM.

All of this combined would indicate that wimpy cores are a good fit for cloud storage systems such as Swift. As a reminder Swift is an open-source cloud storage project that is part of the OpenStack effort. But are microservers really a good fit here? Let’s take a look:

Positives of a microserver  architecture for Swift:
  • OpenStack Swift is open-source and runs on Ubuntu. That’s great for microservers since Canonical has taken a leadership role in porting Ubuntu to ARM.
  • OpenStack Swift is indeed a scale-out architecture. This bodes well for microservers to be used here.
  • There is a lot of flexibility in constructing the right compute : memory : storage ratio for OpenStack Swift. In fact one could argue that rather than sticking 24 drives behind a single or dual socket Xeon/ Opteron processor, it might actually be a lot more efficient to stick 4 drives behind one microserver CPU by providing compute a lot closer to storage. This architecture has the promise to reduce cost and power at the same time improving reliability and performance!

Negatives of microservers for Swift:
  • Most microserver CPUs plan to have SATA interfaces and not SAS. This means the architecture is OK for SATA, but difficult to use for nearline SAS drives.
  • By increasing the number of compute nodes, we are putting more strain on the network. This trade-off would have to be looked at.

Hopefully one of the above companies will like this use-case enough to run some real performance test to say one way or another in any conclusive manner, but superficially Cloud Storage systems such as Swift seem like a good target for a microserver architecture with wimpy cores.

Monday, August 27, 2012

Cold Storage Using OpenStack Swift vs. AWS Glacier

Amazon announced their latest IaaS service last week called Glacier, which is intended for cold storage of data. It is 10x cheaper than Amazon S3 (ignoring access charges). Amazon S3 is already ridiculously cheap as compared to enterprise storage and Glacier takes it to the next level. With Glacier, retrieval needs to be infrequent and can take hours. This restriction is what makes it "cold" storage. It seems like tape-as-a-service to me even though Amazon doesn't use this word at all. However if it walks like a duck and talks like a duck, it must be a duck. In this case tape.

The cost equation is amazing. OpenStack does not have an answer. Does this mean a problem for OpenStack? I don't think so, rather I think this is an opportunity. A combination of OpenStack Swift and Linear Tape File System (LTFS) can not only match, but leapfrog AWS Glacier.

Monday, August 6, 2012

Is OpenStack Swift Reliable Enough for Enterprise Use? (Corrected)

CORRECTION: I had incorrectly interpreted the non-correctable error number as being the probability of a bit-rot. This is not the case. I've been told that the probability of a silent bit-rot error is actually quite low,  1 (bit up to sector) in 10^21 (in reality it is even lower) or lower. Even with this 1 in 10^21 number, the MTTDL improves significantly! Apologies to the Swift community for representing Swift in.

In this blog, I’d like to tackle reliability of OpenStack Swift. OpenStack Swift is a very successful open-source object storage project that is suitable for public and private cloud storage.  I believe reliability is a really important topic to discuss for enterprise adoption of Swift to progress, even though terms such as mean-time-to-data-loss may put even the most die-hards into a deep slumber J!!

Thursday, July 5, 2012

The Significance of Hadoop running on OpenStack Swift

The folks at BigDataCraft are working on integrating Hadoop with OpenStack Swift; see for more. This is really exciting! Most readers might ask the obvious question - Hadoop already runs very well on HDFS. Why would running it on top of Swift be of any interest at all?

There are two ways to answer this question. One is from the end-user point of view and the other is from a Swift-enthusiast point of view. Let's explore each one.

Thursday, April 19, 2012

Zmanda Tackles the Hardware Selection Problem for Swift

With the OpenStack conference going on in San Francisco, we’re hearing about a number of very interesting announcements & developments around Swift especially the Essex release. There is a lot to discuss, but I’d like to focus on Zmanda, a cloud backup startup, that is tackling a very interesting problem – how does one select the right hardware for Swift.