Wednesday, January 4, 2012

Can OpenStack Swift Hit Amazon S3 like Cost Points?

OpenStack Swift is an open-source object-storage software that can be used to create an Amazon S3 like private/ public cloud storage implementation. There are many questions that immediately come to mind about Swift but none are as critical as cost! The key question is whether a user creating cloud storage with OpenStack Swift can sell it to internal or external users at a price competitive to Amazon S3. 

This table summarizes my conclusion on this topic. The executive summary is that cost is indeed competitive and should not be an inhibitor in you moving forward.

Is a Swift Storage Cluster Cost Competitive with Amazon S3?

DISCLAIMER: The views expressed here are my own and don't necessarily represent my employer Emulex's positions, strategies or opinions.

There are two main factors that have to be considered before we can look at hard numbers:


1. Private vs. Public
 
Amazon's pricing ranges from $0.055/GB/month to $0.14/GB/month for standard storage (let's not worry about reduced redundancy storage for now). For simplicity assume the average selling price (ASP) is a simple average i.e. $0.0975/GB/month.

For private cloud storage, I am therefore assuming that the corporate IT department will want to hit a $0.10/GB/month cost number so that internal users see price parity between Amazon S3 and their own internal cloud storage. Just as a reference, this is 3-5x lower than the lowest current enterprise online storage tier, and more than 10x lower than other storage tiers!

For public cloud storage, I am assuming that service providers need to have 65% gross margins. That would mean their cost target would be $0.035/GB/month for storage equivalent in durability to Amazon's standard storage.

2. Cost-Availability-Performance Trade-off

The second major consideration is cost, availability, and performance. These factors are by definition conflicting. If you use dense servers (large number of disks per server) with gigabit Ethernet connectivity, the performance will be lower than if you use less dense servers with more memory connected via 10 Gigabit Ethernet (10GE). But so will the cost! Similarly, if you have a 24x7 dev ops staff with triple replication on Swift vs. a 9-5 staff with double replication, the latter will cost less but come with lower reliability. The following analysis will make some assumptions on the cost-availability-performance vector.

The total-cost-of-ownership (TCO) of a Swift cluster is the sum of capital expenditures (CAPEX) amortized over a certain time-period and operational expenditures (OPEX).


Capital Expenditure Cost (CAPEX)

The easier of the two costs is CAPEX. Assuming a set of "reasonable" availability and performance targets for a public cloud, the numbers work out as follows using a simplistic model. I am assuming an Open Compute implementation to get the lowest cost and power.






If someone, in a private cloud storage implementation, wanted to optimize performance and populate only say 4 disks per server, the numbers are as follows:






Finally, let's say a small private cloud storage implementation, wanted to reduce dev ops commitment by increasing the replication factor to 4, the numbers are as follows:





Obviously, there are a lot more factors such as the amount of memory, the use (or not) of SSDs, the interface & speed of disk drives (SAS vs. SATA, 10K RPM vs. 7.2K RPM etc.), the CPU speed, the network speed (GE vs. 10 GE), use of SSL acceleration, type of load balancing switch etc. that need to be worked into the model. There are other considerations such as the blended cost of the cluster as it is built up over time. Meaning that nobody is going to do a full build-out on day one, they will build over time and that future purchases will usually be at a lower cost than current purchases etc. A spreadsheet wiz in your company can take this model to the next level.


OK, so the good news is that using a conservative model, for standard storage we are at $0.0162/GB/month. And $0.0202/GB/month for high reliability storage. So far so good.


Operational Expenditure (OPEX)

The scale of your cloud storage implementation is critical in determining the OPEX since the scale directly impacts the number of dev ops employees required to run the cluster. Power, real-estate, and cooling costs are really not material on a per GB basis so I'm ignoring these for this analysis.

5PB+ Scale
Let's say a user has 5PB+ usable storage (i.e. 22PB+ raw storage with triple replication and 10% spare capacity). For this, you would need 4-5 dev ops folks to provide 24x7 coverage. You might also need around 2 architect-developers to do architecture, integration, code enhancement debug etc. Hopefully any innovation will be contributed back to the community! 

Let's say the cost-to-company for these employees is $150K/yr * 7 = $1.05M. At 5PB, this works out to be $0.018/GB/month.

Note: These assume North American Dev Ops Staff Costs. Your costs could be lower if your dev ops is in a different part of the world.



CAPEX + OPEX = $0.034/GB/month which hits the lower of the two targets which was $0.035/GB/month. Since I'm using conservative numbers all around, in reality we should be below the $0.034/GB/month number.

1-5PB Scale
Most installations are not going to be 5PB+, so let's look at this range which will probably be the sweet spot for public clouds. Obviously we can't afford to hire 7 folks to run a cluster of this size. In this case we won't hire the 2 architect-developers meaning users in this range can no longer innovate on their own. Nor can they architect, tune, set-up their cluster, or train their dev ops staff, or integrate Swift into their environment. So these types of users have to outsource this function by using consulting services to get up & running and use already developed code by the community or specialized product companies. Not a major constraint really... Users in the 1-5PB range probably don't want to innovate much anyways.

Now we need to worry about dev ops. Either the user will need 4-5 dev ops folks to provide 24x7 coverage OR go with a higher replication factor like 4 and have only 9-5 (business day) coverage with 1-2 dev ops folks (we will ultimately have to look at whether 4-way replication with 9-5 support provides the required durability or not, but for now let's make that leap of faith assumption. Clearly I'm signing myself up to another blog here). For 2.5PB usable storage, we are at $0.041 with 24x7 dev ops and $0.030 with 9-5 dev ops & 4-way replication. Since these numbers are conservative, I'd say we'll hit our numbers on both public and private with a very minor constraint of utilizing consulting services to get going.



<1PB Scale
Private clouds are likely to be in this range and perhaps some public clouds. Here we certainly can't have any architect-developers. Even for dev ops, 24x7 is difficult unless they can be shared with some other projects. So let's go with 9-5 dev ops and 4-way replication. In many cases 3-way replication might be just fine, but no harm being conservative.



The total cost is $0.045/GB/month for 500TB which still hits the private cloud number. For private clouds, in fact a cluster as small as 150TB usable storage still hits the target. 

Since this is conservative we should be fine with a public cloud as well. The public cloud provider will probably need to give up the 65% gross margin target and accept a lower gross margin. The nice thing about open-source software is that it continually improves. As automation features keep getting added to Swift, the dev ops burden will keep going down in turn improving the gross margin.


Summary

Swift can definitely hit Amazon S3 like cost targets. Service providers can make their 65% gross margin for public clouds at 1PB+ usable storage, and sacrifice some margin below that. Private cloud providers can hit the cost targets at any storage above about 150TB usable storage.

In short, cost is not an inhibitor. If you are considering Swift and worried about cost, rest assured and move forward!

Please feel free to leave comments. I'd love a lively discussion on this topic.

3 comments:

  1. Hey Amar,
    [Disc: I work for Scality, Object-storage ISV]

    very interesting article. I think you're missing a point though. You're not incorporating any development and integration cost of OpenStack into the existing infrastructure.
    I believe that right now, this is one of the big issues that OpenStack is facing (when you talk to real users). Deployment and integration can take a really long time and efforts (meaning engineers and senior architects which costs to a company are way above $150K a year).

    That's why there's a surge of companies starting to sell services around Openstack (Like Piston or Stackops, et al). Either incorporate their costs or the cost of in-house engineers into your model and it will have even more value. It will however show that it's not that financially wise to deploy and manage an openstack-based PB+ scale cloud storage.

    I'd be interested in seeing the numbers.

    -Marc

    ReplyDelete
    Replies
    1. Perhaps an initial capex investment of 150K or more can be easily offset by implementing tiered storage architectures within the object store that drives a more efficient overall purchase of storage, and by offering AWS - like self service and payback models for various use case scenarios that provides self management of resources that typically drain in house IT resources dry and by offering capex to opex opportunities to reduce overall capex strain

      Delete
  2. Are you keen to find a regular technology updates and want to connect with the incredible world of technologies, if yes visit the site OnlineBackupMag.com. Here you can find loads of useful and interesting technological stuff that can help you to grow technically.
    ======================
    BackupPro

    ReplyDelete