This table summarizes my conclusion on this topic. The executive summary is that cost is indeed competitive and should not be an inhibitor in you moving forward.
|Is a Swift Storage Cluster Cost Competitive with Amazon S3?|
There are two main factors that have to be considered before we can look at hard numbers:
1. Private vs. Public
Amazon's pricing ranges from $0.055/GB/month to $0.14/GB/month for standard storage (let's not worry about reduced redundancy storage for now). For simplicity assume the average selling price (ASP) is a simple average i.e. $0.0975/GB/month.
For private cloud storage, I am therefore assuming that the corporate IT department will want to hit a $0.10/GB/month cost number so that internal users see price parity between Amazon S3 and their own internal cloud storage. Just as a reference, this is 3-5x lower than the lowest current enterprise online storage tier, and more than 10x lower than other storage tiers!
For public cloud storage, I am assuming that service providers need to have 65% gross margins. That would mean their cost target would be $0.035/GB/month for storage equivalent in durability to Amazon's standard storage.
2. Cost-Availability-Performance Trade-off
The second major consideration is cost, availability, and performance. These factors are by definition conflicting. If you use dense servers (large number of disks per server) with gigabit Ethernet connectivity, the performance will be lower than if you use less dense servers with more memory connected via 10 Gigabit Ethernet (10GE). But so will the cost! Similarly, if you have a 24x7 dev ops staff with triple replication on Swift vs. a 9-5 staff with double replication, the latter will cost less but come with lower reliability. The following analysis will make some assumptions on the cost-availability-performance vector.
The total-cost-of-ownership (TCO) of a Swift cluster is the sum of capital expenditures (CAPEX) amortized over a certain time-period and operational expenditures (OPEX).
Capital Expenditure Cost (CAPEX)
The easier of the two costs is CAPEX. Assuming a set of "reasonable" availability and performance targets for a public cloud, the numbers work out as follows using a simplistic model. I am assuming an Open Compute implementation to get the lowest cost and power.
If someone, in a private cloud storage implementation, wanted to optimize performance and populate only say 4 disks per server, the numbers are as follows:
Finally, let's say a small private cloud storage implementation, wanted to reduce dev ops commitment by increasing the replication factor to 4, the numbers are as follows:
Obviously, there are a lot more factors such as the amount of memory, the use (or not) of SSDs, the interface & speed of disk drives (SAS vs. SATA, 10K RPM vs. 7.2K RPM etc.), the CPU speed, the network speed (GE vs. 10 GE), use of SSL acceleration, type of load balancing switch etc. that need to be worked into the model. There are other considerations such as the blended cost of the cluster as it is built up over time. Meaning that nobody is going to do a full build-out on day one, they will build over time and that future purchases will usually be at a lower cost than current purchases etc. A spreadsheet wiz in your company can take this model to the next level.
OK, so the good news is that using a conservative model, for standard storage we are at $0.0162/GB/month. And $0.0202/GB/month for high reliability storage. So far so good.
Operational Expenditure (OPEX)
The scale of your cloud storage implementation is critical in determining the OPEX since the scale directly impacts the number of dev ops employees required to run the cluster. Power, real-estate, and cooling costs are really not material on a per GB basis so I'm ignoring these for this analysis.
Let's say a user has 5PB+ usable storage (i.e. 22PB+ raw storage with triple replication and 10% spare capacity). For this, you would need 4-5 dev ops folks to provide 24x7 coverage. You might also need around 2 architect-developers to do architecture, integration, code enhancement debug etc. Hopefully any innovation will be contributed back to the community!
Let's say the cost-to-company for these employees is $150K/yr * 7 = $1.05M. At 5PB, this works out to be $0.018/GB/month.
Note: These assume North American Dev Ops Staff Costs. Your costs could be lower if your dev ops is in a different part of the world.
CAPEX + OPEX = $0.034/GB/month which hits the lower of the two targets which was $0.035/GB/month. Since I'm using conservative numbers all around, in reality we should be below the $0.034/GB/month number.
Most installations are not going to be 5PB+, so let's look at this range which will probably be the sweet spot for public clouds. Obviously we can't afford to hire 7 folks to run a cluster of this size. In this case we won't hire the 2 architect-developers meaning users in this range can no longer innovate on their own. Nor can they architect, tune, set-up their cluster, or train their dev ops staff, or integrate Swift into their environment. So these types of users have to outsource this function by using consulting services to get up & running and use already developed code by the community or specialized product companies. Not a major constraint really... Users in the 1-5PB range probably don't want to innovate much anyways.
Now we need to worry about dev ops. Either the user will need 4-5 dev ops folks to provide 24x7 coverage OR go with a higher replication factor like 4 and have only 9-5 (business day) coverage with 1-2 dev ops folks (we will ultimately have to look at whether 4-way replication with 9-5 support provides the required durability or not, but for now let's make that leap of faith assumption. Clearly I'm signing myself up to another blog here). For 2.5PB usable storage, we are at $0.041 with 24x7 dev ops and $0.030 with 9-5 dev ops & 4-way replication. Since these numbers are conservative, I'd say we'll hit our numbers on both public and private with a very minor constraint of utilizing consulting services to get going.
Private clouds are likely to be in this range and perhaps some public clouds. Here we certainly can't have any architect-developers. Even for dev ops, 24x7 is difficult unless they can be shared with some other projects. So let's go with 9-5 dev ops and 4-way replication. In many cases 3-way replication might be just fine, but no harm being conservative.
Since this is conservative we should be fine with a public cloud as well. The public cloud provider will probably need to give up the 65% gross margin target and accept a lower gross margin. The nice thing about open-source software is that it continually improves. As automation features keep getting added to Swift, the dev ops burden will keep going down in turn improving the gross margin.
Swift can definitely hit Amazon S3 like cost targets. Service providers can make their 65% gross margin for public clouds at 1PB+ usable storage, and sacrifice some margin below that. Private cloud providers can hit the cost targets at any storage above about 150TB usable storage.
In short, cost is not an inhibitor. If you are considering Swift and worried about cost, rest assured and move forward!