Tuesday, February 28, 2012

End-User Feedback on OpenStack Swift: A Deeper Look at UCSD's Implementation

I had previously blogged about UCSD’s OpenStack Swift Storage Cloud. Subsequently I had the good fortune of chatting with Stephen Meier, manager of the storage group at UCSD SDSC, to get more details about their Swift implementation. Here's a synopsis.

DISCLAIMER: The views expressed here are my own and don't necessarily represent my employer Emulex's positions, strategies or opinions. 

OpenStack Swift Installation
UCSD’s Swift installation is pretty close to the stock OpenStack release with swauth authentication. There were some minor patches required to correct auth URLs for Cyberduck and Commvault. Otherwise there were no major challenges. [Amar: Learn here how to install OpenStack Swift on a set of EC2 machines].

UCSD used ROCKS, a UCSD clustering tool with support for different roles, instead of Puppet or Chef for install. This was done mostly due to familiarity with the tool which enabled an easy launch. Fortunately ROCKS also comes with monitoring also which is very helpful.

For charge-back/ billing, UCSD wrote a set of PERL commands to sample the account, make the right calculations for billing, and then tie it back into the accounting system.

For the future, USCD might consider Keystone and hook it into the UC wide authentication LDAP system which would make things more seamless for users.

Managing Swift
Managing Swift has been pretty good so far. No real issues encountered. Though, the cluster has not been stressed fully yet to see operation at scale. The Swift team operates during regular working hours, although there is a NOC that could provide 24x7 support. More tools/ scripts with automation are required to fully utilize the NOC.

Disk failures are easy to deal with by replacing them. However, nodes are a little trickier. Since the nodes are so dense (100TB), the team can’t just replace them at the first occurrence of a failure. They have to ensure that the node is truly dead.

Using Swift
This is an area where there is room for improvement. Getting good clients that can support greater than 5GB files has been difficult. UCSD has been enhancing their web browsing client and is also working on a Java client that will support files greater than 5GB. The Swift CLI is great for sophisticated users, but is missing a number of management features like rename, copy, move etc.

Another area for improvement is in mounting Swift as a file-system. SDSC is looking at FUSE modules but no ideal solution has emerged yet. Some have performance issues. Some like S3backer work well and get around 5GB limit by making blocks; but then other tools can’t view the data.

Interest in Swift in Academic Community
There is a lot of interest from other educational institutions. At least 4-5 institutions are asking UCSD how they can copy UCSD’s Swift “recipe”.

Value-proposition in a nut-shell
Tape has been a massive repository of data at UCSD SDSC. Researchers use tape for writing a lot of data. Swift offers an online way (significantly lower access latency) to do the same and may prove to be more popular over time for this reason [Amar: see my cheap & deep storage tier characterization of this type of private cloud storage]. With no access fees or meta-data access fees and the attractive pricing, the Swift cluster is very useful for researchers.

1 comment:

  1. In terms of using Swift, agree this is an area where there is room for improvement. The easier it is to get data into the object store without reworking existing processes the better. We (Maldivica) have a tool to provide a NAS interface to an OpenStack object store that preserves the data integrity. In addition to automatically extracting and cataloging metadata for later searching and processing. A shameless plug of course, but the easier it is for people to get data in, the more likely we'll see broader adoption of object stores in the enterprise.