Ceph Storage - Why Ceph? (part 1)

This is a new series of articles about Ceph. I am not Ceph expert and I still have a very limited knowledge of Ceph, but I found Ceph very interesting storage option. So what is Ceph? According to the developer website, “Ceph is a distributed object store and file system designed to provide excellent performance, reliability and scalability”.

I have been working with storage in the virtualiation and cloud business for the past 10 years and have been dealing with the same problems over and over. Here are just few examples of legacy storage solutions:

Running out of storage:

  • Replacing hard drives for bigger hard drives in our current boxes? That would be quite expensive. Also what would we do with the old drives? Not to mention how long it would take to replace several hard drives and resync our raid. No way.

  • How about buying new NetApp and cluster it with the one we have? The NetApp is too expensive for our limited budget. Is there any cheaper option? Hmm…

  • How about adding new Linux box with NFS/CIFS? Do we need to attach new storage to our XenServers? Can we just increase the current SRs somehow to make it easier? Hmm…

Storage performance:

  • Unfortunately, one storage box is limited with IOPs or link speed. The only solution would be to move your content to another storage. Not ideal solution, right?

  • We have too many users or services accessing the box. There is not much we can do unless we buy another box.

  • When something happens with RAID array, the performance is decreased until the issue is fixed.

  • What RAID level to use to use for best performance?

Vendor lock-in:

  • We might have to buy new storage to deal with the performance issues, but this one will be slightly different. We don't want to go with the same vendor again.

  • The storage vendor we bought our last storage box is too expensive. Is there any other, cheaper vendor to buy a storage from?

We could list many more problems and challenges storage admins have to deal with, but that would take too much time. My experience tells me that one ends up with different boxes running different systems over some time, which becomes a nightmare to manage it, or you end up stuck with one vendor and you pay the price.

Ceph can solve these problems and provides many benefits.

  1. You need a commodity hardware which runs Linux. Yes, that's right. Almost any hardware is suitable today. This also means no vendor lock-in.

  2. You can add any box with any number of hard drives to the cluster. The drives can be different size, type or vendor too.

  3. You don’t need expensive RAID cards. Ceph manages each drive independently.

  4. The more boxes you add the more reliable the storage will be and the better will perform.

  5. Decommissioning of old hardware is easy. You just add new and remove the old.

You can read more about Ceph and its benefits here - http://ceph.com/ceph-storage/

So here is the ultimate question. Why to implement Ceph? In summary Ceph provides flexibility and removes vendor lock-in. You can buy any box with any number of drives and add it to the cluster. You can also replace any old boxes for new boxes without affecting the cluster, users or services. This means you have a lot more choice.

In my next article, we will look at Ceph concept and what we need for Ceph deployment. Stay tuned and don't hesitate to post me your storage problems in the comments!

Add comment

Security code