Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting our team. We will be in touch shortly.Close

  1. Blog
  2. Article

Philip Williams
on 10 November 2022

Object storage is a type of storage where data is manipulated as distinct units. It has accompanied the cloud computing revolution, with S3 (Simple Storage Service) being the very first AWS service. The API for which later turned into the industry standard for the majority of object stores.

Object stores have a very simplistic interface, and do not require you to manage complicated SCSI and HBA drivers, multipathing tools, or volume managers embedded into your operating system. Access to storage becomes an application integration where you point your application at an HTTP endpoint, and use a simple set of verbs to describe what you want to do with a piece of data. Users and applications can be given access to buckets, a bucket being somewhat analogous to a folder. However there is no hierarchical behaviour.

As an example of how these verbs work, do you want to PUT an object somewhere for safekeeping? Do you want to GET an object so that you can do some work with that piece of data? Or do you want to LIST the contents of your bucket? Perhaps these three verbs are an oversimplification of what is possible with object storage, but this is loosely where cloud object storage began. It was an initiative to make storage more economical by removing proprietary technologies and creating a simple scalable storage solution, without the complexities of legacy technologies.

Now that we have a basic understanding of object stores, let’s explore some use cases.

Uses of Object Storage

When building a new application, you will need to build it with object storage in mind. Instead of relying on cluster-aware filesystems and quorum devices, the application will need to handle failover and data consistency itself to remain available during infrastructure failures.  

Many off the shelf applications now have native deployment models for working with cloud-native infrastructure, and most importantly with object storage. When your application has finished processing or creating a piece of data, it can be written to an object store for safekeeping, and can easily be retrieved as and when needed.

We can even use object storage buckets to trigger events. Imagine the scenario where you have a mobile app that uploads photos or video, and then some processing happens, before publication. Once a photo or video is uploaded to an object store, an event is triggered to let your backend application know that there is a new object to be processed. And once that object has been processed the output could be written to a bucket that triggers another job to push it to your Content Distribution Network (CDN).

Where can I get Object Storage?

There are lots of options available, all public clouds have object storage offerings. Some of the most well-known are Azure Blob Storage, GCP Cloud Storage, and Amazon AWS S3. Each of these offerings has its own APIs but the most commonly used is the S3 API.

The S3 API has been implemented in other storage solutions, such as Ceph and to a certain extent OpenStack Swift. However, Swift’s implementation is not as feature-complete as Ceph’s and is lacking some features around object lifecycle management and notifications.

Major storage vendors, such as Dell EMC and NetApp, also have solutions, which have largely standardised on the S3 API. Yet, when compared with open source solutions, these remain cumbersome and expensive.

Public or private cloud object storage?

The public cloud might not always be the right choice for all workloads, or for storing all of your data. Despite the fact that the public cloud is instantly accessible, which makes it a great way to get started, over time and as your data set grows, it can become rather cost-inefficient. Public clouds were created around the notion that you can scale up and down on demand, but storage tends to only scale up. Cloud provider costs not only include the charges for storing data, but also retrieval too, and additionally, some providers charge for the number of API operations that you request, and for network transfer costs on top!

A privately hosted Ceph solution can provide significant savings when you have predictable capacity requirements, and you can more effectively manage your own transit costs, either into a public cloud, via products like Direct Connect or ExpressRoute, or at no cost in your own DC or Colo.

Is S3 on Ceph a solution for you?

A Ceph cluster that is compatible with both the AWS S3 API and the OpenStack Swift API can be a cost-effective way to provide object storage to your applications, by combining open-source software with commodity hardware to meet performance, availability and capacity needs.

Learn more about open source Ceph:

Canonical Charmed Ceph

Blog : Cloud Adjacent Storage

Webinar : Reduce your cloud storage costs with cloud adjacent Ceph

Related posts

Philip Williams
16 March 2022

Cloud Adjacent Storage

Ceph Article

What is cloud adjacent storage? Put simply, cloud adjacent storage is just a privately owned and operated storage system, within network reach of a cloud provider’s region, but without the pay-as-you-grow and access charges found in public clouds. Why is cloud adjacent storage a better choice than public cloud storage? Public clouds were ...

Philip Williams
22 December 2021

A look forward to storage in 2022

Ceph Article

It’s that time of year, where we start to look ahead, and think about the ongoing trends in our various industries. One thing is for certain in the storage industry: capacity demand remains high, with the industry observing continued exponential growth. Growth, growth, growth More and more data is being created every day. It truly ...

Philip Williams
11 April 2024

The role of secure data storage in fueling AI innovation

Ceph Article

There is no AI without data Artificial intelligence is the most exciting technology revolution of recent years. Nvidia, Intel, AMD and others continue to produce faster and faster GPU’s enabling larger models, and higher throughput in decision making processes. Outside of the immediate AI-hype, one area still remains somewhat overlooked: ...