Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting our team. We will be in touch shortly.Close

  1. Blog
  2. Article

Philip Williams
on 19 January 2023

Cloud storage pricing – how to optimise TCO

The flexibility of public cloud infrastructure allows for little to no upfront expense, and is great when starting a venture or testing an idea.  But once a dataset grows and becomes predictable, it can become a significant base cost, compounded further by additional costs depending on how you are consuming that data.

Public clouds were initially popularised under the premise that workloads are dynamic, and that you could easily match available compute resources to the peaks and troughs in your consumption, rather than having to maintain mostly idle buffer capacity to meet peak user demands.  Essentially shifting sunk capital into variable operational expense.

However, what has become more apparent is that this isn’t necessarily true when it comes to public cloud storage.  Typically what is observed in a production environment is a continual growth of all data sets.  Those that are actively used for decision making or transactional processing in databases, tend to age out but need to be retained for audit and accountability purposes.  Training data for AI/ML workloads grow and allow models to be more refined and accurate over time.  Content and media repositories grow daily, and exponentially with the use of higher quality recording equipment.

How is public cloud storage priced?

Typically there are three areas where costs are incurred.

  • Capacity ($/GB): this is the amount of space you use for storing your data or the amount of space you allocate/provision for a block volume.
  • Transactional charges when you interact with the dataset.  In an object storage context, this can be  PUT/GET/DEL operations.  In a block storage context, this can be allocated IOPs or throughput (MB/s).
  • Object storage can also incur additional bandwidth charges (egress) when you access your data from outside of a cloud provider’s infrastructure or from a vm or container in different compute regions. These charges can even apply when you have deployed your own private network links to a cloud provider!

If in the future you decide to move your data to another public cloud provider, you would incur these costs during migration too!

Calculating cloud storage TCO

Imagine you have a dataset that’s 5PB and you want to understand its total cost of ownership (TCO) over 5 years.  First we need to make some assumptions about the dataset and how frequently it will be accessed.

Over the lifetime of the dataset we will assume that it will be written to twice, so 10PBs of written data.  We will also assume that it will be read 10 times, and each object is an average of 10MB.

In a popular public cloud, object storage capacity starts at $0.023/GB, and as usage increases the price decreases to $0.021/GB.  You are also charged for the transactions to store and retrieve the data.  These costs sound low, but as you start to scale up, and then consider the multi-year cost they can quickly rise to significant numbers.

For the 5PB example, the TCO over 5 years is over $7,000,000, and that’s before you even consider any charges for compute to interact with the data, or egress charges to access the dataset from outside of the cloud provider’s infrastructure.

Balancing costs with flexibility

Is there another way to tackle these mounting storage costs, yet also retain the flexibility of deploying workloads in the cloud?

IT infrastructure is increasingly flexible, so with some planning it is possible to operate an open-source storage infrastructure based on Charmed Ceph that is fully managed by experts adjacent to a public cloud region and connected to the public cloud via private links to ensure the highest availability and reliability.  Using the same assumptions around usage as before, a private storage solution  can reduce your storage costs by more than 2-3x over a 3-5 year period.

Having your data stored using open-source Charmed Ceph in a neutral location, yet near to multiple public cloud providers unlocks a new level of multi-cloud flexibility.  For example, should one provider start offering a specific compute service that is not available elsewhere, you can make your data accessible to that provider without incurring significant access or migration costs. As you would when accessing one provider’s storage from another provider’s compute offering.  

Additionally, you can securely expose your storage system to your users via your own internet connectivity, without incurring public cloud bandwidth fees.

Later this quarter we will publish a detailed whitepaper with a breakdown of all the costs of both of these solutions alongside a blueprint of the hardware and software used.  Make sure to sign up for our newsletter using the form on the right hand side of this page (cloud and server category) to be notified when it is released.

Learn more

Related posts

Philip Williams
11 April 2024

The role of secure data storage in fueling AI innovation

Ceph Article

There is no AI without data Artificial intelligence is the most exciting technology revolution of recent years. Nvidia, Intel, AMD and others continue to produce faster and faster GPU’s enabling larger models, and higher throughput in decision making processes. Outside of the immediate AI-hype, one area still remains somewhat overlooked: ...

Philip Williams
12 March 2024

CentOS EOL – What does it mean for Ceph storage?

Ceph Article

Out of the darkness and into the light, a new path forward Back in 2020, the CentOS Project announced that they would focus only on CentOS Stream, meaning that CentOS 7 would be the last release with commonality to Red Hat Enterprise Linux. The End of Life (EOL) of CentOS 7 on June 30, 2024, ...

Philip Williams
26 February 2024

Ceph Storage for AI

Ceph Article

Use open source Ceph storage to fuel your AI vision The use of AI is a hot topic for any organisation right now. The allure of operational insights, profit, and cost reduction that could be derived from existing data makes it a technology that’s being rolled out at an incredible pace in even change-resistant organisations. ...