Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting our team. We will be in touch shortly.Close

  1. Blog
  2. Article

Andreea Munteanu
on 5 March 2024

PostgreSQL for AI applications


If you’re working with AI, you’re working with data.  From numerical data to videos or images, regardless of your industry or use case, every AI project depends on data in some form. The question is: how can you efficiently store that data and use it when building your models? One answer is PostgreSQL, a proven and well-loved database that, thanks to recent developments, has become a strong choice to support AI.

Why PostgreSQL?

PostgreSQL is an open-source, highly capable database system that supports different features like foreign keys, subqueries, triggers, and different user-defined types and functions. In recent years, PostgreSQL enjoyed large popularity in the database landscape, winning database management system (DBMS) of the year in 2023.

PostgreSQL has applications across all industries, such as  finops and e-commerce. It also fits a variety of workloads like online transaction processing, analytics and geospatial data. The solution’s widespread adoption has led to the development of new extensions and libraries for many specific use cases –including machine learning.

[Watch our webinar about PostgreSQL for AI Applications]

Watch our webinar about PostgreSQL for AI Applications

PostgreSQL for AI applications

PostgreSQL has more than 1000 extensions. They are add-on modules that deliver additional capabilities on top of those found in the core Postgres system. From handling of geospatial data to transforming PostgreSQL to  a vector database, various enhancements are available. The capabilities of the extensions cover a wide range, including analytics and search. 

The flexibility and breadth of features that these extensions provide unlock the tremendous potential to enhance your AI projects.

Some of the most relevant extensions for AI:

  • Pgvector is an open-source vector similarity search for PostgreSQL. It can be used also for storing embeddings. Due to its capabilities, it enables the database to work as a vector database, similar to OpenSearch.
  • Hydra is an open source columnar database. It enables efficient queries in billions of rows instantly without code changes. It is helpful when ML projects need to process large amounts of data.
  • PostgresML is a complete MLOps platform in a PostgreSQL extension. It enables organisations to build models inside the database.

Role of PostgreSQL in MLOps

MLOps is DevOps for machine learning. MLOps platforms such as Kubeflow ingest data from different types of databases, including PostgreSQL. Additionally, they use databases to store part of their artefacts, including metadata spanning experiments, jobs, pipeline runs and single scalar metrics. Kubeflow and your database need to be reliable and seamlessly integrated, since their availability influences the ability to run ML projects in production. 

PostgreSQL is a great database to use alongside Kubeflow, but that doesn’t mean it’s the best choice in every scenario. In practice there are also other viable options, for instance MySQL. When choosing which database you’ll use, prioritise the solution that makes the most sense for your organisation:

  • Existing database – if you already use a particular database, for example MySQL, within your MLOps platform, then changing it to PostgreSQL might be an unnecessary overhead.
  • Skillset – Choose a database that aligns with the skills and experience of your teams. If you already have experience working with PostgreSQL, choosing it for this use case would be preferable.

There are other considerations about MySQL and PostgreSQL that you can read about in this whitepaper.

Charmed PostgreSQL for AI applications

The Charmed PostgreSQL Operator delivers automated operations management from day 0 to day 2 on the PostgreSQL Database Management System. It is an open source, end-to-end, production-ready data platform on top of Juju. It comes in two flavours to deploy and operate PostgreSQL on physical/virtual machines and Kubernetes. Both offer features such as replication, TLS, password rotation, and easy-to-use integration with applications. 

The Charmed PostgreSQL Operator meets the need for deploying PostgreSQL in a structured and consistent manner while allowing the user flexibility in configuration. It simplifies deployment, scaling, configuration and management of PostgreSQL in production at scale in a reliable way. PostgreSQL on its own is a great choice for AI projects, and the Charmed Operator takes it to the next level, making it even easier to store your data and build ML models.

Further reading

AI in 2024 – What does the future hold?

PostgreSQL high availability made charmingly easy

MLOps toolkit whitepaper

Related posts


Canonical
11 April 2024

Ventana and Canonical collaborate on enabling enterprise data center, high-performance and AI computing on RISC-V

Silicon Article

This blog is co-authored by Gordan Markuš, Canonical and Kumar Sankaran, Ventana Micro Systems Unlocking the future of semiconductor innovation  RISC-V, an open standard instruction set architecture (ISA), is rapidly shaping the future of high-performance computing, edge computing, and artificial intelligence. The RISC-V customizable and ...


Philip Williams
11 April 2024

The role of secure data storage in fueling AI innovation

Ceph Article

There is no AI without data Artificial intelligence is the most exciting technology revolution of recent years. Nvidia, Intel, AMD and others continue to produce faster and faster GPU’s enabling larger models, and higher throughput in decision making processes. Outside of the immediate AI-hype, one area still remains somewhat overlooked: ...


Canonical
10 April 2024

Canonical announces collaboration with Qualcomm

Canonical announcements Article

The collaboration will bring Ubuntu and Ubuntu Core to devices powered by Qualcomm® processors Today Canonical, the publisher of Ubuntu, announced a collaboration with Qualcomm Technologies, Inc., the latest major System-on-Chip manufacturer and designer to join Canonical’s silicon partner program. Through the partner program, Qualcomm Te ...