Skip to main content

Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting us. A member of our team will be in touch shortly. Close

  1. Blog
  2. Article

Andreea Munteanu
on 2 June 2024

Deploy GenAI applications with Canonical’s  Charmed Kubeflow and NVIDIA NIM


It’s been over two years since generative AI (GenAI) took off with the launch of ChatGPT. From that moment on, a variety of applications, models and libraries were launched to address market needs and simplify enterprise activity. As Deloitte observes in its State of Generative AI Q2 2024 report, organisations are now at a stage where they are ready to move beyond pilots and proof of concepts and start creating value – but bringing AI models to production can prove highly complex.

Canonical has collaborated with NVIDIA in the past  to help enable open source AI at scale. In 2023, both Canonical Kubernetes and Charmed Kubeflow were certified as part of the NVIDIA DGX-Ready Software program. Shortly after, NVIDIA NGC containers and NVIDIA Triton Inference Server were integrated with our MLOps platform. This year brought news about Ubuntu on NVIDIA Jetson for AI at the edge and Kubernetes enablement for the NVIDIA AI Enterprise software platform.

Today, our sights are set on GenAI. This blog will explore how together we help organisations with their GenAI applications and simplify the path to production. You can develop your GenAI apps on Canonical’s MLOps platform, Charmed Kubeflow, and deploy them using NVIDIA NIM inference microservices – part of the NVIDIA AI Enterprise software platform for the development and deployment of generative AI – integrated with KServe, Kubeflow’s component for deployment.

Scale enterprise AI with Canonical and NVIDIA NIM

To simplify operations and deliver GenAI at scale, your teams need to be able to focus on building models rather than tooling. The best way to achieve this is with integrated solutions that cover the entire machine learning lifecycle. Professionals need an end-to-end solution that can be used to train models, automate ML workloads and then deploy them to edge devices. This is an iterative process that requires constant updates, enhanced monitoring and the ability to serve models anywhere. These needs are directly met by using Canonical MLOps integrated with NVIDIA NIM.

Fig. Canonical’s MLOps platform, Charmed Kubeflow now supports NVIDIA NIM microservices, part of the NVIDIA AI Enterprise software suite

Canonical MLOps is a solution that covers the complete machine learning lifecycle, integrating leading open-source tooling such as Spark, Kafka or MLflow in a secure, portable and reliable manner. Charmed Kubeflow is the foundation of the solution. It is an MLOps platform that runs on any cloud, including hybrid or multi-cloud scenarios and any CNCF-conformant Kubernetes. KServe is one of the core components of Kubeflow, and it is used to serve models in a serverless manner. It enables different inference engines to be used, including NVIDIA Triton Inference Server and NVIDIA NIM.

NVIDIA NIM, part of NVIDIA AI Enterprise, is a set of microservices designed to reduce the time to market of machine learning models and enable organisations to run their projects in production while maintaining security and control of their GenAI applications. NVIDIA NIM delivers seamless, scalable AI inferencing, on premises or in the cloud, using industry-standard APIs. It simplifies model deployment across any cloud and streamlines the path to enterprise AI at scale, reducing the upfront engineering costs. The microservices bridge the gap between complex deployments and operational needs to maintain models in production. It is a cloud-native solution that integrates with KServe, so you can develop and deploy models using a single set of tools.

“Beyond the work that we do with NVIDIA in Ubuntu and in Canonical Kubernetes for GPU-specific integrations and optimisations, we facilitate the development and deployment of ML models as one integrated solution,” said Aniket Ponkshe, Director of Silicon Alliances at Canonical. “After the work done to certify Charmed Kubeflow and Charmed Kubenertes on NVIDIA DGX, extending it to NVIDIA NIM on the MLOps platform was a natural step for our teams to further simplify the developer journey from development to production.”

“Enterprises often struggle with the complexity of deploying generative AI models into production, facing challenges in scalability, security, and integration,” said Pat Lee, Vice President of Strategic Enterprise Partnerships at NVIDIA. “Charmed Kubeflow with NVIDIA NIM simplifies the process by providing pre-built, cloud-native microservices that streamline deployment, reduce costs, and deliver enterprise-grade performance and security.”

Accelerate AI project delivery 

In its 2024 report, The AI Infrastructure Alliance asked AI/ML technology leads about their greatest concerns around deploying GenAI. The top two concerns were making mistakes due to moving too quickly, and moving too slowly due to a lack of execution ability. This offering from Canonical with NVIDIA NIM addresses both of these problems by enabling enterprises to move at speed with a repeatable, streamlined GenAI delivery path. 

Canonical MLOps is built with secure open source software so that organisations can develop their models in a reliable environment. By taking advantage of Ubuntu Pro and Canonical Kubernetes in addition to the MLOps solutions, enterprises have a one-stop shop for their AI projects, with a secure, trusted operating system and upstream Kubernetes with NVIDIA integrations to accelerate their AI journey from concept to deployment. No matter what requirements and internal skill sets they have, organisations can benefit from enterprise support, managed services and even training from Canonical experts.

Get started with Charmed Kubeflow and NVIDIA NIM

Getting started with the solution is easy. You can deploy Charmed Kubeflow in any environment. Then, you can access NVIDIA NIM microservices from the NVIDIA API catalogue after applying for NIM access. After that, it just takes a few actions at the Kubernetes layer to create a NIM runtime, create a PVC, instantiate KServe’s Inference service and validate the NIM running on KServe. You can read more about it here and follow up the NVIDIA NIM on Charmed Kubeflow tutorial.

Further reading

Related posts


Gokhan Cetinkaya
30 September 2024

How to deploy AI workloads at the edge using open source solutions

AI Article

Running AI workloads at the edge with Canonical and Lenovo AI is driving a new wave of opportunities in all kinds of edge settings—from predictive maintenance in manufacturing, to virtual assistants in healthcare, to telco router optimisation in the most remote locations. But to support these AI workloads running virtually everywhere, com ...


Canonical
11 April 2024

Ventana and Canonical collaborate on enabling enterprise data center, high-performance and AI computing on RISC-V

Silicon Article

This blog is co-authored by Gordan Markuš, Canonical and Kumar Sankaran, Ventana Micro Systems Unlocking the future of semiconductor innovation  RISC-V, an open standard instruction set architecture (ISA), is rapidly shaping the future of high-performance computing, edge computing, and artificial intelligence. The RISC-V customizable and ...


Michelle Anne Tabirao
20 December 2024

Building RAG with enterprise open source AI infrastructure

Data Platform Article

How to create a robust enterprise AI infrastructure for RAG systems using open source tooling?A highlight on how open source can help ...