Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting our team. We will be in touch shortly.Close

  1. Blog
  2. Article

Andreea Munteanu
on 2 June 2024

Deploy GenAI applications with Canonical’s  Charmed Kubeflow and NVIDIA NIM

It’s been over two years since generative AI (GenAI) took off with the launch of ChatGPT. From that moment on, a variety of applications, models and libraries were launched to address market needs and simplify enterprise activity. As Deloitte observes in its State of Generative AI Q2 2024 report, organisations are now at a stage where they are ready to move beyond pilots and proof of concepts and start creating value – but bringing AI models to production can prove highly complex.

Canonical has collaborated with NVIDIA in the past  to help enable open source AI at scale. In 2023, both Canonical Kubernetes and Charmed Kubeflow were certified as part of the NVIDIA DGX-Ready Software program. Shortly after, NVIDIA NGC containers and NVIDIA Triton Inference Server were integrated with our MLOps platform. This year brought news about Ubuntu on NVIDIA Jetson for AI at the edge and Kubernetes enablement for the NVIDIA AI Enterprise software platform.

Today, our sights are set on GenAI. This blog will explore how together we help organisations with their GenAI applications and simplify the path to production. You can develop your GenAI apps on Canonical’s MLOps platform, Charmed Kubeflow, and deploy them using NVIDIA NIM inference microservices – part of the NVIDIA AI Enterprise software platform for the development and deployment of generative AI – integrated with KServe, Kubeflow’s component for deployment.

Scale enterprise AI with Canonical and NVIDIA NIM

To simplify operations and deliver GenAI at scale, your teams need to be able to focus on building models rather than tooling. The best way to achieve this is with integrated solutions that cover the entire machine learning lifecycle. Professionals need an end-to-end solution that can be used to train models, automate ML workloads and then deploy them to edge devices. This is an iterative process that requires constant updates, enhanced monitoring and the ability to serve models anywhere. These needs are directly met by using Canonical MLOps integrated with NVIDIA NIM.

Fig. Canonical’s MLOps platform, Charmed Kubeflow now supports NVIDIA NIM microservices, part of the NVIDIA AI Enterprise software suite

Canonical MLOps is a solution that covers the complete machine learning lifecycle, integrating leading open-source tooling such as Spark, Kafka or MLflow in a secure, portable and reliable manner. Charmed Kubeflow is the foundation of the solution. It is an MLOps platform that runs on any cloud, including hybrid or multi-cloud scenarios and any CNCF-conformant Kubernetes. KServe is one of the core components of Kubeflow, and it is used to serve models in a serverless manner. It enables different inference engines to be used, including NVIDIA Triton Inference Server and NVIDIA NIM.

NVIDIA NIM, part of NVIDIA AI Enterprise, is a set of microservices designed to reduce the time to market of machine learning models and enable organisations to run their projects in production while maintaining security and control of their GenAI applications. NVIDIA NIM delivers seamless, scalable AI inferencing, on premises or in the cloud, using industry-standard APIs. It simplifies model deployment across any cloud and streamlines the path to enterprise AI at scale, reducing the upfront engineering costs. The microservices bridge the gap between complex deployments and operational needs to maintain models in production. It is a cloud-native solution that integrates with KServe, so you can develop and deploy models using a single set of tools.

“Beyond the work that we do with NVIDIA in Ubuntu and in Canonical Kubernetes for GPU-specific integrations and optimisations, we facilitate the development and deployment of ML models as one integrated solution,” said Aniket Ponkshe, Director of Silicon Alliances at Canonical. “After the work done to certify Charmed Kubeflow and Charmed Kubenertes on NVIDIA DGX, extending it to NVIDIA NIM on the MLOps platform was a natural step for our teams to further simplify the developer journey from development to production.”

“Enterprises often struggle with the complexity of deploying generative AI models into production, facing challenges in scalability, security, and integration,” said Pat Lee, Vice President of Strategic Enterprise Partnerships at NVIDIA. “Charmed Kubeflow with NVIDIA NIM simplifies the process by providing pre-built, cloud-native microservices that streamline deployment, reduce costs, and deliver enterprise-grade performance and security.”

Accelerate AI project delivery 

In its 2024 report, The AI Infrastructure Alliance asked AI/ML technology leads about their greatest concerns around deploying GenAI. The top two concerns were making mistakes due to moving too quickly, and moving too slowly due to a lack of execution ability. This offering from Canonical with NVIDIA NIM addresses both of these problems by enabling enterprises to move at speed with a repeatable, streamlined GenAI delivery path. 

Canonical MLOps is built with secure open source software so that organisations can develop their models in a reliable environment. By taking advantage of Ubuntu Pro and Canonical Kubernetes in addition to the MLOps solutions, enterprises have a one-stop shop for their AI projects, with a secure, trusted operating system and upstream Kubernetes with NVIDIA integrations to accelerate their AI journey from concept to deployment. No matter what requirements and internal skill sets they have, organisations can benefit from enterprise support, managed services and even training from Canonical experts.

Get started with Charmed Kubeflow and NVIDIA NIM

Getting started with the solution is easy. You can deploy Charmed Kubeflow in any environment. Then, you can access NVIDIA NIM microservices from the NVIDIA API catalogue after applying for NIM access. After that, it just takes a few actions at the Kubernetes layer to create a NIM runtime, create a PVC, instantiate KServe’s Inference service and validate the NIM running on KServe. You can read more about it here and follow up the NVIDIA NIM on Charmed Kubeflow tutorial.

Further reading

Related posts

Karen Horovitz
18 March 2024

Accelerate AI development with Ubuntu and NVIDIA AI Workbench

AI Article

As the preferred OS for data science, AI and ML, Ubuntu plays an integral role in NVIDIA AI Workbench capabilities.  ...

Karen Horovitz
18 March 2024

Canonical accelerates AI Application Development with NVIDIA AI Enterprise

Kubernetes Article

Charmed Kubernetes support comes to NVIDIA AI Enterprise Canonical’s Charmed Kubernetes is now supported on NVIDIA AI Enterprise 5.0. Organisations using Kubernetes deployments on Ubuntu can look forward to a seamless licensing migration to the latest release of the NVIDIA AI Enterprise software platform providing developers the latest AI ...

11 March 2024

Large Language Models (LLMs) Retrieval Augmented Generation (RAG) using Charmed OpenSearch

AI Article

This article guides you on leveraging Charmed OpenSearch to maintain a relevant and up-to-date LLM application. ...