Skip to main content

Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting us. A member of our team will be in touch shortly. Close

An error occurred while submitting your form. Please try again or file a bug report. Close

  1. Blog
  2. Article

Canonical
on 23 October 2025

Introducing silicon-optimized inference snaps


Install a well-known model like DeepSeek R1 or Qwen 2.5 VL with a single command, and get the silicon-optimized AI engine automatically.

London, October 23 – Canonical today announced optimized inference snaps, a new way to deploy AI models on Ubuntu devices, with automatic selection of optimized engines, quantizations and architectures based on the specific silicon of the device. Canonical is working with a wide range of silicon providers to deliver their optimizations of well-known LLMs to the developers and devices.

A single well-known model like Qwen 2.5 VL or DeepSeek R1 has many different sizes and setup configurations, each of which is optimized for specific silicon. It can be difficult for an end-user to know which model size and runtime to use on their device. Now, a single command gets you the best combination, automatically. Canonical is working with silicon partners to integrate their optimizations. As new partners publish their optimizations, the models will become more efficient on more devices.

This enables developers to integrate well-known AI capabilities seamlessly into their applications and have them run optimally across desktops, servers, and edge devices.

A snap package can dynamically load components. We fetch the recommended build for the host system, simplifying dependency management while improving latency.  The public beta includes Intel and Ampere®-optimized DeepSeek R1 and Qwen 2.5 VL as examples, and open sources the framework by which these are built.

“We are making silicon-optimized AI models available for everyone. When enabled by the user, they will be deeply integrated down to the silicon level,” said Jon Seager, VP Engineering at Canonical, “I’m excited to work with silicon partners to ensure that their silicon-optimized models ‘just work.’ Developers and end-users no longer need to worry about the complex matrix of engines, builds and quantizations. Instead, they can reliably integrate a local version of the model that is as efficient as possible and continuously improves.”

The silicon ecosystem invests heavily in performance optimizations for AI, but developer environments are complex and lack simple tools for unpacking all the necessary components for building complete runtime environments. On Ubuntu, the community can now distribute their optimized stacks straight to end users. Canonical worked closely with Intel and Ampere to deliver hardware-tuned inference snaps that maximize performance.

“By working with Canonical to package and distribute large language models optimized for Ampere hardware through our AIO software, developers can simply get our recommended builds by default, already tuned for Ampere processors in their servers,” said Jeff Wittich, Chief Product Officer at Ampere, “This brings Ampere’s high performance and efficiency to end users right out of the box. Together, we’re enabling enterprises to rapidly deploy and scale their preferred AI models on Ampere systems with Ubuntu’s AI-ready ecosystem.”

“Intel optimizes for AI workloads from silicon to high-level software libraries. Until now, a developer has needed the skills and knowledge to select which model variants and optimizations may be best for their client system,” said Jim Johnson, Senior VP, GM of Client Computing Group, Intel, “Canonical’s approach to packaging and distributing AI models overcomes this challenge, enabling developers to extract the performance and cost benefits of Intel hardware with ease. One command detects the hardware and uses OpenVINO, our open source toolkit for accelerating AI inference, to deploy a recommended model variant, with recommended parameters, onto the most suitable device.”

Get started today 

Get started and run silicon-optimized models on Ubuntu with the following commands:

sudo snap install qwen-vl --beta

sudo snap install deepseek-r1 --beta

Developers can begin experimenting with the local and standard inference endpoints of these models to power AI capabilities in their end-user applications. 

Learn more and provide feedback

About Canonical 

Canonical, the publisher of Ubuntu, provides open source security, support and services. Our portfolio covers critical systems, from the smallest devices to the largest clouds, from the kernel to containers, from databases to AI. With customers that include top tech brands, emerging startups, governments and home users, Canonical delivers trusted open source for everyone.

Learn more at https://canonical.com/

Related posts


Stephanie Domas
14 November 2025

A CISO’s preview of open source and cybersecurity trends in 2026 and beyond

Ubuntu Article

Where is open source going next? What’s in store for open source in the coming years, particularly in relation to security? Here’s a CISO’s reflection on the state of open source, and the trends that you can expect to have an impact going into 2026.  ...


Benjamin Ryzman
13 November 2025

Canonical Kubernetes officially included in Sylva 1.5

5G Telecommunications

Sylva 1.5 becomes the first release to include Kubernetes 1.32, bringing the latest open source cloud-native capabilities to the European telecommunications industry  With the launch of Sylva 1.5, Canonical Kubernetes is now officially part of the project’s reference architecture. This follows its earlier availability as a technology prev ...


Lidia Luna Puerta
13 November 2025

Canonical expands total coverage for Ubuntu LTS releases to 15 years with Legacy add-on

Support Article

Ubuntu Pro now supports LTS releases for up to 15 years through the Legacy add-on. More security, more stability, and greater control over upgrade timelines for enterprises. ...