Canonical
on 23 October 2025

Introducing silicon-optimized inference snaps

Share on:

Install a well-known model like DeepSeek R1 or Qwen 2.5 VL with a single command, and get the silicon-optimized AI engine automatically.

London, October 23 – Canonical today announced optimized inference snaps, a new way to deploy AI models on Ubuntu devices, with automatic selection of optimized engines, quantizations and architectures based on the specific silicon of the device. Canonical is working with a wide range of silicon providers to deliver their optimizations of well-known LLMs to the developers and devices.

A single well-known model like Qwen 2.5 VL or DeepSeek R1 has many different sizes and setup configurations, each of which is optimized for specific silicon. It can be difficult for an end-user to know which model size and runtime to use on their device. Now, a single command gets you the best combination, automatically. Canonical is working with silicon partners to integrate their optimizations. As new partners publish their optimizations, the models will become more efficient on more devices.

This enables developers to integrate well-known AI capabilities seamlessly into their applications and have them run optimally across desktops, servers, and edge devices.

A snap package can dynamically load components. We fetch the recommended build for the host system, simplifying dependency management while improving latency. The public beta includes Intel and Ampere®-optimized DeepSeek R1 and Qwen 2.5 VL as examples, and open sources the framework by which these are built.

“We are making silicon-optimized AI models available for everyone. When enabled by the user, they will be deeply integrated down to the silicon level,” said Jon Seager, VP Engineering at Canonical, “I’m excited to work with silicon partners to ensure that their silicon-optimized models ‘just work.’ Developers and end-users no longer need to worry about the complex matrix of engines, builds and quantizations. Instead, they can reliably integrate a local version of the model that is as efficient as possible and continuously improves.”

The silicon ecosystem invests heavily in performance optimizations for AI, but developer environments are complex and lack simple tools for unpacking all the necessary components for building complete runtime environments. On Ubuntu, the community can now distribute their optimized stacks straight to end users. Canonical worked closely with Intel and Ampere to deliver hardware-tuned inference snaps that maximize performance.

“By working with Canonical to package and distribute large language models optimized for Ampere hardware through our AIO software, developers can simply get our recommended builds by default, already tuned for Ampere processors in their servers,” said Jeff Wittich, Chief Product Officer at Ampere, “This brings Ampere’s high performance and efficiency to end users right out of the box. Together, we’re enabling enterprises to rapidly deploy and scale their preferred AI models on Ampere systems with Ubuntu’s AI-ready ecosystem.”

“Intel optimizes for AI workloads from silicon to high-level software libraries. Until now, a developer has needed the skills and knowledge to select which model variants and optimizations may be best for their client system,” said Jim Johnson, Senior VP, GM of Client Computing Group, Intel, “Canonical’s approach to packaging and distributing AI models overcomes this challenge, enabling developers to extract the performance and cost benefits of Intel hardware with ease. One command detects the hardware and uses OpenVINO, our open source toolkit for accelerating AI inference, to deploy a recommended model variant, with recommended parameters, onto the most suitable device.”

Get started today

Get started and run silicon-optimized models on Ubuntu with the following commands:

sudo snap install qwen-vl --beta

sudo snap install deepseek-r1 --beta

Developers can begin experimenting with the local and standard inference endpoints of these models to power AI capabilities in their end-user applications.

Learn more and provide feedback

About Canonical

Canonical, the publisher of Ubuntu, provides open source security, support and services. Our portfolio covers critical systems, from the smallest devices to the largest clouds, from the kernel to containers, from databases to AI. With customers that include top tech brands, emerging startups, governments and home users, Canonical delivers trusted open source for everyone.

Learn more at https://canonical.com/

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Categories

Industries

Partner programs

Quick links

Roles by department

Working here

Explore Canonical

Latest updates

Company highlights ›

Introducing silicon-optimized inference snaps

Install a well-known model like DeepSeek R1 or Qwen 2.5 VL with a single command, and get the silicon-optimized AI engine automatically.

Get started today

Learn more and provide feedback

About Canonical

Related posts

The bare metal problem in AI Factories

Fast-tracking industrial and AI deployment on Renesas RZ platforms

Canonical and Arduino collaborate to enable Ubuntu on the VENTUNO Q, the next generation platform for AI