Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting our team. We will be in touch shortly.Close

  1. Blog
  2. Article

Eduardo Aguilar Pelaez
on 2 April 2020

Edge AI in a 5G world – part 3: Why ‘smart cell towers’ matter to AI


This is part of a blog series on the impact that 5G and GPUs at the edge will have on the roll out of new AI solutions. You can read the other posts here.

Recap

In part 1 we talked about the industrial applications and benefits that 5G and fast compute at the edge will bring to AI products. In part 2 we went deeper into how you can benefit from this new opportunity. In this part we will focus on the key technical barriers that 5G and Edge compute remove for AI applications. 

Photo by Louis Reed

Model training

With deep neural networks, the rule of thumb is that the more training data (real or from simulations) the better the results will be, you see this for yourself here. Processing more data costs more money if running on rented servers such as the public cloud.

This is incentivising many companies to set up their own data center to carry out the model training on their own servers. This used to require a lot of human resources but it is becoming simpler every day with tools such as MAAS to provision servers quickly.

Data transfer

Transferring large amounts of data from sensors to servers can be costly and complex so a pattern is emerging whereby companies bring the compute servers physically closer to the data sources. 

One example of this is the installation of Edge GPU servers near traffic cameras to

  1. Process 
  2. Record and
  3. Store

video of urban traffic.

Implementation of AI/ML operations

When planning to deliver a great AI product, the following system requirements are critical to ensure that reliable, timely and secure ML operations will be possible:

  1. Model size after training, i.e. will the trained model fit on the target device where inference will run? If not, could this computation be offloaded?
  2. Response latency, i.e. will the trained model compute fast enough for the product to work as intended, e.g. fast speech recognition. Alternatively in the offloaded compute configuration, will the response time of the offload compute be fast enough? (accounting for both connectivity and compute latency).
  3. Update delivery mechanisms, i.e. how will the software and AI models on the device or co-located compute be updated and patched to be kept secure?

Although the last few years have seen very interesting work in the field of trained model size minimisation with tools such as TensorFlow Lite , model response latency remains an issue for more data and compute heavy applications due to compute limitations on the IoT device or robot. 

In such cases, an alternative to running the AI model on the device is to offload this particular task to a nearby server.  

Co-location and 5G connectivity 

Co-location of sensors, servers and actuators provides the following two key technical benefits that make new solutions possible:

  • High bandwidth connectivity between the data sources and the servers, and
  • Low latency between the IoT device and the co-located compute. 

For example we could imagine a situation where a street camera; 

  • Could trigger a traffic light to go ‘red’ if an accident is detected. Sensing, perceiving and acting very quickly.

This could be achieved within a few milliseconds with a video processing Ai model running on an Edge GPU nearby connected to the camera via 5G.

In traditional settings, only wired connections would have been able to realistically provide the benefits listed above. 5G will soon do this reliably everywhere at an affordable price.

Cost and latency minimisation with Computation offloading

To date, the computational capability of IoT systems has primarily been limited by the amount of money customers were willing to spend. 

Large ticket items such as a Tesla electric car can absorb the cost of a powerful compute board, however cheaper devices for more pervasive applications such as mobile phones, smart speakers or connected cameras have much stricter compute constraints. 

High speed and low latency connectivity such as 5G networks can accelerate the adoption of co-located Edge GPU servers as a solution to hardware constrained IoT devices and robots.



Related posts


Karen Horovitz
18 March 2024

Canonical accelerates AI Application Development with NVIDIA AI Enterprise

Kubernetes Article

Charmed Kubernetes support comes to NVIDIA AI Enterprise Canonical’s Charmed Kubernetes is now supported on NVIDIA AI Enterprise 5.0. Organisations using Kubernetes deployments on Ubuntu can look forward to a seamless licensing migration to the latest release of the NVIDIA AI Enterprise software platform providing developers the latest AI ...


Serdar Vural
26 January 2024

Meet Canonical at Mobile World Congress Barcelona 2024

Ubuntu Article

The world’s largest and most influential telecommunications exhibition event, Mobile World Congress (MWC), is taking place in Barcelona on 26-29 February 2024. Canonical is excited to join this important annual event once again and meet the telecom industry.  Telecommunications is a key sector for Canonical. We offer solutions for private ...


Serdar Vural
14 September 2023

How a real-time kernel reduces latency in telco edge clouds

Telecommunications Article

Telco edge clouds are among the most popular and promising technologies with great potential to deliver performance improvements. An edge cloud is an infrastructure to run software located close to devices and end-users.  This type of local deployment brings several performance benefits, one of which is reduced latency. Edge computing ser ...