Mindgard Recognized as UK's Most Innovative Cyber SME 2024 at Infosecurity Europe
William Hackett

When people talk about adversarial machine learning, they often focus on inputs and outputs. But there is another class of attacks that never touches the model’s visible interface. Instead, they watch the hardware.
Side-channel accelerator attacks do exactly that. By monitoring kernel level metrics such as cache reads and writes during inference, tools like DeepSniffer can infer which sequence of operators a model is running and reconstruct a close approximation of the underlying architecture.
For organizations that treat their architectures as trade secrets, or that share accelerators in multi tenant environments, this is a serious concern. The question is how to defend without rewriting every model from scratch.
The “Compilation as a Defense” work explores a promising option. Instead of changing model architectures, we use tensor optimization within the TVM compiler stack to change how those architectures are implemented at the kernel level. By doing so, they reduce the effectiveness of side-channel attacks while also improving runtime performance.
In the DeepSniffer style of attack, an adversary monitors hardware level traces as a model runs on a GPU. Every operator, such as convolution or pooling, generates a characteristic pattern of memory accesses and cache usage. By learning to associate those patterns with specific operators, the attacker can reconstruct the sequence of layers that make up the model.
The attack does not rely on access to training data or weights. It only needs visibility into low level kernel metrics, which can sometimes be obtained in shared environments or via performance tooling that was not designed with security in mind.
Defenses in the literature often propose larger architectural changes or framework modifications. While useful, these approaches can be costly for engineering teams that have already invested heavily in particular models and pipelines.
Instead of changing the model itself, this research takes aim at the implementation layer. Modern deep learning frameworks rely on shared libraries like cuDNN for core operator implementations. These libraries produce predictable kernel behavior, which is exactly what side-channel attacks exploit.
We take ONNX versions of four popular architectures:
They then feed these models into TVM, an optimizing compiler for deep learning workloads, and apply AutoTVM’s tensor optimization routines. AutoTVM uses simulated annealing and a learned cost model to generate many candidate schedules for each operator, selecting those that minimize runtime on a target accelerator.
From a security standpoint, the key idea is simple. If you change how operators are scheduled, tiled, and fused, you change the observed kernel level behavior. That, in turn, makes it harder for a side-channel classifier trained on standard implementations to correctly infer the architecture.
To quantify this, the team:
The results show a clear trend. As the number of optimization trials increases, DeepSniffer’s fidelity drops. For example:
RoBERTa behaves differently, in part because its heterogeneous language model operators were already unfamiliar to DeepSniffer’s classifier. But across the vision models, the trend is consistent. More diverse and optimized schedules make the attack less successful.
There is no free lunch. AutoTVM’s search process is computationally expensive. Generating and benchmarking large numbers of candidate schedules required around 83 GPU hours across all models and trials in the study.
However, this cost is paid once per deployment configuration, not per inference. After the compiler finds an optimized schedule, the resulting model not only becomes more robust against the specific side-channel attack, it also runs faster.
For operators of high value models, that can be an acceptable trade. You spend GPU time up front to buy both performance gains and additional security margins.
One of the more interesting ideas in the discussion is the concept of using compilation as part of a moving target defense. Instead of treating optimization as a one time step, defenders could:
This would force adversaries to constantly retrain their side-channel classifiers and would shrink the window where any particular model of kernel behavior is valid.
From Mindgard’s perspective, this aligns with a broader principle. Many of the most effective security strategies in traditional computing rely on making systems less predictable over time: address space layout randomization, key rotation, or dynamic sandboxing. Applying similar thinking to AI workloads via compilers is a natural extension.
For teams responsible for securing accelerator heavy AI workloads, this research offers several practical takeaways:
Side-channel attacks remind us that AI models are not just mathematical functions. They are software artifacts running on complex hardware stacks, and every layer leaks some information. Work like this shows that defenders can use the same toolchains that power performance to quietly reshape what an attacker sees.