Confidential AI Computing: Protecting Models and Data in Use

The gap in AI data protection
#

Encryption protects data at rest (on disk) and in transit (over networks). But AI workloads must decrypt data to process it — creating a vulnerability window where models, training data, and inference inputs exist unencrypted in memory. This “data in use” gap is the target of confidential computing.

For AI specifically, the stakes are high:

Model weights represent millions of dollars in training investment and proprietary IP. Unencrypted weights in GPU memory can be extracted by compromised hypervisors, malicious co-tenants, or insider threats.
Training data often includes sensitive personal information, proprietary business data, or regulated records. Memory residuals from one tenant’s training run can leak to the next user of shared GPU infrastructure.
Inference inputs may contain classified documents, medical records, financial data, or other material that must never be accessible to infrastructure operators.

Traditional security perimeters — firewalls, access controls, network segmentation — don’t help when the threat is the infrastructure itself.

TEE architecture for AI workloads
#

Trusted Execution Environments create hardware-isolated regions of memory where computation occurs under cryptographic protection. The key property: even the host operating system, hypervisor, and hardware management firmware cannot read TEE-protected memory.

CPU TEEs
#

Intel TDX (Trust Domain Extensions) creates isolated virtual machines called Trust Domains. Each TD has its own encrypted memory space with hardware-enforced access controls. The CPU’s memory encryption engine encrypts all data leaving the processor boundary, including writes to DRAM.

AMD SEV-SNP (Secure Encrypted Virtualization — Secure Nested Paging) provides similar isolation with per-VM encryption keys and integrity protection. SNP adds strong memory integrity guarantees that prevent replay and remapping attacks that affected earlier SEV versions.

Both architectures support remote attestation — the ability for a remote verifier to cryptographically confirm that a specific workload is running on genuine hardware in a correctly configured TEE, without modifications.

GPU TEEs
#

AI workloads are GPU-intensive, making CPU-only TEEs insufficient. NVIDIA’s Confidential Computing extends TEE protection to GPU computation:

Encrypted PCIe bus — Data is encrypted as it moves between CPU and GPU, preventing bus-snooping attacks
Encrypted GPU memory — All GPU VRAM is encrypted; even physical memory probes cannot extract model weights or intermediate activations
GPU-to-GPU encryption — Multi-GPU workloads communicate over encrypted NVLink/NVSwitch channels, supporting distributed training and inference
Composite attestation — Both CPU TEE and GPU TEE are attested together, establishing a chain of trust from CPU silicon through GPU silicon to the application

Supported hardware includes NVIDIA H100, H200, and the latest Blackwell B200 GPUs. Corvex demonstrated that confidential computing on HGX B200 systems achieves near-native performance with no significant throughput penalty.

The bounce buffer architecture
#

Moving data between CPU TEE and GPU TEE requires careful handling. Intel and NVIDIA developed the bounce buffer approach:

Data is decrypted inside the CPU TEE
An intermediary encrypted memory region (the bounce buffer) handles the PCIe transfer
Data is re-encrypted for GPU consumption
At no point is plaintext data accessible to the hypervisor, device drivers, or host OS

This architecture ensures that even if the host software stack is completely compromised, the AI workload’s data remains protected.

Attestation and verification
#

Attestation is the mechanism that makes confidential computing trustworthy. Without it, a user has no way to verify that their workload is actually running inside a TEE rather than on a regular, unprotected system.

How attestation works
#

Hardware generates an attestation report — The TEE hardware creates a cryptographically signed report containing measurements of the loaded firmware, kernel, and application code
Report includes a unique challenge — The verifier provides a nonce to prevent replay attacks
Independent verification — An attestation service (like Intel Trust Authority) validates the hardware signatures and measurements against expected values
Key release on success — Only after successful attestation are decryption keys released to the workload

Production attestation services
#

Intel Trust Authority provides independent, multi-cloud attestation verification. It’s offered as a free service (with optional paid enterprise support) and can attest workloads across different cloud providers and on-premises infrastructure.

NVIDIA Remote Attestation verifies GPU TEE integrity and provides evidence of correct GPU configuration for confidential computing mode.

These services enable a zero-trust model for AI infrastructure: workloads prove their trustworthiness cryptographically rather than relying on perimeter security or contractual guarantees.

Production deployments (2025–2026)
#

Confidential AI computing has moved from research to production:

Fortanix Confidential AI — Powered by NVIDIA Confidential Computing, Fortanix enables enterprises to deploy proprietary third-party AI models where model weights remain encrypted and invisible to infrastructure operators. The model vendor never sees the customer’s data; the customer never sees the model weights.

EQTY Lab Verifiable Runtime — Launched at GTC 2026, EQTY Lab’s system provides silicon-based enforcement for autonomous AI agents in enterprise environments. It leverages NVIDIA Confidential Computing on BlueField DPUs to create hardware-attested, cryptographically proven security postures for AI agents, at near-zero performance cost.

Phala Network GPU TEE Cloud — Combines Intel TDX and NVIDIA Confidential Computing to offer GPU TEE services for decentralized AI workloads, with full remote attestation and OpenAI-compatible APIs.

TrustedGenAi — Open-source TEE-based LLM inference using Intel TDX or AMD SEV-SNP, with remote attestation and standard API compatibility.

Implications for AI agent security
#

Confidential computing is particularly relevant for AI agent deployments:

Credential protection — AI agents often hold API keys, database credentials, and authentication tokens. Inside a TEE, these credentials exist only in encrypted memory, inaccessible to the host system even if the host is compromised.

Model IP protection — Organizations deploying proprietary models to edge locations or partner infrastructure can use TEEs to prevent model extraction, even by the infrastructure operator.

Regulatory compliance — TEE-based processing provides cryptographic evidence of data isolation that satisfies regulatory requirements for data residency, access control, and processing restrictions.

Multi-party computation — Multiple organizations can collaboratively run AI workloads where each party’s data remains invisible to the others, enabled by TEE isolation and attestation.

Limitations and considerations
#

Confidential computing is powerful but not a silver bullet:

Performance overhead — While improving (Blackwell achieves near-native), there is still measurable overhead for memory encryption and PCIe bus encryption. For latency-sensitive workloads, this must be evaluated
Hardware requirements — TEE support requires specific CPU and GPU generations. Not all cloud instances offer confidential computing options
Side-channel risks — While TEEs protect against software-based attacks, some side-channel attacks (cache timing, power analysis) may still leak information. Vendors continuously patch known side-channel vulnerabilities
TCB size — The Trusted Computing Base (the amount of code that must be trusted) varies by implementation. Smaller TCBs are more auditable and less likely to contain vulnerabilities

For the most critical AI deployments — safety-critical autonomous systems, classified workloads, regulated data processing — confidential computing should be combined with hardware capability architectures (CHERI) that provide additional memory safety and compartmentalization guarantees.

Getting started
#

Organizations evaluating confidential AI computing should:

Identify sensitive workloads — Which AI models and datasets require “data in use” protection?
Assess hardware availability — Confirm TEE support in your cloud provider or on-premises hardware
Evaluate attestation requirements — Determine who needs to verify workload integrity and how
Test performance impact — Benchmark your specific workloads with TEE enabled
Plan for key management — Confidential computing requires robust key management for sealed keys and attestation-gated release

The infrastructure for confidential AI is production-ready today. The question is no longer “can we protect AI workloads cryptographically?” but “how quickly should we deploy this protection?”

The gap in AI data protection#

TEE architecture for AI workloads#

CPU TEEs#

GPU TEEs#

The bounce buffer architecture#

Attestation and verification#

How attestation works#

Production attestation services#

Production deployments (2025–2026)#

Implications for AI agent security#

Limitations and considerations#

Getting started#