Skip to main content

2 posts tagged with "MIG"

Posts about NVIDIA Multi-Instance GPU

View All Tags

Google Cloud Fractional G4 Uses vGPU, Not MIG. Here Is Why That Matters.

· 15 min read
Dhayabaran V
Barrack AI

Google Cloud announced fractional G4 VMs at GTC 2026 in March. The pitch is straightforward. The RTX PRO 6000 Blackwell Server Edition is a 96 GB GDDR7 GPU. Most workloads do not need all 96 GB. So Google slices one physical GPU into fractions (1/8, 1/4, 1/2) and sells you only what you need. Pay less, get less. Simple.

What Google does not say in the announcement, but does say in the documentation, is how that slicing works. The fractional G4 shapes use NVIDIA vGPU. Not MIG. That is a specific technical choice with specific security consequences, and the distinction matters if you are putting anything sensitive on a fractional instance.

This post covers what fractional G4 is actually built on, what NVIDIA's own documentation says about vGPU isolation versus MIG isolation, the Virtual GPU Manager's CVE history over the past two years, and what all of this means for workloads that care about tenant separation.

Fractional GPU Security: NVIDIA Says Sharing GPUs Is Not Safe

· 20 min read
Dhayabaran V
Barrack AI

The fractional GPU pitch goes like this. Full GPUs are expensive. Most workloads do not need a full GPU. So we will slice one GPU into fractions, rent you a fraction, and pass the savings along. Pay for what you use, the marketing says. Efficient, cheap, modern.

The part of the pitch that never gets said out loud is that the fraction you rented sits on the same physical hardware as someone else's fraction, and the isolation between your work and theirs is much weaker than the marketing suggests. NVIDIA's own documentation says so directly. Published research from MICRO, CCS, ISCA, and USENIX Security has been demonstrating it for years. The gap between what NVIDIA recommends and what fractional GPU providers actually ship is the entire problem.

This post is about fractional GPU security and why fractional GPU is the wrong place for anything you care about. Not just regulated enterprise workloads. Anything that matters to the person running it. Your company's inference traffic. Your startup's model weights. Your research data. Your unpublished thesis. If the work has value to you, fractional GPU is the wrong answer.