Fractional GPU Security: NVIDIA Says Sharing GPUs Is Not Safe
The fractional GPU pitch goes like this. Full GPUs are expensive. Most workloads do not need a full GPU. So we will slice one GPU into fractions, rent you a fraction, and pass the savings along. Pay for what you use, the marketing says. Efficient, cheap, modern.
The part of the pitch that never gets said out loud is that the fraction you rented sits on the same physical hardware as someone else's fraction, and the isolation between your work and theirs is much weaker than the marketing suggests. NVIDIA's own documentation says so directly. Published research from MICRO, CCS, ISCA, and USENIX Security has been demonstrating it for years. The gap between what NVIDIA recommends and what fractional GPU providers actually ship is the entire problem.
This post is about fractional GPU security and why fractional GPU is the wrong place for anything you care about. Not just regulated enterprise workloads. Anything that matters to the person running it. Your company's inference traffic. Your startup's model weights. Your research data. Your unpublished thesis. If the work has value to you, fractional GPU is the wrong answer.
