How many GPU cloud providers should I shortlist?

Three to five. Fewer makes the comparison shallow, more makes the diligence diluted. The shortlist should mix at least one direct operator and one marketplace so that the trade-offs become visible.

Should I always pick a direct operator over a reseller?

Not necessarily. A direct operator is the right pick when you want a single accountable counterparty for hardware, network and uptime. A reseller can be the right pick when you want hyperscaler reliability with friendlier procurement terms or specialised stacks. The workload picks the model, not the other way round.

How important is the country a provider is in?

For most workloads, only as important as the latency budget and the data residency constraint allow. For regulated workloads the country is non-negotiable, so the country page is the right starting point. For unconstrained workloads, model availability and price usually dominate.

How to choose a GPU cloud provider

What questions should I ask before picking a GPU cloud provider?

Start by writing down the workload shape (training, inference, batch jobs, interactive notebooks) and the hard constraints (region, compliance, contract length). Then evaluate providers along three axes: who actually owns the GPUs, what the real all-in price looks like once storage and egress are included and how reliably the capacity is available when you need it. The brochure rarely answers any of those three cleanly, so the shortlist is built by asking.

Step 1: describe the workload before you describe the GPU

The biggest mistake in GPU procurement is starting with the accelerator. The accelerator is downstream of the workload and the workload determines which trade-offs are tolerable. A continuous training job that runs for weeks tolerates spot-style interruption poorly, but happily absorbs long contract terms in exchange for committed pricing. An inference fleet behind a public API has the opposite shape, with elastic burst as the priority and contract terms a liability. A batch pipeline that runs overnight can chase the cheapest spot capacity globally, while an interactive research notebook needs predictable latency to a specific region.

Write the workload down in plain prose before opening a single pricing page. Two paragraphs is usually enough. The shape of that paragraph quietly eliminates a third of the market before you have looked at a single SKU.

Step 2: figure out who actually owns the metal

Three sourcing models dominate the market and they price and behave differently. A direct operator runs its own GPUs on its own network. A marketplace surfaces capacity from many independent operators and routes you to whichever has availability. A reseller fronts hyperscaler capacity and adds its own packaging on top. None of these is inherently better than the others, but the right pick depends on whether you want a single accountable counterparty or whether you are happy to trade accountability for price or breadth.

The brochure will not draw this distinction for you, because the brochure has every reason not to. You can usually infer it from three signals: whether the provider runs its own ASN and announces its own IP prefixes, whether the support contract names a specific physical facility and whether the published capacity scales smoothly or in obvious hyperscaler-sized blocks. The deeper guide in this series walks through the tells in detail.

Step 3: compute the all-in price, not the headline price

The per-hour GPU number is the most visible price and the least useful one. Workloads always touch storage, almost always touch networking and frequently touch a control plane that bills separately. A six-dollar-per-hour H100 quote can land at well over twice that once persistent storage, model checkpoints, dataset egress and inter-region traffic are added in. The directional ratio matters more than the exact number, because providers cluster around very different fixed-versus-variable shapes.

The fix is not to pick the cheapest headline. The fix is to write down a realistic month of work, including the dataset sizes and the egress pattern, then ask every shortlisted provider to price that month. Two or three of them will quietly drop off the list at that point and the ones that remain will be priceable apples-to-apples.

Step 4: pressure-test availability before signing

Listed availability is not the same as actual availability. A provider can advertise H100 capacity on its homepage while quietly throttling new tenants, queueing requests behind existing committed customers or sourcing the SKU from a partner facility that has its own queue. The diligence question is not whether the model is offered but whether it can be booked at the size and the region you need, on the day you need it.

A short paid pilot is the cleanest way to confirm this and most serious providers will accommodate one. If a provider refuses or routes you to a long sales cycle before any capacity is actually demonstrated, treat that as a signal in itself.

What questions should I ask before picking a GPU cloud provider?

Step 1: describe the workload before you describe the GPU

Step 2: figure out who actually owns the metal

Step 3: compute the all-in price, not the headline price

Step 4: pressure-test availability before signing

Common questions