Step 1 — describe the workload before you describe the GPU
The biggest mistake in GPU procurement is starting with the accelerator. The accelerator is downstream of the workload, and the workload determines which trade-offs are tolerable. A continuous training job that runs for weeks tolerates spot-style interruption poorly, but happily absorbs long contract terms in exchange for committed pricing. An inference fleet behind a public API has the opposite shape, with elastic burst as the priority and contract terms a liability. A batch pipeline that runs overnight can chase the cheapest spot capacity globally, while an interactive research notebook needs predictable latency to a specific region.
Write the workload down in plain prose before opening a single pricing page. Two paragraphs is usually enough. The shape of that paragraph quietly eliminates a third of the market before you have looked at a single SKU.
Step 2 — figure out who actually owns the metal
Three sourcing models dominate the market, and they price and behave differently. A direct operator runs its own GPUs on its own network. A marketplace surfaces capacity from many independent operators and routes you to whichever has availability. A reseller fronts hyperscaler capacity and adds its own packaging on top. None of these is inherently better than the others, but the right pick depends on whether you want a single accountable counterparty or whether you are happy to trade accountability for price or breadth.
The brochure will not draw this distinction for you, because the brochure has every reason not to. You can usually infer it from three signals: whether the provider runs its own ASN and announces its own IP prefixes, whether the support contract names a specific physical facility, and whether the published capacity scales smoothly or in obvious hyperscaler-sized blocks. The deeper guide in this series walks through the tells in detail.
Step 3 — compute the all-in price, not the headline price
The per-hour GPU number is the most visible price and the least useful one. Workloads always touch storage, almost always touch networking, and frequently touch a control plane that bills separately. A six-dollar-per-hour H100 quote can land at well over twice that once persistent storage, model checkpoints, dataset egress, and inter-region traffic are added in. The directional ratio matters more than the exact number, because providers cluster around very different fixed-versus-variable shapes.
The fix is not to pick the cheapest headline. The fix is to write down a realistic month of work, including the dataset sizes and the egress pattern, and to ask every shortlisted provider to price that month. Two or three of them will quietly drop off the list at that point, and the ones that remain will be priceable apples-to-apples.
Step 4 — pressure-test availability before signing
Listed availability is not the same as actual availability. A provider can advertise H100 capacity on its homepage while quietly throttling new tenants, queueing requests behind existing committed customers, or sourcing the SKU from a partner facility that has its own queue. The diligence question is not whether the model is offered but whether it can be booked at the size and the region you need, on the day you need it.
A short paid pilot is the cleanest way to confirm this, and most serious providers will accommodate one. If a provider refuses or routes you to a long sales cycle before any capacity is actually demonstrated, treat that as a signal in itself.