GPU Server Buying Guide: NVIDIA A100 vs. H100 on the Used Market

The AI infrastructure boom has created an unprecedented secondary market for GPU servers. As organizations upgrade from A100 to H100 (and beyond to B100/B200), a steady stream of capable A100 systems is entering the used market at increasingly attractive prices.

But buying used GPU servers is fundamentally different from buying standard compute. The stakes are higher, the failure modes are different, and the performance gap between generations is enormous.

A100 vs. H100: The Numbers

Before discussing pricing, let's establish the performance baseline:

| Specification | A100 80GB SXM | H100 80GB SXM | |--------------|---------------|---------------| | FP16 Tensor | 312 TFLOPS | 989 TFLOPS | | FP8 Tensor | N/A | 1,979 TFLOPS | | Memory | 80GB HBM2e | 80GB HBM3 | | Memory Bandwidth | 2.0 TB/s | 3.35 TB/s | | TDP | 400W | 700W | | NVLink | 600 GB/s | 900 GB/s | | Interconnect | NVLink 3.0 | NVLink 4.0 |

The H100 delivers roughly 3x the training throughput for large language models and 2x for inference compared to the A100. That performance gap is reflected in pricing, but not linearly.

NVIDIA DGX A100View current valuations

→

Current Market Pricing

Used GPU server pricing moves fast, but here are the ranges as of early 2026:

Individual GPUs:

A100 80GB SXM: $5,000-8,000 (down from $15,000+ in 2023)
A100 80GB PCIe: $4,000-6,500
H100 80GB SXM: $18,000-25,000
H100 80GB PCIe: $15,000-20,000

Complete systems:

DGX A100 (8x A100 SXM): $60,000-100,000
DGX H100 (8x H100 SXM): $200,000-300,000
Dell/HPE/Supermicro 4-GPU A100 systems: $35,000-55,000

The A100's value proposition is clear: at roughly 25-30% of the H100's price, you get 33-50% of its performance. For many workloads — inference serving, fine-tuning, and moderate-scale training — that's an excellent deal.

For inference workloads specifically, the A100 often delivers better performance-per-dollar than the H100. The H100's advantages are most pronounced in large-scale training where the faster interconnect and higher tensor throughput compound across hundreds of GPUs.

What to Check When Buying Used GPU Servers

GPU servers require more careful inspection than standard compute. Here's what to verify:

GPU Health

Run nvidia-smi and check for ECC errors, temperature readings, and clock speeds
Power draw at idle and load — GPUs with degraded VRM components may throttle under load
Memory health — run a GPU memory test (CUDA memtest) for at least 4 hours
NVLink status — verify all NVLink bridges are functional with nvidia-smi nvlink -s

Thermal History

GPUs that ran at sustained high temperatures may have reduced lifespan. While you can't fully verify thermal history, check:

Fan condition and noise levels (worn bearings indicate heavy use)
Thermal paste condition (if accessible)
Any signs of thermal discoloration on the PCB

System-Level Verification

All GPUs visible — verify the expected number appear in nvidia-smi and lspci
PCIe link speed — all GPUs should negotiate at Gen4 x16
NVSwitch functionality (DGX systems) — verify with nvidia-smi nvlink -s
Cooling system — GPU servers have complex cooling that must be fully functional

NVIDIA DGX H100View current valuations

→

SXM vs. PCIe: Which to Buy?

SXM modules are designed for NVLink-connected GPU-to-GPU communication. They offer higher TDP (and thus higher sustained performance) and faster interconnect. However, they require specific baseboard designs and aren't interchangeable between server models.

PCIe GPUs are more flexible — they fit standard PCIe slots and can be moved between different server platforms. They run at lower TDP and lack the NVLink interconnect bandwidth of SXM modules.

Buy SXM when: Multi-GPU training is your primary workload and you need maximum GPU-to-GPU bandwidth.

Buy PCIe when: You want flexibility across server platforms, your workloads are primarily single-GPU inference, or you plan to repurpose the GPUs later.

Get pricing updates:

Common Pitfalls

Cooling Requirements

GPU servers generate enormous heat. A DGX A100 draws 6.5kW at full load — that's equivalent to running six space heaters. Ensure your facility can handle:

Adequate rack cooling (30-40kW per rack for dense GPU deployments)
Appropriate power circuits (most GPU servers need 2x 30A 208V circuits)
Sufficient airflow (front-to-back, with hot/cold aisle separation)

Software Licensing

Some GPU server features require software licenses:

NVIDIA AI Enterprise — required for certain vGPU and enterprise AI features
DGX-specific software stack — DGX OS and management tools may require active NVIDIA support
CUDA compute capability — verify the GPU generation supports your target CUDA version

Warranty and Support

NVIDIA DGX systems have proprietary components that can be difficult to source outside of NVIDIA's support channel. Consider:

Can you self-support, or do you need vendor support?
Are replacement GPUs available on the secondary market for your specific SKU?
What's the cost of a single GPU replacement vs. an extended warranty?

Dell PowerEdge R750View current valuations

→

Recommendations by Use Case

LLM Inference at Scale: Used A100 systems offer the best value. The inference performance gap is smaller than training, and A100 availability is excellent.

Fine-tuning and RAG: A100 80GB provides sufficient memory for most fine-tuning tasks. The cost savings over H100 are substantial.

Pre-training Large Models: H100 is worth the premium here. NVLink 4.0 and the Transformer Engine provide meaningful speedups that compound over weeks of training.

Research and Experimentation: A100 PCIe cards in flexible server platforms give maximum versatility at the lowest cost.

Know what your hardware is worth

Get a free, data-driven valuation for your servers, networking, or storage equipment in under 2 minutes.

Get Free Valuation