I’m looking at some quotes from OSCER:
R750xa, dual H100 80 GB with NVlink 600 GB/s: $61,669.80
R750xa, dual A100 80 GB with NVlink 600 GB/s: $39,622.13
I avoided the jump from the A40s to the A100s because I found it made marginal (few %) speedups in actual calculations. If I’m interpreting the data correctly, it seems like the leap from GDDR6X to HBM2 is more useful if you can load your entire dataset into VRAM to begin with. In my case, datasets are dozens of terabytes, so that’s not going to be an option for a long time. So the only speedup I see in my workflow is in speed of actual matrix operations.
With the H100, whole new architecture, so I’m guessing there’s substantial speedups in addition to the HBM memory. Problem is, there’s very little detail available, and what I can find is mostly for AI workloads where, again, presumably the whole dataset is already in memory.
Obviously the price jump is huge, but the H100s seem like they might actually be better value for the money if Nvidia’s 2.7x speedup is to be believed. Anyone here have any experience with the H100s that might be willing to chime in?