I’ve looked at both the ones you’ve mentioned, though Zhao et. al 2022 definitely isn’t paywalled. Something to be aware of is MDPI has a notoriously lax peer review process, has been on and off predatory publisher lists, and as a result some university systems will not fund publications with them or consider MDPI papers in their promotion and tenure processes. If you do basic due diligence and check Table 1’s citations one is a data free Hynix press release and the other’s a Synopsis blog with a HBM2 PHY specific pJ/bit bar graph that doesn’t include DDR5 or HBM3.
So probably Table 1 is made up. The DDR4 and 5 bandwidths, pin counts, and capacities have obvious errors. The HBM columns I’d bet started by plagiarizing Wikipedia (note the 307 GB/s error) and I think it’s worth asking how HBM can have a pin count when it doesn’t have pins. I wouldn’t be surprised if an LLM managed to hallucinate more accurate pJ/bit totals.
Besides measured DDR5 pJ/bit the HBM3 number I mentioned is an average of Micron and Hynix’s datasheets. I’ve had a harder time with HBM2 but Micron’s press release claims 50+% reduction from HBM2E to HBM3, implying 8.6+ pJ/bit for HBM2E per their HBM3 datasheet. Which sounds high, so I suspect the press release has a basic math fail (maybe it’s more like 6.5 pJ/bit). I haven’t had luck trying to work back from HBM2E to HBM2.
Assuming Zhao et al. didn’t also make up their results, what they did is look at putting hot Redis data in HBM. Their methods don’t say how the energy metrics in Figure 6, 7, and 9 were calculated. DRAMsim3 seems to have been involved but it’s not indicated which DDR and HBM versions were compared and, making the dubious assumption Zhao et al. set up the sims with accurate values, #53 doesn’t exactly give me a lot of confidence.
Mmm, Drávai and Reguly 2024 have Granite Rapids roughly halving Sapphire Rapids power consumption. I’m not so sure their results are saying HBM is good as their results are saying Sapphire Rapids is a mess. Which we knew already, really, just not as specifically on this particular point.
Phoronix recently benched HBv4 (9V33X v-cache) and HBv5 (9V64H HBM3) but unfortunately doesn’t have power in the results. I’ve failed to find anything specific for Nvidia as well. Just handwavy stuff about gaining energy efficiency. It’s tempting to interpret that as an indication the power reduction isn’t particularly remarkable since, if it was, probably it’d get stated more specifically.
Another hint might be Intel not continuing Xeon Max. With Intel’s situation I’m hesitant to read much into it but, if was consistently a large power reduction, I’d tend to expect at least a few follow up parts to have happened in the past couple years.