AMD offers several processors in their EPYC Rome that range from 8 cores to 64. If I understand correctly, in the former there is only one chiplet (or CCD, with two groups of 4 cores) and one I/O die. In the latter there are 8 chiplets and the I/O die. Supposedly, each chiplet is directly connected to the I/O die via the infinity fabric and the 8 memory controllers (which are on the I/O die) then connect to system RAM.
Let’s call the RAM memory bandwidth for each controller “X”. So, the socket has a theoretical total bandwidth of 8*X. Let’s call the bandwidth of the infinity fabric from one chiplet to the I/O die “Y”. A percentage “p” of this bandwidth (in average) is dedicated to chiplet-chiplet communication, amongst other things, not including access to RAM. So, the bandwidth available to RAM-chiplet transactions is therefore (1-p)*Y ROUGHLY.
So, for the case of an 8 core EPYC processor (one chiplet), if (1-p)Y<8X, then there is no need to populate all of 16 channels of memory, right? Experience has shown that RAM “performance” does not scale linearly with number of channels, so that even (and this is my call) a 64 core EPYC Rome wouldn’t leverage all 8 channels e.g. with 8x the “performance” vs a single channel configuration.
So, my questions are: does anyone have rough figures for the numbers X, Y and (a conservative estimate for) p? Not factoring cost, would a 2 channel RAM configuration would be enough for 8 core EPYC? Or, would the benefits of populating all 8 memory controllers transfer equally to an 8 core EPYC vs a 64 core one?