Big Navi ROCm Support

Found my answer, you were right. Now I need my 6900 XT! Thank you!

Apologies, I should have cited a source but am such a news glutton that often the source gets forgotten.

You scored a 6900XT as in youā€™re holding it in your hand, are you sure itā€™s not merely a trick of the light? If you get ROCm working and have some time to spare Iā€™d appreciate a few quick benchmarks if you fancy it, itā€™d let me know how important (if at all) it is to try and get one myself: Benchmark Request Thread

My ASRock Radeon RX 6900 XT PHANTOM GAMING D was already shipped from Newegg. Once I set it up I will be more then happy to provide more info. I will try to get a reference 6900 XT and will get rid of the ASRock one.

1 Like

One more thing I have ROCm set up for my Radeon VII if you want any benchmarks for that card it should be easy enough to do it before I change to 6900 XT.

Thanks for the offer but thereā€™s no need for Radeon VII benchmarks, my niche is already saturated with Radeon VIIā€™s as itā€™s by far the most performant affordable card out there. Radeon VIIā€™s are getting hard to come across for a decent price so weā€™re on the lookout for a good replacement that will eventually have plentiful supply. The RTX 3000 series has already been shown to be unsuitable so I have high hopes for Big Navi.

Got it. Will keep you posted once I get my card.

Happy to report that it arrived! Will do some testing this weekend. I need to disassemble my loop before I can do anything. Stay tuned for more :smiley:

1 Like

Christmas has come early. If thatā€™s a normal tree that box is massive.

1 Like

That is a normal tree lol.

1 Like

Probably most of the volume of my entire desktop build.

1 Like

ROCm 4.0 is out, unfortunately the readme makes no mention of big navi with the headline instead being that MI100 is supported in all of its unattainable glory. Still, worth a try. I definitely wouldnā€™t get rid of the card even if it doesnā€™t work just yet, that Navi 1 works unofficially is a good sign that they have been paving the way for big navi support. I wouldnā€™t be surprised if big navi was their next official new feature, hopefully ready before AMD supposedly sort out the supply issues. Maybe thatā€™s being too optimistic.

Well, Big Navi has 40 less CUs than the MI100, plus AMD have supposedly diverged the architecture a bit. Iā€™d expect the former to perform 50-60% as well as the latter at best. Hope AMD really brings ROCm 4.0 to Big Navi, pretty interested in it myself.

CDNA and RDNA2 CUā€™s are quite different, CDNA is an evolution of Vega20 and RDNA/2 is streamlined for gaming/rendering. They shed much of the compute potential creating RDNA in order to better compete with Nvidia in the consumer space, which unfortunately transferred many of the drawbacks consumer Nvidia usually has but luckily not to the same degree where it counts.

Which architecture is better really depends on workload. CDNA is the doomslayer when it comes to workloads that scale by FP64, memory bandwidth and/or memory capacity, I donā€™t know how much catch up if any AMD has to do for ML. RDNA2 is built for rendering/gaming but I am surprised how well a 5700XT does with gpuowl (not the same class as a Radeon VII but not a million miles off), the good memory bandwidth and AMD-standard 1:16 FP64 ratio seems to allow it to punch above its weight in this particular workload. Even if a 6900XT ends up ā€œjustā€ performing close to twice as well as a 5700XT itā€™s worth a closer look, if it is as power efficient as it appears. The Infinity Cache could conceivably give an unnaturally large boost to this workload (low memory capacity but high bandwidth requirements) and tip it into the same class as a Radeon VII, but my optimism is leaking.

Hi Everyone,

Beginning to test the ROCm support for the RX 6900 XT now. If it sucks I will sell the GPU at the price I paid. If anyone is interested and, if not I will eBay it. Probably will get a 3090 after that.

Unfortunately, I have nothing good to report so far. :confused: GPU isnā€™t detected as a RX 6900 XT (see below) and rocm-smi and rocm-bandwidth-test give a command not found. I tried to run a tensorflow benchmark but I get an error:

/src/external/hip-on-vdi/rocclr/hip_code_object.cpp:120: guarantee(false && "hipErrorNoBinaryForGpu: Coudn't find binary for current devices!")
Fatal Python error: Aborted

Current thread 0x00007f4878539740 (most recent call first):
  File "/home/rys/vrocm20/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 711 in __init__
  File "/home/rys/vrocm20/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1596 in __init__
  File "/home/rys/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py", line 3538 in setup
  File "tf_cnn_benchmarks.py", line 61 in main
  File "/home/rys/vrocm20/lib/python3.8/site-packages/absl/app.py", line 251 in _run_main
  File "/home/rys/vrocm20/lib/python3.8/site-packages/absl/app.py", line 303 in run
  File "tf_cnn_benchmarks.py", line 73 in <module>
Aborted (core dumped)

Agent 2                  
*******                  
  Name:                    gfx1030                            
  Uuid:                    GPU-XX                             
  Marketing Name:          Device 73bf                        
  Vendor Name:             AMD                                
  Feature:                 KERNEL_DISPATCH                    
  Profile:                 BASE_PROFILE                       
  Float Round Mode:        NEAR                               
  Max Queue Number:        128(0x80)                          
  Queue Min Size:          4096(0x1000)                       
  Queue Max Size:          131072(0x20000)                    
  Queue Type:              MULTI                              
  Node:                    1                                  
  Device Type:             GPU                                
  Cache Info:              
    L1:                      16(0x10) KB                        
  Chip ID:                 29631(0x73bf)                      
  Cacheline Size:          64(0x40)                           
  Max Clock Freq. (MHz):   2660                               
  BDFID:                   768                                
  Internal Node ID:        1                                  
  Compute Unit:            80                                 
  SIMDs per CU:            4                                  
  Shader Engines:          8                                  
  Shader Arrs. per Eng.:   2                                  
  WatchPts on Addr. Ranges:4                                  
  Features:                KERNEL_DISPATCH 
  Fast F16 Operation:      FALSE                              
  Wavefront Size:          32(0x20)                           
  Workgroup Max Size:      1024(0x400)                        
  Workgroup Max Size per Dimension:
    x                        1024(0x400)                        
    y                        1024(0x400)                        
    z                        1024(0x400)                        
  Max Waves Per CU:        64(0x40)                           
  Max Work-item Per CU:    2048(0x800)                        
  Grid Max Size:           4294967295(0xffffffff)             
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)             
    y                        4294967295(0xffffffff)             
    z                        4294967295(0xffffffff)             
  Max fbarriers/Workgrp:   32                                 
  Pool Info:               
    Pool 1                   
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
      Size:                    16760832(0xffc000) KB              
      Allocatable:             TRUE                               
      Alloc Granule:           4KB                                
      Alloc Alignment:         4KB                                
      Accessible by all:       FALSE                              
    Pool 2                   
      Segment:                 GROUP                              
      Size:                    64(0x40) KB                        
      Allocatable:             FALSE                              
      Alloc Granule:           0KB                                
      Alloc Alignment:         0KB                                
      Accessible by all:       FALSE                              
  ISA Info:                
    ISA 1                    
      Name:                    amdgcn-amd-amdhsa--gfx1030         
      Machine Models:          HSA_MACHINE_MODEL_LARGE            
      Profiles:                HSA_PROFILE_BASE                   
      Default Rounding Mode:   NEAR                               
      Default Rounding Mode:   NEAR                               
      Fast f16:                TRUE                               
      Workgroup Max Size:      1024(0x400)                        
      Workgroup Max Size per Dimension:
        x                        1024(0x400)                        
        y                        1024(0x400)                        
        z                        1024(0x400)                        
      Grid Max Size:           4294967295(0xffffffff)             
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)             
        y                        4294967295(0xffffffff)             
        z                        4294967295(0xffffffff)             
      FBarrier Max Size:       32                      

It feels like AMD will not support this card. I donā€™t want to own this card just for gaming it is just not what I would do. Ehhā€¦ will see what else is out thereā€¦

Found some more information here: https://github.com/RadeonOpenCompute/ROCm/issues/1180

Update 1
It seems that I was wrong, gfx1030 is the RX 6900 XT and the post above confirms future RDNA 2 support. Thus I think I will wait and see how everything develops and continue using my Radeon VII for ML tasks.

Stay tuned for more.

1 Like

@Praetorian did you have any luck with Rocm 4.0? I managed to install it and it seems to work with OpenCL, but I get the same hipErrorNoBinaryForGpu error trying to run tensorflow

1 Like

Hi @Agustin_Aguilar. There was a new ROCm release 4.0.1, but it doesnā€™t mention 6000 series. It seems that there is nothing that can be done right now. (Not willing to go through the code to modify it.)

I wouldnā€™t loose hope. It appears that people that work on ROCm officially said that support is coming for RDNA 2 and dodged the question about RDNA 1. That means it is coming but we donā€™t know how long will it take. It would be helpful to take a look at the release date of the Radeon VII and the date that ROCm support came out for it. That can give us some kind of idea how long it will take.

My information indicates that Radeon Instinct MI 100 is ~2-3 months away. And since they have the support for it ready they might take the time to start working on RDNA 2 support.

Last thing to consider is that it appears that AMD provides support for gaming GPUs when they are released together with a compute GPUs. That is why we donā€™t see RDNA 1 support. Simply there is no compute card that requires support and AMD wonā€™t waste resources on a gaming GPU support.

2 Likes

The latest news on the topic:

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.