Help getting a Radeon V340L working

I’m trying to get a $50 ebay V340L (gfx900) working with ollama or llama.cpp and I did… technically? I got it working with Ollama with ROCm and it was less than 2 tokens per second so I figured i configured it wrong and redid the whole thing from the ground up with llama.cpp and vulkan and again it was 2 tokens per second. (with a Llama 3.2 3B Q8) I saw utilization in rocm-smi and utilization was 100% on one of the gpus as expected I saw an appropriate amount of vam in use for that small model. but the power was super low and it was not heating up at all (40c at most and I’ve got some likely inadequate cooling on it right now for testing)
I’m at a loss of what else to do.

No idea about AI stuff, but how does the card do in regular graphics stuff?

Might just be the wrong kind of silicon for AI workloads?

So it might end up trying to brute force stuff maybe?

You can try the Vulkan backend, which has probably received more development effort, thanks to interoperability.

I am ignorant as to ROCm, or the specifics of your card and its architecture.

It’s designed for virtualization, there might be some kind of bios hack you could do to have it show up in windows differently like a Vega 56 ideally, But it would limit the card to one gpu even if you did, and it’s got no outputs. I tried getting it running on linux (in a VM under proxmox) but every time I install the drivers and reboot, it crashes Xorg so I’m probably not doing something right on passthrough just yet.

Sure, okay, but it is old, and was not designed for AI.

It looks a good card for lots of other reasons, but it may not be defective/damaged/worn out, could simply be the wrong task for the part…

That card is effectively two Vega 56’s in one right? So just a quick search turned up this on Vega 56 performance:

So it can get a good number of tokens per second in some situations and 1-2 tokens in others lkike with larger models. Your 3B model seems like it would fit the workload fine though and get a good number, and then you have two of them besides.

Okay, so I got it working on windows the the R.ID software (the cards show up as Vega 64) and there are two of them. But in superposition it gets around 2fps directX or (15fps OpenGL) on 1080 Medium :frowning: The card is being hindered massively by something, I don’t think it’s thermal throttling.
Every once in a while I see the FPS jump up by a ton for a single frame.
Could the bios on the card be limiting how much performance any one process can get with the card and how might I disable that. I tried with SR-IOV enabled and disabled in the motherboard bios no change.

I did manage to get around 5 tokens per second (double the SPEED, lol) by installing the linux vulkan drivers and not rebooting and thus avoiding the crashing issues with X so X kept using my old card. I tried to configure X to keep using the igpu on the 5600G I’m running but with no justifiable results I didn’t want to bother.

1 Like

Two things that come to mind:

  1. Is the compute load on your twin-dGPU card, and not on the iGPU? That’s a mistake I’ve made myself before.

  2. I think power problems, from hardware, connection, or supply, could cause some cards to limp at minimum performance. Wonder whether that applies here.

sounds like the card is not great…
might have lost out on that bargain :frowning:
but for sure, give the bios a go

The performance seems real low in superposition, and low tokens on the AI, but no temperature warning, seems like a repaste isn’t gonna help

Did you check HWInfo64 for power draw information and GPU utilization? Maybe youll find something interesting like only 50w draw or no utilization on 2nd GPU.


The card basically refuses to use more than 25watts per gpu it’s clocked at 23Mhz wtf (the limited power draw was something I saw before when I was on linux running rocm, so that’s a big clue)

What does it say, under “GPU thermal limits”? Starting to look like power problems, for sure.

That image is from while you are running a GPU load? Cause it basically says you are not using your GPU more than a couple percent. Though it could be reporting problems since quite a few areas show 0 for some reason, probably due to the specialized nature of the card.

I’m sitting here like someone say it, SOMEONE SAY IT

Do you have Above 4G Decoding and Resizeable BAR enabled?

My systems are ancient and all have a switchable BAR, so they just… do it when required. But on new ass stuff you can hard lock it on and get the full bus all the time. This is really important for stuff like Looking Glass and VM Swapping GPU’s, so I bet its just as important here.

In 2020 I tried to build a machine that would let me use GPU’s sorta like USB sticks for VM’s. Didn’t work out, 3900X was a huge kack and a half with no RMA option for my use case, so I threw the whole thing out, but the only way the cards even showed up correctly and ran anything at all was Above 4G Decode and ReBAR.

I have a Grid K2 that I actually want to use as a render card in a build someday. The display unit will be some nonsense like a GT730 or an X600, but the actual render engine will be the K2, and maybe even a MI for ROCm etc. Just doing big on the cheap. (If I am lucky, I will be able to watercool all of the dies too)

Also, just my recommendation, test this all out on a bench with a PSU dedicated for your testing card. Leave it on with a jumper, and see what you can get the card to do with whatever air solution you can build. I, at least, find things easier to test without my compute card pulling on the mains for the mobo. But thats a personal peave, mostly. You can touch everything and know what its physically doing compared to what its lying about on the screen LOL.

What quantization do you use? If you try to run a quantization that is not supported by your card the model will reside in VRAM but the calculations will be done on the CPU. Can you please check the load of your CPU while you run the model?

2 Likes



Also that sounds convenient for getting these weird cards booting which does require some bios tinkering. Could the board have some bad power delivery or something that’s preventing it from leaving the low power state? Or do you think it’s significantly more likely that the driver, not having all the information available is preventing it from going to a higher power state? I’m going to try to flash it later if I can’t get it working as is.

1 Like

Turn IOV on too.

I think its just about the PCIe chips talking to each other in the right modes. IOV in GPU’s will kill this problem in future.

And no code 43 or 30!

Yeah, it works and all but it’s stuck at a core clock of 23mhz, memory clock looks fine though. I haven’t found anything that has helped yet. Not afterburner or clockblock :frowning:

sounds like a hardware issue, power sensing is probably broken ( card will put it self in a low clock state to protect itself). usually it is repairable, if you lack the expertise to repair it, I would probably just chuck it, buy another one if you feel lucky.

I dont think this card will support curremt rocm… see the table here. Your card is gcn 5.0 but it seems support cutoff is gcn 5.1 (in the pro cards tab)?