GPU does not wake up?

Hi all! I am making a virtual assistant for our company on llama3.1:8b. This is a proof of concept with my own equipment until I can get funding to build a more capable model/system. I have ollama running on my system, and a docker container managing webui for my test users. Here are the specs

Gigabyte B450M DS3H Motherboard
Ryzen3 2200G
GTX 1660 Ti 6gib of VRAM
256gib Samsung Evo Nvme drive
32gib of 3200mhz ram I found.
Running Ubuntu 24.04.1 LTS

I am working up to get a RAG going to give this thing some persistent memory of company policies and procedures, but the main issue I am having is:

After it idles for a few hours, (No users running queries) The GPU appears to go to sleep? A query will come in, and ollama/docker falls back to just the CPU. It does not do the usual load-up of the model in VRAM, and then answer the question.

Is there a power setting somewhere? Or a docker setting that I am missing? Why does it fallback to the CPU after a few hours of inactivity.

1 Like

Disable sleep and C states in UEFI

With home assistant / mission critical applications you’ll want the machine always idling, never in any sleep state as the time to wake up introduces latency and weird issues (as you are experiencing)

1 Like

Did you try walking into the computer room and saying 'WAKE UP" lol

As said above if you do not need the power savings you can just disable them as said above and it should be fine. Let us know

2 Likes

Thank you gents. lol I tried yelling at it, that didn’t work. I will take a look at the BIOS next time I give it a restart.

1 Like

There was another menu option tucked in advanced frequency tab. I let her sit overnight, and it used the GPU to answer the query! That takes care of that for now, thank you for the help!

3 Likes

Glad you got it all figured out

I got so used to working with server equipment, I forgot I built this with a desktop motherboard. You guys rock. Have good days.

Got it running, and stable. Thanks for the help everybody!

2 Likes