Training LLM with a single Nvidia a100

Hello everyone,

First of all, greetings to all! I’m new here, just registered to seek some advice and insights. My knowledge of software and hardware is quite basic, but I’ve always been fascinated by it. Now, with the advent of Large Language Models (LLMs), my interest has peaked more than ever.

That’s why I’m planning to build a homelab to experiment with training my own models and running them locally. I’ve managed to get a good deal on an Nvidia A100 40GB, and I’m curious to know to what extent it’s possible to train LLMs with a single A100 and how much time would that take.

Thank you! looking forward to your thoughts and suggestions

Training from scratch? Forget about it… Training Llama 2 7B (the smallest variant) took 184 320 hours on 80GB A100s:

With an A100 you can work on finetuning smaller models, as long as they fit in VRAM.

1 Like

Hey thanks a lot for your answer!
that is like 21 years! how is that number even possible?
also, what would I need to train larger models?

they did not use a single A100 but you can divide down to one :smiley:

1 Like

Yea I figured it out, so are you telling me that I would need like 40 a100 80gb to train a model like that in 6 months?! oh I expected much different numbers.

I’m seeing that it’s completely out of range for an individual to train from scratch a decent LLM. I’m quite sad

Yes and No, it depends what you want the machine to learn. A full fledge AI to Human conversation? No you need a lot of compute.

You want an Objekt recognicion Model you call your own, that could be doable or an Picture creation Model or modefy an existing and fork it, yea more plausable :smiley:

Alright I’m forming a better picture now, thanks a lot for the help.
So in order to do some fine tuning to bigger LLMs like Dolphin do you have any estimation of what would it take?

No, sry.

but i quick google search it seems you could fine tune LLAMA in the Mid Model size with your VRam, so i can do an eduacte gues that you can dabble in that range :smiley:

i hostet LLAMA on Unraid with a Mid Model and it worked Ok’ish, but that was about a year ago.

Could start with a notebook, as it should explain enough to help get started.