Training LLM with a single Nvidia a100

Ikaelzur99 · January 13, 2024, 11:39am

Hello everyone,

First of all, greetings to all! I’m new here, just registered to seek some advice and insights. My knowledge of software and hardware is quite basic, but I’ve always been fascinated by it. Now, with the advent of Large Language Models (LLMs), my interest has peaked more than ever.

That’s why I’m planning to build a homelab to experiment with training my own models and running them locally. I’ve managed to get a good deal on an Nvidia A100 40GB, and I’m curious to know to what extent it’s possible to train LLMs with a single A100 and how much time would that take.

Thank you! looking forward to your thoughts and suggestions

quilt · January 13, 2024, 12:24pm

Training from scratch? Forget about it… Training Llama 2 7B (the smallest variant) took 184 320 hours on 80GB A100s:

github.com

facebookresearch/llama/blob/main/MODEL_CARD.md#hardware-and-software

# **Model Details**

Meta developed and released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama-2-Chat, are optimized for dialogue use cases. Llama-2-Chat models outperform open-source chat models on most benchmarks we tested, and in our human evaluations for helpfulness and safety, are on par with some popular closed-source models like ChatGPT and PaLM.

**Model Developers** Meta

**Variations** Llama 2 comes in a range of parameter sizes — 7B, 13B, and 70B — as well as pretrained and fine-tuned variations.

**Input** Models input text only.

**Output** Models generate text only.

**Model Architecture** Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

||Training Data|Params|Context Length|GQA|Tokens|LR|
|---|---|---|---|---|---|---|
Llama 2|*A new mix of publicly available online data*|7B|4k|&#10007;|2.0T|3.0 x 10<sup>-4</sup>
Llama 2|*A new mix of publicly available online data*|13B|4k|&#10007;|2.0T|3.0 x 10<sup>-4</sup>
Llama 2|*A new mix of publicly available online data*|70B|4k|&#10004;|2.0T|1.5 x 10<sup>-4</sup>

This file has been truncated. show original

With an A100 you can work on finetuning smaller models, as long as they fit in VRAM.

Ikaelzur99 · January 13, 2024, 12:30pm

Hey thanks a lot for your answer!
that is like 21 years! how is that number even possible?
also, what would I need to train larger models?

Domrockt · January 13, 2024, 12:44pm

they did not use a single A100 but you can divide down to one

Ikaelzur99 · January 13, 2024, 12:47pm

Yea I figured it out, so are you telling me that I would need like 40 a100 80gb to train a model like that in 6 months?! oh I expected much different numbers.

Ikaelzur99 · January 13, 2024, 12:49pm

I’m seeing that it’s completely out of range for an individual to train from scratch a decent LLM. I’m quite sad

Domrockt · January 13, 2024, 12:54pm

Yes and No, it depends what you want the machine to learn. A full fledge AI to Human conversation? No you need a lot of compute.

You want an Objekt recognicion Model you call your own, that could be doable or an Picture creation Model or modefy an existing and fork it, yea more plausable

Ikaelzur99 · January 13, 2024, 12:59pm

Alright I’m forming a better picture now, thanks a lot for the help.
So in order to do some fine tuning to bigger LLMs like Dolphin do you have any estimation of what would it take?

Domrockt · January 13, 2024, 1:13pm

No, sry.

but i quick google search it seems you could fine tune LLAMA in the Mid Model size with your VRam, so i can do an eduacte gues that you can dabble in that range

i hostet LLAMA on Unraid with a Mid Model and it worked Ok’ish, but that was about a year ago.

Iron_Bound · January 16, 2024, 11:09pm

Could start with a notebook, as it should explain enough to help get started.

github.com

brevdev/notebooks/blob/main/llama2-finetune.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XIyP_0r6zuVc"
   },
   "source": [
    "<!-- Banner Image -->\n",
    "<img src=\"https://uohmivykqgnnbiouffke.supabase.co/storage/v1/object/public/landingpage/brev-xmas-3.png\" width=\"100%\">\n",
    "\n",
    "<!-- Links -->\n",
    "<center>\n",
    "  <a href=\"https://console.brev.dev\" style=\"color: #06b6d4;\">Console</a> •\n",
    "  <a href=\"https://brev.dev\" style=\"color: #06b6d4;\">Docs</a> •\n",
    "  <a href=\"/\" style=\"color: #06b6d4;\">Templates</a> •\n",
    "  <a href=\"https://discord.gg/NVDyv7TUgJ\" style=\"color: #06b6d4;\">Discord</a>\n",
    "</center>\n",
    "\n",
    "# Fine-tuning Llama 2 7B using QLoRA 🤙\n",

This file has been truncated. show original