Turn gaming gpus into a peer-to-peer ML network

While OpenAI’s ChatGPT generates impressive code, letters and summaries.

The secret is a large language model (LLM) with billions of parameters, the operation requiring hundreds of gigabytes of GPU memory on very expensive hardware.

Well rather than the five richest kings, OpenAI and Tech Companies having this ability. We can now address the problem with Petals a solution to share your gaming GPU towards a network, just like a cake you only hold a slice of the model, running vector operations on it.

While the diagram is simple a precision changed LLM was able to fit on 22 gaming graphics cards also higher throughput is available with more cards added.

After using chatGPT the value is clear to setup and run a network for the community as we have a lot of knowledgeable people and may want to up to speed on machine learning via shared projects.
There would also the benefit of using transfer learning, building on the shoulders of this monster 176 billion parameter model save a lot of iteration time.

If we do this right I believe it’s a solid way to maintain access, continue in the spirit of open source and with recent news of OpenAI’s goal of $1 billion in revenue by 2024 its moving towards corporations.

Check out the links and give feedback with a post!

Install for Windows via WSL

System Requirements:

  • 12GB of system ram
  • Nvidia graphics card with 8GB vram (AMD needs Torch ROCm)
  • 25mbit or higher connection

Install WSL via command line
Note: if any errors check virtualization is enabled (docs for troubleshooting)

wsl --install

After installing Ubuntu, do the following commands there.
Get Anaconda, download and run the script.

$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ bash Miniconda3-latest-Linux-x86_64.sh

Python Libraries and Petals

$ conda install pytorch cudatoolkit=11.3 -c pytorch
$ pip install git+https://github.com/bigscience-workshop/petals

Run the server with command

$ python -m petals.cli.run_server bigscience/bloom-petals

Alternative use the GPU-enabled Docker image

sudo docker run --net host --ipc host --gpus all --volume petals-cache:/cache --rm learningathome/petals:main \
    python -m petals.cli.run_server bigscience/bloom-petals

Troubleshooting

Verify the install (python code)

import torch
x = torch.rand(5, 3)
print(x)

Verify cuda is working (python code)

import torch
print(torch.cuda.is_available())

Fix for ‘cuda.so’ error

$ export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH

–Reserved post–

Indeed, ChatGPT is a powerful tool

followed this guide: CUDA on WSL User Guide

Still cannot access the GPU in WSL.

  File "/home/any/miniconda3/lib/python3.10/site-packages/petals/server/server.py", line 222, in _choose_num_blocks
    assert self.device.type == "cuda", (
AssertionError: GPU is not available. If you want to run a CPU-only server, please specify --num_blocks. CPU-only servers in the public swarm are discouraged since they are much slower
>>> print(torch.cuda.is_available())
False

Could try installing libraries versions at match https://pytorch.org

$ conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia

then in python you can do to confirm

import torch
print(torch.cuda.is_available())

Also see if you can run ‘nvidia-smi’ as it could give an idea for troubleshooting.