Perhaps Dumb AI Questions , I risk the humiliation

I been looking at so many posts now, but Did Wendel make a how-to , for beginners for things like [
Llama for local use if so do anybody have a link ?

I’m asking because so many consultants who promote AI also suggest that cloud bases solutions run as " private " are safe and I do not agree and want to see how fast and easy it is to build a local version and if at any use at all

Hope found a way to post correctly , sorry of not first post tech me dont bite

1 Like

This: https://ollama.com/
Plus some compatible gui app.

Idk about running ollama natively on windows. You can definitely install it under wsl.

2 Likes

NetworkChuck did a video about this topic. You can check it out.

I think it’s this one:

2 Likes

Thanks, super any view on " private " version in cloud?

I’ve had Koboldccp running fairly easily on Windows 11 using their precompiled binary.

2 Likes

Welcome! I get where you’re coming from—there’s definitely a lot of discussion around cloud vs. local solutions. I haven’t seen a specific beginner’s guide from Wendel on setting up Llama locally, but it might be worth checking out the general resources or tutorials section of the forum.

If you don’t find what you need, you could also try searching for community-created guides or threads discussing local installations. It can be really eye-opening to build it yourself!

Don’t hesitate to ask more questions as you dive in; we’re all here to help each other out!

Cheers!

Welcome! You have a great question. I run many of the open weight local LLM’s on my desktop at home. Even with a modest GPU you can get some of the smaller 8B models running at home.

  • How much VRAM do you have? 8GB is kind of entry level, 16GB is okay, 24 GB will get you a taste of the bigger models, dual+ GPU is the dream lol
  • How much RAM do you have? If you can’t fit the whole model into your VRAM, many inference engines allow you to do partial offload. The speed will be slower though.

I agree with @MonstrousMicrobe that koboldcpp is a great way to get started. If you are a software dev and have experience with python virtual environments, managing dependency hell, and compiling c code, then I’d go straight to llama.cpp as many of the other projects use it under the hood.

Once you have koboldcpp, llama.cpp, ollama, lmstudio, or whatever you choose running on your computer, then head over to hugging face and type in GGUF in the search bar. A good starter model that fits in under 6GB VRAM I recommend is bartowski/Meta-Llama-3.1-8B-Instruct-GGUF and download the Q4_K_M

Start out small and build on your successes to keep motivated. There are a bunch of models, new ones coming out every month, and certain ones have better use cases e.g. “creative” writing, math, certain programming languages, etc… I keep an eye on r/localllama for latest releases and benchmarks.

After a while you might want to play with different quants (quantization is a way to compact the model weights to fit into RAM/VRAM), inference engines (there is more than just GGUF lol), or advanced stuff like distributed inferencing hah…

Enjoy the journey, what you learn along they way might be more useful than anything the bots tell u! :wink:

2 Likes

There are several web browser extensions to locally host web interfaces for ollama.

It is way more performant to run it native

1 Like

ollama in podman plus Alpaca is what I use on Linux

For models, I would stick mostly to Mistral but if you have more resources you could go Mistral Nemo or Mixtral

1 Like

Also recommend checking lm studio if you want something more user friendly

1 Like

Found this also, might be of value to some put this into your youtube /watch?v=DYhC7nFRL5I it runs and manages a number of models under docker in both linux and Windows

GPU is 24GB so that will work fine, did find this from " Dave" can’t post full link but Im sure most will know what to put after youtube I’m playing I think the way is local system for safety and support firms’ data /watch?v=DYhC7nFRL5I

1 Like

Nice, with 24GB locally you can try out many open LLMs at home without any of your data going through a 3rd party off-site system. Do be careful if you play with ComfyUI as some of the modules are very sketchy hah…

Some other folks are discussing this in another thread on here too:

Yes, there are lot’s of YTubers and resources available. Follow whatever methods and apps look interesting and motivate you to keep plugging along!

Cheers!

2 Likes

This is a recent one by Dave: https://youtu.be/DYhC7nFRL5I?si=qwtA9azJglmli00g

1 Like

Dave I like him; he’s direct and seems to do this for fun, with lots of fun projects. He’s a bit light on documentation at times, but that forces you to learn something, so it’s fine.

1 Like