Do you use AI locally hosted? | Poal: Say what you want.

573

•

I was on a ham radio net recently and the topic of AI came up. The other operator I was talking with convinced me to download Ollama (Meta, yuck) to at least play and see how it works.

Supposedly there are versions of Meta's Ollama that aren't woke or at least are more politically incorrect, but I didn't have the 500GB space on the VM I was working with, yet...

But before I move into building and deploying some massive AI into a VM, are there any recommendations for a better non-Facebook AI?

I was on a ham radio net recently and the topic of AI came up. The other operator I was talking with convinced me to download Ollama (Meta, yuck) to at least play and see how it works. Supposedly there are versions of Meta's Ollama that aren't woke or at least are more politically incorrect, but I didn't have the 500GB space on the VM I was working with, yet... But before I move into building and deploying some massive AI into a VM, are there any recommendations for a better non-Facebook AI?

[–] • 3 pts

Ollama with Dolphin-llama3 model, faster and better than llama2-uncensored.

link

[–] • 1 pt

Excellent, thanks!

parent
link

[–] • 1 pt

Depending on how much ram you have, choose wisely.

8b > 8~16GB of ram.

70b > 64GB or more

parent
link

[–] • 1 pt

Is it all CPU based or does it offload to the GPU? There a couple older workstations I could get my hands on with still decent CPUs that go up to 128 Gb of RAM, but sourcing a decent GPU for them would be expensive.

parent
link

Load more (1 reply)

[–] • 0 pt

How do I know which was installed if I simply performed the "ollama run dolphin-llama3" command and it fetched the Dolphin model?

Am I even using the right vernacular? I need to skill up on all this, LOL.

parent
link

Load more (1 reply)

[–] • 0 pt

https://ollama.com/library/dolphin-llama3

parent
link

[–] • 1 pt

uncensored

Most of these models (for example, Alpaca, Vicuna, WizardLM, MPT-7B-Chat, Wizard-Vicuna, GPT4-X-Vicuna) have some sort of embedded alignment. For general purposes, this is a good thing. This is what stops the model from doing bad things, like teaching you how to cook meth and make bombs. But what is the nature of this alignment? And, why is it so?

this is a good thing

If cooking meth is your metric for censorship you're on the wrong path. (that's not to AOU, he's not the developer instituting such things, I think) That is as easy as using jewgle. I probably have instructions somewhere.

parent
link

[–] • 1 pt

(that's not to AOU, he's not the developer instituting such things, I think)

That's correct. That's the name they gave to the model because it supposedly can tell you things that can be considered illegal depending the country or state you're living in.

parent
link

[–] • 0 pt

It seems like you've set this up on a machine you have? How easy is it to do? Is it more fun than useful or more useful than fun?

parent
link

Load more (1 reply)

[–] • 2 pts

I do. I have a PC I set up for it with two video cards. I use ollama and a few different models depending on what I'm working on. I plan on fine tuning one at some point but I'm trying to get the data I want to feed it tagged well.

link

[–] • 1 pt

Awesome - do you mind sharing some generic details about your setup? Do you need to run the cards in an SLI type setup? When the models use RAM, is it the video card mem or actual RAM?

Like I said in another post, I need to skill up, lol

parent
link

[–] • 2 pts

(edited )

I run Arch and in this case completely headless. It's running an i7-12700, RTX 3060 and a Tesla P40. I don't have a lot of memory in it because I just tossed in what I had laying around which is 32GB. The M.2 in there isn't huge either, but I have what is essentially a NAS that I store anything I don't want to lose on, so space isn't really a big deal.

Ollama was smart and I didn't need to do any kind of special SLI set up or physically link the cards. It just uses them both, filling the memory on the first card (in my case the Tesla) and then using the second card as needed. I've forwarded a port via ssh so that I can use novelcrafter on my laptop or desktop and have it access the local LLM.

I'm hoping to get some fine-tuning done soon since what I want to use it for is quite specific. That reminds me that I came across a model on huggingface I need to see if I can get running well in ollama locally. I use Gemini and Claude (both online) a lot, though Claude is limited in queries. I don't like the censorship and moralizing on them and it's quite funny to see from update to update the behavior changes.

I'm hoping to get a completely uncensored one up and running, but tagging my data is brutal and I've been trying to script that out using ML, but I'm no coder, so it's slow going. I'm having to pick up python as I go.

parent
link

[–] • 0 pt

Thats awesome, thanks for the details! Are you at liberty to tell what you use it for? Again, just curious. I am totally new to all of it but want to learn more.

parent
link

Load more (1 reply)

[–] • 0 pt

What do you need the AI for?

link

[–] • 1 pt

Nothing at the moment, tbh. I'd like to have one for kids to use as a resource for school maybe? Also perhaps to automate some tasks easier, but really just to tinker with at the moment.

parent
link

(post is archived)