You Can Run Alpaca Locally, Even If You Can't Code
run stanford's language model on your mac with a relative few simple steps — you don't need to be a developer to do it
Github creator @cocktailpeanut published a guide to running the language model Alpaca locally, which allows users to interact with a fine-tuned version of Facebook’s LLM LLaMa, which “behaves qualitatively similarly to OpenAI’s text-davinci-003.” The guide, called Dalai, consists of a series of javascript commands.
While the guide will be simple for anyone who knows JavaScript, Python, and other programming languages, it requires enough technical know-how that the majority of people (myself included) will get lost pretty quickly. Here, using Dalai as a foundation, I’ve put together an absolute beginner’s guide to running Alpaca locally, on your Mac.
1. Install node.js
Node.js is an open-source, JavaScript runtime environment that allows you to run JavaScript code outside of a web browser — in this case, Terminal. To install it, click here, follow the prompts on the site, and follow the Installer prompts.
After following the prompts, node.js will be installed. If you’re like me, you’ll try to get ahead of the guide and open node.js on your Mac, but that’s not how it works. Node.js is simply a way of executing code on your computer, and you’ll use this capability in Terminal now.
2. Open Terminal
Using spotlight, open Terminal on your Mac. When it’s open, you’ll see a console that looks like Notepad.
Terminal lets you control your computer using code instead of clicking on icons or buttons. Here, you can type in commands to tell your computer what to do. Broadly, these can do all kinds of things, like finding files, running programs, or making your Mac do tasks automatically.
3. Ensure you’ve downloaded and installed node.js
In Terminal, type the following command and hit return:
node --version
If it returns a node.js version like in the screenshot below, you’re good.
4. Install the Alpaca model
Now you’re ready to install the Alpaca model to your hard drive. Installing this model will allow you to interact with it using a web UI (“localhost”) which will access the model. More on this in Step 5. To install the Alpaca model, give Terminal this command and hit return:
npx dalai alpaca install 7B
This will take a minute or two, and your Terminal will look like this:
5. Run the Web UI
Once the model has been installed, give it this command, and hit return —
npx dalai serve
Then, click here (http://localhost:3000) to run localhost, on which you will be able to talk to Alpaca. Unlike using ChatGPT, running an LLM on localhost can provide additional security and privacy benefits, since your data remains on your computer. It can provide faster response times compared to accessing ChatGPT, which can be super slow.
Here’s what you should be seeing now, with an example response after I prompted it to describe an apple:
Troubleshooting and Tips
If Chrome can’t access http://localhost:3000, return to Terminal and give it this command then hit return:
npx dalai serve
If Alpaca starts returning text in an infinite loop, return to Terminal and press
control+C
to stop it.Ironically, GPT (version 4) was a huge help as I was setting installing Alpaca on my Mac. Any time I got hung up, I asked GPT what to do. When Terminal indicated errors, I just copy and pasted them into GPT-4 and ultimately it would get me on the right track.
There are additional, specific troubleshooting tips on Dalai. And as above, if you get hung up on anything, paste what Terminal is returning into ChatGPT (ideally you’re using version 4) after telling it what you’re trying to do, and ask it what you’re doing wrong.
Does running Alpaca locally allow you to get past ‘censorship’?
Stanford’s blog post says the following:
Deploying an interactive demo for Alpaca also poses potential risks, such as more widely disseminating harmful content and lowering the barrier for spam, fraud, or disinformation. We have put into place two risk mitigation strategies. First, we have implemented a content filter using OpenAI’s content moderation API, which filters out harmful content as defined by OpenAI’s usage policies.
Anecdotally, running Alpaca locally seems pretty uncensored. However, it is also highly prone to repetition and hallucination. That said, you can fine tune it to behave better — and even act similarly to GPT — if you put in enough research and work.
@cocktailpeanut’s user-friendly guide is a preview of the potential of having the ability to run an LLM on local computers, at scale. Individuals running their own LLMs can change their weights, customize their training data, and essentially use the technology ‘their own way’. We are approaching a inflection point after which LLMs are a commodity, and anyone who wants to can personalize their own language model in total privacy, with no restrictions, to supercharge their productivity — or perhaps their delusions.
Brian Roemmele, who writes at multiplex, tweeted yesterday that he used Dalai, in part, to help him “install and operate a full ChatGPT knowledge set… fully trained on my local computer and it needs no Internet once installed.” He says there is “no censorship,” and that “this model is now in a live connect with all of my other AI systems and the results have been absolutely stunning.”
-Brandon Gorrell
Read more Pirate Wires coverage of LLMs —
GPT-4 Scores Near 90th Percentile For Bar, LSAT, SAT, And AP Exams
GPT-4 Jailbreaks "Still Exist," But Much More "Difficult," Says OpenAI
GPT Prompt Using 'Token Smuggling' Really Does Jailbreak GPT-4

You Can Run Alpaca Locally, Even If You Can't Code
Few things:
- If anyone is nervous about installing Node.js, you don't need to be. Node.js is an *extremely* widely used environment. It isn't going to give you a virus and it isn't spyware.
- The 7B Alpaca model is 4-5 GB, so make sure you have space on your hard drive. The model may take a while to download because of its size and because I believe it uses webtorrent to download it. There is also a 13B Alpaca model, which will be about twice as big, but should give better results.
- You will need ~5GB RAM for the 7B model and ~8GB RAM for the 13B model. If you aren't seeing results when you ask it things, restart it (ctrl-C and run the serve command again), and reissue the request with the "Debug" checkbox checked. If you see a "segmentation fault" that may be because you don't have enough RAM.
- btw, 7B and 13B refer to 7 billion and 13 billion model parameters respectively. These aren't "parameters" like knobs on an audio system that you can freely turn up or down. Each one of these parameters is just a number, a coefficient that the data gets multiplied by as it's processed. When he says you can "fine tune" the model, he means training the model to change these parameters -- that's what training is, searching through the space of possible parameters for ones that give good results.
- Regarding http://localhost:3000/ -- let's break that down. "HTTP" means it's using the HTTP protocol to communicate with the server (running on your machine). It is not HTTPS, which means it is not secure -- if anyone could snoop the traffic, they'd see clear text data, not encrypted data. I'll come back to that in a moment.
"localhost" is a special predefined hostname that basically all operating systems used to refer to themselves. localhost corresponds to 127.0.0.1, the loopback IP address, which also refers to oneself. So, even though you're accessing it through a browser, because it's "localhost", you're not sending packets to network beyond your computer.
":3000" just means that you're connecting to port 3000 on localhost. 3000 is just the default port on which the server runs. Ports are just a way of allowing multiple servers on one host.
- Note that following the instructions given above *does* (try to) open the server to listen for incoming *remote* connections on port 3000 as well.
The advantage is that if you want you could run the server on your computer, but connect to it through your phone (if your computer's firewall rules allow this -- on Windows it prompts you for permission to open a firewall hole for the application. Don't know about Mac). To do that you would have to know the hostname or IP address of your computer. Suppose the IP address (on your local network) is 192.168.1.100, then you could access it from your phone (while on your local network) by going to http://192.168.1.100:3000/ .
The downside to this is that, because it's going over HTTP and not HTTPS, if you're accessing it remotely, your requests and the returned responses can theoretically be snooped on.
I thought I understood English...I was wrong.