Rajesh Natarajan

Posted on Jan 18

How to Run a ChatGPT-like LLM on Your Own Machine

So, you want to run a ChatGPT-like chatbot on your own computer? Whether you're looking to learn more about Large Language Models (LLMs) or just want to chat privately without anyone snooping, this guide is for you.

I’ve been experimenting with running LLMs and other generative AI tools locally, and I stumbled upon an incredible web UI by oobabooga for running these models. It’s packed with features, easy to set up, and works like a charm. Today, I’ll walk you through the process of setting it up on your machine.

The Easy Way (Windows + WSL)

If you're on Windows and using WSL (Windows Subsystem for Linux), you can get started with just a few commands.

Clone the Repository: Open your terminal and run:

   git clone https://github.com/oobabooga/text-generation-webui.git

Run the Batch File: Navigate to the cloned directory and run:

   start_wsl.bat

You’ll be prompted to choose your GPU/platform setup. If everything works, you’re all set! Skip to the Run the WebUI section below.

If it fails (which can happen), don’t worry. Follow the manual installation steps below.

Manual Installation (Linux/WSL)

If the easy way doesn’t work, or if you’re on Linux, here’s how to set it up manually.

Step 1: Install Anaconda

Update your system:

   sudo apt-get update

Install wget (if not already installed):

   sudo apt-get install wget

Download the latest Anaconda installer:

   cd /tmp
   wget https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh

Validate the installer:

   sha256sum Anaconda3-2023.09-0-Linux-x86_64.sh

Run the installer:

   bash Anaconda3-2023.09-0-Linux-x86_64.sh

Follow the prompts to complete the installation. When asked, choose the default installation location and allow Anaconda to initialize itself.

Restart your terminal or WSL window.

Step 2: Install the Text Generation Web UI

Clone the repository:

   git clone https://github.com/oobabooga/text-generation-webui.git

Create a new Conda environment:

   conda create -n textgen python=3.11
   conda activate textgen

If you see (textgen) in your terminal prompt, the environment is active.

Install PyTorch:

For NVIDIA GPUs:

 pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

For CPU-only machines:

 pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Install additional dependencies: Navigate to the text-generation-webui directory and install the required packages based on your hardware:

   cd text-generation-webui
   pip install -r requirements.txt

Refer to the table below to choose the correct requirements file:

| GPU | CPU | Requirements File |
|-----------|-----------|---------------------------------|
| NVIDIA | AVX2 | requirements.txt |
| NVIDIA | No AVX2 | requirements_noavx2.txt |
| AMD | AVX2 | requirements_amd.txt |
| AMD | No AVX2 | requirements_amd_noavx2.txt |
| CPU Only | AVX2 | requirements_cpu_only.txt |
| CPU Only | No AVX2 | requirements_cpu_only_noavx2.txt |

Step 3: Run the Web UI

Start the server:

   python server.py

You should see a message indicating that the server is running.

Open your browser and navigate to:

   http://localhost:7860

If you see the web UI, you’re ready to go!

Downloading an LLM Model

To use the web UI, you’ll need to download a model. Here’s how:

In the web UI, click on Model in the top menu.
Select Download model or LoRA.
Enter the model’s Hugging Face path (e.g., TheBloke/Nous-Hermes-13B-GPTQ).
Click Download.

Once the model is downloaded, refresh the list, select the model, and click Load.

Having a Chat

Now that your model is loaded, you can start chatting! Here’s an example of how it works:

Type your message in the input box and press Enter.
The model will generate a response in real-time.

For example:

You: "What’s the capital of France?"
Model: "The capital of France is Paris."

You can tweak parameters like temperature, top-p, and max tokens to customize the model’s behavior.

Conclusion

Running a ChatGPT-like LLM on your own machine is not only fun but also incredibly useful. Whether you’re exploring AI, building a private chatbot, or just experimenting, this setup gives you full control over your AI experience.

I’ve been using this setup to dive deeper into LLMs and their capabilities. What about you? Are you running LLMs locally? Let’s chat about it in the comments!

Resources

Happy hacking! 🚀

DEV Community

How to Run a ChatGPT-like LLM on Your Own Machine