DEV Community

Rajesh Natarajan
Rajesh Natarajan

Posted on

How to Run a ChatGPT-like LLM on Your Own Machine

How to Run a ChatGPT-like LLM on Your Own Machine

So, you want to run a ChatGPT-like chatbot on your own computer? Whether you're looking to learn more about Large Language Models (LLMs) or just want to chat privately without anyone snooping, this guide is for you.

I’ve been experimenting with running LLMs and other generative AI tools locally, and I stumbled upon an incredible web UI by oobabooga for running these models. It’s packed with features, easy to set up, and works like a charm. Today, I’ll walk you through the process of setting it up on your machine.


The Easy Way (Windows + WSL)

If you're on Windows and using WSL (Windows Subsystem for Linux), you can get started with just a few commands.

  1. Clone the Repository: Open your terminal and run:
   git clone https://github.com/oobabooga/text-generation-webui.git
Enter fullscreen mode Exit fullscreen mode
  1. Run the Batch File: Navigate to the cloned directory and run:
   start_wsl.bat
Enter fullscreen mode Exit fullscreen mode

You’ll be prompted to choose your GPU/platform setup. If everything works, you’re all set! Skip to the Run the WebUI section below.

If it fails (which can happen), don’t worry. Follow the manual installation steps below.


Manual Installation (Linux/WSL)

If the easy way doesn’t work, or if you’re on Linux, here’s how to set it up manually.


Step 1: Install Anaconda

  1. Update your system:
   sudo apt-get update
Enter fullscreen mode Exit fullscreen mode
  1. Install wget (if not already installed):
   sudo apt-get install wget
Enter fullscreen mode Exit fullscreen mode
  1. Download the latest Anaconda installer:
   cd /tmp
   wget https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh
Enter fullscreen mode Exit fullscreen mode
  1. Validate the installer:
   sha256sum Anaconda3-2023.09-0-Linux-x86_64.sh
Enter fullscreen mode Exit fullscreen mode
  1. Run the installer:
   bash Anaconda3-2023.09-0-Linux-x86_64.sh
Enter fullscreen mode Exit fullscreen mode

Follow the prompts to complete the installation. When asked, choose the default installation location and allow Anaconda to initialize itself.

  1. Restart your terminal or WSL window.

Step 2: Install the Text Generation Web UI

  1. Clone the repository:
   git clone https://github.com/oobabooga/text-generation-webui.git
Enter fullscreen mode Exit fullscreen mode
  1. Create a new Conda environment:
   conda create -n textgen python=3.11
   conda activate textgen
Enter fullscreen mode Exit fullscreen mode

If you see (textgen) in your terminal prompt, the environment is active.

  1. Install PyTorch:

    • For NVIDIA GPUs:
     pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
    
  • For CPU-only machines:

     pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
    
  1. Install additional dependencies: Navigate to the text-generation-webui directory and install the required packages based on your hardware:
   cd text-generation-webui
   pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Refer to the table below to choose the correct requirements file:

| GPU | CPU | Requirements File |
|-----------|-----------|---------------------------------|
| NVIDIA | AVX2 | requirements.txt |
| NVIDIA | No AVX2 | requirements_noavx2.txt |
| AMD | AVX2 | requirements_amd.txt |
| AMD | No AVX2 | requirements_amd_noavx2.txt |
| CPU Only | AVX2 | requirements_cpu_only.txt |
| CPU Only | No AVX2 | requirements_cpu_only_noavx2.txt |


Step 3: Run the Web UI

  1. Start the server:
   python server.py
Enter fullscreen mode Exit fullscreen mode

You should see a message indicating that the server is running.

  1. Open your browser and navigate to:
   http://localhost:7860
Enter fullscreen mode Exit fullscreen mode

If you see the web UI, you’re ready to go!


Downloading an LLM Model

To use the web UI, you’ll need to download a model. Here’s how:

  1. In the web UI, click on Model in the top menu.
  2. Select Download model or LoRA.
  3. Enter the model’s Hugging Face path (e.g., TheBloke/Nous-Hermes-13B-GPTQ).
  4. Click Download.

Once the model is downloaded, refresh the list, select the model, and click Load.


Having a Chat

Now that your model is loaded, you can start chatting! Here’s an example of how it works:

  1. Type your message in the input box and press Enter.
  2. The model will generate a response in real-time.

For example:

  • You: "What’s the capital of France?"
  • Model: "The capital of France is Paris."

You can tweak parameters like temperature, top-p, and max tokens to customize the model’s behavior.


Conclusion

Running a ChatGPT-like LLM on your own machine is not only fun but also incredibly useful. Whether you’re exploring AI, building a private chatbot, or just experimenting, this setup gives you full control over your AI experience.

I’ve been using this setup to dive deeper into LLMs and their capabilities. What about you? Are you running LLMs locally? Let’s chat about it in the comments!


Resources


Happy hacking! 🚀


Top comments (0)