Josh Gregory
  • Home
  • About
  • Projects
  • Blog
  • Resume

On this page

  • Downloads
    • Install with conda
    • Ollama
      • Get Models via Ollama

Running Large Language Models Locally

Deep learning
Machine learning
Artificial intelligence
Large Language Models
My notes from installing Ollama and using open-source large language models locally.
Author

Josh Gregory

Published

January 18, 2025

Modified

September 3, 2025

These instructions are taken from the Open WebUI tutorials, which can be found here. I used Python and Conda, but other platforms (e.g., Docker, Kubernetes) are also available.

Downloads

Note: These instructions were done on a Windows machine running WSL 2. They should also work for macOS. If you’re running a Windows machine and don’t have WSL installed, you can find more information about that here.

Install with conda

Create a new Conda environment. It can be named anything, but for the sake of following the tutorials, I’ll name it open-webui.

conda create -n open-webui python=3.11

and activate it with

conda activate open-webui

Then intall Open WebUI with pip:

pip install open-webui

Start the server:

open-webui serve

Then navigate to http://localhost:8080/ to access the ChatGPT-like UI.

To update the Open WebUI package, run

pip install -U open-webui

Ollama

Get Models via Ollama

Go to the Ollama website and navigate to “Models”. Or just click here. Each model will have instructions on it’s size. For example, if I want to download the 1b variant of llama3.2, I would run

ollama pull llama3.2:1b

which would install the model. To run the model, I would simply type

ollama run llama3.2:1b

We can now run any of the models we have installed from the command line. For example, if I installed llama3.2 via ollama pull llama3.2, I could run it with

ollama run llama3.2

which then results in an interactive prompt that we can interact with:

~$ ollama run llama3.2
>>> Send a message (/? for help)

Citation

BibTeX citation:
@online{gregory2025,
  author = {Gregory, Josh},
  title = {Running {Large} {Language} {Models} {Locally}},
  date = {2025-01-18},
  url = {https://joshgregory42.github.io/posts/2025-09-03-local-llm/},
  langid = {en}
}
For attribution, please cite this work as:
Gregory, Josh. 2025. “Running Large Language Models Locally.” January 18, 2025. https://joshgregory42.github.io/posts/2025-09-03-local-llm/.
  • © 2025 Josh Gregory