OFFLINE AI! Using Ollama At Home Or In The Office
- Jim Clover
- Feb 16
- 4 min read

Since it's inception I've been a huge fan of Ollama (www.ollama.com) as it allows you to easily try out different Large Language Models against your use cases. From seeing how an LLM can help you summarise text, to analysing an image for things you want to detect, Ollama is the wrapper to all this magic!
Benefits of Ollama and Local LLMs
If you use ChatGPT, Claude Sonnet and other "Frontier Models" at some point you will want to pay for more features. Which is fine, and I already do that myself, but what if I wanted to be able to chat with an AI offline? Say on the train, or at home without spending money on tokens, or maybe even write a Python script to talk to an AI with some work I need doing?
This is where Ollama comes in. It sets up an environment which allows you to easily PULL (download) a model, chat to it (on the command line/Terminal), delete models you don't want and much, much more. All private, nothing going online and potentially stored in the Cloud, your searches stay on the computer you ran them on (until you delete them).
And of course, no cost to you.
Will the offline AI models be as powerful as using ChatGPT or similar? No, but you'll be shocked at how much you can get done. I would encourage you to experiment, try different things out to see how good or bad they are. You can always use a blend of both online and offline AI's, the choice is yours :)
You can get Ollama here: www.ollama.com and use the respective installer for your Operating System. It works equally well on Windows, Apple and Linux, as well as ARM and other architectures.
So What Can I Use It For?
Examples paint a thousand words, so here are two to get you thinking!
Mistral - This LLM, or model as I'll refer to it, has been around for some time now but continues to do a great job when it comes to working with text. I use it for my news app FocalChat, which downloads news articles and Mistral has the job to not only summarise each one, but grab a snappy headline too.
ollama pull mistral
llama3.2-vision - Whilst Mistral shines to this day on text stuff, vision models just get better and better. Simply put, you provide an image (a photo for example) to llama3.2-vision and ask it "What is this?" - it will then do a great job of producing you a paragraph on what it saw.
ollama pull llama3.2-vision
What is incredible about Ollama is the range of models - you can literally try out anything you like BUT there is a catch (isn't there always? :) ). Your laptop, desktop computer needs to have sufficient hardware to match the model you want to try out.
The Sweet Spot
My AI test rig here uses two Nvidia 4060Ti GPU's with 16Gb of VRAM each, which gives me a combined VRAM total for models to use of 32Gb. Remember, VRAM is different (big time) from your computers RAM. The abilities of GPUs to compute AI workloads is crucial for a smooth experience with these models, so try and fit your Ollama downloads to what you have. With my 32Gb of VRAM, I'm able to run deepseek-r1 32b, which is pretty good and totally usable. But if I try and run deepseek-r1 70b model, which needs more VRAM, it's not a great experience at all! In fact, it takes several minutes just to start up!
If you have a Macbook, Mac Mini or iMac that has more than 16Gb of (Unified, because everything uses it) you can comfortably run 7b or below models, possibly 8b too but try at your own risk. I know from my own testing that the deepseek-r1 70b model works OK on a Macbook M4 Pro Max with 128Gb (Unified) RAM, but it gets VERY HOT! So beware.

Use the chart above to help you decide which models suit your hardware. You can always try the smallest models first (often around 1.5b) and then edge up to 7b. If they don't work, at your command line simply type:
ollama rm <MODEL NAME: Example: mistral>
And it will be immediately removed and not occupy your disk space.
User Interfaces for Ollama
Not everyone likes the command line/Terminal. There are LOADS of User Interfaces for Ollama thankfully! You can check out what is available from the open source community and more at https://github.com/ollama/ollama?tab=readme-ov-file#community-integrations well as receive insights to other Ollama commands (and the basic ones of course) by scrolling back up the page.
For non-technical folk who need a more straightforward setup, there are some options.
msty.app is pretty solid but it's not strictly speaking open source, but does take a lot of the pain away from using the Terminal/Command Line.
https://openwebui.com is probably the most popular BUT can be a pain to set up for novices. It is my go-to, and I run it in docker. If you do install docker, click on the Terminal bottom right and try installing OpenWeb UI with:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Copy and paste the above and make sure the Play button is clicked if it doesn't start.
I hope this introduction to local AI and LLMs using Ollama was helpful to you. Happy AI'ing!
Comments