How to Run Local LLMs on Your MacBook: A Complete Guide
Tech Tips 1 min read intermediate

How to Run Local LLMs on Your MacBook: A Complete Guide

Complete guide to running LLMs locally on MacBook with Ollama — zero API costs, full privacy.

Marcus Rivera
Marcus Rivera
Mar 26, 2026

Why Run AI Locally?

Running LLMs on your own hardware gives you complete privacy, zero API costs, and offline access. With Apple Silicon, it's surprisingly capable.

Prerequisites

  • MacBook with M1/M2/M3/M4 chip (16GB+ RAM recommended)
  • Homebrew installed
  • ~20GB free disk space

Step 1: Install Ollama

The easiest way to get started:

brew install ollama
ollama serve

Step 2: Pull a Model

# Fast and capable
ollama pull llama3.2

# For coding tasks
ollama pull deepseek-coder-v2

# Smaller, faster option
ollama pull phi-3

Step 3: Start Chatting

ollama run llama3.2

Performance Benchmarks

Model RAM Usage Tokens/sec (M3 Pro)
Llama 3.2 7B 5.2 GB 42 t/s
DeepSeek Coder 8.1 GB 28 t/s
Phi-3 Mini 2.8 GB 65 t/s

Pro Tips

  1. Use quantized models (Q4_K_M) for the best speed/quality balance
  2. Close memory-heavy apps before running larger models
  3. Use Open WebUI for a ChatGPT-like interface on localhost

The local AI revolution is here, and your MacBook is more than ready for it.

Local AI Ollama MacBook Privacy