Marcus Rivera

Full-stack developer and open-source advocate. Writes about developer tools, frameworks, and the craft of software.

54 articles

Ollama: Run Local LLMs Like a Pro in 2026

A hands-on guide to Ollama, the default local-LLM runner in 2026 (v0.30.10). Covers install, pulling and running models, calling them from the OpenAI SDK at localhost:11434, structured JSON outputs, tool calling, and Modelfiles, plus how to size a model to your hardware.

By Marcus Rivera · 6 min · Jun 25, 2026

Reviews

OpenCode: The Open-Source AI Coding Agent at 178K Stars

OpenCode is an open-source (MIT), terminal-native AI coding agent with 178K GitHub stars. It is model-agnostic, connecting to 75+ providers (Anthropic, OpenAI, Google, Ollama) with bring-your-own keys. LSP integration feeds compiler diagnostics back to the model; built-in build and plan agents plus a general subagent. Runs locally/air-gapped, ships frequently (v1.17.9, 826 releases), and now has a desktop beta. Trade-offs: a terminal learning curve, you pay your own API bills, and quality depends on the model you plug in.

By Marcus Rivera · 5 min · Jun 24, 2026

Reviews

CodeRabbit: The AI Code Reviewer That Reads Your Whole Repo

CodeRabbit is an AI code reviewer that posts line-by-line PR feedback across GitHub, GitLab, Azure DevOps and Bitbucket. Plans run Free, Pro (4/user/mo), Pro Plus (8) and Enterprise, billed only for developers who open PRs. Strengths: context via MCP, one-click autofix, deep static analysis. Watch the 5/10/12 review rate limits.

By Marcus Rivera · 6 min · Jun 23, 2026

Tech Tips

Structured Outputs: Force LLMs to Return Valid JSON

A practical guide to OpenAI Structured Outputs: the difference from JSON mode, function calling vs response_format, strict schema rules, constrained decoding, limits, and cross-provider options.

By Marcus Rivera · 8 min · Jun 22, 2026

Reviews

Google Antigravity 2.0: From Cursor Clone to AI Agent Platform

A hands-on review of Google Antigravity 2.0: its multi-agent orchestration, pricing, and trade-offs.

By Marcus Rivera · 5 min · Jun 20, 2026

Tech Tips

Context Engineering: A Practical Playbook for Reliable AI Agents

Context engineering is the discipline of curating tools, prompts, retrieval, and memory each turn so AI agents stay reliable over long-horizon tasks.

By Marcus Rivera · 7 min · Jun 16, 2026

Tech Tips

Prompt Caching: How to Cut LLM API Costs by Up to 90%

Prompt caching stores the computed KV attention tensors for a repeated prompt prefix so the model skips recomputation, cutting input cost and latency. Anthropic (explicit cache_control, ~90% read discount), OpenAI (automatic, 50% off, 1,024-token minimum), and Google Gemini (implicit plus explicit cache objects, up to 90%) all support it. The one rule that determines hit rate: put all static content at the front of the prompt and all dynamic content at the back.

By Marcus Rivera · 7 min · Jun 12, 2026

Tech Tips

Firecrawl: Turn Any Website Into Agent-Ready Markdown

Firecrawl converts messy, JavaScript-rendered websites into clean, LLM-ready markdown for RAG and AI agents. Install with 'pip install firecrawl' and use the Firecrawl class: scrape for known URLs (1 credit), crawl for discovery (1 credit per page, always set a limit), and schema-based extraction for typed JSON. Watch Enhanced/Stealth Mode, which costs 5 credits per page on Cloudflare-protected sites, and note that credits do not roll over.

By Marcus Rivera · 5 min · Jun 10, 2026

Tech Tips

RAG Grounding: 7 Ways to Stop LLM Hallucinations in Production

A practitioner's guide to grounding retrieval-augmented generation systems. Covers fixing retrieval first, hybrid dense-plus-keyword search, cross-encoder reranking, contextual compression, refusal prompting, verified citations, Chain-of-Verification, confidence-threshold abstention, and measuring faithfulness with RAGAS.

By Marcus Rivera · 6 min · Jun 9, 2026

Tech Tips

MCP Security: A 2026 Hardening Playbook After CVE-2025-6514

A practical 2026 security playbook for Model Context Protocol agents. It explains MCP-specific threats (prompt injection, tool poisoning, rug pulls, confused-deputy), dissects the critical CVE-2025-6514 mcp-remote RCE, and gives concrete hardening steps: patch to 0.1.16, enforce OAuth 2.1 over HTTPS, isolate servers, gate destructive actions, and audit agent activity.

By Marcus Rivera · 7 min · Jun 2, 2026

Tech Tips

AGENTS.md: Configure AI Coding Agents That Actually Obey

AGENTS.md is a Linux Foundation-stewarded open standard, adopted by 60,000+ repositories and read natively by 20+ tools including Codex, Cursor, and Copilot. This guide covers the eight core sections, the phrasing patterns that change agent behavior, monorepo nesting, and how it differs from CLAUDE.md, .cursorrules, MCP, and SKILL.md.

By Marcus Rivera · 9 min · May 31, 2026

Tech Tips

Prompt Injection: A 2026 Defense Playbook for AI Agents

A defense playbook for prompt injection in AI agents. It explains why the attack is unsolvable at the model layer, frames the threat with Simon Willison's lethal trifecta (private data, untrusted content, external communication), and prescribes layered controls: architectural separation, least-privilege tools, input filtering, egress allowlisting, circuit breakers, and hardened models, which can cut attack success from 73.2% to 8.7%.

By Marcus Rivera · 6 min · May 30, 2026

Marcus Rivera

Ollama: Run Local LLMs Like a Pro in 2026

OpenCode: The Open-Source AI Coding Agent at 178K Stars

CodeRabbit: The AI Code Reviewer That Reads Your Whole Repo

Structured Outputs: Force LLMs to Return Valid JSON

Google Antigravity 2.0: From Cursor Clone to AI Agent Platform

Context Engineering: A Practical Playbook for Reliable AI Agents

Prompt Caching: How to Cut LLM API Costs by Up to 90%

Firecrawl: Turn Any Website Into Agent-Ready Markdown

RAG Grounding: 7 Ways to Stop LLM Hallucinations in Production

MCP Security: A 2026 Hardening Playbook After CVE-2025-6514

AGENTS.md: Configure AI Coding Agents That Actually Obey

Prompt Injection: A 2026 Defense Playbook for AI Agents

Kanwas: The Open-Source AI Workspace That Hit #1 on Product Hunt

Understand-Anything: The 37K-Star Knowledge Graph for Your Codebase

Tycoon AI Review: One Operator, an AI CEO, and a Full C-Suite

Emdash: The Open-Source IDE Built to Run 22 Coding Agents in Parallel

Pipali: The Open-Source Desktop AI Coworker From Khoj AI's YC Team

mattpocock/skills: The 91.7K-Star Repo Reshaping AI-Assisted Engineering

Raindrop Workshop: The Local AI Agent Debugger That Hit 473 Stars

OpenHuman: The 776-Star Agent That Reads You Before You Type

Kilo Code v7: The Open-Source AI Agent Rebuilt for Parallel Work

Wispr Flow Review: $15 Voice App Eyeing $2B Valuation

Vercel Open Agents: Background Coding Agents You Can Fork

GitHub Spec-Kit: The 90K-Star Antidote to Vibe-Coding With AI Agents

OpenClaw: 371K Stars, Three Rebrands, and a $16M Crypto Scam

FlowMarket: The Live Network Where AI Agents Negotiate B2B Deals

Cursor Bugbot Hits 78% Bug Resolution by Learning From Your PRs

Gemini API Webhooks: Kill the Polling Loop on Long-Running Jobs

Postiz: The 29.6K-Star Open-Source Social Scheduler Killing Buffer

VibeVoice: Microsoft's Open-Source Frontier Voice AI Hits 33K Stars

Windsurf 2.0: Cognition Bakes Devin Right Into the IDE

Nemotron 3 Nano Omni: NVIDIA's 30B Open Model Sees and Hears

Archon OS: The Open-Source Brain That Makes Claude Code Remember

Goose: Block's Open-Source Local-First AI Agent Hits 35K

ElevenCreative Review: ElevenLabs' All-in-One AI Studio

Voicebox: The Local-First Voice Cloning Studio for Mac and Windows

NVIDIA Ising: Open-Source AI Models That Make Quantum Computing Actually Work

GLM-5.1: The Open-Source 754B Model That Works for Eight Hours Straight

Caveman: The Claude Code Skill That Cuts 65% of Output Tokens

Ghost Pepper: 100% Local Speech-to-Text for macOS

Edgee Codex Compressor: The Rust Gateway That Cuts Codex Costs 35.6%

Ray: The Open-Source AI Financial Advisor That Runs on Your Laptop

Hermes Agent: The Open-Source AI Agent That Learns How You Work

Cohere Transcribe: The Open-Source ASR Model That Dethroned Whisper

Baton: The Desktop App for Orchestrating AI Coding Agents

Gemini CLI: Google's Open-Source Terminal Agent Hits 101K GitHub Stars

Google Gemma 4: Four Open Models That Punch Above Their Weight

Moondream 3: The 9B Vision Model That Runs Like a 2B

Voxtral TTS: Mistral's Open-Weight Speech Model Challenges ElevenLabs

5 Best AI Tool Directories in 2026: Find the Right Tool Fast

Mistral Small 4: One Open-Source Model Replaces Three Separate AI Products

LTX 2.3: Lightricks' Open-Source Model Generates 4K Video with Synced Audio

Biome v2.4: The Rust-Powered Toolchain Replacing ESLint and Prettier

OpenClaw: The Self-Hosted AI Agent That Hit 247K GitHub Stars