Tech Tips 5 min read intermediate

Firecrawl: Turn Any Website Into Agent-Ready Markdown

Firecrawl converts messy, JavaScript-rendered websites into clean, LLM-ready markdown for RAG and AI agents. Install with 'pip install firecrawl' and use the Firecrawl class: scrape for known URLs (1 credit), crawl for discovery (1 credit per page, always set a limit), and schema-based extraction for typed JSON. Watch Enhanced/Stealth Mode, which costs 5 credits per page on Cloudflare-protected sites, and note that credits do not roll over.

Marcus Rivera

Jun 10, 2026

Every RAG pipeline and AI agent has the same unglamorous bottleneck: getting clean, structured text out of messy modern websites. Raw HTML is a swamp of nav bars, cookie banners, lazy-loaded JavaScript, and bot walls. Firecrawl exists to drain that swamp — it takes a URL and hands back LLM-ready markdown, no DOM-wrangling required.

This is a practical guide to using Firecrawl well: the setup, the core operations, the cost traps, and the patterns that separate a toy script from something you'd put in production. It assumes you're comfortable with Python and have wired up an API or two before.

Why Not Just Use `requests` and BeautifulSoup?

You can. For a static blog, that stack is fine and free. Firecrawl earns its keep when the sites fight back.

The modern web is overwhelmingly client-rendered. A requests.get() on a React or Next.js page returns a near-empty shell — the content materializes only after JavaScript runs. On top of that, Veracode's 2025 GenAI Code Security Report found that 45% of AI-generated code samples contained security vulnerabilities, and brittle hand-rolled scrapers are a classic source of silent data-quality bugs that poison everything downstream.

Firecrawl renders JavaScript, strips boilerplate, handles pagination and crawling, and returns markdown that an LLM can actually reason over. You trade a per-page cost for not maintaining a fragile parsing layer. For agent workloads, that trade is usually worth it.

Step 1: Install and Authenticate

The current SDK ships as the firecrawl package and exposes a Firecrawl class. (You may still see the older firecrawl-py package and FirecrawlApp class in tutorials — that's the legacy line.)

pip install firecrawl

Grab an API key from your Firecrawl dashboard — keys are prefixed fc-. Store it in an environment variable rather than hardcoding it:

export FIRECRAWL_API_KEY="fc-YOUR_KEY_HERE"

from firecrawl import Firecrawl

firecrawl = Firecrawl(api_key="fc-YOUR_KEY_HERE")  # or omit to read FIRECRAWL_API_KEY

Step 2: Scrape a Single Page

The atomic operation is scrape. Point it at a URL, declare the formats you want, and you get back structured content.

doc = firecrawl.scrape(
    "https://example.com/article",
    formats=["markdown", "html"],
)

print(doc.markdown)   # clean, LLM-ready text

This is the call you'll reach for most: feeding a known URL into a summarizer, an extraction step, or a vector store. One scrape costs 1 credit.

Step 3: Crawl a Whole Site

When you need an entire docs site or knowledge base, crawl discovers and scrapes linked pages for you.

from firecrawl import ScrapeOptions

job = firecrawl.crawl(
    "https://docs.example.com",
    limit=100,
    scrape_options=ScrapeOptions(formats=["markdown"]),
    poll_interval=30,
)

for page in job.data:
    print(page.metadata["sourceURL"])

Crawling is the expensive operation in disguise: each page is 1 credit, so a limit=100 crawl burns up to 100 credits. Always set a limit. Use exclude_paths to skip noise like blog/* or /tag/, and prefer crawling a sitemap section over an entire domain.

For long jobs, kick off the crawl asynchronously with start_crawl and poll for status instead of blocking:

job = firecrawl.start_crawl("https://docs.example.com", limit=100)
status = firecrawl.get_crawl_status(job.id)

Step 4: Extract Structured Data

Markdown is great for RAG, but agents often need typed fields. Firecrawl's extraction lets you pass a schema and a prompt, and it returns JSON shaped to your model. Define the schema with Pydantic so you get validation for free:

from pydantic import BaseModel

class Product(BaseModel):
    name: str
    price: float
    in_stock: bool

result = firecrawl.scrape(
    "https://shop.example.com/item/42",
    formats=["json"],
    json_options={"schema": Product.model_json_schema()},
)

This is where Firecrawl stops being a scraper and starts being a data extraction layer — the difference between "give me the page" and "give me the three fields I care about."

The Cost Traps Nobody Warns You About

Firecrawl bills in credits, and the defaults can surprise you. The free tier grants 500 one-time API credits with 2 concurrent requests; the Hobby plan moves to 3,000 credits per month with 5 concurrent requests. Beyond that, watch three line items:

Operation	Credit cost	Watch out for
Scrape / Crawl / Map	1 per page	Unbounded crawls with no `limit`
Search	2 per 10 results	Easy to call in a loop
Stealth / Enhanced Mode	5 per page	Auto-triggers on Cloudflare-protected sites

That last row is the big one. If your targets sit behind Cloudflare or similar bot protection, Firecrawl escalates to Enhanced Mode and your effective cost quintuples to 5 credits per page. Credits also do not roll over month to month (with narrow exceptions for auto-recharge balances). Budget against your worst-case page count, not your average.

Production Patterns

A few habits keep Firecrawl reliable at scale:

Cache aggressively. Re-scraping the same URL is wasted money. Hash the URL, store the markdown, and set a sane TTL.
Scrape, don't crawl, when you have the URLs. Crawling is for discovery. If you already know your target pages, scrape them directly and skip the credit overhead of link traversal.
Validate before you store. Pydantic schemas catch malformed extractions before bad data reaches your vector DB.
Respect the source. Just because a tool can bypass access controls doesn't mean you should. Scraping behind login walls or around explicit robots.txt denials is exactly the conduct now drawing lawsuits across the industry.

The Bottom Line

Firecrawl solves a real, boring, expensive problem: turning the live web into text your models can use. For a single static page, requests plus BeautifulSoup is still cheaper. But the moment you're dealing with JavaScript-heavy sites, multi-page crawls, or typed extraction at scale, Firecrawl's per-credit cost buys back hours of brittle parser maintenance.

Use scrape for known URLs, reserve crawl for genuine discovery and always cap it with a limit, lean on schema-based extraction when you need fields instead of prose, and keep a close eye on Enhanced Mode quietly multiplying your bill. Treat it as a metered utility rather than a free firehose, and it becomes one of the most dependable pieces of an AI data pipeline.

firecrawl developer-tools rag web-scraping ai-agents

More in Tech Tips

Tech Tips

LiteLLM: One Unified API for Every LLM Provider in 2026

LiteLLM is an open-source gateway that gives developers a single OpenAI-format interface to call 100+ LLM providers. This tutorial covers installing the SDK and Proxy Server, switching providers by changing a model string, unified exception handling, streaming, and adding cost tracking, observability, virtual keys, and budgets.

By Marcus Rivera · 7 min · Jul 17, 2026

Tech Tips

Langfuse: LLM Observability That Debugs Your AI Agents

Langfuse is an open-source, MIT-licensed LLM observability platform acquired by ClickHouse in January 2026. It provides hierarchical tracing, prompt management, evaluations, and datasets. Its OpenTelemetry-based Python SDK v3 uses the @observe decorator and integrates with LangChain, the OpenAI SDK, Anthropic, and LiteLLM.

By Marcus Rivera · 6 min · Jul 16, 2026

Tech Tips

Unsloth: Fine-Tune LLMs 2x Faster on a Single GPU

Unsloth is an open-source library that fine-tunes open LLMs (Llama, Qwen, Mistral, Gemma, gpt-oss) roughly 2x faster and with up to 70% less VRAM than a stock Hugging Face setup, without sacrificing accuracy. It achieves this with custom OpenAI Triton kernels and a manual backpropagation engine, and fuses LoRA with 4-bit quantization. It runs on any NVIDIA GPU with CUDA Capability 7.0+, including the free Colab T4. Install with 'pip install unsloth' and use FastLanguageModel.from_pretrained plus get_peft_model to attach LoRA adapters before training with trl's SFTTrainer.

By Marcus Rivera · 6 min · Jul 10, 2026