TL;DR

Web Scraping to Database is a data workflow that chains Firecrawl + Supabase to automate a common task. Schedule a Firecrawl scrape of any website and store the structured results directly in a Supabase table for analysis. Once configured, it saves ~12 hours/week of manual competitive research, plus elimination of brittle custom scrapers and runs through Claude Code, Cursor, Windsurf or any MCP-compatible AI agent.

🔥🟢

DataIntermediate

Web Scraping to Database

Schedule a Firecrawl scrape of any website and store the structured results directly in a Supabase table for analysis.

15 min setup, continuous data collection2 MCPs requiredSaves ~12 hours/week of manual competitive research, plus elimination of brittle custom scrapers

How it works

🔥Firecrawl

🟢Supabase

Automated

1Schedule or trigger Firecrawl job2Scrape target URLs with JS rendering3Extract structured fields via schema+2 more steps

Hostable — runs in your browser2/2 MCPs hosted

Run with MCPizy

New

Execute this recipe in your browser — no local install, no Claude Code. Streams results live.

Whitelisted MCPs: perplexity, notion, anthropic, openai, tavily, firecrawl, coingecko, stripe, slack, github, gitlab, linear, resend, sendgrid, elevenlabs, shopify, sentry, posthog, supabase-mcp, context7, deepwiki~4k tokens · ~$0.012 est.

Why this combo?

Firecrawl handles the hard parts of scraping — JS rendering, pagination, rate limiting — and returns clean structured data. Supabase gives you a queryable database to accumulate that data over time. Together they replace a brittle custom scraper + manual CSV import workflow.

Without this workflow

Write a custom scraper that breaks every time the site updates, export CSV, import into a database manually, fix encoding issues.

With MCPizy

Configure Firecrawl once, data flows into Supabase on schedule. Query it with SQL immediately.

Business value

Concrete ROI — not marketing fluff.

Time saved

~12 hours/week of manual competitive research, plus elimination of brittle custom scrapers

Replaces a full-time data engineer maintaining scrapers ($120-180k/year) with a declarative schema
Pricing intelligence updates daily — catch competitor price drops and reprice within hours, not weeks
Zero downtime on site redesigns: Firecrawl's rendering handles JS changes your custom scraper breaks on
Historical data accumulates in SQL — enables trend analysis that one-off scrapes can never provide

Workflow steps

1
Schedule or trigger Firecrawl job
2
Scrape target URLs with JS rendering
3
Extract structured fields via schema
4
Deduplicate against existing records
5
Upsert rows into Supabase table

Use cases

Scrape competitor pricing pages daily into a queryable Supabase table
Monitor job boards and store new listings for talent pipeline tracking
Aggregate product reviews from multiple sites into one database
Track changes in public datasets by scraping and diffing over time

MCPs required

🔥

Firecrawl

Firecrawl MCP Server

View

🟢

Supabase

Supabase MCP Server

View

Agent prompt (copy into Claude Code)

This prompt is the workflow. Paste into Claude Code, Cursor, or Windsurf.

You are a scraping-to-database agent. Runs on a schedule defined in scrape-targets.yaml.

For each target (url, schema, supabase_table):
1. Call firecrawl.scrape(url=target.url, formats=["json"], json_schema=target.schema, render_js=true) to get structured rows
2. For each row, compute a stable hash(url + primary_key) to use as upsert key
3. Call supabase.execute_sql with parameterized UPSERT:
   INSERT INTO ${target.supabase_table} (...) VALUES (...) ON CONFLICT (hash) DO UPDATE SET ...
4. Track diff count: rows_inserted, rows_updated, rows_unchanged
5. If rows_updated > 0, call supabase.execute_sql to insert a changelog row in scrape_log

On rate-limit or 5xx from Firecrawl, retry with exponential backoff (3 attempts). Report row counts only.

Trigger & credentials

How this workflow fires and what env vars you need.

.env.example

ScheduledTrigger

0 */6 * * *  # every 6 hours

🔥Firecrawl· 1 var

FIRECRAWL_API_KEYGet key

Firecrawl API key

e.g. fc-...

🟢Supabase· 2 vars

SUPABASE_URLGet key

Project URL

e.g. https://abcd.supabase.co

SUPABASE_SERVICE_ROLE_KEYGet key

Service role key (server-side only, bypasses RLS)

e.g. eyJhbGci...

One-command deploy

Install everything — MCPs, prompt, env template — in a single call.

$ mcpizy recipe install firecrawl-supabase-scraping

✓ Installs all 2 MCP servers
✓ Writes prompt to ~/.mcpizy/prompts/firecrawl-supabase-scraping.md
✓ Generates .env.example in current directory
✓ Ready to paste into Claude Code

Requires mcpizy CLI v1.1+ — install via npm i -g mcpizy.

Quick install (MCPs only)

15 min setup, continuous data collection

$ mcpizy install firecrawl && mcpizy install supabase

More Data recipes

🔍🟢

Search Results Indexing

Run Tavily searches on scheduled topics and index the results in Supabase for trend analysis and content research.

🔴🟢

Cache Invalidation Pipeline

When a Supabase row changes, the corresponding Redis cache key is automatically invalidated to keep your API fresh.

🕸️🐙

Knowledge Graph from Code

Parse your GitHub repos and build a Neo4j knowledge graph of files, functions, imports, and authors for code intelligence.

🦆☁️

Data Lake Queries

Query Parquet files directly from S3 using DuckDB without any ETL. Results are returned in seconds for ad-hoc analytics.

Frequently asked questions

What is this workflow?

Web Scraping to Database is a data automation that uses Firecrawl + Supabase together via the Model Context Protocol. Schedule a Firecrawl scrape of any website and store the structured results directly in a Supabase table for analysis.

How long does setup take?

Setup takes around 15 min setup, continuous data collection. You install the required MCP servers with `mcpizy install firecrawl && mcpizy install supabase`, connect your accounts, and the workflow is ready to run.

How much time does this workflow save?

Once running, this workflow saves ~12 hours/week of manual competitive research, plus elimination of brittle custom scrapers. The concrete business value: Replaces a full-time data engineer maintaining scrapers ($120-180k/year) with a declarative schema; Pricing intelligence updates daily — catch competitor price drops and reprice within hours, not weeks.

Which MCP servers do I need for this?

You need 2 MCP servers: Firecrawl (mcpizy install firecrawl), Supabase (mcpizy install supabase). All are installable in one command via the MCPizy CLI and configured in your `.claude.json` or `.cursor/mcp.json`.

Does this work with Claude Code, Cursor, and Windsurf?

Yes. The workflow runs with any MCP-compatible AI agent — Claude Code, Claude Desktop, Cursor, Windsurf, VS Code with Copilot, and custom agents built on the MCP SDK. The MCP servers are identical across clients; only the config file path (`.claude.json` vs `.cursor/mcp.json`) changes.

Start building this workflow

Install the required MCPs from the marketplace and automate this in 15 min setup.

$ mcpizy install firecrawl && mcpizy install supabase

🔥Install Firecrawl 🟢Install Supabase

Free to install. Connect your accounts and this workflow runs itself.

TL;DR

You are a scraping-to-database agent. Runs on a schedule defined in scrape-targets.yaml. For each target (url, schema, supabase_table): 1. Call firecrawl.scrape(url=target.url, formats=["json"], json_schema=target.schema, render_js=true) to get structured rows 2. For each row, compute a stable hash(url + primary_key) to use as upsert key 3. Call supabase.execute_sql with parameterized UPSERT: INSERT INTO ${target.supabase_table} (...) VALUES (...) ON CONFLICT (hash) DO UPDATE SET ... 4. Track diff count: rows_inserted, rows_updated, rows_unchanged 5. If rows_updated > 0, call supabase.execute_sql to insert a changelog row in scrape_log On rate-limit or 5xx from Firecrawl, retry with exponential backoff (3 attempts). Report row counts only.

$ mcpizy recipe install firecrawl-supabase-scraping ✓ Installs all 2 MCP servers ✓ Writes prompt to ~/.mcpizy/prompts/firecrawl-supabase-scraping.md ✓ Generates .env.example in current directory ✓ Ready to paste into Claude Code

Frequently asked questions

What is this workflow?

How long does setup take?

How much time does this workflow save?

Which MCP servers do I need for this?

Does this work with Claude Code, Cursor, and Windsurf?