[PR #12] Add firecrawl-cli skill #14

New issue

Open

opened 2026-03-07 13:58:01 +03:00 by kerem · 0 comments

kerem commented

2026-03-07 13:58:01 +03:00

Owner

Original Pull Request: https://github.com/badlogic/pi-skills/pull/12

State: open
Merged: No

Add firecrawl-cli skill: web scraping, search, and site mapping

What this adds

A new firecrawl-cli skill that gives pi-coding-agent (and Claude Code, Codex CLI, Amp, Droid) web scraping, search, and site mapping through the Firecrawl CLI.

Files:

firecrawl-cli/SKILL.md - Full skill definition and CLI reference
firecrawl-cli/rules/install.md - Installation and auth error handling
README.md - Updated with symlink instructions, skills table entry, and requirements

Why this fits pi-skills

This repo is built around CLI tools that give coding agents real-world capabilities - gmcli for email, gccli for calendar, gdcli for Drive, and others for web search. Firecrawl follows the same pattern: a single CLI binary that handles all web data extraction.

What Firecrawl adds to the collection:

The existing skills cover web search (finding links) and browser automation (interactive navigation). Firecrawl fills the gap between those two: content extraction. When an agent finds a relevant URL, it needs to actually read the page - and that's where Firecrawl comes in.

Full page scraping to clean Markdown - not search snippets, but actual page content optimized for LLM context windows
JavaScript-rendered content - handles SPAs and dynamic docs without spinning up a browser
Site-wide URL discovery - the map command finds all URLs on a domain
Search + scrape in one step - firecrawl search "query" --scrape returns both results and their full content
Image and news search - multi-source research with --sources images,news
Parallel batch operations - scrape many pages concurrently
File-based output - -o flag writes directly to file, avoids flooding agent context

A common agent workflow shows how Firecrawl complements the existing skills:

Search: "Find the Tailwind v4 migration guide" -> gets URLs
Firecrawl: Scrapes those URLs to get the actual content as clean Markdown -> agent can work with it

With Firecrawl, steps 1 and 2 can also be combined: firecrawl search "Tailwind v4 migration" --scrape returns both search results and their scraped content.

Firecrawl isn't trying to replace browser automation either - it's purpose-built for content extraction. When an agent needs to read documentation, research a topic, or extract data from a website, Firecrawl returns clean LLM-optimized Markdown without needing Chrome. It handles JavaScript rendering server-side.

How agents use it

# Search and get results with scraped content
firecrawl search "error handling best practices rust" --scrape -o .firecrawl/search-rust-errors.json

# Scrape documentation to Markdown
firecrawl scrape "https://docs.rs/tokio/latest/tokio/" -o .firecrawl/tokio-docs.md

# Find all pages in a docs site
firecrawl map https://docs.rs/tokio --search "runtime" -o .firecrawl/tokio-runtime-urls.txt

# Scrape multiple pages in parallel
firecrawl scrape https://page1.com -o .firecrawl/1.md &
firecrawl scrape https://page2.com -o .firecrawl/2.md &
firecrawl scrape https://page3.com -o .firecrawl/3.md &
wait

Setup

npm install -g firecrawl-cli
firecrawl login --browser

The skill includes a rules/install.md with full auth error handling - if login fails, it walks the user through browser auth or manual API key entry, following the same pattern as other skills in this repo that need external auth.

Changes to README.md

Added firecrawl-cli to both user-level and project-level symlink instructions (Claude Code section)
Added entry to the Available Skills table
Added requirements entry with install and auth commands

I'm on Firecrawl's developer relations team. Let me know if anything needs adjusting to match your conventions - happy to iterate.

**Original Pull Request:** https://github.com/badlogic/pi-skills/pull/12 **State:** open **Merged:** No --- ## Add firecrawl-cli skill: web scraping, search, and site mapping ### What this adds A new `firecrawl-cli` skill that gives pi-coding-agent (and Claude Code, Codex CLI, Amp, Droid) web scraping, search, and site mapping through the [Firecrawl CLI](https://www.npmjs.com/package/firecrawl-cli). **Files:** - `firecrawl-cli/SKILL.md` - Full skill definition and CLI reference - `firecrawl-cli/rules/install.md` - Installation and auth error handling - `README.md` - Updated with symlink instructions, skills table entry, and requirements ### Why this fits pi-skills This repo is built around CLI tools that give coding agents real-world capabilities - `gmcli` for email, `gccli` for calendar, `gdcli` for Drive, and others for web search. Firecrawl follows the same pattern: a single CLI binary that handles all web data extraction. **What Firecrawl adds to the collection:** The existing skills cover web search (finding links) and browser automation (interactive navigation). Firecrawl fills the gap between those two: **content extraction**. When an agent finds a relevant URL, it needs to actually read the page - and that's where Firecrawl comes in. - **Full page scraping to clean Markdown** - not search snippets, but actual page content optimized for LLM context windows - **JavaScript-rendered content** - handles SPAs and dynamic docs without spinning up a browser - **Site-wide URL discovery** - the `map` command finds all URLs on a domain - **Search + scrape in one step** - `firecrawl search "query" --scrape` returns both results and their full content - **Image and news search** - multi-source research with `--sources images,news` - **Parallel batch operations** - scrape many pages concurrently - **File-based output** - `-o` flag writes directly to file, avoids flooding agent context A common agent workflow shows how Firecrawl complements the existing skills: 1. **Search**: "Find the Tailwind v4 migration guide" -> gets URLs 2. **Firecrawl**: Scrapes those URLs to get the actual content as clean Markdown -> agent can work with it With Firecrawl, steps 1 and 2 can also be combined: `firecrawl search "Tailwind v4 migration" --scrape` returns both search results and their scraped content. Firecrawl isn't trying to replace browser automation either - it's purpose-built for content extraction. When an agent needs to read documentation, research a topic, or extract data from a website, Firecrawl returns clean LLM-optimized Markdown without needing Chrome. It handles JavaScript rendering server-side. ### How agents use it ```bash # Search and get results with scraped content firecrawl search "error handling best practices rust" --scrape -o .firecrawl/search-rust-errors.json # Scrape documentation to Markdown firecrawl scrape "https://docs.rs/tokio/latest/tokio/" -o .firecrawl/tokio-docs.md # Find all pages in a docs site firecrawl map https://docs.rs/tokio --search "runtime" -o .firecrawl/tokio-runtime-urls.txt # Scrape multiple pages in parallel firecrawl scrape https://page1.com -o .firecrawl/1.md & firecrawl scrape https://page2.com -o .firecrawl/2.md & firecrawl scrape https://page3.com -o .firecrawl/3.md & wait ``` ### Setup ```bash npm install -g firecrawl-cli firecrawl login --browser ``` The skill includes a `rules/install.md` with full auth error handling - if login fails, it walks the user through browser auth or manual API key entry, following the same pattern as other skills in this repo that need external auth. ### Changes to README.md - Added `firecrawl-cli` to both user-level and project-level symlink instructions (Claude Code section) - Added entry to the Available Skills table - Added requirements entry with install and auth commands --- *I'm on Firecrawl's developer relations team. Let me know if anything needs adjusting to match your conventions - happy to iterate.*