Documentation Index
Fetch the complete documentation index at: https://firecrawl-docs-rust-improve-parse-intro.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Installation
The official Rust SDK is maintained in the Firecrawl monorepo at apps/rust-sdk.
To install the Firecrawl Rust SDK, add the dependency from crates.io:
[dependencies]
firecrawl = "2"
tokio = { version = "1", features = ["full"] }
serde_json = "1"
Or install via Cargo:
cargo add firecrawl
cargo add tokio --features full
cargo add serde_json
Requires Rust 1.70 or later.
Usage
- Get an API key from firecrawl.dev
- Set the API key as an environment variable named
FIRECRAWL_API_KEY, or pass it directly to Client::new(...)
Scrape a page and print its markdown:
use firecrawl::{Client, ScrapeOptions, Format};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = Client::new("fc-YOUR-API-KEY")?;
let doc = client.scrape(
"https://firecrawl.dev",
ScrapeOptions {
formats: Some(vec![Format::Markdown]),
..Default::default()
},
).await?;
println!("{}", doc.markdown.unwrap_or_default());
Ok(())
}
The sections below cover crawling, mapping, searching, and the other SDK methods.
Scraping a URL
To scrape a single URL, use the scrape method.
use firecrawl::{Client, ScrapeOptions, Format};
let doc = client.scrape(
"https://firecrawl.dev",
ScrapeOptions {
formats: Some(vec![Format::Markdown, Format::Html]),
only_main_content: Some(true),
wait_for: Some(5000),
..Default::default()
},
).await?;
println!("{}", doc.markdown.unwrap_or_default());
if let Some(meta) = &doc.metadata {
println!("{:?}", meta.title);
}
Extract structured JSON using scrape_with_schema:
use firecrawl::Client;
use serde_json::json;
let schema = json!({
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "number" }
}
});
let data = client.scrape_with_schema(
"https://example.com/product",
schema,
Some("Extract the product name and price"),
).await?;
println!("{}", serde_json::to_string_pretty(&data)?);
Or configure JSON extraction via ScrapeOptions directly:
use firecrawl::{Client, ScrapeOptions, Format, JsonOptions};
use serde_json::json;
let doc = client.scrape(
"https://example.com/product",
ScrapeOptions {
formats: Some(vec![Format::Json]),
json_options: Some(JsonOptions {
schema: Some(json!({
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "number" }
}
})),
prompt: Some("Extract the product name and price".to_string()),
..Default::default()
}),
..Default::default()
},
).await?;
println!("{:?}", doc.json);
Parsing uploaded files
Convert a local document into markdown, HTML, or structured JSON with parse. Supported file types are .html, .htm, .pdf, .docx, .doc, .odt, .rtf, .xlsx, and .xls. The file uploads as multipart form data, and the call returns a Document populated with the formats you requested.
ParseOptions omits scrape-only fields that /v2/parse rejects, such as actions, waitFor, location, mobile, screenshot, branding, and changeTracking.
Build a ParseFile from in-memory bytes or directly from a path:
use firecrawl::{Client, ParseFile, ParseFormat, ParseOptions};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = Client::new("fc-YOUR-API-KEY")?;
let file = ParseFile::from_bytes(
"upload.html",
b"<!DOCTYPE html><html><body><h1>Rust Parse</h1></body></html>".to_vec(),
)
.with_content_type("text/html");
let options = ParseOptions {
formats: Some(vec![ParseFormat::Markdown, ParseFormat::Html]),
only_main_content: Some(true),
..Default::default()
};
let doc = client.parse(file, Some(options)).await?;
println!("{}", doc.markdown.unwrap_or_default());
Ok(())
}
Or read the file from disk and omit the options:
use firecrawl::{Client, ParseFile};
let client = Client::new("fc-YOUR-API-KEY")?;
let file = ParseFile::from_path("./report.pdf")?;
let doc = client.parse(file, None).await?;
println!("{}", doc.markdown.unwrap_or_default());
ParseFile
| Constructor | Description |
|---|
ParseFile::from_bytes(filename, bytes) | Build from a filename and in-memory bytes |
ParseFile::from_path(path) | Read bytes from disk and derive the filename |
.with_content_type(content_type) | Attach a MIME type hint (e.g. text/html, application/pdf) |
ParseOptions
Supported fields (all optional, camelCase on the wire):
formats: Vec<ParseFormat> — any of Markdown, Html, RawHtml, Links, Images, Summary, Json, Attributes
only_main_content: bool
include_tags: Vec<String> / exclude_tags: Vec<String>
headers: HashMap<String, String>
timeout: u32 (ms)
parsers: Vec<ParserConfig> (e.g. PDF parser config)
skip_tls_verification: bool
remove_base64_images: bool
fast_mode: bool
block_ads: bool
proxy: ParseProxyType (Basic or Auto)
json_options: JsonOptions
attribute_selectors: Vec<AttributeSelector>
zero_data_retention: bool
integration: String, origin: String, use_mock: String
Crawling a Website
To crawl a website and wait for completion, use crawl.
use firecrawl::{Client, CrawlOptions, ScrapeOptions, Format};
let job = client.crawl(
"https://firecrawl.dev",
CrawlOptions {
limit: Some(50),
max_discovery_depth: Some(3),
scrape_options: Some(ScrapeOptions {
formats: Some(vec![Format::Markdown]),
..Default::default()
}),
..Default::default()
},
).await?;
println!("Status: {:?}", job.status);
println!("Progress: {}/{}", job.completed, job.total);
for page in &job.data {
if let Some(meta) = &page.metadata {
println!("{:?}", meta.source_url);
}
}
Start a Crawl
Start a job without waiting using start_crawl.
use firecrawl::{Client, CrawlOptions};
let start = client.start_crawl(
"https://firecrawl.dev",
CrawlOptions {
limit: Some(100),
..Default::default()
},
).await?;
println!("Job ID: {}", start.id);
Checking Crawl Status
Check crawl progress with get_crawl_status.
let status = client.get_crawl_status(&start.id).await?;
println!("Status: {:?}", status.status);
println!("Progress: {}/{}", status.completed, status.total);
Cancelling a Crawl
Cancel a running crawl with cancel_crawl.
let result = client.cancel_crawl(&start.id).await?;
println!("{:?}", result);
Checking Crawl Errors
Retrieve errors from a crawl job with get_crawl_errors.
let errors = client.get_crawl_errors(&start.id).await?;
println!("{:?}", errors);
Mapping a Website
Discover links on a site using map.
use firecrawl::{Client, MapOptions};
let response = client.map(
"https://firecrawl.dev",
MapOptions {
limit: Some(100),
search: Some("blog".to_string()),
..Default::default()
},
).await?;
for link in &response.links {
println!("{} - {}", link.url, link.title.as_deref().unwrap_or(""));
}
For a simpler result with just URLs, use map_urls:
let urls = client.map_urls("https://firecrawl.dev", None).await?;
for url in &urls {
println!("{}", url);
}
Searching the Web
Search with optional settings using search.
use firecrawl::{Client, SearchOptions};
let results = client.search(
"firecrawl web scraping",
SearchOptions {
limit: Some(10),
..Default::default()
},
).await?;
if let Some(web) = results.data.web {
for item in web {
match item {
firecrawl::SearchResultOrDocument::WebResult(r) => {
println!("{} - {}", r.url, r.title.unwrap_or_default());
}
firecrawl::SearchResultOrDocument::Document(d) => {
println!("{}", d.markdown.unwrap_or_default());
}
}
}
}
For a convenience method that returns scraped documents directly:
let docs = client.search_and_scrape("firecrawl web scraping", 5).await?;
for doc in &docs {
println!("{}", doc.markdown.as_deref().unwrap_or(""));
}
Batch Scraping
Scrape multiple URLs in parallel using batch_scrape.
use firecrawl::{Client, BatchScrapeOptions, ScrapeOptions, Format};
let urls = vec![
"https://firecrawl.dev".to_string(),
"https://firecrawl.dev/blog".to_string(),
];
let job = client.batch_scrape(
urls,
BatchScrapeOptions {
options: Some(ScrapeOptions {
formats: Some(vec![Format::Markdown]),
..Default::default()
}),
..Default::default()
},
).await?;
for doc in &job.data {
println!("{}", doc.markdown.as_deref().unwrap_or(""));
}
Agent
Run an AI-powered agent with agent.
use firecrawl::{Client, AgentOptions};
let result = client.agent(
AgentOptions {
prompt: "Find the pricing plans for Firecrawl and compare them".to_string(),
..Default::default()
},
).await?;
println!("{:?}", result.data);
With a JSON schema for structured output:
use firecrawl::{Client, AgentOptions, AgentModel};
use serde::Deserialize;
use serde_json::json;
#[derive(Debug, Deserialize)]
struct PricingPlan {
name: String,
price: String,
}
#[derive(Debug, Deserialize)]
struct PricingData {
plans: Vec<PricingPlan>,
}
let schema = json!({
"type": "object",
"properties": {
"plans": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "string" }
}
}
}
}
});
let result: Option<PricingData> = client.agent_with_schema(
vec!["https://firecrawl.dev".to_string()],
"Extract pricing plan details",
schema,
).await?;
if let Some(data) = result {
for plan in &data.plans {
println!("{}: {}", plan.name, plan.price);
}
}
Scrape-Bound Interactive Session
Use a scrape job ID to run follow-up browser code in the same context:
interact(...) runs code or prompts in the scrape-bound browser session.
stop_interaction(...) stops the interactive session when you are done.
use firecrawl::{Client, ScrapeExecuteOptions, ScrapeExecuteLanguage};
let scrape_job_id = "550e8400-e29b-41d4-a716-446655440000";
// Run code in the browser session
let run = client.interact(
scrape_job_id,
ScrapeExecuteOptions {
code: Some("console.log(await page.title())".to_string()),
language: Some(ScrapeExecuteLanguage::Node),
timeout: Some(60),
..Default::default()
},
).await?;
println!("{:?}", run.stdout);
// Or use a natural language prompt
let run = client.interact(
scrape_job_id,
ScrapeExecuteOptions {
prompt: Some("Click the pricing tab and summarize the plans".to_string()),
..Default::default()
},
).await?;
// Stop the session when done
client.stop_interaction(scrape_job_id).await?;
Configuration
Client::new(...) and Client::new_selfhosted(...) create the client.
| Option | Description |
|---|
Client::new(api_key) | Create a client for the Firecrawl cloud service (https://api.firecrawl.dev) |
Client::new_selfhosted(api_url, api_key) | Create a client for a self-hosted Firecrawl instance |
use firecrawl::Client;
// Cloud service
let client = Client::new("fc-your-api-key")?;
// Self-hosted
let client = Client::new_selfhosted(
"http://localhost:3002",
Some("fc-your-api-key"),
)?;
// Self-hosted without authentication
let client = Client::new_selfhosted(
"http://localhost:3002",
None::<&str>,
)?;
Environment Variable
Set the FIRECRAWL_API_KEY environment variable instead of passing the key directly:
export FIRECRAWL_API_KEY=fc-YOUR-API-KEY
let api_key = std::env::var("FIRECRAWL_API_KEY")
.expect("FIRECRAWL_API_KEY must be set");
let client = Client::new(api_key)?;
Poll Intervals
Synchronous methods (crawl, batch_scrape, agent) poll until completion. You can customize the poll interval via the options struct:
use firecrawl::CrawlOptions;
let options = CrawlOptions {
limit: Some(50),
poll_interval: Some(3000), // Poll every 3 seconds (default: 2000ms)
..Default::default()
};
Error Handling
The SDK uses the FirecrawlError enum, which implements Error, Debug, and Display. All methods return Result<T, FirecrawlError>.
use firecrawl::{Client, FirecrawlError};
match client.scrape("https://example.com", None).await {
Ok(doc) => println!("{}", doc.markdown.unwrap_or_default()),
Err(FirecrawlError::HttpRequestFailed(action, status, msg)) => {
eprintln!("HTTP {}: {} ({})", status, msg, action);
}
Err(FirecrawlError::APIError(action, api_err)) => {
eprintln!("API error ({}): {}", action, api_err.error);
}
Err(FirecrawlError::JobFailed(msg)) => {
eprintln!("Job failed: {}", msg);
}
Err(FirecrawlError::Misuse(msg)) => {
eprintln!("SDK misuse: {}", msg);
}
Err(e) => eprintln!("Error: {}", e),
}
Are you an AI agent that needs a Firecrawl API key? See firecrawl.dev/agent-onboarding/SKILL.md for automated onboarding instructions.