Generated with sparks and insights from 5 sources
Introduction
-
Firecrawl is an open-source tool that converts websites into LLM-ready markdown or structured data.
-
To install Firecrawl locally, you need to clone the repository, set up environment variables, and run the necessary services.
-
Firecrawl can be installed on a Kubernetes cluster for more advanced deployment.
-
The installation process involves setting up dependencies like Node.js, pnpm, and Redis.
-
Firecrawl offers both Python and Node SDKs for easier integration and usage.
Local Installation [1]
-
Clone the repository:
git clone https://github.com/mendableai/firecrawl.git
. -
Navigate to the project directory:
cd firecrawl
. -
Copy the example environment file:
cp ./apps/api/.env.example ./.env
. -
Edit the
.env
file to setUSE_DB_AUTHENTICATION=false
. -
Update the Redis URL in the
.env
file:REDIS_URL=redis://localhost:6379
. -
Run the local instance:
pnpm install
and thenpnpm run dev
.
Kubernetes Installation [2]
-
Clone the repository:
git clone https://github.com/mendableai/firecrawl.git
. -
Navigate to the project directory:
cd firecrawl
. -
Copy the example environment file:
cp ./apps/api/.env.example ./.env
. -
Edit the
.env
file to setUSE_DB_AUTHENTICATION=false
. -
Update the Redis URL in the
.env
file:REDIS_URL=redis://redis:6379
. -
Follow the instructions in
examples/kubernetes-cluster-install/README.md
for Kubernetes setup.
Setting Up Dependencies [3]
-
Install Node.js: Follow the instructions at Node.js.
-
Install pnpm: Follow the instructions at pnpm.
-
Install Redis: Follow the instructions at Redis.
-
Set environment variables in a
.env
file in the/apps/api/
directory. -
Use the template in
.env.example
to set up your.env
file.
[Using Python SDK](/spark?generatorapi=generate_by_article_name&generatorapi_param=query=Firecrawl+Python+SDK+usage) [4]
-
Install the Firecrawl Python SDK:
pip install firecrawl-py
. -
Get an API key from firecrawl.dev.
-
Set the API key as an environment variable named
FIRECRAWL_API_KEY
or pass it as a parameter to theFirecrawlApp
class. -
Scrape a URL:
app.[scrape_url](prompt://ask_markdown?question=scrape_url)('https://example.com')
. -
Crawl a website:
app.crawl_url('https://example.com', params={'pageOptions': {'onlyMainContent': True}})
.
[Using Node SDK](/spark?generatorapi=generate_by_article_name&generatorapi_param=query=Firecrawl+Node+SDK+usage) [5]
-
Install the Firecrawl Node SDK:
npm install @mendable/firecrawl-js
. -
Get an API key from firecrawl.dev.
-
Set the API key as an environment variable named
FIRECRAWL_API_KEY
or pass it as a parameter to theFirecrawlApp
class. -
Scrape a URL:
app.scrapeUrl('https://example.com')
. -
Crawl a website:
app.crawlUrl('https://example.com', params={crawlerOptions: {excludes: ['blog/'], limit: 1000}, pageOptions: {onlyMainContent: true}}, waitUntilDone=true, timeout=5)
.
<br><br>