Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.scrapellm.com/llms.txt

Use this file to discover all available pages before exploring further.

ScrapeLLM enforces per-account rate limits on synchronous requests and manages queue depth for async jobs. This guide covers how to scale efficiently without hitting limits.

Rate limits

All accounts are subject to per-second and per-minute request rate limits. Limits are stored per-user and enforced by the API gateway. When you exceed a limit, you receive HTTP 429:
{
  "detail": "Rate limit exceeded. Retry after a moment."
}
Implement exponential backoff on 429 responses. See Error handling for a code example. For large batches, submit async jobs and poll for results. This avoids holding open HTTP connections and lets the ScrapeLLM job queue handle concurrency automatically.
import time, requests

API_KEY = "YOUR_API_KEY"

def submit_job(prompt, country="US"):
    resp = requests.post(
        "https://api.scrapellm.com/scrapers/chatgpt/jobs",
        headers={"X-API-Key": API_KEY},
        params={"prompt": prompt, "country": country},
    )
    return resp.json()["job_id"]

def poll_job(job_id, interval=3):
    while True:
        job = requests.get(
            f"https://api.scrapellm.com/jobs/{job_id}"
        ).json()
        if job["status"] in ("done", "failed"):
            return job
        time.sleep(interval)

prompts = [
    "Best CRM for small business?",
    "Top email marketing tools?",
    "Leading project management software?",
]

job_ids = [submit_job(p) for p in prompts]
results = [poll_job(jid) for jid in job_ids]

for result in results:
    if result["status"] == "done":
        print(result["result"]["result"][:200])

Pattern 2: Concurrent workers (sync requests)

For smaller batches where you need immediate results, use concurrent sync requests — but respect rate limits.
import asyncio, aiohttp

API_KEY = "YOUR_API_KEY"
MAX_CONCURRENT = 5  # Stay within your plan's rate limit

async def scrape(session, prompt, semaphore):
    async with semaphore:
        params = {"prompt": prompt, "country": "US"}
        async with session.get(
            "https://api.scrapellm.com/scrapers/chatgpt",
            headers={"X-API-Key": API_KEY},
            params=params,
        ) as resp:
            return await resp.json()

async def main(prompts):
    semaphore = asyncio.Semaphore(MAX_CONCURRENT)
    async with aiohttp.ClientSession() as session:
        tasks = [scrape(session, p, semaphore) for p in prompts]
        return await asyncio.gather(*tasks)

prompts = ["Prompt 1", "Prompt 2", "Prompt 3"]
results = asyncio.run(main(prompts))

Quick reference

Use casePatternWhen to use
Large batchesAsync jobsNon-time-sensitive, 10+ prompts
Real-time resultsConcurrent sync workersNeed immediate responses, smaller batches

Common questions

Why am I getting 429 errors?

You’ve exceeded your plan’s rate limit. Options:
  • Implement exponential backoff and retry
  • Switch to async job mode — jobs are queued server-side
  • Upgrade your plan for higher limits

What’s the best approach for processing 100+ prompts?

Use async jobs. Submit all jobs first, then poll for results. This decouples submission from processing and lets the queue handle concurrency automatically.

Can I increase my rate limit?

Yes — higher-tier plans include higher rate limits. Contact [email protected] for custom limits on enterprise volumes.