Async jobs

Async mode lets you submit a scrape and get a job_id back immediately - no open connection required. The scrape runs in the background, is automatically retried up to 3 times on failure, and credits are restored if every attempt fails. Use async when:

You’re processing 10+ prompts and don’t need instant results
Your prompts may take a long time and you don’t want to hold a connection open
You’re running a background pipeline or cron job

Use sync when:

You need the result immediately in the same request cycle
You’re processing a single prompt in response to a user action

How it works

POST /scrapers/{scraper}/jobs   →  { job_id, status: "pending" }   (HTTP 202)
GET  /jobs/{job_id}             →  { status: "pending" }            (poll)
GET  /jobs/{job_id}             →  { status: "done", result: {...} } (done)

Step 1 - Submit the job

POST https://api.scrapellm.com/scrapers/{scraper}/jobs Replace {scraper} with any of: chatgpt, perplexity, grok, copilot, gemini, google_ai_mode, amazon_rufus. Pass the same query parameters as the sync endpoint.

curl -X POST "https://api.scrapellm.com/scrapers/chatgpt/jobs" \
  -H "X-API-Key: YOUR_API_KEY" \
  -G \
  --data-urlencode "prompt=What brands do marketers recommend for email automation?" \
  --data-urlencode "country=US"

import requests

resp = requests.post(
    "https://api.scrapellm.com/scrapers/chatgpt/jobs",
    headers={"X-API-Key": "YOUR_API_KEY"},
    params={
        "prompt": "What brands do marketers recommend for email automation?",
        "country": "US",
    }
)

print(resp.status_code)   # 202
print(resp.json())        # { "job_id": "3f7a2b1c-...", "status": "pending" }
job_id = resp.json()["job_id"]

const params = new URLSearchParams({
  prompt: "What brands do marketers recommend for email automation?",
  country: "US",
});

const resp = await fetch(
  `https://api.scrapellm.com/scrapers/chatgpt/jobs?${params}`,
  { method: "POST", headers: { "X-API-Key": "YOUR_API_KEY" } }
);

// HTTP 202
const { job_id } = await resp.json();

import (
    "fmt"
    "io"
    "net/http"
)

req, _ := http.NewRequest("POST",
    "https://api.scrapellm.com/scrapers/chatgpt/jobs?prompt=What+brands+do+marketers+recommend%3F&country=US",
    nil,
)
req.Header.Set("X-API-Key", "YOUR_API_KEY")

resp, _ := http.DefaultClient.Do(req)
body, _ := io.ReadAll(resp.Body)
fmt.Println(string(body)) // { "job_id": "...", "status": "pending" }

Step 2 - Poll for the result

GET https://api.scrapellm.com/jobs/{job_id} - no authentication required. Poll every few seconds until status is "done" or "failed". Most scrapes complete in 5–30 seconds.

import time, requests

def wait_for_job(job_id, interval=3):
    while True:
        job = requests.get(
            f"https://api.scrapellm.com/jobs/{job_id}"
        ).json()

        if job["status"] == "done":
            return job["result"]
        if job["status"] == "failed":
            raise Exception(f"Job failed: {job.get('error')}")

        time.sleep(interval)

result = wait_for_job(job_id)
print(result["result"])        # plain text response
print(result["search_queries"]) # query fan-out (ChatGPT)

async function waitForJob(jobId, intervalMs = 3000) {
  while (true) {
    const resp = await fetch(`https://api.scrapellm.com/jobs/${jobId}`);
    const job = await resp.json();

    if (job.status === "done") return job.result;
    if (job.status === "failed") throw new Error(job.error);

    await new Promise(r => setTimeout(r, intervalMs));
  }
}

const result = await waitForJob(job_id);
console.log(result.result);

JOB_ID="3f7a2b1c-9e4d-4f8a-b2c1-7d6e5f4a3b2c"

while true; do
  RESPONSE=$(curl -s "https://api.scrapellm.com/jobs/$JOB_ID")
  STATUS=$(echo $RESPONSE | python3 -c "import sys,json; print(json.load(sys.stdin)['status'])")
  echo "Status: $STATUS"
  if [ "$STATUS" = "done" ] || [ "$STATUS" = "failed" ]; then
    echo $RESPONSE
    break
  fi
  sleep 3
done

Job status response

Field	Type	Description
`job_id`	string	The unique job UUID
`status`	string	`pending` · `done` · `failed`
`result`	object	Full scrape response - present when `status` is `"done"`
`error`	string	Error message - present when `status` is `"failed"`
`created_at`	string	ISO 8601 UTC timestamp when the job was submitted
`completed_at`	string	ISO 8601 UTC timestamp when the job finished. `null` while pending

Jobs are retained for 24 hours. After that, GET /jobs/{job_id} returns 404.

Job lifecycle

submit  →  pending  →  done
                    →  failed (after up to 3 automatic retries)

Credits deducted at submit time. If all retry attempts fail, credits are automatically restored.
Retries are automatic. Failed scrapes are retried up to 3 times before the job is marked failed.
No cancellation. Once submitted, a job runs to completion. Avoid submitting jobs you don’t intend to use.

Batch processing

Submit all jobs first, then poll - don’t submit-and-wait serially.

import time, requests

API_KEY = "YOUR_API_KEY"
SCRAPER = "chatgpt"

prompts = [
    "Best CRM for small business?",
    "Top email marketing platforms?",
    "Leading project management tools?",
    "Best accounting software for startups?",
    "Top HR software for SMBs?",
]

# 1. Submit all jobs
job_ids = []
for prompt in prompts:
    resp = requests.post(
        f"https://api.scrapellm.com/scrapers/{SCRAPER}/jobs",
        headers={"X-API-Key": API_KEY},
        params={"prompt": prompt, "country": "US"},
    )
    job_ids.append(resp.json()["job_id"])
    print(f"Submitted: {job_ids[-1]}")

# 2. Poll all until done
def poll(job_id):
    while True:
        job = requests.get(f"https://api.scrapellm.com/jobs/{job_id}").json()
        if job["status"] in ("done", "failed"):
            return job
        time.sleep(3)

results = [poll(jid) for jid in job_ids]

# 3. Process results
for prompt, result in zip(prompts, results):
    if result["status"] == "done":
        print(f"\n{prompt}")
        print(result["result"]["result"][:300])
    else:
        print(f"\n{prompt} → FAILED: {result.get('error')}")

const API_KEY = "YOUR_API_KEY";
const SCRAPER = "chatgpt";

const prompts = [
  "Best CRM for small business?",
  "Top email marketing platforms?",
  "Leading project management tools?",
];

async function submitJob(prompt) {
  const params = new URLSearchParams({ prompt, country: "US" });
  const resp = await fetch(
    `https://api.scrapellm.com/scrapers/${SCRAPER}/jobs?${params}`,
    { method: "POST", headers: { "X-API-Key": API_KEY } }
  );
  const { job_id } = await resp.json();
  console.log("Submitted:", job_id);
  return job_id;
}

async function pollJob(jobId) {
  while (true) {
    const resp = await fetch(`https://api.scrapellm.com/jobs/${jobId}`);
    const job = await resp.json();
    if (job.status === "done" || job.status === "failed") return job;
    await new Promise(r => setTimeout(r, 3000));
  }
}

// 1. Submit all at once
const jobIds = await Promise.all(prompts.map(submitJob));

// 2. Poll all in parallel
const results = await Promise.all(jobIds.map(pollJob));

// 3. Process
results.forEach((result, i) => {
  if (result.status === "done") {
    console.log(`\n${prompts[i]}`);
    console.log(result.result.result.slice(0, 300));
  } else {
    console.log(`\n${prompts[i]} → FAILED: ${result.error}`);
  }
});

Submit all jobs before polling any of them. This way all scrapes run in parallel on ScrapeLLM’s infrastructure rather than waiting for each one sequentially.

Cross-scraper batching

Submit the same prompt to multiple scrapers simultaneously to compare AI responses:

import time, requests

API_KEY = "YOUR_API_KEY"
PROMPT = "What CRM do sales teams recommend?"
SCRAPERS = ["chatgpt", "perplexity", "grok", "gemini"]

# Submit to all scrapers at once
job_ids = {
    scraper: requests.post(
        f"https://api.scrapellm.com/scrapers/{scraper}/jobs",
        headers={"X-API-Key": API_KEY},
        params={"prompt": PROMPT, "country": "US"},
    ).json()["job_id"]
    for scraper in SCRAPERS
}

# Poll all
def poll(job_id):
    while True:
        job = requests.get(f"https://api.scrapellm.com/jobs/{job_id}").json()
        if job["status"] in ("done", "failed"):
            return job
        time.sleep(3)

results = {scraper: poll(jid) for scraper, jid in job_ids.items()}

for scraper, result in results.items():
    if result["status"] == "done":
        print(f"\n── {scraper.upper()} ──")
        print(result["result"]["result"][:300])

Common questions

How long do jobs take?

Most complete in 5–30 seconds. Complex prompts with deep reasoning (e.g. Grok MODEL_MODE_EXPERT) may take up to 60 seconds. Set timeout up to 600 seconds on the job submit request if needed.

Do failed jobs consume credits?

No. Credits are deducted at submission but automatically restored if all 3 retry attempts fail.

Can I cancel a submitted job?

No. Once submitted, a job runs to completion. Only submit jobs you intend to use.

What polling interval should I use?

3 seconds is a reasonable default. Most scrapes complete in under 30 seconds, so you’d typically make 5–10 poll requests per job. Polling faster than 1 second provides no benefit.

How long are jobs retained?

24 hours from submission. After that, GET /jobs/{job_id} returns HTTP 404.

What if the job returns `failed` after retries?

The scrape encountered an unrecoverable error after 3 attempts. Credits are restored automatically. Resubmit with a more specific prompt, or try bypass_cache=true. For persistent failures on the same prompt, contact [email protected] with the job_id.

Introduction

Guides

How it works

Step 1 - Submit the job

Step 2 - Poll for the result

Job status response

Job lifecycle

Batch processing

Cross-scraper batching

Common questions

How long do jobs take?

Do failed jobs consume credits?

Can I cancel a submitted job?

What polling interval should I use?

How long are jobs retained?

What if the job returns `failed` after retries?

​How it works

​Step 1 - Submit the job

​Step 2 - Poll for the result

​Job status response

​Job lifecycle

​Batch processing

​Cross-scraper batching

​Common questions

​How long do jobs take?

​Do failed jobs consume credits?

​Can I cancel a submitted job?

​What polling interval should I use?

​How long are jobs retained?

​What if the job returns failed after retries?

How it works

Step 1 - Submit the job

Step 2 - Poll for the result

Job status response

Job lifecycle

Batch processing

Cross-scraper batching

Common questions

How long do jobs take?

Do failed jobs consume credits?

Can I cancel a submitted job?

What polling interval should I use?

How long are jobs retained?

What if the job returns `failed` after retries?