A

Apify

apify

Run and manage Apify Actors via REST API to scrape websites, crawl pages, extract data, and retrieve results from Apify datasets and key-value stores.

数据来源：ClawHub。在 ClawSkills 查看

2.6k下载量

5收藏数

24浏览量

安装

选择你使用的 Agent

方法一：命令行安装（推荐）

关于 Apify

--- name: apify description: Run Apify Actors (web scrapers, crawlers, automation tools) and retrieve their results using the Apify REST API with curl. Use when the user wants to scrape a website, extract data from the web, run an Apify Actor, crawl pages, or get results from Apify datasets. homepage: https://docs.apify.com/api/v2 metadata: { "openclaw": { "emoji": "🐝", "primaryEnv": "APIFY_TOKEN", "requires": { "anyBins": ["curl", "wget"], "env": ["APIFY_TOKEN"] }, }, } ---

Apify

Run any of the 17,000+ Actors on Apify Store and retrieve structured results via the REST API.

Full OpenAPI spec: openapi.json

Authentication

All requests need the APIFY_TOKEN env var. Use it as a Bearer token:

-H "Authorization: Bearer $APIFY_TOKEN"

Base URL: https://api.apify.com

Core workflow

1. Find the right Actor

Search the Apify Store by keyword:

curl -s "https://api.apify.com/v2/store?search=web+scraper&limit=5" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data.items[] | {name: (.username + "/" + .name), title, description}'

Actors are identified by username~name (tilde) in API paths, e.g. apify~web-scraper.

2. Get Actor README and input schema

Before running an Actor, fetch its default build to get the README (usage docs) and input schema (expected JSON fields):

curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {readme, inputSchema}'

inputSchema is a JSON-stringified object — parse it to see required/optional fields, types, defaults, and descriptions. Use this to construct valid input for the run.

You can also get the Actor's per-build OpenAPI spec (no auth required):

curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default/openapi.json"

3. Run an Actor (async — recommended for most cases)

Start the Actor and get the run object back immediately:

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":10}'

Response includes data.id (run ID), data.defaultDatasetId, data.status.

Optional query params: ?timeout=300&memory=4096&maxItems=100&waitForFinish=60

waitForFinish (0-60): seconds the API waits before returning. Useful to avoid polling for short runs.

4. Poll run status

curl -s "https://api.apify.com/v2/actor-runs/RUN_ID?waitForFinish=60" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {status, defaultDatasetId}'

Terminal statuses: SUCCEEDED, FAILED, ABORTED, TIMED-OUT.

5. Get results

Dataset items (most common — structured scraped data):

curl -s "https://api.apify.com/v2/datasets/DATASET_ID/items?clean=true&limit=100" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Or directly from the run (shortcut — same parameters):

curl -s "https://api.apify.com/v2/actor-runs/RUN_ID/dataset/items?clean=true&limit=100" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Key-value store record (screenshots, HTML, OUTPUT):

curl -s "https://api.apify.com/v2/key-value-stores/STORE_ID/records/OUTPUT" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Run log:

curl -s "https://api.apify.com/v2/logs/RUN_ID" \
  -H "Authorization: Bearer $APIFY_TOKEN"

6. Run Actor synchronously (short-running Actors only)

For Actors that finish within 300 seconds, get dataset items in one call:

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":5}'

Returns the dataset items array directly (not wrapped in data). Returns 408 if the run exceeds 300s.

Alternative: /run-sync returns the KVS OUTPUT record instead of dataset items.

Quick recipes

Scrape a website

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":20}'

Google search

curl -s -X POST "https://api.apify.com/v2/acts/apify~google-search-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"queries":"site:example.com openai","maxPagesPerQuery":1}'

Long-running Actor (async with polling)

# 1. Start
RUN=$(curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs?waitForFinish=60" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":500}')
RUN_ID=$(echo "$RUN" | jq -r '.data.id')

# 2. Poll until done
while true; do
  STATUS=$(curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID?waitForFinish=60" \
    -H "Authorization: Bearer $APIFY_TOKEN" | jq -r '.data.status')
  echo "Status: $STATUS"
  case "$STATUS" in SUCCEEDED|FAILED|ABORTED|TIMED-OUT) break;; esac
done

# 3. Fetch results
curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID/dataset/items?clean=true" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Abort a run

curl -s -X POST "https://api.apify.com/v2/actor-runs/RUN_ID/abort" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Paid / rental Actors

Some Actors require a monthly subscription before they can be run. If the API returns a permissions or payment error for an Actor, ask the user to manually subscribe via the Apify Console:

https://console.apify.com/actors/ACTOR_ID

Replace ACTOR_ID with the Actor's ID (e.g. AhEsMsQyLfHyMLaxz). The user needs to click Start on that page to activate the subscription. Most rental Actors offer a free trial period set by the developer.

You can get the Actor ID from the store search response (data.items[].id) or from GET /v2/acts/username~name (data.id).

Error handling

401: APIFY_TOKEN missing or invalid.
404 Actor not found: check username~name format (tilde, not slash). Browse https://apify.com/store.
400 run-failed: check GET /v2/logs/RUN_ID for details.
402/403 payment required: the Actor likely requires a subscription. See "Paid / rental Actors" above.
408 run-timeout-exceeded: sync endpoints have a 300s limit. Use async workflow instead.
429 rate-limit-exceeded: retry with exponential backoff (start at 500ms, double each time).

Additional resources

API docs (LLM-friendly): https://docs.apify.com/api/v2.md
OpenAPI spec: openapi.json
Apify Store (browse Actors): https://apify.com/store

Prompt 示例

安装 Apify 后，可以对 AI 说这些话来触发它

U

Help me get started with Apify

A

Explains what Apify does, walks through the setup, and runs a quick demo based on your current project

U

Use Apify to run and manage Apify Actors via REST API to scrape websites, crawl ...

A

Invokes Apify with the right parameters and returns the result directly in the conversation

U

What can I do with Apify in my developer & devops workflow?

A

Lists the top use cases for Apify, with example commands for each scenario

常见问题

如何安装 Apify？▾

将技能文件夹放到 ~/.claude/skills/apify/ 目录（个人级，所有项目可用），或 .claude/skills/apify/（项目级）。重启 AI 客户端后，用 /apify 主动调用，或让 AI 根据上下文自动发现并使用。

Apify 支持哪些 AI 平台？▾

Apify 支持 Claude、Cursor、OpenClaw，可与这些 AI 平台无缝集成，扩展其能力。

Apify 是免费的吗？▾

Apify 可免费安装使用。请查阅仓库了解许可证信息。

Apify 有什么功能？▾

Run and manage Apify Actors via REST API to scrape websites, crawl pages, extract data, and retrieve results from Apify datasets and key-value stores.

Apify 属于哪个分类？▾

Apify 属于「Developer & DevOps」分类，该分类的技能帮助 AI 智能体在此领域执行专业任务。

使用场景

Getting Started with Apify→Automate Developer & DevOps Workflows with Apify→Team Collaboration with Apify→

Apify

安装

关于 Apify

Apify

Authentication

Core workflow

1. Find the right Actor

2. Get Actor README and input schema

3. Run an Actor (async — recommended for most cases)

4. Poll run status

5. Get results

6. Run Actor synchronously (short-running Actors only)

Quick recipes

Scrape a website

Google search

Long-running Actor (async with polling)

Abort a run

Paid / rental Actors

Error handling

Additional resources

Prompt 示例

常见问题

使用场景

同类技能推荐

Github

Browser Use

Browser Automation

Playwright MCP