C

ClawdCursor

clawd-cursor

AI desktop agent — control any app on Windows/macOS from your OpenClaw agent. Send natural language tasks to the Clawd Cursor API and it handles everything:...

数据来源：ClawHub。在 ClawSkills 查看

1.2k下载量

0收藏数

9浏览量

安装

选择你使用的 Agent

方法一：命令行安装（推荐）

关于 ClawdCursor

--- name: clawdcursor version: 0.6.3 description: > AI desktop agent — control any app on Windows/macOS from your OpenClaw agent. Send natural language tasks to the Clawd Cursor API and it handles everything: opening apps, clicking buttons, typing text, navigating browsers, filling forms. If you can click it, your agent can too. homepage: https://clawdcursor.com source: https://github.com/AmrDab/clawd-cursor privacy: > Screenshots and UI data stay on the user's machine. If using Ollama (local), zero data leaves the host. If using a cloud provider (Anthropic, OpenAI, Kimi), screenshots/text are sent to that provider's API only — never to third-party servers or skill authors. The user controls which provider is used. The REST API binds to 127.0.0.1 only and is not network-accessible. metadata: openclaw: # Note: bins check removed — OpenClaw's hasBinary() doesn't handle Windows # PATHEXT (.exe/.cmd), causing false "missing" on Windows even when node is # installed. Since install requires npm (which implies node), the check is # redundant. See: https://github.com/openclaw/openclaw/issues/TBD requires: {} install: - git clone https://github.com/AmrDab/clawd-cursor.git - cd clawd-cursor && npm install && npm run setup - cd clawd-cursor && clawdcursor doctor - cd clawd-cursor && clawdcursor start privacy: - Screenshots processed by user's own configured AI provider only - With Ollama, fully offline — no external API calls - With cloud providers, data goes only to user's chosen API endpoint - REST API binds to 127.0.0.1 only — not network accessible notes: - In OpenClaw, this skill inherits the active agent's AI provider + API key. - Standalone CLI mode can still use AI_API_KEY / provider-specific env vars or Ollama. ---

Clawd Cursor

If a human can do it on a screen, you can too. No API? No integration? No problem. Every app already has a UI — Clawd Cursor gives you eyes and hands to use all of them.

No task is impossible. If it's on the screen, it's within reach.

> IMPORTANT — READ THIS BEFORE ANYTHING ELSE > > You have a desktop AI agent over GUI running on the user's machine. This means: > 1. You CAN open apps, click buttons, type text, fill forms > 2. You CAN see what's on screen (via Clawd Cursor's screenshot pipeline) > 3. You CAN interact with applications — browser, desktop apps, settings > 4. You MUST respect safety tiers — 🟢 Auto actions run freely, 🔴 Confirm actions require user approval > 5. You MUST ask before accessing sensitive apps (email, banking, private messages, password managers) > 6. You SHOULD prefer direct tools (file read, web_fetch, CLI) over desktop automation when available > > Think of Clawd Cursor as your hands and eyes on the user's computer — used with their permission.

---

When to Use This (AGENTS: READ THIS)

You have a desktop UI agent available. Use it when the user asks you to interact with their desktop, or when no other tool can accomplish the task.

Tool vs Skill (OpenClaw terminology)

Tool = direct capability (API call, filesystem, shell, web fetch, browser command).
Skill = packaged workflow/domain logic that may call one or more tools.
This skill (Clawd Cursor) = GUI execution skill. Use it after OpenClaw tools/skills that can complete the same work without GUI.

Use Clawd Cursor for (examples, not limits):

Clawd Cursor can perform any action that is visible and interactable in the GUI (subject to safety policy).

User-requested desktop tasks — "open Gmail and send an email", "check my calendar"
Read a webpage — when web_fetch or browser tools aren't available
Interact with desktop apps — click buttons, fill forms, read results
Browser tasks — search, navigate, fill forms (when browser tool unavailable)
Visual verification — did the page load? what does the UI show?
Cross-app workflows — copy from one app, paste in another
Settings changes — when the user explicitly asks

⚠️ Sensitive App Policy

Always ask the user before accessing:

Email clients (Gmail, Outlook)
Banking or financial apps
Private messaging (WhatsApp, Signal, Telegram)
Password managers
Admin panels or cloud consoles

Don't use Clawd Cursor when:

You can do it with a direct API call or CLI command (faster)
The task is purely computational (math, text generation, code writing)
You can already read/write the file directly
The browser tool or web_fetch can handle it

OpenClaw + Clawd Cursor Routing Contract (Avoid Overlap)

Clawd Cursor should be treated as OpenClaw's GUI execution layer, not a competing planner.

Route tasks in this order:

OpenClaw native tools first (filesystem, API, shell, provider-native skills)
Browser-native automation next (Playwright/CDP direct) for browser-only reads/clicks
Clawd Cursor API task (POST /task) only when desktop/UI-level interaction is required

Practical rule

If OpenClaw already has a reliable skill/tool for the domain, use it.
Use Clawd Cursor to bridge gaps where no API/tool exists or when the user explicitly asks for GUI interaction.

This keeps behavior predictable, lowers latency/cost, and avoids duplicated logic between the main OpenClaw agent and this skill.

Universal task pattern

For broad "get it done" requests, split into three phases:

Plan in OpenClaw: break work into API/CLI/browser/GUI subtasks.
Execute cheap paths first: API + CLI + browser direct.
Escalate only residual UI steps to Clawd Cursor.

Think: "OpenClaw decides, Clawd Cursor acts on GUI when needed."

Direct Browser Access (Fast Path)

For quick page reads without a full task, connect to Chrome via Playwright CDP:

const pw = require('playwright');
const browser = await pw.chromium.connectOverCDP('http://127.0.0.1:9222');
const pages = browser.contexts()[0].pages();
const text = await pages[0].innerText('body');

Use this when you just need page content — faster than sending a task.

| Scenario | Use | Why | |----------|-----|-----| | Read page content/text | CDP Direct | Instant, free | | Fill a web form | API task (POST /task) | Clawd handles multi-step planning | | Check if a page loaded | CDP Direct | Just read the title/URL | | Click through a complex UI flow | API task (POST /task) | Clawd handles planning | | Get a list of elements on page | CDP Direct | Fast DOM query | | Interact with a desktop app | API task (POST /task) | CDP is browser-only |

---

REST API Reference

Base URL: http://127.0.0.1:3847

> Note: On Windows PowerShell, use curl.exe (with .exe) or Invoke-RestMethod. Bare curl is aliased to Invoke-WebRequest which behaves differently.

Pre-flight Check

Before your first task, verify Clawd Cursor is running:

curl.exe -s http://127.0.0.1:3847/health

Expected: {"status":"ok","version":"0.6.0"}

If connection refused — start it yourself (don't ask the user):

# Find the skill directory and start the server
Start-Process -FilePath "node" -ArgumentList "dist/index.js","start" -WorkingDirectory "<clawd-cursor-directory>" -WindowStyle Hidden
Start-Sleep 3
# Verify it's running
curl.exe -s http://127.0.0.1:3847/health

The skill directory is wherever SKILL.md lives (the parent of this file). Use that path as the working directory.

Sending a Task (Async — Returns Immediately)

POST /task accepts the task and returns immediately. The task runs in the background. You must poll /status to know when it's done.

curl.exe -s -X POST http://127.0.0.1:3847/task -H "Content-Type: application/json" -d "{\"task\": \"YOUR_TASK_HERE\"}"

PowerShell:

Invoke-RestMethod -Uri http://127.0.0.1:3847/task -Method POST -ContentType "application/json" -Body '{"task": "YOUR_TASK_HERE"}'

...

Prompt 示例

安装 ClawdCursor 后，可以对 AI 说这些话来触发它

U

Help me get started with ClawdCursor

A

Explains what ClawdCursor does, walks through the setup, and runs a quick demo based on your current project

U

Use ClawdCursor to aI desktop agent — control any app on Windows/macOS from your OpenC...

A

Invokes ClawdCursor with the right parameters and returns the result directly in the conversation

U

What can I do with ClawdCursor in my ai agent & automation workflow?

A

Lists the top use cases for ClawdCursor, with example commands for each scenario

常见问题

如何安装 ClawdCursor？▾

将技能文件夹放到 ~/.claude/skills/clawd-cursor/ 目录（个人级，所有项目可用），或 .claude/skills/clawd-cursor/（项目级）。重启 AI 客户端后，用 /clawd-cursor 主动调用，或让 AI 根据上下文自动发现并使用。

ClawdCursor 支持哪些 AI 平台？▾

ClawdCursor 支持 Claude、Cursor、OpenClaw，可与这些 AI 平台无缝集成，扩展其能力。

ClawdCursor 是免费的吗？▾

ClawdCursor 可免费安装使用。请查阅仓库了解许可证信息。

ClawdCursor 有什么功能？▾

AI desktop agent — control any app on Windows/macOS from your OpenClaw agent. Send natural language tasks to the Clawd Cursor API and it handles everything:...

ClawdCursor 属于哪个分类？▾

ClawdCursor 属于「AI Agent & Automation」分类，该分类的技能帮助 AI 智能体在此领域执行专业任务。

使用场景

Getting Started with ClawdCursor→Automate AI Agent & Automation Workflows with ClawdCursor→Team Collaboration with ClawdCursor→