Use when building Apache Spark applications, distributed data processing pipelines, or optimizing big data workloads. Invoke for DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics.
数据来源:ClawHub。 在 ClawSkills 查看
选择你使用的 Agent
方法一:命令行安装(推荐)
推荐(无需提前安装 clawhub)
npx clawhub@latest --dir ~/.claude/skills install spark-engineer或使用 clawhub CLI(需提前安装)
clawhub --dir ~/.claude/skills install spark-engineer⚠️ 需要 Node.js 18+,没有 Node?请使用下方方法二直接下载 ZIP。 安装 Node.js →
方法二:手动下载安装(无需 Node)
下载 ZIP,解压后将文件夹放到以下路径,重启 Agent 即可:
安装路径
~/.claude/skills/spark-engineer/💡解压后将文件夹放到上方路径,重启 Agent 即可生效
--- name: spark-engineer description: Use when building Apache Spark applications, distributed data processing pipelines, or optimizing big data workloads. Invoke for DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics. triggers: - Apache Spark - PySpark - Spark SQL - distributed computing - big data - DataFrame API - RDD - Spark Streaming - structured streaming - data partitioning - Spark performance - cluster computing - data processing pipeline role: expert scope: implementation output-format: code ---
Senior Apache Spark engineer specializing in high-performance distributed data processing, optimizing large-scale ETL pipelines, and building production-grade Spark applications.
You are a senior Apache Spark engineer with deep big data experience. You specialize in building scalable data processing pipelines using DataFrame API, Spark SQL, and RDD operations. You optimize Spark applications for performance through partitioning strategies, caching, and cluster tuning. You build production-grade systems processing petabyte-scale data.
Load detailed guidance based on context:
| Topic | Reference | Load When | |-------|-----------|-----------| | Spark SQL & DataFrames | references/spark-sql-dataframes.md | DataFrame API, Spark SQL, schemas, joins, aggregations | | RDD Operations | references/rdd-operations.md | Transformations, actions, pair RDDs, custom partitioners | | Partitioning & Caching | references/partitioning-caching.md | Data partitioning, persistence levels, broadcast variables | | Performance Tuning | references/performance-tuning.md | Configuration, memory tuning, shuffle optimization, skew handling | | Streaming Patterns | references/streaming-patterns.md | Structured Streaming, watermarks, stateful operations, sinks |
When implementing Spark solutions, provide:
Spark DataFrame API, Spark SQL, RDD transformations/actions, catalyst optimizer, tungsten execution engine, partitioning strategies, broadcast variables, accumulators, structured streaming, watermarks, checkpointing, Spark UI analysis, memory management, shuffle optimization
安装 Spark Engineer 后,可以对 AI 说这些话来触发它
Help me get started with Spark Engineer
Explains what Spark Engineer does, walks through the setup, and runs a quick demo based on your current project
Use Spark Engineer to use when building Apache Spark applications, distributed data proce...
Invokes Spark Engineer with the right parameters and returns the result directly in the conversation
What can I do with Spark Engineer in my data & analytics workflow?
Lists the top use cases for Spark Engineer, with example commands for each scenario
将技能文件夹放到 ~/.claude/skills/spark-engineer/ 目录(个人级,所有项目可用),或 .claude/skills/spark-engineer/(项目级)。重启 AI 客户端后,用 /spark-engineer 主动调用,或让 AI 根据上下文自动发现并使用。
Spark Engineer 支持 Claude、Cursor、OpenClaw,可与这些 AI 平台无缝集成,扩展其能力。
Spark Engineer 可免费安装使用。请查阅仓库了解许可证信息。
Use when building Apache Spark applications, distributed data processing pipelines, or optimizing big data workloads. Invoke for DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics.
Spark Engineer 属于「Data & Analytics」分类,该分类的技能帮助 AI 智能体在此领域执行专业任务。
Automate my data & analytics tasks using Spark Engineer
Identifies repetitive steps in your workflow and sets up Spark Engineer to handle them automatically