D

Docling

docling

Extract and parse content from web pages, PDFs, documents (docx, pptx), and images using the docling CLI with GPU acceleration. Use INSTEAD of web_fetch for extracting content from specific URLs when you need clean, structured text. Use Brave (web_search) for searching/discovering pages. Use docling when you HAVE a URL and need its content parsed.

数据来源：ClawHub。在 ClawSkills 查看

1.3k下载量

0收藏数

4浏览量

安装

选择你使用的 Agent

方法一：命令行安装（推荐）

关于 Docling

--- name: docling description: Extract and parse content from web pages, PDFs, documents (docx, pptx), and images using the docling CLI with GPU acceleration. Use INSTEAD of web_fetch for extracting content from specific URLs when you need clean, structured text. Use Brave (web_search) for searching/discovering pages. Use docling when you HAVE a URL and need its content parsed. version: 1.0.2 metadata: requires: bins: ["docling"] ---

Docling - Document & Web Content Extraction

CLI tool for parsing documents and web pages into clean, structured text. Uses GPU acceleration for OCR and ML models.

Prerequisites

docling CLI must be installed (e.g., via pipx install docling)
For GPU support: NVIDIA GPU with CUDA drivers

When to Use

Extract content from a URL → Use docling (not web_fetch)
Search for information → Use web_search (Brave)
Parse PDFs, DOCX, PPTX → Use docling
OCR on images → Use docling

Quick Commands

Web Page → Markdown (default)

docling "<URL>" --from html --to md

Output: creates a .md file in current directory (or use --output)

Web Page → Plain Text

docling "<URL>" --from html --to text --output /tmp/docling_out

PDF with OCR

docling "/path/to/file.pdf" --ocr --device cuda --output /tmp/docling_out

Key Options

| Option | Values | Description | |--------|--------|-------------| | --from | html, pdf, docx, pptx, image, md, csv, xlsx | Input format | | --to | md, text, json, yaml, html | Output format | | --device | auto, cuda, cpu | Accelerator (default: auto) | | --output | path | Output directory (recommended: use controlled temp dir) | | --ocr | flag | Enable OCR for images/scanned PDFs | | --tables | flag | Extract tables (default: on) |

Security Notes

⚠️ Avoid these flags unless you trust the source:

--enable-remote-services - can send data to remote endpoints
--allow-external-plugins - loads third-party code
Custom --headers with untrusted values - can redirect requests

Workflow

For web content extraction: Use docling "" --from html --to text --output /tmp/docling_out
Read the output file from the specified output directory
Clean up the output directory after reading

GPU Support

Docling supports GPU acceleration via CUDA (NVIDIA). Verify CUDA is available:

python -c "import torch; print(torch.cuda.is_available())"

Full CLI Reference

See references/cli-reference.md for complete option list.

Prompt 示例

安装 Docling 后，可以对 AI 说这些话来触发它

U

Help me get started with Docling

A

Explains what Docling does, walks through the setup, and runs a quick demo based on your current project

U

Use Docling to extract and parse content from web pages, PDFs, documents (docx, pp...

A

Invokes Docling with the right parameters and returns the result directly in the conversation

U

What can I do with Docling in my documents & notes workflow?

A

Lists the top use cases for Docling, with example commands for each scenario

常见问题

如何安装 Docling？▾

将技能文件夹放到 ~/.claude/skills/docling/ 目录（个人级，所有项目可用），或 .claude/skills/docling/（项目级）。重启 AI 客户端后，用 /docling 主动调用，或让 AI 根据上下文自动发现并使用。

Docling 支持哪些 AI 平台？▾

Docling 支持 Claude、Cursor、OpenClaw，可与这些 AI 平台无缝集成，扩展其能力。

Docling 是免费的吗？▾

Docling 可免费安装使用。请查阅仓库了解许可证信息。

Docling 有什么功能？▾

Extract and parse content from web pages, PDFs, documents (docx, pptx), and images using the docling CLI with GPU acceleration. Use INSTEAD of web_fetch for extracting content from specific URLs when you need clean, structured text. Use Brave (web_search) for searching/discovering pages. Use docling when you HAVE a URL and need its content parsed.

Docling 属于哪个分类？▾

Docling 属于「Documents & Notes」分类，该分类的技能帮助 AI 智能体在此领域执行专业任务。

使用场景

Getting Started with Docling→Automate Documents & Notes Workflows with Docling→Team Collaboration with Docling→