Scrape documents from Notion, DocSend, PDFs, and other sources into local PDF files. Use when the user needs to download, archive, or convert web documents to PDF format. Supports authentication flows for protected documents and session persistence via profiles. Returns local file paths to downloaded PDFs.
数据来源:ClawHub。 在 ClawSkills 查看
选择你使用的 Agent
方法一:命令行安装(推荐)
推荐(无需提前安装 clawhub)
npx clawhub@latest --dir ~/.claude/skills install links-to-pdfs或使用 clawhub CLI(需提前安装)
clawhub --dir ~/.claude/skills install links-to-pdfs⚠️ 需要 Node.js 18+,没有 Node?请使用下方方法二直接下载 ZIP。 安装 Node.js →
方法二:手动下载安装(无需 Node)
下载 ZIP,解压后将文件夹放到以下路径,重启 Agent 即可:
安装路径
~/.claude/skills/links-to-pdfs/💡解压后将文件夹放到上方路径,重启 Agent 即可生效
--- name: scraper description: Scrape documents from Notion, DocSend, PDFs, and other sources into local PDF files. Use when the user needs to download, archive, or convert web documents to PDF format. Supports authentication flows for protected documents and session persistence via profiles. Returns local file paths to downloaded PDFs. ---
CLI tool that scrapes documents from various sources into local PDF files using browser automation.
npm install -g docs-scraper
Scrape any document URL to PDF:
docs-scraper scrape https://example.com/document
Returns local path: ~/.docs-scraper/output/1706123456-abc123.pdf
Scrape with daemon (recommended, keeps browser warm):
docs-scraper scrape <url>
Scrape with named profile (for authenticated sites):
docs-scraper scrape <url> -p <profile-name>
Scrape with pre-filled data (e.g., email for DocSend):
docs-scraper scrape <url> -D [email protected]
Direct mode (single-shot, no daemon):
docs-scraper scrape <url> --no-daemon
When a document requires authentication (login, email verification, passcode):
```bash docs-scraper scrape https://docsend.com/view/xxx # Output: Scrape blocked # Job ID: abc123 ```
```bash docs-scraper update abc123 -D [email protected] # or with password docs-scraper update abc123 -D [email protected] -D password=1234 ```
Profiles store session cookies for authenticated sites.
docs-scraper profiles list # List saved profiles
docs-scraper profiles clear # Clear all profiles
docs-scraper scrape <url> -p myprofile # Use a profile
The daemon keeps browser instances warm for faster scraping.
docs-scraper daemon status # Check status
docs-scraper daemon start # Start manually
docs-scraper daemon stop # Stop daemon
Note: Daemon auto-starts when running scrape commands.
PDFs are stored in ~/.docs-scraper/output/. The daemon automatically cleans up files older than 1 hour.
Manual cleanup:
docs-scraper cleanup # Delete all PDFs
docs-scraper cleanup --older-than 1h # Delete PDFs older than 1 hour
docs-scraper jobs list # List blocked jobs awaiting auth
---
Each scraper accepts specific -D data fields. Use the appropriate fields based on the URL type.
Handles: URLs ending in .pdf
Data fields: None (downloads directly)
Example:
docs-scraper scrape https://example.com/document.pdf
---
Handles: docsend.com/view/, docsend.com/v/, and subdomains (e.g., org-a.docsend.com)
URL patterns:
https://docsend.com/view/{id} or https://docsend.com/v/{id}https://docsend.com/view/s/{id}https://{subdomain}.docsend.com/view/{id}Data fields:
| Field | Type | Description | |-------|------|-------------| | email | email | Email address for document access | | password | password | Passcode/password for protected documents | | name | text | Your name (required for NDA-gated documents) |
Examples:
# Pre-fill email for DocSend
docs-scraper scrape https://docsend.com/view/abc123 -D [email protected]
# With password protection
docs-scraper scrape https://docsend.com/view/abc123 -D [email protected] -D password=secret123
# With NDA name requirement
docs-scraper scrape https://docsend.com/view/abc123 -D [email protected] -D name="John Doe"
# Retry blocked job
docs-scraper update abc123 -D [email protected] -D password=secret123
Notes:
---
Handles: notion.so/, .notion.site/*
Data fields:
| Field | Type | Description | |-------|------|-------------| | email | email | Notion account email | | password | password | Notion account password |
Examples:
# Public page (no auth needed)
docs-scraper scrape https://notion.so/Public-Page-abc123
# Private page with login
docs-scraper scrape https://notion.so/Private-Page-abc123 \
-D [email protected] -D password=mypassword
# Custom domain
docs-scraper scrape https://docs.company.notion.site/Page-abc123
Notes:
---
Handles: Any URL not matched by other scrapers (automatic fallback)
Data fields: Dynamic - determined by Claude analyzing the page
The LLM scraper uses Claude to analyze the page HTML and detect:
Common dynamic fields:
| Field | Type | Description | |-------|------|-------------| | email | email | Login email (if detected) | | password | password | Login password (if detected) | | username | text | Username (if login uses username) |
Examples:
# Generic webpage (no auth)
docs-scraper scrape https://example.com/article
# Webpage requiring login
docs-scraper scrape https://members.example.com/article \
-D [email protected] -D password=secret
# When blocked, check the job for required fields
docs-scraper jobs list
# Then retry with the fields the scraper detected
docs-scraper update abc123 -D username=myuser -D password=secret
Notes:
ANTHROPIC_API_KEY environment variable---
| Scraper | email | password | name | Other | |---------|-------|----------|------|-------| | DirectPdf | - | - | - | - | | DocSend | ✓ | ✓ | ✓ | - | | Notion | ✓ | ✓ | - | - | | LLM Fallback | ✓ | ✓ | - | Dynamic* |
*Fields detected dynamically from page analysis
Only needed for LLM fallback scraper:
export ANTHROPIC_API_KEY=your_key
Optional browser settings:
export BROWSER_HEADLESS=true # Set false for debugging
Archive a Notion page:
docs-scraper scrape https://notion.so/My-Page-abc123
Download protected DocSend:
docs-scraper scrape https://docsend.com/view/xxx
# If blocked:
docs-scraper update <job-id> -D [email protected] -D password=1234
Batch scraping with profiles:
docs-scraper scrape https://site.com/doc1 -p mysite
docs-scraper scrape https://site.com/doc2 -p mysite
Success: Local file path (e.g., ~/.docs-scraper/output/1706123456-abc123.pdf) Blocked: Job ID + required credential types
docs-scraper daemon stop && docs-scraper daemon startdocs-scraper jobs list to check pending jobsdocs-scraper cleanup to remove old PDFs安装 Links to PDFs 后,可以对 AI 说这些话来触发它
Help me get started with Links to PDFs
Explains what Links to PDFs does, walks through the setup, and runs a quick demo based on your current project
Use Links to PDFs to scrape documents from Notion, DocSend, PDFs, and other sources into...
Invokes Links to PDFs with the right parameters and returns the result directly in the conversation
What can I do with Links to PDFs in my documents & notes workflow?
Lists the top use cases for Links to PDFs, with example commands for each scenario
将技能文件夹放到 ~/.claude/skills/links-to-pdfs/ 目录(个人级,所有项目可用),或 .claude/skills/links-to-pdfs/(项目级)。重启 AI 客户端后,用 /links-to-pdfs 主动调用,或让 AI 根据上下文自动发现并使用。
Links to PDFs 支持 Claude、Cursor、OpenClaw,可与这些 AI 平台无缝集成,扩展其能力。
Links to PDFs 可免费安装使用。请查阅仓库了解许可证信息。
Scrape documents from Notion, DocSend, PDFs, and other sources into local PDF files. Use when the user needs to download, archive, or convert web documents to PDF format. Supports authentication flows for protected documents and session persistence via profiles. Returns local file paths to downloaded PDFs.
Links to PDFs 属于「Documents & Notes」分类,该分类的技能帮助 AI 智能体在此领域执行专业任务。
Automate my documents & notes tasks using Links to PDFs
Identifies repetitive steps in your workflow and sets up Links to PDFs to handle them automatically