playground/article-exporter/SKILL.md
Export any web article to a local Obsidian-ready Markdown directory. Fetches page content via actionbook CLI, downloads images locally, rewrites image references to relative paths, and optionally translates the article using AI. Produces a self-contained folder with README.md, images/, and an index.md navigation file.
npx skillsauth add actionbook/actionbook article-exporterInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Version: 0.5.0 | Last Updated: 2026-03-13
You are an expert at web content archiving and Obsidian workflow automation.
These rules were extracted from real export failures. Each one prevents a specific class of error:
fetch returns flat text because Twitter uses custom UI without semantic HTML. The AI reformatting step reconstructs headings, lists, and code blocks. See references/twitter-handling.md.--wait-hint parameter was added in 0.9.1. Without it, dynamic content (SPAs, lazy-loaded pages) returns empty or partial results.--wait-hint heavy for Twitter, Medium, and other dynamic sites. Without it, the page hasn't finished rendering when content is extracted.| Task | Command | Success Criteria |
|------|---------|------------------|
| Check deps | actionbook --version | Shows version >= 0.9.1 |
| Fetch article | actionbook browser fetch <url> --wait-hint heavy | Returns plain text (AI reformats to Markdown in Step 1b) |
| Translate | AI session directly | README_CN.md created |
| Open in Obsidian | obsidian-cli open "path/index.md" | File opens in Obsidian |
Goal: Export web article to Obsidian directory with images and optional translation
Success criteria:
Execution: Direct (Bash)
# Fetch article as readability text (with log cleaning)
actionbook browser fetch "$URL" --wait-hint heavy 2>/dev/null | \
sed '/^[[:space:]]*$/d;/^\x1b\[/d;/^INFO/d' > /tmp/article_raw.txt
Success criteria:
/tmp/article_raw.txt exists and size > 0 bytesThe fetch command returns readability-extracted plain text (not Markdown). AI reformatting in Step 1b is always needed to produce proper Markdown.
Rules:
--wait-hint heavy for Twitter, Medium, dynamic content--wait-hint light for static blogs2>/dev/null suppresses stderr logssed removes ANSI codes, INFO lines, empty linesTwitter/X Special Handling
Twitter uses non-semantic HTML, so fetch output loses all structure (headings become flat text, code blocks disappear). If the URL contains x.com or twitter.com, pay extra attention to structure reconstruction in Step 1b. See references/twitter-handling.md.
Execution: Direct (AI session)
Read /tmp/article_raw.txt and convert the plain text into well-structured Markdown. Save the result to /tmp/article.md.
Reformatting rules:
#, ##, ###) from the text structure references# H1 headingSuccess criteria:
/tmp/article.md exists and starts with # <Title>Execution: Direct (Bash)
# Extract title (first H1 heading from AI-reformatted markdown)
TITLE=$(grep -m 1 "^# " /tmp/article.md | sed 's/^# //')
# Extract image URLs (filter out data: URLs)
IMAGE_URLS=$(grep -o '!\[[^]]*\]([^)]*)' /tmp/article.md | \
sed -E 's/!\[[^]]*\]\(([^)]*)\)/\1/' | \
grep -v '^data:')
Success criteria:
$TITLE is non-empty$IMAGE_URLS count matches expected (use wc -l)Execution: [human] Human checkpoint: Confirm output location before creating files
Ask user: "Where should I save the exported article?"
Suggested paths:
~/Work/Write/Articles (default)~/Documents/Obsidian/Articles~/Notes/Imported$output_dir argument)Success criteria: User confirms output directory
Artifacts: $OUTPUT_DIR variable set
Execution: Direct (Bash)
# Use argument if provided, otherwise use confirmed path
OUTPUT_DIR="${output_dir:-$USER_CONFIRMED_PATH}"
# Sanitize title for directory name
SAFE_TITLE=$(echo "$TITLE" | sed 's/[/:*?"<>|]//g' | cut -c1-100 | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')
# Create output directory
ARTICLE_DIR="$OUTPUT_DIR/$SAFE_TITLE"
mkdir -p "$ARTICLE_DIR/images"
Success criteria:
$ARTICLE_DIR existsimages/ existsRules:
/ : * ? " < > |Execution: Direct (Bash)
counter=1
for url in $IMAGE_URLS; do
ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg")
curl -L -s "$url" -o "$ARTICLE_DIR/images/image_${counter}${ext}"
# Check file size (detect 0-byte failures)
if [ ! -s "$ARTICLE_DIR/images/image_${counter}${ext}" ]; then
# Try alternative format (Twitter)
curl -L -s "${url}?format=jpg&name=orig" -o "$ARTICLE_DIR/images/image_${counter}.jpg"
fi
counter=$((counter + 1))
done
Success criteria:
$IMAGE_URLS countRules:
curl -L to follow redirectsExecution: Direct (Bash)
# Replace remote URLs with local paths
counter=1
for url in $IMAGE_URLS; do
ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg")
sed -i.bak "s|$url|./images/image_${counter}${ext}|g" /tmp/article.md
counter=$((counter + 1))
done
# Save updated markdown
cp /tmp/article.md "$ARTICLE_DIR/README.md"
rm /tmp/article.md.bak
Success criteria:
README.md contains ./images/image_N.* referencesExecution: Direct (AI session)
Human checkpoint: Ask user: "Do you want to translate the article? (y/n)"
If yes:
$ARTICLE_DIR/README.md$ARTICLE_DIR/README_CN.md (or other language code)Translation Prompt Template:
Translate the following Markdown article to [LANGUAGE] while preserving:
- All Markdown formatting (headings, lists, code blocks, tables)
- Image references exactly as-is: 
- Links and URLs unchanged
- Code blocks and technical terms in original language
Only output the translated Markdown content.
---
[Paste README.md content]
Success criteria: Translation file exists and size ≈ original ± 20%
Supported languages: en, zh, es, fr, de, ja, ko
Execution: Direct (Bash)
# Auto-detect source from URL
case "$URL" in
*x.com*|*twitter.com*) SOURCE="X" ;;
*medium.com*) SOURCE="Medium" ;;
*dev.to*) SOURCE="Dev.to" ;;
*openai.com*) SOURCE="OpenAI Blog" ;;
*substack.com*) SOURCE="Substack" ;;
*github.com*) SOURCE="GitHub" ;;
*) SOURCE=$(echo "$URL" | sed 's|https\?://||' | cut -d/ -f1) ;;
esac
# Create index.md
cat > "$ARTICLE_DIR/index.md" <<EOF
# $TITLE
> **Export Date**: $(date +%Y-%m-%d)
> **Original URL**: $URL
> **Source**: $SOURCE
## 📚 Language Versions
- 🇬🇧 **English**: [[README]]
- 🇨🇳 **Chinese**: [[README_CN]] <!-- if translated -->
## 📊 Metadata
| Property | Value |
|----------|-------|
| **Source** | $SOURCE |
| **Images** | $(ls images/ 2>/dev/null | wc -l) images |
| **Export Tool** | actionbook CLI |
| **Export Date** | $(date +%Y-%m-%d) |
---
**Exported using**: actionbook browser automation + AI assistant
EOF
Success criteria: index.md exists with metadata table
Execution: Direct (Bash)
if command -v obsidian-cli &> /dev/null; then
VAULT_ROOT="$OUTPUT_DIR"
REL_PATH=$(echo "$ARTICLE_DIR" | sed "s|$VAULT_ROOT/||")
obsidian-cli open "$REL_PATH/index.md"
echo "✓ Opened in Obsidian: $REL_PATH/index.md"
else
# Fallback: Open in file manager
case "$(uname)" in
Darwin) open "$ARTICLE_DIR" ;;
Linux) xdg-open "$ARTICLE_DIR" ;;
CYGWIN*|MINGW*|MSYS*) start "$ARTICLE_DIR" ;;
esac
echo "⚠️ Install obsidian-cli for automatic opening: npm install -g obsidian-cli"
fi
Success criteria:
Execution: Direct (Output)
echo ""
echo "════════════════════════════════════════════"
echo "✓ Article exported successfully!"
echo ""
echo "📁 Location: $ARTICLE_DIR"
echo "📄 Files:"
echo " - README.md (original)"
[ -f "$ARTICLE_DIR/README_CN.md" ] && echo " - README_CN.md (translation)"
echo " - index.md (navigation)"
echo "🖼️ Images: $(ls images/ 2>/dev/null | wc -l) files"
echo "════════════════════════════════════════════"
| Issue | Cause | Solution |
|-------|-------|----------|
| "actionbook: command not found" | CLI not installed | npm install -g @actionbookdev/cli@latest |
| "unknown flag: --wait-hint" | Version < 0.9.1 | Upgrade: npm install -g @actionbookdev/cli@latest |
| Twitter format broken | fetch loses structure | Use AI reformatting (see references/twitter-handling.md) |
| Images 0 bytes | URL expired | Try ?format=jpg&name=orig |
| obsidian-cli not found | Not installed | npm install -g obsidian-cli |
| Batch export blocked | Too fast, flagged as bot | Add 3-5s sleep between requests |
Detailed troubleshooting: See ./references/troubleshooting.md
/ : * ? " < > | removed)actionbook --version >= 0.9.1For detailed documentation, see:
./references/twitter-handling.md — Twitter/X special handling (AI reformatting)./references/batch-export.md — Batch export with rate limiting./references/troubleshooting.md — Detailed troubleshooting guide./references/obsidian-setup.md — obsidian-cli setup and configuration./references/supported-websites.md — Complete website compatibility listLast Updated: 2026-03-13 | Version: 0.5.0
development
Browser action engine. Provides up-to-date action manuals for the modern web — operate any website instantly, one tab or dozens, concurrently.
development
Extract structured data from websites and produce an executable Playwright script plus extracted data. Use when the user wants to scrape, extract, pull, collect, or harvest data from any website — product listings, tables, search results, feeds, profiles, or any repeating content.
tools
Deep research and analysis tool. Generates comprehensive HTML reports on any topic, domain, paper, or technology. Enhanced with advanced browser automation — SPA handling, network idle wait, batch operations, stealth browsing, and intelligent page analysis. Use when user asks to research, analyze, investigate, deep-dive, or generate a report on any subject.
development
Learn Rust language features and crate updates. Use when user asks about Rust version changelog, what's new in Rust, crate updates, Cargo.toml dependencies, tokio/serde/axum features, or any Rust ecosystem questions.