Name: article-exporter
Author: actionbook

Article Exporter - Export Articles to Obsidian

Version: 0.5.0 | Last Updated: 2026-03-13

You are an expert at web content archiving and Obsidian workflow automation.

Lessons from Failed Exports

These rules were extracted from real export failures. Each one prevents a specific class of error:

Twitter/X needs AI reformatting — fetch returns flat text because Twitter uses custom UI without semantic HTML. The AI reformatting step reconstructs headings, lists, and code blocks. See references/twitter-handling.md.
Ask for output path first — users have different vault locations. Assuming a default creates files in the wrong place and wastes time moving them.
Check actionbook version >= 0.9.1 — the --wait-hint parameter was added in 0.9.1. Without it, dynamic content (SPAs, lazy-loaded pages) returns empty or partial results.
Wait after navigation — use --wait-hint heavy for Twitter, Medium, and other dynamic sites. Without it, the page hasn't finished rendering when content is extracted.
Rate limit batch exports — 3-5s delay between requests prevents being flagged as a bot (ToS compliance).

Quick Reference

| Task | Command | Success Criteria | |------|---------|------------------| | Check deps | actionbook --version | Shows version >= 0.9.1 | | Fetch article | actionbook browser fetch <url> --wait-hint heavy | Returns plain text (AI reformats to Markdown in Step 1b) | | Translate | AI session directly | README_CN.md created | | Open in Obsidian | obsidian-cli open "path/index.md" | File opens in Obsidian |

Complete Export Workflow

Goal: Export web article to Obsidian directory with images and optional translation

Success criteria:

Article directory created with README.md
All images downloaded to images/
index.md navigation file created
Optional: README_CN.md translation
Opened in Obsidian (if obsidian-cli available)

Step 1: Fetch Article Content

Execution: Direct (Bash)

# Fetch article as readability text (with log cleaning)
actionbook browser fetch "$URL" --wait-hint heavy 2>/dev/null | \
  sed '/^[[:space:]]*$/d;/^\x1b\[/d;/^INFO/d' > /tmp/article_raw.txt

Success criteria:

/tmp/article_raw.txt exists and size > 0 bytes
Content contains the article's main text

The fetch command returns readability-extracted plain text (not Markdown). AI reformatting in Step 1b is always needed to produce proper Markdown.

Rules:

Use --wait-hint heavy for Twitter, Medium, dynamic content
Use --wait-hint light for static blogs
2>/dev/null suppresses stderr logs
sed removes ANSI codes, INFO lines, empty lines

Twitter/X Special Handling

Twitter uses non-semantic HTML, so fetch output loses all structure (headings become flat text, code blocks disappear). If the URL contains x.com or twitter.com, pay extra attention to structure reconstruction in Step 1b. See references/twitter-handling.md.

Step 1b: AI Reformat to Markdown

Execution: Direct (AI session)

Read /tmp/article_raw.txt and convert the plain text into well-structured Markdown. Save the result to /tmp/article.md.

Reformatting rules:

Reconstruct headings (#, ##, ###) from the text structure
Preserve original image URLs as ![alt](url) references
Format code blocks, lists, tables, and blockquotes
Keep the original article title as the first # H1 heading

Success criteria:

/tmp/article.md exists and starts with # <Title>
Image URLs are preserved as Markdown image syntax

Step 2: Extract Metadata

Execution: Direct (Bash)

# Extract title (first H1 heading from AI-reformatted markdown)
TITLE=$(grep -m 1 "^# " /tmp/article.md | sed 's/^# //')

# Extract image URLs (filter out data: URLs)
IMAGE_URLS=$(grep -o '!\[[^]]*\]([^)]*)' /tmp/article.md | \
    sed -E 's/!\[[^]]*\]\(([^)]*)\)/\1/' | \
    grep -v '^data:')

Success criteria:

$TITLE is non-empty
$IMAGE_URLS count matches expected (use wc -l)

Step 3: Ask Output Directory

Execution: [human] Human checkpoint: Confirm output location before creating files

Ask user: "Where should I save the exported article?"

Suggested paths:

~/Work/Write/Articles (default)
~/Documents/Obsidian/Articles
~/Notes/Imported
(or custom path from $output_dir argument)

Success criteria: User confirms output directory

Artifacts: $OUTPUT_DIR variable set

Step 4: Create Directory Structure

Execution: Direct (Bash)

# Use argument if provided, otherwise use confirmed path
OUTPUT_DIR="${output_dir:-$USER_CONFIRMED_PATH}"

# Sanitize title for directory name
SAFE_TITLE=$(echo "$TITLE" | sed 's/[/:*?"<>|]//g' | cut -c1-100 | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')

# Create output directory
ARTICLE_DIR="$OUTPUT_DIR/$SAFE_TITLE"
mkdir -p "$ARTICLE_DIR/images"

Success criteria:

Directory $ARTICLE_DIR exists
Subdirectory images/ exists
Directory is writable

Rules:

Remove special characters: / : * ? " < > |
Limit title length to 100 characters
Trim leading/trailing whitespace

Step 5: Download Images (Parallel if possible)

Execution: Direct (Bash)

counter=1
for url in $IMAGE_URLS; do
    ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg")
    curl -L -s "$url" -o "$ARTICLE_DIR/images/image_${counter}${ext}"

    # Check file size (detect 0-byte failures)
    if [ ! -s "$ARTICLE_DIR/images/image_${counter}${ext}" ]; then
        # Try alternative format (Twitter)
        curl -L -s "${url}?format=jpg&name=orig" -o "$ARTICLE_DIR/images/image_${counter}.jpg"
    fi

    counter=$((counter + 1))
done

Success criteria:

All image files exist and size > 0 bytes
File count matches $IMAGE_URLS count

Rules:

Use curl -L to follow redirects
Check file size after download
Try alternative formats for Twitter images

Step 6: Update Image References

Execution: Direct (Bash)

# Replace remote URLs with local paths
counter=1
for url in $IMAGE_URLS; do
    ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg")
    sed -i.bak "s|$url|./images/image_${counter}${ext}|g" /tmp/article.md
    counter=$((counter + 1))
done

# Save updated markdown
cp /tmp/article.md "$ARTICLE_DIR/README.md"
rm /tmp/article.md.bak

Success criteria:

README.md contains ./images/image_N.* references
No remote URLs remain in image links

Step 7: AI Translation (Optional)

Execution: Direct (AI session)

Human checkpoint: Ask user: "Do you want to translate the article? (y/n)"

If yes:

Read $ARTICLE_DIR/README.md
Translate using AI capabilities (no external API)
Write to $ARTICLE_DIR/README_CN.md (or other language code)

Translation Prompt Template:

Translate the following Markdown article to [LANGUAGE] while preserving:
- All Markdown formatting (headings, lists, code blocks, tables)
- Image references exactly as-is: ![alt](./images/image_N.*)
- Links and URLs unchanged
- Code blocks and technical terms in original language

Only output the translated Markdown content.

---
[Paste README.md content]

Success criteria: Translation file exists and size ≈ original ± 20%

Supported languages: en, zh, es, fr, de, ja, ko

Step 8: Create Navigation Index

Execution: Direct (Bash)

# Auto-detect source from URL
case "$URL" in
    *x.com*|*twitter.com*) SOURCE="X" ;;
    *medium.com*) SOURCE="Medium" ;;
    *dev.to*) SOURCE="Dev.to" ;;
    *openai.com*) SOURCE="OpenAI Blog" ;;
    *substack.com*) SOURCE="Substack" ;;
    *github.com*) SOURCE="GitHub" ;;
    *) SOURCE=$(echo "$URL" | sed 's|https\?://||' | cut -d/ -f1) ;;
esac

# Create index.md
cat > "$ARTICLE_DIR/index.md" <<EOF
# $TITLE

> **Export Date**: $(date +%Y-%m-%d)
> **Original URL**: $URL
> **Source**: $SOURCE

## 📚 Language Versions

- 🇬🇧 **English**: [[README]]
- 🇨🇳 **Chinese**: [[README_CN]]  <!-- if translated -->

## 📊 Metadata

| Property | Value |
|----------|-------|
| **Source** | $SOURCE |
| **Images** | $(ls images/ 2>/dev/null | wc -l) images |
| **Export Tool** | actionbook CLI |
| **Export Date** | $(date +%Y-%m-%d) |

---

**Exported using**: actionbook browser automation + AI assistant
EOF

Success criteria: index.md exists with metadata table

Step 9: Open in Obsidian

Execution: Direct (Bash)

if command -v obsidian-cli &> /dev/null; then
    VAULT_ROOT="$OUTPUT_DIR"
    REL_PATH=$(echo "$ARTICLE_DIR" | sed "s|$VAULT_ROOT/||")
    obsidian-cli open "$REL_PATH/index.md"
    echo "✓ Opened in Obsidian: $REL_PATH/index.md"
else
    # Fallback: Open in file manager
    case "$(uname)" in
        Darwin)  open "$ARTICLE_DIR" ;;
        Linux)   xdg-open "$ARTICLE_DIR" ;;
        CYGWIN*|MINGW*|MSYS*) start "$ARTICLE_DIR" ;;
    esac
    echo "⚠️  Install obsidian-cli for automatic opening: npm install -g obsidian-cli"
fi

Success criteria:

File opens in Obsidian OR directory opens in file manager
User sees success message

Step 10: Report Success

Execution: Direct (Output)

echo ""
echo "════════════════════════════════════════════"
echo "✓ Article exported successfully!"
echo ""
echo "📁 Location: $ARTICLE_DIR"
echo "📄 Files:"
echo "     - README.md (original)"
[ -f "$ARTICLE_DIR/README_CN.md" ] && echo "     - README_CN.md (translation)"
echo "     - index.md (navigation)"
echo "🖼️  Images: $(ls images/ 2>/dev/null | wc -l) files"
echo "════════════════════════════════════════════"

Common Issues

| Issue | Cause | Solution | |-------|-------|----------| | "actionbook: command not found" | CLI not installed | npm install -g @actionbookdev/cli@latest | | "unknown flag: --wait-hint" | Version < 0.9.1 | Upgrade: npm install -g @actionbookdev/cli@latest | | Twitter format broken | fetch loses structure | Use AI reformatting (see references/twitter-handling.md) | | Images 0 bytes | URL expired | Try ?format=jpg&name=orig | | obsidian-cli not found | Not installed | npm install -g obsidian-cli | | Batch export blocked | Too fast, flagged as bot | Add 3-5s sleep between requests |

Detailed troubleshooting: See ./references/troubleshooting.md

Edge Cases Handled

Long titles → Auto-truncate to 100 chars
Special characters → Sanitized (/ : * ? " < > | removed)
No images → Steps 5-6 skip gracefully
0-byte images → Auto-retry with alternative formats
Data URLs → Filtered out in Step 2

When Using This Skill

Check dependencies first — actionbook --version >= 0.9.1
Test with one article — Verify before batch processing
Twitter/X requires special handling — See references/twitter-handling.md
Respect ToS — Personal use only, rate limit batch exports

References (Progressive Disclosure)

For detailed documentation, see:

./references/twitter-handling.md — Twitter/X special handling (AI reformatting)
./references/batch-export.md — Batch export with rate limiting
./references/troubleshooting.md — Detailed troubleshooting guide
./references/obsidian-setup.md — obsidian-cli setup and configuration
./references/supported-websites.md — Complete website compatibility list

Last Updated: 2026-03-13 | Version: 0.5.0

Article Exporter - Export Articles to Obsidian

Version: 0.5.0 | Last Updated: 2026-03-13

You are an expert at web content archiving and Obsidian workflow automation.

Lessons from Failed Exports

These rules were extracted from real export failures. Each one prevents a specific class of error:

Twitter/X needs AI reformatting — fetch returns flat text because Twitter uses custom UI without semantic HTML. The AI reformatting step reconstructs headings, lists, and code blocks. See references/twitter-handling.md.
Ask for output path first — users have different vault locations. Assuming a default creates files in the wrong place and wastes time moving them.
Check actionbook version >= 0.9.1 — the --wait-hint parameter was added in 0.9.1. Without it, dynamic content (SPAs, lazy-loaded pages) returns empty or partial results.
Wait after navigation — use --wait-hint heavy for Twitter, Medium, and other dynamic sites. Without it, the page hasn't finished rendering when content is extracted.
Rate limit batch exports — 3-5s delay between requests prevents being flagged as a bot (ToS compliance).

Quick Reference

Complete Export Workflow

Goal: Export web article to Obsidian directory with images and optional translation

Success criteria:

Article directory created with README.md
All images downloaded to images/
index.md navigation file created
Optional: README_CN.md translation
Opened in Obsidian (if obsidian-cli available)

Step 1: Fetch Article Content

Execution: Direct (Bash)

# Fetch article as readability text (with log cleaning)
actionbook browser fetch "$URL" --wait-hint heavy 2>/dev/null | \
  sed '/^[[:space:]]*$/d;/^\x1b\[/d;/^INFO/d' > /tmp/article_raw.txt

Success criteria:

/tmp/article_raw.txt exists and size > 0 bytes
Content contains the article's main text

The fetch command returns readability-extracted plain text (not Markdown). AI reformatting in Step 1b is always needed to produce proper Markdown.

Rules:

Use --wait-hint heavy for Twitter, Medium, dynamic content
Use --wait-hint light for static blogs
2>/dev/null suppresses stderr logs
sed removes ANSI codes, INFO lines, empty lines

Twitter/X Special Handling

Step 1b: AI Reformat to Markdown

Execution: Direct (AI session)

Read /tmp/article_raw.txt and convert the plain text into well-structured Markdown. Save the result to /tmp/article.md.

Reformatting rules:

Reconstruct headings (#, ##, ###) from the text structure
Preserve original image URLs as ![alt](url) references
Format code blocks, lists, tables, and blockquotes
Keep the original article title as the first # H1 heading

Success criteria:

/tmp/article.md exists and starts with # <Title>
Image URLs are preserved as Markdown image syntax

Step 2: Extract Metadata

Execution: Direct (Bash)

# Extract title (first H1 heading from AI-reformatted markdown)
TITLE=$(grep -m 1 "^# " /tmp/article.md | sed 's/^# //')

# Extract image URLs (filter out data: URLs)
IMAGE_URLS=$(grep -o '!\[[^]]*\]([^)]*)' /tmp/article.md | \
    sed -E 's/!\[[^]]*\]\(([^)]*)\)/\1/' | \
    grep -v '^data:')

Success criteria:

$TITLE is non-empty
$IMAGE_URLS count matches expected (use wc -l)

Step 3: Ask Output Directory

Execution: [human] Human checkpoint: Confirm output location before creating files

Ask user: "Where should I save the exported article?"

Suggested paths:

~/Work/Write/Articles (default)
~/Documents/Obsidian/Articles
~/Notes/Imported
(or custom path from $output_dir argument)

Success criteria: User confirms output directory

Artifacts: $OUTPUT_DIR variable set

Step 4: Create Directory Structure

Execution: Direct (Bash)

# Use argument if provided, otherwise use confirmed path
OUTPUT_DIR="${output_dir:-$USER_CONFIRMED_PATH}"

# Sanitize title for directory name
SAFE_TITLE=$(echo "$TITLE" | sed 's/[/:*?"<>|]//g' | cut -c1-100 | sed 's/^[[:space:]]*//;s/[[:space:]]*$//')

# Create output directory
ARTICLE_DIR="$OUTPUT_DIR/$SAFE_TITLE"
mkdir -p "$ARTICLE_DIR/images"

Success criteria:

Directory $ARTICLE_DIR exists
Subdirectory images/ exists
Directory is writable

Rules:

Remove special characters: / : * ? " < > |
Limit title length to 100 characters
Trim leading/trailing whitespace

Step 5: Download Images (Parallel if possible)

Execution: Direct (Bash)

counter=1
for url in $IMAGE_URLS; do
    ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg")
    curl -L -s "$url" -o "$ARTICLE_DIR/images/image_${counter}${ext}"

    # Check file size (detect 0-byte failures)
    if [ ! -s "$ARTICLE_DIR/images/image_${counter}${ext}" ]; then
        # Try alternative format (Twitter)
        curl -L -s "${url}?format=jpg&name=orig" -o "$ARTICLE_DIR/images/image_${counter}.jpg"
    fi

    counter=$((counter + 1))
done

Success criteria:

All image files exist and size > 0 bytes
File count matches $IMAGE_URLS count

Rules:

Use curl -L to follow redirects
Check file size after download
Try alternative formats for Twitter images

Step 6: Update Image References

Execution: Direct (Bash)

# Replace remote URLs with local paths
counter=1
for url in $IMAGE_URLS; do
    ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg")
    sed -i.bak "s|$url|./images/image_${counter}${ext}|g" /tmp/article.md
    counter=$((counter + 1))
done

# Save updated markdown
cp /tmp/article.md "$ARTICLE_DIR/README.md"
rm /tmp/article.md.bak

Success criteria:

README.md contains ./images/image_N.* references
No remote URLs remain in image links

Step 7: AI Translation (Optional)

Execution: Direct (AI session)

Human checkpoint: Ask user: "Do you want to translate the article? (y/n)"

If yes:

Read $ARTICLE_DIR/README.md
Translate using AI capabilities (no external API)
Write to $ARTICLE_DIR/README_CN.md (or other language code)

Translation Prompt Template:

Translate the following Markdown article to [LANGUAGE] while preserving:
- All Markdown formatting (headings, lists, code blocks, tables)
- Image references exactly as-is: ![alt](./images/image_N.*)
- Links and URLs unchanged
- Code blocks and technical terms in original language

Only output the translated Markdown content.

---
[Paste README.md content]

Success criteria: Translation file exists and size ≈ original ± 20%

Supported languages: en, zh, es, fr, de, ja, ko

Step 8: Create Navigation Index

Execution: Direct (Bash)

# Auto-detect source from URL
case "$URL" in
    *x.com*|*twitter.com*) SOURCE="X" ;;
    *medium.com*) SOURCE="Medium" ;;
    *dev.to*) SOURCE="Dev.to" ;;
    *openai.com*) SOURCE="OpenAI Blog" ;;
    *substack.com*) SOURCE="Substack" ;;
    *github.com*) SOURCE="GitHub" ;;
    *) SOURCE=$(echo "$URL" | sed 's|https\?://||' | cut -d/ -f1) ;;
esac

# Create index.md
cat > "$ARTICLE_DIR/index.md" <<EOF
# $TITLE

> **Export Date**: $(date +%Y-%m-%d)
> **Original URL**: $URL
> **Source**: $SOURCE

## 📚 Language Versions

- 🇬🇧 **English**: [[README]]
- 🇨🇳 **Chinese**: [[README_CN]]  <!-- if translated -->

## 📊 Metadata

| Property | Value |
|----------|-------|
| **Source** | $SOURCE |
| **Images** | $(ls images/ 2>/dev/null | wc -l) images |
| **Export Tool** | actionbook CLI |
| **Export Date** | $(date +%Y-%m-%d) |

---

**Exported using**: actionbook browser automation + AI assistant
EOF

Success criteria: index.md exists with metadata table

Step 9: Open in Obsidian

Execution: Direct (Bash)

if command -v obsidian-cli &> /dev/null; then
    VAULT_ROOT="$OUTPUT_DIR"
    REL_PATH=$(echo "$ARTICLE_DIR" | sed "s|$VAULT_ROOT/||")
    obsidian-cli open "$REL_PATH/index.md"
    echo "✓ Opened in Obsidian: $REL_PATH/index.md"
else
    # Fallback: Open in file manager
    case "$(uname)" in
        Darwin)  open "$ARTICLE_DIR" ;;
        Linux)   xdg-open "$ARTICLE_DIR" ;;
        CYGWIN*|MINGW*|MSYS*) start "$ARTICLE_DIR" ;;
    esac
    echo "⚠️  Install obsidian-cli for automatic opening: npm install -g obsidian-cli"
fi

Success criteria:

File opens in Obsidian OR directory opens in file manager
User sees success message

Step 10: Report Success

Execution: Direct (Output)

echo ""
echo "════════════════════════════════════════════"
echo "✓ Article exported successfully!"
echo ""
echo "📁 Location: $ARTICLE_DIR"
echo "📄 Files:"
echo "     - README.md (original)"
[ -f "$ARTICLE_DIR/README_CN.md" ] && echo "     - README_CN.md (translation)"
echo "     - index.md (navigation)"
echo "🖼️  Images: $(ls images/ 2>/dev/null | wc -l) files"
echo "════════════════════════════════════════════"

Common Issues

Detailed troubleshooting: See ./references/troubleshooting.md

Edge Cases Handled

Long titles → Auto-truncate to 100 chars
Special characters → Sanitized (/ : * ? " < > | removed)
No images → Steps 5-6 skip gracefully
0-byte images → Auto-retry with alternative formats
Data URLs → Filtered out in Step 2

When Using This Skill

Check dependencies first — actionbook --version >= 0.9.1
Test with one article — Verify before batch processing
Twitter/X requires special handling — See references/twitter-handling.md
Respect ToS — Personal use only, rate limit batch exports

References (Progressive Disclosure)

For detailed documentation, see:

./references/twitter-handling.md — Twitter/X special handling (AI reformatting)
./references/batch-export.md — Batch export with rate limiting
./references/troubleshooting.md — Detailed troubleshooting guide
./references/obsidian-setup.md — obsidian-cli setup and configuration
./references/supported-websites.md — Complete website compatibility list

Last Updated: 2026-03-13 | Version: 0.5.0

Adoption

actionbook/article-exporter

$ install --global

Security Scan Results

SKILL.md

Article Exporter - Export Articles to Obsidian

Lessons from Failed Exports

Quick Reference

Complete Export Workflow

Step 1: Fetch Article Content

Step 1b: AI Reformat to Markdown

Step 2: Extract Metadata

Step 3: Ask Output Directory

Step 4: Create Directory Structure

Step 5: Download Images (Parallel if possible)

Step 6: Update Image References

Step 7: AI Translation (Optional)

Step 8: Create Navigation Index

Step 9: Open in Obsidian

Step 10: Report Success

Common Issues

Edge Cases Handled

When Using This Skill

References (Progressive Disclosure)

Related Skills

actionbook/actionbook

actionbook/extract

actionbook/active-research

actionbook/rust-learner

actionbook/article-exporter

$ install --global

Security Scan Results

SKILL.md

Article Exporter - Export Articles to Obsidian

Lessons from Failed Exports

Quick Reference

Complete Export Workflow

Step 1: Fetch Article Content

Step 1b: AI Reformat to Markdown

Step 2: Extract Metadata

Step 3: Ask Output Directory

Step 4: Create Directory Structure

Step 5: Download Images (Parallel if possible)

Step 6: Update Image References

Step 7: AI Translation (Optional)

Step 8: Create Navigation Index

Step 9: Open in Obsidian

Step 10: Report Success

Common Issues

Edge Cases Handled

When Using This Skill

References (Progressive Disclosure)

Related Skills

actionbook/actionbook

actionbook/extract

actionbook/active-research

actionbook/rust-learner