Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

mohitagw15856/substack-notes-scraper

Name: substack-notes-scraper
Author: mohitagw15856

skills/substack-notes-scraper/SKILL.md

npx skillsauth add mohitagw15856/pm-claude-skills substack-notes-scraper

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Substack Notes Scraper

Substack has no public API for Notes analytics. You can't see likes, comments, and restacks in one place without scrolling through your feed manually. This skill scrapes the rendered Notes page, filters to only your original content, and exports everything to a spreadsheet you can actually analyze.

Credit: Originally created by a Substack newsletter author — adapted and extended for this library.

Required Inputs

| Input | Format | Example | |---|---|---| | Notes URL | Full URL to the Notes tab | https://substack.com/@handle/notes | | Author handle or name | Exact handle or display name | @handle or Jane Smith | | Date range | Plain English or explicit range | last 30 days or Jan 2026 – Mar 2026 |

Claude will ask for these if not provided upfront.

Output Structure

File

substack-notes-[handle]-[YYYY-MM-DD].xlsx

Sheet: "Notes Data"

| Column | Description | |---|---| | Date | Publication date (YYYY-MM-DD) | | Text Preview | First 200 characters of the note | | Full Text | Complete note text | | Likes | Like count at time of scrape | | Comments | Comment count | | Restacks | Restack count | | Total Engagement | Likes + Comments + Restacks | | Link | Direct URL to the note | | Note Type | original or restack |

Formatting applied:

Row 1: frozen header row
Auto-filter enabled on all columns
Top 20% by Likes column: highlighted yellow (#FFF2CC)
Column widths: auto-fit to content, min 12, max 60

Sheet: "Summary"

Scrape Date:         [YYYY-MM-DD HH:MM UTC]
Author:              [handle]
Date Range:          [start] – [end]
Total Notes:         [n]
Original Notes:      [n]
Restacks Filtered:   [n]

Avg Likes:           [n.n]
Avg Comments:        [n.n]
Avg Restacks:        [n.n]
Avg Total Eng:       [n.n]

Best Note (Likes):   [date] — [first 80 chars] — [n] likes
Best Note (Eng):     [date] — [first 80 chars] — [n] total engagement

Instructions for Claude

Step 1: Validate inputs

Confirm the three required inputs are present. If any are missing, ask before proceeding. Parse the date range into a concrete start date and end date (convert relative ranges like "last 30 days" to explicit dates using today's date).

Step 2: Fetch the Notes page

Use WebFetch to load the Notes URL. Substack Notes pages are JavaScript-rendered — request the full rendered HTML. If WebFetch returns a skeleton page without note content, note this in your response and ask the user to paste the page HTML manually or confirm browser access is available.

Step 3: Paginate through all notes in the date window

Substack Notes load incrementally. Repeat fetching or scrolling until either:

A note's date falls outside the target date range (stop loading more), or
No new content loads on the next request.

Rate-limit: wait 2 seconds between each paginated request. Do not hammer the endpoint.

Step 4: Parse each note

For every note element found on the page, extract:

Date: the timestamp on the note (convert to YYYY-MM-DD)
Author: the display name or handle shown on the note
Full text: complete body text, stripping HTML tags
Text preview: first 200 characters of full text
Likes count: the number shown on the like/heart counter
Comments count: the number shown on the comment counter
Restacks count: the number shown on the restack counter
Link: the direct permalink to the note
Note type: original if the author matches the specified author; restack if it belongs to someone else

Step 5: Filter

Keep ALL rows in the data (restacks included as rows with Note Type = restack). The Summary sheet stats should count only original notes. Mark restacks clearly so the user can filter them out themselves in Excel if preferred.

Apply date filter: exclude any note outside the specified date range.

Step 6: Calculate Total Engagement

For each row: Total Engagement = Likes + Comments + Restacks

Step 7: Identify top 20% by Likes

Sort original notes by Likes descending. Mark the top 20% (round up) for conditional formatting. These rows will be highlighted yellow in the output file.

Step 8: Build the .xlsx file

Use Python with openpyxl to generate the file. Structure:

# Required libraries
import openpyxl
from openpyxl.styles import PatternFill, Font, Alignment
from openpyxl.utils import get_column_letter
from datetime import datetime

# Sheet 1: Notes Data
# - Write header row, bold, freeze row 1
# - Write all data rows
# - Apply auto-filter: ws.auto_filter.ref = ws.dimensions
# - Apply yellow fill to top-20% rows by likes
# - Auto-size columns (iterate cells to find max length)

# Sheet 2: Summary
# - Write summary stats as key-value pairs, no table format

Name the file substack-notes-[handle]-[YYYY-MM-DD].xlsx using today's date.

Step 9: Report back

After generating the file, report:

File path
Total notes found, original vs. restacks
Date range actually covered
Top 3 notes by total engagement (date + preview + stats)
Any notes or warnings (e.g., page didn't fully load, some dates were ambiguous)

Quality Checks

[ ] All three required inputs were confirmed before starting
[ ] Rate limiting honored: 2-second delay between paginated requests
[ ] Author filter applied correctly — restacks are included as rows but flagged, not silently dropped
[ ] Date range filter applied — no notes outside the window appear in the data
[ ] Total Engagement column is Likes + Comments + Restacks (not hardcoded)
[ ] Top 20% highlight is based on the actual data distribution, not a fixed threshold
[ ] Header row is frozen and auto-filter is active
[ ] Summary sheet stats reference only original notes, not restacks
[ ] File is named with the author handle and today's date
[ ] If the page failed to load properly, the user was told — not silently given an empty file

Anti-Patterns

[ ] Do not proceed without a valid Substack handle or profile URL — scraping without a specific target cannot be completed
[ ] Do not ignore rate-limit responses from Substack — implement backoff and reduce request frequency before retrying
[ ] Do not export data without conditional formatting and summary stats — raw data without visualisation is not the expected output
[ ] Do not attempt to access private or subscriber-only notes — this skill is for public Notes content only
[ ] Do not produce output without a clear date range filter — undated exports make trend analysis impossible

Example Trigger Phrases

"Scrape my Substack Notes and export to Excel — my handle is @handle, last 60 days"
"Use the substack-notes-scraper skill on https://substack.com/@handle/notes for Q1 2026"
"Pull my notes engagement data into a spreadsheet"
"Export my Substack Notes stats with likes and restacks — author: Jane Smith, Jan–Mar 2026"
"Run the Substack scraper on my notes page and show me which posts performed best"

mohitagw15856/substack-notes-scraper

skills/substack-notes-scraper/SKILL.md

Scrapes a Substack Notes page and exports engagement data to a formatted .xlsx file. Use when asked to download, analyse, or export Substack Notes performance data including likes, comments, and restacks. Produces a formatted spreadsheet with conditional formatting, summary stats, and per-note engagement metrics.

979 stars

development

Updated Jun 18, 2026

$ install --global

skillsauth

npx skillsauth add mohitagw15856/pm-claude-skills substack-notes-scraper

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 19, 2026, 2:48 AM183.2s1 file scanned

SKILL.md

name:: substack-notes-scraper
description:: Scrapes a Substack Notes page and exports engagement data to a formatted .xlsx file. Use when asked to download, analyse, or export Substack Notes performance data including likes, comments, and restacks. Produces a formatted spreadsheet with conditional formatting, summary stats, and per-note engagement metrics.

Substack Notes Scraper

Credit: Originally created by a Substack newsletter author — adapted and extended for this library.

Required Inputs

Claude will ask for these if not provided upfront.

Output Structure

File

substack-notes-[handle]-[YYYY-MM-DD].xlsx

Sheet: "Notes Data"

Formatting applied:

Row 1: frozen header row
Auto-filter enabled on all columns
Top 20% by Likes column: highlighted yellow (#FFF2CC)
Column widths: auto-fit to content, min 12, max 60

Sheet: "Summary"

Scrape Date:         [YYYY-MM-DD HH:MM UTC]
Author:              [handle]
Date Range:          [start] – [end]
Total Notes:         [n]
Original Notes:      [n]
Restacks Filtered:   [n]

Avg Likes:           [n.n]
Avg Comments:        [n.n]
Avg Restacks:        [n.n]
Avg Total Eng:       [n.n]

Best Note (Likes):   [date] — [first 80 chars] — [n] likes
Best Note (Eng):     [date] — [first 80 chars] — [n] total engagement

Instructions for Claude

Step 1: Validate inputs

Step 2: Fetch the Notes page

Step 3: Paginate through all notes in the date window

Substack Notes load incrementally. Repeat fetching or scrolling until either:

A note's date falls outside the target date range (stop loading more), or
No new content loads on the next request.

Rate-limit: wait 2 seconds between each paginated request. Do not hammer the endpoint.

Step 4: Parse each note

For every note element found on the page, extract:

Date: the timestamp on the note (convert to YYYY-MM-DD)
Author: the display name or handle shown on the note
Full text: complete body text, stripping HTML tags
Text preview: first 200 characters of full text
Likes count: the number shown on the like/heart counter
Comments count: the number shown on the comment counter
Restacks count: the number shown on the restack counter
Link: the direct permalink to the note
Note type: original if the author matches the specified author; restack if it belongs to someone else

Step 5: Filter

Apply date filter: exclude any note outside the specified date range.

Step 6: Calculate Total Engagement

For each row: Total Engagement = Likes + Comments + Restacks

Step 7: Identify top 20% by Likes

Sort original notes by Likes descending. Mark the top 20% (round up) for conditional formatting. These rows will be highlighted yellow in the output file.

Step 8: Build the .xlsx file

Use Python with openpyxl to generate the file. Structure:

# Required libraries
import openpyxl
from openpyxl.styles import PatternFill, Font, Alignment
from openpyxl.utils import get_column_letter
from datetime import datetime

# Sheet 1: Notes Data
# - Write header row, bold, freeze row 1
# - Write all data rows
# - Apply auto-filter: ws.auto_filter.ref = ws.dimensions
# - Apply yellow fill to top-20% rows by likes
# - Auto-size columns (iterate cells to find max length)

# Sheet 2: Summary
# - Write summary stats as key-value pairs, no table format

Name the file substack-notes-[handle]-[YYYY-MM-DD].xlsx using today's date.

Step 9: Report back

After generating the file, report:

File path
Total notes found, original vs. restacks
Date range actually covered
Top 3 notes by total engagement (date + preview + stats)
Any notes or warnings (e.g., page didn't fully load, some dates were ambiguous)

Quality Checks

[ ] All three required inputs were confirmed before starting
[ ] Rate limiting honored: 2-second delay between paginated requests
[ ] Author filter applied correctly — restacks are included as rows but flagged, not silently dropped
[ ] Date range filter applied — no notes outside the window appear in the data
[ ] Total Engagement column is Likes + Comments + Restacks (not hardcoded)
[ ] Top 20% highlight is based on the actual data distribution, not a fixed threshold
[ ] Header row is frozen and auto-filter is active
[ ] Summary sheet stats reference only original notes, not restacks
[ ] File is named with the author handle and today's date
[ ] If the page failed to load properly, the user was told — not silently given an empty file

Anti-Patterns

[ ] Do not proceed without a valid Substack handle or profile URL — scraping without a specific target cannot be completed
[ ] Do not ignore rate-limit responses from Substack — implement backoff and reduce request frequency before retrying
[ ] Do not export data without conditional formatting and summary stats — raw data without visualisation is not the expected output
[ ] Do not attempt to access private or subscriber-only notes — this skill is for public Notes content only
[ ] Do not produce output without a clear date range filter — undated exports make trend analysis impossible

Example Trigger Phrases

"Scrape my Substack Notes and export to Excel — my handle is @handle, last 60 days"
"Use the substack-notes-scraper skill on https://substack.com/@handle/notes for Q1 2026"
"Pull my notes engagement data into a spreadsheet"
"Export my Substack Notes stats with likes and restacks — author: Jane Smith, Jan–Mar 2026"
"Run the Substack scraper on my notes page and show me which posts performed best"

Related Skills

mohitagw15856/win-loss-analysis

business

VerifiedTrustedCommunity

Analyze why deals are won and lost and turn it into an action plan. Use when asked to run a win/loss analysis, review closed-won and closed-lost deals, understand why the team is losing to a competitor, or summarize sales feedback into patterns. Produces a structured win/loss report with themes, win/loss rates by segment and competitor, representative quotes, and prioritized actions for product, marketing, and sales.

1,117SKILL.mdUpdated Jul 2, 2026

mohitagw15856/win-loss-analysis

mohitagw15856/which-skill

development

VerifiedTrustedCommunity

Route a fuzzy request to the right skill in this library. Use when the user is unsure which skill fits, asks 'which skill should I use for X', describes a task without naming a skill, or when a request could plausibly match several skills. Produces a best-fit recommendation with the inputs to gather, a runner-up with the tie-breaker, and a workflow recipe when the job spans multiple skills.

1,117SKILL.mdUpdated Jul 2, 2026

mohitagw15856/which-skill

mohitagw15856/vuln-triage

testing

VerifiedTrustedCommunity

Triage a vulnerability or scanner finding — assess real severity, exploitability, and how urgently to fix. Use when asked to triage a CVE, prioritize scanner/pentest findings, assess a vuln's risk, or decide what to patch first. Produces a triage verdict: CVSS-informed severity adjusted for your context, exploitability, real risk, a fix/mitigation, and an SLA — so you fix what matters, not just what's red.

1,117SKILL.mdUpdated Jul 2, 2026

mohitagw15856/vuln-triage

mohitagw15856/voice-of-customer-program

development

VerifiedTrustedCommunity

Stand up a Voice of Customer (VoC) program that turns feedback into action. Use when asked to build a VoC program, design a customer feedback loop, consolidate feedback sources, or set up a closed-loop feedback process. Produces a VoC program design — objectives, feedback sources and channels, a taxonomy, collection and analysis cadence, closed-loop routing, ownership, and success metrics.

1,117SKILL.mdUpdated Jul 2, 2026

mohitagw15856/voice-of-customer-program

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/mohitagw15856/pm-claude-skills.git

# Copy into Claude Code skills folder (global)
cp -r pm-claude-skills/skills/substack-notes-scraper ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

mohitagw15856/pm-claude-skills

979 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT