skills/05-kthorn-research-superpower/research/answering-research-questions/SKILL.md
<!-- ╔══════════════════════════════════════════════════════════════╗ ║ 本文件为开源 Skill 原始文档,收录仅供学习与研究参考 ║ ║ CoPaper.AI 收集整理 | https://copaper.ai ║ ╚══════════════════════════════════════════════════════════════╝ 来源仓库: https://github.com/kthorn/research-superpower 项目名称: research-superpower 开源协议: MIT License 收录日期: 2026-04-02 声明: 本文件版权归原作者所有。此处收录旨在为社会科学实证研究者 提供 AI Agent Skills 的集中参考。如有侵权,请联系删除。 --> --- name: Answering Research Questions description: Ma
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research skills/05-kthorn-research-superpower/research/answering-research-questionsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Orchestrate the complete research workflow from query to findings.
Core principle: Systematic, trackable, comprehensive. Search → Evaluate → Traverse → Synthesize.
Announce at start: "I'm using the Answering Research Questions skill to find [specific data] about [topic]."
Extract from user's request:
Keywords:
Data types needed:
Constraints:
Ask clarifying questions if needed:
Propose folder name:
research-sessions/YYYY-MM-DD-brief-description/
Example: research-sessions/2025-10-11-btk-inhibitor-selectivity/
Show proposal to user:
📁 Creating research folder: research-sessions/2025-10-11-btk-inhibitor-selectivity/
Proceed? (y/n)
Create folder structure:
mkdir -p "research-sessions/YYYY-MM-DD-description"/{papers,citations}
Initialize files:
Core files (always create these):
papers-reviewed.json:
{}
citations/citation-graph.json:
{}
SUMMARY.md:
# Research Query: [User's question]
**Started:** YYYY-MM-DD HH:MM
**Keywords:** keyword1, keyword2, keyword3
**Data types sought:** IC50 values, selectivity data, synthesis methods
---
## Highly Relevant Papers (Score ≥ 8)
Papers scored using `evaluating-paper-relevance` skill:
- Score 0-10 based on: Keywords (0-3) + Data type (0-4) + Specificity (0-3)
- Score ≥ 8: Highly relevant with significant data
- Score 7: Relevant with useful data
- Score 5-6: Possibly relevant
- Score < 5: Not relevant
(Papers will be added here as found)
Example format:
### [Paper Title](https://doi.org/10.1234/example)
**DOI:** [10.1234/example](https://doi.org/10.1234/example) | **PMID:** [12345678](https://pubmed.ncbi.nlm.nih.gov/12345678/)
---
## Relevant Papers (Score 7)
(Papers will be added here as found)
---
## Possibly Relevant Papers (Score 5-6)
(Noted for potential follow-up)
---
## Search Progress
- Initial PubMed search: X results
- Papers reviewed: Y
- Papers with relevant data: Z
- Citations followed: N
---
## Key Findings
(Synthesized findings will be added as research progresses)
CRITICAL: Always use clickable markdown links for DOIs and PMIDs
Auxiliary files (for large searches >100 papers):
See evaluating-paper-relevance skill for guidance on when to create:
For small searches (<50 papers), stick to core files only. For large searches (>100 papers), auxiliary files add significant organizational value.
Use searching-literature skill:
initial-search-results.jsonUse evaluating-paper-relevance skill:
For each paper:
CRITICAL: Add every paper to papers-reviewed.json regardless of score. This prevents re-review and tracks complete search history.
Report progress for EVERY paper:
📄 [15/100] Screening: "Paper Title"
Abstract score: 8 → Fetching full text...
✓ Found IC50 data for 8 compounds
→ Added to SUMMARY.md
📄 [16/100] Screening: "Another Paper"
Abstract score: 3 → Skipping (not relevant)
📄 [17/100] Screening: "Third Paper"
Abstract score: 7 → Relevant, adding to queue...
Every 10 papers, give summary update
Use traversing-citations skill:
For papers scoring ≥ 7:
Report progress:
🔗 Following citations from highly relevant paper
→ Found 12 relevant references
→ Found 8 relevant citing papers
→ Adding 20 papers to queue
Check after:
Ask user:
⏸️ Checkpoint: Reviewed 50 papers, found 12 relevant
Papers with data: 7
Continue searching? (y/n/summary)
Options:
y - Continue processingn - Stop and finalizesummary - Show current findings, then decideWhen stopping (user says no or queue empty):
Option A: Manual synthesis (small research sessions)
## Key Findings Summary
### IC50 Values for BTK Inhibitors
- Compound A: 12 nM (Smith et al., 2023)
- Compound B: 45 nM (Doe et al., 2024)
- [More compounds...]
### Selectivity Data
- Compound A shows >80-fold selectivity vs other kinases
- Tested against panel of 50 kinases (Jones et al., 2023)
### Synthesis Methods
- Lead compounds synthesized via [method]
- Yields: 30-45%
- Full protocols in [papers]
### Gaps Identified
- No data on selectivity vs [specific kinase]
- Limited in vivo data
- Few papers on resistance mechanisms
Option B: Script-based synthesis (large research sessions >50 papers)
For large research sessions, consider creating a synthesis script:
create generate_summary.py:
evaluated-papers.json from helper scriptsBenefits:
Final report:
✅ Research complete!
📊 Summary:
- Papers reviewed: 127
- Relevant papers: 18
- Highly relevant: 7
- Data extracted: IC50 values for 45 compounds, selectivity data, synthesis methods
📁 All findings in: research-sessions/2025-10-11-btk-inhibitor-selectivity/
- SUMMARY.md (organized findings)
- papers/ (14 PDFs + supplementary data)
- papers-reviewed.json (complete tracking)
CRITICAL: Always consolidate findings at the end
Filter papers-reviewed.json to extract only relevant papers (score ≥ 7):
# Read papers-reviewed.json
with open('papers-reviewed.json') as f:
all_papers = json.load(f)
# Filter for relevant papers (score >= 7)
relevant_papers = {
doi: data for doi, data in all_papers.items()
if data.get('score', 0) >= 7
}
# Save to relevant-papers.json
with open('relevant-papers.json', 'w') as f:
json.dump(relevant_papers, f, indent=2)
Format:
{
"10.1234/example1.2023": {
"pmid": "12345678",
"title": "Paper title",
"status": "highly_relevant",
"score": 9,
"source": "pubmed_search",
"timestamp": "2025-10-11T16:00:00Z",
"found_data": ["IC50 values", "synthesis methods"],
"chembl_id": "CHEMBL1234567"
},
"10.1234/example2.2023": {
"pmid": "23456789",
"title": "Another paper",
"status": "relevant",
"score": 7,
"source": "forward_citation",
"timestamp": "2025-10-11T16:15:00Z",
"found_data": ["MIC data"]
}
}
Add these sections to the TOP of existing SUMMARY.md (before paper listings):
# Research Query: [User's question]
**Date:** 2025-10-11
**Duration:** 2h 15m
**Status:** Complete
---
## Search Strategy
**Keywords:** BTK, Bruton tyrosine kinase, inhibitor, selectivity, off-target, kinase panel, IC50
**Data types sought:** IC50 values, selectivity data, kinase panel screening
**Constraints:** None (open date range)
**PubMed Query:**
("BTK" OR "Bruton tyrosine kinase") AND (inhibitor OR "kinase inhibitor") AND (selectivity OR "off-target")
---
## Screening Methodology
**Rubric:** Abstract scoring (0-10)
- Key terms: +3 pts each (or Keywords 0-3, Data type 0-4, Specificity 0-3 if using old rubric)
- Relevant terms: +1 pt each
- Threshold: ≥7 = relevant
**Sources:**
- Initial PubMed search
- Forward/backward citations via Semantic Scholar
---
## Results Statistics
**Papers Screened:**
- Total reviewed: 127 papers
- Highly relevant (≥8): 12 papers
- Relevant (7): 18 papers
- Possibly relevant (5-6): 23 papers
- Not relevant (<5): 74 papers
**Data Extracted:**
- IC50 values: 45 compounds across 12 papers
- Selectivity data: 8 papers with kinase panel screening
- Full text obtained: 18/30 relevant papers (60%)
**Citation Traversal:**
- Papers with citations followed: 7
- References screened: 45 papers
- Citing papers screened: 38 papers
- Relevant papers found via citations: 8 papers
---
## Key Findings Summary
### IC50 Values for BTK Inhibitors
- Ibrutinib: 0.5 nM (Smith et al., 2023)
- Acalabrutinib: 3 nM (Doe et al., 2024)
- [Additional findings synthesized from papers below]
### Selectivity Patterns
- Most inhibitors show >50-fold selectivity vs other kinases
- Common off-targets: TEC, BMX (other TEC family kinases)
### Gaps Identified
- Limited data on selectivity vs JAK/SYK
- Few papers on resistance mechanisms
- No in vivo selectivity data found
---
## File Inventory
- `SUMMARY.md` - This file (methodology + findings)
- `relevant-papers.json` - 30 relevant papers (score ≥7)
- `papers-reviewed.json` - All 127 papers screened
- `papers/` - 18 PDFs + 5 supplementary files
- `citations/citation-graph.json` - Citation relationships
---
## Reproducibility
**To reproduce:**
1. Use PubMed query above
2. Apply screening rubric (threshold ≥7)
3. Follow citations from highly relevant papers (≥8)
4. Check Unpaywall for paywalled papers
**Software:** Research Superpowers skills v2025-10-11
---
[Existing paper listings follow below...]
## Highly Relevant Papers (Score ≥ 8)
### [Paper Title]...
Report to user:
✅ Research session complete!
📄 Consolidation complete:
1. SUMMARY.md - Enhanced with methodology, statistics, and findings
2. relevant-papers.json - 30 relevant papers (score ≥7) in JSON format
📁 All files in: research-sessions/2025-10-11-btk-inhibitor-selectivity/
- SUMMARY.md (complete: methodology + paper-by-paper findings)
- relevant-papers.json (30 relevant papers for programmatic access)
- papers-reviewed.json (127 total papers screened)
- papers/ (18 PDFs)
🔍 Quick access:
- Open SUMMARY.md for complete findings and methodology
- Use relevant-papers.json for programmatic access
💡 Optional: Clean up intermediate files?
→ Use cleaning-up-research-sessions skill to safely remove temporary files
Use TodoWrite to track these steps:
Skills used:
searching-literature - Initial PubMed searchevaluating-paper-relevance - Score and extract from paperstraversing-citations - Follow citation networksAll skills coordinate through:
papers-reviewed.json (deduplication)SUMMARY.md (findings accumulation)citation-graph.json (relationship tracking)File organization:
No results found:
API rate limiting:
Full text unavailable:
Too many results (>500):
| Phase | Skill | Output | |-------|-------|--------| | Parse | (built-in) | Keywords, data types, constraints | | Initialize | (built-in) | Folder, SUMMARY.md, tracking files | | Search | searching-literature | List of papers with metadata | | Evaluate | evaluating-paper-relevance | Scored papers, extracted findings | | Traverse | traversing-citations | Additional papers from citations | | Synthesize | (built-in) | Enhanced SUMMARY.md with methodology + findings | | Consolidate | (built-in) | relevant-papers.json (filtered to score ≥7) |
Not tracking all papers: Only adding relevant papers to papers-reviewed.json → Add EVERY paper to prevent re-review, track complete history Creating unnecessary auxiliary files for small searches: For <50 papers, stick to core files (papers-reviewed.json, SUMMARY.md, citation-graph.json). For large searches (>100 papers), auxiliary files like README.md and TOP_PRIORITY_PAPERS.md add value. Silent work: User can't see progress → Report EVERY paper, give updates every 10 Non-clickable identifiers: Plain text DOIs/PMIDs → Always use markdown links Jumping to evaluation without good search: Too narrow results → Optimize search first Not tracking papers: Re-reviewing same papers → Always use papers-reviewed.json Following all citations: Exponential explosion → Filter before traversing No checkpoints: User loses context → Report and ask every 50 papers Poor synthesis: Just list papers → Group by data type, extract key findings Batch reporting: Reporting 20 papers at once → Report each one as you go
NEVER work silently! User needs continuous feedback.
Report frequency:
📄 [N/Total] Title... Score: X)Be specific in progress reports:
Ask for clarification when needed:
Report blockers immediately:
Periodic summaries (every 10-15 papers):
📊 Progress update:
- Reviewed: 30/127 papers
- Highly relevant: 3 (scores 8-10)
- Relevant: 5 (score 7)
- Currently: Screening paper 31...
Why: User can course-correct early, knows work is happening, can stop if needed
Research session successful when:
After completing research:
development
Conduct rigorous thematic analysis (TA) of qualitative data following Braun and Clarke's (2006) six-phase framework. Use whenever the user mentions 'thematic analysis', 'TA', 'Braun and Clarke', 'qualitative coding', 'identifying themes', or asks for help analysing interviews, focus groups, open-ended survey responses, or transcripts to identify patterns. Also trigger for questions about inductive vs theoretical coding, semantic vs latent themes, essentialist vs constructionist epistemology, building a thematic map, or writing up a qualitative findings section. Covers all six phases, the four upfront analytic decisions, the 15-point quality checklist, and the five common pitfalls. Produces a Word document write-up and an annotated thematic map. Does NOT cover IPA, grounded theory, discourse analysis, conversation analysis, or narrative analysis — use a different method for those.
development
Guide users through writing a systematic literature review (SLR) following the PRISMA 2020 framework. Use this skill whenever the user mentions 'systematic review', 'systematic literature review', 'SLR', 'PRISMA', 'PRISMA 2020', 'PRISMA flow diagram', 'PRISMA checklist', or asks for help writing, structuring, or auditing a literature review that follows reporting guidelines. Also trigger when the user asks about inclusion/exclusion criteria for a review, search strategies for databases like Scopus/WoS/PubMed, study selection processes, risk of bias assessment, or narrative synthesis for a review paper. This skill covers the full PRISMA 2020 checklist (27 items), produces a Word document manuscript in strict journal article format, generates an annotated PRISMA flow diagram, and enforces APA 7th Edition referencing throughout. It does NOT cover meta-analysis or statistical pooling. By Chuah Kee Man.
testing
Performs placebo-in-time sensitivity analysis with hierarchical null model and optional Bayesian assurance. Use when checking model robustness, verifying lack of pre-intervention effects, or estimating study power.
data-ai
Fit, summarize, plot, and interpret a chosen CausalPy experiment. Use after the causal method has been selected, including when configuring PyMC/sklearn models and scale-aware custom priors.