skills/25-HosungYou-Diverga/skills/b2/SKILL.md
VS-Enhanced Evidence Quality Appraiser - Prevents Mode Collapse with context-adaptive quality assessment Enhanced VS 3-Phase process: Avoids automatic tool application, delivers research-specific evaluation strategies Use when: appraising study quality, assessing risk of bias, grading evidence Triggers: quality appraisal, RoB, GRADE, Newcastle-Ottawa, risk of bias, methodological quality
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research b2Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
diverga_check_prerequisites("b2") → must return approved: true
If not approved → AskUserQuestion for each missing checkpoint (see .claude/references/checkpoint-templates.md)
diverga_mark_checkpoint("CP_QUALITY_REVIEW", decision, rationale)Read .research/decision-log.yaml directly to verify prerequisites. Conversation history is last resort.
Agent ID: 06 Category: B - Literature & Evidence VS Level: Enhanced (3-Phase) Tier: Core Icon: 🔬
Systematically evaluates methodological quality and risk of bias in individual studies. Selects and applies appropriate assessment tools based on study design type.
Applies VS-Research methodology to go beyond mechanical tool application, providing differentiated quality evaluation strategies tailored to research context and purpose.
Purpose: Recognize limitations of mechanical tool application
⚠️ **Modal Warning**: The following are the most predictable quality assessment approaches:
| Modal Approach | T-Score | Limitation |
|----------------|---------|------------|
| "RCT → Apply RoB 2.0" | 0.90 | Automatic matching ignoring context |
| "Observational → Apply NOS" | 0.88 | Ignores tool limitations |
| "Report GRADE rating only" | 0.85 | Rating rationale unclear |
➡️ Tool application is baseline. Proceeding with context-adaptive assessment.
Purpose: Present evaluation approaches suited to research purpose and context
**Direction A** (T ≈ 0.7): Standard tool + contextual interpretation
- Standard tool application + domain-specific weighting
- Suitable for: General systematic reviews
**Direction B** (T ≈ 0.4): Multi-tool triangulation
- Simultaneous application of multiple tools + discrepancy analysis
- Additional field-specific quality criteria
- Suitable for: Methodology papers, high-quality reviews
**Direction C** (T < 0.3): Purpose-specific evaluation
- Differentiated criteria by meta-analysis purpose
- Propose new evaluation dimensions (reproducibility, transparency)
- Suitable for: Methodological innovation, guideline development
Based on selected evaluation strategy:
T > 0.8 (Modal - Supplementation Required):
├── Study type → Standard tool automatic matching
├── Yes/No per checklist item
├── Report only total score or rating
└── Judgment rationale unclear
T 0.5-0.8 (Established - Add Interpretation):
├── Specific rationale per domain
├── Interpret meaning in research context
├── Meta-analysis inclusion/exclusion recommendation
└── Sensitivity analysis necessity determination
T 0.3-0.5 (In-depth - Recommended):
├── Multi-tool triangulation
├── Additional field-specific criteria
├── Quality-effect size relationship analysis
└── Rating uncertainty quantification
T < 0.3 (Innovative - For Leading Research):
├── Propose new evaluation dimensions
├── Critical discussion of tool limitations
├── Purpose-specific evaluation framework
└── Quality assessment uncertainty propagation
Study Type-Specific Tool Selection
Risk of Bias Assessment
GRADE Certainty Rating
Quality Summary Visualization
| Domain | Assessment Content | |--------|-------------------| | D1 | Bias arising from randomization process | | D2 | Bias due to deviations from intended interventions | | D3 | Bias due to missing outcome data | | D4 | Bias in measurement of outcome | | D5 | Bias in selection of reported result |
Judgment: Low risk / Some concerns / High risk
| Domain | Items | Points | |--------|-------|--------| | Selection | Representativeness of exposed cohort | ★ | | | Selection of non-exposed cohort | ★ | | | Ascertainment of exposure | ★ | | | Demonstration outcome not present at start | ★ | | Comparability | Comparability of cohorts | ★★ | | Outcome | Assessment of outcome | ★ | | | Adequate follow-up length | ★ | | | Adequacy of follow-up | ★ |
Total Score: /9 points
Required:
- study_type: "RCT, cohort, case-control, qualitative, etc."
- study_information: "Methods section or full paper"
Optional:
- assessment_tool: "If specific tool preferred"
- assessment_purpose: "Meta-analysis, guideline development, etc."
## Study Quality Assessment Report
### 1. Study Information
- Authors: [Author names]
- Year: [Publication year]
- Study Type: [Design type]
- Applied Tool: [Assessment tool name]
### 2. Risk of Bias Assessment (RCT Example)
| Domain | Judgment | Rationale |
|--------|----------|-----------|
| D1: Randomization process | 🟢/🟡/🔴 | [Specific rationale] |
| D2: Deviations from interventions | 🟢/🟡/🔴 | [Specific rationale] |
| D3: Missing outcome data | 🟢/🟡/🔴 | [Specific rationale] |
| D4: Outcome measurement | 🟢/🟡/🔴 | [Specific rationale] |
| D5: Selection of reported result | 🟢/🟡/🔴 | [Specific rationale] |
**Overall Judgment**: [Low risk / Some concerns / High risk]
### 3. Quality Assessment Summary
**Key Strengths:**
1. [Strength 1]
2. [Strength 2]
**Key Weaknesses:**
1. [Weakness 1]
2. [Weakness 2]
### 4. Evidence Utilization Recommendations
- Meta-analysis inclusion: [Recommended/Caution needed/Exclude recommended]
- Sensitivity analysis: [Needed/Not needed]
- Interpretation caveats: [Specific cautions]
### 5. GRADE Assessment (If Applicable)
| Factor | Assessment | Impact |
|--------|------------|--------|
| Study design | | |
| Risk of bias | | ↓ |
| Inconsistency | | |
| Indirectness | | |
| Imprecision | | |
| Publication bias | | |
**Certainty Rating**: ⊕⊕⊕⊕ High / ⊕⊕⊕◯ Moderate / ⊕⊕◯◯ Low / ⊕◯◯◯ Very Low
You are a research quality assessment expert.
Please evaluate the methodological quality of the following study:
[Study Type]: {study_type}
[Study Information]: {study_info}
Tasks to perform:
[For RCT - Cochrane RoB 2.0]
1. Bias arising from randomization process
2. Bias due to deviations from intended interventions
3. Bias due to missing outcome data
4. Bias in measurement of outcome
5. Bias in selection of reported result
→ Overall judgment: Low / Some concerns / High
[For Observational - Newcastle-Ottawa Scale]
1. Selection - 4 points
2. Comparability - 2 points
3. Outcome/Exposure - 3 points
→ Total: /9
[For Qualitative - CASP]
1. Clear research aim
2. Appropriate qualitative methodology
3. Appropriate research design
... (10 items)
Final output:
- Quality assessment summary table
- Key strengths and weaknesses
- Evidence utilization caveats
| Factor | Criteria | Downgrade | |--------|----------|-----------| | Risk of bias | Serious limitations | -1 or -2 | | Inconsistency | I² > 75%, CI non-overlap | -1 or -2 | | Indirectness | PICO mismatch | -1 or -2 | | Imprecision | OIS not met, wide CI | -1 or -2 | | Publication bias | Funnel plot asymmetry | -1 |
| Factor | Criteria | Upgrade | |--------|----------|---------| | Large effect size | RR > 2 or < 0.5 | +1 | | Dose-response | Clear gradient | +1 | | Confounding | Acts toward reducing effect | +1 |
| Check | Rule | Alert | |-------|------|-------| | F-to-t consistency | F(1, df) = t^2 | Error if >5% deviation | | Standardization detection | "standardized" in measure | Critical flag | | Pre-test as outcome | Pre-test used as ES | REJECT | | Missing correlation | Gain score needs r_pre_post | Warning |
| Rating | Criteria | |--------|----------| | HIGH | Reported g with n, verified calculation | | MEDIUM | Calculated from M/SD, needs verification | | LOW | Estimated from t/F/p, high uncertainty | | UNACCEPTABLE | Pre-test as outcome, missing key data |
extraction_quality_checklist:
- item: "Source verification"
check: "ES matches original paper values"
required: true
- item: "Calculation verification"
check: "d-to-g conversion within tolerance"
required: true
- item: "Independence check"
check: "No pre-test as outcome"
required: true
- item: "Design classification"
check: "Between/within/mixed correctly identified"
required: true
- item: "Dependency documentation"
check: "Multiple ES from same study flagged"
required: true
| Mechanism | Application Timing | Usage Example | |-----------|-------------------|---------------| | Forced Analogy | Phase 2 | Apply quality criteria from other fields by analogy | | Iterative Loop | Phase 2 | 4-round divergence-convergence for strategy refinement | | Semantic Distance | Phase 2 | Discover new evaluation dimensions beyond existing tools |
Applied Checkpoints:
- CP-INIT-002: Select creativity level
- CP-VS-001: Select quality assessment direction (multiple)
- CP-VS-003: Final assessment strategy satisfaction confirmation
- CP-SD-001: Concept combination distance threshold
../../research-coordinator/core/vs-engine.md
../../research-coordinator/core/t-score-dynamic.md
../../research-coordinator/creativity/forced-analogy.md
../../research-coordinator/creativity/iterative-loop.md
../../research-coordinator/creativity/semantic-distance.md
../../research-coordinator/interaction/user-checkpoints.md
../../research-coordinator/core/vs-engine.md../../research-coordinator/core/t-score-dynamic.md../../research-coordinator/references/creativity-mechanisms.md../../research-coordinator/core/project-state.md../../research-coordinator/core/pipeline-templates.md../../research-coordinator/core/integration-hub.md../../research-coordinator/core/guided-wizard.md../../research-coordinator/core/auto-documentation.mddevelopment
Conduct rigorous thematic analysis (TA) of qualitative data following Braun and Clarke's (2006) six-phase framework. Use whenever the user mentions 'thematic analysis', 'TA', 'Braun and Clarke', 'qualitative coding', 'identifying themes', or asks for help analysing interviews, focus groups, open-ended survey responses, or transcripts to identify patterns. Also trigger for questions about inductive vs theoretical coding, semantic vs latent themes, essentialist vs constructionist epistemology, building a thematic map, or writing up a qualitative findings section. Covers all six phases, the four upfront analytic decisions, the 15-point quality checklist, and the five common pitfalls. Produces a Word document write-up and an annotated thematic map. Does NOT cover IPA, grounded theory, discourse analysis, conversation analysis, or narrative analysis — use a different method for those.
development
Guide users through writing a systematic literature review (SLR) following the PRISMA 2020 framework. Use this skill whenever the user mentions 'systematic review', 'systematic literature review', 'SLR', 'PRISMA', 'PRISMA 2020', 'PRISMA flow diagram', 'PRISMA checklist', or asks for help writing, structuring, or auditing a literature review that follows reporting guidelines. Also trigger when the user asks about inclusion/exclusion criteria for a review, search strategies for databases like Scopus/WoS/PubMed, study selection processes, risk of bias assessment, or narrative synthesis for a review paper. This skill covers the full PRISMA 2020 checklist (27 items), produces a Word document manuscript in strict journal article format, generates an annotated PRISMA flow diagram, and enforces APA 7th Edition referencing throughout. It does NOT cover meta-analysis or statistical pooling. By Chuah Kee Man.
testing
Performs placebo-in-time sensitivity analysis with hierarchical null model and optional Bayesian assurance. Use when checking model robustness, verifying lack of pre-intervention effects, or estimating study power.
data-ai
Fit, summarize, plot, and interpret a chosen CausalPy experiment. Use after the causal method has been selected, including when configuring PyMC/sklearn models and scale-aware custom priors.