skills/25-HosungYou-Diverga/skills/g5/SKILL.md
VS-Enhanced Academic Style Auditor - Academic Writing Quality Analysis Identifies 24+ writing patterns that reduce scholarly quality, adapted from Wikipedia AI Cleanup guidelines Use when: checking drafts before submission, improving academic writing quality, preparing for style improvement Triggers: writing quality, style audit, pattern check, writing review, academic style check
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research g5Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
No prerequisites required for this agent.
diverga_mark_checkpoint("CP_HUMANIZATION_REVIEW", decision, rationale)Read .research/decision-log.yaml directly to verify prerequisites. Conversation history is last resort.
Agent ID: G5 Category: G - Communication VS Level: Medium (Pattern awareness) Tier: Support Icon: 🔍 Model Tier: MEDIUM (Sonnet)
Analyzes academic writing for patterns that diminish scholarly quality and authentic academic voice. Based on Wikipedia's AI Cleanup initiative's 24 pattern categories, adapted for scholarly writing contexts.
This agent is the analysis phase of the writing quality improvement pipeline. It identifies patterns but does not transform them - that's handled by G6-AcademicStyleHumanizer.
"Identify patterns that weaken scholarly prose. Analyze, then improve."
The goal is to provide researchers with awareness of writing patterns that reduce academic quality, enabling informed decisions about style improvement while maintaining academic integrity.
| ID | Pattern | Description | Academic Example | |----|---------|-------------|------------------| | C1 | Significance Inflation | Overstating importance | "This pivotal study" → "This study" | | C2 | Notability Claims | Vague authority appeals | "Widely cited research" → "[cited N times]" | | C3 | Superficial -ing | Empty participial phrases | "highlighting the need" → direct statement | | C4 | Promotional Language | Marketing-style adjectives | "groundbreaking findings" → "novel findings" | | C5 | Vague Attributions | Unspecified sources | "Experts argue" → "[Author] argues" | | C6 | Formulaic Sections | Template structures | "Challenges and Future Prospects" |
| ID | Pattern | Description | Academic Example | |----|---------|-------------|------------------| | L1 | AI Vocabulary Clustering | High-frequency AI words | "landscape", "tapestry", "underscore" | | L2 | Copula Avoidance | Avoiding "is/are" | "serves as" → "is" | | L3 | Negative Parallelism | Overused structures | "not only...but also" overuse | | L4 | Rule of Three | Forced triads | "X, Y, and Z" when 2 or 4 fit better | | L5 | Elegant Variation | Excessive synonym cycling | "study/research/investigation" in 3 sentences | | L6 | False Ranges | Misapplied scales | "from theory to practice" as filler |
| ID | Pattern | Description | Academic Example | |----|---------|-------------|------------------| | S1 | Em Dash Overuse | Excessive — usage | >2 per paragraph flagged | | S2 | Excessive Boldface | Over-emphasis | Mechanical term bolding | | S3 | Inline-Header Lists | Corporate formatting | "Term: Definition" patterns | | S4 | Title Case Overuse | Improper capitalization | Headings should be sentence case | | S5 | Emoji Usage | Decorative symbols | Inappropriate in academic text | | S6 | Curly Quote Artifacts | Typography markers | Inconsistent quotation marks |
| ID | Pattern | Description | Academic Example | |----|---------|-------------|------------------| | M1 | Chatbot Artifacts | Conversational leakage | "I hope this helps", "Let me explain" | | M2 | Knowledge Disclaimers | AI limitation disclosure | "As of my last training" | | M3 | Sycophantic Tone | Excessive agreement | "Excellent point!" in formal writing |
| ID | Pattern | Description | Academic Example | |----|---------|-------------|------------------| | H1 | Verbose Phrases | Unnecessary words | "In order to" → "To" | | H2 | Excessive Hedging | Qualifier stacking | "could potentially possibly" → "may" | | H3 | Generic Conclusions | Template endings | "Future research is needed" without specifics |
| ID | Pattern | Description | Academic Example | |----|---------|-------------|------------------| | A1 | Abstract Template | Rigid IMRAD filling | "This paper aims to..." variations | | A2 | Methods Boilerplate | Generic methodology | "Data were analyzed using..." without detail | | A3 | Discussion Inflation | Overclaiming implications | "These findings revolutionize..." | | A4 | Citation Hedging | Vague reference phrases | "Previous studies have shown" without cite | | A5 | Contribution Listing | Enumerated value claims | "This study contributes to... First,... Second,..." | | A6 | Limitation Disclaimers | Generic limitation statements | "This study has several limitations" |
High-frequency words that cluster in AI-assisted writing (post-2023):
high_alert: # Strong indicators of AI-assisted writing patterns
- "tapestry"
- "delve"
- "intricacies"
- "multifaceted"
- "nuanced"
- "paradigm shift"
- "testament to"
- "indelible mark"
moderate_alert: # Common in AI, check context
- "landscape"
- "underscore"
- "pivotal"
- "crucial"
- "furthermore"
- "notably"
- "interplay"
- "synergy"
context_dependent: # Valid in specific contexts
- "robust" (statistics context OK)
- "significant" (p-value context OK)
- "framework" (theory context OK)
- "implications" (discussion context OK)
Required:
- text: "The text to analyze"
Optional:
- context: "abstract/methods/results/discussion/response_letter"
- sensitivity: "low/medium/high" # Detection threshold
- include_context_words: true/false # Flag context-dependent words
## Academic Writing Quality Report
### Summary
| Metric | Value |
|--------|-------|
| Total Patterns Detected | N |
| High-Priority Patterns | N |
| Medium-Priority Patterns | N |
| Low-Priority Patterns | N |
| Writing Quality Score | X% |
### Quality Assessment
███████████░░░░░░░░░ 55% Pattern Density
Low (0-30%) | Medium (31-60%) | High (61-100%)
---
### Detailed Pattern Report
#### High-Priority Patterns (Immediate Attention)
**[C1] Significance Inflation**
- Location: Paragraph 1, Sentence 2
- Original: "This pivotal study examines..."
- Issue: "pivotal" inflates importance without evidence
- Recommendation: "This study examines..."
**[L1] AI Vocabulary Clustering**
- Location: Throughout
- Flagged words: "landscape" (2x), "underscore" (1x), "multifaceted" (1x)
- Issue: High concentration of AI-typical vocabulary
- Recommendation: Replace with field-specific terminology
---
#### Medium-Priority Patterns
**[L2] Copula Avoidance**
- Location: Paragraph 3, Sentence 1
- Original: "This framework serves as a foundation..."
- Issue: "serves as" instead of direct "is"
- Recommendation: "This framework is a foundation..."
---
#### Low-Priority Patterns
**[H1] Verbose Phrases**
- Location: Multiple
- Examples: "In order to" (3x), "Due to the fact that" (1x)
- Recommendation: Simplify to "To" and "Because"
---
### Pattern Distribution
Content Patterns: ████░░░░░░ 4 Language Patterns: ██████░░░░ 6 Style Patterns: ██░░░░░░░░ 2 Communication: ░░░░░░░░░░ 0 Filler/Hedging: ███░░░░░░░ 3 Academic-Specific: ████░░░░░░ 4 ───────────── Total: 19 patterns
---
### Writing Quality Improvement Recommendation
Based on analysis:
- **Recommended Mode**: Balanced
- **Priority Fixes**: C1, L1, L2 (5 instances)
- **Optional Fixes**: H1, A5 (7 instances)
- **Preserve**: All citations, statistics, methodology details
---
### Section-Level Scores (v3.1)
The G5 report includes per-section writing quality scores and top remaining patterns for each section. This data powers the Rich Checkpoint v2.0 display in the writing quality improvement pipeline.
```yaml
section_scores:
abstract: { quality_score: 72, top_patterns: ["C1 (x2)", "L1 (x1)"] }
introduction: { quality_score: 45, top_patterns: ["H1 (x3)"] }
methods: { quality_score: 25, top_patterns: [] } # (clean)
results: { quality_score: 35, top_patterns: ["S1 (x1)"] }
discussion: { quality_score: 95, top_patterns: ["L1 (x5)", "S1 (x3)", "C1 (x2)"] }
conclusion: { quality_score: 88, top_patterns: ["H3 (x2)", "A6 (x1)"] }
top_patterns:
# Top 3 remaining patterns per section with counts
# Empty array [] displayed as "(clean)" in checkpoint UI
Report format:
### Section-Level Scores
| Section | Quality Score | Top Patterns |
|---------|---------------|-------------|
| Abstract | 72% | C1 (x2), L1 (x1) |
| Introduction | 45% | H1 (x3) |
| Methods | 25% | (clean) |
| Results | 35% | S1 (x1) |
| Discussion | 95% | L1 (x5), S1 (x3), C1 (x2) |
| Conclusion | 88% | H3 (x2), A6 (x1) |
This section-level data is used by:
🟠 CHECKPOINT: CP_HUMANIZATION_REVIEW
Would you like to proceed with writing quality improvement?
[A] Improve (Conservative) - Fix high-priority patterns only [B] Improve (Balanced) - Fix high and medium-priority patterns ⭐ Recommended [C] Improve (Aggressive) - Maximum style transformation [D] View specific pattern details [E] Keep original
## Prompt Template
You are an academic writing quality specialist.
Analyze the following text for writing patterns that weaken scholarly quality:
[Context]: {context} # abstract/methods/discussion/etc. [Sensitivity]: {sensitivity} # low/medium/high
Perform the following analysis:
Pattern Detection Scan for all 24 pattern categories:
Priority Classification For each detected pattern:
Writing Quality Assessment Calculate based on:
Style Improvement Recommendation Based on analysis, recommend:
Output in the specified report format.
## Academic Context Adjustments
Different sections have different acceptable patterns:
| Section | Acceptable | Flag Anyway |
|---------|------------|-------------|
| Abstract | A1 (some template OK) | C1, L1 |
| Methods | A2 (some boilerplate OK) | C4, M1 |
| Results | Statistical terminology | C3, L6 |
| Discussion | A3 (some interpretation OK) | H3 generic conclusions |
| Response Letter | Gratitude phrases | M3 excessive |
## Integration with Pipeline
┌─────────────────────────────────────────────────────────┐ │ Content Generation (G2/G3/Auto-Doc) │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────┐ │ │ │ G5-AcademicStyleAuditor (THIS AGENT) │ │ │ │ ├─ Pattern Detection │ │ │ │ ├─ Risk Classification │ │ │ │ ├─ AI Probability Score │ │ │ │ └─ Humanization Recommendation │ │ │ └─────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ 🟠 CHECKPOINT: CP_HUMANIZATION_REVIEW │ │ User decides: Humanize? Which mode? │ │ │ │ │ ▼ │ │ G6-AcademicStyleHumanizer (if approved) │ └─────────────────────────────────────────────────────────┘
## Commands
"Check writing quality of my draft" → Full analysis with detailed report
"Quick style check" → Summary only (pattern count + quality score)
"Show flagged vocabulary" → List all overused or formulaic words found
"Analyze my abstract for writing quality" → Context-aware analysis for abstracts
"Compare before/after style improvement" → Re-run analysis on improved text
## Category 7: LaTeX Syntax Patterns (6 patterns)
When scanning manuscripts with inline or display math (`$...$` or `$$...$$`), detect malformed LaTeX
that will cause rendering failures in Word/PDF export.
| ID | Pattern | Description | Example |
|----|---------|-------------|---------|
| X1 | Unclosed Math Delimiter | Unmatched `$` or `$$` | `$R_b` without closing `$` |
| X2 | Missing Braces | `\frac{a}{b` missing `}` | `\frac{a}{b` → `\frac{a}{b}` |
| X3 | Inconsistent Subscripts | Bare vs braced subscripts | `$R_b$` vs `$R_{b}$` in same doc |
| X4 | Unescaped Underscores | `_` in `\text{}` without `\_` | `\text{p_value}` → `\text{p\_value}` |
| X5 | Double Dollar Misuse | `$$` where `$` intended (inline) | `$$x$$` inline → `$x$` |
| X6 | Invalid Commands | Misspelled or unsupported commands | `\fraq` → `\frac` |
### LaTeX Auto-Fix Capability
When LaTeX syntax errors are detected, G5 can suggest auto-corrections:
```yaml
latex_fixes:
X1_unclosed: "Add matching delimiter"
X2_missing_brace: "Insert closing brace at expected position"
X3_inconsistent: "Standardize to braced form: $R_{b}$"
X4_unescaped: "Escape with backslash: \_"
X5_double_dollar: "Replace $$ with $ for inline math"
X6_invalid_cmd: "Suggest closest valid command"
For deeper validation, use the latex2omml package (available in packages/latex2omml/):
from latex2omml.converter import _Tokenizer
def validate_latex(expr: str) -> list[str]:
"""Tokenize LaTeX and return list of issues found."""
issues = []
try:
tok = _Tokenizer(expr)
# Check for unmatched braces
depth = 0
for ttype, tval in tok.tokens:
if ttype == "LBRACE": depth += 1
elif ttype == "RBRACE": depth -= 1
if depth != 0:
issues.append(f"X2: Unmatched braces (depth={depth})")
except Exception as e:
issues.append(f"X6: Parse error: {e}")
return issues
../../research-coordinator/core/vs-engine.md../../research-coordinator/interaction/user-checkpoints.md../../research-coordinator/core/integration-hub.mdtools
Show mcp-stata identity, connected tools, and status. Use when the user asks if mcp-stata is available, asks about access to the toolkit, or asks what Stata tools are connected.
tools
Activate when users mention Stata commands, .do files, regressions, econometrics, stored results, graphs, dataset inspection, replication, or Stata errors. Route the task through mcp-stata tools and the specialized research skills instead of treating it as plain text coding.
development
Build and review paper-ready regression, balance, and summary tables from Stata outputs. Use when the user needs a clean table for a draft, appendix, or coauthor share-out.
tools
Install, configure, update, or verify mcp-stata across Claude Code, Codex, Gemini CLI, Cursor, Windsurf, and VS Code. Activate when users ask to set up the Stata toolkit or troubleshoot the installation.