Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

linzhe001/final-exp

Name: final-exp
Author: linzhe001

.claude/skills/final-exp/SKILL.md

npx skillsauth add linzhe001/Harness-Research final-exp

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

WF9: Ablation Experiment Plan

<role> You are a Research Methodology Expert who designs rigorous experiments that meet the standards of top-tier venues like CVPR/ICCV/NeurIPS. </role> <context> This is Stage 9 of the 10-stage CV research workflow. Input: iteration_log.json from WF8 (best iteration) + Stage_Report.md (if available). Output: Final_Experiment_Matrix.md. On completion → project concludes, ready for paper writing.

First, read PROJECT_STATE.json to get target_venue and experiment context. For the output format, see templates/experiment-matrix.md. For language behavior, see ../../shared/language-policy.md. </context>

<instructions> 1. **Read prerequisite materials**

Read Stage_Report.md, extracting:

Main experiment results
Method components
Target venue experiment standards

Design ablation experiments

Design ON/OFF experiments for each novel component:

| Experiment ID | Component A | Component B | Component C | Expected Result | |---------------|-------------|-------------|-------------|-----------------| | Baseline | OFF | OFF | OFF | Baseline performance | | Exp-1 | ON | OFF | OFF | Validate A's contribution | | Exp-2 | OFF | ON | OFF | Validate B's contribution | | Exp-3 | OFF | OFF | ON | Validate C's contribution | | Exp-AB | ON | ON | OFF | Validate A+B synergy | | Full | ON | ON | ON | Full method |

Principle: Each experiment changes only one variable to ensure individual component contributions can be isolated.
Hyperparameter search space

Define hyperparameters to search and their ranges:
```
search_space:
  learning_rate: [1e-4, 5e-4, 1e-3]
  weight_decay: [0, 1e-4, 1e-3]
  # ... other key hyperparameters
```
Recommended search strategy: Grid Search (small space) or Random Search (large space).
Robustness tests

Design edge case tests:
- Performance variation across different input resolutions
- Extreme lighting/weather conditions
- Severe occlusion scenarios
- Out-of-distribution (OOD) data
Cross-dataset evaluation

List datasets on which generalization should be validated:
- Primary dataset: full evaluation
- Transfer datasets: verify generalization
- Special scenario datasets: verify robustness
Computation budget

Estimate total GPU hours:

| Experiment Type | Count | Duration Each | GPU Type | Total | |-----------------|-------|---------------|----------|-------| | Ablation experiments | N | Xh | ... | NXh | | Hyperparameter search | M | Yh | ... | MYh | | Robustness tests | K | Zh | ... | KZh | | Cross-dataset | J | Wh | ... | JWh | | Total | | | | XXh |
Output experiment matrix

Write to docs/Final_Experiment_Matrix.md, including:
- context_summary (≤20 lines)
- ablation_table (ablation experiment table)
- hyperparameter_search (search space and strategy)
- robustness_tests (robustness test list)
- cross_dataset_evaluation (cross-dataset evaluation plan)
- computation_budget (computation budget summary)
- execution_order (recommended execution order and parallelization strategy)
Preserve the template structure, but localize headings and narrative text according to ../../shared/language-policy.md unless a field is explicitly marked English-only.
Update project state

Update PROJECT_STATE.json:
- current_stage.status → "completed"
- artifacts.experiment_matrix → file path
- Append completion record to history </instructions>

<constraints> - ALWAYS include at least 3 ablation experiments - ALWAYS design experiments that isolate individual component contributions - ALWAYS estimate computation budget before suggesting experiments - ALWAYS consider what experiments the target venue reviewers would expect - NEVER design experiments without clear hypothesis and expected outcome </constraints>

linzhe001/final-exp

.claude/skills/final-exp/SKILL.md

WF9 ablation experiment plan. Design ablation experiments, hyperparameter searches, robustness tests, and cross-dataset evaluations meeting top-venue standards, estimate computation budget, and output Final_Experiment_Matrix.md. Use when main experiments are complete and ablation studies need to be designed.

1 stars

testing

Updated Apr 17, 2026

$ install --global

skillsauth

npx skillsauth add linzhe001/Harness-Research final-exp

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 17, 2026, 1:59 AM20.0s2 files scanned

SKILL.md

name:: final-exp
description:: WF9 ablation experiment plan. Design ablation experiments, hyperparameter searches, robustness tests, and cross-dataset evaluations meeting top-venue standards, estimate computation budget, and output Final_Experiment_Matrix.md. Use when main experiments are complete and ablation studies need to be designed.
argument-hint:: [stage_report_path]
disable-model-invocation:: true
allowed-tools:: Read, Write, Bash, Glob

WF9: Ablation Experiment Plan

<instructions> 1. **Read prerequisite materials**

Read Stage_Report.md, extracting:

Main experiment results
Method components
Target venue experiment standards

Design ablation experiments

Design ON/OFF experiments for each novel component:

| Experiment ID | Component A | Component B | Component C | Expected Result | |---------------|-------------|-------------|-------------|-----------------| | Baseline | OFF | OFF | OFF | Baseline performance | | Exp-1 | ON | OFF | OFF | Validate A's contribution | | Exp-2 | OFF | ON | OFF | Validate B's contribution | | Exp-3 | OFF | OFF | ON | Validate C's contribution | | Exp-AB | ON | ON | OFF | Validate A+B synergy | | Full | ON | ON | ON | Full method |

Principle: Each experiment changes only one variable to ensure individual component contributions can be isolated.
Hyperparameter search space

Define hyperparameters to search and their ranges:
```
search_space:
  learning_rate: [1e-4, 5e-4, 1e-3]
  weight_decay: [0, 1e-4, 1e-3]
  # ... other key hyperparameters
```
Recommended search strategy: Grid Search (small space) or Random Search (large space).
Robustness tests

Design edge case tests:
- Performance variation across different input resolutions
- Extreme lighting/weather conditions
- Severe occlusion scenarios
- Out-of-distribution (OOD) data
Cross-dataset evaluation

List datasets on which generalization should be validated:
- Primary dataset: full evaluation
- Transfer datasets: verify generalization
- Special scenario datasets: verify robustness
Computation budget

Estimate total GPU hours:

| Experiment Type | Count | Duration Each | GPU Type | Total | |-----------------|-------|---------------|----------|-------| | Ablation experiments | N | Xh | ... | NXh | | Hyperparameter search | M | Yh | ... | MYh | | Robustness tests | K | Zh | ... | KZh | | Cross-dataset | J | Wh | ... | JWh | | Total | | | | XXh |
Output experiment matrix

Write to docs/Final_Experiment_Matrix.md, including:
- context_summary (≤20 lines)
- ablation_table (ablation experiment table)
- hyperparameter_search (search space and strategy)
- robustness_tests (robustness test list)
- cross_dataset_evaluation (cross-dataset evaluation plan)
- computation_budget (computation budget summary)
- execution_order (recommended execution order and parallelization strategy)
Preserve the template structure, but localize headings and narrative text according to ../../shared/language-policy.md unless a field is explicitly marked English-only.
Update project state

Update PROJECT_STATE.json:
- current_stage.status → "completed"
- artifacts.experiment_matrix → file path
- Append completion record to history </instructions>

Related Skills

linzhe001/validate-run

development

VerifiedTrustedCommunity

WF7.5 training pipeline validation. Before entering WF8 iteration, first use Codex to review code for baseline equivalence, then run a 100-step smoke test to verify end-to-end pipeline functionality.

1SKILL.mdUpdated Apr 17, 2026

linzhe001/validate-run

linzhe001/survey-idea

business

VerifiedTrustedCommunity

WF1 Inspiration survey and gap analysis. Takes the user's research idea, performs literature search, gap analysis, competitor analysis, and feasibility scoring, then outputs Feasibility_Report.md. Use when the user has a new CV research idea that needs a feasibility assessment.

1SKILL.mdUpdated Apr 17, 2026

linzhe001/survey-idea

linzhe001/release

tools

VerifiedTrustedCommunity

WF10 Submission/Release Tool. Multi-scene training, result packaging, filename validation, dry-run submission checks. Used after ablation experiments are complete and before competition submission.

1SKILL.mdUpdated Apr 17, 2026

linzhe001/refine-arch

development

VerifiedTrustedCommunity

WF2 Architecture refinement and MVP design. Reads the feasibility report, analyzes the base codebase architecture, designs plug-and-play new modules, defines the MVP, provides A/B/C alternative plans, and outputs Technical_Spec.md. Use when a research idea needs to be translated into a concrete technical architecture design.

1SKILL.mdUpdated Apr 17, 2026

linzhe001/refine-arch

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/linzhe001/Harness-Research.git

# Copy into Claude Code skills folder (global)
cp -r Harness-Research/.claude/skills/final-exp ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

linzhe001/Harness-Research

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT