Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

harsh040506/prompt-engineering

Name: prompt-engineering
Author: harsh040506

engineering/ai-ml-engineering/skills/prompt-engineering/SKILL.md

npx skillsauth add harsh040506/claude-code-unified-skill-plugin-library prompt-engineering

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Prompt Engineering

Design, iterate, and optimize prompts for large language models to achieve consistent, accurate, and production-ready results.

Fundamental Principles

1. Be Direct and Specific

LLMs respond to precise, imperative instructions. Vague requests produce vague results.

| Vague | Specific | |-------|---------| | "Summarize this" | "Summarize this in exactly 3 bullet points, each under 20 words, focusing on action items" | | "Extract information" | "Extract the customer name, order ID, and complaint category. Return ONLY a JSON object matching the schema below" | | "Write a response" | "Write a professional 3-sentence email response acknowledging the complaint, apologizing without admitting liability, and offering a call within 24 hours" |

2. Structure Before Content

The model reads top-to-bottom. Structure matters:

Primacy effect: Instructions at the beginning set the frame. Recency effect: Instructions at the end are freshest in context.

Critical constraints → put at END, after the task description. Role and context → put at BEGINNING.

3. Examples Trump Instructions

When precision matters, show don't tell. Three well-chosen examples outperform a paragraph of description.

Prompt Components

Complete Template

ROLE: You are [specific expert role with domain and context].

TASK: [One paragraph describing the exact task, in imperative voice.]

CONTEXT: [Any background the model needs. Remove anything the model can infer.]

RULES:
1. [Hard constraint — frame as prohibition or requirement]
2. [Hard constraint]
3. [Hard constraint]

OUTPUT FORMAT:
[Exact format. Use code block for JSON schemas or templates.]

EXAMPLES:
Input: [example input 1]
Output: [example output 1]

Input: [example input 2]
Output: [example output 2]

Now process the following:
[{{USER_INPUT}}]

Role Design

A well-designed role activates relevant knowledge and sets tone.

Format: You are [job title] at [organization type] with [years] of experience [specialization]. You [key behavioral trait].

Examples:

// Generic (bad)
You are a helpful assistant.

// Specific (good)
You are a senior backend engineer at a high-traffic SaaS company with 10 years of experience 
in distributed systems. You prioritize correctness and operational safety over cleverness.

// For a support agent
You are a customer support specialist for a B2B software company. You are empathetic, clear, 
and solution-oriented. You never promise what you can't deliver and always set accurate expectations.

// For document analysis
You are a paralegal with expertise in commercial contract review. You are precise, cite specific 
clauses when relevant, and flag ambiguity rather than guessing intent.

Chain-of-Thought Prompting

For complex reasoning tasks (math, logic, multi-step analysis), explicit reasoning dramatically improves accuracy.

Zero-Shot CoT

Simply adding "Think step by step" at the end of a prompt improves reasoning:

User: James has 3 apples. He gives half to Sarah and then buys 4 more. How many does he have?

Without CoT: 5 (often wrong on harder problems)

With CoT prompt:
"Think through this step by step before giving your answer."

Model response:
Step 1: James starts with 3 apples.
Step 2: He gives half to Sarah: 3 / 2 = 1.5. Since we're dealing with whole apples, assume 1 or 2...
[correct reasoning follows]

Few-Shot CoT

Show the reasoning pattern, not just the answer:

Example 1:
Question: If a store offers 20% off and then an additional 10% off, what's the total discount?
Reasoning: 
- First discount: 100% × 0.80 = 80% of original price
- Second discount: 80% × 0.90 = 72% of original price
- Total: paid 72%, so discount is 28%
Answer: 28% (not 30% — discounts don't add, they compound)

Example 2:
Question: [new question]
Reasoning:

Structured Output

Force structured output for machine-readable responses.

JSON Output

Extract the following from the customer complaint and return ONLY a valid JSON object.
Do not include any text, explanation, or markdown outside the JSON.

Schema:
{
  "customer_name": string | null,
  "order_id": string | null,  
  "issue_category": "billing" | "shipping" | "product" | "account" | "other",
  "sentiment": "angry" | "frustrated" | "neutral" | "satisfied",
  "urgency": 1 | 2 | 3 | 4 | 5,
  "key_complaint": string  // One sentence
}

Complaint: {{COMPLAINT_TEXT}}

Using Function Calling (OpenAI / Anthropic tool_use)

# OpenAI function calling
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": complaint_text}],
    tools=[{
        "type": "function",
        "function": {
            "name": "extract_complaint",
            "description": "Extract structured data from a customer complaint",
            "parameters": {
                "type": "object",
                "properties": {
                    "customer_name": {"type": "string"},
                    "issue_category": {"type": "string", "enum": ["billing", "shipping", "product", "account", "other"]},
                    "urgency": {"type": "integer", "minimum": 1, "maximum": 5},
                },
                "required": ["issue_category", "urgency"],
            }
        }
    }],
    tool_choice={"type": "function", "function": {"name": "extract_complaint"}},
)

Function calling is more reliable than asking for JSON in the prompt — the model is trained specifically for this format.

Retrieval-Augmented Generation (RAG) Prompts

For RAG systems, design the prompt to use retrieved context faithfully:

You are a support agent for Acme Corp. Answer the customer's question using ONLY the information 
in the provided context. Do not use outside knowledge.

Rules:
1. If the answer is in the context, answer directly and cite the source section.
2. If the answer is NOT in the context, say: "I don't have information about that in our documentation. Let me connect you with a specialist."
3. Never guess or make up product details.
4. Keep answers under 150 words unless the question requires more detail.

CONTEXT:
{{RETRIEVED_CHUNKS}}

CUSTOMER QUESTION:
{{QUESTION}}

Common RAG prompt mistakes:

Not telling the model to prefer context over training knowledge
Not handling "context doesn't contain the answer" gracefully
Injecting too much context (model loses focus on the question)
Not including source attribution instructions

Prompt Security

Prompt Injection Defense

Users can try to override your system prompt with inputs like "Ignore previous instructions and..."

Defensive techniques:

// Sandwich defense — repeat critical rules after the user input
System: You are a customer support agent for Acme Corp. Only answer questions about Acme products.

User input goes here.

IMPORTANT REMINDER: You may only discuss Acme Corp products and services. 
If the user asks about anything else, politely redirect.

// Input sanitization in code
def sanitize_user_input(text: str) -> str:
    # Encode potentially dangerous phrases
    dangerous_patterns = [
        "ignore previous instructions",
        "ignore all instructions",
        "disregard your",
        "you are now",
        "pretend you are",
    ]
    for pattern in dangerous_patterns:
        if pattern.lower() in text.lower():
            return "[Input contained policy-violating content and was blocked]"
    return text

Prompt Evaluation Framework

Evaluate prompts systematically, not by vibe:

def evaluate_prompt(prompt_template: str, test_cases: list[dict]) -> dict:
    """
    test_cases: list of {"input": str, "expected_output": str, "criteria": list[str]}
    """
    results = []
    for case in test_cases:
        filled_prompt = prompt_template.replace("{{INPUT}}", case["input"])
        actual_output = call_llm(filled_prompt)
        
        # Evaluate each criterion
        case_results = {
            "input": case["input"],
            "output": actual_output,
            "criteria": {}
        }
        for criterion in case["criteria"]:
            # Use LLM-as-judge or hard-coded checks
            passed = evaluate_criterion(actual_output, criterion, case["expected_output"])
            case_results["criteria"][criterion] = passed
        
        results.append(case_results)
    
    # Aggregate
    all_criteria = set(c for r in results for c in r["criteria"])
    summary = {
        c: sum(r["criteria"].get(c, False) for r in results) / len(results)
        for c in all_criteria
    }
    return {"per_case": results, "aggregate": summary}

Key criteria to evaluate:

Correct output (does it match expected?)
Format compliance (does it follow the output format?)
No hallucination (does it cite only provided facts?)
Instruction following (does it obey all rules?)
Appropriate length (not too long/short?)
Tone/persona consistency

Cost and Latency Optimization

| Technique | Token savings | Quality impact | |-----------|--------------|----------------| | Shorter system prompt (remove redundancy) | 20–50% | Minimal if done carefully | | Few-shot → zero-shot (with CoT) | 30–70% | Test carefully | | Use smaller model for simpler tasks | N/A | Test accuracy threshold | | Prompt caching (Anthropic / OpenAI) | Up to 90% on repeated prefix | None | | Response caching for identical inputs | 100% on cache hit | None (verification required) | | Batch requests | Reduces overhead | None |

Always measure: cheaper prompts that produce wrong results cost more in the long run.

Deeper Reference

For an extensive prompt pattern catalog and optimization recipes, see:

references/prompt-patterns-catalog.md — 50+ annotated prompt patterns covering chain-of-thought, self-consistency, tool use, structured output, and domain-specific templates

harsh040506/prompt-engineering

engineering/ai-ml-engineering/skills/prompt-engineering/SKILL.md

This skill should be used when the user asks about "prompt engineering", "prompt design", "system prompt", "few-shot examples", "chain of thought", "CoT", "zero-shot", "one-shot", "few-shot", "prompt optimization", "prompt iteration", "improve accuracy of LLM", "make the model follow instructions", "structured output", "JSON mode", "function calling", "tool calling", "prompt template", "LangChain", "LlamaIndex", "prompt injection", "jailbreak defense", "RAG prompt", "retrieval augmented generation prompt", "evaluation of prompts", or "why is the LLM not doing what I want". Also trigger for "my prompt gives inconsistent results", "the model ignores my instructions", or "how do I write a better system prompt".

2 stars

tools

Updated Apr 5, 2026

$ install --global

skillsauth

npx skillsauth add harsh040506/claude-code-unified-skill-plugin-library prompt-engineering

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 5, 2026, 5:11 PM31.0s2 files scanned

SKILL.md

name:: prompt-engineering
description:: This skill should be used when the user asks about "prompt engineering", "prompt design", "system prompt", "few-shot examples", "chain of thought", "CoT", "zero-shot", "one-shot", "few-shot", "prompt optimization", "prompt iteration", "improve accuracy of LLM", "make the model follow instructions", "structured output", "JSON mode", "function calling", "tool calling", "prompt template", "LangChain", "LlamaIndex", "prompt injection", "jailbreak defense", "RAG prompt", "retrieval augmented generation prompt", "evaluation of prompts", or "why is the LLM not doing what I want". Also trigger for "my prompt gives inconsistent results", "the model ignores my instructions", or "how do I write a better system prompt".

Prompt Engineering

Design, iterate, and optimize prompts for large language models to achieve consistent, accurate, and production-ready results.

Fundamental Principles

1. Be Direct and Specific

LLMs respond to precise, imperative instructions. Vague requests produce vague results.

2. Structure Before Content

The model reads top-to-bottom. Structure matters:

Primacy effect: Instructions at the beginning set the frame. Recency effect: Instructions at the end are freshest in context.

Critical constraints → put at END, after the task description. Role and context → put at BEGINNING.

3. Examples Trump Instructions

When precision matters, show don't tell. Three well-chosen examples outperform a paragraph of description.

Prompt Components

Complete Template

ROLE: You are [specific expert role with domain and context].

TASK: [One paragraph describing the exact task, in imperative voice.]

CONTEXT: [Any background the model needs. Remove anything the model can infer.]

RULES:
1. [Hard constraint — frame as prohibition or requirement]
2. [Hard constraint]
3. [Hard constraint]

OUTPUT FORMAT:
[Exact format. Use code block for JSON schemas or templates.]

EXAMPLES:
Input: [example input 1]
Output: [example output 1]

Input: [example input 2]
Output: [example output 2]

Now process the following:
[{{USER_INPUT}}]

Role Design

A well-designed role activates relevant knowledge and sets tone.

Format: You are [job title] at [organization type] with [years] of experience [specialization]. You [key behavioral trait].

Examples:

// Generic (bad)
You are a helpful assistant.

// Specific (good)
You are a senior backend engineer at a high-traffic SaaS company with 10 years of experience 
in distributed systems. You prioritize correctness and operational safety over cleverness.

// For a support agent
You are a customer support specialist for a B2B software company. You are empathetic, clear, 
and solution-oriented. You never promise what you can't deliver and always set accurate expectations.

// For document analysis
You are a paralegal with expertise in commercial contract review. You are precise, cite specific 
clauses when relevant, and flag ambiguity rather than guessing intent.

Chain-of-Thought Prompting

For complex reasoning tasks (math, logic, multi-step analysis), explicit reasoning dramatically improves accuracy.

Zero-Shot CoT

Simply adding "Think step by step" at the end of a prompt improves reasoning:

User: James has 3 apples. He gives half to Sarah and then buys 4 more. How many does he have?

Without CoT: 5 (often wrong on harder problems)

With CoT prompt:
"Think through this step by step before giving your answer."

Model response:
Step 1: James starts with 3 apples.
Step 2: He gives half to Sarah: 3 / 2 = 1.5. Since we're dealing with whole apples, assume 1 or 2...
[correct reasoning follows]

Few-Shot CoT

Show the reasoning pattern, not just the answer:

Example 1:
Question: If a store offers 20% off and then an additional 10% off, what's the total discount?
Reasoning: 
- First discount: 100% × 0.80 = 80% of original price
- Second discount: 80% × 0.90 = 72% of original price
- Total: paid 72%, so discount is 28%
Answer: 28% (not 30% — discounts don't add, they compound)

Example 2:
Question: [new question]
Reasoning:

Structured Output

Force structured output for machine-readable responses.

JSON Output

Extract the following from the customer complaint and return ONLY a valid JSON object.
Do not include any text, explanation, or markdown outside the JSON.

Schema:
{
  "customer_name": string | null,
  "order_id": string | null,  
  "issue_category": "billing" | "shipping" | "product" | "account" | "other",
  "sentiment": "angry" | "frustrated" | "neutral" | "satisfied",
  "urgency": 1 | 2 | 3 | 4 | 5,
  "key_complaint": string  // One sentence
}

Complaint: {{COMPLAINT_TEXT}}

Using Function Calling (OpenAI / Anthropic tool_use)

# OpenAI function calling
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": complaint_text}],
    tools=[{
        "type": "function",
        "function": {
            "name": "extract_complaint",
            "description": "Extract structured data from a customer complaint",
            "parameters": {
                "type": "object",
                "properties": {
                    "customer_name": {"type": "string"},
                    "issue_category": {"type": "string", "enum": ["billing", "shipping", "product", "account", "other"]},
                    "urgency": {"type": "integer", "minimum": 1, "maximum": 5},
                },
                "required": ["issue_category", "urgency"],
            }
        }
    }],
    tool_choice={"type": "function", "function": {"name": "extract_complaint"}},
)

Function calling is more reliable than asking for JSON in the prompt — the model is trained specifically for this format.

Retrieval-Augmented Generation (RAG) Prompts

For RAG systems, design the prompt to use retrieved context faithfully:

You are a support agent for Acme Corp. Answer the customer's question using ONLY the information 
in the provided context. Do not use outside knowledge.

Rules:
1. If the answer is in the context, answer directly and cite the source section.
2. If the answer is NOT in the context, say: "I don't have information about that in our documentation. Let me connect you with a specialist."
3. Never guess or make up product details.
4. Keep answers under 150 words unless the question requires more detail.

CONTEXT:
{{RETRIEVED_CHUNKS}}

CUSTOMER QUESTION:
{{QUESTION}}

Common RAG prompt mistakes:

Not telling the model to prefer context over training knowledge
Not handling "context doesn't contain the answer" gracefully
Injecting too much context (model loses focus on the question)
Not including source attribution instructions

Prompt Security

Prompt Injection Defense

Users can try to override your system prompt with inputs like "Ignore previous instructions and..."

Defensive techniques:

// Sandwich defense — repeat critical rules after the user input
System: You are a customer support agent for Acme Corp. Only answer questions about Acme products.

User input goes here.

IMPORTANT REMINDER: You may only discuss Acme Corp products and services. 
If the user asks about anything else, politely redirect.

// Input sanitization in code
def sanitize_user_input(text: str) -> str:
    # Encode potentially dangerous phrases
    dangerous_patterns = [
        "ignore previous instructions",
        "ignore all instructions",
        "disregard your",
        "you are now",
        "pretend you are",
    ]
    for pattern in dangerous_patterns:
        if pattern.lower() in text.lower():
            return "[Input contained policy-violating content and was blocked]"
    return text

Prompt Evaluation Framework

Evaluate prompts systematically, not by vibe:

def evaluate_prompt(prompt_template: str, test_cases: list[dict]) -> dict:
    """
    test_cases: list of {"input": str, "expected_output": str, "criteria": list[str]}
    """
    results = []
    for case in test_cases:
        filled_prompt = prompt_template.replace("{{INPUT}}", case["input"])
        actual_output = call_llm(filled_prompt)
        
        # Evaluate each criterion
        case_results = {
            "input": case["input"],
            "output": actual_output,
            "criteria": {}
        }
        for criterion in case["criteria"]:
            # Use LLM-as-judge or hard-coded checks
            passed = evaluate_criterion(actual_output, criterion, case["expected_output"])
            case_results["criteria"][criterion] = passed
        
        results.append(case_results)
    
    # Aggregate
    all_criteria = set(c for r in results for c in r["criteria"])
    summary = {
        c: sum(r["criteria"].get(c, False) for r in results) / len(results)
        for c in all_criteria
    }
    return {"per_case": results, "aggregate": summary}

Key criteria to evaluate:

Correct output (does it match expected?)
Format compliance (does it follow the output format?)
No hallucination (does it cite only provided facts?)
Instruction following (does it obey all rules?)
Appropriate length (not too long/short?)
Tone/persona consistency

Cost and Latency Optimization

Always measure: cheaper prompts that produce wrong results cost more in the long run.

Deeper Reference

For an extensive prompt pattern catalog and optimization recipes, see:

references/prompt-patterns-catalog.md — 50+ annotated prompt patterns covering chain-of-thought, self-consistency, tool use, structured output, and domain-specific templates

Related Skills

harsh040506/single-cell-rna-qc

testing

VerifiedTrustedCommunity

Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations. Use when users request QC analysis, filtering low-quality cells, assessing data quality, or following scverse/scanpy best practices for single-cell analysis.

2SKILL.mdUpdated Apr 5, 2026

harsh040506/single-cell-rna-qc

harsh040506/scvi-tools

tools

VerifiedTrustedCommunity

Deep learning for single-cell analysis using scvi-tools. This skill should be used when users need (1) data integration and batch correction with scVI/scANVI, (2) ATAC-seq analysis with PeakVI, (3) CITE-seq multi-modal analysis with totalVI, (4) multiome RNA+ATAC analysis with MultiVI, (5) spatial transcriptomics deconvolution with DestVI, (6) label transfer and reference mapping with scANVI/scArches, (7) RNA velocity with veloVI, or (8) any deep learning-based single-cell method. Triggers include mentions of scVI, scANVI, totalVI, PeakVI, MultiVI, DestVI, veloVI, sysVI, scArches, variational autoencoder, VAE, batch correction, data integration, multi-modal, CITE-seq, multiome, reference mapping, latent space.

2SKILL.mdUpdated Apr 5, 2026

harsh040506/scvi-tools

harsh040506/scientific-problem-selection

testing

VerifiedTrustedCommunity

This skill should be used when scientists need help with research problem selection, project ideation, troubleshooting stuck projects, or strategic scientific decisions. Use this skill when users ask to pitch a new research idea, work through a project problem, evaluate project risks, plan research strategy, navigate decision trees, or get help choosing what scientific problem to work on. Typical requests include "I have an idea for a project", "I'm stuck on my research", "help me evaluate this project", "what should I work on", or "I need strategic advice about my research".

2SKILL.mdUpdated Apr 5, 2026

harsh040506/scientific-problem-selection

harsh040506/nextflow-development

development

VerifiedTrustedCommunity

Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or public datasets from GEO/SRA. Triggers on nf-core, Nextflow, FASTQ analysis, variant calling, gene expression, differential expression, GEO reanalysis, GSE/GSM/SRR accessions, or samplesheet creation.

2SKILL.mdUpdated Apr 5, 2026

harsh040506/nextflow-development

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/harsh040506/claude-code-unified-skill-plugin-library.git

# Copy into Claude Code skills folder (global)
cp -r claude-code-unified-skill-plugin-library/engineering/ai-ml-engineering/skills/prompt-engineering ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

harsh040506/claude-code-unified-skill-plugin-library

2 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT