skills/inbound/reply-classification/SKILL.md
Classify inbound email replies by intent and route them appropriately. Use when building reply handling for outreach, support, or transactional email - detecting interested, not interested, OOO, bounce, unsubscribe, forwarded, and question replies.
npx skillsauth add chunkydotdev/email-skills reply-classificationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Classify inbound email replies so your system knows what happened and what to do next.
inbound-processing - receiving and parsing incoming email before classificationthread-management - maintaining conversation context across reply chainsbounce-handling - processing hard/soft bounces and retry strategiessuppression-lists - managing bounces, complaints, and opt-outs after classificationemail-compliance - legal requirements for honoring unsubscribe repliesStart with a taxonomy that maps cleanly to actions. Too many categories creates ambiguity; too few misses important signals. Here is a proven set of 9 categories that covers the vast majority of reply types:
| Intent | Description | Typical action |
|--------|------------|----------------|
| interested | Positive engagement - wants to learn more, book a meeting, see pricing | Notify owner immediately (5-min SLA) |
| not_now | Timing is wrong but not a hard no - "maybe next quarter", "circle back later" | Auto-archive, schedule follow-up |
| objection | Hard no - "not interested", "remove me", "stop emailing" | Auto-archive, suppress from sequence |
| out_of_office | Vacation, leave, or away auto-reply | Auto-archive, note return date if present |
| unsubscribe | Explicit opt-out request | Honor immediately, add to suppression list |
| question | Asks a question without clear positive/negative signal | Route to owner for manual response |
| support | Reports a problem, bug, or needs help | Route to support queue |
| billing | Invoice, payment, refund, or subscription topic | Route for approval (60-min SLA) |
| unclassified | No clear signal - short replies, ambiguous content | Route to owner with low-confidence flag |
Some systems add referral (forwarded to a colleague), meeting_booked (calendar confirmation), or legal/security for sensitive intents. Add these only when you have distinct actions for them - categories without actions just create noise.
Not all intents should be handled the same way. Split them into tiers:
out_of_office, not_now, objection - safe to archive or suppress automatically when confidence is high.interested, question, support - route to a human but don't block on approval.billing, legal, security - never auto-act. These need human review before any response.unsubscribe - must be honored within 10 business days per CAN-SPAM, but best practice is to process within seconds.Auto-replies are the easiest category to detect with near-perfect accuracy because they follow standardized patterns. Always check headers before content.
These headers are defined by RFC 3834 and widely used. If any match, the message is an auto-reply - skip content analysis entirely.
Auto-Submitted (RFC 3834):
Auto-Submitted: auto-replied
Auto-Submitted: auto-generated
Auto-Submitted: auto-notified
Any value other than no (or absent) means the message is automated. This is the most reliable signal.
Precedence:
Precedence: bulk
Precedence: auto_reply
Precedence: list
Precedence: junk
Legacy header, but still widely set. Treat bulk, auto_reply, list, and junk as auto-generated.
X-Auto-Response-Suppress (Microsoft Exchange/Outlook):
X-Auto-Response-Suppress: All
X-Auto-Response-Suppress: DR, AutoReply
X-Auto-Response-Suppress: OOF
If this contains DR, AutoReply, OOF, or All, the message is automated. Microsoft products set this consistently.
Other indicators:
X-Autoreply: yes - some mail servers set thisX-Mail-Autoreply: yes - variantReturn-Path: <> (empty) - DSN/bounce, not a human replyFrom or Reply-To contains noreply@, no-reply@, or no_reply@Content-Type includes report-type=delivery-status - this is a DSN (RFC 3464)List-Unsubscribe header present - newsletter, not a personal replyIf headers are inconclusive, check the subject:
/^(re:\s*)?(out of office|ooo|away from|on vacation|on leave|automatic reply|auto[\s-]?reply|autoreply)/i
Also check for localized variants:
Absence du bureau, Re automatiqueAbwesenheitsnotiz, Automatische AntwortFuera de la oficina, Respuesta automaticaFora do escritorio, Resposta automatica不在外出When headers and subject are ambiguous, scan the body for patterns:
/I am (currently )?(out of (the )?office|on vacation|on leave|away|unavailable)/i
/I will (be )?(back|return|returning) (on |by )?/i
/limited access to email/i
/I will respond (to your (email|message) )?when I return/i
/If (this is |your matter is )?urgent/i
/please contact .+ in my absence/i
The combination of two or more of these patterns in a single message is a very strong OOO signal.
OOO messages often contain a return date. Extract it to schedule follow-ups:
/(?:back|return|returning|available)\s+(?:on\s+)?(\w+ \d{1,2}(?:,?\s+\d{4})?|\d{1,2}[\/\-]\d{1,2}(?:[\/\-]\d{2,4})?)/i
Parse dates carefully - handle both US (MM/DD) and international (DD/MM) formats. When in doubt, default to the interpretation that produces a future date.
Bounces arrive as DSNs (Delivery Status Notifications, RFC 3464) or as freeform rejection messages from mail servers.
Content-Type: multipart/report; report-type=delivery-status
If this header is present, parse the message/delivery-status MIME part for the status code:
| Code pattern | Meaning | Classification |
|-------------|---------|----------------|
| 5.1.1 | Mailbox does not exist | Hard bounce |
| 5.1.2 | Domain does not exist | Hard bounce |
| 5.2.1 | Mailbox disabled | Hard bounce |
| 5.2.2 | Mailbox full | Soft bounce |
| 5.4.1 | No answer from host | Soft bounce |
| 5.7.1 | Delivery not authorized | Hard bounce (policy) |
| 4.x.x | Transient failure | Soft bounce |
Many bounces don't use proper DSN format. Detect them via subject:
/^(returned mail|undeliverable|delivery (status )?notification|mail delivery (failed|failure)|failure notice|returned to sender)/i
For replies that aren't auto-generated or bounces, classify by intent using keyword matching. This is the workhorse of most classification systems.
A weighted keyword scoring system outperforms simple keyword presence checks. For each intent, maintain a list of keywords with a base weight. Then score:
Interested (weight: 0.85):
interested, tell me more, demo, schedule a call, set up a meeting,
learn more, pricing, sounds great, let's chat, let's connect, free trial,
show me, walk me through, send me info, book a time
Objection (weight: 0.85):
not interested, no thanks, no thank you, unsubscribe, remove me,
stop emailing, do not contact, opt out, take me off, please stop,
wrong person, not relevant
Not now (weight: 0.80):
not right now, not interested right now, maybe later, reach out later,
bad timing, next quarter, next year, circle back, check back,
not a priority, too busy right now
Support (weight: 0.75):
help, support, issue, problem, bug, error, not working, broken,
trouble, can't access, doesn't work, how do I
Billing (weight: 0.85):
invoice, billing, payment, charge, refund, subscription,
cancel subscription, receipt, credit, pricing question
Short replies (under 5 words) are common and tricky. "Thanks" could be positive acknowledgment or dismissal. "OK" could mean interested or just acknowledging receipt.
Rules of thumb:
interested with low confidence.objection.unclassified and route to owner.interested. A "Thanks" after an intro email is ambiguous.Sometimes a reply matches multiple intents. "I'm interested but the timing is bad - maybe next quarter" scores for both interested and not_now. This is common and you need a strategy for it.
Flag a conflict when:
interested + objection, interested + not_now, out_of_office + interested).Conflicting intents should escalate to human review. Never auto-act on a conflicting classification. The routing action should be require_approval with a short SLA (15 minutes).
Include both intents and their scores in the classification output so the reviewer has context:
{
"intent": "interested",
"confidence": 0.72,
"runnerUpIntent": "not_now",
"runnerUpConfidence": 0.65,
"flags": ["conflicting_intents"]
}
Raw keyword scores need to be normalized into a confidence value between 0 and 1. Key thresholds:
| Confidence | Meaning | Recommended action | |-----------|---------|-------------------| | 0.85+ | High confidence | Safe to auto-act (archive, notify) | | 0.60-0.85 | Medium confidence | Act but flag for review if wrong | | Below 0.60 | Low confidence | Do NOT auto-act. Route for human review |
When confidence is below 0.60, override the default routing action:
auto_archive, upgrade to require_approval.notify_owner, upgrade to require_approval.require_approval actions as-is (they're already going to a human).This prevents false classifications from silently archiving important emails or triggering wrong workflows.
Classification without routing is useless. Every intent needs a clear routing action.
| Action | Description | When to use |
|--------|-------------|-------------|
| notify_owner | Send alert to the contact's owner/rep | Interested, support, unclassified |
| auto_archive | Mark as processed, no human action needed | OOO, not_now, objection (high confidence) |
| require_approval | Queue for human review before any action | Billing, legal, security, low-confidence |
| escalate | Flag for senior review | Safety concerns, adversarial content |
| spam | Route to spam queue | Failed safety classification |
Not all intents have the same urgency:
| Intent | SLA | Why |
|--------|-----|-----|
| interested | 5 minutes | Hot lead - speed matters |
| security | 15 minutes | Potential incident |
| legal | 30 minutes | Compliance risk |
| support | 30 minutes | Customer satisfaction |
| billing | 60 minutes | Revenue impact |
| unclassified | 60 minutes | Needs triage |
| out_of_office | None | Auto-archived |
| not_now | None | Auto-archived |
| objection | None | Auto-archived |
When routing to notify_owner, you need to resolve who the owner is. Typical resolution order:
If no owner can be resolved, treat it as require_approval - someone needs to claim it.
Reply classification and safety classification are complementary but separate concerns. Intent classification asks "what does this person want?" Safety classification asks "is this message dangerous?"
Run safety classification in parallel with intent classification. Safety verdicts override intent-based routing:
| Safety verdict | Override action |
|---------------|----------------|
| clean | No override - use intent-based routing |
| spam | Route to spam queue |
| phishing | Quarantine for human review |
| malware | Reject or quarantine |
| abuse | Quarantine for human review |
| impersonation | Quarantine for human review |
Trusted senders (verified customers, known domains) should bypass safety classification to avoid false positives. Maintain a whitelist at both the email and domain level.
Some replies must always go to a human, regardless of classification confidence:
Include flags in your classification output to explain why escalation happened:
| Flag | Meaning |
|------|---------|
| conflicting_intents | Top two intents are close in score or opposing |
| adversarial_position | Sensitive-intent keywords in unexpected position |
| low_confidence | Top score below confidence threshold |
| injection_risk | Prompt injection patterns detected |
| thread_anomaly | Reply doesn't match thread context |
Classification feeds into a next-best-action engine that considers the full contact context, not just the current reply:
| Context | Recommendation |
|---------|---------------|
| Contact is suppressed | stop - do not send anything |
| Recent objection or negative signal | stop - respect the no |
| Unsafe safety verdict | escalate - human review needed |
| Sensitive intent (legal, billing, security) | escalate - human review |
| Positive intent (interested) | reply - respond promptly |
| Last outbound < 24h with no reply | wait - don't pile on |
| No activity > 7 days | nudge - gentle follow-up |
| No strong signal either way | wait - monitor for changes |
The key insight: a single reply's classification is necessary but not sufficient. You need the full timeline - sends, replies, bounces, journey state - to make a good decision.
Three approaches, each with trade-offs:
Pros: Deterministic, fast (sub-millisecond), no training data needed, easy to debug, no external dependencies.
Cons: Misses nuance, can't handle sarcasm or complex phrasing, requires manual keyword list maintenance.
Best for: Auto-reply/OOO detection (near-perfect accuracy), bounce detection, unsubscribe detection - categories with predictable patterns.
Pros: Handles nuance better, learns from your data, good accuracy with 50-100 labeled examples per category.
Cons: Requires training data, needs retraining as language evolves, still struggles with very short replies.
Best for: Intent classification at scale when you have labeled training data from past campaigns.
Pros: Handles nuance and context exceptionally well, works zero-shot (no training data), understands sarcasm and complex phrasing.
Cons: Slow (100-500ms per classification), expensive at scale, non-deterministic, vulnerable to prompt injection in the email body.
Best for: Low-volume high-value classification, fallback for low-confidence rule-based results, classification where context from the thread matters.
Use all three in layers:
This layered approach gives you speed where it matters and accuracy where it's needed, without running every reply through an expensive model.
Treating auto-replies as human responses. OOO and auto-acknowledgment replies should never trigger sequence steps, CRM updates, or rep notifications. Check headers before content - always.
Not honoring unsubscribes from replies. When someone replies "unsubscribe" or "remove me", that's a valid opt-out even if they didn't click your unsubscribe link. CAN-SPAM requires you to honor any reasonable opt-out request. Process it immediately.
Auto-acting on low-confidence classifications. If your classifier is only 55% sure an email is "not interested", don't auto-archive it. That email might be from a prospect saying "I'm not interested in Plan A, but tell me about Plan B." Route low-confidence results to a human.
Ignoring the runner-up intent. When the top two intents are close in score, the classification is ambiguous. Logging only the winner throws away important signal. Always capture the runner-up intent and its score.
Using a single threshold for all intents. A 0.7 confidence for "out_of_office" is very different from 0.7 for "interested". OOO detection is reliable at 0.7; interest detection needs more scrutiny. Adjust thresholds per intent or at least per risk tier.
Classifying without thread context. "Yes" means nothing without knowing what question was asked. If you have thread history, use it. A "yes" reply to "Would you like a demo?" is interested. A "yes" reply to "Should I remove you from the list?" is objection.
Running LLM classification on unsanitized email bodies. Email bodies can contain prompt injection attacks. If you feed raw reply content into an LLM, an attacker can manipulate your classification. Sanitize content and use structured prompts that separate the instruction from the email content.
Treating all bounces the same. Hard bounces (5.1.1 - mailbox doesn't exist) and soft bounces (5.2.2 - mailbox full) require completely different handling. Hard bounces should suppress immediately. Soft bounces should retry with backoff.
Bulk-approving high-risk intents. Legal and security intents should never be bulk-approved. Each one needs individual review. An email that mentions "attorney" or "compliance" could be a serious matter.
Not rate-limiting classification. If you use an external service (ML model, LLM API) for classification, a sudden spike in inbound volume can overwhelm it. Queue classifications and process them at a controlled rate.
Services like molted.email handle the full classification and routing pipeline out of the box - intent classification, safety filtering, routing with SLAs, and human approval workflows - so you can focus on what to do with the results rather than building the classifier.
data-ai
Choose and configure an email service provider. Use when setting up email for a new project, comparing providers, migrating between providers, or adding failover.
development
Set up SPF, DKIM, and DMARC email authentication. Use when configuring a new sending domain, debugging spam/rejection issues, adding email providers, or preparing for Google/Yahoo/Microsoft bulk sender requirements.
development
Design and send transactional emails. Use when building password resets, receipts, shipping notifications, account alerts, or separating transactional from marketing streams.
development
Build welcome and activation email sequences. Use when designing signup flows, driving users to key actions, converting trials to paid, or reducing early churn.