skills/eliciting-least-to-most-reasoning-phishing/SKILL.md
Detect phishing URLs using Least-to-Most iterative decomposition with answer sensitivity scoring. Triggers: 'analyze this URL for phishing', 'is this URL safe', 'check URL for phishing indicators', 'phishing detection', 'classify this URL as phishing or benign', 'scan URL for suspicious patterns'
npx skillsauth add ndpvt-web/arxiv-claude-skills eliciting-least-to-most-reasoning-phishingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill enables Claude to classify URLs as phishing or benign using a Least-to-Most prompting framework with answer sensitivity. Rather than making a single snap judgment, Claude decomposes URL analysis into a series of increasingly specific sub-questions — examining domain structure, path components, TLD patterns, and brand impersonation signals — where each answer informs the next. A running phishing-likelihood percentage (0-100%) accumulates across iterations, terminating when confidence thresholds are crossed. This technique, from Trikilis et al. (2026), achieves F1 scores of 0.90+ across multiple datasets without supervised training, outperforming one-shot classification by ~3% F1.
Least-to-Most prompting decomposes a hard problem into ordered sub-questions, solving each one sequentially so that earlier answers provide context for later, harder questions. For phishing detection, this means examining a URL's domain, TLD, path structure, and impersonation signals one at a time rather than holistically. Each sub-question produces both a textual analysis (80-100 words) and a phishing-likelihood percentage (0% = certainly benign, 100% = certainly phishing).
The answer sensitivity mechanism is what makes this approach iterative and self-correcting. After each sub-question, the running phishing-likelihood score is evaluated against two thresholds: an upper threshold (e.g., 85%) indicating high phishing confidence, and a lower threshold (e.g., 15%) indicating high benign confidence. If neither threshold is crossed, the framework generates a new, deeper sub-question and repeats — up to a maximum of 10 iterations. This prevents premature classification and allows borderline URLs to receive deeper analysis. URLs that exhaust all iterations without crossing a threshold default conservatively to "phishing." Research shows that these "outlier" cases requiring extra iterations corrected 19 of 24 URLs that one-shot methods misclassified.
The progressive accumulation of evidence mirrors how a human analyst works: start with the obvious signals (is the domain itself suspicious?), then dig into subtler indicators (does the path suggest a compromised server? does the filename pattern match known phishing kits?). Each iteration's context carries forward, so later sub-questions benefit from all prior reasoning.
Extract and normalize the URL. Parse the raw URL into components: scheme, subdomain, domain, TLD, port, path segments, query parameters, and fragment. Handle URL encoding and IDN/punycode domains.
Initialize the sensitivity tracker. Set the phishing-likelihood score to 50% (neutral), define the upper threshold (default: 85%) and lower threshold (default: 15%), and set the iteration counter to 0 with a maximum of 10.
Generate the first sub-question: Domain analysis. Ask: "What is the primary domain of this URL, and does its structure suggest brand impersonation, typosquatting, or suspicious registration patterns?" Produce an 80-100 word analysis and update the phishing-likelihood percentage.
Check threshold crossing. If the score exceeds the upper threshold, classify as phishing and stop. If below the lower threshold, classify as benign and stop. Otherwise, continue.
Generate the second sub-question: TLD and subdomain analysis. Ask: "Does the TLD choice or subdomain structure raise suspicion? Are free hosting services, unusual country-code TLDs, or excessive subdomain depth present?" Incorporate the domain analysis from step 3 into the context. Update the score.
Generate the third sub-question: Path and filename analysis. Ask: "What do the directory names and filenames in the URL path suggest about website compromise or malicious content hosting? Are there patterns like /install/, /wp-admin/, or files like document.php?" Carry forward all prior context. Update the score.
Generate further sub-questions as needed (iterations 4-10). If thresholds remain uncrossed, probe deeper signals: query parameter anomalies, URL length and entropy, presence of IP addresses instead of domains, use of URL shorteners, HTTPS mismatches, and known phishing kit fingerprints. Each iteration inherits all prior answers.
Apply the default rule for exhausted iterations. If 10 iterations complete without crossing either threshold, classify the URL conservatively as phishing. Log this as a low-confidence result requiring human review.
Compile the reasoning chain. Present the final classification along with the full iteration history: each sub-question, its answer, and the running score trajectory. This provides an auditable explanation for the decision.
Report results. Output the classification (phishing/benign), the final confidence score, the number of iterations required, and the key evidence that drove the decision.
Example 1: Obvious phishing URL
User: "Is this URL phishing? http://paypa1-secure-login.com.suspicious-domain.xyz/account/verify.php?id=8372"
Approach:
suspicious-domain.xyz, subdomain=paypa1-secure-login.com, path=/account/verify.php, query=id=8372suspicious-domain.xyz with subdomain paypa1-secure-login.com. The subdomain impersonates PayPal using the "1" for "l" typosquatting technique. Score: 82%.xyz TLD is commonly associated with disposable phishing domains. The subdomain embeds a fake .com to deceive users into thinking paypa1-secure-login.com is the actual domain. Score: 93% — upper threshold crossed, stop.Output:
Classification: PHISHING
Confidence: 93%
Iterations: 2
Key evidence:
- Typosquatting of "PayPal" (paypa1 with numeral 1)
- Subdomain designed to mimic legitimate domain structure
- .xyz TLD commonly used for disposable phishing infrastructure
- Path pattern (verify.php) matches credential harvesting pages
Sensitivity trajectory: 50% → 82% → 93% (threshold crossed)
Example 2: Benign URL requiring deeper analysis
User: "Check this URL: https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository"
Approach:
github.com, subdomain=docs, path=/en/repositories/.../cloning-a-repositorygithub.com is a well-known, legitimate software development platform. No typosquatting or impersonation. Score: 18%docs subdomain is the official documentation subdomain for GitHub. .com TLD is standard. Score: 10% — lower threshold crossed, stop.Output:
Classification: BENIGN
Confidence: 10% phishing likelihood (90% benign confidence)
Iterations: 2
Key evidence:
- github.com is a verified, high-reputation domain
- docs.github.com is the official documentation subdomain
- Path structure matches standard GitHub documentation URL patterns
- HTTPS with legitimate certificate expected
Sensitivity trajectory: 50% → 18% → 10% (threshold crossed)
Example 3: Ambiguous URL requiring many iterations
User: "Classify: https://login-microsoftonline.com/common/oauth2/authorize"
Approach:
login-microsoftonline.com, path=/common/oauth2/authorizelogin-microsoftonline.com resembles Microsoft's legitimate login.microsoftonline.com but is missing the dot separator — this is a concatenation-based impersonation. Score: 72%.com TLD is standard but unremarkable. The lack of subdomain structure differs from Microsoft's actual auth flow. Score: 75%/common/oauth2/authorize exactly mirrors Microsoft's legitimate OAuth endpoint path, suggesting deliberate replication of the authentication flow. Score: 80%login with microsoftonline (vs. the real login.microsoftonline.com) plus an exact replica of the OAuth path is a strong phishing indicator. Score: 88% — threshold crossed.Output:
Classification: PHISHING
Confidence: 88%
Iterations: 4
Key evidence:
- Domain concatenates "login" + "microsoftonline" without dot separator
- Real Microsoft domain is login.microsoftonline.com (subdomain structure)
- Path perfectly replicates legitimate OAuth2 authorization endpoint
- Classic credential-harvesting pattern targeting Microsoft accounts
Sensitivity trajectory: 50% → 72% → 75% → 80% → 88% (threshold crossed)
url=https://... in query params), analyze both the outer URL and the redirect target separately.data: scheme are unusual and warrant elevated initial suspicion scores.Trikilis, H., Marasinghe, P., Rashid, F., & Seneviratne, S. (2026). Eliciting Least-to-Most Reasoning for Phishing URL Detection. arXiv:2601.20270v1. https://arxiv.org/abs/2601.20270v1
Key takeaway: The answer sensitivity mechanism — iterating with a running phishing-likelihood percentage until confidence thresholds are crossed — is what transforms Least-to-Most prompting from a generic decomposition strategy into a self-correcting classification framework that catches subtle phishing signals missed by one-shot analysis.
development
Audit LLM-based automatic short answer grading (ASAG) systems for adversarial vulnerabilities using token-level and prompt-level attack strategies from the GradingAttack framework. Triggers: 'test grading robustness', 'adversarial attack on grading', 'audit LLM grader', 'red-team answer grading', 'ASAG vulnerability assessment', 'grading fairness attack'
development
Build structured information-seeking agents that decompose complex queries into multi-turn search-and-browse workflows, aggregate results from multiple web sources, and return answers in typed structured formats (items, sets, lists, tables). Applies the GISA benchmark's ReAct-based agent architecture and evaluation methodology. Trigger phrases: "build an information-seeking agent", "search agent pipeline", "multi-turn web research agent", "structured web search workflow", "aggregate information from multiple sources", "web research with structured output"
data-ai
Optimize LLM prompts using GFlowPO's iterative generate-evaluate-refine loop with diversity-preserving exploration and dynamic memory. Use when: 'optimize this prompt', 'find a better prompt for this task', 'prompt engineering with examples', 'auto-tune my system prompt', 'improve prompt accuracy', 'generate prompt variations'.
development
Constrain LLM generation with executable Pydantic schemas and multi-agent pipelines to produce structurally valid, domain-rich artifacts. Uses ontology-as-grammar to eliminate hallucinated structures while preserving creative output. Trigger phrases: "generate a valid game design", "schema-constrained generation", "build a multi-agent pipeline with Pydantic validation", "ontology-driven content generation", "structured creative generation with DSPy", "generate artifacts that pass domain validation".