GOMS Models for Task Analysis

license: Apache-2.0

Metadata

Name: goms-task-analysis
Source: GOMS Models for Task Analysis by David Kieras
Domain: Human-computer interaction, task analysis, procedural knowledge modeling
Activation Triggers:
- "How long will it take users to learn this?"
- "Why is this interface inconsistent?"
- "Can we predict performance before building?"
- "How do we model procedural knowledge?"
- "What's the cognitive load of this workflow?"
- "How should we decompose this task?"
- Questions about learning time, execution time, error prediction for interfaces

When to Use This Skill

Load this skill when facing:

Pre-prototype evaluation: Need quantitative predictions of learning/execution time before implementation
Interface consistency audits: Determining whether similar tasks use similar procedures
Workflow complexity assessment: Understanding cognitive load and working memory demands
Training cost estimation: Predicting how much procedural knowledge users must acquire
Multi-agent task decomposition: Designing hierarchical coordination structures
Procedural knowledge representation: Distinguishing rote memorization from generalizable understanding
Error-prone step identification: Finding where users will struggle or make mistakes
Design alternative comparison: Choosing between approaches based on cognitive cost

Core insight: GOMS bridges informal task descriptions and executable cognitive models, making human performance predictable through hierarchical-sequential decomposition constrained by cognitive architecture.

Core Mental Models

1. Generative Models Enable Generalization

Concept: A procedural model must be generative—capable of producing correct behavior for any instance within a task class—not just describe specific action sequences.

Why it matters:

Captures what users actually learn vs. what they do once
Distinguishes memorized procedures from understood methods
Enables design evaluation across task variations
Models the transferable knowledge, not the behavioral trace

Application: When modeling a task, ask "Can this procedure handle variations?" If your model only works for one specific example, you're modeling behavior, not knowledge.

2. Hierarchical Decomposition Reveals Consistency

Concept: Breaking tasks into goal-method-operator-selection hierarchies makes consistency (or its absence) structurally visible.

Why it matters:

Similar high-level goals requiring different low-level methods = unnecessary learning burden
Shared submethods across goals = consistency that reduces cognitive load
The hierarchy is the structure of learnable procedural knowledge
Inconsistency appears as duplicated methods with variations

Application: Map two "similar" tasks side-by-side in hierarchical form. Where their trees diverge unnecessarily, you've found a consistency problem users will feel as confusion.

3. Working Memory as Tagged Coordination

Concept: Working memory isn't unlimited storage—it's a coordination mechanism using explicit tags (<filename>, <target>) that represents real cognitive constraints.

Why it matters:

Accessing stored information takes ~1200ms (not zero)
Multiple tags create actual cognitive load
"Forgetting" must be deliberate (Delete operations)
Memory demands become inspectable before user testing

Application: Count the working memory tags active at any point in your procedure. More than 3-4 simultaneously active tags signals likely cognitive overload.

4. The Bypass Heuristic for Intractable Complexity

Concept: When encountering processes too complex to model practically (reading comprehension, creative judgment, problem-solving), bypass them by treating results as "already available on a yellow pad."

Why it matters:

Not avoidance—explicit documentation of where complexity lies
Only model what varies with design alternatives
Separates interface design space from task domain complexity
Makes modeling tractable without losing validity

Application: If modeling a process requires PhD-level domain theory, bypass it. Focus on what the interface contributes to task difficulty, not domain complexity.

5. Errors as Predictable Goal Triggers

Concept: Errors aren't model failures—they're foreseeable events triggering error-recovery methods. Well-designed systems have simple, consistent recovery procedures.

Why it matters:

Error-prone steps are identifiable (complex calculations, delayed feedback, ambiguous states)
Recovery difficulty is predictable through the same framework as normal operation
Error tolerance becomes designable, not accidental
Consistency matters as much for recovery as for primary tasks

Application: For each operator in your model, ask "What if this fails?" If recovery requires a completely different mental model, the design is brittle.

Decision Frameworks

Should I build this model with GOMS?

If you need to compare design alternatives before implementation → Use GOMS to get quantitative learning/execution predictions

If the task is primarily perceptual-motor with little decision-making → Use simpler keystroke-level models (KLM)

If the task involves complex problem-solving or creative work → Use bypass heuristic to model only the interface-mediated portions

If you need to understand why users find a system confusing → Build hierarchical GOMS to reveal procedural inconsistencies

How detailed should my model be?

If comparing macro-level design approaches → Model to method level only, showing goal decomposition

If predicting actual execution time → Model to operator level, including primitive actions

If analyzing working memory load → Include all Retain, Delete, and Recall operations explicitly

If the detail varies greatly across design alternatives → Only that differing detail matters; use bypass heuristic elsewhere

How do I know if procedures are consistent?

If achieving similar high-level goals requires learning different low-level methods → Inconsistent (redesign for shared submethods)

If the same submethod serves multiple parent goals → Consistent (users learn once, apply broadly)

If selection rules differ arbitrarily for similar situations → Inconsistent (standardize decision criteria)

If error recovery uses the same mechanisms as normal operation → Consistent (reduced cognitive load)

Should I model this as one method or multiple?

If conditions at execution time determine which procedure to follow → One method with selection rules

If users think of these as different tasks → Separate methods (even if procedures are similar)

If the procedures share 80%+ structure with small variations → One parameterized method

If the procedures are fundamentally different approaches → Multiple methods (cognitive distinction exists)

Reference Table

| Reference File | When to Load | Key Content | |---------------|--------------|-------------| | hierarchical-decomposition-as-coordination-mechanism.md | Designing multi-agent systems, task distribution, understanding how hierarchy enables reuse | How goal-method hierarchies enable distributed intelligence to coordinate without global communication; structure as shared protocol | | generative-models-versus-behavioral-traces.md | Distinguishing knowledge from behavior, evaluating if a model captures transferable understanding | The fundamental distinction between models that generate behavior for any task instance vs. those describing specific sequences; why generativity matters | | bypassing-complexity-yellow-pad-heuristic.md | Encountering intractable domain complexity, scoping modeling effort, defending modeling boundaries | How to explicitly non-model complex processes; separating interface-mediated work from domain complexity; pragmatic tractability | | working-memory-as-coordination-mechanism.md | Analyzing cognitive load, designing information handoffs, multi-agent state management | Tagged working memory as coordination mechanism; explicit Retain/Recall/Delete operations; memory as constraint not storage | | judgment-calls-and-how-users-view-tasks.md | Making modeling decisions about user mental models, handling observational ambiguity | The irreducible judgment calls in task modeling; how analyst perspective shapes models; representing unobservable user understanding | | failure-modes-in-procedural-systems.md | Predicting errors, designing error recovery, understanding when procedures break down | Where GOMS models fail (novel situations, skill development); error as goal trigger; designing for fault tolerance |

Anti-Patterns

Modeling Behavioral Traces Instead of Generative Knowledge

Symptom: Your model only works for the exact example you documented, not for variations.

Why it fails: You've captured what an expert does in one case, not what users must learn to handle all cases.

GOMS corrective: Build procedures with parameters and selection rules that generate correct behavior across instances.

Ignoring the Hierarchy

Symptom: Flat lists of steps; no sense of goal-subgoal structure.

Why it fails: You can't see consistency patterns; every task looks equally complex; no reuse visible.

GOMS corrective: Decompose to goals and methods; shared submethods become obvious; inconsistency appears as duplicated-but-different structures.

Treating Working Memory as Free Storage

Symptom: Procedure assumes users "just remember" 7+ pieces of information indefinitely.

Why it fails: Real users experience cognitive overload; information requires deliberate retention and recall; access has time cost.

GOMS corrective: Explicitly tag all working memory contents; count active tags; add Retain/Recall/Delete operations; if >4 tags active, redesign to reduce load.

Modeling Everything at Equal Detail

Symptom: Spending days modeling complex domain calculations that are identical across design alternatives.

Why it fails: Wastes effort on areas where interface design has no impact; obscures what actually matters.

GOMS corrective: Use bypass heuristic for complex invariant processes; focus detail where design alternatives differ.

Treating Errors as Model Failures

Symptom: "The model doesn't account for mistakes, so it's unrealistic."

Why it fails: Errors are predictable, not random; error-prone steps have characteristics; recovery is designable.

GOMS corrective: Identify error-prone operators (complex calculations, delayed feedback); model error recovery as goal-triggered methods; design for consistent recovery procedures.

Confusing Task Structure with Interface Structure

Symptom: Organizing model by menu structure or page flow rather than user goals.

Why it fails: Interface organization often doesn't match task logic; forces artificial procedural sequences.

GOMS corrective: Organize by user goals, not interface geography; let goal-method hierarchy be natural; poor fit between task and interface structure becomes visible.

Shibboleths: Distinguishing Deep Understanding from Surface Knowledge

Surface-Level Understanding Says:

"GOMS just breaks tasks into steps"
"It's for predicting how long tasks take"
"Too detailed for modern agile development"
"Only works for simple procedural tasks"
"Replaced by user testing"

Deep Understanding Recognizes:

The generative distinction: "This model memorizes one case; that model generates solutions for the class"
Hierarchy as consistency detector: "These tasks look different but share 80% of their method structure—easy to learn together"
Working memory as coordination: "We need a Recall operation here because the system stored that information 3 steps ago"
Bypass as principled scoping: "We're bypassing the routing algorithm—not because it's unimportant, but because it's invariant across our design alternatives"
Structure enables reuse: "If we pull 'Verify Input' into a separate method, it becomes reusable across all entry forms—users learn once"
Time prediction as side effect: "The execution time estimate is useful, but the real value is discovering that the new design requires learning 3 additional methods"
Error as designed-for event: "This operator has high error risk (ambiguous feedback), so we need a consistent, simple recovery method here"

Tell-Tale Phrases of Integration:

"The hierarchical structure reveals..."
"This needs to be generative across..."
"Working memory load at this point is..."
"We should bypass this complexity because..."
"The selection rule here mirrors..."
"This inconsistency appears as duplicated methods..."
"Error recovery triggers a goal to..."

The Master's Recognition:

Someone who has truly internalized GOMS doesn't just analyze tasks—they see procedural structure in real-time during design discussions. They notice when "similar" features actually require unrelated procedural knowledge. They instinctively count working memory tags. They distinguish fluently between behavioral observations and generative knowledge. They know when to model deeply and when to bypass pragmatically.

The ultimate shibboleth: They can explain why a proposed interface change will increase learning time before anyone builds it, and do so by pointing to specific structural changes in the goal-method hierarchy, not through vague intuition.

GOMS Models for Task Analysis

license: Apache-2.0

Metadata

Name: goms-task-analysis
Source: GOMS Models for Task Analysis by David Kieras
Domain: Human-computer interaction, task analysis, procedural knowledge modeling
Activation Triggers:
- "How long will it take users to learn this?"
- "Why is this interface inconsistent?"
- "Can we predict performance before building?"
- "How do we model procedural knowledge?"
- "What's the cognitive load of this workflow?"
- "How should we decompose this task?"
- Questions about learning time, execution time, error prediction for interfaces

When to Use This Skill

Load this skill when facing:

Pre-prototype evaluation: Need quantitative predictions of learning/execution time before implementation
Interface consistency audits: Determining whether similar tasks use similar procedures
Workflow complexity assessment: Understanding cognitive load and working memory demands
Training cost estimation: Predicting how much procedural knowledge users must acquire
Multi-agent task decomposition: Designing hierarchical coordination structures
Procedural knowledge representation: Distinguishing rote memorization from generalizable understanding
Error-prone step identification: Finding where users will struggle or make mistakes
Design alternative comparison: Choosing between approaches based on cognitive cost

Core Mental Models

1. Generative Models Enable Generalization

Concept: A procedural model must be generative—capable of producing correct behavior for any instance within a task class—not just describe specific action sequences.

Why it matters:

Captures what users actually learn vs. what they do once
Distinguishes memorized procedures from understood methods
Enables design evaluation across task variations
Models the transferable knowledge, not the behavioral trace

Application: When modeling a task, ask "Can this procedure handle variations?" If your model only works for one specific example, you're modeling behavior, not knowledge.

2. Hierarchical Decomposition Reveals Consistency

Concept: Breaking tasks into goal-method-operator-selection hierarchies makes consistency (or its absence) structurally visible.

Why it matters:

Similar high-level goals requiring different low-level methods = unnecessary learning burden
Shared submethods across goals = consistency that reduces cognitive load
The hierarchy is the structure of learnable procedural knowledge
Inconsistency appears as duplicated methods with variations

Application: Map two "similar" tasks side-by-side in hierarchical form. Where their trees diverge unnecessarily, you've found a consistency problem users will feel as confusion.

3. Working Memory as Tagged Coordination

Concept: Working memory isn't unlimited storage—it's a coordination mechanism using explicit tags (<filename>, <target>) that represents real cognitive constraints.

Why it matters:

Accessing stored information takes ~1200ms (not zero)
Multiple tags create actual cognitive load
"Forgetting" must be deliberate (Delete operations)
Memory demands become inspectable before user testing

Application: Count the working memory tags active at any point in your procedure. More than 3-4 simultaneously active tags signals likely cognitive overload.

4. The Bypass Heuristic for Intractable Complexity

Why it matters:

Not avoidance—explicit documentation of where complexity lies
Only model what varies with design alternatives
Separates interface design space from task domain complexity
Makes modeling tractable without losing validity

Application: If modeling a process requires PhD-level domain theory, bypass it. Focus on what the interface contributes to task difficulty, not domain complexity.

5. Errors as Predictable Goal Triggers

Concept: Errors aren't model failures—they're foreseeable events triggering error-recovery methods. Well-designed systems have simple, consistent recovery procedures.

Why it matters:

Error-prone steps are identifiable (complex calculations, delayed feedback, ambiguous states)
Recovery difficulty is predictable through the same framework as normal operation
Error tolerance becomes designable, not accidental
Consistency matters as much for recovery as for primary tasks

Application: For each operator in your model, ask "What if this fails?" If recovery requires a completely different mental model, the design is brittle.

Decision Frameworks

Should I build this model with GOMS?

If you need to compare design alternatives before implementation → Use GOMS to get quantitative learning/execution predictions

If the task is primarily perceptual-motor with little decision-making → Use simpler keystroke-level models (KLM)

If the task involves complex problem-solving or creative work → Use bypass heuristic to model only the interface-mediated portions

If you need to understand why users find a system confusing → Build hierarchical GOMS to reveal procedural inconsistencies

How detailed should my model be?

If comparing macro-level design approaches → Model to method level only, showing goal decomposition

If predicting actual execution time → Model to operator level, including primitive actions

If analyzing working memory load → Include all Retain, Delete, and Recall operations explicitly

If the detail varies greatly across design alternatives → Only that differing detail matters; use bypass heuristic elsewhere

How do I know if procedures are consistent?

If achieving similar high-level goals requires learning different low-level methods → Inconsistent (redesign for shared submethods)

If the same submethod serves multiple parent goals → Consistent (users learn once, apply broadly)

If selection rules differ arbitrarily for similar situations → Inconsistent (standardize decision criteria)

If error recovery uses the same mechanisms as normal operation → Consistent (reduced cognitive load)

Should I model this as one method or multiple?

If conditions at execution time determine which procedure to follow → One method with selection rules

If users think of these as different tasks → Separate methods (even if procedures are similar)

If the procedures share 80%+ structure with small variations → One parameterized method

If the procedures are fundamentally different approaches → Multiple methods (cognitive distinction exists)

Reference Table

Anti-Patterns

Modeling Behavioral Traces Instead of Generative Knowledge

Symptom: Your model only works for the exact example you documented, not for variations.

Why it fails: You've captured what an expert does in one case, not what users must learn to handle all cases.

GOMS corrective: Build procedures with parameters and selection rules that generate correct behavior across instances.

Ignoring the Hierarchy

Symptom: Flat lists of steps; no sense of goal-subgoal structure.

Why it fails: You can't see consistency patterns; every task looks equally complex; no reuse visible.

GOMS corrective: Decompose to goals and methods; shared submethods become obvious; inconsistency appears as duplicated-but-different structures.

Treating Working Memory as Free Storage

Symptom: Procedure assumes users "just remember" 7+ pieces of information indefinitely.

Why it fails: Real users experience cognitive overload; information requires deliberate retention and recall; access has time cost.

GOMS corrective: Explicitly tag all working memory contents; count active tags; add Retain/Recall/Delete operations; if >4 tags active, redesign to reduce load.

Modeling Everything at Equal Detail

Symptom: Spending days modeling complex domain calculations that are identical across design alternatives.

Why it fails: Wastes effort on areas where interface design has no impact; obscures what actually matters.

GOMS corrective: Use bypass heuristic for complex invariant processes; focus detail where design alternatives differ.

Treating Errors as Model Failures

Symptom: "The model doesn't account for mistakes, so it's unrealistic."

Why it fails: Errors are predictable, not random; error-prone steps have characteristics; recovery is designable.

GOMS corrective: Identify error-prone operators (complex calculations, delayed feedback); model error recovery as goal-triggered methods; design for consistent recovery procedures.

Confusing Task Structure with Interface Structure

Symptom: Organizing model by menu structure or page flow rather than user goals.

Why it fails: Interface organization often doesn't match task logic; forces artificial procedural sequences.

GOMS corrective: Organize by user goals, not interface geography; let goal-method hierarchy be natural; poor fit between task and interface structure becomes visible.

Shibboleths: Distinguishing Deep Understanding from Surface Knowledge

Surface-Level Understanding Says:

"GOMS just breaks tasks into steps"
"It's for predicting how long tasks take"
"Too detailed for modern agile development"
"Only works for simple procedural tasks"
"Replaced by user testing"

Deep Understanding Recognizes:

The generative distinction: "This model memorizes one case; that model generates solutions for the class"
Hierarchy as consistency detector: "These tasks look different but share 80% of their method structure—easy to learn together"
Working memory as coordination: "We need a Recall operation here because the system stored that information 3 steps ago"
Bypass as principled scoping: "We're bypassing the routing algorithm—not because it's unimportant, but because it's invariant across our design alternatives"
Structure enables reuse: "If we pull 'Verify Input' into a separate method, it becomes reusable across all entry forms—users learn once"
Time prediction as side effect: "The execution time estimate is useful, but the real value is discovering that the new design requires learning 3 additional methods"
Error as designed-for event: "This operator has high error risk (ambiguous feedback), so we need a consistent, simple recovery method here"

Tell-Tale Phrases of Integration:

"The hierarchical structure reveals..."
"This needs to be generative across..."
"Working memory load at this point is..."
"We should bypass this complexity because..."
"The selection rule here mirrors..."
"This inconsistency appears as duplicated methods..."
"Error recovery triggers a goal to..."

Adoption

curiositech/kieras-goms-for-task-analysis

$ install --global

Security Scan Results

SKILL.md

GOMS Models for Task Analysis

Metadata

When to Use This Skill

Core Mental Models

1. Generative Models Enable Generalization

2. Hierarchical Decomposition Reveals Consistency

3. Working Memory as Tagged Coordination

4. The Bypass Heuristic for Intractable Complexity

5. Errors as Predictable Goal Triggers

Decision Frameworks

Should I build this model with GOMS?

How detailed should my model be?

How do I know if procedures are consistent?

Should I model this as one method or multiple?

Reference Table

Anti-Patterns

Modeling Behavioral Traces Instead of Generative Knowledge

Ignoring the Hierarchy

Treating Working Memory as Free Storage

Modeling Everything at Equal Detail

Treating Errors as Model Failures

Confusing Task Structure with Interface Structure

Shibboleths: Distinguishing Deep Understanding from Surface Knowledge

Surface-Level Understanding Says:

Deep Understanding Recognizes:

Tell-Tale Phrases of Integration:

The Master's Recognition:

Related Skills

curiositech/revisiting-interview-data-analysing-turn

curiositech/redis-patterns-expert

curiositech/react-server-components-boundary

curiositech/rate-limiting-strategy

curiositech/kieras-goms-for-task-analysis

$ install --global

Security Scan Results

SKILL.md

GOMS Models for Task Analysis

Metadata

When to Use This Skill

Core Mental Models

1. Generative Models Enable Generalization

2. Hierarchical Decomposition Reveals Consistency

3. Working Memory as Tagged Coordination

4. The Bypass Heuristic for Intractable Complexity

5. Errors as Predictable Goal Triggers

Decision Frameworks

Should I build this model with GOMS?

How detailed should my model be?

How do I know if procedures are consistent?

Should I model this as one method or multiple?

Reference Table

Anti-Patterns

Modeling Behavioral Traces Instead of Generative Knowledge

Ignoring the Hierarchy

Treating Working Memory as Free Storage

Modeling Everything at Equal Detail

Treating Errors as Model Failures

Confusing Task Structure with Interface Structure

Shibboleths: Distinguishing Deep Understanding from Surface Knowledge

Surface-Level Understanding Says:

Deep Understanding Recognizes:

Tell-Tale Phrases of Integration:

The Master's Recognition:

Related Skills

curiositech/revisiting-interview-data-analysing-turn

curiositech/redis-patterns-expert

curiositech/react-server-components-boundary

curiositech/rate-limiting-strategy