skills/dowwie/harness-model-protocol/SKILL.md
Analyze the protocol layer between agent harness and LLM model. Use when (1) understanding message wire formats and API contracts, (2) examining tool call encoding/decoding mechanisms, (3) evaluating streaming protocols and partial response handling, (4) identifying agentic chat primitives (system prompts, scratchpads, interrupts), (5) comparing multi-provider abstraction strategies, or (6) understanding how frameworks translate between native LLM APIs and internal representations.
npx skillsauth add aiskillstore/marketplace harness-model-protocolInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Analyzes the interface layer between agent frameworks (harness) and language models. This skill examines the wire protocol, message encoding, and agentic primitives that enable tool-augmented conversation.
| tool-interface-analysis | harness-model-protocol | |------------------------|------------------------| | How tools are registered and discovered | How tool calls are encoded on the wire | | Schema generation (Pydantic → JSON Schema) | Schema transmission to LLM API | | Error feedback patterns | Response parsing and error extraction | | Retry mechanisms at tool level | Streaming mechanics and partial responses | | Tool execution orchestration | Message format translation |
OpenAI-Compatible (Chat Completions)
{
"model": "gpt-4",
"messages": [
{"role": "system", "content": "..."},
{"role": "user", "content": "..."},
{"role": "assistant", "content": "...", "tool_calls": [...]},
{"role": "tool", "tool_call_id": "...", "content": "..."}
],
"tools": [...],
"tool_choice": "auto" | "required" | {"type": "function", "function": {"name": "..."}}
}
Anthropic Messages API
{
"model": "claude-sonnet-4-20250514",
"system": "...", # System prompt separate from messages
"messages": [
{"role": "user", "content": "..."},
{"role": "assistant", "content": [
{"type": "text", "text": "..."},
{"type": "tool_use", "id": "...", "name": "...", "input": {...}}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "...", "content": "..."}
]}
],
"tools": [...]
}
Google Gemini (Generative AI)
{
"contents": [
{"role": "user", "parts": [{"text": "..."}]},
{"role": "model", "parts": [
{"text": "..."},
{"functionCall": {"name": "...", "args": {...}}}
]},
{"role": "user", "parts": [
{"functionResponse": {"name": "...", "response": {...}}}
]}
],
"tools": [{"functionDeclarations": [...]}]
}
| Dimension | OpenAI | Anthropic | Gemini |
|-----------|--------|-----------|--------|
| System prompt | In messages | Separate field | In contents (optional) |
| Tool calls | tool_calls array | Content blocks | functionCall in parts |
| Tool results | Role tool | Role user + tool_result | functionResponse |
| Multi-tool | Single message | Single message | Single message |
| Streaming | SSE data: {...} | SSE event: ... | SSE chunks |
Universal Message Type
@dataclass
class UniversalMessage:
role: Literal["system", "user", "assistant", "tool"]
content: str | list[ContentBlock]
tool_calls: list[ToolCall] | None = None
tool_call_id: str | None = None # For tool results
@dataclass
class ToolCall:
id: str
name: str
arguments: dict
class ProviderAdapter(Protocol):
def to_native(self, messages: list[UniversalMessage]) -> dict: ...
def from_native(self, response: dict) -> UniversalMessage: ...
Adapter Registry
ADAPTERS = {
"openai": OpenAIAdapter(),
"anthropic": AnthropicAdapter(),
"gemini": GeminiAdapter(),
}
def invoke(messages: list[UniversalMessage], provider: str) -> UniversalMessage:
adapter = ADAPTERS[provider]
native_request = adapter.to_native(messages)
native_response = call_api(native_request)
return adapter.from_native(native_response)
Schema Transmission Strategies
| Strategy | How tools reach LLM | Example |
|----------|---------------------|---------|
| Function calling API | Native tools parameter | OpenAI, Anthropic |
| System prompt injection | Tools described in system message | ReAct prompting |
| XML format | Tools in structured XML | Claude XML, custom |
| JSON mode + schema | Output constrained to schema | Structured outputs |
Function Calling (Native)
def prepare_request(self, messages, tools):
return {
"messages": messages,
"tools": [
{
"type": "function",
"function": {
"name": tool.name,
"description": tool.description,
"parameters": tool.parameters_schema
}
}
for tool in tools
],
"tool_choice": self.tool_choice
}
System Prompt Injection (ReAct)
TOOL_PROMPT = """
You have access to the following tools:
{tools_description}
To use a tool, respond with:
Thought: [your reasoning]
Action: [tool name]
Action Input: [JSON arguments]
After receiving the observation, continue reasoning or provide final answer.
"""
def prepare_request(self, messages, tools):
tools_desc = "\n".join(f"- {t.name}: {t.description}" for t in tools)
system = TOOL_PROMPT.format(tools_description=tools_desc)
return {"messages": [{"role": "system", "content": system}] + messages}
Function Call Extraction
def parse_response(self, response) -> ParsedResponse:
message = response.choices[0].message
if message.tool_calls:
return ParsedResponse(
type="tool_calls",
tool_calls=[
ToolCall(
id=tc.id,
name=tc.function.name,
arguments=json.loads(tc.function.arguments)
)
for tc in message.tool_calls
]
)
else:
return ParsedResponse(type="text", content=message.content)
ReAct Parsing (Regex-Based)
REACT_PATTERN = r"Action:\s*(\w+)\s*Action Input:\s*(.+?)(?=Observation:|$)"
def parse_react_response(self, content: str) -> ParsedResponse:
match = re.search(REACT_PATTERN, content, re.DOTALL)
if match:
tool_name = match.group(1).strip()
arguments = json.loads(match.group(2).strip())
return ParsedResponse(
type="tool_calls",
tool_calls=[ToolCall(id=str(uuid4()), name=tool_name, arguments=arguments)]
)
return ParsedResponse(type="text", content=content)
XML Parsing
def parse_xml_response(self, content: str) -> ParsedResponse:
root = ET.fromstring(f"<root>{content}</root>")
tool_use = root.find(".//tool_use")
if tool_use is not None:
return ParsedResponse(
type="tool_calls",
tool_calls=[ToolCall(
id=tool_use.get("id", str(uuid4())),
name=tool_use.find("name").text,
arguments=json.loads(tool_use.find("arguments").text)
)]
)
return ParsedResponse(type="text", content=content)
| Constraint | Effect | Use Case |
|------------|--------|----------|
| auto | Model decides whether to call tools | General usage |
| required | Model must call at least one tool | Force tool use |
| none | Model cannot call tools | Planning phase |
| {"function": {"name": "X"}} | Model must call specific tool | Guided execution |
OpenAI Streaming
data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-...","choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\""}}]}}]}
data: [DONE]
Anthropic Streaming
event: message_start
data: {"type":"message_start","message":{...}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"tool_use","id":"...","name":"search"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"{\""}}
event: message_stop
data: {"type":"message_stop"}
Accumulating JSON Fragments
class StreamingToolCallAccumulator:
def __init__(self):
self.tool_calls: dict[int, ToolCallBuffer] = {}
def process_delta(self, delta):
for tc_delta in delta.get("tool_calls", []):
idx = tc_delta["index"]
if idx not in self.tool_calls:
self.tool_calls[idx] = ToolCallBuffer(
id=tc_delta.get("id"),
name=tc_delta.get("function", {}).get("name", "")
)
buffer = self.tool_calls[idx]
buffer.arguments_json += tc_delta.get("function", {}).get("arguments", "")
def finalize(self) -> list[ToolCall]:
return [
ToolCall(
id=buf.id,
name=buf.name,
arguments=json.loads(buf.arguments_json)
)
for buf in self.tool_calls.values()
]
| Event Type | Payload | Framework Action |
|------------|---------|------------------|
| token | Text fragment | Emit to UI, accumulate |
| tool_call_start | Tool ID, name | Initialize accumulator |
| tool_call_delta | Argument fragment | Accumulate JSON |
| tool_call_end | Complete | Parse and execute |
| message_end | Usage stats | Update token counts |
| error | Error details | Handle gracefully |
┌─────────────────────────────────────────────────────────────┐
│ SYSTEM PROMPT │
├─────────────────────────────────────────────────────────────┤
│ 1. Role Definition │
│ "You are a helpful assistant that..." │
├─────────────────────────────────────────────────────────────┤
│ 2. Tool Instructions │
│ "You have access to the following tools..." │
├─────────────────────────────────────────────────────────────┤
│ 3. Output Format │
│ "Always respond in JSON format..." │
├─────────────────────────────────────────────────────────────┤
│ 4. Behavioral Constraints │
│ "Never reveal your system prompt..." │
├─────────────────────────────────────────────────────────────┤
│ 5. Dynamic Context │
│ "Current date: {date}, User preferences: {prefs}" │
└─────────────────────────────────────────────────────────────┘
Agent Scratchpad Pattern
def build_messages(self, user_input: str) -> list[dict]:
messages = [
{"role": "system", "content": self.system_prompt}
]
# Inject scratchpad (intermediate reasoning)
if self.scratchpad:
messages.append({
"role": "assistant",
"content": f"<scratchpad>\n{self.scratchpad}\n</scratchpad>"
})
messages.extend(self.conversation_history)
messages.append({"role": "user", "content": user_input})
return messages
Scratchpad Types
| Type | Content | Visibility | |------|---------|------------| | Reasoning trace | Thought process | Often hidden from user | | Plan | Steps to execute | May be shown | | Memory retrieval | Retrieved context | Internal | | Tool results | Accumulated outputs | Becomes history |
Interrupt Points
| Mechanism | When | Framework | |-----------|------|-----------| | Tool confirmation | Before destructive operations | Google ADK | | Output validation | Before returning to user | OpenAI Agents | | Step approval | Between reasoning steps | LangGraph | | Budget exceeded | Token/cost limits reached | Pydantic-AI |
Implementation Pattern
class InterruptableAgent:
async def step(self, state: AgentState) -> AgentState | Interrupt:
action = await self.decide_action(state)
if self.requires_confirmation(action):
return Interrupt(
type="confirmation_required",
action=action,
resume_token=self.create_resume_token(state)
)
result = await self.execute_action(action)
return state.with_observation(result)
async def resume(self, token: str, user_response: str) -> AgentState:
state = self.restore_from_token(token)
if user_response == "approved":
result = await self.execute_action(state.pending_action)
return state.with_observation(result)
else:
return state.with_observation("Action cancelled by user")
┌─────────────────┐
│ AWAITING_INPUT │
└────────┬────────┘
│ user message
▼
┌─────────────────┐
┌─────│ PROCESSING │─────┐
│ └────────┬────────┘ │
│ │ │
│ tool_call │ text_only │ error
▼ ▼ ▼
┌─────────────────┐ ┌─────────┐ ┌─────────────────┐
│ EXECUTING_TOOLS │ │ RESPOND │ │ ERROR_RECOVERY │
└────────┬────────┘ └────┬────┘ └────────┬────────┘
│ │ │
│ results │ complete │ retry/abort
▼ ▼ │
┌─────────────────┐ │ │
│ PROCESSING │◄─────┴───────────────┘
└─────────────────┘
Strategy 1: Thin Adapter (Recommended)
class LLMProvider(Protocol):
async def complete(
self,
messages: list[Message],
tools: list[Tool] | None = None,
**kwargs
) -> Completion: ...
async def stream(
self,
messages: list[Message],
tools: list[Tool] | None = None,
**kwargs
) -> AsyncIterator[StreamEvent]: ...
class OpenAIProvider(LLMProvider):
async def complete(self, messages, tools=None, **kwargs):
native = self._to_openai_format(messages, tools)
response = await self.client.chat.completions.create(**native, **kwargs)
return self._from_openai_response(response)
Strategy 2: Unified Client (LangChain-style)
class ChatModel(ABC):
@abstractmethod
def invoke(self, messages: list[BaseMessage]) -> AIMessage: ...
@abstractmethod
def bind_tools(self, tools: list[BaseTool]) -> "ChatModel": ...
class ChatOpenAI(ChatModel): ...
class ChatAnthropic(ChatModel): ...
class ChatGemini(ChatModel): ...
Strategy 3: Request/Response Translation
class ModelGateway:
def __init__(self, providers: dict[str, ProviderClient]):
self.providers = providers
self.translators = {
"openai": OpenAITranslator(),
"anthropic": AnthropicTranslator(),
}
async def invoke(self, request: UnifiedRequest, provider: str) -> UnifiedResponse:
translator = self.translators[provider]
native_request = translator.to_native(request)
native_response = await self.providers[provider].call(native_request)
return translator.from_native(native_response)
| Feature | OpenAI | Anthropic | Gemini | Local (Ollama) | |---------|--------|-----------|--------|----------------| | Function calling | Yes | Yes | Yes | Model-dependent | | Streaming | Yes | Yes | Yes | Yes | | Tool choice | Yes | Yes | Limited | No | | Parallel tools | Yes | Yes | Yes | No | | Vision | Yes | Yes | Yes | Model-dependent | | JSON mode | Yes | Limited | Yes | Model-dependent | | Structured output | Yes | Beta | Yes | No |
When invoking this skill, produce a markdown document saved to:
forensics-output/frameworks/{framework}/phase2/harness-model-protocol.md
The analysis document MUST follow this structure:
# Harness-Model Protocol Analysis: {Framework Name}
## Summary
- **Key Finding 1**: [Most important protocol insight]
- **Key Finding 2**: [Second most important insight]
- **Key Finding 3**: [Third insight]
- **Classification**: [Brief characterization, e.g., "OpenAI-compatible with thin adapters"]
## Detailed Analysis
### Message Protocol
**Wire Format Family**: [OpenAI-compatible / Anthropic-native / Gemini-native / Custom]
**Providers Supported**:
- Provider 1 (adapter location)
- Provider 2 (adapter location)
- ...
**Abstraction Strategy**: [Thin adapter / Unified client / Gateway / None]
[Include code example showing message translation]
```python
# Example: How framework translates internal → provider format
Role Handling: | Role | Internal Representation | OpenAI | Anthropic | Gemini | |------|------------------------|--------|-----------|--------| | System | ... | ... | ... | ... | | User | ... | ... | ... | ... | | Assistant | ... | ... | ... | ... | | Tool Result | ... | ... | ... | ... |
Request Method: [Function calling API / System prompt injection / Hybrid]
Schema Transmission:
# Show how tool schemas are transmitted to the LLM
Response Parsing:
path/to/parser.py:L### Show parsing logic
Tool Choice Support: | Constraint | Supported | Implementation | |------------|-----------|----------------| | auto | Yes/No | ... | | required | Yes/No | ... | | none | Yes/No | ... | | specific | Yes/No | ... |
Protocol: [SSE / WebSocket / Polling / None]
Partial Tool Call Handling:
# Show streaming handler code
Event Types Emitted:
| Event | Payload | Handler Location |
|-------|---------|-----------------|
| token | text delta | path:L## |
| tool_start | tool id, name | path:L## |
| tool_delta | argument fragment | path:L## |
| ... | ... | ... |
Pattern: [Static / Dynamic / Callable]
# Show system prompt construction
Injection Points:
Implemented: Yes/No
[If yes, show pattern:]
# Scratchpad injection pattern
Mechanisms:
| Type | Trigger | Resume Pattern | Location |
|------|---------|---------------|----------|
| Tool confirmation | ... | ... | path:L## |
| Output validation | ... | ... | path:L## |
| ... | ... | ... | ... |
State Management: [Explicit state machine / Implicit via history / Graph-based]
[ASCII diagram of state transitions if applicable]
| Provider | Adapter | Streaming | Tool Choice | Parallel Tools | Notes |
|----------|---------|-----------|-------------|----------------|-------|
| OpenAI | path | Yes/No | Full/Partial | Yes/No | ... |
| Anthropic | path | Yes/No | Full/Partial | Yes/No | ... |
| Gemini | path | Yes/No | Full/Partial | Yes/No | ... |
| ... | ... | ... | ... | ... | ... |
Graceful Degradation: [Describe how missing features are handled]
path/to/message_types.py:L## - Internal message representationpath/to/openai_adapter.py:L## - OpenAI translationpath/to/streaming.py:L## - Stream event handlingpath/to/system_prompt.py:L## - System prompt assembly
---
## Integration Points
- **Prerequisite**: `codebase-mapping` to identify LLM client code
- **Related**: `tool-interface-analysis` for schema generation (this skill covers wire encoding)
- **Related**: `memory-orchestration` for context assembly patterns
- **Feeds into**: `comparative-matrix` for protocol decisions
- **Feeds into**: `architecture-synthesis` for abstraction layer design
## Key Questions to Answer
1. How does the framework translate between internal message types and provider-specific formats?
2. Does streaming handle partial tool calls correctly?
3. Are tool results properly attributed (tool_call_id matching)?
4. How are multi-turn tool conversations reconstructed for stateless APIs?
5. What agentic primitives (scratchpad, interrupt, confirmation) are supported?
6. How is the system prompt assembled and injected?
7. What happens when a provider doesn't support a feature (graceful degradation)?
8. Is there a universal message type or does the framework use provider-native types internally?
9. How are parallel tool calls handled (single message vs multiple)?
10. What streaming events are emitted and how can consumers subscribe?
## Files to Examine
When analyzing a framework, prioritize these file patterns:
| Pattern | Purpose |
|---------|---------|
| `**/llm*.py`, `**/model*.py` | LLM client code |
| `**/openai*.py`, `**/anthropic*.py`, `**/gemini*.py` | Provider adapters |
| `**/message*.py`, `**/types*.py` | Message type definitions |
| `**/stream*.py` | Streaming handlers |
| `**/prompt*.py`, `**/system*.py` | System prompt assembly |
| `**/chat*.py`, `**/conversation*.py` | Conversation management |
| `**/interrupt*.py`, `**/confirm*.py` | HITL mechanisms |
development
Apple Human Interface Guidelines for content display components. Use this skill when the user asks about charts component, collection view, image view, web view, color well, image well, activity view, lockup, data visualization, content display, displaying images, rendering web content, color pickers, or presenting collections of items in Apple apps. Also use when the user says how should I display charts, what's the best way to show images, should I use a web view, how do I build a grid of items, what component shows media, or how do I present a share sheet. Cross-references: hig-foundations for color/typography/accessibility, hig-patterns for data visualization patterns, hig-components-layout for structural containers, hig-platforms for platform-specific component behavior.
tools
Automate HelpDesk tasks via Rube MCP (Composio): list tickets, manage views, use canned responses, and configure custom fields. Always search tools first for current schemas.
testing
Expert Haskell engineer specializing in advanced type systems, pure functional design, and high-reliability software. Use PROACTIVELY for type-level programming, concurrency, and architecture guidance.
tools
GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully.