skills/report-agent-risk-data/SKILL.md
Use when an agent is already instrumented with Prefactor and you need to populate data_risk fields on its span types to enable compliance tracking and data governance.
npx skillsauth add prefactordev/typescript-sdk report-agent-risk-dataInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Populate data_risk metadata on span types for an agent that is already instrumented with Prefactor.
Core principle: infer, don't guess. Read the code each span type wraps and reason about what data enters and leaves it — then record that as risk metadata.
Apply this skill when the user asks for any of these:
The agent must already be instrumented with Prefactor and emitting spans. If it is not, first apply skills/instrument-existing-agent-with-prefactor-sdk/SKILL.md.
Also ensure you have:
agent_id of the target agentaction_profile.params_data_categories.result_data_categories.data_risk on each span type.Search the agent source for strings that name span types:
# Find span type declarations
rg "spanType|schema_name|withSpan|startSpan" --type ts -l
# Find string literals that look like span type names (package:category pattern)
rg '"[a-z][a-z0-9_-]*:[a-z][a-z0-9_:_-]*"' --type ts
For custom SDK agents, collect the unique span type strings used in withSpan or startSpan calls across the codebase.
action_profileFor each span type, read the code it wraps and answer these six questions. Use allowed when the span explicitly performs the action, disallowed when it explicitly cannot, and unknown when it is unclear.
| Field | Set to allowed when... | Default when no evidence |
|---|---|---|
| create_data | span creates files, db records, or artifacts | unknown |
| read_data | span reads files, db, memory, or config | unknown |
| update_data | span modifies existing records or files | unknown |
| destroy_data | span deletes data | unknown |
| financial_transactions | span calls payment, billing, or financial APIs | disallowed |
| external_communication | span makes HTTP calls, sends email, or calls external APIs | unknown |
financial_transactions is the only field that should default to disallowed — most spans have no payment involvement and that can be stated confidently.
params_data_categoriesparams_data_categories describes data that flows into the span as inputs.
First, set classification — the overall sensitivity level of the input data:
public — only public web content, no user or org datainternal — org-internal system data, no user PIIconfidential — user messages, org documents, or business datarestricted — credentials, secrets, or high-sensitivity regulated datasecret — highest sensitivity (rare)unknown — unclear from code inspectionThen set each of the 17 category fields to included, excluded, or unknown:
included when you can confirm that type of data is present in inputs.excluded when you can confirm it is absent.unknown when it is unclear — do not use excluded speculatively.Category fields:
personal_identifiers — names, IDs, usernames
contact_information — email, phone, address
financial_information — payment details, account numbers
health_and_medical — medical records, prescriptions
criminal_justice — criminal records, legal proceedings
authentication_and_secrets — passwords, API keys, tokens, private keys
organisational_confidential — internal business documents, source code
minors_data — data relating to children
location_and_tracking — GPS, IP address, movement history
behavioural_and_inferred — usage patterns, inferred preferences
gdpr_racial_or_ethnic_origin
gdpr_political_opinions
gdpr_religious_or_philosophical_beliefs
gdpr_trade_union_membership
gdpr_genetic_data
gdpr_biometric_for_identification
gdpr_sex_life_or_sexual_orientation
result_data_categoriesresult_data_categories describes data that flows out of the span as outputs. Use the same structure as params_data_categories.
Outputs are often narrower than inputs. A span that reads confidential files may return only a byte count or status, making many result categories excluded. Read the return values and result payload of the span before setting these fields.
For agents instrumented directly with @prefactor/core (or provider packages like @prefactor/langchain / @prefactor/ai), pass data_risk directly on each SpanTypeSchema entry when building the agent schema version. The types are available from core:
import type { DataRisk, SpanTypeSchema, AgentSchemaVersion } from '@prefactor/core';
const spanTypeSchemas: SpanTypeSchema[] = [
{
name: 'myapp:ingest',
params_schema: { type: 'object', properties: { text: { type: 'string' } } },
data_risk: {
action_profile: {
create_data: 'unknown',
read_data: 'allowed',
update_data: 'unknown',
destroy_data: 'disallowed',
financial_transactions: 'disallowed',
external_communication: 'unknown',
},
params_data_categories: {
classification: 'confidential',
personal_identifiers: 'unknown',
contact_information: 'unknown',
financial_information: 'excluded',
health_and_medical: 'unknown',
criminal_justice: 'excluded',
authentication_and_secrets: 'excluded',
organisational_confidential: 'included',
minors_data: 'unknown',
location_and_tracking: 'unknown',
behavioural_and_inferred: 'unknown',
gdpr_racial_or_ethnic_origin: 'unknown',
gdpr_political_opinions: 'unknown',
gdpr_religious_or_philosophical_beliefs: 'unknown',
gdpr_trade_union_membership: 'unknown',
gdpr_genetic_data: 'unknown',
gdpr_biometric_for_identification: 'unknown',
gdpr_sex_life_or_sexual_orientation: 'unknown',
},
result_data_categories: {
classification: 'internal',
personal_identifiers: 'excluded',
contact_information: 'excluded',
financial_information: 'excluded',
health_and_medical: 'excluded',
criminal_justice: 'excluded',
authentication_and_secrets: 'excluded',
organisational_confidential: 'excluded',
minors_data: 'excluded',
location_and_tracking: 'excluded',
behavioural_and_inferred: 'excluded',
gdpr_racial_or_ethnic_origin: 'excluded',
gdpr_political_opinions: 'excluded',
gdpr_religious_or_philosophical_beliefs: 'excluded',
gdpr_trade_union_membership: 'excluded',
gdpr_genetic_data: 'excluded',
gdpr_biometric_for_identification: 'excluded',
gdpr_sex_life_or_sexual_orientation: 'excluded',
},
},
},
];
Build a complete AgentSchemaVersion — span_type_schemas must be nested inside it along with external_identifier (and any other required fields) — then pass that object as AgentInstanceRegisterPayload.agent_schema_version when registering the agent instance, or configure the full AgentSchemaVersion via your provider package's schema configuration:
const agentSchemaVersion: AgentSchemaVersion = {
external_identifier: '<version-identifier>',
span_type_schemas: spanTypeSchemas,
};
// Pass to AgentInstanceRegisterPayload.agent_schema_version
Create or update a schema version via CLI when you cannot modify the instrumentation source:
prefactor agent_schema_versions create \
--agent_id <agent-id> \
--external_identifier <version-identifier> \
--span_type_schemas '<json-array>'
Each element of the array should follow this shape:
{
"name": "<span-type-name>",
"params_schema": { "type": "object", "properties": {} },
"data_risk": {
"action_profile": {
"create_data": "unknown",
"read_data": "unknown",
"update_data": "unknown",
"destroy_data": "unknown",
"financial_transactions": "disallowed",
"external_communication": "unknown"
},
"params_data_categories": {
"classification": "unknown",
"personal_identifiers": "unknown",
"contact_information": "unknown",
"financial_information": "unknown",
"health_and_medical": "unknown",
"criminal_justice": "unknown",
"authentication_and_secrets": "unknown",
"organisational_confidential": "unknown",
"minors_data": "unknown",
"location_and_tracking": "unknown",
"behavioural_and_inferred": "unknown",
"gdpr_racial_or_ethnic_origin": "unknown",
"gdpr_political_opinions": "unknown",
"gdpr_religious_or_philosophical_beliefs": "unknown",
"gdpr_trade_union_membership": "unknown",
"gdpr_genetic_data": "unknown",
"gdpr_biometric_for_identification": "unknown",
"gdpr_sex_life_or_sexual_orientation": "unknown"
},
"result_data_categories": {
"classification": "unknown",
"personal_identifiers": "unknown",
"contact_information": "unknown",
"financial_information": "unknown",
"health_and_medical": "unknown",
"criminal_justice": "unknown",
"authentication_and_secrets": "unknown",
"organisational_confidential": "unknown",
"minors_data": "unknown",
"location_and_tracking": "unknown",
"behavioural_and_inferred": "unknown",
"gdpr_racial_or_ethnic_origin": "unknown",
"gdpr_political_opinions": "unknown",
"gdpr_religious_or_philosophical_beliefs": "unknown",
"gdpr_trade_union_membership": "unknown",
"gdpr_genetic_data": "unknown",
"gdpr_biometric_for_identification": "unknown",
"gdpr_sex_life_or_sexual_orientation": "unknown"
}
}
}
Use these as a starting point and adjust based on what you find in the code.
| Span type pattern | action_profile highlights | classification | Notable categories |
|---|---|---|---|
| *:tool:read | read_data: allowed, financial_transactions: disallowed, others unknown | confidential | authentication_and_secrets: included, organisational_confidential: included |
| *:tool:write | create_data: allowed, financial_transactions: disallowed, others unknown | confidential | organisational_confidential: included |
| *:tool:edit | update_data: allowed, financial_transactions: disallowed, others unknown | confidential | authentication_and_secrets: included, organisational_confidential: included |
| *:tool:exec | read_data: allowed, financial_transactions: disallowed, others unknown | restricted | authentication_and_secrets: included, organisational_confidential: included |
| *:tool:web_search / *:tool:web_fetch / *:tool:browser | external_communication: allowed, others disallowed | public | all categories excluded |
| *:user_message / *:user_interaction | all unknown except financial_transactions: disallowed | confidential | all unknown |
| *:agent_run / *:session | all unknown except financial_transactions: disallowed | internal | all unknown |
| *:agent_thinking / *:assistant_response | create_data: allowed, others unknown | confidential | all unknown |
| *:tool (generic fallback) | all unknown | unknown | all unknown |
After delivering risk data, retrieve the schema version and confirm data_risk is present:
# List schema versions for the agent
prefactor agent_schema_versions list --agent_id <agent-id>
# Retrieve and inspect a specific version
prefactor agent_schema_versions retrieve <schema-version-id>
Confirm the response includes data_risk on each span type schema entry.
references/risk-data-checklist.md.unknown without reading the code — this defeats the purpose of the exercise.financial_transactions: allowed on spans that only incidentally display financial data without transacting.classification: public on spans where user-controlled text can flow through.result_data_categories identical to params_data_categories without checking whether outputs are actually narrower.financial_transactions: disallowed from non-financial spans.tools
Use when writing or fixing span summary templates (display templates) on Prefactor span type schemas, when spans show raw JSON or blank summaries in the Prefactor UI, or when you want one-line Liquid summaries of agent, llm, tool, and custom spans.
tools
Use when performing root-cause analysis on a Prefactor agent run — bad output, surprising behavior, high cost, incomplete work, downvotes, or anything worth investigating. Run in the agent's own codebase. User provides agent instance ID (and agent ID if needed).
development
Use when choosing which Prefactor SDK skill to load for agent instrumentation or for building a custom provider integration on top of @prefactor/core.
tools
Use when an existing agent already works without Prefactor and you need to add tracing for runs, llm calls, tool calls, and failures with minimal behavior changes.