.agents/skills/java-code-reviewer/SKILL.md
Review Java OpenInference instrumentation code for correctness and completeness. Use this skill when reviewing a Java instrumentor package — whether it's a new instrumentor, a PR that modifies one, or when the user asks to audit/review/check an existing instrumentor's code quality. Trigger on phrases like "review the instrumentor", "check the Java code", "audit the package", "is this instrumentor correct", or any request to validate an OpenInference Java instrumentation package against project standards.
npx skillsauth add arize-ai/openinference java-code-reviewerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Review a Java OpenInference instrumentation package against the project's established patterns and conventions. Report findings with file paths and line numbers, organized by severity (Critical / High / Medium / Low).
Step 1: Identify the package to review
java/instrumentation/openinference-instrumentation-<name>/build.gradle, and src/test/ directoryStep 2: Use the instrumented library source as ground truth
Before flagging any finding, verify it against the actual library code. Do NOT assume how the instrumented library works — read it. Do NOT present findings without having read the library source first.
java/build.gradle ext block~/.gradle/caches/modules-2/files-2.1/ for cached sourcesrepo1.maven.org).
Some libraries split across multiple artifacts — check build.gradle dependency
declarations and fetch all relevant ones.Step 3: Run all review sections below
Step 4: Present findings in a severity table, list what's working well, then ask
the user: fix issues, run tests (./gradlew :instrumentation:...:test), or done.
Read the instrumentor's build.gradle and the root java/build.gradle.
compileOnly (not implementation) — Highopeninference-instrumentation must be apiext block, not hardcoded — Mediumjava/settings.gradle — Critical if missingcd java && ./gradlew spotlessCheck (Palantir Java Format)This is the most important testing pattern. Tests must verify ALL span attributes, not spot-check a few. The remove-and-verify pattern catches both unexpected additions and silent removals:
Map<String, Object> attributes = new HashMap<>();
span.getAttributes().forEach((key, value) -> attributes.put(key.getKey(), value));
assertThat(attributes.remove("openinference.span.kind")).isEqualTo("LLM");
assertThat(attributes.remove("llm.model_name")).isEqualTo("gpt-4");
// ... remove and assert all remaining attributes ...
assertThat(attributes).isEmpty(); // Nothing unexpected left
Missing emptiness check — High.
StatusCode.ERROR + recorded exception — HighhideInputMessages etc. actually suppress attributesRead SemanticConventions.java for the full attribute catalog:
java/openinference-semantic-conventions/src/main/java/com/arize/semconv/trace/SemanticConventions.java
Also read the spec files under spec/ (semantic_conventions.md, traces.md,
llm_spans.md, embedding_spans.md, tool_calling.md) for expected behavior.
For the library type being reviewed, verify the instrumentor sets all applicable attributes. Key checks:
OPENINFERENCE_SPAN_KIND, INPUT_VALUE + INPUT_MIME_TYPE,
OUTPUT_VALUE + OUTPUT_MIME_TYPE. Missing MIME type when value is set — HighSemanticConventions, not hardcoded strings — MediumTraceConfig.java for the full list of hide flags; verify each is respected
where applicable — Medium if missingspan.end() must ALWAYS be called — Critical if missingScope from context.makeCurrent() must be closed (try-with-resources) — HightraceIdOITracer (not raw Tracer) to get TraceConfig supportOrganize findings into a table:
| Severity | Section | Finding | Location |
|----------|---------|---------|----------|
| Critical | 4 | span.end() not called in error path | SomeListener.java:142 |
| High | 2 | Tests don't verify all span attributes | SomeTest.java:85 |
| ... | ... | ... | ... |
Then list what's working well — positive findings help the user understand what doesn't need to change.
development
Investigate and propose fixes for Python canary cron failures in the openinference repo. Use when the user mentions Python canary failures, Python cron failures, or when the auto-fix CI job reports Python instrumentation canary issues.
development
Review Python OpenInference instrumentation code for correctness and completeness. Use this skill when reviewing a Python instrumentor package — whether it's a new instrumentor, a PR that modifies one, or when the user asks to audit/review/check an existing instrumentor's code quality. Trigger on phrases like "review the instrumentor", "check the code", "audit the package", "is this instrumentor correct", or any request to validate an OpenInference Python instrumentation package against project standards.
tools
Debug LLM applications using the Phoenix CLI. Fetch traces, analyze errors, review experiments, inspect datasets, and query the GraphQL API. Use when debugging AI/LLM applications, analyzing trace data, working with Phoenix observability, or investigating LLM performance issues.
development
Keep hand-written docs/ documentation in JS packages accurate and up to date with their source code. Use this skill whenever: (1) source files in a JS package that has a docs/ folder are modified — especially exports, function signatures, types, or public API changes, (2) the user asks to "update docs", "sync docs", "check if docs are accurate", "review the documentation", or similar, (3) new exports or features are added to a JS package and the docs need to reflect them. Also trigger when the user mentions documentation drift, stale examples, or missing API coverage in any JS package under js/packages/.