.cognition/skills/debug-exiftool/SKILL.md
Debug and fix Image::ExifTool test failures in PerlOnJava
npx skillsauth add fglock/perlonjava debug-exiftoolInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
git stash ⚠️⚠️⚠️DANGER: Changes are SILENTLY LOST when using git stash/stash pop!
git stash to temporarily revert changesgit diff > backup.patchYou are debugging failures in the Image::ExifTool test suite running under PerlOnJava (a Perl-to-JVM compiler/interpreter). Failures typically stem from missing Perl features or subtle behavior differences in PerlOnJava, not bugs in ExifTool itself.
IMPORTANT: Never push directly to master. Always use feature branches and PRs.
IMPORTANT: Always commit or stash changes BEFORE switching branches. If git stash pop has conflicts, uncommitted changes may be lost.
git checkout -b fix/exiftool-issue-name
# ... make changes ...
git push origin fix/exiftool-issue-name
gh pr create --title "Fix: description" --body "Details"
src/main/java/org/perlonjava/ (compiler, bytecode interpreter, runtime)Image-ExifTool-13.44/ (unmodified upstream)Image-ExifTool-13.44/t/*.tImage-ExifTool-13.44/t/TestLib.pm (exports check, writeCheck, writeInfo, testCompare, binaryCompare, testVerbose, notOK, done)Image-ExifTool-13.44/t/images/ (reference images)Image-ExifTool-13.44/t/<TestName>_N.out (expected tag output per sub-test)src/test/resources/unit/*.t (make suite, 154 tests)perl5_t/t/ (Perl 5 compatibility suite, run via make test-gradle)target/perlonjava-3.0.0.jar./jperl (resolves JAR path, sets $^X)ALWAYS use make commands. NEVER use raw mvn/gradlew commands.
| Command | What it does |
|---------|--------------|
| make | Build + run all unit tests (use before committing) |
| make dev | Build only, skip tests (for quick iteration during debugging) |
make # Standard build - compiles and runs tests
make dev # Quick build - compiles only, NO tests
cd Image-ExifTool-13.44
java -jar ../target/perlonjava-3.0.0.jar -Ilib t/Writer.t
# Or using the launcher:
cd Image-ExifTool-13.44
../jperl -Ilib t/Writer.t
cd Image-ExifTool-13.44
timeout 120 java -jar ../target/perlonjava-3.0.0.jar -Ilib t/XMP.t
cd Image-ExifTool-13.44
mkdir -p /tmp/exiftool_results
for t in t/*.t; do
name=$(basename "$t" .t)
( output=$(timeout 120 java -jar ../target/perlonjava-3.0.0.jar -Ilib "$t" 2>&1)
ec=$?
if [ $ec -eq 124 ]; then echo "$name TIMEOUT"
else
pass=$(echo "$output" | grep -cE '^ok ')
fail=$(echo "$output" | grep -cE '^not ok ')
plan=$(echo "$output" | grep -oE '^1\.\.[0-9]+' | head -1)
planned=${plan#1..}
echo "$name pass=$pass fail=$fail planned=${planned:-?} exit=$ec"
fi
) > "/tmp/exiftool_results/$name.txt" &
done
wait
echo "=== RESULTS ==="
cat /tmp/exiftool_results/*.txt | sort
echo "=== TOTALS ==="
cat /tmp/exiftool_results/*.txt | awk '{
for(i=1;i<=NF;i++) {
if($i~/^pass=/) p+=substr($i,6)
if($i~/^fail=/) f+=substr($i,6)
if($i~/^planned=/) { v=substr($i,9); if(v!="?") pl+=v }
}
} END { printf "PASS=%d FAIL=%d PLANNED=%d RATE=%d%%\n", p, f, pl, (pl>0?p*100/pl:0) }'
cd perl5_t/t
../../jperl op/lexsub.t
Tests using run_multiple_progs() or fresh_perl_is() spawn jperl as a subprocess. This requires jperl to be in PATH:
# Using the test runner (handles PATH automatically):
perl dev/tools/perl_test_runner.pl perl5_t/t/op/eval.t
# Manual running (must set PATH):
PATH="/Users/fglock/projects/PerlOnJava2:$PATH" cd perl5_t/t && ../../jperl op/eval.t
When debugging, compare PerlOnJava output with native Perl to isolate the difference:
# Run with system Perl
cd Image-ExifTool-13.44
perl -Ilib t/Writer.t 2>&1 | grep -E '^(not )?ok ' > /tmp/perl_results.txt
# Run with PerlOnJava
java -jar ../target/perlonjava-3.0.0.jar -Ilib t/Writer.t 2>&1 | grep -E '^(not )?ok ' > /tmp/jperl_results.txt
# Diff
diff /tmp/perl_results.txt /tmp/jperl_results.txt
For individual Perl constructs:
# System Perl
perl -e 'my @a = (1,2,3); $_ *= 2 foreach @a; print "@a\n"'
# PerlOnJava
java -jar target/perlonjava-3.0.0.jar -e 'my @a = (1,2,3); $_ *= 2 foreach @a; print "@a\n"'
For comparing .failed output files against .out reference files:
cd Image-ExifTool-13.44
diff t/Writer_11.out t/Writer_11.failed
| Variable | Effect |
|----------|--------|
| JPERL_DISABLE_INTERPRETER_FALLBACK=1 | Disable bytecode interpreter fallback for large subs (force JVM compilation only) |
| JPERL_SHOW_FALLBACK=1 | Print a message when a sub falls back to the bytecode interpreter |
| JPERL_EVAL_NO_INTERPRETER=1 | Disable interpreter for eval STRING (force JVM compilation) |
| JPERL_SPILL_SLOTS=N | Set number of JVM spill slots (default 16) |
| Variable | Effect |
|----------|--------|
| JPERL_ASM_DEBUG=1 | Print JVM bytecode disassembly when ASM frame computation crashes |
| JPERL_ASM_DEBUG_CLASS=<name> | Filter ASM debug output to a specific generated class name |
| JPERL_BYTECODE_SIZE_DEBUG=1 | Print bytecode size for each generated method |
| JPERL_EVAL_VERBOSE=1 | Verbose error reporting for eval STRING compilation issues |
| JPERL_EVAL_TRACE=1 | Trace eval STRING execution path (compile, interpret, fallback) |
| JPERL_IO_DEBUG=1 | Trace file handle open/dup/write operations |
| JPERL_STDIO_DEBUG=1 | Trace STDOUT/STDERR flush sequencing |
| JPERL_REQUIRE_DEBUG=1 | Trace require/use module loading |
| JPERL_TRACE_CONTROLFLOW=1 | Trace control flow detection (goto, return, last/next/redo safety) |
| JPERL_DISASSEMBLE=1 | Disassemble generated bytecode (also --disassemble CLI flag) |
| Variable | Effect |
|----------|--------|
| JPERL_UNIMPLEMENTED=warn | Downgrade unimplemented regex features from fatal to warning |
# Pass JVM options via JPERL_OPTS
JPERL_OPTS="-Xmx512m" ./jperl script.pl
# Combine env vars
JPERL_SHOW_FALLBACK=1 JPERL_EVAL_TRACE=1 java -jar target/perlonjava-3.0.0.jar -Ilib t/Writer.t 2>&1
ExifTool .t files follow a common pattern:
BEGIN { $| = 1; print "1..N\n"; require './t/TestLib.pm'; t::TestLib->import(); }
END { print "not ok 1\n" unless $loaded; }
use Image::ExifTool;
$loaded = 1;
# Read test: extract tags and compare against t/<TestName>_N.out
my $exifTool = Image::ExifTool->new;
my $info = $exifTool->ImageInfo('t/images/SomeFile.ext', @tags);
print 'not ' unless check($exifTool, $info, $testname, $testnum);
print "ok $testnum\n";
# Write test: modify tags and verify output
writeInfo($exifTool, 'src.jpg', 'tmp/out.jpg', \@setNewValue_args);
# Binary compare test: verify exact byte-for-byte match
binaryCompare('output.jpg', 't/images/original.jpg');
The check() function compares extracted tags against reference files t/<TestName>_N.out. Failed tests leave t/<TestName>_N.failed files for comparison. The writeInfo() function calls SetNewValue + WriteInfo.
Run the failing test and capture full output (stdout + stderr). Look for:
not ok N lines (which specific sub-tests fail)Can't locate ... (missing module)Undefined subroutine / Can't call method errorsIdentify the failing sub-test number and find it in the .t file. Map it to the ExifTool operation (read vs write, which image format, which tags).
Check the .out vs .failed files to understand the difference:
diff t/Writer_11.out t/Writer_11.failed
Compare with system Perl to confirm it's a PerlOnJava issue, not a test environment issue.
Isolate the Perl construct causing the failure. Write a minimal reproducer:
java -jar target/perlonjava-3.0.0.jar -e 'print pos("abc" =~ /b/g), "\n"'
perl -e 'print pos("abc" =~ /b/g), "\n"'
Trace into PerlOnJava source to find the bug. Use JPERL_SHOW_FALLBACK=1 to check if large subs are hitting the interpreter path.
Fix in PerlOnJava, rebuild (make dev), re-run the ExifTool test.
Verify no regressions: Run make (154 unit tests) and check perl5_t/t/op/lexsub.t (sensitive to block/sub emission changes).
PerlOnJava has two compilation backends:
eval STRING by default.Key files for the interpreter:
BytecodeCompiler.java — compiles AST to interpreter bytecodeBytecodeInterpreter.java — executes interpreter bytecodeCompileAssignment.java — assignment compilation for interpreterOpcodes.java — opcode definitionsInterpretedCode.java — runtime representation of interpreter-compiled codeClosure variables are the main challenge for the interpreter fallback path. There are two distinct mechanisms:
Inner named subs within the large sub: These are compiled by SubroutineParser using the JVM compiler (via compilerSupplier). They get full closure support through RETRIEVE_BEGIN_* opcodes and VariableCollectorVisitor.java.
The large sub itself accessing outer-scope my variables: This is handled by detectClosureVariables() in BytecodeCompiler.java. It must:
getAllVisibleVariables() (TreeMap, sorted by register index) with the exact same filtering as SubroutineParser (skip @_, empty decl, fields, & refs) to ensure the capturedVars ordering matches withCapturedVars().addVariableWithIndex() so that ALL variable resolution paths find them — not just visit(IdentifierNode). This is critical because handleHashElementAccess, handleArrayElementAccess, hash slices, array slices, and assignment targets all have their own variable lookup logic that checks the symbol table.nextRegister) so local my declarations don't collide with captured variable registers.capturedVarIndices for register recycling protection (prevents getHighestVariableRegister() from being too low).The runtime flow for captured variables in the interpreter path:
compileToInterpreter() creates BytecodeCompiler, calls compiler.compile(ast, ctx) which runs detectClosureVariables() — this sets up capturedVarIndices (name→register mapping) used during bytecode generationcompileToInterpreter() creates placeholder capturedVars (all RuntimeScalar)SubroutineParser.withCapturedVars() replaces the placeholder with actual values from paramList (built from getAllVisibleVariables() with same filtering)BytecodeInterpreter.execute() copies capturedVars[i] to registers[3+i] via System.arraycopyKey invariant: The ordering of variables in detectClosureVariables() MUST match SubroutineParser's paramList ordering, because capturedVars[i] is copied to register 3+i and the bytecode was compiled expecting specific variables at specific registers.
return inside a block refactored by LargeBlockRefactorer into sub { ... }->(@_). The return exits the anonymous sub instead of the enclosing function.timeout 120 to prevent hangs; JPERL_SHOW_FALLBACK=1 to see if interpreter fallback is involved.WriteExif.pl using %mandatory hash.%mandatory is accessible (closure variable issue in interpreter fallback).my %hash / my @array not accessible inside large subs compiled by interpreter.visit(IdentifierNode), handleHashElementAccess, handleArrayElementAccess, hash/array slices, assignment LHS). If captured variables are only in capturedVarIndices but NOT in the compiler's symbol table, most access paths won't find them and fall through to global variable load (which returns an empty hash/array).detectClosureVariables() must call symbolTable.addVariableWithIndex() for each captured variable so all resolution paths find them.System.err.println in BytecodeInterpreter.execute() after the System.arraycopy for capturedVars to verify the correct values are being passed at runtime. Also check the handleHashElementAccess code path to see if it reaches LOAD_GLOBAL_HASH (bad) vs getVariableRegister (good).en, de, fr) fail to be created in lang-alt lists.WriteXMP.pl path tracking using pos() after m//g regex.pos() returning wrong value after global regex match can cause index tracking bugs in ExifTool's write logic.EXPR foreach @list) must alias $_ to actual array elements for modification.StatementParser.java vs StatementResolver.java.binmode, sysread, syswrite, pack, unpack, Encode::decode/encode.$_ aliased to a constant).| Test | Pass/Planned | Status | |------|-------------|--------| | ExifTool.t | 35/35 | PASS | | Writer.t | 59/61 | 2 fail (test 10: Pentax date fmt, test 46: XMP Audio data) | | XMP.t | 44/54 | 10 fail | | Geotag.t | 3/12 | 9 fail | | PDF.t | 18/26 | 8 fail | | QuickTime.t | 17/22 | 5 fail | | CanonVRD.t | 19/24 | 5 fail | | Nikon.t | 6/9 | 3 fail | | CanonRaw.t | 5/9 | 3 fail + crash | | Pentax.t | 1/4 | 3 fail | | Panasonic.t | 2/5 | 3 fail | | (72 other tests) | all pass | PASS |
2008:03:02 becomes 2008:0:0, time 12:01:23 becomes 12:0:0. Binary date decoding issue — likely pack/unpack or BCD decode in Pentax.pm. Also has a float rounding diff (13.2 vs 13.3).[XMP, XMP-GAudio, Audio] Data - Audio Data: (Binary data 1 bytes) in output. An XMP Audio binary data tag is not being written/preserved.The %mandatory and %crossDelete hashes in WriteExif.pl are file-scope my variables accessed inside the large WriteExif sub (compiled by interpreter fallback). Fixed by registering captured variables in the compiler's symbol table via addVariableWithIndex() in detectClosureVariables(). This fixed Writer tests 6,7,11,13,19,25-28,35,38,42,48,53,55.
All geotag tests except module loading and 2 others fail. All use Time::Local for date arithmetic and GPS coordinate interpolation. Likely one root cause in date string parsing or timezone offset calculation. Compare Geotag_2.out vs Geotag_2.failed to see if GPS coordinates are wrong or dates are wrong.
Writing non-default language entries to XMP lang-alt lists fails silently. Only x-default works. The write path in WriteXMP.pl uses pos() after m//g for path tracking. Test with:
perl -e '"a/b/c" =~ m|/|g; print pos(), "\n"' # should print 2
java -jar target/perlonjava-3.0.0.jar -e '"a/b/c" =~ m|/|g; print pos(), "\n"'
Values assigned to wrong bag items; empty strings dropped from lists. Also likely pos() related. Test 36 specifically loses an empty string as first list element.
Tests 7-12 are sequential edit/revert operations on a PDF — one failure cascades. Tests 25-26 are AES encryption (require Digest::SHA). Investigate test 7 first as it's the cascade root.
HEIC write failures and VideoKeys/AudioKeys extraction. Lower priority — likely format-specific issues.
Various format-specific write issues. Many may share root causes with P1 (mandatory EXIF tags).
| Area | File |
|------|------|
| Bytecode compiler | backend/bytecode/BytecodeCompiler.java |
| Bytecode interpreter | backend/bytecode/BytecodeInterpreter.java |
| Assignment compilation (interp) | backend/bytecode/CompileAssignment.java |
| Variable collector (closures) | backend/bytecode/VariableCollectorVisitor.java |
| Opcodes | backend/bytecode/Opcodes.java |
| Block emission (JVM) | backend/jvm/EmitBlock.java |
| Subroutine emission (JVM) | backend/jvm/EmitSubroutine.java |
| Foreach emission (JVM) | backend/jvm/EmitForeach.java |
| Eval handling (JVM) | backend/jvm/EmitEval.java |
| Method creator / fallback | backend/jvm/EmitterMethodCreator.java |
| Large block refactoring | backend/jvm/LargeBlockRefactorer.java |
| Control flow safety | frontend/analysis/ControlFlowDetectorVisitor.java |
| Statement parser (block foreach) | frontend/parser/StatementParser.java |
| Statement resolver (postfix foreach) | frontend/parser/StatementResolver.java |
| Subroutine parser | frontend/parser/SubroutineParser.java |
| Runtime scalar | runtime/runtimetypes/RuntimeScalar.java |
| Runtime array | runtime/runtimetypes/RuntimeArray.java |
| Runtime hash | runtime/runtimetypes/RuntimeHash.java |
| Dynamic variables | runtime/runtimetypes/DynamicVariableManager.java |
| IO operations | runtime/runtimetypes/RuntimeIO.java |
| IO operator (open/dup) | runtime/operators/IOOperator.java |
| Control flow (goto/labels) | backend/jvm/EmitControlFlow.java |
| Dereference / slicing | backend/jvm/Dereference.java |
| Variable emission (refs) | backend/jvm/EmitVariable.java |
| String parser (qw, heredoc) | frontend/parser/StringParser.java |
| String operators | runtime/operators/StringOperators.java |
| Pack/Unpack | runtime/operators/PackOperator.java |
| Regex preprocessor | runtime/regex/RegexPreprocessor.java |
| Regex runtime | runtime/regex/RuntimeRegex.java |
| Module loading | runtime/operators/ModuleOperators.java |
All paths relative to src/main/java/org/perlonjava/.
The HEAD code's AST-based detectClosureVariables populated capturedVarIndices with ~321 entries, which inflated getHighestVariableRegister() and prevented aggressive register recycling. A no-op version (removing all capturedVarIndices) dropped Writer.t from 44/61 to 26/61 — not because of closure access, but because register recycling became too aggressive. When modifying detectClosureVariables, always ensure capturedVarIndices has enough entries to keep getHighestVariableRegister() high enough to prevent register corruption.
The bytecode compiler resolves variables in MANY separate code paths:
visit(IdentifierNode) — checks capturedVarIndices then symbol tablehandleHashElementAccess — checks closure vars, symbol table, then globalhandleArrayElementAccess — same patternhandleHashSlice, handleArraySlice, handleHashKeyValueSlice — sameCompileAssignment.java — same patternCompileOperator.javaIf a fix only patches ONE of these paths (e.g., capturedVarIndices check in visit(IdentifierNode)), hash/array access will still fall through to globals. The correct fix is to register captured variables in the symbol table so ALL paths find them.
SubroutineParser builds paramList by iterating getAllVisibleVariables() (TreeMap sorted by register index) with specific filters. detectClosureVariables() must use the exact same iteration order and filters. Any mismatch causes captured variable values to be assigned to wrong registers at runtime.
EmitControlFlow.handleGotoLabel() resolves labels at compile time within the current JVM scope. When the target label is outside the current scope (e.g., goto inside a map block to a label outside, or goto inside an eval block), the compile-time lookup fails. The fix is to emit a RuntimeControlFlowList marker with ControlFlowType.GOTO at runtime (the same mechanism used by dynamic goto EXPR), allowing the goto signal to propagate up the call stack. This was a blocker for both op/array.t and op/eval.t.
In Dereference.handleArrowArrayDeref(), the check for single-index vs slice path must account for range expressions (.. operator). A range like 0..5 is a single AST node but produces multiple indices. The correct condition is: use single-index path only if there's one element AND it's not a range. Otherwise, use the slice path. The old code had a complex isArrayLiteral check that was too restrictive.
StringParser.parseWordsString() must apply single-quote backslash rules to each word: \\ → \ and \delimiter → delimiter. Without this, backslashes are doubled in the output. The processing uses the closing delimiter from the qw construct.
\(LIST) must flatten arrays before creating refs\(@array) should create individual scalar refs to each array element (like map { \$_ } @array), not a single ref to the array. EmitVariable needs a flattenElements() method that detects @ sigil nodes in the list and flattens them before creating element references.
git diff + git applyWhen a feature branch has diverged far from master (thousands of commits in common history), both git rebase and git merge --squash can produce massive conflicts across dozens of files. The clean workaround:
# 1. Generate a patch of ONLY the branch's changes vs master
git diff master..feature-branch > /tmp/branch-diff.patch
# 2. Create a fresh branch from current master
git checkout master && git checkout -b feature-branch-clean
# 3. Apply the patch (no merge history = no conflicts)
git apply /tmp/branch-diff.patch
# 4. Commit as a single squashed commit
git add -A && git commit -m "Squashed: ..."
# 5. Force push to update the PR
git push --force origin feature-branch-clean
This works because git diff master..branch produces the exact file-level delta, bypassing all the intermediate merge history that causes conflicts.
Uncommitted working tree changes are lost when git rebase --abort is run. If you have a fix in progress (e.g., a BitwiseOperators change), commit it first — even as a WIP commit — before attempting any rebase. The rebase abort restores the branch to its pre-rebase state, which does NOT include uncommitted changes.
getInt() vs (int) getLong() for 32-bit integer wrappingRuntimeScalar.getInt() clamps DOUBLE values to Integer.MAX_VALUE (e.g., (int) 2147483648.0 == 2147483647). But (int) getLong() wraps correctly via long→int truncation (e.g., (int) 2147483648L == -2147483648). For use integer operations where Config.pm reports ivsize=4, always use (int) getLong() to get proper 32-bit wrapping behavior matching Perl's semantics.
Perl's scalar gmtime/localtime returns ctime(3) format: "Fri Mar 7 20:13:52 881" — NOT RFC 1123 ("Fri, 7 Mar 0881 20:13:52 GMT"). Use String.format() with explicit field widths, not DateTimeFormatter. Also: wday must use getValue() % 7 (Perl: 0=Sun..6=Sat) not getValue() (Java: 1=Mon..7=Sun). Large years (>9999) must not crash the formatter.
Before declaring a fix complete, run the same test on both master and the branch to distinguish real regressions from pre-existing failures. Use perl5_t/t/ (not perl5/t/) for running Perl5 core tests — the perl5_t copy has test harness files (test.pl, charset_tools.pl) that PerlOnJava can load.
In ExifTool Perl code (temporary, never commit):
print STDERR "DEBUG: variable=$variable\n";
In PerlOnJava Java code (temporary, never commit):
System.err.println("DEBUG: value=" + value);
To trace which subs hit interpreter fallback:
JPERL_SHOW_FALLBACK=1 java -jar target/perlonjava-3.0.0.jar -Ilib t/Writer.t 2>&1 | grep FALLBACK
development
# PerlOnJava Debugging Skills and Architecture Knowledge This document captures key knowledge about PerlOnJava internals learned during debugging sessions. ## Variable Storage and Scoping ### Three Types of Variable Declarations 1. **`my` variables** - Lexical scope - Stored in JVM local variable slots during normal execution - When captured by closures: stored as closure fields or in GlobalVariable with IDs - Symbol table entry: `decl = "my"`, has `index` (JVM slot number) 2. **`o
development
# PerlOnJava Interpreter Developer Guide - name all test files /tmp/test.pl ## Quick Reference **Performance:** 46.84M ops/sec (1.75x slower than compiler ✓) **Opcodes:** 0-157 (contiguous) for JVM tableswitch optimization **Runtime:** 100% API compatibility with compiler (zero duplication) ### Testing Modes **JPERL_EVAL_USE_INTERPRETER=1** - Forces all eval STRING to use the interpreter - Used for testing interpreter implementation of operators in eval context - Compiler still used for mai
development
# Profile PerlOnJava ## ⚠️⚠️⚠️ CRITICAL: NEVER USE `git stash` ⚠️⚠️⚠️ **DANGER: Changes are SILENTLY LOST when using git stash/stash pop!** - NEVER use `git stash` to temporarily revert changes - INSTEAD: Commit to a WIP branch or use `git diff > backup.patch` - This warning exists because completed work was lost during debugging Profile and optimize PerlOnJava runtime performance using Java Flight Recorder. ## Git Workflow **IMPORTANT: Never push directly to master. Always use feature bra
development
# Port CPAN Module to PerlOnJava ## ⚠️⚠️⚠️ CRITICAL: NEVER USE `git stash` ⚠️⚠️⚠️ **DANGER: Changes are SILENTLY LOST when using git stash/stash pop!** - NEVER use `git stash` to temporarily revert changes - INSTEAD: Commit to a WIP branch or use `git diff > backup.patch` - This warning exists because completed work was lost during debugging This skill guides you through porting a CPAN module with XS/C components to PerlOnJava using Java implementations. ## When to Use This Skill - User as