dev/prompts/SKILL.md
# PerlOnJava Debugging Skills and Architecture Knowledge This document captures key knowledge about PerlOnJava internals learned during debugging sessions. ## Variable Storage and Scoping ### Three Types of Variable Declarations 1. **`my` variables** - Lexical scope - Stored in JVM local variable slots during normal execution - When captured by closures: stored as closure fields or in GlobalVariable with IDs - Symbol table entry: `decl = "my"`, has `index` (JVM slot number) 2. **`o
npx skillsauth add fglock/perlonjava dev/promptsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This document captures key knowledge about PerlOnJava internals learned during debugging sessions.
my variables - Lexical scope
decl = "my", has index (JVM slot number)our variables - Package scope with lexical declaration
decl = "our", has index but uses package nameGlobalVariable.getGlobalVariable("Package::varname")use vars variables - Package scope without lexical declaration
Vars.importVars()GlobalVariable.getGlobalVariable("Package::varname")When large code blocks are refactored, file-scope my variables can get IDs assigned:
my %allGroups = (); # File-scope lexical
# Large block triggers refactoring
for (1..10000) { ... } # This creates closures
# Now %allGroups has an ID (e.g., 75)
What happens:
SubroutineParser.handleNamedSubWithFilter() line 700-709 assigns IDs to variables that need to be capturedPerlOnJava::_BEGIN_75::allGroupsPersistentVariable.retrieveBeginHash("allGroups", 75)Key insight: Variables with IDs behave like BEGIN variables even if not in BEGIN blocks.
The GlobalVariable registry is a runtime storage system:
// Storage maps
private static final ConcurrentHashMap<String, RuntimeScalar> globalVariables
private static final ConcurrentHashMap<String, RuntimeArray> globalArrays
private static final ConcurrentHashMap<String, RuntimeHash> globalHashes
// Access
RuntimeHash hash = GlobalVariable.getGlobalHash("Package::varname");
// Creates new empty hash if doesn't exist
Important methods:
getGlobalHash() - Returns existing or creates newremoveGlobalHash() - Removes from map and returns (transfer ownership)existsGlobalHash() - Check if exists (for strict vars)BEGIN blocks execute during parsing, not at runtime:
my $x;
BEGIN { $x = 42 } # Executes immediately when parsed
print $x; # Uses the value set at compile-time
Implementation:
PersistentVariable.retrieveBegin*()Storage pattern:
Variable: $x in BEGIN block with ID 42
Storage key: PerlOnJava::_BEGIN_42::x
Retrieved via: PersistentVariable.retrieveBeginScalar("$x", 42)
BEGIN variables use name-based lookup, not JVM slot numbers:
my variables: ALOAD <slot> (breaks if slot reallocated)PersistentVariable.retrieveBeginScalar(name, id) (always works)When a subroutine/closure is created, it captures variables from outer scopes:
my $outer = 1;
my $closure = sub { $outer + 1 }; # Captures $outer
Implementation (SubroutineParser.java lines 680-730):
decl == "our": Use package name, store as GlobalVariable referencedecl == "my" or "state":
PersistentVariable.retrieveBegin*()Critical: Closures capture variables by reference, not by value:
my $x = 1;
my $c1 = sub { $x };
$x = 2;
my $c2 = sub { $x };
print $c1->(); # Prints 2 (not 1!)
print $c2->(); # Prints 2
Both closures see the same $x reference. Changes are visible to all closures.
Two phases:
Instantiation (closure constructor):
NEW org/perlonjava/anon42
DUP
ALOAD <captured_var_1> // Variables captured here
ALOAD <captured_var_2>
INVOKESPECIAL org/perlonjava/anon42.<init>
Variables are captured during instantiation, even if uninitialized.
Execution (closure apply):
ALOAD <closure_object>
ALOAD <args>
ILOAD <context>
INVOKEVIRTUAL org/perlonjava/anon42.apply
Variables are accessed/used during execution.
Key insight: If a variable is initialized AFTER closure instantiation but BEFORE execution, the closure sees the initialized value (because it captured by reference).
The JVM has a hard limit: 65,535 bytes per method. Large Perl code blocks can exceed this.
LargeBlockRefactorer.trySmartChunking() splits large blocks:
Original: 10,000 statements in one method
↓
Refactored: sub { 4000 statements, sub { 3000 statements, sub { 3000 statements }->() }->() }->()
Two paths:
treatAllElementsAsSafe = true (no labels, no control flow):
treatAllElementsAsSafe = false (has labels or control flow):
Chunking process:
Start with all elements as one safe run:
safeRunLen = 100
safeRunEndExclusive = 100
Iteration 1: Take chunk [60..99]
chunkStart = 60
Create closure with elements [60..99]
Update: safeRunEndExclusive = 60, safeRunLen = 60
Iteration 2: Take chunk [30..59]
chunkStart = 30
Create closure with elements [30..59] + previous closure call
Update: safeRunEndExclusive = 30, safeRunLen = 30
After chunking: Add remaining [0..29] to result
safeRunStart = safeRunEndExclusive - safeRunLen = 0
Add elements [0..29]
Key insight: safeRunStart is RECALCULATED on each iteration. The algorithm is correct.
During closure creation in SubroutineParser.java:704:
if (ast.id == 0) {
ast.id = EmitterMethodCreator.classCounter++;
}
This assigns IDs to ANY my variable that gets captured by a refactored closure, not just BEGIN variables.
--debug - Emit debug information during compilation
./jperl --debug script.pl
Shows: use statements, warnings, compilation stages
--disassemble - Show JVM bytecode
./jperl --disassemble script.pl > output.txt
Shows: Java classes, methods, bytecode instructions, LINENUMBER markers
--parse - Show AST structure
./jperl --parse script.pl
Shows: AST nodes, token positions (pos:N)
--tokenize - Show lexer tokens
./jperl --tokenize script.pl
Shows: Each token with type and position
Critical distinction:
LINENUMBER in disassembly output (from ./jperl --disassemble) is TOKEN INDEX, not source line number
LINENUMBER 229 in bytecode = Token 229 in tokenizer output
Use: ./jperl --tokenize file.pl | sed -n '229p'
"line" in Perl error messages (runtime errors) is SOURCE LINE NUMBER
Error: at line 229
Source line 229: %fileTypeLookup = (
This IS the actual source line 229 in the file
Key insight: Don't confuse bytecode LINENUMBER (token index) with error message line numbers (source line).
Pattern for tracking execution:
// In RuntimeHash.java
private static volatile int debugHashId = -1;
public RuntimeArray setFromList(RuntimeList value) {
if (value.elements.size() == 6 && value.elements.get(1).toString().equals("ExifTool")) {
long timestamp = System.nanoTime();
debugHashId = System.identityHashCode(this);
System.err.println("DEBUG [" + timestamp + "] setFromList CALLED: hash=" +
System.identityHashCode(this) + " size=" + this.size());
StackTraceElement[] stack = Thread.currentThread().getStackTrace();
for (int i = 2; i < Math.min(stack.length, 15); i++) {
System.err.println(" at " + stack[i]);
}
}
// ... rest of method
}
Use System.identityHashCode() to track object identity (not .equals() which might be overridden).
Use timestamps to track execution order when multiple events happen.
# Generate disassembly before fix
./jperl --disassemble file.pl > /tmp/before.txt
# Apply fix, rebuild
make
# Generate disassembly after fix
./jperl --disassemble file.pl > /tmp/after.txt
# Compare
diff /tmp/before.txt /tmp/after.txt | less
# Or search for specific patterns
grep "setFromList" /tmp/before.txt
grep "setFromList" /tmp/after.txt
Pattern: Code appears in source but not in bytecode
Check if refactoring happened:
grep -c "anon.*apply" disassembly.txt
If > 0, code was refactored into closures
Search in all anonymous classes:
grep -B5 -A20 "class org/perlonjava/anon" disassembly.txt | less
Check if code is in a closure that's never called:
grep "LINENUMBER <token>" disassembly.txt
Missing LINENUMBER means code wasn't emitted
WRONG:
# This creates ONE for loop in the AST
for (1..10000) {
$x++;
}
RIGHT:
# This creates 10,000 statements in the AST
$x++;
$x++;
$x++;
# ... repeat 9,997 more times
cat > test.t << 'EOF'
use v5.38;
use Test::More;
my $x = 0;
EOF
# Generate 10,000 actual statements
perl -e 'print "\$x += 1;\n" x 10000' >> test.t
cat >> test.t << 'EOF'
is($x, 10000, "All statements executed");
done_testing();
EOF
To trigger LargeBlockRefactorer:
treatAllElementsAsSafe path: no labels, no last/next/redo/return$x++ for 1..N to a loop, not N statementsWhat I thought: Lines 383-386 only add [safeRunStart..safeRunEndExclusive-1], so elements [0..safeRunStart-1] are lost.
Reality: safeRunStart = safeRunEndExclusive - safeRunLen is RECALCULATED on each iteration. After chunking completes, this formula correctly identifies the remaining elements.
How I found out:
safeRunStart changes on each iterationLesson: If you can't reproduce a bug with a test, you probably misunderstood the code.
What I thought: \%allGroups where %allGroups has an ID doesn't create a proper reference.
Reality: The backslash operator (handleCreateReference) evaluates the operand (which loads the variable correctly) then calls createReference() on it. Works fine.
How I found out:
\%hash where hash has IDEmitOperator.handleCreateReference() - code is correctLesson: Test your hypothesis before claiming bugs.
What I thought: ExifTool failure is caused by refactorer bug.
Reality: Fixed refactorer "bug" (which wasn't a bug), ExifTool still fails with same error.
How I found out: Applied fix, ExifTool still broken.
Lesson: Don't assume cause based on symptoms. Follow the evidence.
org.perlonjava.astnode)org.perlonjava.astrefactor)org.perlonjava.codegen)org.perlonjava.runtime.runtimetypes)Each layer doesn't need to know about the others' internals.
Each block/subroutine has its own symbol table:
class ScopedSymbolTable {
Map<String, SymbolEntry> variableIndex; // Variables in this scope
ScopedSymbolTable parent; // Outer scope
}
Lookup walks up the parent chain until variable is found.
The second strategy is slower but more flexible (survives refactoring).
CLAUDE.md - Project-specific guidance (READ THIS FIRST)SKILL.md - This document - debugging skills and architecture knowledgesrc/main/java/org/perlonjava/ - The actual implementationdevelopment
# PerlOnJava Interpreter Developer Guide - name all test files /tmp/test.pl ## Quick Reference **Performance:** 46.84M ops/sec (1.75x slower than compiler ✓) **Opcodes:** 0-157 (contiguous) for JVM tableswitch optimization **Runtime:** 100% API compatibility with compiler (zero duplication) ### Testing Modes **JPERL_EVAL_USE_INTERPRETER=1** - Forces all eval STRING to use the interpreter - Used for testing interpreter implementation of operators in eval context - Compiler still used for mai
development
# Profile PerlOnJava ## ⚠️⚠️⚠️ CRITICAL: NEVER USE `git stash` ⚠️⚠️⚠️ **DANGER: Changes are SILENTLY LOST when using git stash/stash pop!** - NEVER use `git stash` to temporarily revert changes - INSTEAD: Commit to a WIP branch or use `git diff > backup.patch` - This warning exists because completed work was lost during debugging Profile and optimize PerlOnJava runtime performance using Java Flight Recorder. ## Git Workflow **IMPORTANT: Never push directly to master. Always use feature bra
development
# Port CPAN Module to PerlOnJava ## ⚠️⚠️⚠️ CRITICAL: NEVER USE `git stash` ⚠️⚠️⚠️ **DANGER: Changes are SILENTLY LOST when using git stash/stash pop!** - NEVER use `git stash` to temporarily revert changes - INSTEAD: Commit to a WIP branch or use `git diff > backup.patch` - This warning exists because completed work was lost during debugging This skill guides you through porting a CPAN module with XS/C components to PerlOnJava using Java implementations. ## When to Use This Skill - User as
development
Migrate from JNA to a modern native access library (eliminate sun.misc.Unsafe warnings)