skills/ai-agent-for-reverseengineering/SKILL.md
Reverse-engineer legacy numerical/scientific Fortran or C code and translate it into modern Python frameworks (Devito, NumPy, SciPy, FEniCS, etc.) using a multi-stage analysis pipeline with knowledge-graph-guided retrieval, structured code synthesis, and iterative validation. Trigger phrases: "convert this Fortran code to Python", "reverse engineer this finite difference code", "translate this legacy numerical solver to Devito", "modernize this scientific computing code", "what does this Fortran stencil do and how do I write it in Python", "migrate this CFD solver from Fortran to a modern framework"
npx skillsauth add ndpvt-web/arxiv-claude-skills ai-agent-for-reverseengineeringInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill enables Claude to systematically reverse-engineer legacy finite-difference and numerical simulation code (Fortran, C, or older scientific codebases) and translate it into modern Python-based frameworks such as Devito, NumPy/SciPy, or FEniCS. The approach follows the multi-stage pipeline from Hou & Yang (2026): static analysis to extract computational structure, knowledge-graph-organized retrieval to map legacy patterns onto target framework APIs, Pydantic-style constrained code synthesis to produce correct output, and multi-dimensional validation covering execution correctness, mathematical consistency, and API compliance.
Multi-stage reverse engineering with structured retrieval and constrained synthesis. The core insight from Hou & Yang is that naive LLM-based code translation fails on scientific code because finite-difference stencils encode implicit mathematical relationships (PDEs, boundary conditions, stability constraints) that are not apparent from syntax alone. The solution is a three-level analysis pipeline: (1) function-level static analysis to identify computational kernels and stencil patterns, (2) module-level analysis to capture data flow and organizational structure, and (3) codebase-level dependency mapping across files.
Knowledge-graph-guided retrieval for target framework mapping. Rather than relying on the LLM's
parametric knowledge of the target framework, the approach builds a structured knowledge graph of the
target API (e.g., Devito's Function, TimeFunction, Eq, Operator classes) organized into semantic
communities via Leiden clustering. When translating a specific stencil, the system retrieves the most
relevant API patterns from the correct community (e.g., seismic simulation vs. CFD vs. performance
tuning), then expands the query with related concepts to capture edge cases like boundary handling or
subdomain specifications.
Iterative validation with feedback-driven refinement. Generated code is validated across four dimensions: execution correctness (does it run?), structural soundness (does it follow framework idioms?), mathematical consistency (does it implement the same PDE discretization?), and API compliance (does it use the target framework correctly?). When validation fails, the specific failure dimension feeds back into retrieval weighting, causing the system to pull more context from the relevant knowledge community on the next iteration. This transforms static translation into an adaptive refinement loop.
Parse the legacy source code statically. Identify all subroutines/functions, their call graph, array declarations with dimensions, loop nests, and index arithmetic. For Fortran, pay special attention to COMMON blocks, IMPLICIT typing, and column-major array ordering. Produce a structured inventory: {function_name, arguments, array_accesses, loop_bounds, stencil_offsets}.
Extract the computational stencil from each kernel. For every nested loop that updates an array, determine the stencil shape by collecting all relative index offsets (e.g., u(i-1,j), u(i+1,j), u(i,j-1), u(i,j+1) indicates a 5-point Laplacian). Record the coefficients multiplying each offset to reconstruct the finite-difference weights.
Identify the governing PDE and discretization scheme. From the stencil weights, spatial dimensions, and time-stepping structure, determine which PDE is being solved (wave equation, heat equation, Navier-Stokes, etc.) and the discretization order (second-order central differences, fourth-order, upwind, etc.). Document the CFL condition or stability constraints if present.
Map boundary conditions and initial conditions. Analyze code outside the main stencil loops for boundary handling: Dirichlet (fixed values), Neumann (gradient conditions), absorbing boundaries (PML, sponge layers), or periodic wrapping. Record these as structured constraints.
Build a target-framework knowledge map. For the target framework (Devito, NumPy, FEniCS, etc.), organize the relevant API into categories: grid/mesh construction, field variable declaration, equation specification, operator compilation, boundary condition application, and time-stepping control. If using Devito, map: Grid for domain, TimeFunction/Function for fields, Eq for stencil equations, Operator for compiled kernels.
Translate each component using constrained synthesis. Generate the target code component by component, enforcing structural constraints: (a) grid dimensions must match the original, (b) stencil order must match extracted weights, (c) boundary conditions must be explicitly applied, (d) time-stepping loop structure must preserve the original update sequence. Use Pydantic-style validation schemas to verify each component before assembly.
Assemble the full translated program. Combine grid setup, field initialization, equation definitions, boundary conditions, and the time-stepping driver into a complete runnable script. Preserve the original code's I/O structure (reading input parameters, writing output snapshots) adapted to Python conventions.
Validate across four dimensions. Run the translated code and check: (a) Execution: does it run without errors? (b) Structure: does it follow target framework idioms (no raw NumPy loops where symbolic operators should be used)? (c) Mathematics: do the stencil coefficients match the original discretization order? (d) API compliance: are framework-specific objects used correctly (e.g., Devito Operator vs. manual loops)?
Iterate on failures with targeted retrieval. If validation fails on a specific dimension, refine that component. For mathematical failures, re-examine the stencil extraction. For API failures, retrieve more framework documentation for the specific construct that failed. For execution failures, check array shapes, index ordering (column-major vs. row-major), and off-by-one errors in loop bounds.
Document the translation mapping. Produce a summary table mapping each original Fortran subroutine to its Python equivalent, noting any semantic changes (e.g., 1-based to 0-based indexing, column-major to row-major array layout, explicit loops to symbolic operators).
Example 1: Fortran 2D acoustic wave equation to Devito
User: "Convert this Fortran code to Devito:"
do it = 1, nt
do j = 2, ny-1
do i = 2, nx-1
u_new(i,j) = 2*u(i,j) - u_old(i,j) + &
(dt**2 * v(i,j)**2) * ( &
(u(i+1,j) - 2*u(i,j) + u(i-1,j))/dx**2 + &
(u(i,j+1) - 2*u(i,j) + u(i,j-1))/dy**2 )
end do
end do
u_old = u
u = u_new
end do
Approach:
[-1,0], [+1,0], [0,-1], [0,+1], coefficients [1,-2,1]/dx^2 in x and [1,-2,1]/dy^2 in y. This is the 2nd-order central difference for the 2D acoustic wave equation d2u/dt2 = v^2 * (d2u/dx2 + d2u/dy2).2*u - u_old + dt^2 * ...).Output:
from devito import Grid, TimeFunction, Function, Eq, Operator
grid = Grid(shape=(nx, ny), extent=(nx*dx, ny*dy))
u = TimeFunction(name='u', grid=grid, time_order=2, space_order=2)
v = Function(name='v', grid=grid)
v.data[:] = velocity_model # load from original input
pde = u.dt2 - v**2 * (u.dx2 + u.dy2)
stencil = Eq(u.forward, solve(pde, u.forward))
op = Operator([stencil])
op.apply(time_M=nt-1, dt=dt)
Example 2: Understanding and documenting a legacy Fortran stencil
User: "What does this Fortran code compute?"
do j = 2, ny-1
do i = 2, nx-1
phi_new(i,j) = 0.25*(phi(i+1,j)+phi(i-1,j)+phi(i,j+1)+phi(i,j-1))
end do
end do
Approach:
d2phi/dx2 + d2phi/dy2 = 0.Output:
This is a single Jacobi iteration step for solving the 2D Laplace equation
(nabla^2 phi = 0) on a uniform grid. Each interior point is replaced by the
average of its 4 neighbors. The loop excludes boundaries (indices 2 to n-1
in Fortran 1-based indexing), implying Dirichlet boundary conditions held
fixed at the array edges.
To converge, this must be wrapped in an outer iteration loop checking
a residual tolerance (e.g., max|phi_new - phi| < epsilon).
Example 3: Translating a 1D advection scheme with upwind differencing
User: "Translate this C finite-difference code to NumPy:"
for (int n = 0; n < nt; n++) {
for (int i = 1; i < nx; i++) {
u_new[i] = u[i] - c * dt / dx * (u[i] - u[i-1]);
}
memcpy(u, u_new, nx * sizeof(double));
}
Approach:
(u[i] - u[i-1])/dx, first-order upwind for advection equation du/dt + c*du/dx = 0.i=1, so u[0] is a fixed inflow boundary.Output:
import numpy as np
u = u_initial.copy()
courant = c * dt / dx # CFL number, must be <= 1 for stability
for n in range(nt):
u[1:] = u[1:] - courant * (u[1:] - u[:-1])
# u[0] remains fixed (inflow Dirichlet BC)
dt <= dx / v_max, the translated code must enforce the same condition.Operator, FEniCS solve) for performance.u(idx(i)+1, j)) rather than literal constants, trace the index computation to resolve the actual offsets. If unresolvable, flag to the user and ask for clarification.REAL*4 vs REAL*8 affects numerical stability. Default to float64 in Python and note where the original used single precision.#ifdef forests, Fortran cpp directives) need macro expansion before stencil extraction is reliable.Hou, Y. & Yang, Z. (2026). AI Agent for Reverse-Engineering Legacy Finite-Difference Code and Translating to Devito. arXiv:2601.18381v1. Key takeaway: the three-level static analysis (function/module/codebase) combined with knowledge-graph-organized retrieval and four-dimensional validation produces reliable translations where single-pass LLM translation fails.
development
Audit LLM-based automatic short answer grading (ASAG) systems for adversarial vulnerabilities using token-level and prompt-level attack strategies from the GradingAttack framework. Triggers: 'test grading robustness', 'adversarial attack on grading', 'audit LLM grader', 'red-team answer grading', 'ASAG vulnerability assessment', 'grading fairness attack'
development
Build structured information-seeking agents that decompose complex queries into multi-turn search-and-browse workflows, aggregate results from multiple web sources, and return answers in typed structured formats (items, sets, lists, tables). Applies the GISA benchmark's ReAct-based agent architecture and evaluation methodology. Trigger phrases: "build an information-seeking agent", "search agent pipeline", "multi-turn web research agent", "structured web search workflow", "aggregate information from multiple sources", "web research with structured output"
data-ai
Optimize LLM prompts using GFlowPO's iterative generate-evaluate-refine loop with diversity-preserving exploration and dynamic memory. Use when: 'optimize this prompt', 'find a better prompt for this task', 'prompt engineering with examples', 'auto-tune my system prompt', 'improve prompt accuracy', 'generate prompt variations'.
development
Constrain LLM generation with executable Pydantic schemas and multi-agent pipelines to produce structurally valid, domain-rich artifacts. Uses ontology-as-grammar to eliminate hallucinated structures while preserving creative output. Trigger phrases: "generate a valid game design", "schema-constrained generation", "build a multi-agent pipeline with Pydantic validation", "ontology-driven content generation", "structured creative generation with DSPy", "generate artifacts that pass domain validation".