skills/environment-in-the-loop-rethinking-code-migration/SKILL.md
Perform code migrations (dependency upgrades, API changes, framework transitions) with integrated environment verification. Instead of migrating code then hoping it builds, this skill builds and tests inside a real environment at every step, using feedback loops to fix both code and configuration issues. Use when: 'migrate this project from X to Y', 'upgrade dependency version', 'port this codebase to a new framework', 'fix build after dependency update', 'help me upgrade NumPy/React/Django/Spring', 'automate this library migration'.
npx skillsauth add ndpvt-web/arxiv-claude-skills environment-in-the-loop-rethinking-code-migration-Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill enables Claude to perform code migrations — dependency upgrades, API adaptations, framework transitions — by treating the execution environment as a first-class participant rather than an afterthought. Based on the Environment-in-the-Loop (EITL) framework from Li et al. (2026), the core insight is that code and its environment co-evolve: migrating code without simultaneously constructing and verifying the target environment leads to hidden runtime failures, configuration drift, and prolonged rework cycles. This skill implements a tight feedback loop where every code change is immediately validated against a real build/test environment, and diagnostic output from that environment drives the next round of fixes.
Traditional LLM-assisted migration treats code transformation and environment setup as separate, sequential steps: first rewrite the code, then try to build it. This fails because many migration errors are invisible to static analysis. For example, NumPy 2.x changed internal constraints on np.concatenate while keeping the same function signature — code that looks correct fails at runtime with a type error. These version-dependent behavioral changes only surface when you actually execute the migrated code in the target environment.
The EITL framework solves this with a three-agent architecture operating in a continuous feedback loop. The Migration Agent generates candidate code changes and dependency specifications. The Environment Agent (the central hub) autonomously constructs an isolated build environment — installing dependencies, resolving version conflicts, configuring build systems, and executing the project. When builds or tests fail, the Environment Agent captures diagnostic logs and routes them: configuration errors (wrong Python version, missing system library) are self-repaired by the Environment Agent; semantic code errors are sent back to the Migration Agent for correction. The Testsuite Agent generates and runs regression tests within the verified environment, ensuring behavioral equivalence between old and new versions.
The critical innovation is that environment feedback is continuous and structured, not a one-shot pass/fail. Each iteration produces a diagnostic report classifying the failure type (dependency conflict, API incompatibility, configuration drift, runtime behavioral change) and prescribing the correction pathway. This eliminates the "blind retry" pattern where developers repeatedly tweak code without understanding root causes.
Audit the current environment state. Read package.json, requirements.txt, pom.xml, build.gradle, Dockerfile, or equivalent manifests. Identify the current dependency versions, language runtime version, and build system. Run the existing build/test suite to establish a green baseline.
Define the migration target explicitly. Clarify exactly what is being upgraded (specific library version, language version, framework). Check the target library's changelog and migration guide for breaking changes. Document known incompatibilities as a checklist.
Create an isolated environment for the migration. Use Docker, a virtual environment, or a clean branch. The environment must be reproducible — write a Dockerfile or shell script that provisions it from scratch. Never mutate the user's working environment directly.
Update dependency manifests first. Change version numbers in lock files and manifests. Run the package manager's dependency resolution (npm install, pip install, mvn dependency:resolve) inside the isolated environment. Capture all output — version conflicts and resolution failures are the first feedback signal.
Apply code transformations based on known breaking changes. Using the migration guide and changelog from step 2, transform API calls, update import paths, replace deprecated patterns. Make changes file by file, grouping related changes.
Build the project inside the target environment and capture diagnostics. Run the full build. On failure, parse the error output to classify each issue:
Fix the highest-priority issue and rebuild. Address one class of error at a time, starting with environment/configuration issues (they block everything else), then dependency conflicts, then API changes. Rebuild after each fix to get fresh diagnostics.
Run regression tests inside the verified environment. Execute existing tests. If tests are sparse, write targeted tests for the migrated code paths — especially around functions whose behavior changed across versions. Compare test results against the baseline from step 1.
Iterate the feedback loop until green. Repeat steps 6-8. Each cycle should resolve at least one class of error. If a fix introduces new failures, classify them and prioritize. Track progress explicitly — the error count should monotonically decrease.
Produce a migration artifact. Output the final set of code changes, updated manifests, updated Dockerfile/environment script, and a summary of what changed and why. This artifact should allow anyone to reproduce the migration from scratch.
Example 1: Upgrading NumPy from 1.x to 2.x in a data science project
User: "Upgrade this project from NumPy 1.24 to NumPy 2.0"
Approach:
requirements.txt and setup.py. Identify all NumPy-dependent code by grepping for import numpy and np..np.bool, np.int, np.float), changed concatenate behavior, removed np.mat.numpy==2.0.0 plus all other dependencies at their current versions.requirements.txt to pin numpy>=2.0.0,<3.0.0.pip install -r requirements.txt — capture any dependency conflicts (e.g., scipy 1.10 requires numpy<1.27). Resolve by upgrading scipy to a 2.0-compatible version.np.bool with np.bool_, np.int with np.int_, np.float with np.float64 across the codebase.np.concatenate on mixed dtypes fails with a new type error — NumPy 2.0 is stricter about dtype promotion. Fix by adding explicit dtype= argument.Output:
Migration complete: 14 files changed, 37 API calls updated.
- Replaced 22 deprecated type aliases (np.bool → np.bool_, etc.)
- Upgraded scipy 1.10.1 → 1.14.0 for NumPy 2.0 compatibility
- Fixed 3 dtype promotion issues in concatenate/stack calls
- Updated requirements.txt and Dockerfile
- All 148 tests passing in target environment
Example 2: Spring Boot 2 to Spring Boot 3 migration (Java 17)
User: "Migrate our Spring Boot 2.7 app to Spring Boot 3.2"
Approach:
pom.xml — current Spring Boot 2.7.x, Java 11. Note: Spring Boot 3 requires Java 17 minimum.javax.* → jakarta.* namespace, removed WebSecurityConfigurerAdapter, changed auto-configuration class loading.pom.xml: Spring Boot parent to 3.2.x, Java source/target to 17, update Spring Cloud and other Spring ecosystem dependencies.mvn dependency:resolve — capture conflicts. A transient dependency pulls in javax.servlet which conflicts with jakarta.servlet. Exclude it and add the Jakarta equivalent.find + sed to replace all javax.persistence → jakarta.persistence, javax.servlet → jakarta.servlet, etc.WebSecurityConfigurerAdapter extends with SecurityFilterChain bean pattern.mvn compile — 4 errors remain around removed spring.factories auto-config. Migrate to META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports.mvn test — 2 test failures from changed default behavior in MockMvc content-type handling. Fix assertions.Output:
Migration complete: 89 files changed.
- javax → jakarta namespace: 64 files (automated replacement)
- Security config refactored: 3 files (manual pattern change)
- Auto-configuration migrated to new imports file
- pom.xml: 12 dependency version updates, 3 exclusions added
- Dockerfile updated: JDK 11 → JDK 17
- All 312 tests passing on Spring Boot 3.2.4 / JDK 17
Example 3: Python 3.8 to 3.12 upgrade with environment repair
User: "Our CI just switched to Python 3.12 and everything is broken"
Approach:
pip install -r requirements.txt — immediately see distutils import errors (removed from stdlib in 3.12). Install setuptools as a dependency to restore distutils.imp module import fails (removed in 3.12). Replace import imp with import importlib and update call sites.asyncio.coroutine decorator usage fails (removed). Replace with async def syntax.cffi 1.14 and pycryptodome 3.9 don't support 3.12. Upgrade both to latest compatible versions.requirements.txt with pinned compatible versions and a Dockerfile that matches the CI environment exactly.Output:
Migration complete: 8 files changed, 4 dependency upgrades.
- Replaced distutils imports with setuptools equivalents
- Replaced imp module usage with importlib (2 files)
- Modernized 5 asyncio.coroutine decorators to async def
- Upgraded cffi 1.14→1.17, pycryptodome 3.9→3.21
- Dockerfile aligned with CI runner (python:3.12-slim)
- All 94 tests passing on Python 3.12.1
install step before trying to build.| Error Type | Diagnosis | Resolution |
|---|---|---|
| Dependency resolution failure | Package manager can't find a compatible version set | Relax version constraints, check for alternative packages, or pin transitive dependencies |
| Build failure after manifest update | Compilation errors from removed/changed APIs | Consult migration guide, apply API transformations, rebuild |
| Tests pass locally but fail in container | Environment mismatch (system libraries, locale, timezone) | Compare env, installed packages, and OS-level dependencies between environments |
| Silent behavioral change | Tests pass but output differs from baseline | Add assertion-level regression tests comparing old vs. new behavior on edge cases |
| Circular dependency after upgrade | Two packages each require incompatible versions of a third | Check if either package has a newer release resolving the conflict, or pin the shared dependency and test both consumers |
| C extension compilation failure | Native code incompatible with new runtime | Upgrade the extension package, or if unmaintained, find a pure-Python alternative |
Li, X., Fei, Z., Ma, Y., Zhang, J., & Sarro, F. (2026). Environment-in-the-Loop: Rethinking Code Migration with LLM-based Agents. arXiv:2602.09944v1. https://arxiv.org/abs/2602.09944v1
Key takeaway: Section 3 describes the three-agent (Migration, Environment, Testsuite) feedback loop architecture. Figure 3 shows the full workflow. The diagnostic classification scheme (dependency conflict vs. API change vs. configuration drift vs. behavioral change) is the most directly actionable element for implementation.
development
Audit LLM-based automatic short answer grading (ASAG) systems for adversarial vulnerabilities using token-level and prompt-level attack strategies from the GradingAttack framework. Triggers: 'test grading robustness', 'adversarial attack on grading', 'audit LLM grader', 'red-team answer grading', 'ASAG vulnerability assessment', 'grading fairness attack'
development
Build structured information-seeking agents that decompose complex queries into multi-turn search-and-browse workflows, aggregate results from multiple web sources, and return answers in typed structured formats (items, sets, lists, tables). Applies the GISA benchmark's ReAct-based agent architecture and evaluation methodology. Trigger phrases: "build an information-seeking agent", "search agent pipeline", "multi-turn web research agent", "structured web search workflow", "aggregate information from multiple sources", "web research with structured output"
data-ai
Optimize LLM prompts using GFlowPO's iterative generate-evaluate-refine loop with diversity-preserving exploration and dynamic memory. Use when: 'optimize this prompt', 'find a better prompt for this task', 'prompt engineering with examples', 'auto-tune my system prompt', 'improve prompt accuracy', 'generate prompt variations'.
development
Constrain LLM generation with executable Pydantic schemas and multi-agent pipelines to produce structurally valid, domain-rich artifacts. Uses ontology-as-grammar to eliminate hallucinated structures while preserving creative output. Trigger phrases: "generate a valid game design", "schema-constrained generation", "build a multi-agent pipeline with Pydantic validation", "ontology-driven content generation", "structured creative generation with DSPy", "generate artifacts that pass domain validation".