skills/domains/cs/software-engineering-research/SKILL.md
Guide to software engineering research topics and methodologies
npx skillsauth add wentorai/research-plugins software-engineering-researchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Navigate the landscape of software engineering research, including key subfields, methodologies, datasets, benchmarks, and top venues.
| Subfield | Key Topics | Major Venues | |----------|-----------|-------------| | Software Testing | Test generation, fuzzing, mutation testing, flaky tests | ISSTA, ICST, ASE | | Program Analysis | Static analysis, abstract interpretation, symbolic execution | PLDI, POPL, OOPSLA | | Software Maintenance | Code refactoring, technical debt, code smells, evolution | ICSME, MSR, SANER | | SE for AI/ML | ML pipeline testing, data quality, model debugging | ICSE-SEIP, FSE | | AI for SE | Code generation, bug detection, program repair | ICSE, FSE, ASE | | Distributed Systems | Consensus, fault tolerance, scalability, microservices | SOSP, OSDI, EuroSys | | Cybersecurity | Vulnerability detection, malware analysis, privacy | IEEE S&P, CCS, USENIX Security | | HCI in SE | Developer tools, IDE usability, code comprehension | CHI, CSCW, VL/HCC | | Empirical SE | Mining repositories, developer surveys, controlled experiments | ESEM, MSR, TOSEM |
Testing a specific hypothesis with treatment and control groups:
Example: Does AI code completion improve developer productivity?
Design:
- Participants: 60 professional developers
- Treatment: IDE with AI code completion enabled
- Control: IDE with AI code completion disabled
- Task: Complete 5 programming tasks of varying difficulty
- Metrics: Task completion time, code correctness, lines of code
- Analysis: Mixed-effects linear model with participant as random effect
Threats to validity:
- Internal: Learning effect (counterbalance task order)
- External: Lab setting may not reflect real development
- Construct: "Productivity" operationalized as speed + correctness
Analyzing data from version control, issue trackers, code review systems:
# Example: Analyze commit patterns using PyDriller
from pydriller import Repository
repo_url = "https://github.com/apache/kafka"
commit_data = []
for commit in Repository(repo_url, since=datetime(2023, 1, 1),
to=datetime(2023, 12, 31)).traverse_commits():
commit_data.append({
"hash": commit.hash[:8],
"author": commit.author.name,
"date": commit.committer_date,
"files_changed": commit.files,
"insertions": commit.insertions,
"deletions": commit.deletions,
"message": commit.msg[:100]
})
df = pd.DataFrame(commit_data)
print(f"Total commits in 2023: {len(df)}")
print(f"Unique contributors: {df['author'].nunique()}")
print(f"Avg files per commit: {df['files_changed'].mean():.1f}")
In-depth investigation of a phenomenon in its real-world context:
Case Study Protocol (based on Yin, 2018):
1. Research questions: How do teams adopt microservices?
2. Unit of analysis: Development teams at 3 companies
3. Data sources:
- Semi-structured interviews (8-12 per company)
- Architecture documentation review
- Commit history and deployment logs
- Meeting observations
4. Analysis: Thematic analysis with cross-case comparison
5. Validity: Triangulation across data sources, member checking
| Benchmark | Task | Languages | Size | |-----------|------|-----------|------| | HumanEval | Code generation from docstrings | Python | 164 problems | | MBPP | Code generation from descriptions | Python | 974 problems | | SWE-bench | Real-world GitHub issue resolution | Python | 2,294 instances | | CodeXGLUE | Multiple code tasks | 6 languages | Varies by task | | BigCloneBench | Clone detection | Java | 6M clone pairs | | Defects4J | Bug localization and repair | Java | 835 real bugs |
| Dataset | Content | Use Cases | |---------|---------|-----------| | GHTorrent | GitHub event data (commits, issues, PRs) | MSR studies | | Software Heritage | Universal source code archive | Code evolution, provenance | | Stack Overflow Data Dump | Q&A posts, tags, votes | Developer knowledge, NLP | | CVE Database | Vulnerability records | Security research | | Chrome/Firefox Bug Trackers | Bug reports, patches | Bug triage, severity prediction |
# Example: Using tree-sitter for AST-level code analysis
from tree_sitter import Language, Parser
import tree_sitter_python as tspython
PYTHON_LANGUAGE = Language(tspython.language())
parser = Parser(PYTHON_LANGUAGE)
source_code = b"""
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n-1) + fibonacci(n-2)
"""
tree = parser.parse(source_code)
root = tree.root_node
def count_nodes(node, node_type):
"""Count AST nodes of a given type."""
count = 1 if node.type == node_type else 0
for child in node.children:
count += count_nodes(child, node_type)
return count
print(f"Function definitions: {count_nodes(root, 'function_definition')}")
print(f"If statements: {count_nodes(root, 'if_statement')}")
print(f"Return statements: {count_nodes(root, 'return_statement')}")
print(f"Function calls: {count_nodes(root, 'call')}")
# Common software metrics
metrics = {
"Lines of Code (LOC)": "Total lines (including blanks and comments)",
"Cyclomatic Complexity": "Number of independent paths (McCabe, 1976)",
"Halstead Volume": "Based on operators and operands count",
"Maintainability Index": "Composite of LOC, CC, and Halstead",
"Coupling Between Objects": "Number of other classes referenced",
"Depth of Inheritance": "Levels in class hierarchy",
"Code Churn": "Lines added + modified + deleted per period",
"Comment Density": "Ratio of comment lines to total lines"
}
# Calculate cyclomatic complexity using radon
# pip install radon
import subprocess
result = subprocess.run(
["radon", "cc", "my_module.py", "-s", "-j"],
capture_output=True, text=True
)
print(result.stdout)
| Venue | Type | Acceptance Rate | Focus | |-------|------|-----------------|-------| | ICSE | Conference | ~22% | Broad SE | | FSE/ESEC | Conference | ~24% | Broad SE | | ASE | Conference | ~22% | Automated SE | | ISSTA | Conference | ~25% | Software testing | | MSR | Conference | ~30% | Mining repositories | | TOSEM | Journal | -- | Broad SE (ACM) | | TSE | Journal | -- | Broad SE (IEEE) | | EMSE | Journal | -- | Empirical SE (Springer) |
| Venue | Type | Focus | |-------|------|-------| | SOSP/OSDI | Conference | Operating systems, distributed systems | | EuroSys | Conference | Systems (Europe) | | NSDI | Conference | Networked systems design | | IEEE S&P (Oakland) | Conference | Security and privacy | | USENIX Security | Conference | Security | | CCS | Conference | Computer and communications security | | NDSS | Conference | Network and distributed systems security |
| Tool | Purpose | URL | |------|---------|-----| | PyDriller | Git repository mining (Python) | github.com/ishepard/pydriller | | Radon | Python code metrics | github.com/rubik/radon | | SonarQube | Multi-language static analysis | sonarqube.org | | Understand | Code analysis and metrics | scitools.com | | Joern | Code analysis platform (CPG) | joern.io | | CodeQL | Semantic code analysis | codeql.github.com | | tree-sitter | Incremental parsing library | tree-sitter.github.io |
tools
10 document processing skills. Trigger: extracting text from PDFs, parsing references, document Q&A. Design: parsing pipelines (GROBID, marker) and structured extraction tools.
documentation
Guide to tldraw for infinite canvas whiteboarding and diagram creation
testing
Create graphical abstracts, schematic diagrams, and scientific illustrations
documentation
Create UML diagrams and architecture visualizations with PlantUML