.claude/skills/parser-expert/SKILL.md
Language parser development expert for CodeCompress. Covers the ILanguageParser strategy pattern, regex-based symbol extraction, and language-specific grammar for all current parsers (Luau, C#, Terraform, Blazor, .NET Project, JSON) and planned parsers (Python, Go, Rust).
npx skillsauth add MCrank/code-compress parser-expertInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a language parser development expert for the CodeCompress project. Guide the implementation, debugging, and testing of regex-based parsers that extract symbols from source files across multiple languages.
For .NET project conventions, see dotnet-reference.md.
Never rely on training data for language grammar rules. Always verify syntax rules.
Use the Context7 MCP and Ref MCP for:
[GeneratedRegex] source generator patternsAll parsers implement ILanguageParser:
public interface ILanguageParser
{
string LanguageId { get; }
IReadOnlyList<string> FileExtensions { get; }
ParseResult Parse(string filePath, ReadOnlySpan<byte> content);
}
public sealed record ParseResult(
IReadOnlyList<SymbolInfo> Symbols,
IReadOnlyList<DependencyInfo> Dependencies);
| Field | Type | Purpose |
|-------|------|---------|
| Name | string | Symbol name (e.g., ProcessAttack) |
| QualifiedName | string | Parent-qualified name (e.g., CombatService.ProcessAttack) |
| Kind | SymbolKind | Function, Method, Class, Record, Enum, etc. |
| Signature | string | Full declaration signature |
| Visibility | Visibility | Public, Private, Protected, Internal |
| DocComment | string? | Documentation comment (XML, triple-dash, etc.) |
| FilePath | string | Relative path to source file |
| ByteOffset | int | Byte position in file (for seek-based retrieval) |
| ByteLength | int | Byte length of symbol body |
| LineStart | int | Line number of declaration |
| LineEnd | int | Line number of closing brace/end |
| ParentName | string? | Enclosing symbol name (null for top-level) |
Function, Method, Class, Record, Enum, Type, Interface, Export, Constant, Module
Public, Private, Protected, Internal
Adding a new language parser requires:
ILanguageParserservices.AddSingleton<ILanguageParser, MyParser>(); in ServiceCollectionExtensions.AddCodeCompressCore()IndexEngine auto-resolves parsers by file extension — no other wiring neededAll parsers use regex pattern matching, NOT AST parsing. This is by design:
// Source-generated regex (preferred — compile-time, AOT-compatible)
[GeneratedRegex(@"^(?<vis>public|private|protected|internal)\s+(?<kind>class|interface|struct|record|enum)\s+(?<name>\w+)",
RegexOptions.Multiline)]
private static partial Regex TypeDeclarationRegex();
// Parse content
public ParseResult Parse(string filePath, ReadOnlySpan<byte> content)
{
var text = Encoding.UTF8.GetString(content);
var symbols = new List<SymbolInfo>();
var dependencies = new List<DependencyInfo>();
// ... regex matching and symbol extraction
return new ParseResult(symbols, dependencies);
}
The MCP get_symbol and expand_symbol tools use byte offsets to seek directly to a symbol in a file. Every SymbolInfo MUST have accurate:
ByteOffset — byte position of the symbol declaration in the fileByteLength — byte length from declaration to closing brace/endConvert string index to byte offset: Encoding.UTF8.GetByteCount(text[..charIndex])
Most languages need brace-depth or indent-level tracking to determine:
ParentName assignmentBrace-based languages (C#, Go, Rust, Terraform): Track {/} depth, accounting for strings and comments.
Indentation-based languages (Python): Track indent level changes.
Extract the comment block immediately preceding a symbol declaration:
/// XML doc comments--- triple-dash comments# comments before blocks""" docstrings after def/class// comments before declarations/// and //! doc commentsLuauParser.cs| Property | Value |
|----------|-------|
| Language ID | luau |
| Extensions | .luau, .lua |
Symbol types: Functions (function foo()), local functions, methods (:Method()), module table assignments, constants
Scoping: Nesting depth via function/end blocks
Doc comments: --- triple-dash
Dependencies: require() calls
Gotchas:
function Module:Method() — the receiver is implicitreturn Module at file end... parameterCSharpParser.cs| Property | Value |
|----------|-------|
| Language ID | csharp |
| Extensions | .cs |
Symbol types: Namespaces, classes, interfaces, structs, records, enums, methods, properties, constants, delegates
Scoping: Brace-depth {/} tracking — must handle:
"...", @"...", $"...", """...""" raw strings)//..., /* ... */)'{')@"contains { and }")Doc comments: /// <summary>...</summary> XML format
Generics: <T>, <T, U> — don't confuse angle brackets with comparison operators
Attributes: [Foo], [Foo(args)] — extract but don't treat as separate symbols
Record types: record Foo(int X, string Y) — primary constructor
Expression-bodied members: => expr; — single line, no braces
File-scoped namespaces: namespace Foo; — affects all subsequent declarations
Modifiers: public, private, protected, internal, static, abstract, sealed, override, virtual, async, readonly, partial
Pattern matching: is, switch expressions — not symbols but affect brace depth
TerraformParser.cs| Property | Value |
|----------|-------|
| Language ID | terraform |
| Extensions | .tf, .tfvars |
Symbol types: Resources, data sources, variables, outputs, modules, providers, locals, terraform blocks
Scoping: HCL brace-depth tracking
Doc comments: # comments before blocks
Dependencies: Module source references
Gotchas:
aws_instance.web) conflict with GetSymbolByNameAsync's parent.child splitting logic. Use GetSymbolsByFileAsync for exact name lookup..tfvars files have different parsing (variable assignments, not block declarations)<<EOF ... EOF) — skip brace counting insideBlazorRazorParser.cs| Property | Value |
|----------|-------|
| Language ID | blazor |
| Extensions | .razor |
Symbol types: @page directives, @inject directives, @using directives, @inherits/@implements
Delegation: Delegates to CSharpParser for @code { } and @functions { } sections
Gotchas: Mixed HTML and C# content, Razor syntax (@if, @foreach)
DotNetProjectParser.cs| Property | Value |
|----------|-------|
| Language ID | dotnet-project |
| Extensions | .csproj, .fsproj, .vbproj, .props |
Parsing: XML-based using XDocument (not regex)
Symbol types: Package references (name + version), build properties (TargetFramework, etc.), project references
Dependencies: <ProjectReference> entries
JsonConfigParser.cs| Property | Value |
|----------|-------|
| Language ID | json-config |
| Extensions | .json |
Parsing: JsonDocument traversal (not regex)
Symbol types: Config keys as symbols, nested keys with qualified names (e.g., ConnectionStrings.Default)
| Property | Value |
|----------|-------|
| Extensions | .py |
Key challenges:
def (functions/methods), class, module-level variables, @decorator annotations"""...""" immediately after def/classdef foo(x: int) -> str: — include in signatureimport and from ... import statementsasync def, nested classes, __init__ methods, @property, @staticmethod, @classmethod| Property | Value |
|----------|-------|
| Extensions | .go |
Key challenges:
Exported (public) vs unexported (private)func, type (struct, interface), const, var, methods with receivers func (r *Receiver) Method()// comments directly before declarations (Go convention)import statements (single and grouped import (...))| Property | Value |
|----------|-------|
| Extensions | .rs |
Key challenges:
pub, pub(crate), pub(super), default privatefn, struct, enum, trait, impl blocks, type aliases, const, static, mod/// (outer) and //! (inner/module-level)use statements, mod declarations, extern crate<'a>), generic bounds (where T: Trait), macros (macro_rules!), derive macros (#[derive(Debug, Clone)]), impl blocks associate methods with types (method's parent is the type, not the impl block)Every new parser MUST include both. This is enforced by the implement-plan skill (Step 6).
samples/{language}-sample-project/Requirements:
samples/csharp-sample-project/, samples/luau-sample-project/, samples/terraform-sample-project/tests/CodeCompress.Integration.Tests/{Language}EndToEndTests.csFollow the pattern in CSharpEndToEndTests.cs:
internal sealed class PythonEndToEndTests
{
[Test]
public async Task IndexPythonSampleProject()
{
// In-memory SQLite + IndexEngine + parser
// Index the sample project
// Assert: correct file count, symbol count
}
[Test]
public async Task OutlineContainsAllSymbolKinds()
{
// Verify all expected SymbolKind values appear
}
[Test]
public async Task SpecificSymbolHasCorrectMetadata()
{
// Verify a known symbol has correct Kind, Visibility, DocComment
}
[Test]
public async Task SearchFindsSymbols()
{
// Verify FTS5 search returns expected results
}
[Test]
public async Task DependenciesAreTracked()
{
// Verify import/require edges in dependency graph
}
}
Important: For Terraform-style dotted symbol names, use GetSymbolsByFileAsync instead of GetSymbolByNameAsync (which splits on .).
When this skill is invoked as a sub-agent, the caller must provide:
ILanguageParser interface definitiondevelopment
TDD expert with deep TUnit, NSubstitute, and Verify knowledge. Use for writing tests, test infrastructure, and enforcing test-first methodology in the CodeCompress project.
tools
--- name: security-expert description: Security expert covering OWASP Top 10 and MCP-specific threats (prompt injection, data exfiltration, tool poisoning). Use for security reviews, implementation guidance, and audit of CodeCompress code. argument-hint: [review|enforce] [file-or-directory] disable-model-invocation: true --- # Security Expert — CodeCompress You are a security expert for the CodeCompress MCP server. This server indexes codebases and provides AI agents with compressed code acces
tools
Implement a feature from a mini-plan document, user story, or GitHub issue using TDD, enforcing security and .NET/MCP best practices. Pass the path to a mini-plan .md file, user story, or GitHub issue URL/file. Also use when the user says "implement issue
development
Create a release for CodeCompress following Gitflow conventions, Semantic Versioning, and .NET version management. Handles version bumps in Directory.Build.props, CHANGELOG generation, and PR creation.