csv-excel-merger/SKILL.md
Merge multiple CSV/Excel files with intelligent column matching, data deduplication, and conflict resolution. Handles different schemas, formats, and combines data sources. Use when users need to merge spreadsheets, combine data exports, or consolidate multiple files into one.
npx skillsauth add onewave-ai/claude-skills csv-excel-mergerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Merge multiple CSV or Excel files with automatic column matching, deduplication, and conflict resolution.
references/merge_strategies.md — column matching, conflict resolution, and dedup optionsreferences/output_template.md — the merge-report formatInspect the inputs. Determine file count, format (CSV / Excel / TSV), and whether the files are attached or read from disk. Read each header; identify column names, data types, and encoding (UTF-8, Latin-1). Note the candidate primary key.
Plan the merge. Match columns across files to one unified schema, choose a conflict-resolution rule, and pick a deduplication strategy. See references/merge_strategies.md for the matching heuristics and the full set of options.
Execute the merge with pandas:
import pandas as pd
df1 = pd.read_csv("file1.csv")
df2 = pd.read_csv("file2.csv")
# Normalize, then map column names onto the unified schema
for df in (df1, df2):
df.columns = df.columns.str.lower().str.strip()
df2 = df2.rename(columns={"firstname": "first_name", "e_mail": "email"})
merged = pd.concat([df1, df2], ignore_index=True)
merged = merged.drop_duplicates(subset=["email"], keep="last")
merged.to_csv("merged_output.csv", index=False)
Verify the result before reporting — see Verification.
Report using the layout in references/output_template.md, then offer export options: CSV (UTF-8), Excel (.xlsx), JSON, SQL INSERT statements, or Parquet for large datasets.
Never hand back a merge without checking it. After merging, assert the row math holds and the key is actually unique:
total_in = len(df1) + len(df2)
assert len(merged) > 0, "merge produced an empty frame"
assert len(merged) <= total_in, "more rows than inputs — check the concat/join"
assert merged["email"].is_unique, "duplicate keys remain after dedup"
print(f"in: {total_in} rows | out: {len(merged)} rows | removed: {total_in - len(merged)}")
print(f"null keys: {merged['email'].isna().sum()} | columns: {list(merged.columns)}")
Report rows in vs. out, duplicates removed, and per-column completeness so the user can sanity-check the numbers against their own expectations.
subset=["email", "company"].pd.read_csv(path, chunksize=...)), report progress, and estimate memory before loading everything at once.development
Custom training plans by goal (strength, cardio, flexibility). Progressive overload programming, rest day optimization, home vs gym adaptations, deload weeks.
tools
Takes a manual business workflow description and designs the automated version. Maps current steps, handoffs, decision points, and bottlenecks. Designs automated flow with triggers, conditions, actions, and error handling. Outputs workflow-automation.md with before/after Mermaid diagrams, tool recommendations, implementation steps, and time savings estimate.
testing
Auto-generates weekly KPI reports from multiple data sources including Supabase analytics, CRM data, financial spreadsheets, and email metrics. Produces executive-ready reports with dashboards, trends, highlights, concerns, and action items.
development
Convert webinar recordings into blog posts, social snippets, email series. Extract key quotes, statistics, and soundbites.