.claude/skills/ts-data-analysis/SKILL.md
Analyze spreadsheet data, generate insights, and create visualizations. Use when a user asks to analyze data, explore a dataset, find trends, generate statistics, create charts from CSV or Excel data, summarize data, or answer questions about tabular data.
npx skillsauth add eliferjunior/Claude data-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Analyze tabular data from CSV, Excel, or other structured formats. Generate summary statistics, discover patterns, answer specific questions, and produce visualizations. Uses Python with pandas for data manipulation and matplotlib/seaborn for charts.
When a user asks you to analyze data, follow this process:
Ensure required Python packages are available:
python3 -c "import pandas; import matplotlib; import seaborn; print('All packages available')" 2>/dev/null || \
pip install pandas matplotlib seaborn openpyxl
Read the file and understand its structure:
import pandas as pd
# Load the data (auto-detect format)
df = pd.read_csv("data.csv") # For CSV
# df = pd.read_excel("data.xlsx") # For Excel
# df = pd.read_json("data.json") # For JSON
# Initial inspection
print(f"Shape: {df.shape[0]} rows x {df.shape[1]} columns")
print(f"\nColumns: {list(df.columns)}")
print(f"\nData types:\n{df.dtypes}")
print(f"\nFirst 5 rows:\n{df.head()}")
print(f"\nMissing values:\n{df.isnull().sum()}")
Report the dataset structure to the user before proceeding:
# Numeric summary
print(df.describe())
# Categorical summary
for col in df.select_dtypes(include='object').columns:
print(f"\n{col} - unique values: {df[col].nunique()}")
print(df[col].value_counts().head(10))
# Correlations between numeric columns
print(df.select_dtypes(include='number').corr())
Based on the user's request, perform targeted analysis:
Filtering and aggregation:
# Group by category and compute stats
summary = df.groupby('category').agg(
count=('id', 'count'),
avg_value=('value', 'mean'),
total=('value', 'sum')
).sort_values('total', ascending=False)
Time series analysis:
df['date'] = pd.to_datetime(df['date'])
monthly = df.resample('M', on='date').agg({'revenue': 'sum', 'orders': 'count'})
Outlier detection:
Q1 = df['value'].quantile(0.25)
Q3 = df['value'].quantile(0.75)
IQR = Q3 - Q1
outliers = df[(df['value'] < Q1 - 1.5*IQR) | (df['value'] > Q3 + 1.5*IQR)]
print(f"Found {len(outliers)} outliers")
Save charts as image files:
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('seaborn-v0_8-darkgrid')
fig, ax = plt.subplots(figsize=(10, 6))
# Bar chart
df.groupby('category')['revenue'].sum().plot(kind='bar', ax=ax)
ax.set_title('Revenue by Category')
ax.set_ylabel('Revenue ($)')
plt.tight_layout()
plt.savefig('chart_revenue_by_category.png', dpi=150)
plt.close()
# Line chart for trends
fig, ax = plt.subplots(figsize=(10, 6))
monthly['revenue'].plot(ax=ax)
ax.set_title('Monthly Revenue Trend')
plt.tight_layout()
plt.savefig('chart_monthly_trend.png', dpi=150)
plt.close()
# Correlation heatmap
fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(df.select_dtypes(include='number').corr(), annot=True, cmap='coolwarm', ax=ax)
plt.tight_layout()
plt.savefig('chart_correlation.png', dpi=150)
plt.close()
Summarize the analysis with:
User request: "Analyze this sales CSV and tell me what's trending"
Actions:
sales.csv and inspect (columns: date, product, category, quantity, price, region)Output: "Sales totaled $2.4M over 12 months. Revenue grew 15% quarter-over-quarter. Electronics is the top category (42% of revenue). The Southwest region underperforms at 8% of total sales despite covering 20% of stores. See the attached charts for trends."
User request: "Compare last year's performance to this year"
Actions:
Output: A comparison table showing each metric with last year vs. this year values, percentage change, and a trend indicator.
User request: "Are there any anomalies in this server metrics CSV?"
Actions:
Output: "Found 3 anomaly windows: Jan 15 2-4pm (CPU 95%, errors 12x normal), Feb 3 11am (memory spike to 98%), Mar 20 all day (elevated response times). The Jan 15 event correlates with a deployment at 1:55pm."
development
Expert guidance for Fireworks AI, the platform for running open-source LLMs (Llama, Mixtral, Qwen, etc.) with enterprise-grade speed and reliability. Helps developers integrate Fireworks' inference API, fine-tune models, and deploy custom model endpoints with function calling and structured output support.
development
Convert any website into clean, structured data with Firecrawl — API-first web scraping service. Use when someone asks to "turn a website into markdown", "scrape website for LLM", "Firecrawl", "extract website content as clean text", "crawl and convert to structured data", or "scrape website for RAG". Covers single-page scraping, full-site crawling, structured extraction, and LLM-ready output.
tools
Expert guidance for Firebase, Google's platform for building and scaling web and mobile applications. Helps developers set up authentication, Firestore/Realtime Database, Cloud Functions, hosting, storage, and analytics using Firebase's SDK and CLI.
development
When the user needs to build file upload functionality for a web application. Use when the user mentions "file upload," "image upload," "upload endpoint," "multipart upload," "presigned URL," "S3 upload," "file validation," "upload to cloud storage," or "accept user files." Handles upload endpoints, file validation (type, size, magic bytes), cloud storage integration, and upload status tracking. For image/video processing after upload, see media-transcoder.