skills/famou-data-analysis/SKILL.md
A data analysis skill for understanding datasets, analyzing data, building data processing pipelines, and summarizing analytical results. Use this skill when the user mentions "analyze data", "data processing", "data exploration", "statistical analysis", "data cleaning", "data summarization", "create a data report", "understand this dataset", or "take a look at this CSV/Excel/dataset". Even if the user simply says "help me look at this data" or "analyze this", trigger this skill whenever the context involves a data file or dataset. Also invoke this skill if data analysis is required during Famou problem definition.
npx skillsauth add baidubce/skills famou-data-analysisInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The core objectives of a data analysis task are:
Always follow these constraints throughout the analysis:
After receiving data, prioritize understanding the context: What is the business scenario? What is the user's analytical goal? This determines which columns matter, what counts as "anomalous", and how to handle missing values.
Don't just output statistics — explain what they mean. "The mean is 3200" is less useful than "The average order value is approximately $3,200, but the median is only $1,800, suggesting a small number of high-value orders are inflating the mean."
Every operation should be answerable with "why did we do this?", for example:
Each insight should be accompanied by specific numbers as evidence; avoid subjective judgments unsupported by data.
/mnt/user-data/uploads/; save output files to /mnt/user-data/outputs/utf-8 first; fall back to latin-1 or cp1252 if neededmatplotlib.rcParams['font.sans-serif'] = ['DejaVu Sans'] for reliable renderingencoding='utf-8-sig' to prevent encoding issues when opening in ExcelRegardless of the analysis path taken, the final summary report should include:
## Data Analysis Report
### Data Overview
(Dataset size, source, time range, description of key fields)
### Data Quality
(Issues found, severity level, actions taken or items pending confirmation)
### Key Findings
(3–5 most important insights, each supported by data)
### Processing Pipeline
(What transformations were applied to the data and why)
### Recommendations
(Suggestions for data improvement, directions for deeper analysis)
The following examples show how to apply the goals and constraints above when facing different types of data.
Scenario: The user uploads an order CSV with columns for Order ID, Order Time, Product Category, Amount, User City, and Payment Status, and wants to understand "overall sales performance."
Analysis approach:
Quality issue handling example:
Found 12 rows with negative values in the "Amount" column. In sales data, negatives typically represent refund records. Recommendation: filter them out when analyzing gross revenue; keep them when analyzing net income. Please confirm your analysis goal before we proceed.
Key findings example:
- March revenue grew 34% month-over-month, primarily driven by the "Electronics" category (contributing 61% of the incremental growth)
- New York and Los Angeles together account for 47% of all orders, but average order value is 15% below the average for mid-tier cities
- Payment failure rate is 8.3%, above the typical industry benchmark of 2–5%; recommend investigating the payment flow
Scenario: The user uploads an Excel file with multiple sheets, inconsistent column names (some say "Q1", others say "Question 1"), merged cells, and scattered blank rows and columns.
First step: Don't rush into analysis — first surface the structural issues to the user:
This file contains 3 sheets, and there are several structural issues to clarify before we proceed:
- Sheet1 and Sheet2 have different column names — are these different batches of the same survey, or entirely different surveys?
- Rows 5–8 are blank — is it safe to remove them?
- The "Q3_Other" column is 92% empty — is this a low open-response rate, or a data export issue?
Once confirmed, I'll design a cleaning plan.
Constraint in action: Column meanings are not assumed, sheets are not merged arbitrarily — issues are surfaced first and the user is asked to confirm.
Processing log example (provided alongside the output file):
Data Processing Log
Source file: sales_2024.csv (8,412 rows × 15 columns)
Output file: sales_2024_cleaned.csv (8,203 rows × 13 columns)
Changes:
1. Removed duplicate rows: deleted 89 fully duplicate records
2. Dropped "Notes2" column: 96% missing, no useful information
3. Dropped "Internal Code" column: user confirmed it is not needed for analysis
4. Standardized Amount column: converted "¥1,200.00" format to numeric 1200.0 (203 rows affected)
5. Standardized Date column: unified to YYYY-MM-DD format (source had mixed MM/DD/YYYY and written-out date formats)
6. Missing value handling:
- "City" column: 34 missing values → filled with "Unknown" (user confirmed)
- "Amount" column: 86 missing values → left blank (user confirmed these are anomalous records that should not be imputed)
devops
百度智能云对象存储(BOS)集成技能。当用户需要上传、下载、删除或复制 BOS 文件, 列出文件列表或 Bucket,获取签名 URL,处理图片(亮度、对比度、模糊、旋转、裁剪、水印等), 或递归同步本地目录与 BOS 时使用此技能。
development
Generate interactive visualization pages for feasible solutions produced by Famou evolutionary algorithms. Use this skill when the user mentions "Famou visualization", "visualize this solution", "show feasible solution results", "evolution results", "evolve visualization", or provides a Python-code solution (path planning, scheduling, knapsack, TSP, job scheduling, machine learning, etc.) that needs to be displayed visually. Even if the user just says "help me visualize this solution", "draw it out", or "show me the results", trigger this skill immediately whenever the context involves evolutionary algorithms or optimization problem solutions.
testing
Workflow skill for managing famou evolutionary experiment tasks, including public normal mode and public pro hybrid mode. Use this skill when the user mentions "submit experiment", "check experiment status", "delete experiment", "get experiment results", "account info", "quota", "credits", "famou experiment", "upload experiment", "config.yaml experiment", "hybrid mode", or needs to use famou-ctl to manage experiment tasks. Even if the user just says "submit" or "run experiment", trigger this skill whenever the context involves the famou platform.
testing
Interactive end-to-end Famou workflow for defining, implementing, and solving optimization tasks. The workflow typically proceeds in three stages: (1) understand the data and define the task, producing `problem.md`; (2) implement and validate `evaluator.py`, `init.py`, and `prompt.md` from the task definition; (3) run deep solving through Famou. Trigger this skill whenever the user wants to define, clarify, create, or fix a Famou task; prepare Famou experiment artifacts; write or update `problem.md`, `evaluator.py`, `init.py`, or `prompt.md`; run Famou; do deep solving; or solve an optimization, ML, or search problem with evolutionary methods. Even if the user simply says "help me make a Famou task", "help me solve this", or "run Famou", trigger this skill whenever the surrounding context indicates an optimization or search task. Also trigger when the user describes a combinatorial optimization, scheduling, routing, or ML problem without mentioning Famou — treat it as a potential Famou task.