skills/analysis/wrangling/streamline-analyst-guide/SKILL.md
End-to-end data analysis AI agent with Streamlit UI
npx skillsauth add wentorai/research-plugins streamline-analyst-guideInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Streamline Analyst is an end-to-end data analysis AI agent with a Streamlit web interface. Upload a dataset and describe your analysis goal in natural language — the agent handles data cleaning, EDA, feature engineering, model training, evaluation, and report generation. Provides an interactive UI for reviewing each step and adjusting parameters.
git clone https://github.com/Wilson-ZheLin/Streamline-Analyst.git
cd Streamline-Analyst
pip install -r requirements.txt
streamlit run app.py
Upload Dataset (CSV, Excel, Parquet)
↓
Data Profiling
├── Column types and distributions
├── Missing value analysis
├── Correlation matrix
└── Outlier detection
↓
Data Cleaning (interactive)
├── Handle missing values
├── Remove/fix outliers
├── Type conversions
└── Feature encoding
↓
EDA (automated + custom)
├── Univariate analysis
├── Bivariate relationships
├── Statistical tests
└── Custom visualizations
↓
Modeling (if applicable)
├── Train/test split
├── Model selection + training
├── Hyperparameter tuning
└── Evaluation metrics
↓
Report Generation
# Streamline Analyst provides:
# 1. Smart data profiling
# - Auto-detect column types (numeric, categorical, datetime)
# - Distribution analysis per column
# - Missing value patterns (MCAR, MAR, MNAR hints)
# - Correlation analysis with significance
# 2. Interactive cleaning
# - Imputation strategies (mean, median, mode, KNN, model)
# - Outlier handling (IQR, Z-score, isolation forest)
# - Encoding (one-hot, label, target, ordinal)
# - Scaling (standard, minmax, robust)
# 3. Automated EDA
# - Distribution plots (histogram, KDE, box, violin)
# - Relationship plots (scatter, pair, heatmap)
# - Time series decomposition
# - Statistical tests (t-test, ANOVA, chi-square, Mann-Whitney)
# 4. Model pipeline
# - Classification: LR, RF, GBM, SVM, MLP
# - Regression: LR, RF, GBM, SVR, ElasticNet
# - Cross-validation with confidence intervals
# - Feature importance visualization
# - SHAP explanations
# 5. Report
# - HTML report with all plots and findings
# - Downloadable cleaned dataset
# - Model artifacts (pickle)
### Example Prompts
- "Show me the distribution of all numeric columns"
- "Is there a significant difference in income between genders?"
- "Build a classifier to predict churn using all features"
- "What are the top 5 most important features for prediction?"
- "Clean the data: fill missing values and remove outliers"
- "Generate a summary report of this dataset"
documentation
Write Tsinghua University theses using the ThuThesis LaTeX template
development
Templates, formatting rules, and strategies for thesis and dissertation writing
documentation
Set up LaTeX templates for PhD and Master's thesis documents
documentation
Write SJTU theses using the SJTUThesis LaTeX template with full compliance