.internal-skills/machine-learning-engineer/SKILL.md
Engenheiro de Machine Learning. Use para: - Desenvolver modelos de ML - MLOps e deployment - Feature engineering - Experiment tracking - Model serving e inference
npx skillsauth add suportebahia/equipe-devs Equipe SBahia - Machine Learning EngineerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
import pandas as pd
from sklearn.preprocessing import StandardScaler, LabelEncoder
class FeatureEngineer:
def __init__(self):
self.scaler = StandardScaler()
self.encoders = {}
def transform(self, df):
# Temporal features
df['hour'] = df['timestamp'].dt.hour
df['day_of_week'] = df['timestamp'].dt.dayofweek
df['is_weekend'] = df['day_of_week'].isin([5, 6]).astype(int)
# Aggregate features
df['order_count_7d'] = df.groupby('customer_id')['order_id'] \
.transform(lambda x: x.rolling('7D', min_periods=1).count())
# Encoding
for col in ['category', 'channel']:
self.encoders[col] = LabelEncoder()
df[f'{col}_encoded'] = self.encoders[col].fit_transform(df[col])
# Normalization
numeric_cols = ['amount', 'quantity', 'recency']
df[numeric_cols] = self.scaler.fit_transform(df[numeric_cols])
return df
import xgboost as xgb
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import classification_report, roc_auc_score
class ChurnModel:
def __init__(self):
self.model = xgb.XGBClassifier(
n_estimators=200,
max_depth=6,
learning_rate=0.1,
objective='binary:logistic',
eval_metric='auc',
early_stopping_rounds=20,
)
def train(self, X, y):
X_train, X_val, y_train, y_val = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
self.model.fit(
X_train, y_train,
eval_set=[(X_val, y_val)],
verbose=50
)
return self.evaluate(X_val, y_val)
def evaluate(self, X, y):
y_pred = self.model.predict(X)
y_proba = self.model.predict_proba(X)[:, 1]
return {
'classification_report': classification_report(y, y_pred),
'auc_score': roc_auc_score(y, y_proba),
}
import mlflow
from mlflow.tracking import MlflowClient
mlflow.set_experiment('churn_prediction')
with mlflow.start_run(run_name='xgboost_v1'):
# Log parameters
mlflow.log_params({
'n_estimators': 200,
'max_depth': 6,
'learning_rate': 0.1,
})
# Train
model = train_model(X_train, y_train)
# Log metrics
mlflow.log_metrics({
'auc_train': auc_train,
'auc_val': auc_val,
})
# Log model
mlflow.sklearn.log_model(
sk_model=model,
artifact_path='model',
registered_model_name='churn-model'
)
# Log feature importance
mlflow.log_figure(feature_importance_plot, 'feature_importance.png')
from fastapi import FastAPI
from pydantic import BaseModel
import mlflow.pyfunc
import pandas as pd
app = FastAPI()
# Load model from registry
model = mlflow.pyfunc.load_model(
model_uri='models:/churn-model/production'
)
class PredictionRequest(BaseModel):
customer_id: str
features: dict
@app.post('/predict')
async def predict(request: PredictionRequest):
df = pd.DataFrame([request.features])
prediction = model.predict(df)
probability = model.predict_proba(df)[:, 1]
return {
'customer_id': request.customer_id,
'churn_probability': float(probability[0]),
'will_churn': bool(prediction[0]),
}
@app.post('/batch_predict')
async def batch_predict(requests: List[PredictionRequest]):
df = pd.DataFrame([r.features for r in requests])
predictions = model.predict(df)
probabilities = model.predict_proba(df)[:, 1]
return [
{
'customer_id': r.customer_id,
'probability': float(p),
'will_churn': bool(m)
}
for r, p, m in zip(requests, probabilities, predictions)
]
# Evidently AI per drift detection
from evidently.dashboard import Dashboard
from evidently.tabs import DataDriftTab, DataQualityTab
# Calculate drift
drift_report = Dashboard(tabs=[
DataDriftTab(),
DataQualityTab(),
])
drift_report.calculate(
reference_data=reference_df,
current_data=current_df,
column_mapping=column_mapping
)
drift_report.save('drift_report.html')
# Prometheus metrics for model
from prometheus_client import Counter, Histogram
PREDICTIONS = Counter('model_predictions_total', 'Total predictions', ['model_version'])
LATENCY = Histogram('model_inference_seconds', 'Inference latency', ['model_version'])
@app.middleware('http')
def track_metrics(request, call_next):
start = time.time()
response = await call_next(request)
PREDICTIONS.labels('v1').inc()
LATENCY.labels('v1').observe(time.time() - start)
return response
from kfp import dsl
from kfp.components import create_component_from_func
@dsl.component
def load_data_component() -> Dataset:
...
@dsl.component
def train_model_component(data: Dataset) -> Model:
...
@dsl.component
def evaluate_model_component(model: Model, data: Dataset) -> Metrics:
...
@dsl.component
def deploy_model_component(model: Model):
...
@dsl.pipeline(name='churn_training_pipeline')
def churn_pipeline():
data = load_data_component()
model = train_model_component(data=data.outputs['output_dataset'])
metrics = evaluate_model_component(
model=model.outputs['model'],
data=data.outputs['output_dataset']
)
deploy_model_component(model=model.outputs['model'])
| Tipo | Uso | Exemplos | |------|-----|----------| | Classificação | Categorização | Churn, fraude, spam | | Regressão | Previsão numérica | Preço, demanda | | Recomendações | Personalização | "Você também pode gosta" | | NLP | Texto | Chatbot, sentiment | | Visão | Imagens | Detecção objetos | | Séries Temporais | Forecasting | Vendas, estoque |
| Camada | Ferramentas | |--------|-------------| | Training | Python, scikit-learn, XGBoost, PyTorch, TensorFlow | | Feature Store | Feast, Tecton, Redis | | Experiment Tracking | MLflow, Weights & Biases, Neptune | | Model Registry | MLflow Registry, SageMaker | | Serving | FastAPI, TensorFlow Serving, TorchServe, SageMaker | | Orchestration | Airflow, Kubeflow, Argo | | Monitoring | Evidently, Arize, SageMaker |
testing
Sistema de agentes IA para coordenação de projetos de desenvolvimento. Use este skill para iniciar qualquer projeto. Este skill orquestra automaticamente os agentes especializados conforme a necessidade: - Análise e planejamento de projetos - Coordenação de múltiplos agentes - Gestão de tasks e dependências
development
Orquestrador principal do ecossistema de agentes IA Equipe SBahia. Use para: - Coordenar projetos de desenvolvimento web - Alocar agentes especializados - Gerenciar workflow completo - Garantir padrões MVC e de mercado Agents disponíveis: leadership-tech, uxui-designer, frontend-developer, backend-controller, backend-model, dba-specialist, security-specialist, api-gateway-specialist, mobile-developer, data-engineer, elastic-engineer, machine-learning-engineer, testing-specialist, error-handling-specialist, product-owner, devops-engineer, solutions-engineer
testing
Skill para Designer UX/UI. Use para: - Criar experiência do usuário - Desenvolver interfaces visuais - Definir design system - Validar usabilidade
testing
Especialista em QA/Testes automatizados. Use para: - Criar estratégia de testes completa - Implementar testes unitários, integração e E2E - TDD/BDD quando aplicável - Coverage analysis - Testes de performance e carga