skills/codex/bigquery-ai/SKILL.md
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: bigquery-ai description: BigQuery AI and ML patterns for Gemini-powered analytics. Use when building natural language SQL, ML.GENERATE_TEXT pipelines, Cortex Framework data foundations, or grounding agents in warehouse data. --- > **Platform Note:** This skill was designed for multi-agent execution. In Codex, treat sub-agent instructions as sequential steps to complete thoroughly within a single agent context. # BigQuery AI fo
npx skillsauth add frank-luongt/faos-skills-marketplace skills/codex/bigquery-aiInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Platform Note: This skill was designed for multi-agent execution. In Codex, treat sub-agent instructions as sequential steps to complete thoroughly within a single agent context.
Use BigQuery as the data foundation for AI agents with Gemini-powered analytics, natural language SQL, and Cortex Framework enterprise data models.
-- Summarize customer feedback directly in BigQuery
SELECT
feedback_id,
customer_id,
feedback_text,
ml_generate_text_result['candidates'][0]['content']['parts'][0]['text'] AS summary
FROM
ML.GENERATE_TEXT(
MODEL `my_project.my_dataset.gemini_model`,
(SELECT feedback_id, customer_id, feedback_text,
CONCAT('Summarize this customer feedback in 2 sentences: ', feedback_text) AS prompt
FROM `my_project.my_dataset.customer_feedback`
WHERE DATE(created_at) = CURRENT_DATE()),
STRUCT(256 AS max_output_tokens, 0.2 AS temperature)
);
-- Connect BigQuery to Vertex AI Gemini model
CREATE OR REPLACE MODEL `my_project.my_dataset.gemini_model`
REMOTE WITH CONNECTION `my_project.us.my_connection`
OPTIONS (ENDPOINT = 'gemini-2.0-flash');
-- Create embeddings for knowledge base articles
CREATE OR REPLACE TABLE `my_project.my_dataset.article_embeddings` AS
SELECT
article_id,
title,
content,
ml_generate_embedding_result['predictions'][0]['embeddings']['values'] AS embedding
FROM
ML.GENERATE_EMBEDDING(
MODEL `my_project.my_dataset.embedding_model`,
(SELECT article_id, title, content FROM `my_project.articles`),
STRUCT(TRUE AS flatten_json_output)
);
-- Similarity search
SELECT
base.article_id,
base.title,
distance
FROM
VECTOR_SEARCH(
TABLE `my_project.my_dataset.article_embeddings`,
'embedding',
(SELECT ml_generate_embedding_result['predictions'][0]['embeddings']['values'] AS embedding
FROM ML.GENERATE_EMBEDDING(
MODEL `my_project.my_dataset.embedding_model`,
(SELECT 'How do I reset my password?' AS content)
)),
top_k => 5,
distance_type => 'COSINE'
);
-- Cortex Framework pre-built SAP data model
-- After deploying Cortex Framework, query normalized SAP data:
-- Sales performance from SAP S/4HANA
SELECT
MaterialText_MAKTX AS product,
SoldToParty_KUNAG AS customer,
SUM(NetPrice_NETWR) AS revenue,
COUNT(SalesDocument_VBELN) AS order_count
FROM `cortex_sap.SalesOrders`
WHERE DATE(CreationDate_ERDAT) BETWEEN '2025-01-01' AND '2025-12-31'
GROUP BY 1, 2
ORDER BY revenue DESC
LIMIT 20;
from google.cloud import bigquery
client = bigquery.Client()
def query_warehouse(sql: str) -> list[dict]:
"""Execute a BigQuery query and return results as dicts.
Used as a tool function for AI agents.
"""
query_job = client.query(sql)
results = query_job.result()
return [dict(row) for row in results]
# Agent tool: natural language -> SQL -> results
def answer_data_question(question: str) -> str:
"""Convert natural language to SQL and execute."""
from google import genai
ai_client = genai.Client(vertexai=True, project="my-project", location="us-central1")
schema_context = get_table_schemas() # Your schema loader
response = ai_client.models.generate_content(
model="gemini-2.0-flash",
contents=f"""Given this schema:\n{schema_context}\n\n
Generate a BigQuery SQL query to answer: {question}
Return ONLY the SQL, no explanation.""",
)
sql = response.text.strip().strip("```sql").strip("```")
results = query_warehouse(sql)
return str(results[:20])
development
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-mlflow-evaluation --- # MLflow 3 GenAI Evaluation ## Before Writing Any Code 1. **Read GOTCHAS.md** - 15+ common mistakes that cause failures 2. **Read CRITICAL-interfaces.md** - Exact API signatures and data schemas ## End-to-End Workflows Follow these workflows based on your goal. Each step indicates which reference files to read. ### Workflow 1: First-Time Evaluation Setup For users new to MLflow GenAI evalu
development
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-lakebase-provisioned --- # Lakebase Provisioned Patterns and best practices for using Lakebase Provisioned (Databricks managed PostgreSQL) for OLTP workloads. ## When to Use Use this skill when: - Building applications that need a PostgreSQL database for transactional workloads - Adding persistent state to Databricks Apps - Implementing reverse ETL from Delta Lake to an operational database - Storing chat/agent m
tools
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-jobs --- # Databricks Lakeflow Jobs ## Overview Databricks Jobs orchestrate data workflows with multi-task DAGs, flexible triggers, and comprehensive monitoring. Jobs support diverse task types and can be managed via Python SDK, CLI, or Asset Bundles. ## Reference Files | Use Case | Reference File | | ----------------------
development
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-genie --- # Databricks Genie Create and query Databricks Genie Spaces - natural language interfaces for SQL-based data exploration. ## Overview Genie Spaces allow users to ask natural language questions about structured data in Unity Catalog. The system translates questions into SQL queries, executes them on a SQL warehouse, and presents results conversationally. ## When to Use This Skill Use this skill when: -