Enhanced SQL Agent - Quick Reference¶

Quick Start¶

Using AgentBuilder (Recommended)¶

from ryoma_ai.services import AgentBuilder, DataSourceService
from ryoma_ai.infrastructure.datasource_repository import StoreBasedDataSourceRepository
from langchain_core.stores import InMemoryStore
from ryoma_data.sql import DataSource

# Initialize datasource
datasource = DataSource(
    "postgres",
    host="localhost",
    port=5432,
    database="mydb",
    user="user",
    password="password"
)

# Set up services
store = InMemoryStore()
repo = StoreBasedDataSourceRepository(store)
datasource_service = DataSourceService(repo)
datasource_service.add_datasource(datasource)

# Create SQL agent using builder
builder = AgentBuilder(datasource_service)
agent = builder.build_sql_agent(model="gpt-4", mode="enhanced")

# Ask a question
result = agent.stream("Show top 10 customers by revenue")

Direct SqlAgent Factory (Alternative)¶

from ryoma_ai.agent.sql import SqlAgent
from ryoma_ai.domain.constants import StoreKeys
from langchain_core.stores import InMemoryStore
from ryoma_data.sql import DataSource

# Initialize datasource
datasource = DataSource(
    "postgres",
    host="localhost",
    database="mydb",
    user="user",
    password="password"
)

# Create store and inject datasource
store = InMemoryStore()
store.mset([(StoreKeys.ACTIVE_DATASOURCE, datasource)])

# Create SQL agent with store
agent = SqlAgent(
    model="gpt-4",
    mode="enhanced",  # or "basic" or "reforce"
    store=store
)

# Ask a question
result = agent.stream("Show top 10 customers by revenue")

ReFoRCE Mode (Advanced)¶

# State-of-the-art SQL agent with ReFoRCE optimizations
agent = builder.build_sql_agent(
    model="gpt-4",
    mode="reforce"
)

Constructor Parameters¶

SqlAgent Factory¶

SqlAgent(
    model: str | BaseChatModel,      # Model ID or LLM instance
    model_parameters: dict = None,    # Optional LLM parameters
    mode: str = "basic",              # "basic", "enhanced", or "reforce"
    safety_config: dict = None,       # Safety configuration (enhanced/reforce only)
    store = None,                     # BaseStore with datasource injected
)

AgentBuilder.build_sql_agent¶

builder.build_sql_agent(
    model: str = "gpt-3.5-turbo",    # Model ID or LLM instance
    mode: str = "basic",              # "basic", "enhanced", or "reforce"
    model_params: dict = None,        # Optional model parameters
)

Agent Modes Comparison¶

Feature	Basic Mode	Enhanced Mode	ReFoRCE Mode
Multi-step reasoning	❌	✅	✅
Schema linking	❌	✅	✅
Safety validation	❌	✅	✅
Error handling	❌	✅	✅
Query planning	❌	✅	✅
Database compression	❌	❌	✅
Format restriction	❌	❌	✅
Column exploration	❌	❌	✅
Self-refinement	❌	❌	✅
Parallel generation	❌	❌	✅
Consensus voting	❌	❌	✅

Available Tools by Mode¶

Basic Mode (3 tools)¶

SqlQueryTool - Execute SQL queries
CreateTableTool - Create new tables
QueryProfileTool - Profile query execution

Enhanced Mode (5 tools)¶

All basic tools plus:

QueryExplanationTool - Explain query execution plans
QueryOptimizationTool - Optimize query performance

ReFoRCE Mode (5 tools)¶

Same as enhanced mode (ReFoRCE is a prompting strategy, not different tools)

Workflow Steps¶

Enhanced SQL Agent (8 Steps)¶

analyze_question - Understand intent and complexity
schema_linking - Find relevant tables
query_planning - Create execution plan
generate_sql - Generate SQL query
validate_safety - Security validation
execute_query - Execute validated query
handle_error - Error recovery (if needed)
format_response - Format final answer

ReFoRCE SQL Agent (7 Steps)¶

compress_database_info - Schema compression
generate_format_restriction - Answer format specification
explore_columns - Column discovery with feedback
parallel_generation - Multiple SQL candidates
self_refinement - Query improvement iterations
consensus_voting - Majority-vote consensus
final_validation - Result validation and formatting

Key Methods¶

Agent Methods¶

# Stream responses (recommended)
result = agent.stream(question, display=True)

# Invoke synchronously
result = agent.invoke(question)

# Async invocation
result = await agent.ainvoke(question)

# Tool execution control
from ryoma_ai.agent.workflow import ToolMode
result = agent.stream(question, tool_mode=ToolMode.ONCE)
result = agent.stream(question, tool_mode=ToolMode.CONTINUOUS)

Enhanced/ReFoRCE Methods¶

# Safety configuration (enhanced/reforce only)
agent.enable_safety_rule("BLOCK_DELETE")
agent.disable_safety_rule("LIMIT_JOINS")
agent.set_safety_config(new_config)

# Schema analysis (enhanced/reforce only)
analysis = agent.analyze_schema(question)

# Query planning (enhanced/reforce only)
plan = agent.create_query_plan(question, context)

Common Use Cases¶

Business Intelligence¶

# Revenue analysis
result = agent.stream(
    "Show monthly revenue trends for the last 12 months by product category"
)

Customer Analytics¶

# Customer segmentation
result = agent.stream(
    "Identify top 20% of customers by lifetime value and their characteristics"
)

Operational Reporting¶

# Performance metrics
result = agent.stream(
    "Calculate average order processing time by region for last quarter"
)

DataSource Configuration¶

Supported Backends¶

from ryoma_data.sql import DataSource

# PostgreSQL
ds = DataSource("postgres", host="localhost", database="mydb", user="user", password="pass")

# MySQL
ds = DataSource("mysql", host="localhost", database="mydb", user="user", password="pass")

# SQLite
ds = DataSource("sqlite", database="path/to/db.sqlite")

# DuckDB
ds = DataSource("duckdb", database=":memory:")

# Snowflake
ds = DataSource("snowflake", account="xxx", database="DB", user="user", password="pass")

# BigQuery
ds = DataSource("bigquery", project_id="project", dataset_id="dataset")

Troubleshooting¶

Common Issues¶

Issue	Cause	Solution
No active datasource	Datasource not added to service	Call `datasource_service.add_datasource(ds)`
Schema loading error	DB connection/permissions	Check datasource configuration
Query timeout	Complex query/large dataset	Increase timeout or optimize query
Safety violation	Dangerous SQL operation	Review safety configuration

Best Practices¶

Production Deployment¶

Use AgentBuilder - Cleaner separation of concerns
Configure Safety Rules - Set appropriate safety policies
Set Resource Limits - Prevent resource exhaustion
Monitor Performance - Track query execution metrics
Access Control - Implement proper database permissions

Performance Optimization¶

Use ReFoRCE Mode - For best performance and accuracy on complex queries
Use Enhanced Mode - For production with safety validation
Use Basic Mode - For simple queries and development

Version History¶

v0.2.0 - Service-based architecture with AgentBuilder
v0.1.x - Direct datasource parameter (deprecated)

Support and Resources¶

Documentation: Enhanced SQL Agent Architecture
Examples: See examples/ directory
Issues: Report bugs and feature requests
Contributing: See contribution guidelines