Enhanced SQL Agent - Quick Reference

Quick Start

Basic Usage

from ryoma_ai.agent.sql import SqlAgent
from ryoma_ai.datasource.sql import SqlDataSource

# Initialize datasource
datasource = SqlDataSource(connection_string="your_db_connection")

# Create enhanced SQL agent
sql_agent = SqlAgent(
    model="gpt-4",
    datasource=datasource,
    use_enhanced_mode=True
)

# Ask a question
result = sql_agent.invoke({
    "messages": [{"role": "user", "content": "Show top 10 customers by revenue"}]
})

ReFoRCE Mode (Advanced)

# State-of-the-art SQL agent with ReFoRCE optimizations
sql_agent = SqlAgent(
    model="gpt-4",
    datasource=datasource,
    use_reforce_mode=True,
    max_parallel_threads=3,
    max_refinement_iterations=5
)

Configuration Options

Safety Configuration

safety_config = {
    "max_result_rows": 1000,           # Limit result rows
    "blocked_functions": [             # Dangerous SQL functions
        "DROP", "DELETE", "UPDATE", 
        "LOAD_FILE", "INTO OUTFILE"
    ],
    "allowed_schemas": [               # Restrict schema access
        "public", "analytics"
    ],
    "query_timeout": 30,               # Query timeout in seconds
    "max_joins": 5,                    # Maximum number of joins
    "safety_level": "STRICT"           # PERMISSIVE, MODERATE, STRICT
}

sql_agent = SqlAgent(
    model="gpt-4",
    datasource=datasource,
    use_enhanced_mode=True,
    safety_config=safety_config
)

ReFoRCE Configuration

reforce_config = {
    "max_parallel_threads": 3,         # Parallel SQL generation
    "max_refinement_iterations": 5,    # Self-refinement iterations
    "compression_threshold": 30000,    # Schema compression threshold (tokens)
}

sql_agent = SqlAgent(
    model="gpt-4",
    datasource=datasource,
    use_reforce_mode=True,
    **reforce_config
)

Agent Modes Comparison

Feature

Basic Mode

Enhanced Mode

ReFoRCE Mode

Multi-step reasoning

Schema linking

Safety validation

Error handling

Query planning

Database compression

Format restriction

Column exploration

Self-refinement

Parallel generation

Consensus voting

Workflow Steps

Enhanced SQL Agent (8 Steps)

  1. analyze_question - Understand intent and complexity

  2. schema_linking - Find relevant tables

  3. query_planning - Create execution plan

  4. generate_sql - Generate SQL query

  5. validate_safety - Security validation

  6. execute_query - Execute validated query

  7. handle_error - Error recovery (if needed)

  8. format_response - Format final answer

ReFoRCE SQL Agent (7 Steps)

  1. compress_database_info - Schema compression

  2. generate_format_restriction - Answer format specification

  3. explore_columns - Column discovery with feedback

  4. parallel_generation - Multiple SQL candidates

  5. self_refinement - Query improvement iterations

  6. consensus_voting - Majority-vote consensus

  7. final_validation - Result validation and formatting

Key Components

Specialized Agents

  • SchemaLinkingAgent - Intelligent table discovery

  • QueryPlannerAgent - Query complexity analysis and planning

  • SqlErrorHandler - Error recovery and correction

  • SqlSafetyValidator - Security and safety validation

Enhanced Tools

  • SqlQueryTool - Query execution with safety

  • SchemaAnalysisTool - Schema relationship analysis

  • QueryValidationTool - SQL syntax/semantic validation

  • TableSelectionTool - Relevant table suggestions

  • QueryOptimizationTool - Performance optimization

  • QueryExplanationTool - Natural language explanations

Safety Features

Security Validation

  • SQL injection prevention

  • Dangerous operation blocking

  • Resource limit enforcement

  • Access control validation

Error Classification

  • Syntax Errors - Invalid SQL syntax

  • Semantic Errors - Logical errors

  • Permission Errors - Access violations

  • Data Errors - Data-related issues

  • Performance Errors - Resource/timeout issues

Performance Features

ReFoRCE Optimizations

  • Database Compression - Reduce schema size for large databases

  • Parallel Processing - Generate multiple SQL candidates simultaneously

  • Self-Refinement - Iteratively improve query quality

  • Consensus Mechanism - Use majority vote for best results

Query Optimization

  • Index usage analysis

  • Join optimization suggestions

  • Performance estimation

  • Resource usage monitoring

Common Use Cases

Business Intelligence

# Revenue analysis
result = sql_agent.invoke({
    "messages": [{"role": "user", "content": 
        "Show monthly revenue trends for the last 12 months by product category"
    }]
})

Customer Analytics

# Customer segmentation
result = sql_agent.invoke({
    "messages": [{"role": "user", "content": 
        "Identify top 20% of customers by lifetime value and their characteristics"
    }]
})

Operational Reporting

# Performance metrics
result = sql_agent.invoke({
    "messages": [{"role": "user", "content": 
        "Calculate average order processing time by region for last quarter"
    }]
})

Troubleshooting

Common Issues

Issue

Cause

Solution

Schema loading error

DB connection/permissions

Check datasource configuration

Query timeout

Complex query/large dataset

Increase timeout or optimize query

Safety violation

Dangerous SQL operation

Review safety configuration

Memory issues

Large schema

Enable schema compression

Debug Mode

sql_agent = SqlAgent(
    model="gpt-4",
    datasource=datasource,
    use_enhanced_mode=True,
    debug=True,
    log_level="DEBUG"
)

Best Practices

Production Deployment

  1. Configure Safety Rules - Set appropriate safety policies

  2. Set Resource Limits - Prevent resource exhaustion

  3. Monitor Performance - Track query execution metrics

  4. Regular Testing - Test with your specific schema

  5. Access Control - Implement proper database permissions

Performance Optimization

  1. Use ReFoRCE Mode - For best performance and accuracy

  2. Enable Compression - For large database schemas

  3. Tune Parallel Threads - Based on your system capacity

  4. Monitor Resource Usage - Optimize based on metrics

Security Considerations

  1. Principle of Least Privilege - Minimal database permissions

  2. Input Validation - Validate all user inputs

  3. Audit Logging - Log all query executions

  4. Regular Updates - Keep system updated

API Reference

SqlAgent Methods

# Initialize agent
sql_agent = SqlAgent(model, datasource, **config)

# Process question
result = sql_agent.invoke(input_data, **kwargs)

# Stream responses
for chunk in sql_agent.stream(input_data, **kwargs):
    print(chunk)

# Safety configuration
sql_agent.enable_safety_rule("BLOCK_DELETE")
sql_agent.disable_safety_rule("LIMIT_JOINS")
sql_agent.set_safety_config(new_config)

# Schema analysis
analysis = sql_agent.analyze_schema(question)

# Query planning
plan = sql_agent.create_query_plan(question, context)

State Management

# Access workflow state
state = sql_agent.get_current_state()

# Check execution status
status = state.get("current_step")
errors = state.get("error_info")
results = state.get("execution_result")

Support and Resources

  • Documentation: Enhanced SQL Agent Architecture

  • Examples: See examples/ directory

  • Issues: Report bugs and feature requests

  • Contributing: See contribution guidelines

Version History

  • v1.0.0 - Initial Enhanced SQL Agent implementation

  • v1.1.0 - Added ReFoRCE optimizations

  • v1.2.0 - Improved safety validation and error handling