Text-to-SQL, which enables natural language interaction with databases, serves as a pivotal method across diverse industries. With new, more powerful large language models (LLMs) emerging every few months, fine-tuning has become incredibly costly, labor-intensive, and error-prone. As an alternative, zero-shot Text-to-SQL, which leverages the growing knowledge and reasoning capabilities encoded in LLMs without task-specific fine-tuning, presents a promising and more challenging direction.
To address this challenge, we propose Alpha-SQL, a novel approach that leverages a Monte Carlo Tree Search (MCTS) framework to iteratively infer SQL construction actions based on partial SQL query states. To enhance the framework's reasoning capabilities, we introduce LLM-as-Action-Modelto dynamically generate SQL construction actions during the MCTS process, steering the search toward more promising SQL queries. Moreover, Alpha-SQL employs a self-supervised reward function to evaluate the quality of candidate SQL queries, ensuring more accurate and efficient query generation.
Experimental results show that Alpha-SQL achieves 69.7% execution accuracy on the BIRD development set, using a 32B open-source LLM without fine-tuning. Alpha-SQL outperforms the best previous zero-shot approach based on GPT-4o by 2.5% on the BIRD development set.
Alpha-SQL implements a novel framework that combines Monte Carlo Tree Search (MCTS) with Large Language Models (LLMs) for zero-shot Text-to-SQL generation. The implementation consists of three key components:
1. MCTS-based Search Framework: The system models SQL generation as a search problem, where each node represents a partial SQL query state and edges represent SQL construction actions. The MCTS process iteratively explores the search space to find optimal SQL queries through selection, expansion, simulation, and backpropagation phases.
2. LLM-as-Action-Model: To enhance reasoning capabilities, we introduce seven distinct reasoning actions:
3. Self-Supervised Reward Function: The system employs a self-consistency based reward mechanism that evaluates SQL queries by comparing execution results across multiple sampled queries. This approach ensures reliable query generation without requiring annotated data.
Please refer to our paper for more detials.
We conduct extensive experiments to evaluate Alpha-SQL's performance on both BIRD and Spider datasets. Our experiments demonstrate the effectiveness of our approach in zero-shot Text-to-SQL tasks.
Performance on BIRD Dataset: Alpha-SQL achieves 69.7% execution accuracy on the BIRD development set using a 32B open-source LLM without fine-tuning. This performance:
Performance on Spider Dataset: Alpha-SQL with Qwen2.5-Coder-14B outperforms existing methods, achieving a 2.1% improvement over SFT Coder-15B, which was specifically fine-tuned for the Spider dataset.
We conduct ablation studies to validate the effectiveness of our reasoning actions:
Our experiments show that Alpha-SQL, utilizing a model with only 7B parameters:
Please refer to our paper for more detailed experimental results and analysis.
@inproceedings{alpha-sql,
author = {Boyan Li and
Jiayi Zhang and
Ju Fan and
Yanwei Xu and
Chong Chen and
Nan Tang and
Yuyu Luo},
title = {Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search},
booktitle = {Forty-Second International Conference on Machine Learning, {ICML} 2025,
Vancouver, Canada, July 13-19, 2025},
publisher = {OpenReview.net},
year = {2025}
}