MSc Thesis Presentation - Ehsan Soltan Aghai
Name: Ehsan Soltan Aghai
Date: Monday, April 14th, 2025
Time: 10:30 am
Location: Room ICCS 304
Supervisor: Prof. Rachel Pottinger
Title: Workload-Aware SQL Query Recommendation Using Retrieval-Augmented Generation
Abstract:
We propose a retrieval-augmented generation framework for recommending full SQL queries in workload-driven environments. The system is designed to assist non-expert users in composing correct and effective SQL queries by leveraging patterns found in historical query logs. At the core of our method is a dual-encoder model trained to capture both semantic similarity and structural transitions between consecutive queries. This is further enhanced with transition classification, allowing the system to model how queries evolve during a session. At inference time, the system retrieves contextually relevant query templates based on the user’s current query and generates complete, executable SQL statements. This combination of retrieval and generation enables the system to generalize across query patterns while maintaining logical and syntactic correctness. Compared to traditional collaborative or content-based recommenders, our approach provides more personalized, context-aware, and interpretable suggestions. We show that it improves the overall usability of SQL and supports more efficient data exploration for users with limited SQL expertise.