Text-to-SQL with Large Language Models: Exploring the Promise and Pitfalls

AUTHORS: Luca Sala, Giovanni Sullutrone, Sonia Bergamaschi

WORK PACKAGE: WP 5 – Digital Maktaba

URL https://ceur-ws.org/Vol-3741/paper65.pdf

Keywords: Large Language Models, Text-to-SQL, Relational Databases, SQL

Abstract
The emergence of Large Language Models (LLMs) represents a fundamental change in the ever-evolving
field of natural language processing (NLP). Over the past few years, the enhanced capabilities of these
models have led to their widespread use across various fields, in both practical applications and research
contexts. In particular, as data science intersects with LLMs, new research opportunities and insights
emerge, notably in translating text into Structured Query Language (Text-to-SQL). The application of
this technology to such task poses a unique set of opportunities and related issues that have significant
implications for information retrieval. This discussion paper delves into these intricacies and limitations,
focusing on challenges that jeopardise efficacy and reliability. This research investigates the scalability,
accuracy, and concerning issue of hallucinated responses, questioning the trustworthiness of LLMs.
Furthermore, we point out the limits of the current usage of test dataset created for research purposes
in capturing real-world complexities. Finally, we consider the performance of Text-to-SQL with LLMs
from different perspectives. Our investigation identifies the key challenges faced by LLMs and proposes
viable solutions to facilitate the exploitation of these models to advance data retrieval, bridging the gap
between academic researcher and real-world application scenarios.