SQL at 50: A Lesson in How to Stay Relevant Around Data

By on
Read more about author Dave Stokes.

Structured query language (SQL) is now 50 years old. The original paper for SQL (then called SEQUEL) was published in May 1974 by Raymond Boyce and Donald Chamberlin, and provided a guide for data manipulation based on a set of simple commands. Today, we take SQL for granted around data – it is still the third most popular language used by professional programmers, and it can support innovations from geographic data through to JSON and generative AI.

But why is SQL still around? Why does this language, with its idiosyncrasies, quirks, and challenging syntax, still play such a central role in how we work around data today?  

Successful SQL

To start with, we have to look at why SQL succeeded in its mission around data management and manipulation. At the time, data was a precious commodity, but using it effectively was hard. The standardization that SQL provided – and still provides today – just did not exist. SQL’s approach – while it might look clunky to a modern developer audience that is used to low-code and graphical user interfaces – was an effective and understandable way to interact with data. SQL is also based on solid mathematical models. This is what made it so effective in handling data to start with, but those solid underpinnings have kept it reliable to this very day.

The second reason is that SQL was implemented alongside the development and release of relational databases. This combination of language and technology complements each other extremely well and supports businesses in using data as part of their overall processes. For instance, SQL was the first programming language to return multiple rows per single request, allowing users to get data sets that matched specific parameters in one go. This made reporting and analysis easier, as well as batching processing requests for orders. Alongside this, companies could segregate their customer records into one table and have manufacturing data for those customers in another table. This meant that companies could compartmentalize their data for specific tasks into tables, but also bring data together around specific customer orders when they needed to in an easy and repeatable way.

The third reason for success is that SQL has not stood still. Over time, SQL has added support for new and necessary technologies that have entered the market. In the past, this included adding GIS (geographic information system) data support, so that users could query and interact with location data in the same ways that they did other forms of data. Similarly, SQL added support for Extensible Markup Language (XML) so that users could query other data defined by XML using SQL rather than having to manage this interaction separately. 

More recently, SQL has added support for data held in JSON document data files, as these grew in popularity with developers, and can now support vector data searches too. The most recent release of the SQL standard was published in 2023, adding new features for interaction with property graphs as well as scalar and aggregate functions like GREATEST, LEAST, RPAD, LPAD, RTRIM, LTRIM, and ANY_VALUE. What can we learn from this? It is that, like any technology, SQL has kept up with the times and supported users around what they need now, not just in the past.

SQL and the Future

Part of the reason why SQL succeeded at the start was that it standardized data management at a time when that was sorely needed. It made it easier for everyone to consolidate around data and how it could be used. However, SQL remains difficult to learn and write for many individuals. It should come as no surprise that other approaches have been tried. As an example, projects that tried to apply natural language search rather than SQL’s standardized command structure promised to make query design and interaction with data easier. However, those attempts failed, either because they were too cumbersome or because they did not perform to a satisfactory level.

Today, generative AI holds some promise in making it easier to create SQL instructions based on models that are trained using huge volumes of SQL examples. Why write SQL when you can ask ChatGPT to write it for you, based on exposure to more SQL expressions than a single person could ever hope to read and understand? There is potential here to help developers work with SQL more effectively, but we also have to learn the lessons of previous attempts. Rather than focusing on the “what” – more SQL expressions created, more quickly – we have to think about the “why” and the “what happens when things go wrong” too.

SQL is an effective means to manipulate data. However, that data itself will continue to grow, and the reasons why we need to manipulate data will evolve too. Using tools to create SQL for us will help developers be more effective, but they will have to understand why things work in the way they do to get the most out of data. Knowing why a specific SQL request works depends on understanding the business logic that the request should meet, as well as the logic of the request itself. This understanding can help you sort through your SQL drafts and prevent problems due to faults in that logic, rather than relying on the AI-generated instructions alone. 

Understanding SQL will continue to be an essential skill for developers, even if you do not write the code yourself. Without this understanding, you can end up spending more time hunting for why things are not providing the right answers, why things cost so much to run, or why they suddenly stop working. This experience will be invaluable in the future. At the same time, SQL will continue to hold its place as the standard for data manipulation and interaction, even if it moves behind the scenes more than it is today.

So, happy birthday, SQL. Long may you continue to help us all achieve what we want around data – and deliver what our businesses need.