Why you should study SQL first (even if it ever becomes obsolete)
The hidden language behind SQL
Welcome to the latest issue of the data patterns newsletter. If this is your first one, you’ll find all previous issues in the archive.
In this edition we’ll talk about a secret language behind SQL and why I think you should study it first. Then on the SQL POW section I’ll share some articles I wrote.
Recently one of my tweets exploded:
People seemed to agree that learning SQL first is a good idea, but why?
What makes SQL such a good first tool for data analysis and transformation? Isn’t it really old? What if it becomes obsolete?
First of all let’s get this out of the way. SQL is NOT becoming obsolete anytime soon, in fact quite the opposite.
Many technologies that originally didn’t support SQL (like Hadoop) eventually ended up creating a SQL interface through various query engines.
SQL has been around since the 1970s and it has acquired a bit of a Lindy effect. This means the longer it’s been around the longer it’s expected to be around.
But there’s a subtler point that few have ever noticed.
Even if SQL were to become obsolete, there’s a set of skills it teaches you that extend beyond the language itself.
I only noticed this when I tried switching tools for a project.
I was trying to use Knime to do some no-code data analysis and I kept looking for how to do the same things I could do in SQL (aggregate, filter, pivot, append, etc) through Knime.
This made me realize there’s a hidden language of data transformation that’s baked into all the popular tools. I’m not sure if it’s a formal language (like relational algebra) but it feels intuitive to me.
All the modern data tools (SQL, R, Python (Pandas), Julia, etc) implement this language so it’s transferrable across.
This is the real skill!
So if you learn how to do something in SQL, there’s very likely a way to do the same thing in R, Python, Alteryx, Knime, etc. Therefore by learning SQL first, you learn a skill for life, and SQL is the easiest of them to learn (I’m biased of course)
SQL Pattern of the Week (SQLPOW)
For those of you who are interested in studying advanced SQL patterns, I’ve written two articles on my blog.
How to apply modularity to SQL In this article I break down some of the patterns you need to learn to write production ready, modular SQL code that makes your queries easy to read, understand and maintain. It covers three core principles: DRY, SRP and moving logic upstream. Read this before reading the second one.
Refactoring SQL - Level 1 In this article I go through an example of taking a working SQL query and transforming it so the result is the same, but the query is easier to read and maintain and perhaps a tad more efficient. It applies the 3 principles from the first article, so read that first.
I’m working on a new query refactoring article that looks at actual running code (a dbt repo I found on Github) and gives ideas on how to improve it.
Make sure you subscribe and stay tuned for that when it comes.
Until next time.
substack and this post will be long gone before SQL becomes obsolete!
SQL is an important skillset!