DEV Community

Cover image for 10 Things I Hate About SQL
Shaker Islam
Shaker Islam

Posted on

10 Things I Hate About SQL

Let’s clear the air quick — SQL is awesome. It’s the most consistently used query language for well over two decades. There’s a ton of love for SQL from thousands of analysts and developers around the world. And honestly, I probably wouldn’t even have a job if it weren’t for SQL simply based on how ubiquitous it is and how much it helps building products.

So before getting into my rant below, I want to say THANK YOU to the people who’ve worked on SQL and SQL systems. You all must have moved mountains to make it all happen, and I deeply appreciate it.

</appreciation>

<rant>

All that said… SQL can sometimes SUCK. Here’s a bunch of reasons why.

1. SQL Syntax is WAY too strict

Can you spot the errors in the following query?

cool_films as (
    select
        title,
        "Description: " + description,
        length,
        release_year,
    from film
    where description like '%cool%'
    where title like '%cool%'
),
select * from cool_films
Enter fullscreen mode Exit fullscreen mode

How many did you find? Because there are five errors!

  • There’s a missing with at the beginning for the common table expression
  • Trailing comma after release_year,
  • … and another trailing comma after the closing parens ),
  • Double quotes don’t refer to strings, they refer to identifiers (usually column identifiers)
  • You can’t use where twice; you have to use and if you want to specify more filtering conditions

I guarantee you most people who write SQL for work run into at least one of these errors once a week or more. These errors are incredibly frustrating and they pile onto what is already a high learning curve for SQL.

These syntax rules also don’t add much. Is the above SQL that difficult to read? Or even difficult to parse before executing? No! Let me run my query!

2. UPPERCASE IS JUST GREAT, IT’S SUPER CLEAR

IF YOU’RE LOOKING AT QUERY LIKE THIS:

SELECT
    something
FROM a_table
WHERE im_not_yelled_at
ORDER BY my_sanity ASC
Enter fullscreen mode Exit fullscreen mode

DO YOU REALLY FIND THIS MORE READABLE THAN:

select
    something
from a_table
where im_not_yelled_at
order by my_sanity asc
Enter fullscreen mode Exit fullscreen mode

WHO WANTS THEIR QUERIES TO YELL AT THEM?
HAVE I MADE MY POINT YET??

3. Aliases don’t work after the Select block

Suppose we have the following query:

select
    title,
    description,
    (publish_year / 100) as century
from books
where century between 16 and 19
Enter fullscreen mode Exit fullscreen mode

This will return an error because the alias century can’t be used in the where clause. Some SQL engines will allow you to use having here instead, but it’s not standard SQL.

The historical reason for this is that everything in the select block is supposed to happen last — meaning the SQL engine will process the line where century like between 16 and 19 first, and it’s going to be like, “huh, I haven’t seen this century before!”

I’m no expert on building SQL engines, but I don’t really understand why this has to be a rule. Is there a logic error or such a big performance hit that it’s impossible for SQL engines to look for aliases in the select clause before executing the where clause? Is doing a quick scan of some text truly that hard to do?

4. Select statements are not dynamic

When I’m iterating and in the middle of working on a query, I often want to do something like this:

with
something_i_copy_pasted (
    select
        ... 50 different columns
    from a_big_table
),

-- probably many other CTEs with complicated joins...

select
    *, except column "foo", and format column "bar" better
from something_i_copy_pasted
Enter fullscreen mode Exit fullscreen mode

In these cases, I end up having to copy/paste the select statements across multiple CTEs and modify them slightly. This ends up being a tedious task when I really want to change something small and see what the results look like.

5. Sharing queries & code is very difficult

The oft recommended way to share SQL query snippets is through Views. I’m sure some teams use Views correctly and don’t run into issues, but I haven’t seen one.

My main issue with Views is that they can’t take input variables. For instance, it’d be amazing to have a View where you can modify a specific where condition or change/remove a join, or be able to slot in a CTE somewhere. These types of adjustments are nearly impossible without some very complicated Stored Procedures, which are yet another common source of issues.

Unlike a programming language where I can often copy/paste things from Stack Overflow or use a package manager and utilize a library, in SQL I can do none of those things. There are no SQL “packages” that just do a cross join unnest(x), or do some gnarly window functions. Above all, there’s no built-in support for a collaborative codebase — most teams have to roll their own version of shared SQL snippets in Github. All this sucks!

6. No variables in ANSI SQL

Yes yes, variables exist in some SQL dialects, but they’re not standard.

Here’s an example where variables could be helpful:

select
    title,
    description,
    (publish_year / 100) as century,
    case
        when century < 0 then (century * -1) || ' B.C.'
        when century > 0 then century || ' A.D.'
    as formatted_century
from books
Enter fullscreen mode Exit fullscreen mode

Alas, aliases can’t be reused! This is another situation where I need CTE, or write some really ugly SQL. In my opinion, both are less than ideal.

7. Bad, cryptic errors

Every time I get an error from my SQL query, I want to put my head through a wall. Often times, the errors should be simple or easy to fix, but SQL error messages give little information on how.

Here’s a forced example:

select
    first_name
from actor a
join actor b on a.actor_id = b.actor_id

#=> Query Error: error: column reference "first_name" is ambiguous
Enter fullscreen mode Exit fullscreen mode

The error is clear on what’s wrong — there are two columns called first_name, and the engine doesn’t know which one you want. Makes sense! But would it be SO HARD to tell me how to fix this simple error?

What I really want is this:

select
    first_name
from actor a
join actor b on a.actor_id = b.actor_id

#=> Query Error: Column reference "first_name" is ambiguous. Available columns are a.first_name, b.first_name.
Enter fullscreen mode Exit fullscreen mode

Easy peasy! What about misspellings?

select
    first_namee
from actor a

#=> Query Error: error: column "first_namee" does not exist

-- nope! this is what i want:

select
    first_namee
from actor a

#=> Query Error: Column "first_namee" does not exist. Did you mean "first_name"?
Enter fullscreen mode Exit fullscreen mode

Did someone say “syntax error”?

select
    first_name
from actor a
order by last_name
where first_name = 'Brad'

#=> Query Error: error: syntax error at or near "where"

-- "where" do i begin? no thanks! just tell me what the problem is:

select
    first_name
from actor a
order by last_name
where first_name = 'Brad'

#=> Query Error: Syntax error at or near "where". "where" should come before "order by"
Enter fullscreen mode Exit fullscreen mode

I could go on and on.

There are many errors to sift through, and I’m sure some have ambiguous or unclear solutions, and I know some of you will shout “use an IDE!”, etc. etc.

(side note: most SQL IDE’s are terrible, but I digress…)

If SQL engines returned with helpful error messages in the first place, no one would need a workaround, or bug their coworkers, or furiously google for fixable issues. Better error messages would be the single biggest UX improvement of SQL that I can think of.

8. SQL is impossible to debug

In many programming languages, there’s built-in support for a dropping debug statements, stopping execution, and being able to poke around and see what your program is doing. This is immensely helpful whether you’re building something new or debugging a problem in your code.

In SQL, there’s no equivalent. Beginners usually don’t know what to do when they’re faced by query that “isn’t working” or — worse yet — a query that silently gives unexpected results and you didn’t know.

Over time, you build an intuition on how to manually “debug” a query, doing things like:

  • Starting with a limit 1 or where id = '123' to limit your results and iterate quickly
  • Doing a left join other on a = other.b where a = null to surface join issues
  • Running CTEs one by one so you know how the query is “building” from one CTE to the next

I’m sure there’s other methods of “debugging” out there. But these feel like workarounds to me. What I really want is a way to “step through” my query and see how the SQL engine is processing it, and what types of results it gets along the way.

Believe me, I get that building a debugger into a SQL engine would be ridiculously difficult. You’re usually not running your query directly on the database server — you’re sending your query over the network to be validated, parsed, and executed, often on huge datasets. You can’t quite “peek” into how a query is getting executed like you could in a programming language.

… unless there’s a genius out there who can figure it out! I don’t know what’s actually possible, but any progress here would be an immense help.

9. Locked into relational algebra in ANSI SQL

I have to give credit where credit’s due — relational algebra is great, and it’s really useful for a query language to describe how to slice and dice data.

Sometimes, though, I just want a for loop. Just let me drop into a lambda and modify some things here and there. In a similar vein to my other complaints, functions/lambdas do exist in some SQL dialects, but they’re not part of ANSI SQL.

As a result, when you’re locked into relational algebra, it can often feel like a square peg in a round hole. Funky joins, compounding window functions, complex sub-selects — all of which are valid solutions that can, at the same time, feel very “advanced” and sometimes even hacky, even for a seasoned SQL user.

In reality, I want SQL to be as easy as Excel, or as smooth as dropping into a REPL to dig around what’s possible. Users aren’t going to get their queries right from step one, and the nature of relational algebra is that you’re going to get your queries wrong A LOT before you get them right. It’s so demotivating to go through it, and in reality it doesn’t have to be this way.

10. SQL implementations are inconsistent

Of course, the most common complaint of all.

With PostgreSQL, MySQL, Spark SQL, Presto SQL, SQL Server, Oracle SQL, and probably many others… we just can’t get a break! Each of these SQL implementations have slight idiosyncrasies that make it very painful to transfer queries from one system to another.

This issue truly rears its head when you want to use Google or Stack Overflow for solutions. You can’t always search for “how do I do X in SQL”, because some of the answers may be in a specific SQL dialect and won’t help you. For someone new to SQL, this is brutal, and it can take a long time to resolve simple issues.

I don’t know what the root problem here is. There is indeed an accepted SQL Standard, but nonetheless there are many deviations from it. Perhaps if there was recognized implementation of the SQL Standard as a standalone SQL Parser library that any application can utilize, then the industry could converge on the standard naturally.

But alas, here we are.


My hope is that at least some of this resonated with you. Maybe if enough people feel strongly about what can improve, then SQL really will improve.

We spend so much time about the UX of web and mobile products, but not enough, IMO, on developer or analyst products. One day, though, one day.

Thanks for listening to my TED talk.

Top comments (15)

Collapse
 
geraldew profile image
geraldew

Ah, nothing quite like a rant to air some heartfelt observations. I approve of the honesty you've expressed here so I'll try to merely give views based on my experience. I'll apologise ahead if I come across as too "teachy".

With a list of ten things there, I'll use a series of response posts wherever I think I have something to say rather than attempting it all in one go.

So, #1 SQL Syntax

I think it's fair to say that the syntax of SQL is a bit of a mess. It's clearly just evolved and meandered over the years and we're probably lucky that it has not become utterly dreadful.

That said:

  • I can't think of any languages that tolerate lists with a dangling comma - with this applying to both your examples;
  • the usage of single and double quotes took some time to evolve into the semi-standard of now. Their usage varies a lot among programming languages so there's no single comparison to make with all those. I still deal daily with two dialects with contrasting approaches (more on this later). In quite a few dialects they were interchangeable. The main need for them comes from two issues:
    • not really having reserved words - that would disallow a column named "from" or "table" - therefore the quotes let you be clear that you didn't mean the reserved word;
    • allowing for column/table names with spaces in them (me, I would have just said No to that, but once any of them allowed it then the others would all need to as well)* ;
  • regarding multiple WHERE clauses I'm guessing you think it's obvious that having more than would be an AND combination. From where I sit, neither AND nor OR are more obvious than the other, so I'm happy we have to be clear about which is being used. As we'll see later, WHERE has a more distinct meaning than just limiting things.

Expressions of angst are one thing, but declaring features you don't (yet) like as "errors" is overreach when you're not showing how you'll avoid inconsistencies. For example you don't indicate which interpretation you think should be made for your dangling commas. I can easily imagine two versions of you with different obvious opinions:

  • In one, the dangling comma should just be ignored as if you didn't mean it;
  • In another, it obviously implies a NULL expression as an extra column.

The funny thing, in teaching many people SQL over the years it is the latter that I've had instructees suggest as obvious - often when the SELECT was just for supplying an INSERT - and who then disliked the bother of doing CAST( NULL AS INTEGER) such as is often required by various parsers or engines.

As for "syntax rules also don’t add much" I would mainly say - they add clarity (to a degree anyway). When you work with millions (and even billions) of rows, that clarity is more than just handy.

p.s. actually there are some gnarlier issues that you may not have encountered yet. I have a habit of putting CASE expressions inside their own brackets () because some dialects can't parse them when nested inside more expressions e.g. 3 + ( CASE WHEN col = 'A' THEN 1 ELSE 0 END )

  • * a curious reality about SQL is that it's not unusual to be using dialect A to deal with tables/columns constructed with a dialect B. While the early waves of relational databases locked their tables within their own engines, soon APIs and drivers enabled engines to access each other's tables. This isn't just a problem about databases, one runs into similar problems with file systems - a reason to use Linux to fix Windows systems is to avoid the latter having some impossible filenames (a whole other topic that).
Collapse
 
shaqq profile image
Shaker Islam

Hi @geraldew - thanks for the detailed responses! I'll try to answer them one by one, but I want to mention that I appreciate the conversation!

Ok here goes

I can't think of any languages that tolerate lists with a dangling comma - with this applying to both your examples

Are you talking about Query languages, specifically? Because this is a feature in many modern programming languages, including Ruby & Python

the usage of single and double quotes took some time to evolve into the semi-standard of now. Their usage varies a lot among programming languages so there's no single comparison to make with all those. I still deal daily with two dialects with contrasting approaches (more on this later). In quite a few dialects they were interchangeable. The main need for them comes from two issues:

I agree there is a need to specify identifiers, and I do believe that backticks would be much more useful than double quotes. Theoretically this type of feature could be behind a feature-flag to preserve backwards compat.

Expressions of angst are one thing, but declaring features you don't (yet) like as "errors" is overreach when you're not showing how you'll avoid inconsistencies. For example you don't indicate which interpretation you think should be made for your dangling commas. I can easily imagine two versions of you with different obvious opinions:

The fix here would be to ignore everything past the dangling comma. This is the expected behavior in many modern programming languages. Expecting a dangling comma to have a null in either a CTE or select block seems very strange and unexpected.

As for "syntax rules also don’t add much" I would mainly say - they add clarity (to a degree anyway). When you work with millions (and even billions) of rows, that clarity is more than just handy.

I do believe there's a tradeoff between clarity and UX. In this case, I believe the tradeoffs for better usability are worth it.

Collapse
 
geraldew profile image
geraldew

On to #3 Aliases don’t work after the Select block

This is very much a matter of specific dialects - therefore of their engines and parsers.
For some, what you have there would be allowed and would do what you expect, but for others not.

In my current workplace:

  • for Teradata SQL that would work
  • for HiveQL it would not.

For HiveQL the problem seems to me to be the parser, as it's quite trivial to wrap the SELECT as a derived table and then do same WHERE clause. When used with the Hive-on-Spark engine, I'm fairly sure that the underlying engine does it in the one pass.

Which is to say that often the rant is about the implementation not the SQL language (but once again, more on this later).

Interestingly, your comment that "Some SQL engines will allow you to use having here instead" - that's not something I've encountered so I'll assume you're correct. I've only ever used HAVING with a GROUP BY.

  • (aside: gosh aren't these uppercase keywords handy when talking _about_SQL ?)

Maybe this is a good point to bring in that modern SQL uses at least three variants of the WHERE clause. There's:

  • WHERE
  • HAVING
  • QUALIFY

where each applies to a separate layer/round of under-the-hood processing. The overview being:

  • WHERE - applies first, filtering rows
  • HAVING - applies next, after the aggregations, typically for the rollups of a GROUP BY
  • QUALIFY - applies last, after the OLAP/analytical/windowing functions have been derived

Note: I'm deliberately leaving out the use of WHERE as a de-facto JOIN controller. That's a topic really about JOINs and has even more issues about dialect, parser and engine variations.

Again, dialects vary: in HiveQL you get the windowing functions but not the QUALIFY clause forcing another round of a table-within-a-table.

  • * I use the old fashioned term of "derived table" for the replacement of a FROM clause with an inner SELECT. The modern trend is to do these using a WITH clause instead. There's much more to it than that, so that's yet another topic.

p.s. predating the use of AS for aliases, some dialects had a (NAMED) mechanism. Strangely these applied at the text level before any real SQL parsing. This would let you specify a name for something inside an expression and re-use it elsewhere in the query - thus only typing it once and no risk of typos in the additional uses. I know dialects where the AS also works like a copy-paste "macro" rather than indicating something should be calculated once at run-time and then re-used rather than independently re-derived. The irony is that by now most engines have optimisers that will spot the repeated derivations regardless of whether they have the same alias or not.

Collapse
 
shaqq profile image
Shaker Islam

Thanks for the detailed explanation, though I believe we agree that this still hearkens back to #10.

Collapse
 
geraldew profile image
geraldew

Number 10 SQL implementations are inconsistent

I've jumped to this one because a couple of my other responses alluded to it.

In a rough sense, you're completely right on this point. The variations are maddening, and you can be years into using a dialect before realising that you've been presuming it worked the same as some other one before finding there was a slight difference.

However, from my position, where I've been watching some of these evolve over the decades, I can say I'm glad things never got set in stone by the "standard". While there have been inevitable dead ends, by and large it has been the variations and vendor experiments that have let SQL somehow stay SQL and yet gather features that we mostly find useful.

The caveat though, is that SQL has always been the QWERTY of relational scripting work. You couldn't call it a well designed "language" because it simply isn't. As you point out elsewhere, its flaws are such that there's no clear path to "fix" it.

It doesn't even really do "relational" all that well - you'll find criticisms of that kind going a long way back if you search for them. As dialects and engines evolve with new features such as table expressions, it's tempting to say they're drifting even further from being "relational". There's always a lot to argue and discuss.

When I'm teaching colleagues new to it, my first words are usually "SQL is crap language". Like the "standard" keyboard, we all know we should have something better but we persevere with what we have. I don't really see any way of defending that (and the matter of alternative relational languages is yet another topic).

Collapse
 
shaqq profile image
Shaker Islam

However, from my position, where I've been watching some of these evolve over the decades, I can say I'm glad things never got set in stone by the "standard". While there have been inevitable dead ends, by and large it has been the variations and vendor experiments that have let SQL somehow stay SQL and yet gather features that we mostly find useful.

With respect, I can't disagree with this more. Drifting SQL dialects is a massive pain point, one also felt by Java on the JVM, and by browsers in the early days. In the grand scheme of things this really hurts the users and creates a sense of vendor lock-in that isn't necessary.

Collapse
 
geraldew profile image
geraldew

Next, #4. Select statements are not dynamic

Certainly some data engines provide features just like this, so I guess your point is that these have not migrated into such standard use as to be in them all, and to then not be handled with a language standard.

For example, Teradata has two of these:

  • Temporary tables, whose existence only lasts with the session. Disconnect and the table vanishes.
  • Global Temporary tables, whose definition survives across sessions but the data in them vanishes with the sessions.

Neither is the exactly the same as what you want. And all present some form of management issue when what you're writing needs to shift from exploratory code to being a production script of data actions in sequence to run again and again.

e.g. To redo a temporary within a session you still have to do a DROP TABLE to get rid of it before recreating it differently.

Collapse
 
shaqq profile image
Shaker Islam

See complaint #10 😄

It doesn't help if some SQL dialect, somewhere, has a feature. I need it for all of them!

Collapse
 
geraldew profile image
geraldew

Next #2 UPPERCASE

Ah, the classic language argument - and one with no winners.

Except of course, this is not a requirement at all with SQL, it's just a common style. Do what you like, your code will work.

Personally I like it, but then I date from programming in Modula-2 where it was mandatory as well as a time before colour displays and syntax highlighting. In the green-on-black screens and crude dot matrix printouts, the UPPERCASE words stood out nicely.

Maybe that colours my view, but it means with your examples, yes, the one with uppercase keywords was easier for me to read.

I suspect what you want/need is a syntax highlighting editor that does CASE display change along with colour.

And we haven't even discussed line and indent formatting of SQL scripts, which is where the real bun fights take place. I'd say the various habits and preferences of that make the uppercase/lowercase thing look like a mild issue.

Collapse
 
shaqq profile image
Shaker Islam

Of course! This is one of many the "petty" complaints I had.

The usage of uppercase and formatting & such is very subjective. In my view, there's at least a few things to note against the use of uppercase:

  1. Uppercase as a format is much harder to type. Often when we're writing queries, we want to quickly iterate, and even small roadblocks can be frustrating when we want to focus on analysis
  2. While it's true that uppercase was much more common for older programming languages, it's not true for popular programming languages in the past 20 years (that I'm aware of). It's simply not a format that the next generation of analysts & engineers are used to, and we should adjust.
Collapse
 
defaultkramer profile image
Ryan Kramer • Edited

I'm always glad to see people who know SQL well call out its flaws. It is amazing to me that SQL is still the best option we have available to talk to our DBs. Even Typescript has gotten wide adoption as a better Javascript, but for SQL I don't expect any alternative (including my own) to gain wide adoption soon.

You might be interested in my library/language Plisqin, although if you've never used a Lisp-family language it might be too inaccessible: docs.racket-lang.org/plisqin/index...

Items 3 and 6 can be easily solved using a local variable for century like so

(from b 'books
      (define century
        (%%scalar b".publish_year / 100"))
      (select century)
      (%%where century" between 16 and 19"))
Enter fullscreen mode Exit fullscreen mode

Item 4 can be solved in a lot of ways, I think defining a list of column names would be easiest.

Item 5 is trivial -- you simply create a normal Racket procedure which takes whatever variables it needs. This procedure builds and returns the query. Just like a view, you can build a larger query with the result by adding where clauses or joins or whatever you want.

Item 10 is unfortunately a huge pain for me.

One of my biggest gripes about SQL is how often I end up repeating joins. For example, imagine every Book has exactly one Author. If I have a Book and want to talk about the AuthorName, I have to manually include the join from Book to Author every single time! [1] Why should I even have to know that a join is involved? I know that AuthorName is a valid property of a Book, I shouldn't even have to care whether it is a direct property of the Book table or a property of some joined table. Plisqin allows me to define a function "AuthorName given an instance of Book" and then I can reuse that function just the same as I would every other property of Book.

[1]: Yes, I could create a "BookWithAuthor" view, but then maybe I need a "BookWithAuthorAndPublisher" view also, and a "BookWithPublisher" view... The combinations become unmanageable very quickly.

Collapse
 
mhd profile image
Michael Dingler

I can live with uppercase, but then again, I've enjoyed programming Modula-2/Oberon in the past, and it's easy enough to automate it either way. But know what really grinds my gears? Excessive quoting of everything that isn't a keyword. Backtick heaven. (Seems more common in some databases than others)

I'm surprised, too, that there isn't more tooling around SQL-as-a-language, as opposed to specifc RDBMS. Compare it with HTML (another bloated beast). There you can have different syntaxes in template systems (like HAML), and alternate convenient ways to enter things (like Emmet). Not aware of anything like that for SQL. (And no, "expressive ORMS" aren't that, they fall into the same unreadable trap like DSLs that generate HTML. Or JSX, in the worst case).

Collapse
 
alexktracermain profile image
AlexKTracerMain

Thanks, just learning SQL and this was therapeutic. I'm not crazy, SQL is just deeply deeply flawed and extremely hard to learn. No reason it can't more elegantly be like real code where you call other tables you create like re-usable tables. (this is possible, but seems hard/annoying)

The relative immaturity is astonishing for such a massively used 'language'.

Collapse
 
rolandbouman profile image
Roland Bouman

While I do think SQL leaves a number of things to desire, I don't see much in the objections you raise here.

Some items have already been adressed by @geraldew , I just wanted to add:

6. No variables in ANSI SQL

There is a good reason for that - variables are an imperative device and in many cases using them would force a definite execution order, which makes it harder (impossible) to optimize the query. This is not unique to SQL, you see this in functional languages as well.

For the same reason, if your SQL dialect lets you use user-defined functions, then they should typically be free of side-effects. If they are not, unexpected (bad) things will happen.

Now, about your example, you wrote:

(publish_year / 100) as century,
case
    when century < 0 then (century * -1) || ' B.C.'
    when century > 0 then century || ' A.D.'
as formatted_century
Enter fullscreen mode Exit fullscreen mode

It now looks like you would need to duplicate the publish_year/100 twice, and this is what you would solve with a variable.

Now, typically you'd want to push this type of formatting to a reporting tool. But if you really want to do it in a query, you'd simply use a built-in date formatting function. And, you'd make your life considerably easier by storing the publish_year as a date.

With that in place, the century expression would become (showing postgres examples):

to_char(publish_date, 'CC')

and

abs(cast(to_char(publish_date, 'CC') as int))||' '||to_char(publish_date, 'B.C.')

Now, you could still nitpick over the duplication of the 'CC' formatting of the publish_date but the point is that usually when people say they need variables in SQL, they really don't, they just need to use built-in functions to do the job for you.
(see: postgresql.org/docs/8.2/functions-...)

Collapse
 
shaqq profile image
Shaker Islam • Edited

Thanks for responding, @rolandbouman

Now, typically you'd want to push this type of formatting to a reporting tool. But if you really want to do it in a query, you'd simply use a built-in date formatting function. And, you'd make your life considerably easier by storing the publish_year as a date.

With respect, this is a classic example of "this isn't the product's problem, it's the user's problem." Yes, of course - users have been dealing with this and using workaround for years now. My question is - can this be better?

variables are an imperative device and in many cases using them would force a definite execution order, which makes it harder (impossible) to optimize the query

You're saying it's impossible to have some sense of variables in SQL, with some sane guardrails, and be able to optimize a query? I doubt that, given that there are SQL dialects that have variables.

If dividing one number by another static number is an execution that must be done before others, I think that's a fine tradeoff for most/all users. There must be a better way.