CASE operator is maybe one of the most unknown by the beginner users of SQL. Often when I see a question how to write an if-else condition in a SQL query, some people advise to write a UDF and use if-else directly inside. As you will see in this post, this solution is a little bit overkill though.
A virtual conference at the intersection of Data and AI. This is not a conference for the hype. Its real users talking about real experiences.
- 40+ speakers with the likes of Hannes from Duck DB, Sol Rashidi, Joe Reis, Sadie St. Lawrence, Ryan Wolf from nvidia, Rebecca from lidl
- 12th September 2024
- Three simultaneous tracks
- Panels, Lighting Talks, Keynotes, Booth crawls, Roundtables and Entertainment.
- Topics include (ingestion, finops for data, data for inference (feature platforms), data for ML observability
- 100% virtual and 100% free
👉 Register here
I will start this post by a short definition and syntax presentation for CASE...WHEN operator. In the next part, I will show different places where it can be used.
Some basics first
CASE..WHEN..THEN is a SQL equivalent of if-else statement known from programming languages. Therefore, you will define some predicates and at the first positive response, the action specified after THEN will be executed. Exactly like if-else statements, CASE..WHEN..THEN is executed as long as the predicates respond negatively.
The syntax for this operator is the following:
CASE WHEN condition1 THEN result1 WHEN condition2 THEN result2 ELSE result3 END;
Of course, you can have as many conditions as you want. You can also nest CASE blocks inside another CASE blocks. But you should use this operator carefully because you risk to overuse it. For instance, you shouldn't use CASE to get the first not null value of 2 fields. You have a more appropriate operator called COALESCE which does the job. Also, if you want to return a NULL if 2 attributes are equal, you can use NULLIF. Finally, you can opt for IFNULL if you want to return a default value for a nullable argument.
Examples
Let's see now where CASE operator can be used. But just before, I will create a new container with PostgreSQL 11 docker run --name wfc_case_when_test -e POSTGRES_PASSWORD=root -e POSTGRES_USER=root -e POSTGRES_DB=test_case_when -d postgres:11.
After accessing the container (docker exec -ti 28a9adf0b852 psql -d test_case_when), we can start to discover CASE operator. Below list presents some of its principal use cases (please comment if you think that something is missing):
- aliasing - in this context CASE operator is simply used as an alias for the value existing in the column:
WITH numbers (nr) AS ( SELECT 1 AS nr UNION ALL SELECT 2 AS nr ) SELECT nr, CASE WHEN MOD(nr, 2) = 0 THEN 'is even' ELSE 'is odd' END FROM numbers;
Above query returns:nr | case ----+--------- 1 | is odd 2 | is even (2 rows)
- where clause - with CASE operator you can also create variable filtering conditions, like the one in the following snippet which adapts to the value of a column:
WITH numbers (nr) AS ( SELECT 1 AS nr UNION ALL SELECT 2 AS nr UNION ALL SELECT 3 AS nr UNION ALL SELECT 4 AS nr UNION ALL SELECT 5 AS nr UNION ALL SELECT 6 AS nr UNION ALL SELECT 7 AS nr UNION ALL SELECT 8 AS nr ) SELECT nr FROM numbers WHERE CASE WHEN mod(nr, 2) = 0 THEN nr > 4 ELSE nr = 1 END;
Above query returns:nr ---- 1 6 8 (3 rows)
- group clause -
WITH football (name, role, team, country) AS ( SELECT 'name1' AS name, 'player' AS role, 'team1' AS team, 'Poland' AS country UNION ALL SELECT 'name2' AS name, 'player' AS role, 'team2' AS team, 'France' AS country UNION ALL SELECT 'name3' AS name, 'coach' AS role, 'team3' AS team, 'Brazil' AS country UNION ALL SELECT 'name4' AS name, 'player' AS role, 'team4' AS team, 'France' AS country UNION ALL SELECT 'name5' AS name, 'coach' AS role, 'team1' AS team, 'France' AS country UNION ALL SELECT 'name6' AS name, 'player' AS role, 'team1' AS team, 'Argentina' AS country ) SELECT COUNT(*) FROM football GROUP BY CASE WHEN role = 'player' THEN team ELSE country END;
Above query returns (I agree, it's meaningless but that's just for the sake of illustration):count ------- 2 1 1 1 1 (5 rows)
- order by -simiarly to the previous point, we can also use CASE in ORDER BY:
WITH football (name, role, team, country) AS ( SELECT 'name1' AS name, 'player' AS role, 'team1' AS team, 'Poland' AS country UNION ALL SELECT 'name2' AS name, 'player' AS role, 'team2' AS team, 'France' AS country UNION ALL SELECT 'name3' AS name, 'coach' AS role, 'team3' AS team, 'Brazil' AS country UNION ALL SELECT 'name4' AS name, 'player' AS role, 'team4' AS team, 'France' AS country UNION ALL SELECT 'name5' AS name, 'coach' AS role, 'team1' AS team, 'France' AS country UNION ALL SELECT 'name6' AS name, 'player' AS role, 'team1' AS team, 'Argentina' AS country ) SELECT * FROM football ORDER BY CASE WHEN role = 'player' THEN team ELSE country END ASC;
The result of the above query is:name | role | team | country -------+--------+-------+----------- name3 | coach | team3 | Brazil name5 | coach | team1 | France name1 | player | team1 | Poland name6 | player | team1 | Argentina name2 | player | team2 | France name4 | player | team4 | France (6 rows)
- aggregates - we're close to the point that amazed me when I was reading about CASE operator. You can imagine that it can be used in aggregates like COUNT, MAX or SUM? Let's start with a simple query where we want to count all players who scored at least one goal. You can write this query with a WHERE clause but for the sake of example, I will present another solution using CASE:
WITH football (name, team, goals) AS ( SELECT 'name1' AS name, 'team1' AS team, 1 AS goals UNION ALL SELECT 'name2' AS name, 'team2' AS team, 5 AS goals UNION ALL SELECT 'name3' AS name, 'team3' AS team, 0 AS goals UNION ALL SELECT 'name4' AS name, 'team4' AS team, 0 AS goals UNION ALL SELECT 'name5' AS name, 'team1' AS team, 1 AS goals UNION ALL SELECT 'name6' AS name, 'team1' AS team, 3 AS goals ) SELECT COUNT(CASE WHEN goals > 0 THEN goals END) FROM football;
Above query returns 4:count ------- 4 (1 row)
How is it possible? The correct result comes from 2 things. COUNT ignores NULL columns and since CASE returns null by default for all not matches in case of missing ELSE statement, the database can apply the filtering logic correctly. The same rule applies to other aggregates like MIN, MAX or AVG.
From that you can do more complicated things like flattened aggregates. In the next query I will compute the sum of goals scored in each round:WITH football (round, goals) AS ( SELECT 1 AS round, 2 AS goals UNION ALL SELECT 1 AS round, 4 AS goals UNION ALL SELECT 2 AS round, 1 AS goals UNION ALL SELECT 2 AS round, 1 AS goals UNION ALL SELECT 3 AS round, 11 AS goals UNION ALL SELECT 3 AS round, 3 AS goals UNION ALL SELECT 4 AS round, 5 AS goals UNION ALL SELECT 4 AS round, 6 AS goals ) SELECT SUM(CASE WHEN round = 1 THEN goals END) AS goals_round_1, SUM(CASE WHEN round = 2 THEN goals END) AS goals_round_2, SUM(CASE WHEN round = 3 THEN goals END) AS goals_round_3, SUM(CASE WHEN round = 4 THEN goals END) AS goals_round_4 FROM football;
The results look like:goals_round_1 | goals_round_2 | goals_round_3 | goals_round_4 ---------------+---------------+---------------+--------------- 6 | 2 | 14 | 11 (1 row)
In this blog post you can discover CASE operator which is the if-else statement in SQL. As you could see in the multiple examples of the second section, you can use it almost everywhere - from a simple aliasing expression to more advanced flattened aggregates.