How to improve UNION performance? - programming tips about SQL on waitingforcode.com

Like what I do? Support me on Ko-fi

Recently I was in trouble. My query based on 3 UNIONs became really slow after some days of adding new data. The problem could be in bad distribution but it wasn't. Instead, it was made by the used UNION operator.

UNION operator guarantees no duplicates so when it's executed, a step of deduplication is added to the query plan. And since there were some TB of data, even a simple aggregation query was taking time. But there was no reason to use a UNION because the combined dataset weren't supposed to contain duplicates. That's why, after switching from UNION to UNION ALL, which doesn't involve the dedupe stage, the query time improved from 2 minutes to 2 seconds. Remember that and always be careful about the type of used UNION type.