Delta Lake tips

I'm the author of Data Engineering Design Patterns (O'Reilly), a Databricks MVP, and a freelance data engineer specializing in Apache Spark and Databricks. I help teams move from working pipelines to resilient architectures.
I'm currently accepting new projects for May 2026. Whether you need a 2-day architectural audit, a hands-on lead for a complex data engineering problem, or a workshop let's discuss your project here.

How to check data regression with Delta Lake and SQL?

You just changed the value of one column in your processing logic and want to check whether there wasn't other impact. You can do that by launching the query like this: SELECT x.id, COUNT(*) FROM (...

Continue Reading β†’

How to create a data validation query?

The goal is to write a query that for each table will compare the content and return any rows that are missing in one of the compared tables. The query combines FULL OUTER JOIN, time travel, and ro...

Continue Reading β†’