I'm the author of Data Engineering Design Patterns (O'Reilly),
a Databricks MVP, and
a freelance data engineer specializing in Apache Spark and Databricks.
I help teams move from working pipelines to resilient architectures.
I'm currently accepting new projects for Jun 2026. Whether you need a 2-day architectural audit, a hands-on lead for a
complex data engineering problem, or a workshop
let's discuss your project here.
CREATE TABLE command is used to initialize a new table. It works pretty well when we create a new table from scratch. However, if we want to clone already existent schema of an existent table, rewriti...
Recently I was in trouble. My query based on 3 UNIONs became really slow after some days of adding new data. The problem could be in bad distribution but it wasn't. Instead, it was made by the used UN...