A 3-day bug hunt on a 3-person team costs up to €7,200 in lost engineering time. This workshop teaches you to prevent that — unit tests, data tests, and integration tests for PySpark and Databricks Lakeflow, including Spark Declarative Pipelines.
Last November I spoke at Paris.py meetup about integrating Cerberus with PySpark to enhance JSON validation. During the talk, I covered some points that I would like to share with you in this blog post, mostly about error definition and normalized validation.