I'm the author of Data Engineering Design Patterns (O'Reilly),
a Databricks MVP, and
a freelance data engineer specializing in Apache Spark and Databricks.
I help teams move from working pipelines to resilient architectures.
I'm currently accepting new projects for May 2026. Whether you need a 2-day architectural audit, a hands-on lead for a
complex data engineering problem, or a workshop
let's discuss your project here.
Idempotence is something I appreciate, maybe the most, in data engineering. If you write an idempotent logic you don't need to worry when your logic is reprocessed. You don't need to worry that it will generate duplicates or inconsistent results between runs. However, using it is not always easy and I'm actively looking for all related patterns to it. This time I will focus on idempotent consumer implementation in Apache Camel. Even though it may sounds old-school with modern streaming and messaging solutions, it's a good solution to know.
Recently I wrote posts about idempotent consumer pattern analyzing Apache Camel implementation and CDC applied on NoSQL stores. After that I had an idea, what happened if we would mix both of them?
In my previous post I presented an implementation of idempotent consumer pattern with Apache Cassandra CDC. One of drawbacks of that solution was the necessity of producing the messages with slower lightweight transactions. In this post I will show you how to do the same with AWS DynamoDB streams and without that constraint.