I've discovered the term from the title while learning Azure Synapse and Cosmos DB services. I had heard of NoSQL, or even NewSQL, but never of a solution supporting analytical and transactional workloads at once.
When I prepare the "What's new on the cloud..." series, I'm pretty sure that for Azure the most updates will go to the Azure SQL service. The main idea of the service is simple but if you analyze it more deeply, you'll find some concepts that might not be the easiest to understand at first.
When I first heard about Durable Functions, my reaction was: "So cool! We can now build fully serverless streaming stateful pipelines!". Indeed, we can but it's not their single feature!
Almost 2 years ago (already!), I wrote a blog post about data pipeline patterns in Apache Airflow (link in the "Read also" section). Since then I have worked with other data orchestrators. That's why I would like to repeat the same exercise but for Azure Data Factory.
How to orchestrate your data pipelines on the cloud? Often, you will have a possibility to use managed Open Source tools like Cloud Composer on GCP or Amazon Managed Workflows for Apache Airflow on AWS. Sometimes, you will need to use cloud services like for Azure and its Data Factory orchestrator. Is it complicated to create Data Factory pipelines with the Apache Airflow knowledge? We'll see that in this blog post.
I'm happy to complete my quest for data engineering certification on top of 3 major cloud providers. Last year I became AWS Big Data certified, in January a GCP Data Engineer, and more recently, I passed DP-200 and DP-201 and became an Azure Data Engineer Associate. Although DP-203 will soon replace the 2 exams, I hope this article will help you prepare for it!