This is one of my favorite blog posts, the yearly retrospective. Every year I summarize what happened in the past 12 months and share with you my future plans. It's time for the 2023 Edition!
Blog in 2023
This year I blogged less for the 5th year in a row. I added only 53 blog posts which is 15 less than last year and 40 less than 2 years ago:
The tendency confirms the trend that has started in 2020. Back then, I've decided to diversify and instead of blogging only, make an effort on writing ebooks (Data Engineering patterns on the cloud, 2022) and on preparing video online courses (Better data engineering, since 2020, currently on hold). Besides, I've been spending more time with my little family that will get a new member in a few days 👶 Even though I would like to do more things and do them faster, I'm pretty happy with the current writing pace.
Regarding the blog posts themselves, as stream processing is my current focus, I naturally wrote the most on that topic:
|Blog posts in 2023
|Apache Spark Structured Streaming
|General data engineering
|Data engineering on the cloud
|Data engineering on AWS
|Apache Spark SQL
Plans for 2023
Before I share with you my plans for 2023, let's go back to 2022 and see what I have been expecting from it and what was the realty:
- Apache Spark and cloud data engineering will remain my Pi-shaped profile topics. - Partially true as I focused on Apache Spark Structured Streaming mostly last year.
- Develop the Table file formats series. Although I've written 8 blog posts last year, I'm not satisfied because I could only cover some of the most basic features. Hopefully, in 2023 I will do better! - Indeed, I spent some time on that topic but instead of covering all of the 3 major formats (Delta Lake, Hudi, Iceberg), I focused on one of them which turned out to be Delta Lake, closer to my world.
- More pure data engineering topics. For sure, I've presented data contracts in 2022 but I have many more items in my backlog. Hopefully, I'll be able to integrate them either with Apache Spark or cloud data engineering branch. - I didn't do well on that but the reason is simple. All the materials I have for them will be part of my secret project that has started in September 2023 🤫 I'll share more on that in few months.
- Rust. The last programming language I've learned on purpose was Scala. I even wrote a series to illustrate my journey called One Scala feature per week. I'm definitely not a Scala master but feel a need to learn something new. In the blog post about Python alternatives to PySpark I learned about Polars and was intrigued enough to add it to my topics for 2023. - This one is a total failure. Even though I set up the environment and wrote some dummy code snippets, Rust is still in my backlog for the next sprint.
- Apache Arrow. Polars, PySpark and Ray are 3 libraries mentioning Apache Arrow. So far I know it's a columnar memory format for efficient analytic operations but as for Rust, I'd like to know more! - Here too, I'm not proud. I was particularly curious about Apache Arrow and was hoping to learn more about it by the end of the year. But yet again, other topics, especially the stream processing ones, won the battle.
Become a Data Engineer:
- One big update. I'm preparing a V2 of the course with some changes in the content and organization. - The organization changes are in place but the V2 is not finished yet because of my secret project for 2024.
Data engineering patterns on the cloud:
- The first release has 84 patterns but I already added 20 other pattern candidates. There should be one or two updates including them in 2023. - Although I didn't include all the 20 candidates, the number of patterns grew to 91.
- Even though I deliberately left this part aside, I missed it in 2022 a lot. Now, when some other problems of life seem to be solved, it's a great time to start again! - New problems have appeared but I won't let them disturb my speaker engagements this time. I've already submitted 2 CfPs!
- I'm happy to share that I got my first CfP accepted! In March I'm going to speak at Big Data Warsaw about challenges of delivering streaming data at scale. I've plans to submit 2 other CfP I've been always dreaming about, so fingers crossed 🤞 - My CfP wasn't accepted but I was happy to speak at Big Data Warsaw! The remaining 2 remain in my bucket list, though.
Cloud data engineer:
- My data engineering certifications for AWS, Azure and GCP expire soon. It'll be time to renew them and why not share something more helpful to prepare them than a blog post? - Well, they expired and I didn't find time to prepare for the renewals.
Plans for 2024
What about next year?
- Stream processing has still some exciting areas to discover like Apache Flink and streaming databases. It'll probably remain my focus next year with the blog posts about the aforementioned topics, plus of course Apache Spark Structured Streaming.
- Additionally, I would like to explore streaming capabilities on the cloud, but it's less prioritary than learning the internals of Apache Flink.
- I still have a dream of reaching the 1000 blog posts published. So far I'm at 921 and I won't be able to reach this magic number next year. But let's dream big and be as closest as possible to this number!
- I can't tell more on that now but I would like to deliver it as planned, so by the end of 2024.
Become a Better Data Engineer:
- If I have some time to spare, I'll work on the remaining courses marked as WIP.
Data engineering patterns on the cloud:
- Here the goal is to add at least the 9 patterns to reach another magic number of 100 patterns.
- I don't have a fixed numbers here but I would like to speak at one of these 2 dreamy conferences this year.
Cloud data engineer:
- Inevitably, I'll need to renew them in 2024. Probably by the end of the year where I should have more time after the rush on the secret project.
These are just the plans. I know from the past that I might not fully succeed in reaching them. I may discover another exciting thing in 2023, take more time than expected for preparing a certification, or simply speak not at 1 but 6 conferences. Anyway, I will share all the defeats and victories with you next December. Thank you for being with me in 2023 and hope to see you next year too!
Wishing you all a happy, healthy and successful 2024!