After previous presentations of the new date time and functions features in Apache Spark 3.0 it's time to see what's new on the streaming side in Structured Streaming module, and more precisely, on its Apache Kafka integration.
Even though I've already written a few posts about Apache Kafka as a data source in Apache Spark Structured Streaming, I still had some questions in my head. In this post I will try to answer them and let this Kafka integration in Spark topic for investigation later.
I've written a lot about data sources, including Apache Kafka. However, Apache Spark is not only about sources but also about targets called sinks. In this post I will focus on Apache Kafka sink integration and try to answer some question in FAQ mode.
Some time ago I got 3 interesting questions about the implementation of Apache Kafka connector in Apache Spark Structured Streaming. I will answer them in this post.
The error quoted in the title of this post is quite common when you want to copy conception logic from Spark DStream/RDD to Spark structured streaming. This post makes some insight on it.
Spark 2.2.0 brought the change of structured streaming state. Between 2.0 and 2.2.0 it was marked as "alpha". But the last version changed this status to General Availability. It's so a good moment to start to play with this new feature - even if some basics have already been covered in the post about structured streaming. This time we'll go deeper and analyze the integration with Apache Kafka that will be helpful to