Articles about Kafka integration on waitingforcode.com

July 18, 2019 • Apache Spark Structured Streaming

Apache Spark Structured Streaming and Apache Kafka offsets management

Some time ago I got 3 interesting questions about the implementation of Apache Kafka connector in Apache Spark Structured Streaming. I will answer them in this post.

Continue Reading →

September 17, 2017 • Apache Spark Structured Streaming

org.apache.spark.sql.AnalysisException: Queries with streaming sources must be executed with writeStream.start() explained

The error quoted in the title of this post is quite common when you want to copy conception logic from Spark DStream/RDD to Spark structured streaming. This post makes some insight on it.

Continue Reading →

September 10, 2017 • Apache Spark Structured Streaming

Analyzing Structured Streaming Kafka integration - Kafka source

Spark 2.2.0 brought the change of structured streaming state. Between 2.0 and 2.2.0 it was marked as "alpha". But the last version changed this status to General Availability. It's so a good moment to start to play with this new feature - even if some basics have already been covered in the post about structured streaming. This time we'll go deeper and analyze the integration with Apache Kafka that will be helpful to

Continue Reading →

Kafka integration articles

Apache Spark Structured Streaming and Apache Kafka offsets management

org.apache.spark.sql.AnalysisException: Queries with streaming sources must be executed with writeStream.start() explained

Analyzing Structured Streaming Kafka integration - Kafka source