Articles about Apache Spark Structured Streaming on waitingforcode.com - articles for the pleasure of learning and discovery

September 10, 2017 • Apache Spark Structured Streaming

Analyzing Structured Streaming Kafka integration - Kafka source

Spark 2.2.0 brought the change of structured streaming state. Between 2.0 and 2.2.0 it was marked as "alpha". But the last version changed this status to General Availability. It's so a good moment to start to play with this new feature - even if some basics have already been covered in the post about structured streaming. This time we'll go deeper and analyze the integration with Apache Kafka that will be helpful to

Continue Reading →

April 30, 2017 • Apache Spark Structured Streaming

Structured streaming

Project Tungsten, explained in one of previous posts, brought a lot of optimizations - especially in terms of memory use. Until now it was essentially used by Spark SQL and Spark MLib projects. However, since 2.0.0, some work was done to integrate DataFrame/Dataset in streaming processing (Spark Streaming).

Continue Reading →

⟵ Previous
1
2
3
4
5
6

Apache Spark Structured Streaming articles

Analyzing Structured Streaming Kafka integration - Kafka source

Structured streaming