Articles about Spark stateful operations on waitingforcode.com

March 18, 2018 • Apache Spark Structured Streaming

Stateful transformations with mapGroupsWithState

Streaming stateful processing in Apache Spark evolved a lot from the first versions of the framework. At the beginning was updateStateByKey but some time after, judged inefficient, it was replaced by mapWithState. With the arrival of Structured Streaming the last method was replaced in its turn by mapGroupsWithState.

Continue Reading →

June 11, 2017 • Apache Spark Streaming

Stateful transformations with mapWithState

updateStateByKey function, explained in the post about Stateful transformations in Spark Streaming, is not the single solution provided by Spark Streaming to deal with state. Another one, much more optimized, is mapWithState.

Continue Reading →

November 18, 2016 • Apache Spark Streaming

Stateful transformations in Spark Streaming

Spark Streaming is able to handle state-based operations, ie. operations containing a state susceptible to be modified in subsequent batches of data.

Continue Reading →

Spark stateful operations articles

Stateful transformations with mapGroupsWithState

Stateful transformations with mapWithState

Stateful transformations in Spark Streaming