Articles about distributed data serialization on waitingforcode.com

January 6, 2018 • Apache Beam

Coders in Apache Beam

Since in distributed computing the data moves either locally (within single worker) or remotely (between several different workers), it must have a format understandable by the machine. And this format is guaranteed by the operation of serialization, also present in Apache Beam.

Continue Reading →

June 5, 2017 • Apache Spark

Serialization issues - part 2

Some of previous posts (Serialization issues - part 1) presented some of solutions for serialization problems. This post is its continuation.

Continue Reading →

June 5, 2017 • Apache Spark

Serialization issues - part 1

Issues with not serializable objects are maybe the most painful when we start to work with Spark. But hopefully there are several solutions to them.

Continue Reading →

January 29, 2017 • Apache Spark

Serialization in Spark

Serialization frameworks are intrinsic part of Big Data systems. Spark is not an exception for this rule and it offers some different possibilities to manage serialization.

Continue Reading →

distributed data serialization articles

Coders in Apache Beam

Serialization issues - part 2

Serialization issues - part 1

Serialization in Spark