data partitioning articles

on waitingforcode.com
Articles tagged with data partitioning. There are 2 article(s) corresponding to the tag data partitioning. If you don't find what you're looking for, please check related tags: access pattern, Ad-hoc polymorphism, Akka Distributed Data, Akka examples, Apache Beam configuration, Apache Beam internals, Apache Beam partitioning, Apache Beam PCollection, Apache Beam pipeline, Apache Beam stateful transforms.

Check out my new course on Data Engineering!

Are you a data scientist who wants to extend his data engineering skills? Or a software engineer who wants to work with Big Data? If not, maybe a BI developer who wants to evolve to engineering position? My course will help you to achieve your goal! Join the class →

Dynamo paper and consistent hashing

One of previous posts presented partitioning strategies. Among described techniques we could find hashing partitioning based on the number of servers. The drawback of this method was the lack of flexibility. With the add of new server we have to remap all data. Fortunately an alternative to this "primitive" hashing exists and it's called consistent hashing. Continue Reading →

Data partitioning strategies

Every data processing pipeline can have a source of contention. One of them can be the data localization. When all entries are read from single place by dozens or hundreds of workers, the data source can respond slower. One of solutions to this problem can be the partitioning. Continue Reading →