Big Data algorithms articles

on waitingforcode.com

Conflict-Free Replicated Data Type

Pessimistic replication requires a synchronous communication between the main node writing the data and the replicas. However in some cases the optimistic replication can be more efficient and still guarantee the same final result. One of solutions from this category are conflict-free replicated data types. Continue Reading →

HyperLogLog explained

Counting the number of distinct elements can appear a simple task in classical web service-based applications. After all, we usually have to deal with a small subset of data that simply fits in memory and can be automatically counted with the data structures as sets. But the same task is less obvious in Big Data applications where the approximation algorithms can come to the aid. Continue Reading →