Articles about Storage on waitingforcode.com - articles for the pleasure of learning and discovery

Looking for something else? Check the categories of Storage:

Apache Avro Apache Cassandra Apache Hudi Apache Iceberg Apache Parquet Apache ZooKeeper Delta Lake Elasticsearch Embedded databases HDFS MySQL PostgreSQL Time series

If not, below you can find all articles belonging to Storage.

July 2, 2016 • Apache Cassandra

Collections in Apache Cassandra

One of interesting data types used in Apache Cassandra are collections. In our model we can freely use maps, sets or lists.

Continue Reading →

July 2, 2016 • Apache Cassandra

Tables in Apache Cassandra

Because tables in Apache Cassandra are very similar to the tables of relational databases, this article describing them won't focus on basic points. Instead, we'll explore more Cassandra specific subjects, such as configuration or different types.

Continue Reading →

July 2, 2016 • Apache Cassandra

Compaction in Apache Cassandra

Disk compaction helps to save space. Since Cassandra is supposed to store a lot of data, it can't miss this useful process.

Continue Reading →

July 2, 2016 • Apache Cassandra

Partitioners in Apache Cassandra

Since Cassandra is distributed storage system, it holds data in different nodes. But how it determines data should be stored by each node ? It's the role of partitioners.

Continue Reading →

July 2, 2016 • Apache Cassandra

Deletes in Apache Cassandra

Keeping old data eternally takes place and makes reads longer. Apache Cassandra is not an exception and has a mechanism to remove data.

Continue Reading →

July 2, 2016 • Apache Cassandra

Example of data consistency in Apache Cassandra

Previously we've presented theory of data consistency in Cassandra. Now it's a good moment to show some examples of consistency levels.

Continue Reading →

July 2, 2016 • Apache Cassandra

Data consistency in Cassandra

Distributed data brings a new problem to historical standalone relational databases - data consistency. Cassandra deals with this problem pretty nice with its different consistency levels.

Continue Reading →

July 2, 2016 • Apache Cassandra

Data organization on disk in Apache Cassandra

Until now we're working with Cassandra without looking on what happens. It's a time to be a little bit more curious.

Continue Reading →

May 5, 2016 • Apache Cassandra

Data part in Apache Cassandra

The previous article introduced us to Apache Cassandra by presenting vaguely its main concepts. This article focuses more in details on data topics.

Continue Reading →

May 5, 2016 • Apache Cassandra

Introduction to Apache Cassandra

After some articles about data ingestion and serialization in Big Data applications, it's time to start to learn about storage. This part begins with Apache Cassandra.

This article presents basic concepts of Apache Cassandra. In the first part it tries to explain architecture and general concepts of this solution. The second part is focused more on developer topics and it describes some main points about data organization.

Continue Reading →

May 5, 2016 • Apache ZooKeeper

Watches in Apache ZooKeeper

A lot of programming tools implement event-driven approach. Apache ZooKeeper isn't an exception for this rule with its system of watchers.

Continue Reading →

May 5, 2016 • Apache ZooKeeper

ACL in Apache ZooKeeper

Apache ZooKeeper is very often compared to distributed file system. Because each file system has a feature to deal with file permissions, ZooKeeper, as a kind of file system, can't be different.

Continue Reading →

May 5, 2016 • Apache ZooKeeper

Asynchronous operations in Apache ZooKeeper

Sometimes network latencies can slow down the communication between Apache ZooKeeper and its client. It's one of the reasons of possible use of asynchronous operations for zNodes manipulations.

Continue Reading →

May 5, 2016 • Apache ZooKeeper

Manipulate zNodes in Apache ZooKeeper

Until now we've seen how to create zNodes. But creation is not the single thing that Apache ZooKeeper does.

Continue Reading →

May 5, 2016 • Apache ZooKeeper

Session in Apache ZooKeeper

Client connects to ZooKeeper server and maintains a session. There are several things to know about ZooKeeper sessions and we'll explore them in this article.

Continue Reading →

May 5, 2016 • Apache ZooKeeper

zNode in Apache ZooKeeper

As already told, zNodes are a key part in Apache ZooKeeper. They store information shared among different servers directly (as binary data) or indirectly (as parent directories).

Continue Reading →

May 5, 2016 • Apache ZooKeeper

Introduction to Apache ZooKeeper

Usually Apache ZooKeeper works in the shadow of more exposed Big Data tools, as Apache Spark or Apache Kafka. However, its role is very important in system architecture.

Continue Reading →

March 12, 2016 • Elasticsearch

Elasticsearch migration from 1.6 to 2.2

At the begin Elastcisearch 2.2.0 was realeased on February 2016. Because my POC project was frozen with 1.6, I decided to upgrade. But not without surprises and some code rework.

Continue Reading →

March 12, 2016 • Apache Avro

Serialization and deserialization with schemas in Apache Avro

After theoretical introduction to Apache Avro, we can see how it can be used.

Continue Reading →

March 12, 2016 • Apache Avro

Introduction to Apache Avro

Previously we learned why serialization frameworks can facilitate work in distributed systems, where data provide from several different sources. Now, it's a good time to discover some real tools used in serialization step. As told, the chosen tool is Apache Avro.

Continue Reading →

Storage articles