Apache Cassandra blog posts on waitingforcode.com

4-day workshop · In-person or online

What would it take for you to trust your Databricks pipelines in production?

A 3-day bug hunt on a 3-person team costs up to €7,200 in lost engineering time. This workshop teaches you to prevent that — unit tests, data tests, and integration tests for PySpark and Databricks Lakeflow, including Spark Declarative Pipelines.

Unit, data & integration tests

Medallion architecture & Lakeflow SDP

Max 10 participants · production-ready templates

See the full curriculum → €7,000 flat fee · cohort of up to 10

Bartosz
Konieczny

April 8, 2018 • Apache Cassandra

Range query algorithm in Apache Cassandra

When I was learning about the secondary index in Cassandra, I've found the mention of special Cassandra's algorithm used to range and secondary index queries. After some time passed on exploring secondary index mechanism, it's a good moment to discover the algorithm making it work.

Continue Reading →

April 22, 2017 • Apache Cassandra

Dockerize Cassandra troubleshooting

Some time ago I tried to create Docker image with Cassandra and some other programs. For the "others", the operation was quite easy but Cassandra caused some problems because of several configuration properties.

Continue Reading →

July 2, 2016 • Apache Cassandra

Mapper in Cassandra Java API

Before writing some code in Apache Cassandra, we'll try to explore very interesting dependency - cassandra-driver-mapping.

Continue Reading →

July 2, 2016 • Apache Cassandra

Cache in Apache Cassandra

I/O operations are slower than memory lookups. It's the reason why memory cache helps to improve performances, in Cassandra too.

Continue Reading →

July 2, 2016 • Apache Cassandra

Collections in Apache Cassandra

One of interesting data types used in Apache Cassandra are collections. In our model we can freely use maps, sets or lists.

Continue Reading →

July 2, 2016 • Apache Cassandra

Tables in Apache Cassandra

Because tables in Apache Cassandra are very similar to the tables of relational databases, this article describing them won't focus on basic points. Instead, we'll explore more Cassandra specific subjects, such as configuration or different types.

Continue Reading →

July 2, 2016 • Apache Cassandra

Compaction in Apache Cassandra

Disk compaction helps to save space. Since Cassandra is supposed to store a lot of data, it can't miss this useful process.

Continue Reading →

July 2, 2016 • Apache Cassandra

Partitioners in Apache Cassandra

Since Cassandra is distributed storage system, it holds data in different nodes. But how it determines data should be stored by each node ? It's the role of partitioners.

Continue Reading →

July 2, 2016 • Apache Cassandra

Deletes in Apache Cassandra

Keeping old data eternally takes place and makes reads longer. Apache Cassandra is not an exception and has a mechanism to remove data.

Continue Reading →

July 2, 2016 • Apache Cassandra

Example of data consistency in Apache Cassandra

Previously we've presented theory of data consistency in Cassandra. Now it's a good moment to show some examples of consistency levels.

Continue Reading →

July 2, 2016 • Apache Cassandra

Data consistency in Cassandra

Distributed data brings a new problem to historical standalone relational databases - data consistency. Cassandra deals with this problem pretty nice with its different consistency levels.

Continue Reading →

July 2, 2016 • Apache Cassandra

Data organization on disk in Apache Cassandra

Until now we're working with Cassandra without looking on what happens. It's a time to be a little bit more curious.

Continue Reading →

May 5, 2016 • Apache Cassandra

Data part in Apache Cassandra

The previous article introduced us to Apache Cassandra by presenting vaguely its main concepts. This article focuses more in details on data topics.

Continue Reading →

May 5, 2016 • Apache Cassandra

Introduction to Apache Cassandra

After some articles about data ingestion and serialization in Big Data applications, it's time to start to learn about storage. This part begins with Apache Cassandra.

This article presents basic concepts of Apache Cassandra. In the first part it tries to explain architecture and general concepts of this solution. The second part is focused more on developer topics and it describes some main points about data organization.

Continue Reading →

Apache Cassandra articles

What would it take for you to trust your Databricks pipelines in production?