Articles about Apache Kafka on waitingforcode.com - articles for the pleasure of learning and discovery

November 21, 2020 • Apache Kafka

Isolation level in Apache Kafka consumers

Who says transaction, automatically invokes isolation levels, so what can be viewed by the consumer from uncommitted transactions. Apache Kafka also implements this concept and I will take a closer look on it in this blog post.

Continue Reading →

September 27, 2020 • Apache Kafka

Control messages in Apache Kafka

During my last exploration of logs compaction, I found a method called isControlBatch. At the time, I only had a rough idea about this category of batches and that's the reason why I decided to learn a little bit more about them.

Continue Reading →

September 20, 2020 • Apache Kafka

Records writing in Apache Kafka

In my journey to understand transaction internals in Apache Kafka, I discovered another intriguing class that by the way led me to a few others ;) This class is RecordBatch but in this blog post you will also meet MemoryRecords and FileRecords.

Continue Reading →

September 13, 2020 • Apache Kafka

Offset-based lookup in Apache Kafka

In March I published a blog post about timestamp-based lookup in Apache Kafka. But as you know, it's not the single lookup possibility. Another one uses indexes and it will be the topic of this article.

Continue Reading →

August 30, 2020 • Apache Kafka

Logs compaction in Apache Kafka - compact cleanup policy

Delete covered one of my previous posts is not a single clean up policy in Apache Kafka. Another one is compaction which reduces the size of the segments by keeping the last value for every key.

Continue Reading →

August 16, 2020 • Apache Kafka

Logs compaction in Apache Kafka - delete and cleanup policy

Since my very first experiences with Apache Kafka, I was always amazed by the features handled by this tool. One of them, that I haven't had a chance to explore yet, is logs compaction. I will shed some light on it in this and next week's article.

Continue Reading →

May 17, 2020 • Apache Kafka

Files synchronization - zoom at Kafka HDFS Connector

When I was playing with Kafka HDFS Connector, I saw that the generated files are suffixed by some numbers. It intrigued me and I decided to explore the topic in this article.

Continue Reading →

March 29, 2020 • Apache Kafka

Timestamp-based lookup in Apache Kafka

The next thing I wanted to understand while still working on transactions was the lookup. I can imagine how to get the first or the last element of a partition but I had no idea how it can work for more fine-grained access, like the one using timestamp.

Continue Reading →

January 26, 2020 • Apache Kafka

Apache Kafka idempotent producer

I wrote all posts published this year about Apache Kafka (NIO, max in flight requests) to better understand idempotent producers. In this post I'll try to do that before going further and analyze transactions support.

Continue Reading →

January 18, 2020 • Apache Kafka

Apache Kafka and max.in.flight.requests.per.connection

I didn't plan to write this post at all. However, when I was analyzing the idempotent producer, it was hard to understand out-of-sequence policy for multiple in-flight requests without understanding what this in-flight requests parameter really means.

Continue Reading →

January 12, 2020 • Apache Kafka

NIO Selector in Apache Kafka

It's rare when in order to write a blog post I need to cover more than 3 other topics. But that's what happens with Apache Kafka idempotent producer that I will publish soon. But before that, I need to understand and explain NIO Selector, its role in Apache Kafka, and finally the in flight requests. Since the first topic was already covered, I will move to the second one.

Continue Reading →

April 22, 2017 • Apache Kafka

Dockerize Kafka

Apache Kafka dockerization is less complicated that in the case of Cassandra (take a look at post about Dockerize Cassandra troubleshooting). But even of that, there are some thinks to know, globally of the same type as in the case of Cassandra.

Continue Reading →

July 2, 2016 • Apache Kafka

Replication in Apache Kafka

Since Apache Kafka is distributed messaging system and we haven't described replication yet, it's a good moment to do so.

Continue Reading →

July 2, 2016 • Apache Kafka

Requests in Apache Kafka

Kafka clients communicate with broker through dedicated TCP connection. They send a lot of different requests, mostly to handle eventual rebalancing.

Continue Reading →

July 2, 2016 • Apache Kafka

Controller in Apache Kafka

Controller is a well-known concept for the ones who have worked with MVC paradigm. But regarding to Kafka, it shouldn't be thought in the same categories.

Continue Reading →

July 2, 2016 • Apache Kafka

Rebalancing in Apache Kafka

If you're working with Katka, rebalancing is maybe a word the most commonly met. It's also an important word because it helps to ensure correct message consumption.

Continue Reading →

July 2, 2016 • Apache Kafka

Coordinator in Apache Kafka

Since Kafka is a distributed system, it naturally has to coordinate its members somehow. The synchronization mechanism is ensured by coordinators.

Continue Reading →

July 2, 2016 • Apache Kafka

Configuring Apache Kafka

Until now we've almost used only default configuration values. In this article we'll see some of configuration possibilities more in detail.

Continue Reading →

July 2, 2016 • Apache Kafka

Messages in Apache Kafka

An intrinsic part of each messaging system are messages. After learnt previously about producing and consuming messages, it's good moment to see what these messages really are.

Continue Reading →

July 2, 2016 • Apache Kafka

Producers in Apache Kafka

After presenting consumers in Kafka, it's a good moment to pass to producers.

Continue Reading →

Apache Kafka articles