Apache Parquet articles

on waitingforcode.com

Compression in Parquet

Last time we've discovered different encoding methods available in Apache Parquet. But the encoding is not the single technique helping to reduce the size of files. The other one, very similar, is the compression. Continue Reading →

Schema versions in Parquet

When I've started to play with Apache Parquet I was surprised about 2 versions of writers. Before approaching the rest of planed topics, it's a good moment to explain these different versions better. Continue Reading →

Introduction to Apache Parquet

Very often an appropriate storage is as important as the data processing pipeline. And among different possibilities we can still store the data in files. Thanks to different formats, such as column-oriented ones, some of actions in reading path can be optimized. Continue Reading →