Spark SQL and Avro articles

Apache Avro and Apache Spark compatibility

I'm very happy when the readers comment on my posts or tweets. A lot of such discussions are the topics of posts. It's the case of this one where I try to figure out whether Apache Spark SQL Avro source is compatible with other applications using this serialization format.

Continue Reading →

Apache Spark 2.4.0 features - Avro data source

Apache Avro became one of the serialization standards, among others because of its use in Apache Kafka's schema registry. Previously to work with Avro files with Apache Spark we needed Databrick's external package. But it's no longer the case starting from 2.4.0 release where Avro became first-class citizen data source.

Continue Reading →