Apache Spark data sources articles

Apache Spark 2.4.0 features - Avro data source

Apache Avro became one of the serialization standards, among others because of its use in Apache Kafka's schema registry. Previously to work with Avro files with Apache Spark we needed Databrick's external package. But it's no longer the case starting from 2.4.0 release where Avro became first-class citizen data source.

Continue Reading →