Spark on Kubernetes articles

Articles tagged with Spark on Kubernetes. There are 6 article(s) corresponding to the tag Spark on Kubernetes. If you don't find what you're looking for, please check related tags: Apache Spark 2.4.0 features, Apache Spark data sources, Apache Spark internals, Apache Spark Structured Streaming joins, AWS certification, AWS EC2, Big Data patterns implemented, Bloom filters, bucketing in Spark SQL, Cerberus + PySpark.

Check out my new course on Data Engineering!

Are you a data scientist who wants to extend his data engineering skills? Or a software engineer who wants to work with Big Data? If not, maybe a BI developer who wants to evolve to engineering position? My course will help you to achieve your goal! Join the class →

Setting up Apache Spark on Kubernetes with microk8s

When I discovered microk8s I was delighted! An easy installation in very few steps and you can start to play with Kubernetes locally (tried on Ubuntu 16). However, running Apache Spark 2.4.4 on top of microk8s is not an easy piece of cake. In this post I will show you 4 different problems you may encounter, and propose possible solutions. Continue Reading →

Apache Spark on Kubernetes - global overview

Last years are the symbol of popularization of Kubernetes. Thanks to its replication and scalability properties it's more and more often used in distributed architectures. Apache Spark, through a special group of work, integrates Kubernetes steadily. In current (2.3.1) version this new method to schedule jobs is integrated in the project as experimental feature. Continue Reading →

What Kubernetes can bring to Apache Spark pipelines ?

Commercial version of Apache Spark distributed by Databricks offers a serverless and auto-scalable approach for the applications written in this framework. Among the time some other companies tried to provide similar alternatives, going even to put Apache Spark pipelines into AWS Lambda functions. But with the version 2.3.0 another alternative appears as a solution for scalability and elasticity overhead - Kubernetes. Continue Reading →