Apache Spark on Kubernetes - useful commands

on waitingforcode.com

Apache Spark on Kubernetes - useful commands

Beginning with new tool and its CLI is never easy. Having a list of useful debugging commands is always helpful. And the rule is not different for Spark on Kubernetes project.

This post lists some kubectl commands that may be helpful in first contact with Kubernetes CLI. The commands are written in a single list and each of them is composed of a short explanation and generated output.

Among the commands that can help in the firsts contact with Spark on Kubernetes we can distinguish:

  • kubectl get pods --watch - generally kubectl's get command is used to retrieve the information about Kubernetes objects. In this example we'll look for more information about pods. The extra --watch flag is used to continuously listening the changes, i.e. everytime a change happens on given object, it's automatically pushed, a little bit like tail -f.
    In Spark on Kubernetes context this command is useful to see what happened with the pods, especially after first unsuccesfull tries:
    NAME                                               READY     STATUS    RESTARTS   AGE
    spark-pi-a33b36d1656c31039948a9d74e5f3868-driver   0/1       Error     0          2m
    spark-pi-ed55e575ad783c4d8997b7224f28c09e-driver   0/1       Pending   0         0s
    spark-pi-ed55e575ad783c4d8997b7224f28c09e-driver   0/1       Pending   0         0s
    spark-pi-ed55e575ad783c4d8997b7224f28c09e-driver   0/1       ContainerCreating   0         0s
    spark-pi-ed55e575ad783c4d8997b7224f28c09e-driver   0/1       Error     0         2s
  • kubectl describe pod spark-pi-ee0e0145b94a3dcf94506235bd8c5158-driver - it prints the information about specific Kubernetes object, here Spark's driver pod. It's helpful to: investigate what happened with pod's containers (prints containers state), check if custom configuration was correctly applied (e.g. custom labels), ensure correct resources allocation or simply check object definition prepared by spark-submit client. Output's snippet can look like:
    Name:         spark-pi-ee0e0145b94a3dcf94506235bd8c5158-driver
    Namespace:    default
    Node:         docker-for-desktop/
    Labels:       spark-app-selector=spark-923dc658b26547479570e3834aaae402
    Annotations:  spark-app-name=spark-pi
    Status:       Failed
        Container ID:  docker://f63e19366f6ffae958da175a7cc5925332214318bb86c6dcc5f1b7046d781176
        Image:         spark:my-tag
        Image ID:      docker://sha256:c9b6f825fbec6319a9337bfb8895e9de7e87af55ae828d9de1c0e67ffa7aebad
        State:          Terminated
          Reason:       Error
          Exit Code:    1
        Ready:          False
        Restart Count:  0
          memory:  1408Mi
          cpu:     1
          memory:  1Gi
          SPARK_DRIVER_MEMORY:        1g
          SPARK_DRIVER_CLASS:         org.apache.spark.examples.SparkPi
          SPARK_DRIVER_ARGS:          1000
          // ...
          /var/run/secrets/kubernetes.io/serviceaccount from default-token-jgd7n (ro)
      Type           Status
      Initialized    True
      Ready          False
      PodScheduled   True
        Type:        Secret (a volume populated by a Secret)
        SecretName:  default-token-jgd7n
        Optional:    false
    QoS Class:       Burstable
    Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                     node.kubernetes.io/unreachable:NoExecute for 300s
  • kubectl cluster-info - prints cluster info, as addresses of the master and services with label kubernetes.io/cluster-service=true. In the context of Spark on Kubernetes it's useful to get the address of master required in spark-submit command. An output can look like:
    Kubernetes master is running at https://localhost:6445
    KubeDNS is running at https://localhost:6445/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
  • kubectl logs spark-pi-ee0e0145b94a3dcf94506235bd8c5158-driver -f - retrieves logs for given Kubernetes resource. -f flag (f for follow) enables or disables logs streaming. Useless to say that this command should be the starting point for all debugging processes:
    $ kubectl logs spark-pi-7f56238dc75d3162af4b7196a242392b-driver -f   ++ id -u
    ++ id -u
    + myuid=0
    ++ id -g
    + mygid=0
    ++ getent passwd 0
    + uidentry=root:x:0:0:root:/root:/bin/ash
    + '[' -z root:x:0:0:root:/root:/bin/ash ']'
    + SPARK_K8S_CMD=driver
    + '[' -z driver ']'
    + shift 1
    + SPARK_CLASSPATH=':/opt/spark/jars/*'
    + env
    + grep SPARK_JAVA_OPT_
    + sed 's/[^=]*=\(.*\)/\1/g'
    + readarray -t SPARK_JAVA_OPTS
    + '[' -n '/opt/spark/jars/spark-examples_2.11-2.3.0.jar;/opt/spark/jars/spark-examples_2.11-2.3.0.jar' ']'
    + SPARK_CLASSPATH=':/opt/spark/jars/*:/opt/spark/jars/spark-examples_2.11-2.3.0.jar;/opt/spark/jars/spark-examples_2.11-2.3.0.jar'
    + '[' -n '' ']'
    + case "$SPARK_K8S_CMD" in
    + exec /sbin/tini -s -- /usr/lib/jvm/java-1.8-openjdk/bin/java -Dspark.driver.port=7078 -Dspark.master=k8s://https://localhost:6445 -Dspark.jars=/opt/spark/jars/spark-examples_2.11-2.3.0.jar,/opt/spark/jars/spark-examples_2.11-2.3.0.jar -Dspark.executor.instances=2 -Dspark.kubernetes.executor.podNamePrefix=spark-pi-b9eba2ce4ee33677853cf13f84119b54 -Dspark.driver.host=spark-pi-b9eba2ce4ee33677853cf13f84119b54-driver-svc.default.svc -Dspark.submit.deployMode=cluster -Dspark.app.name=spark-pi -Dspark.app.id=spark-ce9f9b930aa146559b054db1dcaa256c -Dspark.driver.blockManager.port=7079 -Dspark.kubernetes.driver.pod.name=spark-pi-b9eba2ce4ee33677853cf13f84119b54-driver -Dspark.kubernetes.container.image=spark:latest -cp ':/opt/spark/jars/*:/opt/spark/jars/spark-examples_2.11-2.3.0.jar;/opt/spark/jars/spark-examples_2.11-2.3.0.jar' -Xms1g -Xmx1g -Dspark.driver.bindAddress= org.apache.spark.examples.SparkPi
    2018-06-24 11:14:02 INFO  SparkContext:54 - Running Spark version 2.3.0
    2018-06-24 11:14:02 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    2018-06-24 11:14:02 INFO  SparkContext:54 - Submitted application: Spark Pi
    2018-06-24 11:14:02 INFO  SecurityManager:54 - Changing view acls to: root
    2018-06-24 11:14:02 INFO  SecurityManager:54 - Changing modify acls to: root
    2018-06-24 11:14:02 INFO  SecurityManager:54 - Changing view acls groups to:
    2018-06-24 11:14:02 INFO  SecurityManager:54 - Changing modify acls groups to:
    2018-06-24 11:14:02 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()  
  • kubectl create -f driver_template.yaml --validate - if for some reasons (in Apache Spark 2.3 Spark on Kubernetes is still marked as experimental) one of Spark's pods is not deployed correctly, it can be debugged by template manipulations. To do so we need first to get the YAML template created by spark-submit client. It can be done with kubectl get pods spark-pi-ed55e575ad783c4d8997b7224f28c09e-driver -o yaml > driver_template.yaml command.
    Later we can manipulate the template and try to validate it with --validate flag of create command. If defined, Kubernetes will use a schema to validate the template before sending it to the scheduler.
    For instance, if we remove an mandatory field as container's image, we'll end up with the following message:
      $ kubectl create -f /C/tmp/template_test.yaml   --validate
      The Pod "spark-pi-ed55e575ad783c4d8997b7224f28c09e-driver" is invalid: spec.containers[0].image: Required value
  • kubectl delete pod spark-pi-7826b7948b3539b8a74ddd909da31da3-driver - as the name points out, this command deletes Kubernetes object (pod in this case). It can be useful if, due of a misconfiguration, a pod remains stuck for too long. The execution of delete gives the following results:
    $ kubectl delete pod spark-pi-c0d471fa3f46318a8e8a754cdb9706d6-driver
    pod "spark-pi-c0d471fa3f46318a8e8a754cdb9706d6-driver" deleted
  • kubectl port-forward spark-pi-8663fb7f8d2531b29975461b62ae1cda-driver 4040:4040 - natively Apache Spark UI will be executed locally to the pod. But we can expose it in our localhost by simply forwarding 4040 port from the pod to the host (exactly as for Docker containers). It can be done with port-forward command and the following output should be printed after doing that:
    $ kubectl port-forward spark-pi-8663fb7f8d2531b29975461b62ae1cda-driver 4040:4040
    Forwarding from -> 4040
    Handling connection for 4040
    Handling connection for 4040
    Handling connection for 4040
    Handling connection for 4040
    Handling connection for 4040
    Handling connection for 4040
  • kubectl get secrets - Spark programs can use secrets to manipulate sensitive configuration as credentials. They can be defined inside spark.kubernetes.driver.secrets.spark-secret and spark.kubernetes.executor.secrets.spark-secret properties. To visualize which secrets are defined for given namespace, get secrets command may be used:
      $ kubectl get secrets
    NAME                  TYPE                                  DATA      AGE
    default-token-jgd7n   kubernetes.io/service-account-token   3         17d

    To go even deeper, each secret can be viewed with already presented describe command, like that: kubectl describe secrets/default-token-jgd7n.
  • kubectl get namespaces - if you intend to test Spark on Kubernetes inside separate namespace, you can check which ones are already defined with get namespaces command. Its execution returns:
    $ kubectl get namespaces
    NAME          STATUS    AGE
    default       Active    17d
    docker        Active    17d
    kube-public   Active    17d
    kube-system   Active    17d
    spark-tests   Active    12d

    Since namespace is also a Kubernetes object, we can also view its properties with describe command:
      $ kubectl describe namespace spark-tests
    Name:         spark-tests
    Status:       Active
    No resource quota.
    No resource limits.
  • kubectl describe nodes - once again another describe version. This time it lets us to see what happens in our cluster's nodes. The command shows pods located in given node as well as used and allocable resources:
        cpu:     3
        memory:  4023128Ki
        pods:    110
        cpu:     3
        memory:  3920728Ki
        pods:    110
      System Info:
        Non-terminated Pods:         (9 in total)
        Namespace                  Name                                          CPU Requests  CPU Limits  Memory Requests  Memory Limits
        ---------                  ----                                          ------------  ----------  ---------------  -------------
        docker                     compose-5d4f4d67b6-xx72m                      0 (0%)        0 (0%)      0 (0%)           0 (0%)
        docker                     compose-api-7bb7b5968f-twrbp                  0 (0%)        0 (0%)      0 (0%)           0 (0%)
        kube-system                etcd-docker-for-desktop                       0 (0%)        0 (0%)      0 (0%)           0 (0%)
        kube-system                kube-apiserver-docker-for-desktop             250m (8%)     0 (0%)      0 (0%)           0 (0%)
        kube-system                kube-controller-manager-docker-for-desktop    200m (6%)     0 (0%)      0 (0%)           0 (0%)
        kube-system                kube-dns-6f4fd4bdf-9f7sn                      260m (8%)     0 (0%)      110Mi (2%)       170Mi (4%)
        kube-system                kube-proxy-r78gr                              0 (0%)        0 (0%)      0 (0%)           0 (0%)
        kube-system                kube-scheduler-docker-for-desktop             100m (3%)     0 (0%)      0 (0%)           0 (0%)
        kube-system                kubernetes-dashboard-5bd6f767c7-cf2jg         0 (0%)        0 (0%)      0 (0%)           0 (0%)
    Allocated resources:
        (Total limits may be over 100 percent, i.e., overcommitted.)
        CPU Requests  CPU Limits  Memory Requests  Memory Limits
        ------------  ----------  ---------------  -------------
        810m (27%)    0 (0%)      110Mi (2%)       170Mi (4%)
    It can be useful to check the impact of our Spark application on the cluster at node level. We can also analyze one specific node by defining its name in the command.

The post listed some interesting commands that can help us to start working with Spark on Kubernetes. Among them we can find a lot of kubectl describe examples thanks to which we can easily see what is really executed (e.g. pod specification). We can also see more network-related commands as the one for proxy forwarding letting us to see Spark's driver UI. The last category of commands concerns listing and is executed with kubectl get.

Read also about Apache Spark on Kubernetes - useful commands here: Secrets , Spark on Kubernetes - secret management , Spark on Kubernetes - accessing driver UI .

Share, like or comment this post on Twitter: