Helm chart options

This page describes all the configuration options of the Data Mechanics Helm chart.

Global configurations

FieldDescriptionDefaultCan I change this?
controlPlaneUrlThe URL of the Data Mechanics control planehttps://control.datamechanics.co
clusterKeyA key identifying your cluster(set by Data Mechanics)
customerKeyOptional. Seting a customer key here will deactivate Data Mechanics authentication in your cluster, see here(empty)
jwt.publicKeyA key for Data Mechanics internal JWT mechanism(set by Data Mechanics)

imagePullSecret configurations

This section governs the secrets used by the platform to pull images from Data Mechanics Docker registry.

FieldDescriptionDefaultCan I change this?
nameThe name of the image pull secrets in namespaces data-mechanics and spark-appsgcr-credentials
createWhether the Helm chart should create the image pull secretstrue
base64DockerConfigSecretThe content of the image pull secrets. Required if create is true(set by Data Mechanics)

ingress configurations

Data Mechanics can set up an ingress in the cluster, see more details on the Ingress page.

FieldDescriptionDefaultCan I change this?
enabledWhether the Helm chart should create an ingress pointing to the Data Mechanics servicefalse
tls.enabledWhether the Helm chart should set up TLS on the ingressfalse
tls.certificateSecretNameThe TLS secret to be used in the ingress. Required if ingress.tls.enabled is set(empty)
tls.domainThe domain to be used for TLS. Required if ingress.tls.enabled is set(empty)

submissionService configurations

The submission service is Data Mechanics main service in your Kubernetes cluster.

FieldDescriptionDefaultCan I change this?
imageThe Docker image for the submission service gcr.io/dm-docker/submission-service:0.2.1
imagePullPolicyThe pull policy for the submission serviceIfNotPresent
apiPrefixThe path to the Data Mechanics API/api
dashboardPrefixThe path to the Data Mechanics dashboard/dashboard
backendSentryUrlWe use Sentry to be alerted upon failure. Please do not remove :)(set by Data Mechanics)
frontendSentryUrlWe use Sentry to be alerted upon failure. Please do not remove :)(set by Data Mechanics)

notebookService configurations

The notebook service allows you to run Jupyter kernels on your Kubernetes cluster, see Jupyter notebooks.

FieldDescriptionDefaultCan I change this?
enabledWhether to deploy the notebook servicetrue
imageThe Docker image for the notebook service gcr.io/dm-docker/notebook-service:0.2.1
imagePullPolicyThe pull policy for the notebook serviceIfNotPresent
prefixThe prefix path for the notebook service/notebooks
logLevelThe log level of Jupyter enterprise gatewayDEBUG

sparkoperator configurations

Data Mechanics rely on the Spark operator to run Spark applications on Kubernetes.

Below are the options that we explicitely set in Data Mechanics Helm chart, but the operator has many more configs. Have a look at their README to have the full list. Change at your own risk!

FieldDescriptionDefaultCan I change this?
sparkJobNamespaceThe namespace when Spark applications are runspark-apps
operatorImageNameThe Docker image for the Spark operator used by Data Mechanics. We use a patched version so change at your own riskgcr.io/dm-docker/spark-operator
operatorVersionThe Docker image version for the Spark operator used by Data Mechanics. We use a patched version so change at your own riskv1beta2-1.0.1-3.0.0
imagePullSecretsThe name of the secrets in namespace spark-apps where the credentials for Docker registry gcr.io/dm-docker are stored. See here for more information- name: gcr-credentials
imagePullPolicyThe pull policy for the spark operatorIfNotPresent

overprovisioning configurations

Data Mechanics maintains excess capacity in your Kubernetes cluster so that there is always room for new Spark applications. This "overprovisioning" can be static (the excess capacity is fixed) or proportional to the total CPU used by the pods on the cluster.

FieldDescriptionDefaultCan I change this?
pausePodImageThe Docker image used for overprovisioningk8s.gcr.io/pause
pausePodCPURequestThe number of cores used by a pause pod1000m
proportional.enabledWhether to activate proportional overprovisioning. If false, static overprovisioning is usedfalse
static.pausePodReplicasThe number of excess capacity maintained by Data Mechanics in the cluster as a number of pause pods3
proportional.autoscalerImageThe Docker image of the overprovisioning autoscalerk8s.gcr.io/cluster-proportional-autoscaler-amd64:1.7.1
proportional.coresPerReplicaControls the excess capacity maintained by Data Mechanics in the cluster for proportional autoscaling. See here for a formula2.0