Configuration templates

This page assumes that you know how to run a Spark application on Data Mechanics.

Data Mechanics provides a way to store configurations and use them when launching a Spark applications. This can be useful when you need to share a large configuration between applications, or when you simply don't want to store configurations on your side.

Config templates are configuration fragments stored in Data Mechanics. The API routes under http(s)://<your-cluster-url>/config-templates/ lets you manage them as a REST resource.

To know more about the API routes and parameters, check out the API reference or navigate to http(s)://<your-cluster-url>/api/ in your browser.

In the previous page about accessing data on GCS, we added the following block of configuration to the configuration of our Spark application:

"executor": {
"secrets": [
{
"name": "gcs-svc-account",
"path": "/mnt/secrets",
"secretType": "GCPServiceAccount"
}
]
},
"driver": {
"secrets": [
{
"name": "gcs-svc-account",
"path": "/mnt/secrets",
"secretType": "GCPServiceAccount"
}
]
}

The following command creates a config template gcs-access containing this block:

curl -X POST \
http(s)://<your-cluster-url>/api/config-templates/ \
-H 'Content-Type: application/json' \
-d '{
"name": "gcs-access",
"config": {
"executor": {
"secrets": [
{
"name": "gcs-svc-account",
"path": "/mnt/secrets",
"secretType": "GCPServiceAccount"
}
]
},
"driver": {
"secrets": [
{
"name": "gcs-svc-account",
"path": "/mnt/secrets",
"secretType": "GCPServiceAccount"
}
]
}
}
}'

It can now be referenced when submitting a Spark application with field configTemplateName:

curl -X POST \
http(s)://<your-cluster-url>/api/apps/ \
-H 'Content-Type: application/json' \
-d '{
"jobName": "word-count",
"configTemplateName": "gcs-access",
"configOverrides": {
"type": "Scala",
"sparkVersion": "3.0.0",
"mainApplicationFile": "gs://<your-bucket>/wordcount.jar",
"mainClass": "org.<your-org>.wordcount.WordCount",
"arguments": ["gs://<your-bucket>/input/*", "gs://<your-bucket>/output"]
}
}'

Data Mechanics merges the configurations in config template gcs-access and in configOverrides. Note that the configuration in configOverrides has higher precedence than the config template.