Connectiong to an External Hive Metastore
This page shows how to connect your Data Mechanics Spark applications and jobs to an external Hive metastore database.
The process involves the following steps:
- Make the jar file containing the JDBC driver of your database accessible to Data Mechanics
- Configure the Spark Config
JDBC Driver jar file
For PostgreSQL the jar file can be found here. You have different options to resolve the dependency, among them the following two:
Option 1: Download and Copy the JDBC driver jar file to your Data Mechanics image
Download the jar file and create a new Docker image like this:
Option 2: Reference the dependency to the jar file in your Data Mechanics template
For a complete reference on template attributes see here
Spark Configuration
You have different options to configure the credentials, among them the following two:
Option 1: Specify the connection in a Data Mechanics template
The configuration can be specified in a Data Mechanics template or in the core-site.xml file of the Hadoop configuration.
Additionnaly if you use older version of Hive you can add:
Option 2: Specify the connection in the core-site.xml file
You can also specify the confidential information in the core-site.xml file. The other parameters remain in the Data Mechanics template.
The core-site.xml file is in the $HADOOP_CONF_DIR path. See configuring environment variables.