Home > Net >  configuring TLS CA on Databricks
configuring TLS CA on Databricks

Time:07-26

I am trying to call an aricGIS service for GEOCoding from Databricks. The URL needs a certificate which i have copied to Blob storage and referring to that in the

Verify = "Path to mycertificate"

section of the code.

the certificate is stored in the following location

/dbfs/mnt/tmp/myfolder/mycertificate.pem

When accessing the certificate from the above path i am getting a HTTP error as below

SSLError: HTTPSConnectionPool(host='MyDomain.com', port=443): Max retries exceeded with url: "MyURL" (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1131)')))

when changing the above certificate path from

/dbfs/...

To

dbfs:/Path/To/Mycertificate

I get the following Error

OSError: Could not find a suitable TLS CA certificate bundle, invalid path: dbfs:/mnt/tmp/path/to/mycertificate.pem

I have also tried the operation with the .pfx file but getting the same error.

Being new to the Certificates in Databricks any help on how to fix the errors and get the service working would be really appreciated.

Also i have googled and referred to many documents to no avail. Nothing is working and it appears i am missing something basic here.

Thanks

CodePudding user response:

To fix this, SSL Certificate needs to be installed on the cluster nodes. For Python this could be two locations:

  1. Linux CA Certificates chain for most of SSL connections
  2. CA Certificates chain that is used by the requests package - provided by certifi package.

You can install the certificates using the following cluster init script that will install your CA certificate in PEM format into all CA certificate chains (Linux/Java/certifi) (you need to put your path /dbfs/.... into the certs variable):

#!/bin/bash
#
# File: install-ssl-certificates.sh
# Author: Alex Ott
# Created: Tuesday, September 28 2021
#

#set -x

declare -a certs=("/dbfs/tmp/myCA.pem" "/dbfs/tmp/myCA2.pem")

mkdir -p /usr/share/ca-certificates/extra
CERTIFI_HOME="$(python -m certifi 2>/dev/null)"
J_HOME="$(dirname $(realpath $(which java)))/.."

for cert in ${certs[@]}; do
    BNAME="$(basename $cert)"
    echo "cert=$cert BNAME=$BNAME"
    cp $cert  /usr/share/ca-certificates/extra/$BNAME
    echo "extra/$BNAME" >> /etc/ca-certificates.conf
    if [ -n "$CERTIFI_HOME" ]; then
        cat $cert >> $CERTIFI_HOME
    fi
    keytool -importcert -keystore ${J_HOME}/lib/security/cacerts -file $cert -alias $(basename $cert .pem) -storepass changeit -noprompt
done

update-ca-certificates

CodePudding user response:

I have found that the firewall was blocking the request. The solution was to just unblock the target IP address on the firewall and providing a certificate location to ".pem" file in the url in the format

session.post(url,data=d,verify=\path to my certificate.pemfile)

that resolved the issue. Thanks

  • Related