I am trying to write a Dockerfile
that builds a container that leverages Databricks Conenect. So, I need to set-up and install Databricks Connect through Docker RUN
commands. I have the following:
FROM python:3.8
COPY requirements.txt /tmp/
RUN apt-get update\
&& apt-get install software-properties-common -y\
&& apt-get update\
&& apt-add-repository "deb http://security.debian.org/debian-security stretch/updates main"\
&& apt-get update\
&& apt-get install openjdk-8-jdk -y
RUN pip install --requirement /tmp/requirements.txt\
&& databricks-connect configure\
&& databricks-connect test
as a simplified example that produces my problem. The step: databricks-connect configure
prompts for license acceptance with default N
, and so throws the following error:
...
#14 1.345 Do you accept the above agreement? [y/N] Traceback (most recent call last):
#14 1.346 File "/usr/local/bin/databricks-connect", line 8, in <module>
#14 1.346 sys.exit(main())
#14 1.346 File "/usr/local/lib/python3.8/site-packages/pyspark/databricks_connect.py", line 281, in main
#14 1.346 configure()
#14 1.346 File "/usr/local/lib/python3.8/site-packages/pyspark/databricks_connect.py", line 119, in configure
#14 1.346 accept = input().strip()
#14 1.346 EOFError: EOF when reading a line
------
executor failed running [/bin/sh -c databricks-connect configure]: exit code: 1
How can I accept this automatically as part of the Docker build?
CodePudding user response:
You need to use something like this (stolen from this demo), because besides accepting the license terms, you also need to provide other parameters:
echo "y
$(databricks_host)
$(databricks_token)
$(cluster_id)
$(org_id)
15001" | databricks-connect configure
Or you can just generate ~/.databricks-connect
file that is just JSON:
{
"host": "https://host",
"cluster_id": "cluster",
"org_id": "org_id",
"port": "15001"
}