Home > Blockchain >  Databricks cli - dbfs commands to copy files
Databricks cli - dbfs commands to copy files

Time:01-31

I'm working on the Deployment of the Purview ADB Lineage Solution Accelerator. In step 3 of Install OpenLineage on Your Databricks Cluster section, the author is asking to run the following in thepowershell to Upload the init script and jar to dbfs using the Databricks CLI.

dbfs mkdirs dbfs:/databricks/openlineage
dbfs cp --overwrite ./openlineage-spark-*.jar               dbfs:/databricks/openlineage/
dbfs cp --overwrite ./open-lineage-init-script.sh           dbfs:/databricks/openlineage/open-lineage-init-script.sh

Question: Do I correctly understand the above code as follows? If that is not the case, before running the code, I would like to know what exactly the code is doing.

  1. The first line creates a folder openlineage in the root directory of dbfs
  2. It's assumed that you are running the powershell command from the location where .jar and open-lineage-init-script.sh are located
  3. The second and third lines of the code are copying the jar and .sh files from your local directory to the dbfs:/databricks/openlineage/ in dbfs of Databricks

CodePudding user response:

  1. dbfs mkdirs is an equivalent of UNIX mkdir -p, ie. under DBFS root it will create a folder named databricks, and inside it another folder named openlineage - and will not complain if these directories already exist.

  2. and 3. Yes. Files/directories not prefixed with dbfs:/ mean your local filesystem. Note that you can copy from DBFS to local or vice versa, or between two DBFS locations. Just not between local filesystem only.

  • Related