i'am trying to call a DataFrame that i created in notebook1 to use it in my notebook2 in Databricks Community addition with pyspark and i tried this code dbutils.notebook.run("notebook1", 60, {"dfnumber2"})
but it shows this error.
py4j.Py4JException: Method _run([class java.lang.String, class java.lang.Integer, class java.util.HashSet, null, class java.lang.String]) does not exist
any help please?
CodePudding user response:
The actual problem is that you pass last parameter ({"dfnumber2"}
) incorrectly - with this syntax it's a set
, not the map
type. You need to use syntax: {"table_name": "dfnumber2"}
to represent it as a dict/map.
But if you look into documentation of dbutils.notebook.run, you will see following phrase:
To implement notebook workflows, use the
dbutils.notebook.*
methods. Unlike%run
, thedbutils.notebook.run()
method starts a new job to run the notebook.
But jobs aren't supported on the Community Edition, so it won't work anyway.
CodePudding user response:
Create a global temp view and pass the table name as argument to your next notebook.
Drnumber2.createOrReplaceGlobalTempView("dfnumber2")
dbutils.notebook.run("notebook1", 60, {table_name:"dfnumber2"})
In your notebook1 you can do
table_name= dbutils.widgets.get("table_name") Dfnumber2 = spark.sql("select * from global_temp." table_name)