Home > front end >  How to return data from R notebook task to Python task in databricks
How to return data from R notebook task to Python task in databricks

Time:07-06

I have a simple function in an R notebook (notebook A) that aggregates some data. I want to call notebook A from another notebook (notebook B) and interogate the aggregated data from notebook A in notebook B.

So far I can run notebook A from notebook B no problem, but cannot see any returned data, variables or functions.

Code in notebook A:

function_to_aggregate_data = function(x,y){
 ...some code...
 }
aggregated_data = function_to_aggregate_data(x,y)

Code in notebook B:

%python
dbutils.notebook.run("path/to/notebook_A", 60)

CodePudding user response:

When you use dbutils.notebook.run, that notebook is executed as a separate job, so no variables, etc. are available for the caller notebook, or in the called notebook. You can return some data from the notebook using dbutils.notebook.exit, but it's limited to 1024 bytes (as I remember). But you can return data by registering temp view, and then accessing data in this temp view - here is an example of doing that (although using Python for both).

Notebook B:

def generate_data1(n=1000, name='my_cool_data'):
  df = spark.range(0, n)
  df.createOrReplaceTempView(name)

Notebook A:

dbutils.notebook.run('./Code1', default_timeout)
df = spark.sql("select * from my_cool_data")
assert(df.count() == 1000)

P.S. You can't directly share data between R & Python code, only by using temp views, etc.

  • Related