I have a simple function in an R notebook (notebook A) that aggregates some data. I want to call notebook A from another notebook (notebook B) and interogate the aggregated data from notebook A in notebook B.
So far I can run notebook A from notebook B no problem, but cannot see any returned data, variables or functions.
Code in notebook A:
function_to_aggregate_data = function(x,y){
...some code...
}
aggregated_data = function_to_aggregate_data(x,y)
Code in notebook B:
%python
dbutils.notebook.run("path/to/notebook_A", 60)
CodePudding user response:
When you use dbutils.notebook.run
, that notebook is executed as a separate job, so no variables, etc. are available for the caller notebook, or in the called notebook. You can return some data from the notebook using dbutils.notebook.exit
, but it's limited to 1024 bytes (as I remember). But you can return data by registering temp view, and then accessing data in this temp view - here is an example of doing that (although using Python for both).
def generate_data1(n=1000, name='my_cool_data'):
df = spark.range(0, n)
df.createOrReplaceTempView(name)
Notebook A:
dbutils.notebook.run('./Code1', default_timeout)
df = spark.sql("select * from my_cool_data")
assert(df.count() == 1000)
P.S. You can't directly share data between R & Python code, only by using temp views, etc.