Home > Blockchain >  Creating a dataframe in Azure ML Notebook with R kernel
Creating a dataframe in Azure ML Notebook with R kernel

Time:09-20

I have written some scripts in R which I have to run in azure ml notebook but I have not found much documentation how to create a dataset by running code in notebook with R kernel. I have written the following python code which works with python kernel as:

from azureml.core import Dataset, Datastore,Workspace

subscription_id = 'abc'
resource_group = 'pqr'
workspace_name = 'xyz'

workspace = Workspace(subscription_id, resource_group, workspace_name)
datastore = Datastore.get(workspace, 'workspaceblobstore')

# create tabular dataset from all parquet files in the directory
tabular_dataset_3 = Dataset.Tabular.from_parquet_files(path=(datastore,'/UI/09-17-2022_125003_UTC/userdata1.parquet'))

df=tabular_dataset_3.to_pandas_dataframe()

It works fine with python kernel but I want to execute the equivalent R code in notebook with R kernel.

Can anyone please help me what is the equivalent R code ? Any help would be appreciated.

CodePudding user response:

To create an R script and use the dataset, first we need to register the dataset to the portal. Once the dataset is added to the portal, we need to get the dataset URL and open the notebook and use the R kernel.

enter image description here

Upload the dataset and get the data source URL

enter image description here

Go to Machine Learning studio and create a new notebook.

Use the below R script to get the dataset and convert that to dataframe.

azureml_main <- function(dataframe1, dataframe2){
  print("R script run.")
  run = get_current_run()
  ws = workspacename
  dataset = azureml$core$dataset$Dataset$get_by_name(ws, “./path/insurance.csv")
  dataframe2 <- dataset$to_pandas_dataframe()
  # Return datasets as a Named List
  return(list(dataset1=dataframe1, dataset2=dataframe2))
  
}
  • Related