Home > front end >  Change Column Names using Dictionary (key value pair) in Databricks
Change Column Names using Dictionary (key value pair) in Databricks

Time:12-16

I am new to Databricks and python, i just want to know the best way to change the column names in Databricks for example if the column name is 'ID' then want to change that to Patien_ID ,'Name' to 'Patient_Name'. so i taught i will use dictionaries but i dont know how to apply that as col names. Please Help, thanks in advance.

Note: the position of col names can change so taught of using dictionary:

Disctionary = { : <Patient_ID>, : <Patient_Name>, : <Patient_age>}

example of what iam trying to achieve(picture attached)

i tried using a json file to do this but i ended up no wr

CodePudding user response:

Given the following dataset

columns=["ID","Name","Age","Country"]
data = [(1,"John","42","Spain"),(2,"Jane","24","Norway"),(3,"Nohj","38","Iceland"),(4,"Fabrice","65","France")]
df=spark.createDataFrame(data,columns)
df.show()

 --- ------- --- ------- 
| ID|   Name|Age|Country|
 --- ------- --- ------- 
|  1|   John| 42|  Spain|
|  2|   Jane| 24| Norway|
|  3|   Nohj| 38|Iceland|
|  4|Fabrice| 65| France|
 --- ------- --- ------- 

You could loop on your dictionary as follows :


dictionary = {"ID": "Patient_ID", "Name": "Patient_Name", "Age": "Patient_Age"}
for column in dictionary.keys() :
  df = df.withColumnRenamed(column,dictionary[column])
  
df.show()

 ---------- ----------- ----------- ------- 
|Patient_ID|Patient_Name|Patient_Age|Country|
 ---------- ----------- ----------- ------- 
|         1|       John|         42|  Spain|
|         2|       Jane|         24| Norway|
|         3|       Nohj|         38|Iceland|
|         4|    Fabrice|         65| France|
 ---------- ----------- ----------- ------- 
  • Related