Home > Blockchain >  Change spark dataframe columns name
Change spark dataframe columns name

Time:05-15

Hi I have a spark data frame in the below format

| id |  Name | Round_1_id |Round_1_name|Round_2_id|Round_2_name|
| ---| ------|------------|------------|----------|------------|
| 12 |  ABC  | 45         |BCD         | 34       | HRF        |

there are not only two rounds there and a total 10 rounds

I want to change the columns name as below only for the round column name

id Name Round_1_identity Round_1_Fullname Round_2_identity Round_2_Fullname
12 ABC 45 BCD 34 HRF

only the columns name which have round should be changed

I am trying the below code but it is not working

rename_col={"id":"identity","name":"Fullname"}
for c in df.columns:
  if 'Round' in c:
    for key,value in rename_col.items():
      df1=df.replace(key,value)

Please help me on the same. it would be very helpful.

CodePudding user response:

You can conditionally find the column name and replace the characters with the value from dict to get the new column and use withColumnRenamed to rename columns.

See the code below

rename_col = {"id":"identity", "name":"Fullname"}

for col in df.columns:
    if "Round" in col:
        key = col.split("_")[-1]
        new_col_name = col.replace(key, rename_col[key])
        df = df.withColumnRenamed(col, new_col_name)
    
  • Related