Hi I have a spark data frame in the below format
| id | Name | Round_1_id |Round_1_name|Round_2_id|Round_2_name|
| ---| ------|------------|------------|----------|------------|
| 12 | ABC | 45 |BCD | 34 | HRF |
there are not only two rounds there and a total 10 rounds
I want to change the columns name as below only for the round column name
id | Name | Round_1_identity | Round_1_Fullname | Round_2_identity | Round_2_Fullname |
---|---|---|---|---|---|
12 | ABC | 45 | BCD | 34 | HRF |
only the columns name which have round should be changed
I am trying the below code but it is not working
rename_col={"id":"identity","name":"Fullname"}
for c in df.columns:
if 'Round' in c:
for key,value in rename_col.items():
df1=df.replace(key,value)
Please help me on the same. it would be very helpful.
CodePudding user response:
You can conditionally find the column name and replace the characters with the value from dict to get the new column and use withColumnRenamed to rename columns.
See the code below
rename_col = {"id":"identity", "name":"Fullname"}
for col in df.columns:
if "Round" in col:
key = col.split("_")[-1]
new_col_name = col.replace(key, rename_col[key])
df = df.withColumnRenamed(col, new_col_name)