How can I rename the columns starting with abcd
to starting with wxyz
.
List of columns: abcd_name, abcd_id, abcd_loc, empId, empCode
I need to change the names of columns in a dataframe that starts with abcd
Required column list: wxyz_name, wxyz_id, wxyz_loc, empId, empCode
I tried getting all the columns' lists using the below code, but not sure how to implement it.
val df_cols_abcd = df.columns.filter(_.startsWith("abcd")).map(df(_))
CodePudding user response:
You can do that with foldLeft
:
val oldPrefix = "abcd"
val newPrefix = "wxyz"
val newDf = df.columns
.filter(_.startsWith(oldPrefix))
.foldLeft(df)((acc, oldName) =>
acc.withColumnRenamed(oldName, newPrefix oldName.substring(oldPrefix.length))
)
Your first idea to filter columns with startWith
is correct. The only think you miss the the part where you rename all the columns.
I recommend to do some research about foldLeft
if you're not familiar with. The idea is the following:
- I start with an initial dataframe (
df
in the first brackets). - I will apply a function to it with each of the columns I need to rename (the function is the one in the second brackets). This function takes as argument an accumulator (
acc
) that is an intermediate dataframe (because it will rename the columns one at a time), and another argument which is the current element of the list (here the list contains the name of the columns that need to be modified).