I have a couple of dataframe and I want all columns of them to be in uppercase. I did this as follows:
for col in df1.columns:
df1 = df1.withColumnRenamed(col, col.upper())
for col in df2.columns:
df2 = df2.withColumnRenamed(col, col.upper())
No I want to do this in an array iteration like this:
list = (df1, df2, df3)
for x in list:
for col in x.columns:
x = x.withColumnRenamed(col, col.upper())
But somehow this does not work (but no error displayed), the columns stay in lowercase. I also tried to attach an "return x" at the end but that doesn't work either. Can someone help me?
CodePudding user response:
The changes to your dataframe are not reflecting in the original variables viz. df1
, df2
, and df3
.
You could use the globals()
function to achieve this. Code below:
a = ['df1', 'df2', 'df3']
for x in a:
for col in globals()[x].columns:
globals()[x] = globals()[x].withColumnRenamed(col, col.upper())
You might have to use either globals()
or locals()
depending on your use case.
globals()
and locals()
both help in accessing a variable by a string, and both of them return a dictionary of variables. You can read more about them online.
EDIT : Also, list is a keyword in your code, you should change the variable name to something else.
CodePudding user response:
Okay pri's answer worked for me if I added a global
statement to it.
global df1, df2, df3
a = ['df1', 'df2', 'df3']
for x in a:
for col in globals()[x].columns:
globals()[x] = globals()[x].withColumnRenamed(col, col.upper())