Home > other >  Pandas: Renaming headers with identical names
Pandas: Renaming headers with identical names

Time:07-06

I have several columns named the same (or not named at all, to be specific) in a dataframe. I need to rename them separately but df.rename method renames them altogether. For example, in a following df:

#     nan    nan    a    nan    nan    b    nan    nan
#     1       2     3     4      5     6     7      8
#     9      10     11    12    13     14    15     16

Following code changes all 'nan' headers into 'word':

df = df.rename(columns={df.columns[1]:'word'})

#   word    word    a    word   word    b   word   word
#     1       2     3     4      5     6     7      8
#     9      10     11    12    13     14    15     16

How do I make it so that I can change the header names separately? The ultimate goal for this example is to make the header look like the following:

#    nan     nan    a     a      a     b     b      b
#     1       2     3     4      5     6     7      8
#     9      10     11    12    13     14    15     16

Update: manual assignment of header values won't work in this case because this is a simplified version of my problem. I wouldn't know what the header values are and how many of them there are.

Following is a for loop that was supposed to incorporate the renaming method:

word = 'nan'
for i in range (0, len(list(df))-1):
if str(list(df)[i]) != 'nan':
    word = str(list(df)[i])
df.rename(columns={df.columns[i]:word}) 

CodePudding user response:

You can simply reset column names by this:

df.columns = ["nan", "nan", "a", "a", "a", "b", "b", "b"]

CodePudding user response:

df.columns = pd.Series(df.columns).mask(lambda x:x=='nan').ffill()
df

   NaN  NaN   a   a   a   b   b   b
0    1    2   3   4   5   6   7   8
1    9   10  11  12  13  14  15  16

CodePudding user response:

Change the n'th column of a pandas dataframe conditionally

df = pd.DataFrame({"A": [1, 2, 3], "nan": [4, 5, 6], "bar" : [7,8,9]}) 
print(df) 

thecolumns = list(df.columns)
if ( somecondition == True):
    thecolumns[1] = "newname" 
df.columns = thecolumns 
print(df)

Which prints:

   A  nan  bar
0  1    4    7
1  2    5    8
2  3    6    9

   A  newname  bar
0  1        4    7
1  2        5    8
2  3        6    9
  • Related