Home > Back-end >  Pandas set column names by position
Pandas set column names by position

Time:07-27

I have the following code:

df1 = pd.read_excel(f, sheet_name=0, header=6)
# Drop Columns by position
df1 = df1.drop([df1.columns[5],df1.columns[8],df1.columns[10],df1.columns[14],df1.columns[15],df1.columns[16],df1.columns[17],df1.columns[18],df1.columns[19],df1.columns[21],df1.columns[22],df1.columns[23],df1.columns[24],df1.columns[25]], axis=1)

# rename cols

This is where I am struggling, as each time I attempt to rename the cols by position it returns "None" which is a <class 'NoneType'> ( when I use print(type(df1)) ). Note that df1 returns the dataframe as expected after dropping the columns

I get this with everything I have tried below:

column_indices = [0,1,2,3,4,5,6,7,8,9,10,11]
new_names = ['AWG Item Code','Description','UPC','PK','Size','Regular Case Cost','Unit Scan','AMAP','Case Bill Back','Monday Start Date','Sunday End Date','Net Unit']
old_names = df1.columns[column_indices]
df1 = df1.rename(columns=dict(zip(old_names, new_names)), inplace=True)

And with:

df1 = df1.rename({df1.columns[0]:"AWG Item Code",df1.columns[1]:"Description",df1.columns[2]:"UPC",df1.columns[3]:"PK",df1.columns[4]:"Size",df1.columns[5]:"Regular Case Cost",df1.columns[6]:"Unit Scan",df1.columns[7]:"AMAP",df1.columns[8]:"Case Bill Back",df1.columns[9]:"Monday Start Date",df1.columns[10]:"Sunday End Date",df1.columns[11]:"Net Unit"}, inplace = True)

When I remove the inplace=True essentially setting it to false, it returns the dataframe but without any of the changes I am wanting.

The tricky part is that in this program my column headers will change each time, but the columns the data is in will not. Otherwise I would just use df = df.rename(columns=["a":"newname"])

Image of my Dataframe in Python

Image of Dataframe in Excel

CodePudding user response:

One simpler version of your code could be :

df1.columns = new_names

It should work as intended, i.e. renaming columns in the index order.

Otherwise, in your own code : if you print df1.columns[column_indices]

You do not get a list but a pandas.core.indexes.base.Index

So to correct your code you just need to change the 2 last lines by :

old_names = df1.columns[column_indices].tolist()
df1.rename(columns=dict(zip(old_names, new_names)), inplace=True)

Have a nice day

CodePudding user response:

I was dumb and missing columns=

df1.rename(columns={df1.columns[0]:"AWG Item Code",df1.columns[1]:"Description",df1.columns[2]:"UPC",df1.columns[3]:"PK",df1.columns[4]:"Size",df1.columns[5]:"Regular Case Cost",df1.columns[6]:"Unit Scan",df1.columns[7]:"AMAP",df1.columns[8]:"Case Bill Back",df1.columns[9]:"Monday Start Date",df1.columns[10]:"Sunday End Date",df1.columns[11]:"Net Unit"}, inplace = True)

works fine

CodePudding user response:

I am not sure whether this answers your question:

There is a simple way to rename the columns:

If I have a data frame: say df1. I can see the columns name using the following code:

df.columns.to_list()

which gives me suppose following columns name:

['A', 'B', 'C','D']

And I want to keep the first three columns and rename them as 'E', 'F' and 'G' respectively. The following code gives me the desired outcome:

df = df[['A','B','C']]
df.columns = ['E','F','G]

new outcome:

df.columns.to_list()
output: ['E','F','G']
  • Related