Home > database >  The values of my 1st column are going into the index but the column name is the 1st column outside t
The values of my 1st column are going into the index but the column name is the 1st column outside t

Time:10-27

The values of my first column are going into the index but the column name is the first column outside the index, so I cannot use df.reset_index. For instance, my dataframe looks like this:

CHA_NUMB CHA_NAME UN_CHA_ID
1 m_3_1 12345 lcha
2 t_1_2 12456 lcha
3 blah 90244 lcha
4 blah 23435 lcha

When it should look like this:

CHA_NUMB CHA_NAME UN_CHA_ID
0 1 m_3_1 12345
1 2 t_1_2 12456
2 3 blah 90244
3 4 blah 23435

I tried resetting the index but it didn't work. Resetting the index makes the dataframe look like this:

index CHA_NUMB CHA_NAME UN_CHA_ID
0 0 m_3_1 12345 lcha
1 1 t_1_2 12456 lcha
2 2 blah 90244 lcha
3 3 blah 23435 lcha

CodePudding user response:

First use DataFrame.reset_index, then remove last column by indexing in DataFrame.iloc and last set columns names by original DataFrame by DataFrame.set_axis:

df = df.reset_index().iloc[:, :-1].set_axis(df.columns, axis=1)
print (df)
   CHA_NUMB CHA_NAME  UN_CHA_ID
0         1    m_3_1      12345
1         2    t_1_2      12456
2         3     blah      90244
3         4     blah      23435

Alternative:

cols = df.columns
df = df.reset_index().iloc[:, :-1]
df.columns = cols

EDIT: If first row of columns names not matched data you can omit columns names by header=None and skiprows=1, get columns names like RangeIndex, then use usecols for select first and third column and last set columns names by names parameter:

df = pd.read_csv(file, 
                 header=None, 
                 skiprows=1, 
                 usecols=[0,2], 
                 names=['CHA_NUMB','UN_CHA_ID'])
    
print (df)
   CHA_NUMB  UN_CHA_ID
0         1      12345
1         2      12456
2         3      90244
3         4      23435
  • Related