Home > Mobile >  How to stop changing pandas from changing the column name into date format?
How to stop changing pandas from changing the column name into date format?

Time:10-16

I am trying to read a excel file with header with month-Year format. It changes it to date-time. How can I stop it from reading it as date time and force it to read as string instead.

test.xlsx

Data looks like this in excel file

When I read it as

df = pd.read_excel('test.xlsx',engine='openpyxl')

This is what it reads as:

This is what I see

I would like it to read as:

enter image description here

Thank you in advance.

CodePudding user response:

Maybe try reading column into string object by specifying the dtype parameter as covered in documentation (linked below).

https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html#:~:text=dtypeType name or dict of column -> type, default None

If you also set the header=None option you can then set the first row to be a header after the fact and change the types according to your needs.

CodePudding user response:

This might be bit of a workaround and specific to your case. But, I tried it and it works. Steps are:

  1. Read the dataframe and remove the UserID column to a temp series
  2. Transpose the datetime only dataframe, which will bring the columns as index
  3. Use strftime() to convert this to string with format you need
  4. Transpose again and add UserID column to the start
uid = df['UserID'] ## Read User ID to a temp variable
df.drop(columns='UserID', axis=1, inplace=True) ##Drop User ID, leaving only datetime colums
df= df.T ## Transpose, will bring datetime as index
df.index = pd.to_datetime(df.index).strftime('%b-%y') ##Conver index to string
df=df.T  ## Transpose back
df.insert(loc=0, column='UserID', value=uid)  ## Add UserID to loc=0
print(df)

 UserID Jan-21  Feb-21  Mar-21  Apr-21
0   1   2   3   4   5
1   2   3   4   5   6
2   3   4   5   6   7
3   4   5   6   7   8
  • Related