I am trying to read a excel file with header with month-Year format. It changes it to date-time. How can I stop it from reading it as date time and force it to read as string instead.
test.xlsx
When I read it as
df = pd.read_excel('test.xlsx',engine='openpyxl')
This is what it reads as:
I would like it to read as:
Thank you in advance.
CodePudding user response:
Maybe try reading column into string object by specifying the dtype
parameter as covered in documentation (linked below).
If you also set the header=None
option you can then set the first row to be a header after the fact and change the types according to your needs.
CodePudding user response:
This might be bit of a workaround and specific to your case. But, I tried it and it works. Steps are:
- Read the dataframe and remove the UserID column to a temp series
- Transpose the datetime only dataframe, which will bring the columns as index
- Use
strftime()
to convert this to string with format you need - Transpose again and add UserID column to the start
uid = df['UserID'] ## Read User ID to a temp variable
df.drop(columns='UserID', axis=1, inplace=True) ##Drop User ID, leaving only datetime colums
df= df.T ## Transpose, will bring datetime as index
df.index = pd.to_datetime(df.index).strftime('%b-%y') ##Conver index to string
df=df.T ## Transpose back
df.insert(loc=0, column='UserID', value=uid) ## Add UserID to loc=0
print(df)
UserID Jan-21 Feb-21 Mar-21 Apr-21
0 1 2 3 4 5
1 2 3 4 5 6
2 3 4 5 6 7
3 4 5 6 7 8