I want to convert a number of columns into the relevant types. I understand that I can do astype()
. But when I was converting datetime type column, it's less straightforward. I know I can use pd.to_datetime
to fix the year, month and day. I am just wondering if there's a way I can do it within astype()
, or am I using astype()
too far in this case? I guess I just wanted to bring in everything into one go.
import pandas as pd
player_list = [
['200712','NY', 50000],['200714','NY', 51000],['200716', 'NY', 51500],
['200719','NY', 53000],['200721','CA', 54000],
['200724','CA', 55000],['200729','CA', 57000]
]
df = pd.DataFrame(player_list,columns=['Dates', 'Venue', 'Patients']); df
df = df.astype({
"Dates": "datetime64[ns]",
"Venue": "category",
"Patients": "object"
})
df['Dates'] = pd.to_datetime(df['Dates'], format='%y%m%d')
Expected output:
Dates Venue Patients
0 2020-07-12 NY 50000
1 2020-07-14 NY 51000
2 2020-07-16 NY 51500
3 2020-07-19 NY 53000
4 2020-07-21 CA 54000
5 2020-07-24 CA 55000
6 2020-07-29 CA 57000
CodePudding user response:
You can't use astype()
for this. astype('datetime64[ns]')
uses to_datetime
under the hood, and you can't specify format='...'
there.
Instead, you can use [DataFrame.assign()
] "chained" to the astype
call to keep them on one line:
df = df.astype({
"Venue": "category",
"Patients": "object"
}).assign(Dates=pd.to_datetime(df['Dates'], format='%y%m%d'))