Home > Blockchain >  Use `astype()` to fix year, month, day in Python
Use `astype()` to fix year, month, day in Python

Time:12-13

I want to convert a number of columns into the relevant types. I understand that I can do astype(). But when I was converting datetime type column, it's less straightforward. I know I can use pd.to_datetime to fix the year, month and day. I am just wondering if there's a way I can do it within astype(), or am I using astype() too far in this case? I guess I just wanted to bring in everything into one go.

import pandas as pd
player_list = [
            ['200712','NY', 50000],['200714','NY', 51000],['200716', 'NY', 51500],
            ['200719','NY', 53000],['200721','CA', 54000],
            ['200724','CA', 55000],['200729','CA', 57000]
]

df = pd.DataFrame(player_list,columns=['Dates', 'Venue', 'Patients']); df

df = df.astype({
    "Dates": "datetime64[ns]",
    "Venue": "category", 
    "Patients": "object"
})

df['Dates'] = pd.to_datetime(df['Dates'], format='%y%m%d')

Expected output:

       Dates Venue  Patients
0 2020-07-12    NY     50000
1 2020-07-14    NY     51000
2 2020-07-16    NY     51500
3 2020-07-19    NY     53000
4 2020-07-21    CA     54000
5 2020-07-24    CA     55000
6 2020-07-29    CA     57000

CodePudding user response:

You can't use astype() for this. astype('datetime64[ns]') uses to_datetime under the hood, and you can't specify format='...' there.

Instead, you can use [DataFrame.assign()] "chained" to the astype call to keep them on one line:

df = df.astype({
    "Venue": "category", 
    "Patients": "object"
}).assign(Dates=pd.to_datetime(df['Dates'], format='%y%m%d'))
  • Related