I'm working on a df that gives me Date in a integer form, and i want to convert it into Datetime so i can continue to manipulate it.
Basically i have this DF where i have a Column like this below
DATE
18010101
18010101
18010101
18010101
18010101
... ... ... ... ... ...
20123124
20123124
20123124
20123124
20123124
the respective order in this dateset is (Year, month, day, hour)
i've already tried to do something like this
df["Year"] = df.DATE[0:2]
df["Month"] = df.DATE[2:4]~
but it converts it into a float and for the first line example, it becomes similar to this
DATE Year
18010101 18010101.0
where it was supposed to be
Year = 2018 or 18
I would appreciate a lot the help of you guys.
CodePudding user response:
Use Series.str.extract
and a regex
df = pd.DataFrame([[18010101], [18010101], [18010101], [18010101], [18010101],
[20123124], [20123124], [20123124], [20123124], [20123124]],
columns=['DATE'])
df[['year', 'month', 'day', 'hour']] = df['DATE'].astype(str).str.extract(r'(\d{2})(\d{2})(\d{2})(\d{2})')
DATE | year | month | day | hour |
---|---|---|---|---|
18010101 | 18 | 01 | 01 | 01 |
18010101 | 18 | 01 | 01 | 01 |
18010101 | 18 | 01 | 01 | 01 |
18010101 | 18 | 01 | 01 | 01 |
18010101 | 18 | 01 | 01 | 01 |
20123124 | 20 | 12 | 31 | 24 |
20123124 | 20 | 12 | 31 | 24 |
20123124 | 20 | 12 | 31 | 24 |
20123124 | 20 | 12 | 31 | 24 |
20123124 | 20 | 12 | 31 | 24 |
CodePudding user response:
Earlier, you tried to split an int like a string.
But you can first convert it into a string (by .astype(str)
) and then split it. Check the lines below.
date_str = df.DATE.astype(str)
df["Year"] = date_str.str[0:2]
df["Month"] = date_str.str[2:4]