I am new to Python/pandas. I have a data frame that looks like this
x = pd.DataFrame([20210901,20210902, 20210903, 20210904])
[out]:
0
0 20210901
1 20210902
2 20210903
3 20210904
I want to separate each row as follows: For example
year = 2021
month = 9
day = 1
or I have a list for each row like this:
[2021,9,1]
CodePudding user response:
You can use pd.to_datetime
to convert the entire column to datetime
type.
>>> import pandas as pd
>>>
>>> df = pd.DataFrame({'Col': [20210917,20210918, 20210919, 20210920]})
>>>
>>> df.Col = pd.to_datetime(df.Col, format='%Y%m%d')
>>> df
Col
0 2021-09-17
1 2021-09-18
2 2021-09-19
3 2021-09-20
>>> df['Year'] = df.Col.dt.year
>>> df['Month'] = df.Col.dt.month
>>> df['Day'] = df.Col.dt.day
>>>
>>> df
Col Year Month Day
0 2021-09-17 2021 9 17
1 2021-09-18 2021 9 18
2 2021-09-19 2021 9 19
3 2021-09-20 2021 9 20
If you want the result as list, you can use list comprehension along with zip
function.
>>> [(year, month, day) for year, month, day in zip(df.Year, df.Month, df.Day)]
[(2021, 9, 17), (2021, 9, 18), (2021, 9, 19), (2021, 9, 20)]
CodePudding user response:
Separate each row as follows three new columns: year, month, day
import pandas as pd
df = pd.DataFrame({"date":[20210901,20210902, 20210903, 20210904]})
# split date field to three fields: year month day
df["year"] = df["date"].apply(lambda x: str(x)[:4])
df["month"] = df["date"].apply(lambda x: str(x)[4:6])
df["day"] = df["date"].apply(lambda x: str(x)[6:])
print(df)
result is as following
# result is:
date year month day
0 20210901 2021 09 01
1 20210902 2021 09 02
2 20210903 2021 09 03
3 20210904 2021 09 04
CodePudding user response:
I created a simple function which will return a zipped list in the format you wanted
def convert_to_dates(df):
dates = []
for i,v in x.iterrows():
dates.append(v.values[0])
for i in range(len(dates)):
dates[i] = str(dates[i])
years = []
months = []
days = []
for i in range(len(dates)):
years.append(dates[i][0:4])
months.append(dates[i][4:6])
days.append(dates[i][6:8])
return list(zip(years, months, days))
Call it by using convert_to_dates(x)
Output:
In [4]: convert_to_dates(x)
Out[4]:
[('2021', '09', '01'),
('2021', '09', '02'),
('2021', '09', '03'),
('2021', '09', '04')]
CodePudding user response:
x = pd.DataFrame([20210917, 20210918, 20210919, 20210920])
x[0] = x[0].astype(str).astype('datetime64')
x = x.apply(lambda a: [a[0].year, a[0].month, a[0].day],axis=1)
print(x)
Output:
0
0 [2021, 9, 17]
1 [2021, 9, 18]
2 [2021, 9, 19]
3 [2021, 9, 20]