i want to write a function that takes as input a list of these datetime strings and returns only the date in 'yyyy-mm-dd' format.
This is the dataframe
twitter_url = 'https://raw.githubusercontent.com/Explore-AI/Public-Data/master/Data/twitter_nov_2019.csv'
twitter_df = pd.read_csv(twitter_url)
twitter_df.head()
This is the date variable
dates = twitter_df['Date'].to_list()
This is the code i have written
def date_parser(dates):
"""
This is a function that takes as input a list of the datetime strings
and returns only the date in 'yyyy-mm-dd' format.
"""
# for every figure in the date string
for date in dates:
# the function should return only the dates and neglect the time
return ([date[:4] "-" date[5:7] "-" date[8:-9]])
date_parser(dates[:3])
This is the output i get
['2019-11-29']
This is the expected output
['2019-11-29', '2019-11-28', '2019-11-28', '2019-11-28', '2019-11-28']
How do i do these?
CodePudding user response:
You can try using regex:
new_date_list = []
and then inside the loop:
new_date_list.append(re.findall(r"^\d{4}(-|\/)(0[1-9]|1[0-2])(-|\/)(0[1-9]|[12][0-9]|3[01])$",date))
CodePudding user response:
Convert the Date Series to pandas datetime format and then format the date time:
twitter_df['Date2'] = pd.to_datetime(twitter_df['Date']).dt.strftime('%Y-%m-%d')
Here, we are adding a new column, named Date2
, for twitter_df
You can convert this to list using:
twitter_df['Date2'].to_list()
Output for twitter_df['Date2']
0 2019-11-29
1 2019-11-29
2 2019-11-29
3 2019-11-29
4 2019-11-29
...
195 2019-11-20
196 2019-11-20
197 2019-11-20
198 2019-11-20
199 2019-11-20
Name: Date2, Length: 200, dtype: object