Home > other >  Write a function that takes as input a list of these datetime strings and returns only the date in &
Write a function that takes as input a list of these datetime strings and returns only the date in &

Time:03-17

i want to write a function that takes as input a list of these datetime strings and returns only the date in 'yyyy-mm-dd' format.

This is the dataframe

twitter_url = 'https://raw.githubusercontent.com/Explore-AI/Public-Data/master/Data/twitter_nov_2019.csv'
twitter_df = pd.read_csv(twitter_url)
twitter_df.head()

This is the date variable

dates = twitter_df['Date'].to_list()

This is the code i have written


def date_parser(dates):
    """
    This is a function that takes as input a list of the datetime strings 
    and returns only the date in 'yyyy-mm-dd' format.
    """
    
    # for every figure in the date string
    for date in dates: 
        
         # the function should return only the dates and neglect the time
        return ([date[:4]   "-"   date[5:7]   "-"   date[8:-9]])
    
date_parser(dates[:3]) 

This is the output i get

['2019-11-29']

This is the expected output

['2019-11-29', '2019-11-28', '2019-11-28', '2019-11-28', '2019-11-28']

How do i do these?

CodePudding user response:

You can try using regex:

new_date_list = []

and then inside the loop:

new_date_list.append(re.findall(r"^\d{4}(-|\/)(0[1-9]|1[0-2])(-|\/)(0[1-9]|[12][0-9]|3[01])$",date))

CodePudding user response:

Convert the Date Series to pandas datetime format and then format the date time:

twitter_df['Date2'] = pd.to_datetime(twitter_df['Date']).dt.strftime('%Y-%m-%d')

Here, we are adding a new column, named Date2, for twitter_df

You can convert this to list using:

twitter_df['Date2'].to_list()

Output for twitter_df['Date2']

0      2019-11-29
1      2019-11-29
2      2019-11-29
3      2019-11-29
4      2019-11-29
          ...    
195    2019-11-20
196    2019-11-20
197    2019-11-20
198    2019-11-20
199    2019-11-20
Name: Date2, Length: 200, dtype: object
  • Related