Home > Software design >  Extract format dates from dataframe column
Extract format dates from dataframe column

Time:11-19

I have a dataframe in python containing various dates.

df = pd.DataFrame({"Date":["2020-01-27 welcome ! offer","Space ! offer 2020-02-27","new | 2020-03-27"],
                   "A_item":[2, 8, 0],
                   "B_item":[1, 7, 10],
                   "C_item":[9, 2, 9],

                   })

and i need to get this as a result

Date A_item B_item C_item Extracted Date
2020-01-27 welcome ! offer 2 1 9 27-01-2020
Space ! offer 2020-02-27 8 7 2 27-02-2020
Space ! offer new 2020-03-27 0 10 9 27-03-2020

Does anybody know how to extract them

CodePudding user response:

You can try the following code:

def extract_date(x):
    pattern = "[0-9] -[0-9] -[0-9] "
    match = re.findall(pattern, x)
    return match[0]

df["new_column"] = df["first_colum"].apply(extract_date)

first_column is the source column.

Then you should get the output below:

enter image description here

CodePudding user response:

df['Extracted Date']=df['Date'].str.extract(r'([\d]{2}-[\d]{2}-[\d]{4})|\)')
  • Related