I have a dataframe in python containing various dates.
df = pd.DataFrame({"Date":["2020-01-27 welcome ! offer","Space ! offer 2020-02-27","new | 2020-03-27"],
"A_item":[2, 8, 0],
"B_item":[1, 7, 10],
"C_item":[9, 2, 9],
})
and i need to get this as a result
Date | A_item | B_item | C_item | Extracted Date |
---|---|---|---|---|
2020-01-27 welcome ! offer | 2 | 1 | 9 | 27-01-2020 |
Space ! offer 2020-02-27 | 8 | 7 | 2 | 27-02-2020 |
Space ! offer new 2020-03-27 | 0 | 10 | 9 | 27-03-2020 |
Does anybody know how to extract them
CodePudding user response:
You can try the following code:
def extract_date(x):
pattern = "[0-9] -[0-9] -[0-9] "
match = re.findall(pattern, x)
return match[0]
df["new_column"] = df["first_colum"].apply(extract_date)
first_column
is the source column.
Then you should get the output below:
CodePudding user response:
df['Extracted Date']=df['Date'].str.extract(r'([\d]{2}-[\d]{2}-[\d]{4})|\)')