I have a pandas dataframe with a column, which I need to extract the word with [ft,mi,FT,MI] of the state column using regular expression and stored in other column.
df1 = {
'State':['Arizona 4.47ft','Georgia 1023mi','Newyork 2022 NY 74.6 FT','Indiana 747MI(In)','Florida 453mi FL']}
Expected output
State Distance
0 Arizona 4.47ft 4.47ft
1 Georgia 1023mi 1023mi
2 Newyork NY 74.6ft 74.6ft
3 Indiana 747MI(In) 747MI
4 Florida 453mi FL 453mi
Would anyone please help?
CodePudding user response:
Build a regex pattern with the help of list l
then use str.extract
to extract the occurrence of this pattern from the State
column
l = ['ft','mi','FT','MI']
df1['Distance'] = df1['State'].str.extract(r'(\S (?:%s))\b' % '|'.join(l))
State Distance
0 Arizona 4.47ft 4.47ft
1 Georgia 1023mi 1023mi
2 Newyork 2022 NY 74.6FT 74.6FT
3 Indiana 747MI(In) 747MI
4 Florida 453mi FL 453mi