Home > Enterprise >  Extracting Specific Text From column in dataframe in pandas
Extracting Specific Text From column in dataframe in pandas

Time:02-18

I have a pandas dataframe with a column, which I need to extract the word with [ft,mi,FT,MI] of the state column using regular expression and stored in other column.

 df1 = {
    'State':['Arizona 4.47ft','Georgia 1023mi','Newyork 2022 NY 74.6 FT','Indiana 747MI(In)','Florida 453mi FL']}

Expected output

               State  Distance
0     Arizona 4.47ft  4.47ft
1     Georgia 1023mi  1023mi
2  Newyork NY 74.6ft  74.6ft
3  Indiana 747MI(In)   747MI
4   Florida 453mi FL   453mi

Would anyone please help?

CodePudding user response:

Build a regex pattern with the help of list l then use str.extract to extract the occurrence of this pattern from the State column

l = ['ft','mi','FT','MI']
df1['Distance'] = df1['State'].str.extract(r'(\S (?:%s))\b' % '|'.join(l))

                    State Distance
0          Arizona 4.47ft   4.47ft
1          Georgia 1023mi   1023mi
2  Newyork 2022 NY 74.6FT   74.6FT
3       Indiana 747MI(In)    747MI
4        Florida 453mi FL    453mi
  • Related