Home > OS >  Get the first word after specific word in a column
Get the first word after specific word in a column

Time:11-02

I am trying to get the zip code after the specific word 'zip_code' within a string.

I have a data frame with a column named "location", in this column there is a string, I want to identify the word "zip_code" and get the value after this word for each row.

Input

name   location
 Bar1   LA Jefferson zip_code 202378 Avenue free
 Pizza  Avenue 45 zip_code 45623 wichita st
 Tacos   Las Americas avenue 7 zip_code 67890 nicolas st

Expected output

 name   location                                          
 Bar1    202378 
 Pizza   45623
 Tacos   67890 

So far, following an example I was able to extract the zip code for any string

str = "address1 355 Turnpike Ste 4  address3 zip_code 02021 country    US   "
str.split("zip_code")[1].split()[0]
>> 02021

But I do not know how to do the same for each row of my column location

CodePudding user response:

The best way is to use extract() which accepts regex and allows searching through each row.

import pandas as pd
import numpy as np

df = pd.DataFrame({'name':['Bar1', 'Pizza', 'Tacos'],
       'location':['LA Jefferson zip_code 202378 Avenue free', 'Avenue 45 zip_code 45623 wichita st', 'Las Americas avenue 7 zip_code 67890 nicolas st']})

df['location'] = df['location'].str.extract('zip_code\s(.*?)\s')


>>> df
    name location
0   Bar1   202378
1  Pizza    45623
2  Tacos    67890
  • Related