I have a dataframe like this:
df=pd.DataFrame({'caption':'hello this pack is for you: Jake Peralta. Thanks'})
df
caption
hello this pack is for you: Jake Peralta. Thanks
...
...
...
I'm trying to get the recipient's first and last name here. The format of the caption column is always the same. So delete everything before for you: and get the first 2(this number may change) words after for you:
CodePudding user response:
Takes care of leading spaces in name:
>>> df.caption.str.split(".").str[0].str.split(":").str[1].str.strip()
1 Jake Peralta
Name: caption, dtype: object
CodePudding user response:
here is one way :
df.caption.apply(lambda st: st[st.find(":") 2:st.find(".")])
output :
0 Jake Peralta
Name: caption, dtype: object
CodePudding user response:
May be you can try like this
df['caption'].str.split("for you: ").str[1].str.split('.').str[0]
output:
0 Jake Peralta
1 first last