New column with word at nth position of string from other column pandas-CodePudding

import numpy as np
import pandas as pd

d = {'ABSTRACT_ID': [14145090,1900667, 8157202,6784974], 
     'TEXT': [
         "velvet antlers vas are commonly used in tradit",
         "we have taken a basic biologic RPA to elucidat4",
         "ceftobiprole bpr is an investigational cephalo",
         "lipoperoxidationderived aldehydes for example",],
     'LOCATION': [1, 4, 2, 1]}

df = pd.DataFrame(data=d)
df

def word_at_pos(x,y):
    pos=x
    string= y

    count = 0
    res = ""
    for word in string:
        if word == ' ':
           count = count   1
        if count == pos:
            break
            res = ""
        else :
            res = res   word
    print(res) 

word_at_pos(df.iloc[0,2],df.iloc[0,1])

For this df I want to create a new column WORD that contains the word from TEXT at the position indicated by LOCATION. e.g. first line would be "velvet".

I can do this for a single line as an isolated function world_at_pos(x,y), but can't work out how to apply this to whole column. I have done new columns with Lambda functions before, but can't work out how to fit this function to lambda.

CodePudding user response：

Looping over TEXT and LOCATION could be the best idea because splitting creates a jagged array, so filtering using numpy advanced indexing won't be possible.

df["WORDS"] = [txt.split()[loc] for txt, loc in zip(df["TEXT"], df["LOCATION"]-1)]
print(df)

   ABSTRACT_ID  ...                    WORDS
0     14145090  ...                   velvet
1      1900667  ...                        a
2      8157202  ...                      bpr
3      6784974  ...  lipoperoxidationderived

[4 rows x 4 columns]