I have the following data frame df1
string lists
0 i have a dog ['fox', 'dog', 'cat']
1 there is a cat ['dog', 'house', 'car']
2 hello everyone ['hi', 'hello', 'everyone']
3 hi my name is Joe ['name', 'was', 'Joe']
I'm trying to return a data frame df2
that looks like this
string lists new_string
0 i have a dog ['fox', 'dog', 'cat'] i have a
1 there is a cat ['dog', 'house', 'car'] there is a cat
2 hello everyone ['hi', 'hello', 'everyone']
3 hi my name is Joe ['name', 'was', 'Joe'] hi my is
I've referenced other questions such as https://stackoverflow.com/a/40493603/5879909, but I'm having trouble searching through a list in a column as opposed to a preset list.
CodePudding user response:
Considering that the dataframe is df
, and that OP's goal is to create a new column named new_string
whose values are strings equal to the one's in the string
column without a string in the lists
column, for that specific row, the following will do the work
df['new_string'] = df['string'].apply(lambda x: ' '.join([word for word in x.split() if word not in df['lists'][df['string'] == x].values[0]]))
[Out]:
string lists new_string
0 i have a dog [fox, dog, cat] i have a
1 there is a cat [dog, house, car] there is a cat
2 hello everyone [hi, hello, everyone]
3 hi my name is Joe [name, was, Joe] hi my is