Hi I have a list and a pandas dataframe whose elements are lists as well. I want to find out if any one of elements of pandas column list are present in the other list and create one column with 1 if found and 0 if not found and another column with found elements as string separated by ,
. I found a similar question but couldn`t understand how could I use it for the case here.
How can I create a new column found
which is 1 if any letter in list letters
is found in letters_list
, and another column letters_found
which outputs letters matched in the list as string separated by ,
? It would like like following.
CodePudding user response:
You need to use a loop here.
Make letters
a set
for efficient testing of common elements with set.intersection
and use a list comprehension. Then check if you found any letter by making "letters_found" as boolean (empty string becomes False
, the rest True
) and converting to int
to have 0/1.
letters = set(['a', 'b', 'c', 'f', 'j'])
df_temp['letters_found'] = [','.join(sorted(letters.intersection(l)))
for l in df_temp['letters_list']]
df_temp['found'] = df_temp['letters_found'].astype(bool).astype(int)
output:
letters_list letters_found found
0 [a, b, c] a,b,c 1
1 [d, e, f] f 1
2 [g, h, i] 0
3 [j, h, i] j 1