Finding the common elements in 2 columns which is present in a single dataframe-CodePudding

[image of a data frame]

I want to find the common elements which is present in top30 and played games column

i have used the below code , but it does not give me the right output

output : {f, 1, , ,, a, s, t, r, g, 2, ', y, o, e, [

The output which i am looking is

Prediction1['precision_at_30'] = [
    set(a).intersection(b) for a, b in zip(Prediction1['played_games'], Prediction1['top 30'])]

CodePudding user response：

You can use apply to check the set intersection:

df['result'] = df.apply(lambda r: set(r['games']).intersection(r['played_games']), axis=1)

Example:

             games played_games      result
0  [abc, def, ghi]   [def, abc]  {abc, def}

CodePudding user response：

Your columns seems to be the string representation of lists, so you can use pd.eval to convert your string in real python list before using set predicate:

df = pd.DataFrame({'games': ["['abc', 'bef']", "['b', 'c', 'e', 'f']"], 
                   'played_games': ["['abc', 'bef', 'e']", "['b', 'f']"]})

df['result'] = df[['games', 'played_games']].apply(
    lambda x: set(pd.eval(x['games'])).intersection(pd.eval(x['played_games'])),
    axis=1)

Output:

>>> df
                  games         played_games      result
0        ['abc', 'bef']  ['abc', 'bef', 'e']  {abc, bef}
1  ['b', 'c', 'e', 'f']           ['b', 'f']      {b, f}