Home > database >  how do I find count of unique combination in 2 columns of dataframe in pandas
how do I find count of unique combination in 2 columns of dataframe in pandas

Time:04-19

df = pd.DataFrame({'col1': [1,2,4,3], 'col2': [2,1,3,4]})
   col1 col2
0   1     2
1   2     1
2   4     3
3   3     4

Desired outcome

  col1 col2 count
0   1     2     2
1   4     3     2

I tried

(df.groupby(['team1','team2']).size()
   .sort_values(ascending=False)
   .reset_index(name='Count')
)

but this is not giving me unique combination

CodePudding user response:

You do something like this also,

df.apply(set, axis=1).value_counts()

Output:

{1, 2}    2
{3, 4}    2
dtype: int64

CodePudding user response:

IIUC, you can first compute a frozenset from your two columns, then use named aggregation:

# compute unordered grouper
group = df[['col1', 'col2']].agg(frozenset, axis=1)

# craft a dictionary of expected output
# first rows for the existing columns   new column for count
d = {c: (c, 'first') for c in df}
d.update({'count': ('col1', 'count')})
# {'col1': ('col1', 'first'),
#  'col2': ('col2', 'first'),
#  'count': ('col1', 'count')}

# perform the aggregation
df.groupby(group, as_index=False).agg(**d)

output:

   col1  col2  count
0     1     2      2
1     4     3      2

CodePudding user response:

Let us check

df[:] = np.sort(df.to_numpy(),axis=1)
df.value_counts()
Out[132]: 
col1  col2
1     2       2
3     4       2
dtype: int64
  • Related