I have the following dataframe. The objective is to count the number of appearances for all elements in both "from" and "to" by date.
id date from to
A 01/01/2020 alice bob
A 01/01/2020 alice carl
A 01/02/2020 jim peter
A 01/02/2020 jim jen
B 01/02/2020 mary john
B 01/03/2020 john max
B 01/03/2020 mike jane
B 01/03/2020 jane jim
C 01/02/2020 xavier jim
C 01/02/2020 jim alice
C 01/03/2020 alice bob
C 01/03/2020 manny lily
C 01/04/2020 rosa bob
.. .. .. ..
Desired dataframe
name date count
alice 01/01/2020 2
alice 01/02/2020 1
alice 01/03/2020 1
bob 01/01/2020 1
bob 01/03/2020 1
bob 01/04/2020 1
carl 01/01/2020 1
peter 01/02/2020 1
jane 01/03/2020 2
jen 01/02/2020 1
jim 01/02/2020 4
jim 01/03/2020 1
john 01/02/2020 1
john 01/03/2020 1
lily 01/03/2020 1
mary 01/02/2020 1
manny 01/03/2020 1
max 01/03/2020 1
mike 01/03/2020 1
rosa 01/04/2020 1
xavier01/02/2020 1
Any help will be appreciated!
CodePudding user response:
You can set_index
stack
and use groupby
value_counts
. The rest is just cosmetic changes to get the output in the desired form:
out = (df.set_index('date')[['from','to']].stack()
.groupby(level='date').value_counts()
.reset_index(level=0, name='count').sort_index()
.rename_axis(index=['name']).reset_index())
Output:
name date count
0 alice 01/01/2020 2
1 alice 01/02/2020 1
2 alice 01/03/2020 1
3 bob 01/01/2020 1
4 bob 01/03/2020 1
5 bob 01/04/2020 1
6 carl 01/01/2020 1
7 jane 01/03/2020 2
8 jen 01/02/2020 1
9 jim 01/02/2020 4
10 jim 01/03/2020 1
11 john 01/02/2020 1
12 john 01/03/2020 1
13 lily 01/03/2020 1
14 manny 01/03/2020 1
15 mary 01/02/2020 1
16 max 01/03/2020 1
17 mike 01/03/2020 1
18 peter 01/02/2020 1
19 rosa 01/04/2020 1
20 xavier 01/02/2020 1
CodePudding user response:
Use melt
as starting point:
out = (df.melt(['id', 'date'], value_name='name')
.value_counts(['name', 'date'], sort=False)
.rename('count').reset_index())
print(out)
# Output
name date count
0 alice 01/01/2020 2
1 alice 01/02/2020 1
2 alice 01/03/2020 1
3 bob 01/01/2020 1
4 bob 01/03/2020 1
5 bob 01/04/2020 1
6 carl 01/01/2020 1
7 jane 01/03/2020 2
8 jen 01/02/2020 1
9 jim 01/02/2020 4
10 jim 01/03/2020 1
11 john 01/02/2020 1
12 john 01/03/2020 1
13 lily 01/03/2020 1
14 manny 01/03/2020 1
15 mary 01/02/2020 1
16 max 01/03/2020 1
17 mike 01/03/2020 1
18 peter 01/02/2020 1
19 rosa 01/04/2020 1
20 xavier 01/02/2020 1