I want to create a heatmap array using plt.imshow(). But I have some doubts about how to transform my dataframe and fill the 7x24 array. My goal is to count all the visits that have the same day and hour:
visits = [
(1,13),
(2,1),
(1,13),
(1,13),
(2,18),
(5,3),
(6,1),
(5,3),
(1,2),
(2,16),
(2,16),
(2,18),
(4,7)
]
expected_df = [
(1,13,3),
(2,1,1),
(2,18,2),
(5,3,2),
(1,2,1),
(2,16,2),
(4,7,1),
(6,1,1)
]
visits = pd.DataFrame(visits , columns = ['day','hour'])
expected_df = pd.DataFrame(expected_df , columns = ['day','hour', 'count'])
I'm new to pandas so i don't really know how to transform the visits dataframe to expected_df dataframe and create an array 7x24 with the expected_df dataframe, in order to create a heatmap with plt.imshow(). I hope my question is clear enough.
CodePudding user response:
For the first part, you want to group by multiple columns (and lets reset the index and give it a name):
expected_df = visits.groupby(['day', 'hour']).size().reset_index(name='count')
For the second part, plotting, i assume from context, that no data means no visits, so 0 on the heatmap
heat = np.zeros((7,24))
heat[expected_df['day'], expected_df['hour']] = expected_df['count']
plt.imshow(heat)
plt.show()
CodePudding user response:
Method:1
First put the result into a map for the count operation on the list of tuple
counts = list({x: visits.count(x) for x in visits})
# {(1, 13): 3, (2, 1): 1, (2, 18): 2, (5, 3): 2, (6, 1): 1, (1, 2): 1, (2, 16): 2, (4, 7): 1}
and then convert that into a list
expected_df = list((x, y, counts[(x, y)]) for x, y in counts)
# [(1, 13, 3), (2, 1, 1), (2, 18, 2), (5, 3, 2), (6, 1, 1), (1, 2, 1), (2, 16, 2), (4, 7, 1)]
Method: 2
an easier solution will be just
list(set((x, y, visits.count((x, y))) for x, y in visits))