Home > Back-end >  Create heatmap by grouping two columns
Create heatmap by grouping two columns

Time:10-09

I want to create a heatmap array using plt.imshow(). But I have some doubts about how to transform my dataframe and fill the 7x24 array. My goal is to count all the visits that have the same day and hour:

visits = [
   (1,13),
   (2,1),
   (1,13),
   (1,13),
   (2,18),
   (5,3),
   (6,1),
   (5,3),
   (1,2),
   (2,16),
   (2,16),
   (2,18),
   (4,7)
]

expected_df = [
    (1,13,3),
    (2,1,1),
    (2,18,2),
    (5,3,2),
    (1,2,1),
    (2,16,2),
    (4,7,1),
    (6,1,1)
]

visits = pd.DataFrame(visits , columns = ['day','hour'])
expected_df = pd.DataFrame(expected_df , columns = ['day','hour', 'count'])

I'm new to pandas so i don't really know how to transform the visits dataframe to expected_df dataframe and create an array 7x24 with the expected_df dataframe, in order to create a heatmap with plt.imshow(). I hope my question is clear enough.

CodePudding user response:

For the first part, you want to group by multiple columns (and lets reset the index and give it a name):

expected_df = visits.groupby(['day', 'hour']).size().reset_index(name='count')

For the second part, plotting, i assume from context, that no data means no visits, so 0 on the heatmap

heat = np.zeros((7,24))
heat[expected_df['day'], expected_df['hour']] = expected_df['count']
plt.imshow(heat)
plt.show()

CodePudding user response:

Method:1

First put the result into a map for the count operation on the list of tuple

counts = list({x: visits.count(x) for x in visits})
# {(1, 13): 3, (2, 1): 1, (2, 18): 2, (5, 3): 2, (6, 1): 1, (1, 2): 1, (2, 16): 2, (4, 7): 1}

and then convert that into a list

expected_df = list((x, y, counts[(x, y)]) for x, y in counts)
# [(1, 13, 3), (2, 1, 1), (2, 18, 2), (5, 3, 2), (6, 1, 1), (1, 2, 1), (2, 16, 2), (4, 7, 1)]

Method: 2

an easier solution will be just

list(set((x, y, visits.count((x, y))) for x, y in visits))
  • Related