Home > Enterprise >  Get total number of x column value occurrences based on corresponding frequency y,z column value pai
Get total number of x column value occurrences based on corresponding frequency y,z column value pai

Time:12-30

I have a huge data set, over 1,000,000 rows. I want to see how many 'xCordAdjusted', 'yCordAdjusted' pairings are correlated to 'event' type 'SHOT', 'MISS', and 'GOAL'.

'xCordAdjusted' has a minimum value of 0 and maximum of 100, 'yCordAdjusted' has a minimum value of -44 and a maximum of 44.

dff.head()

season  event   xCordAdjusted   yCordAdjusted   

2020    SHOT        74              -29             
2020    SHOT        49              -25             
2020    SHOT        52               31                 
2020    SHOT        43               39                 
2020    MISS        46              -33         

I want to see the frequency of each coordinate resulting in the three 'event' attribute possibilities 'SHOT','MISS','GOAL'. Doesn't have to be exact - I just want to be able to preform further analysis on the totals for each 'event' given their x,y cord frequency.

Desired output:

xCordAdjusted   yCordAdjusted   event   total
 100               -44          SHOT    500,xxx
                                MISS    500,xxx
                                GOAL    500,xxx
 99                -44          SHOT    500,xxx
                                MISS    500,xxx
                                GOAL    500,xxx                 
                        

CodePudding user response:

Since you are looking to sum up the number of each type of event by the x and y coordinates, you can use groupby and sum:

dff.groupby(['xCordAdjusted','yCordAdjusted','event']).sum()
  • Related