I have a dataset in CSV format, imported as a 2D-array:
game_id, player_score, rounds_played
0, 56, 30
0, 44, 30
1, 77, 26
1, 34, 26
2, 36, 23
2, 31, 23
In the next step, I need to create a new array, where the player_score needs to be relativized to the rounds_played in each game, to get an estimate of how well the performance was when looking at the score alone.
In a "for loop", I've done something similar where I calculated the "weight" of each score with the rounds played and then multiply the factor to the rounds again:
given, that 30 is the maximum rounds_played value in this set:
weight = (player_score * rounds_played) / 30
new_score = weight
Then I would append new_score to a new array.
This can get quite slow if I have a larger 2D array. - Is there a shorter, direct way (maybe in numpy) to create a new array with corrected scores?
CodePudding user response:
You can do it easily and fast (thanks to python vectorization) in pandas.
df['new_score'] = (df.player_score * df.rounds_played) / 30
Result df: