Home > Blockchain >  calculate arithematic mean of rows above
calculate arithematic mean of rows above

Time:02-11

I have a dataset that looks like this:

Value         Type       mean
-1.975767     Weather   
-0.540979     Fruits
-2.359127     Fruits
-2.815604     Corona
-0.929755     Weather

I want to iterate through each row and calculate a mean value for each row above (only if the Type matches). I want to put this value in the mean column. Mean is calculated by:

sum of all values / number of observations

Here, number of observations will be the number of times a Type has occurred so far.

For example, in the first row, there's no "weather" row above so for weather n = 1. So the mean would be -1.975767 / 1 = -1.975767.

In the second row, there's no FRUITS row above it, so the mean will just be -0.540979/1 = -0.540979.

However, in the third row, when we scan all previous rows, we see that FRUITS has already occurred before this and hence, n = 2 for Fruits. So we should get the last's FRUIT's value and calculate a new mean. So here, the mean will be -0.540979 (-2.359127) divided by 2.

Value         Type       mean
-1.975767     Weather   -1.975767
-0.540979     Fruits    -0.540979
-2.359127     Fruits    (-0.540979 -2.359127)  / 2
-2.815604     Corona    -2.815604
-0.929755     Weather   (-1.975767 -0.929755) / 2

What would be an efficient way to do this?

I see two possible sols:

  1. We somehow store the value of N for each type and then just use the last calculated mean for a particular type, update the N by incrementing it with 1 and then calculate the new mean for a particular row.
  2. If storing the N value is not convinient, then every time we scan for the occurrence of a type, we check its corresponding value and calculate the mean of all values again.

CodePudding user response:

Try using groupby expanding mean:

df['mean'] = df.groupby('Type')['Value'].expanding().mean().droplevel(0)

Output:

>>> df
      Value     Type      mean
0 -1.975767  Weather -1.975767
1 -0.540979   Fruits -0.540979
2 -2.359127   Fruits -1.450053
3 -2.815604   Corona -2.815604
4 -0.929755  Weather -1.452761
  • Related