Cumulative average in python-CodePudding

I'm working with csv files.

I'd like a to create a continuously updated average of a sequence. ex;

I'd like to output the average of each individual value of a list

list; [a, b, c, d, e, f]
formula:

(a)/1= ?

(a b)/2=?

(a b c)/3=?

(a b c d)/4=?

(a b c d e)/5=?

(a b c d e f)/6=?

To demonstrate:

if i have a list; [1, 4, 7, 4, 19]

my output should be; [1, 2.5, 4, 4, 7]

explained;

(1)/1=1

(1 4)/2=2.5

(1 4 7)/3=4

(1 4 7 4)/4=4

(1 4 7 4 19)/5=7

As far as my python file it is a simple code:

import matplotlib.pyplot as plt

import pandas as pd

df = pd.read_csv('somecsvfile.csv')

x = [] #has to be a list of 1 to however many rows are in the "numbers" column, will be a simple [1, 2, 3, 4, 5] etc...

#x will be used to divide the numbers selected in y to give us z

y = df[numbers]

z = #new dataframe derived from the continuous average of y

plt.plot(x, z)

plt.show()

If numpy is needed that is no problem.

CodePudding user response：

You can use cumsum to get cumulative sum and then divide to get the running average.

x = np.array([1, 4, 7, 4, 19])
np.cumsum(x)/range(1,len(x) 1)
print (z)

output:

[1.  2.5 4.  4.  7. ]

CodePudding user response：

pandas.DataFrame.expanding is what you need.

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.expanding.html

Using it you can just pass df.expanging().mean() to get the result you want.

mean = df.expanding().mean()

print(mean)

Out[10]: 
0   1.0
1   2.5
2   4.0
3   4.0
4   5.0

if you want to do it just in one column pass it instead of df. like df['column_name'].expanding().mean()