I have a long row that contains all numbers. The data are all associated with a week so Week 1 has a data point, Week 2 has a point and so on. I need to take the average of the data every 3 weeks, skip 2 weeks and then continue taking the average. This is a short sample of my original list:
import pandas as pd
import numpy as np
data = {'Name': ['Week 1', 'Week 2', 'Week 3', 'Week 4', 'Week 5',
'Week 6', 'Week 7', 'Week 8', 'Week 9', 'Week 10', 'Week 11', 'Week 12'],
'Value': [1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1,2]
}
df = pd.DataFrame(data)
avg_goodweeks = np.mean(np.concatenate((df.values[1:3], df.values[4:6], df.values[7:9],
df.values[10:12])))
avg_badweeks = np.mean(np.concatenate((df.values[3], df.values[6], df.values[9],
df.values[12])))
So the average of this would ideally be the values of week 1, 2, 3, 6, 7, 8, 11, 12, but I would like to be able to set up either a range or block of 2 weeks or something of that order to read through the row and repeat the 3 weeks on 2 weeks off pattern. I will take the average of all the skipped weeks as well, but I figured that would be the same code. I realize the .values[1:3] is not correct, that is old code from when I hard coded data from an excel sheet. The point of this code is to remove hard coded data from excel, have all data on python and automatically sort through using a user defined pattern/range to average values while skipping others. Any help or ideas will be very useful.
CodePudding user response:
IIUC, you can use iloc here and get the mean of Value column:
bad = df.iloc[3::3]['Value']
print(bad.mean())
1.6666666666666667
good = df.drop(bad.index)['Value']
print(good.mean())
1.4444444444444444
CodePudding user response:
avg_goodweeks = np.array([df.values[::3, 1],
df.values[1::3, 1]]).mean(axis=0)
avg_badweeks = df.values[2::3, 1].mean()