I tried to calculate the average/mean of consecutive data points (ith and i 1th entries)
in a column of data using For
loop through the column indices, unfortunately, I was stuck and the only for the loop to be successful is by subtracting the last index. Question is, is there any Pandas
or Numpy
way to calculate this average through the entire index without the For
loop (takes time!) and without necessarily subtracting the last index?
Here's my attempt so far, using an extract from my larger dataset:
import pandas as pd
df = pd.read_csv('class_data.dat',sep='\t')
df
Age BMI Gender
0 23.0 17.2 Male
1 25.6 16.3 Female
2 26.4 22.5 Female
3 43.2 33.0 Male
4 22.5 21.8 Male
5 19.4 29.6 Male
6 20.5 34.6 Female
7 22.7 27.2 Female
8 17.5 15.5 Male
BMI_means = {}
for i in range(len(df)-1):
BMI_means[i] = (df.BMI[i] df.BMI[i 1])/2
BMI_means
output:
{0: 16.75,
1: 19.4,
2: 27.75,
3: 27.4,
4: 25.700000000000003,
5: 32.1,
6: 30.9,
7: 21.35}
CodePudding user response:
here is one way :
df['AVG'] = (df['BMI'] df['BMI'].shift(-1)).div(2)
output:
Age BMI Gender AVG
0 23.0 17.2 Male 16.75
1 25.6 16.3 Female 19.40
2 26.4 22.5 Female 27.75
3 43.2 33.0 Male 27.40
4 22.5 21.8 Male 25.70
5 19.4 29.6 Male 32.10
6 20.5 34.6 Female 30.90
7 22.7 27.2 Female 21.35
8 17.5 15.5 Male NaN
CodePudding user response:
Here is another way:
df["BMI"].rolling(2).mean().shift(-1)