I have a dataframe which looks like this
df=
Time x y
0 2018-09-13 01:17:00 5.0 0.0
1 2018-09-13 02:17:00 9.0 0.0
2 2018-09-13 03:17:00 2.0 1.0
3 2018-09-13 04:17:00 1.0 0.0
.......
I want to iterate through this whole dataframe and calculate a new variable z.
The value of z would be z= z[prev] x-y
for example, the final output would be
Time z
0 2018-09-13 01:17:00 5 #[0 5-0]
1 2018-09-13 02:17:00 14 #[5 9-0]
2 2018-09-13 03:17:00 15 #[14 2-1]
3 2018-09-13 04:17:00 16 #[15 1-0]
.....
I am finding it difficult to iterate over the time series data.
I have tried the following but it is not working.
for i,row in df.iterrows():
z=0
row['z']=row['z'] row['x']-row['y']
print[z]
CodePudding user response:
In your case do cumsum
df['new'] = df.x.sub(df.y).cumsum()
Out[410]:
0 2018-09-13 5.0
1 2018-09-13 14.0
2 2018-09-13 15.0
3 2018-09-13 16.0
dtype: float64
CodePudding user response:
You can use indices
z = []
for i in range(len(df)):
if i == 0:
z.append(df.loc[i]['x'] - df.loc[i]['y'])
else:
z.append(z[i-1] df.loc[i]['x'] - df.loc[i]['y'])
df['z'] = z