I have a Csv file that looks something like this
Time | Count | Operation |
---|---|---|
10:01:00 | 2 | Up |
10:01:00 | 5 | Down |
10:01:00 | 1 | Down |
10:01:00 | 2 | Up |
10:01:00 | 1 | Up |
10:02:00 | 3 | Down |
10:02:00 | 2 | Up |
10:02:00 | 5 | Down |
I want to add up the values of the count column for each minute by the id of the operations column and then for the same minute subtract the up and down sums from each other which should give me something like this
Sum()
:
Time | Count | Operation |
---|---|---|
10:01:00 | 5 | Up |
10:01:00 | 6 | Down |
10:02:00 | 2 | Up |
10:02:00 | 8 | Down |
Diff()
:
Time | Delta |
---|---|
10:01:00 | 1 |
10:02:00 | 6 |
To do this, I try something like
def Delta_Volume():
df = pd.read_csv(Ex_Csv, usecols=['Time','Count','Operation'], parse_dates=[0])
df['Time'] = df['Time'].dt.floor("T", 0).dt.time
df1 = df.groupby('Operation').sum('Count')
df2 = df.groupby('Operation').diff('Count')
#df['Delt_of_row'] = df.loc[1 : 3,['Count' , 'Operation']].sum(axis = 1)
#df['Delt_of_row'] = df.loc[1 : 3,['Count' , 'Operation']].diff(axis = 1)
print(df1)
But it doesn't work the way I need unfortunately
CodePudding user response:
You can start with a classical GroupBy.sum
, then use the MultiIndex to compute the difference:
df2 = df.groupby(['Operation', 'Time']).sum()
print(df2)
out = df2.loc['Down']-df2.loc['Up']
print(out)
output:
# groupby.sum
Count
Operation Time
Down 10:01:00 6
10:02:00 8
Up 10:01:00 5
10:02:00 2
# difference
Count
Time
10:01:00 1
10:02:00 6