Home > Software engineering >  How to subtract the last value from the first value for the same time
How to subtract the last value from the first value for the same time

Time:07-15

I have a csv file that looks like

Time OpenIA
2022-07-15 10:00:23 1
2022-07-15 10:01:11 3
2022-07-15 10:01:11 2
2022-07-15 10:01:11 1
2022-07-15 10:01:11 3
2022-07-15 10:01:11 1
2022-07-15 10:01:33 1
2022-07-15 10:01:33 2

I'm trying to subtract the latter from the first value with the same identifier so that it would eventually turn out something like

Time OpenIA
2022-07-15 10:00:23 0
2022-07-15 10:01:11 2
2022-07-15 10:01:33 -1

To do this, I use this

df = pd.read_csv(DF, usecols=['Time', 'OpenIA'])
df['Time'] = pd.to_datetime(df['Time'])
df['Time'] = df['Time'].dt.ceil("S", 0)
b = df.drop_duplicates(subset=['Time'], keep='last') - df.drop_duplicates(subset=['Time'], keep='first')

But instead of the expected I get

Time OpenIA
0 days 0.0
0 days 0.0
0 days 0.0

CodePudding user response:

You can use groupby.first/last:

g = df.groupby('Time', sort=False)
out = (g.first()-g.last()).reset_index()

output:

                  Time  OpenIA
0  2022-07-15 10:00:23       0
1  2022-07-15 10:01:11       2
2  2022-07-15 10:01:33      -1

CodePudding user response:

try this

df.groupby('Time').agg(diff=('OpenIA', lambda x: x[-1]-x[0]) )
  • Related