Is it possible to apply a calculation to a DF by ID. For example if multiple records share a similar ID make that calculation but do it throughout the DF?
df['Total Volume'] = df.groupby('ID')[df['VOLUME'].shift(-1) df['ADDED'] - df['VOLUME']]
So I want to create the column "Total Volume" using the calculation
df['VOLUME'].shift(-1) df['ADDED'] - df['VOLUME']
I've had some success with this logic:
df['Total Volume'] = df['VOLUME'].shift(-1) df['ADDED'] - df['VOLUME']
But it doesn't take into account for the ID and the first record is always a total from another ID.
Ultimately the column should be blank or empty for the first record with every new ID.
CodePudding user response:
IIUC, you want to restrict the calculations to each "ID", right? Then you could use groupby
shift
on "VOLUME" and do everything else the same.
df['TOTAL VOLUME'] = df.groupby('ID')['VOLUME'].shift(-1) df['ADDED'] - df['VOLUME']
Note that shift(-1)
moves the next record up, so the end result is the last record in each "ID" is empty. Since you say:
Ultimately the column should be blank or empty for the first record with every new ID
I think you want shift(1)
instead (by default it's 1):
df['TOTAL VOLUME'] = df.groupby('ID')['VOLUME'].shift() df['ADDED'] - df['VOLUME']
Then again, the above code can be written even simpler using groupby
diff
:
df['TOTAL VOLUME'] = df['ADDED'] - df.groupby('ID')['VOLUME'].diff()