Home > Enterprise >  How to seperate out a dataframe by ID?
How to seperate out a dataframe by ID?

Time:08-02

I have a df which has object ID as the index, and then x values and y values in the columns, giving coordinates for where the object moved over time. For example:

id    x    y 
1    100  400
1    110  390
1    115  385
2    110  380
2    115  380
3    200  570
3    210  580

I would like to calculate the change in x and the change in y, for each object, so I can see direction (eg north-east) and how linear or how non linear each route is. I can then filter out object moving in a way I am not interested in.

How do I create a loop which loops over each object (aka ID) separately? For example trying something like: for len(df) would loop over the entire number of rows, it would not discriminate based on ID.

Thank you

CodePudding user response:

# if id is your index, fix that:
df = df.reset_index()

# groupby id, getting the difference row by row within each group:
df[['chngX', 'chngY']] = df.groupby('id')[['x', 'y']].diff()
print(df)

Output:

   id    x    y  chngX  chngY
0   1  100  400    NaN    NaN
1   1  110  390   10.0  -10.0
2   1  115  385    5.0   -5.0
3   2  110  380    NaN    NaN
4   2  115  380    5.0    0.0
5   3  200  570    NaN    NaN
6   3  210  580   10.0   10.0
  • Related