Home > Enterprise >  To create a new column to assign a new ID using loop or numpy, where
To create a new column to assign a new ID using loop or numpy, where

Time:08-16

I'm a little new to Python. I am working on movement data. I have ID, location, speed, time, and destination columns. My dataframe looks like this:

shipid destination lon lat speed time
508040000 Brunei 139.356 34.172 13 2016/01/12 16:44:03
508040000 Brunei 139.356 34.173 12.9 2016/01/12 16:44:03
508040000 Brunei 139.781 35.245 12.7 2016/01/12 16:44:03
508040000 Busan 139.358 34.172 15.8 2016/03/01 16:09:36
508040000 Busan 139.359 34.173 15.6 2016/03/01 16:10:06
508040000 Busan 139.362 34.177 15.9 2016/03/01 16:11:07
431000792 Tokyo 139.673 34.842 11 2016/04/01 22:17:51
431000792 Tokyo 139.683 34.914 11 2016/04/01 22:41:11
431000792 Tokyo 139.684 34.926 10.3 2016/04/01 22:45:20
431000792 Aomi 140.760717 40.8771 0.1 2016/05/03 22:03:02
431000792 Aomi 140.759967 40.877033 0 2016/05/03 23:54:19
431000792 Aomi 140.760117 40.876917 0 2016/05/04 0:06:00

I want to loop or iterate through ID and destination to create a new trip ID column.

shipid destination lat lon speed time tripid
508040000 Brunei 139.356 34.172 13 2016/01/12 16:44:03 1
508040000 Brunei 139.356 34.173 12.9 2016/01/12 16:44:03 1
508040000 Brunei 139.781 35.245 12.7 2016/01/12 16:44:03 1
508040000 Busan 139.358 34.172 15.8 2016/03/01 16:09:36 2
508040000 Busan 139.359 34.173 15.6 2016/03/01 16:10:06 2
508040000 Busan 139.362 34.177 15.9 2016/03/01 16:11:07 2
431000792 Tokyo 139.673 34.842 11 2016/04/01 22:17:51 1
431000792 Tokyo 139.683 34.914 11 2016/04/01 22:41:11 1
431000792 Tokyo 139.684 34.926 10.3 2016/04/01 22:45:20 1
431000792 Aomi 140.760717 40.8771 0.1 2016/05/03 22:03:02 2
431000792 Aomi 140.759967 40.877033 0 2016/05/03 23:54:19 2
431000792 Aomi 140.760117 40.876917 0 2016/05/04 0:06:00 2

So I wrote as follows

for shipid in df.shipid.unique():
    tripid=1
    for destination in df.destination.unique():
        df['tripid'] = np.where(df['destination']==df['destination'].shift(-1),tripid, tripid 1)

The code creates a new column tripid where every row is assigned "1", except for the last row of each destination, where "2" is assigned.

I want to assign a unique ID to each ship with the same destination as 1, 2, 3.

I am not sure of what to use, either loop or iteration, etc. I tried to use it for a loop and the error shows the series is ambiguous.

CodePudding user response:

Does this solve your problem?

import pandas as pd

# Example dataframe:
df = pd.DataFrame(
    {'shipid': [1, 1, 1, 1, 1, 2, 2],
    'destination': ['Brunei', 'Brunei', 'Brunei', 'Busan', 'Busan', 'Tokyo', 'Tokyo']}
)

df['tripid'] = df.groupby(['destination']).cumcount()   1
shipid destination tripid
1 Brunei 1
1 Brunei 2
1 Brunei 3
1 Busan 1
1 Busan 2
2 Tokyo 1
2 Tokyo 2

You can also group by both shipid and destination:

df['tripid'] = df.groupby(['shipid', 'destination']).cumcount()   1

Or maybe you need this?

https://stackoverflow.com/a/51329888/14627505

df['tripid'] = df.groupby(['shipid', 'destination']).ngroup()   1
shipid destination tripid
1 Brunei 1
1 Brunei 1
1 Brunei 1
1 Busan 2
1 Busan 2
2 Tokyo 3
2 Tokyo 3
  • Related