Home > OS >  Why does the groupby command in Pandas produce non-exist ids?
Why does the groupby command in Pandas produce non-exist ids?

Time:08-20

I use the pandas groupby command on my dataframe as:

df.groupby('courier_id').type_of_vehicle.size()

but this code produces some 'courier_id' that they're not in my dataframe

courier_id
00aecd42-472f-11ec-94e0-77812be296a5    4
011da6a6-eb0b-11ec-97e1-179dc13cdf87    1
0140f63c-02e0-11ed-b314-9b2e7e4f7e5c    1
0188d572-7228-11ec-ab3b-07d470cb404d    7
01cef7ba-e32e-11ec-bb21-67c7079055d4    0
                                       ..
c98fc418-7b51-11ec-a81c-77139d6dd889    0
d98a4b9a-d056-11ec-9e3c-0b80c11ec04b    1
dae54c80-d1f8-11ec-bbb0-b71d7b2c4e1a    1
f7925664-0ac1-11ed-ab40-df16023f78cb    0
f857cb84-371c-11ec-9af6-ffeaeea4b0f1    4
Name: type_of_vehicle, Length: 268, dtype: int64

I checked it with: '01cef7ba-e32e-11ec-bb21-67c7079055d4' in df.courier_id.values and result was False

I used df.groupby('courier_id').get_group('01cef7ba-e32e-11ec-bb21-67c7079055d4') and it raise KeyError but when make for in it, return empty DataFrame

Note: when I slice my dataframe as new_df = df[['courier_id', 'type_of_vehicle']] the result become right!

CodePudding user response:

If you provide some reproducible code/data it would be appreciated. That way we can provide you the best possible answer.

However, I think the problem is due the following:

When you use the function groupby(), the original courier_id becomes the new index of the transformed DataFrame. Try to use .reset_index() and your problem should be solved.

df.groupby('courier_id').type_of_vehicle.size().reset_index()

CodePudding user response:

I found because I used the sample method to take some instances from my own data something in backend of program, cause this problem.

  • Related