Home > Software engineering >  How to loop pandas dataframe with subset of data (like group by)
How to loop pandas dataframe with subset of data (like group by)

Time:03-30

I have a pandas dataframe after sorted, it looks like bellow (like few person working for shop as shift):

A   B   C   D 
1   1   1   Anna
2   3   1   Anna
3   1   2   Anna
4   3   2   Tom
5   3   2   Tom
6   3   2   Tom
7   3   2   Tom
8   1   1   Anna
9   3   1   Anna
10   1   2   Tom
...

I want to loop and split dataframe to subset of dataframe, then call my another function, eg:

first subset df would be

A   B   C   D 
1   1   1   Anna
2   3   1   Anna
3   1   2   Anna

second subset df would be

4   3   2   Tom
5   3   2   Tom
6   3   2   Tom
7   3   2   Tom

third subset df would be

8   1   1   Anna
9   3   1   Anna

Is there a good way to loop the main datafraem and split it?

for x in some_magic_here:
    sub_df = some_mage_here_too()
    my_fun(sub_df)

Thanks!

CodePudding user response:

You need loop by groupby object with consecutive groups created by compare shifted D values for not equal with cumulative sum:

for i, sub_df in df.groupby(df.D.ne(df.D.shift()).cumsum()):
    print (sub_df)
    my_fun(sub_df)
  • Related