Home > OS >  Pandas: Count Higher Ranks For Current Experiment Participants In Later Experiments
Pandas: Count Higher Ranks For Current Experiment Participants In Later Experiments

Time:09-23

Learning Experiments

In a series of learning experiments, I would like to count the number of participants in each experiment that improved their performance in subsequent experiments (Rank 1 is highest). In addition, I would also like to count the number of participants in each experiment that subsequently reached the top rank.

Here is a short, sanitized version of the learning experiment csv file that I have loaded into a pandas dataframe (df_learning).

Experiment Subject Rank
A Alpha 1
A Bravo 2
A Charlie 3
A Delta 4
A Echo 5
B Alpha 1
B Charlie 2
B Echo 3
B Foxtrot 4
B Golf 5
B India 6
B Juliet 7
C Juliet 1
C Bravo 2
C Charlie 3

Please advise?

CodePudding user response:

You can use a groupby.cummax, then boolean indexing:

m = df['Rank'].sub(df.groupby('Subject')['Rank'].cummax()).lt(0)

improved_rank = df.loc[m, 'Subject'].unique()

output: ['Charlie', 'Echo', 'Juliet']

reached_top_rank = df.loc[m&df['Rank'].eq(1), 'Subject'].unique()

output: ['Juliet']

  • Related