Home > Back-end >  pandas groupby concatination based on a condition
pandas groupby concatination based on a condition

Time:12-03

I have a dataframe like below, and I am trying to join the names, when the class is non empty,

Name   class score
kumar   ""    ""
ram     10    14
ravi    ""    ""
tej     ""    ""
om      12    15

my desired output is,

Name      class score
kumarram    10    14
ravitejom   12    15

I tried groupby class, and adding as a new series but the length is not matching,

my code:
    df['g1'] = df['class'].ne("").cumsum()
    df.loc[df["class"].ne(""), "Name"] = df.groupby("g1").apply(lambda x: " ".join(x["Name"].values)

CodePudding user response:

You are correct to find blocks with cumsum on negate condition. Here however, you can reverse the series before cumsum, so blocks are count from bottom up:

blocks = df['score'].ne('""')[::-1].cumsum()
df.groupby(blocks).agg({
    'Name':''.join,
    'class':'last',
    'score':'last'
})

Output:

            Name class score
score                       
1      ravitejom    12    15
2       kumarram    10    14
  • Related