Home > database >  Transforming in Pandas GroupBy
Transforming in Pandas GroupBy

Time:06-10

Index E_Id P_Id Date
121 701 9002 2021
122 701 9001 2019
123 702 9002 2021
124 702 9002 2019
125 703 9001 2021
126 704 9002 2019
127 704 9003 2019

Now I want to Create another DataFrame groupedby E_Id But I want to Count the number of rows against each P_Id call it 'x', and then sum of 'x' for whoever is linked with each E_Id

So

E_Id TotalOfPIds
701 6

Can anyone help me?

As an intermediary step, I did this:

data['_Pid_Total'] = data.groupby('P_Id')[['P_Id']].transform('count')

And then for single E_Id it works like this:

data.loc[data['E_Id'] == '701', ['E_Id', 'P_Id', '_Pid_Total']].groupby(['E_Id', 'P_Id']).first().sum() returning a single integer. However I want to use this in Transform method or just do it for entire DataFrame.

CodePudding user response:

If I understand you correctly, you want to total sum of P_Ids found in each group:

out = (
    df.groupby("E_Id")
    .apply(lambda x: df["P_Id"].isin(x["P_Id"]).sum())
    .to_frame(name="TotalOfPIds")
    .reset_index()
)
print(out)

Prints:

   E_Id  TotalOfPIds
0   701            6
1   702            4
2   703            2
3   704            5

CodePudding user response:

Thank you for an interesting question. My approach is:

  1. You creates a df1 with unique combination of E_id and P_id
  2. Create df2 to count number of unique occur of P_id
  3. Merge df1 and df2, then use groupby with sum

I modify your sample a bit

# Dataframe:
df = pd.DataFrame({'e':[701, 701,701,701, 701, 701,702,702,703,704, 704, 704],
                   'p':[9002, 9002, 9002, 9002, 9001, 9003, 9002, 9002, 9001, 9002, 9003, 9001]})

# Step 1: Unique combination
df1 = df.groupby(['e', 'p']).count().reset_index()

# Step 2: Count Number of occurs for each unique value P_id
df2 = df.groupby('p').count().reset_index().rename(columns={'e':'val'})

# Step 3: Merge then use groupby and sum
df3 = pd.merge(df1, df2, on=['p'], how='left')
df3.groupby('e')[['val']].sum().reset_index()
  • Related