Home > Enterprise >  how can I un-groupby my dataframe in pandas?
how can I un-groupby my dataframe in pandas?

Time:06-10

I have a dataset like this:

project id phase 1 phase 2 phase 3
112 29 157 213
113 36 121 23

how can I get it transformed to a dataset like this:

project id phase days
112 1 29
112 2 157
112 3 213
112 cumulative 399
113 1 36
113 2 121
113 3 23
113 cumulative 180

thanks!

CodePudding user response:

here is another way to do it

calculate the cumulative by row, then melt it. finally replacing out the text phase

df['cumulative'] = df[['phase1','phase2','phase3']].sum(axis=1)
df=df.melt(id_vars='projectid', var_name='phase', value_name='days').sort_values('projectid')
df['phase'] = df['phase'].replace(r'phase','', regex=True)
df
    projectid   phase   days
0   112             1   29
2   112             2   157
4   112             3   213
6   112     cumulative  399
1   113             1   36
3   113             2   121
5   113             3   23
7   113     cumulative  180

CodePudding user response:

Try:

df = df.set_index("project id")
df = (
    df.assign(cumulative=df.sum(axis=1))
    .stack()
    .to_frame(name="days")
    .reset_index()
    .rename(columns={"level_1": "phase"})
)
df["phase"] = df["phase"].str.replace("phase ", "")
print(df)

Prints:

   project id       phase  days
0         112           1    29
1         112           2   157
2         112           3   213
3         112  cumulative   399
4         113           1    36
5         113           2   121
6         113           3    23
7         113  cumulative   180
  • Related