Home > Software design >  Python: Rows to Column in Pandas Dataframe
Python: Rows to Column in Pandas Dataframe

Time:07-07

I have following dataframe:

data.head()

Out:

     metric_name            metric_date     warehouse   value          week  year   day
0   Crossdock Transfer Out  2022-05-10      WR1        1.370313e 06     19  2022    2
21  New Vendor Freight      2022-04-19      WR1        1.583337e 06     16  2022    2
59  Crossdock Transfer Out  2021-12-26      WR1        3.805000e 03     51  2021    0
80  New Vendor Freight      2021-12-30      WR1        2.832327e 06     52  2021    4
90  Crossdock Transfer In   2022-05-22      WR1        0.000000e 00     20  2022    0
...     ...     ...     ...     ...     ...     ...     ...
127699  LF Forecasted New Crossdock Transfer Out    2021-11-01  WR1     2.595843e 06    44  2021    1

There are different metrics at column "metric_name" with its own value. To have a better dataset for time series forecasting I want to convert my Dataframe. I want every single metric in "metric_name" as new column in the dataset.

See here all different metrics in "metric_name":

data.metric_name.value_counts()

Out:

LF Forecasted End Vendor Freight   End Transfer In Backlog    364
LF Forecasted New Crossdock Transfer Out                      364
LF Forecasted New Vendor Freight                              364
LF Forecasted New Crossdock Transfer In                       364
Forecasted New Crossdock Transfer Out                         359
Forecasted End Vendor Freight   End Transfer In Backlog       359
Forecasted New Crossdock Transfer In                          359
Forecasted New Vendor Freight                                 359
Crossdock Transfer Out                                        345
New Vendor Freight                                            345
Crossdock Transfer In                                         345
End Vendor Freight   End Transfer In Backlog                  345
Name: metric_name, dtype: int64

What can I do to solve this?

I tried this, but it is not giving me the values, just the counts:

bhx4.groupby('metric_date')['metric_name'].value_counts().unstack()

metric_name     Crossdock Transfer In   Crossdock Transfer Out  End Vendor Freight   End Transfer In Backlog    Forecasted End Vendor Freight   End Transfer In Backlog     Forecasted New Crossdock Transfer In    Forecasted New Crossdock Transfer Out   Forecasted New Vendor Freight   LF Forecasted End Vendor Freight   End Transfer In Backlog  LF Forecasted New Crossdock Transfer In     LF Forecasted New Crossdock Transfer Out    LF Forecasted New Vendor Freight    New Vendor Freight
 metric_date                                                
 2021-07-25     1.0     1.0     1.0     1.0     1.0     1.0     1.0         1.0     1.0     1.0     1.0     1.0
 2021-07-26     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0
 2021-07-27     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0
 2021-07-28     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0
 2021-07-29     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0     1.0

I want the DF to look something like this: enter image description here Thanks!

CodePudding user response:

If I got you right:

data.groupby(['metric_date', 'metric_name'])['value'].sum().unstack().fillna(0)
  • Related