I have a dataframe like below which contains 4 columns. I want to convert each unique inventory item number (Z15, Z17 and so on) under the "inv" column into new columns with "info" value corresponds to each store and period. Transpose function does not work in this situation. Also, if I use pivot_table or groupby function, I won't be able to get the value for "High", "Medium" and so on.
Be noted, for the "info" column, I have many different combination of categorical values along with numerical values in real dataset. Also in the real dataset, I have over 100 stores, over 400 inventory items and 30 periods. This is a simplified version of the data to demonstrate my idea. Any suggestion or advice are greatly appericated.
import pandas as pd
import numpy as np
inv = ['Z15','Z15','Z15','Z15','Z15','Z15','Z15','Z15','Z15','Z17','Z17','Z17','Z17','Z17','Z17','Z17']
store = ['store1','store1','store1','store2','store2','store2','store2','store2','store2','store3','store4','store5','store6','store7','store1','store2']
period = [2018,2019,2020,2015,2016,2017,2018,2019,2020,2022,2022,2022,2022,2022,2018,2019]
info = ['0.84773','0.8487','0.82254','0.75','0.65','0.432','0.546','0.777','0.1','High','High','Medium','Very Low','Low','High','Low']
df = pd.DataFrame({'inv':inv,
'store':store,
'period':period,
'info':info})
Data looks like this:
The desired output will be like this :
CodePudding user response:
You're looking for pivot
:
df.pivot(index = ['store', 'period'], columns='inv' ,values = 'info').reset_index()
Output:
inv store period Z15 Z17
0 store1 2018 0.84773 High
1 store1 2019 0.8487 NaN
2 store1 2020 0.82254 NaN
3 store2 2015 0.75 NaN
4 store2 2016 0.65 NaN
5 store2 2017 0.432 NaN
6 store2 2018 0.546 NaN
7 store2 2019 0.777 Low
8 store2 2020 0.1 NaN
9 store3 2022 NaN High
10 store4 2022 NaN High
11 store5 2022 NaN Medium
12 store6 2022 NaN Very Low
13 store7 2022 NaN Low