Unpivot pandas DataFrame partly-CodePudding

I have the following table DataFrame

Items	Description	Store 1 Qty	Store 1 Value	Store 2 Qty	Store 2 Value
item 1	Some item name	5	120	7	240
item 2	Some other item	9	1234	12	98

is there any easy way on Python to unpivot stores only? To make it this way:

Items	Description	Store number	Value	Qty
Item 1	Some item name	Store 1	5	120

I was thinking about unpivoting it completely and then pivoting it back keeping value and qty as values. But I assume, that some more efficient solution should be there

CodePudding user response：

If need Qty, Value to separate columns convert first columns to MultiIndex, so possible use Series.str.rsplit by last space to MultiIndex in columns, so last reshape by DataFrame.stack:

df = df.set_index(['Items','Description'])
df.columns = df.columns.str.rsplit(n=1, expand=True)
df = df.rename_axis(('Store number',None), axis=1).stack(0).reset_index()
print (df)
    Items      Description Store number  Qty  Value
0  item 1   Some item name      Store 1    5    120
1  item 1   Some item name      Store 2    7    240
2  item 2  Some other item      Store 1    9   1234
3  item 2  Some other item      Store 2   12     98

CodePudding user response：

You can also use pd.melt() along with pd.pivot as below:

df_ = pd.melt(df, id_vars=['Items', 'Description'])
df_[['Store number', 'Type']] = df_['variable'].str.rsplit(n=1, expand=True)
df_ = df_.pivot(index=['Items', 'Description', 'Store number'], columns='Type', values='value').reset_index()
print(df_)
Type   Items      Description Store number  Qty  Value
0     item 1   Some item name      Store 1    5    120
1     item 1   Some item name      Store 2    7    240
2     item 2  Some other item      Store 1    9   1234
3     item 2  Some other item      Store 2   12     98

CodePudding user response：

One option is with pd.wide_to_long; first the columns have to be reordered, such that Qty and Value appear at the front:

columns = df.columns.str.rsplit(n=1).str[::-1].str.join(' ')
temp = df.set_axis(columns, axis = 'columns')
(pd.wide_to_long(temp, 
                 stubnames = ['Qty', 'Value'], 
                 i = ['Items', 'Description'], 
                 j = 'Store Number', 
                 sep = ' ', 
                 suffix='. ')
   .reset_index()
)
    Items      Description Store Number  Qty  Value
0  item 1   Some item name      Store 1    5    120
1  item 1   Some item name      Store 2    7    240
2  item 2  Some other item      Store 1    9   1234
3  item 2  Some other item      Store 2   12     98

Below is another possible option:

#pip install git https://github.com/pyjanitor-devs/pyjanitor.git
import pandas as pd
import janitor
df.pivot_longer(index = ['Items', 'Description'], 
                names_to = ('Store Number', '.value'), 
                names_pattern = r"(. \s\d)\s(. )")

    Items      Description Store Number  Qty  Value
0  item 1   Some item name      Store 1    5    120
1  item 2  Some other item      Store 1    9   1234
2  item 1   Some item name      Store 2    7    240
3  item 2  Some other item      Store 2   12     98

This uses the pivot_longer function from pyjanitor.

Explanation: The columns you wish to reshape has a pattern (the store number is followed by either Qty or Value); we take advantage of this in the names_pattern with a regular expression of groups (r"(. \s\d)\s(. )") - the first group points to Store 1/Store2, while the other group points to Qty/Value.

The names_to argument specifies how the new dataframe should look - for this specific case, the .value tells the function to keep the parts of the columns associated with it as header - in this case, the .value is the second entry, so it is paired with the second group in names_pattern; store number in names_to is paired with the first entry in names_pattern