Home > Enterprise >  Unpivot pandas DataFrame partly
Unpivot pandas DataFrame partly

Time:11-16

I have the following table DataFrame

Items Description Store 1 Qty Store 1 Value Store 2 Qty Store 2 Value
item 1 Some item name 5 120 7 240
item 2 Some other item 9 1234 12 98

is there any easy way on Python to unpivot stores only? To make it this way:

Items Description Store number Value Qty
Item 1 Some item name Store 1 5 120

I was thinking about unpivoting it completely and then pivoting it back keeping value and qty as values. But I assume, that some more efficient solution should be there

CodePudding user response:

If need Qty, Value to separate columns convert first columns to MultiIndex, so possible use Series.str.rsplit by last space to MultiIndex in columns, so last reshape by DataFrame.stack:

df = df.set_index(['Items','Description'])
df.columns = df.columns.str.rsplit(n=1, expand=True)
df = df.rename_axis(('Store number',None), axis=1).stack(0).reset_index()
print (df)
    Items      Description Store number  Qty  Value
0  item 1   Some item name      Store 1    5    120
1  item 1   Some item name      Store 2    7    240
2  item 2  Some other item      Store 1    9   1234
3  item 2  Some other item      Store 2   12     98

CodePudding user response:

You can also use pd.melt() along with pd.pivot as below:

df_ = pd.melt(df, id_vars=['Items', 'Description'])
df_[['Store number', 'Type']] = df_['variable'].str.rsplit(n=1, expand=True)
df_ = df_.pivot(index=['Items', 'Description', 'Store number'], columns='Type', values='value').reset_index()
print(df_)
Type   Items      Description Store number  Qty  Value
0     item 1   Some item name      Store 1    5    120
1     item 1   Some item name      Store 2    7    240
2     item 2  Some other item      Store 1    9   1234
3     item 2  Some other item      Store 2   12     98

CodePudding user response:

One option is with pd.wide_to_long; first the columns have to be reordered, such that Qty and Value appear at the front:

columns = df.columns.str.rsplit(n=1).str[::-1].str.join(' ')
temp = df.set_axis(columns, axis = 'columns')
(pd.wide_to_long(temp, 
                 stubnames = ['Qty', 'Value'], 
                 i = ['Items', 'Description'], 
                 j = 'Store Number', 
                 sep = ' ', 
                 suffix='. ')
   .reset_index()
)
    Items      Description Store Number  Qty  Value
0  item 1   Some item name      Store 1    5    120
1  item 1   Some item name      Store 2    7    240
2  item 2  Some other item      Store 1    9   1234
3  item 2  Some other item      Store 2   12     98

Below is another possible option:

#pip install git https://github.com/pyjanitor-devs/pyjanitor.git
import pandas as pd
import janitor
df.pivot_longer(index = ['Items', 'Description'], 
                names_to = ('Store Number', '.value'), 
                names_pattern = r"(. \s\d)\s(. )")

    Items      Description Store Number  Qty  Value
0  item 1   Some item name      Store 1    5    120
1  item 2  Some other item      Store 1    9   1234
2  item 1   Some item name      Store 2    7    240
3  item 2  Some other item      Store 2   12     98

This uses the pivot_longer function from pyjanitor.

Explanation: The columns you wish to reshape has a pattern (the store number is followed by either Qty or Value); we take advantage of this in the names_pattern with a regular expression of groups (r"(. \s\d)\s(. )") - the first group points to Store 1/Store2, while the other group points to Qty/Value.

The names_to argument specifies how the new dataframe should look - for this specific case, the .value tells the function to keep the parts of the columns associated with it as header - in this case, the .value is the second entry, so it is paired with the second group in names_pattern; store number in names_to is paired with the first entry in names_pattern

  • Related