I have the following table DataFrame
Items | Description | Store 1 Qty | Store 1 Value | Store 2 Qty | Store 2 Value |
---|---|---|---|---|---|
item 1 | Some item name | 5 | 120 | 7 | 240 |
item 2 | Some other item | 9 | 1234 | 12 | 98 |
is there any easy way on Python to unpivot stores only? To make it this way:
Items | Description | Store number | Value | Qty |
---|---|---|---|---|
Item 1 | Some item name | Store 1 | 5 | 120 |
I was thinking about unpivoting it completely and then pivoting it back keeping value and qty as values. But I assume, that some more efficient solution should be there
CodePudding user response:
If need Qty, Value
to separate columns convert first columns to MultiIndex
, so possible use Series.str.rsplit
by last space to MultiIndex in columns
, so last reshape by DataFrame.stack
:
df = df.set_index(['Items','Description'])
df.columns = df.columns.str.rsplit(n=1, expand=True)
df = df.rename_axis(('Store number',None), axis=1).stack(0).reset_index()
print (df)
Items Description Store number Qty Value
0 item 1 Some item name Store 1 5 120
1 item 1 Some item name Store 2 7 240
2 item 2 Some other item Store 1 9 1234
3 item 2 Some other item Store 2 12 98
CodePudding user response:
You can also use pd.melt()
along with pd.pivot
as below:
df_ = pd.melt(df, id_vars=['Items', 'Description'])
df_[['Store number', 'Type']] = df_['variable'].str.rsplit(n=1, expand=True)
df_ = df_.pivot(index=['Items', 'Description', 'Store number'], columns='Type', values='value').reset_index()
print(df_)
Type Items Description Store number Qty Value
0 item 1 Some item name Store 1 5 120
1 item 1 Some item name Store 2 7 240
2 item 2 Some other item Store 1 9 1234
3 item 2 Some other item Store 2 12 98
CodePudding user response:
One option is with pd.wide_to_long
; first the columns have to be reordered, such that Qty
and Value
appear at the front:
columns = df.columns.str.rsplit(n=1).str[::-1].str.join(' ')
temp = df.set_axis(columns, axis = 'columns')
(pd.wide_to_long(temp,
stubnames = ['Qty', 'Value'],
i = ['Items', 'Description'],
j = 'Store Number',
sep = ' ',
suffix='. ')
.reset_index()
)
Items Description Store Number Qty Value
0 item 1 Some item name Store 1 5 120
1 item 1 Some item name Store 2 7 240
2 item 2 Some other item Store 1 9 1234
3 item 2 Some other item Store 2 12 98
Below is another possible option:
#pip install git https://github.com/pyjanitor-devs/pyjanitor.git
import pandas as pd
import janitor
df.pivot_longer(index = ['Items', 'Description'],
names_to = ('Store Number', '.value'),
names_pattern = r"(. \s\d)\s(. )")
Items Description Store Number Qty Value
0 item 1 Some item name Store 1 5 120
1 item 2 Some other item Store 1 9 1234
2 item 1 Some item name Store 2 7 240
3 item 2 Some other item Store 2 12 98
This uses the pivot_longer function from pyjanitor.
Explanation: The columns you wish to reshape has a pattern (the store number is followed by either Qty or Value); we take advantage of this in the names_pattern
with a regular expression of groups (r"(. \s\d)\s(. )"
) - the first group points to Store 1/Store2
, while the other group points to Qty/Value
.
The names_to
argument specifies how the new dataframe should look - for this specific case, the .value
tells the function to keep the parts of the columns associated with it as header - in this case, the .value
is the second entry, so it is paired with the second group in names_pattern
; store number
in names_to
is paired with the first entry in names_pattern