Home > Software engineering >  How can I separate a column in new columns
How can I separate a column in new columns

Time:10-20

I don't know how to ask this question, but i'll try do explain my case.

I have a dataset with the data as following:

Product Value Value type year
A 21,5 Price 21
A 5 Volume 21
B 55,3 Price 21
B 10 Volume 21
C 70,0 Price 21
D 37,5 Price 21
D 7,7 Volume 21

And I want to reach something like that:

Product Price Volume Year
A 21,5 5 21
B 55,3 10 21
c 70,0 - 21
D 37,0 7,7 21

I mind that the unstack function can solve the problem, but i don't know how, cause i'm not getting all the columns back.

I found a complex solution but it's not working.

container = []
for label, _df in df.groupby(['Year','Product']):
  _df.set_index('Value type', inplace = True)
  container.append(pd.DataFrame({
                        "Product": [label[1]],
                        "Price":[_df.loc['Price', 'Value']],
                        "Volume": [_df.loc['Volume', 'Value']],
                        "Year":[label[0]]}))

df_new = pd.concat(container)

This solution doesn't work, because the missing line for Volume for product C.

How can I reach the expected dataframe? Is there any fast way to calculate this?

CodePudding user response:

Use pivot:

out = df.pivot(index=['Product', 'year'], columns='Value type', values=['Value']) \
        .droplevel(0, axis=1).reset_index().rename_axis(None, axis=1) \
        [['Product', 'Price', 'Volume', 'year']]
>>> out
  Product  Price  Volume  year
0       A   21.5     5.0    21
1       B   55.3    10.0    21
2       C   70.0     NaN    21
3       D   37.5     7.7    21
  • Related