Home > other >  Python/Pandas/Numpy question: how to stack/merge three columns while keeping the fourth column names
Python/Pandas/Numpy question: how to stack/merge three columns while keeping the fourth column names

Time:10-21

I want to stack or combine three columns but don't want to lose their associated fourth column. I want to merge the three columns and create an additional column stating the initial column names.

I want to go from

to

CodePudding user response:

You can try this, though there must be a better solution out there !

I worked with this data (did not want to copy exact same value from your example for obvious laziness) :

df = pd.DataFrame({"Index": [1, 2, 3, 4, 5], "apples": [1, 2, 3, 4, 5], "bananas" : [6, 7, 8, 9, 10], "strawberries": [11, 12, 13, 14, 15], "colors": ["blue", "green", "red", "yellow", "purple"]})

df

Index   apples  bananas     strawberries    colors
1       1       6           11              blue
2       2       7           12              green
3       3       8           13              red
4       4       9           14              yellow
5       5       10          15              purple

I did the following :

fruits = ["apples", "bananas", "strawberries"]
new_df = pd.DataFrame()
for fruit in fruits:
    temp_df = df[[fruit, "colors"]]
    temp_df["fruits"] = fruit
    temp_df.columns = ["fruit values", "color", "fruits"]
    new_df = new_df.append(temp_df)
new_df = new_df.sort_values("color")
new_df = new_df.reset_index(drop=True)

Resulting in:

new_df  

    fruit values    color   fruits
0   1               blue    apples
1   6               blue    bananas
2   11              blue    strawberries
3   2               green   apples
4   7               green   bananas
5   12              green   strawberries
6   5               purple  apples
7   10              purple  bananas
8   15              purple  strawberries
9   3               red     apples
10  8               red     bananas
11  13              red     strawberries
12  4               yellow  apples
13  9               yellow  bananas
14  14              yellow  strawberries

CodePudding user response:

You can try this:

import pandas as pd

# create original dataframe
df = pd.DataFrame()
df['apples']=[4.63,24.3,5.24,5.255,9.4]
df['bananas']=[6.57,7.366,2.3,4.9,7.3]
df['strawberries']=[26.2,5.39,8.5,9.2,3.4]
df['color']=['Blue','Green','Red','Yellow','Purple']

# unpivot dataframe
df2 = pd.melt(df, 
              id_vars='color', 
              value_vars=list(df.columns[:-1]), # list of fruits
              var_name='fruit', 
              value_name='fruit values')
df2

Resulting in:

     color         fruit  fruit values
0     Blue        apples         4.630
1    Green        apples        24.300
2      Red        apples         5.240
3   Yellow        apples         5.255
4   Purple        apples         9.400
5     Blue       bananas         6.570
6    Green       bananas         7.366
7      Red       bananas         2.300
8   Yellow       bananas         4.900
9   Purple       bananas         7.300
10    Blue  strawberries        26.200
11   Green  strawberries         5.390
12     Red  strawberries         8.500
13  Yellow  strawberries         9.200
14  Purple  strawberries         3.400
  • Related