Home > Back-end >  Melt multiple columns in one
Melt multiple columns in one

Time:07-28

I would like to melt my pandas dataframe from this structure...

Name Color year 2020 Color year 2019 Size year 2020 Size year 2019
A red red big small
B red blue big medium

... to this one.

Name Year Color Size
A 2020 red big
B 2020 red big
A 2019 red small
B 2019 blue medium

I tried to melt with

df_melted = df.melt(
    id_vars=['Name', 'Color year 2020', 'Color year 2019'],
    value_vars=['Size year 2020', 'Size year 2019'],
    var_name='Year',
    value_name='Size')

df_final = df_melted.melt(
    id_vars=['Name', 'Year'],
    value_vars=['Color year 2020', 'Color year 2019'],
    var_name='Year',
    value_name='Color')

But it doesn't seem right... Is it possible to do it in one way?

CodePudding user response:

You can check with wide_to_long

out = pd.wide_to_long(df, ['Color year ','Size year '],
                  i = 'Name', j ='Year',suffix = '\w ')
out.columns = out.columns.str.split().str[0]
out.reset_index(inplace=True)
out
Out[151]: 
  Name  Year Color    Size
0    A  2020   red     big
1    B  2020   red     big
2    A  2019   red   small
3    B  2019  blue  medium

CodePudding user response:

Another version of pd.wide_to_long:

out =  pd.wide_to_long(df.rename(columns=lambda x: x.replace(' year ', '')), 
                       ['Color', 'Size'], 'Name', 'Year').reset_index()
print(out)

# Output
  Name  Year Color    Size
0    A  2020   red     big
1    B  2020   red     big
2    A  2019   red   small
3    B  2019  blue  medium

CodePudding user response:

One option is with pivot_longer from pyjanitor:

# pip install pyjanitor
import pandas as pd
import janitor

df.pivot_longer(index = 'Name', 
                names_to = ('.value', 'year'), 
                names_sep = ' year ')

  Name  year Color    Size
0    A  2020   red     big
1    B  2020   red     big
2    A  2019   red   small
3    B  2019  blue  medium

In the above solution, the .value determines which parts of the column labels remain as headers - the labels are split apart with the names_sep.

  • Related