Home > Net >  Pandas transform list values and their column names
Pandas transform list values and their column names

Time:11-10

I have a pandas dataframe with 1 row and values in columns by separated by categories

car > audi > a4 car > bmw > 3er moto > bmw > gs
[item1, item2, item3] [item1, item4, item5] [item6]

and I would like to create structure something like this:

item category 1 category 2 category 3
item 1 car audi a4
item 1 car bmw 3er
item 2 car audi a4
item 3 car audi a4
item 4 car bmw 3er
item 5 car bmw 3er
item 6 moto bmw gs

What is the best solution? Thank you

CodePudding user response:

You can use the explode function that is a pandas built-in.

Docs: link

Also provide a reproducible example

CodePudding user response:

You can use:

(df.set_axis(df.columns.str.split('\s*>\s*', expand=True), axis=1)
   .loc[0].explode()
   .reset_index(name='item')
   .rename(columns=lambda x: x.replace('level_', 'category'))
)

Output:

  category0 category1 category2   item
0       car      audi        a4  item1
1       car      audi        a4  item2
2       car      audi        a4  item3
3       car       bmw       3er  item1
4       car       bmw       3er  item4
5       car       bmw       3er  item5
6      moto       bmw        gs  item6

Used input:

df = pd.DataFrame({'car > audi > a4': [['item1', 'item2', 'item3']],
                   'car > bmw > 3er': [['item1', 'item4', 'item5']],
                   'moto > bmw > gs': [['item6']]})
  • Related