Home > Blockchain >  Create a new df from two columns where one of them contains multiple values
Create a new df from two columns where one of them contains multiple values

Time:06-12

i have a df that contains some columns like the following table.

A B C
a 1,3,5 id_1
b 2,5,7 id_2
c 8,13,18 id_3

What I want is to extract the values from Column B into a new df with the Values from column C like this:

id_col val_col
id_1 1
id_1 3
id_1 5
id_2 2
id_2 5
id_2 7
id_3 8
id_3 13
id_3 18

Maybe my search terms are not specific enough, cause its not my native language, but I just don't found a solution yet.

Thanks for help.

CodePudding user response:

I think this work for your problem:

import pandas as pd

data={
    'A': ['a', 'b','c'],
    'B': [[1,3,5], [2,5,7],[8,13,18]],
    'C':['id_1','id_2','id_3']
     }
data_frame=pd.DataFrame(data=data)
df_result=data_frame.apply(lambda r: pd.Series({'id_col':r['C'],
                                      'val_col':r['B'],
                                     }),axis=1).explode('val_col')
print(df_result)

CodePudding user response:

Let's try this:

df.assign(B=df['B'].str.split(',')).explode('B')[['B', 'C']]

Output:

    B     C
0   1  id_1
0   3  id_1
0   5  id_1
1   2  id_2
1   5  id_2
1   7  id_2
2   8  id_3
2  13  id_3
2  18  id_3
  • Related