Home > Mobile >  Adding column to DataFrame based on index in list
Adding column to DataFrame based on index in list

Time:02-15

I have below DataFrame & list of data

data = [['tom', 10], ['nick', 15], ['juli', 14],
        ['test',14], ['test1',12],['test1',14]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])
>>> df
    Name  Age
0    tom   10
1   nick   15
2   juli   14
3   test   14
4  test1   12
5  test1   14
index_list=[['test1','juli'],['nick'],['tom','test']]
>>> index_list
[['test1', 'juli'], ['nick'], ['tom', 'test']]

I would like to add a column cluster_id to DataFrame based on index of Name in the list, so output should be like

>>> df
    Name  Age cluster_id
0    tom   10          2
1   nick   15          1
2   juli   14          0
3   test   14          2
4  test1   12          0
5  test1   14          0

CodePudding user response:

You could convert index_list to a dictionary that maps names to cluster ids using a dict comprehension and map it to "Name" column:

index_dic = {name: i for i, sublist in enumerate(index_list) for name in sublist}
df['cluster_id'] = df['Name'].map(index_dic)

Output:

    Name  Age  cluster_id
0    tom   10           2
1   nick   15           1
2   juli   14           0
3   test   14           2
4  test1   12           0
5  test1   14           0
  • Related