Home > Mobile >  Pandas binary table from column based on index
Pandas binary table from column based on index

Time:12-21

How can I binarize a dataset according to the index? E.g.

                   A          B          C
idUser                                 
3                  1          1          1
2                  0          1          0
4                  1          0          0

I have tried using pd.get_dummies but the result is almost what I need.

dictio = {'idUser': [3, 3, 3, 2, 4], 'artist': ['A', 'B', 'C', 'B', 'A']}
df = pd.DataFrame(dictio)
df = df.set_index('idUser')
df_binary = pd.get_dummies(df, columns=['artist'])
print(df_binary)
                   A          B          C
idUser                                 
3                  1          0          0
3                  0          1          0
3                  0          0          1
2                  0          1          0
4                  1          0          0

CodePudding user response:

In [27]: df_binary.groupby(level=0).any().astype(int)
Out[27]:
        artist_A  artist_B  artist_C
idUser
2              0         1         0
3              1         1         1
4              1         0         0

alternatively starting from your df before the .set_index()

In [33]: df.pivot_table(index='idUser', columns='artist', aggfunc='size', fill_value=0).rename_axis(columns=None)
Out[33]:
        A  B  C
idUser
2       0  1  0
3       1  1  1
4       1  0  0
  • Related