Home > front end >  Convert binary columns into columns with string values based on their column headers?
Convert binary columns into columns with string values based on their column headers?

Time:11-07

This is my dataset:

Dept Cell culture Bioinfo Immunology Trigonometry Algebra Microbio Optics
Biotech 1 1 1 0 0 0 0
Biotech 1 0 1 0 0 0 0
Math 0 0 0 1 1 0 0
Biotech 0 0 0 0 0 1 0
Physics 0 0 0 0 0 0 1

This is how I want my result:

|Biotech|Cell culture|Bioinfo   |Immunology|
|Biotech|Cell culture|Immunology|          |
|Math   |Trigonometry|Algebra   |          |
|Biotech|Microbio    |          |          |
|Physics|Optics      |          |          |

CodePudding user response:

Convert column Dept to index and for values with 1 convert columns names to lists and then to new DataFrame:

df2 = df.set_index('Dept')

cols = df2.columns.to_numpy()
df2 = pd.DataFrame([cols[m] for m in df2.eq(1).to_numpy()], index=df2.index)
print (df2)
                    0           1           2
Dept                                         
Biotech  Cell culture     Bioinfo  Immunology
Biotech  Cell culture  Immunology        None
Math     Trigonometry     Algebra        None
Biotech      Microbio        None        None
Physics        Optics        None        None

If performance not important or small data:

df2 = df.set_index('Dept')

cols = df2.columns.to_numpy()
df2 = pd.DataFrame(df2.eq(1).apply(lambda x: cols[x], axis=1).tolist(), index=df2.index)
print (df2)
                    0           1           2
Dept                                         
Biotech  Cell culture     Bioinfo  Immunology
Biotech  Cell culture  Immunology        None
Math     Trigonometry     Algebra        None
Biotech      Microbio        None        None
Physics        Optics        None        None

Last if need all columns:

df2 = df2.reset_index()
df2.columns = range(len(df2.columns))
print (df2)
         0             1           2           3
0  Biotech  Cell culture     Bioinfo  Immunology
1  Biotech  Cell culture  Immunology        None
2     Math  Trigonometry     Algebra        None
3  Biotech      Microbio        None        None
4  Physics        Optics        None        None
  • Related