Home > Mobile >  How to turn columns names into values in Pandas
How to turn columns names into values in Pandas

Time:10-08

I would like to turn the names of columns into values. This is so to create a factor variable and define the levels as the column names. I am hoping to achieve x2 from x1. In R it would be like using the model.matrix() function

Thank you

x1 = pd.DataFrame({'A': [1,0,0],
            'B': [0,1,0],
            'C': [0,1,1]})

x2 = pd.DataFrame({'All': ['A','BC','C']})

CodePudding user response:

That's one way, there should be a simpler solution:

x1.astype(bool).apply(lambda row: ''.join(x1.columns[row]), axis=1)

CodePudding user response:

Use the @ (matrix multiplication operator) to multiply the columns vector by the boolean matrix:

import pandas as pd

x1 = pd.DataFrame({'A': [1, 0, 0],
                   'B': [0, 1, 0],
                   'C': [0, 1, 1]})

# create result DataFrame
x2 = pd.DataFrame({"all": x1 @ x1.columns})
print(x2)

Output

  all
0   A
1  BC
2   C

CodePudding user response:

You can also use list comprehension, as follows:

cols = x1.columns.values

x2 = pd.DataFrame({'All': [''.join(cols[x]) for x in x1.eq(1).values]})

Or simply:

x2 = pd.DataFrame({'All': [''.join(x1.columns[x]) for x in x1.eq(1).values]})

Result:

print(x2)

  All
0   A
1  BC
2   C
  • Related