Conver a list of features into a binary vector-CodePudding

I have several lists of features:

feat_lists = [
  ['f1','f2','f3'],
  ['f2','f3'],
  ['f2','f4']
]

And I'd like to arrange them in a way that each row represents a list (observation), and each column a feature. So the values are 1/0 or True/False, depending on the presence of the value in that list (observation).

For instance, for the example above, I'd like to have the following dataframe (shown as a table)

	f1	f2	f3	f4
1	True	True	True	False
2	False	True	True	False
3	False	True	False	True

I can figure out a way to do it, but I imagine there must be a better and more efficient way to do it in pandas

thanks

CodePudding user response：

Use MultiLabelBinarizer with casting to boolean by DataFrame.astype:

from sklearn.preprocessing import MultiLabelBinarizer

mlb = MultiLabelBinarizer()
df = pd.DataFrame(mlb.fit_transform(feat_lists),columns=mlb.classes_).astype(bool)
print (df)
      f1    f2     f3     f4
0   True  True   True  False
1  False  True   True  False
2  False  True  False   True