I have a dataset which shows each transaction as a row.
for example;
Item_1 | Item_2 | Item_3 |
---|---|---|
NaN | 1 | 1 |
1 | 1 | NaN |
The table has 611 columns with 1180 rows, therefore 611 items and 1180 transactions.
I'm looking to do a basket analysis hence I need all rows which have '1' to be changed to the Item 'name'
For example...
Item_1 | Item_2 | Item_3 |
---|---|---|
NaN | Item_2 | Item_3 |
Item_1 | Item_2 | NaN |
Then I aim to delete the header columns and just have each transaction on each row aligned without NaN's
i.e
No_header | No_header | No_header |
---|---|---|
Item_2 | Item_3 | NaN |
Item_1 | Item_2 | NaN |
CodePudding user response:
Try this:
items = df.apply(lambda col: col.map({1: col.name})).apply(lambda row: row[~row.isna()].tolist(), axis=1)
Output:
>>> items
0 [Item_2, Item_3]
1 [Item_1, Item_2]
dtype: object
>>> type(items)
pandas.core.series.Series