i have such dataframe:
product1 | product2 | product3 | product4 | product5 | product6 |
---|---|---|---|---|---|
straws | orange | melon | chair | bread | milk |
melon | milk | book | coffee | cake | tea |
bread | bananas | juice | chair | book | straws |
I wish to add 6 counting items columns along the index and counts per rows the number of times they appear in the dataframe.
CountProduct1 | CountProduct2 | CountProduct3 | Countproduct4 | Countproduct5 |
---|---|---|---|---|
1 | 1 | 1 | 1 | 1 |
2 | 2 | 1 | 1 | 1 |
2 | 1 | 1 | 2 | 1 |
Thanks for your help.
CodePudding user response:
Given the modified explanation, here is a way to do it.
1. cumulative counts
cnt = df.apply(lambda g: g.value_counts(), axis=1).fillna(0).astype(int).cumsum()
>>> cnt
bananas book bread cake chair coffee juice melon milk orange straws tea
0 0 0 1 0 1 0 0 1 1 1 1 0
1 0 1 1 1 1 1 0 2 2 1 1 1
2 1 2 2 1 2 1 1 2 2 1 2 1
2. replace each item by the corresponding count, row by row
out = pd.DataFrame([
r.map(c) for (_, r), (_, c) in zip(df.T.items(), cnt.T.items())
]).set_axis([f'Count{k}' for k in df.columns], axis=1)
>>> out
Countproduct1 Countproduct2 Countproduct3 Countproduct4 Countproduct5 Countproduct6
0 1 1 1 1 1 1
1 2 2 1 1 1 1
2 2 1 1 2 2 2