For example I have this table
| ID | VALUE |
| -------- | -------------- |
| 1 | row24 |
| 2 | row24 |
| 3 | row1 |
| 4 | row15 |
| 5 | row16 |
| 6 | row17 |
| 8 | row24 |
| 7 | row17 |
| 9 | row19 |
Output should be:
| ID | VALUE |
| -------- | -------------- |
| [1,2,8] | row24 |
| 3 | row1 |
| 4 | row15 |
| 5 | row16 |
| [6,7] | row17 |
| 9 | row19 |
I think maybe groupby in pandas is a solution, i tried some but it didnt work...
CodePudding user response:
If need combination lists and scalars use GroupBy.agg
with lambda function:
df = (df.groupby('VALUE', sort=False)['ID']
.agg(lambda x: list(x) if len(x) > 1 else x)
.reset_index(name='IDS'))
print (df)
VALUE IDS
0 row24 [1, 2, 8]
1 row1 3
2 row15 4
3 row16 5
4 row17 [6, 7]
5 row19 9
Because ig aggregate only list get oalso one element lists:
df = (df.groupby('VALUE', sort=False)['ID']
.agg(list)
.reset_index(name='IDS'))
print (df)
VALUE IDS
0 row24 [1, 2, 8]
1 row1 [3]
2 row15 [4]
3 row16 [5]
4 row17 [6, 7]
5 row19 [9]