still trying to think through how to describe this properly (will update the question), but here's my have
/want
minimal, reproducible, example of what I'm trying to do.
have = pd.DataFrame({'id': [1,1,1,2,2], 'grp': ['a', 'b', 'c', 'd', 'e'], 'val': [5,4,3,2,1]})
>>> have
id grp val
0 1 a 5
1 1 b 4
2 1 c 3
3 2 d 2
4 2 e 1
want = pd.DataFrame({'id': [1,2], 'results': [[('a', 5), ('b', '4'), ('c', 3)], [('d',2), ('e',1)]]})
>>> want
id results
0 1 [(a, 5), (b, 4), (c, 3)]
1 2 [(d, 2), (e, 1)]
CodePudding user response:
You can try groupby id
column then zip the grp
and val
columns
out = (have.groupby('id')
.apply(lambda g: list(zip(g['grp'], g['val'])))
.rename('result')
.reset_index())
print(out)
id result
0 1 [(a, 5), (b, 4), (c, 3)]
1 2 [(d, 2), (e, 1)]
If you want to zip more than two columns into list of tuple, you can also use df.itertuples
, but df.to_records
referenced in other's answer is also fine.
out = (have.groupby('id')
.apply(lambda g: list(g[['grp', 'val']].itertuples(index=False)))
.rename('result')
.reset_index())
print(out)
id result
0 1 [(a, 5), (b, 4), (c, 3)]
1 2 [(d, 2), (e, 1)]
CodePudding user response:
One way to get your data as list tuples is to use df.to_records
. Then groupby.agg
.
have.assign(
res=have[["grp", "val"]].to_records(index=False).tolist()
).groupby("id", as_index=False)["res"].agg(list)
# id res
# 0 1 [(a, 5), (b, 4), (c, 3)]
# 1 2 [(d, 2), (e, 1)]
CodePudding user response:
You can use:
want = (have
.assign(result=have[['grp','val']].agg(tuple, 1))
.groupby('id')['result']
.agg(list).reset_index()
)