I saw this page, How to group dataframe rows into list in pandas groupby but, that's not what I need.
Let my datatable example see please.
index | column1 | column2 |
---|---|---|
0 | apple | red |
1 | banana | a |
1 | banana | b |
2 | grape | wow |
2 | grape | that's |
2 | grape | great |
2 | grape | fruits! |
3 | melon | oh |
3 | melon | no |
...a lot of data... | ...a lot of data... | ... a lot of data... |
and I want to groupby ONLY index 2~3 (I need to range because my real data is hudge.)
so I want to this table
. | column1 | column2 |
---|---|---|
0 | apple | red |
1 | banana | a |
1 | banana | b |
2 | grape | wow that's great grape fruits! |
3 | melon | oh no |
...a lot of data... | ...a lot of data... | ... a lot of data... |
How can I get this?
CodePudding user response:
Let's filter the dataframe based on index
column, then groupby on the filtered dataframe.
m = df['index'].isin([2,3])
out = (pd.concat([df[~m],
(df[m].groupby('column1', as_index=False)
.agg({'index': 'first', 'column2': ' '.join}))], ignore_index=True)
.sort_values('index', ignore_index=True))
print(out)
index column1 column2
0 0 apple red
1 1 banana a
2 1 banana b
3 2 grape wow that's great fruits!
4 3 melon oh no