Image following pandas dataframe:
---- ------ -------
| ID | Name | Value |
---- ------ -------
| 1 | John | 1 |
---- ------ -------
| 1 | John | 4 |
---- ------ -------
| 1 | John | 10 |
---- ------ -------
| 1 | John | 50 |
---- ------ -------
| 1 | Adam | 6 |
---- ------ -------
| 1 | Adam | 3 |
---- ------ -------
| 2 | Jen | 9 |
---- ------ -------
| 2 | Jen | 6 |
---- ------ -------
I want to apply groupby function and create a new column which stores the Value
values as a list from the current till the last groupby value.
Like that:
---- ------ ------- ----------------
| ID | Name | Value | NewCol |
---- ------ ------- ----------------
| 1 | John | 1 | [1, 4, 10, 50] |
---- ------ ------- ----------------
| 1 | John | 4 | [4, 10, 50] |
---- ------ ------- ----------------
| 1 | John | 10 | [10, 50] |
---- ------ ------- ----------------
| 1 | John | 50 | [50] |
---- ------ ------- ----------------
| 1 | Adam | 6 | [6, 3] |
---- ------ ------- ----------------
| 1 | Adam | 3 | [3] |
---- ------ ------- ----------------
| 2 | Jen | 9 | [9, 6] |
---- ------ ------- ----------------
| 2 | Jen | 6 | [9] |
---- ------ ------- ----------------
Is this anyhow possible using pandas groupby function?
CodePudding user response:
Use GroupBy.transform
with custom lambda functions:
f = lambda x: [x.iloc[i:len(x)].tolist() for i, y in enumerate(x)]
df['new'] = df.groupby(['Name', 'ID'])['Value'].transform(f)
Or:
f = lambda x: [y[::-1].tolist() for y in x.expanding()]
df['new'] = df.iloc[::-1].groupby(['Name', 'ID'])['Value'].transform(f)
print (df)
ID Name Value new
0 1 John 1 [1, 4, 10, 50]
1 1 John 4 [4, 10, 50]
2 1 John 10 [10, 50]
3 1 John 50 [50]
4 1 Adam 6 [6, 3]
5 1 Adam 3 [3]
6 2 Jen 9 [9, 6]
7 2 Jen 6 [6]