How to merge multi list of dict into one list of multi dict? Pandas-CodePudding

Here's an example of my dataframe.

d = {'ids': [100, 200, 100, 200, 200, 100, 300, 300], 'col': [1, 2, 3, 4, 5, 6, 7, 8], 'col2': [6, 5, 4, 3, 2, 1, 10, 15]}
df = pd.DataFrame(data=d)
df

        ids col col2
0   100 1   6
1   200 2   5
2   100 3   4
3   200 4   3
4   200 5   2
5   100 6   1
6   300 7   10
7   300 8   15

I want to calculate some value for each ids. For example, as in the example below.

groups = {key: df.loc[value] for key, value in df.groupby("ids").groups.items()}
for key, group in groups.items():
    group['previous_col'] = group['col'].shift()
    group['new_col'] = group['col2'] * group['previous_col']
    print(group)
    
    
# Print out a value like this
   ids  col  col2  previous_col  new_col
0  100    1     6           NaN      NaN
2  100    3     4           1.0      4.0
5  100    6     1           3.0      3.0
   ids  col  col2  previous_col  new_col
1  200    2     5           NaN      NaN
3  200    4     3           2.0      6.0
4  200    5     2           4.0      8.0
   ids  col  col2  previous_col  new_col
6  300    7    10           NaN      NaN
7  300    8    15           7.0    105.0


print(group.to_dict('records'))
[{'ids': 100, 'col': 1, 'col2': 6, 'previous_col': nan, 'new_col': nan}, {'ids': 100, 'col': 3, 'col2': 4, 'previous_col': 1.0, 'new_col': 4.0}, {'ids': 100, 'col': 6, 'col2': 1, 'previous_col': 3.0, 'new_col': 3.0}]
[{'ids': 200, 'col': 2, 'col2': 5, 'previous_col': nan, 'new_col': nan}, {'ids': 200, 'col': 4, 'col2': 3, 'previous_col': 2.0, 'new_col': 6.0}, {'ids': 200, 'col': 5, 'col2': 2, 'previous_col': 4.0, 'new_col': 8.0}]
[{'ids': 300, 'col': 7, 'col2': 10, 'previous_col': nan, 'new_col': nan}, {'ids': 300, 'col': 8, 'col2': 15, 'previous_col': 7.0, 'new_col': 105.0}]

You can see that after running the command to_dict('records') will get multiple list of dicts. But the result I want is one list but multiple dicts will look like this.

[{'ids': 100, 'col': 1, 'col2': 6, 'previous_col': nan, 'new_col': nan}, {'ids': 100, 'col': 3, 'col2': 4, 'previous_col': 1.0, 'new_col': 4.0}, {'ids': 100, 'col': 6, 'col2': 1, 'previous_col': 3.0, 'new_col': 3.0}
{'ids': 200, 'col': 2, 'col2': 5, 'previous_col': nan, 'new_col': nan}, {'ids': 200, 'col': 4, 'col2': 3, 'previous_col': 2.0, 'new_col': 6.0}, {'ids': 200, 'col': 5, 'col2': 2, 'previous_col': 4.0, 'new_col': 8.0}
{'ids': 300, 'col': 7, 'col2': 10, 'previous_col': nan, 'new_col': nan}, {'ids': 300, 'col': 8, 'col2': 15, 'previous_col': 7.0, 'new_col': 105.0}]

How to get results like this?

CodePudding user response：

I would just do something like this:

def func(group):
    prev = group['col'].shift()
    return group['col2'] * prev

df['new_col'] = df.groupby('ids').apply(func).reset_index('ids',drop=True)


output = df.to_dict('records')

That is for a generic function that func represents. In this specific settings, just

df['new_col'] = df.groupby('ids')['col'].shift() * df['col2']

Output:

[{'ids': 100, 'col': 1, 'col2': 6, 'new_col': nan},
 {'ids': 200, 'col': 2, 'col2': 5, 'new_col': nan}, 
 {'ids': 100, 'col': 3, 'col2': 4, 'new_col': 4.0}, 
 {'ids': 200, 'col': 4, 'col2': 3, 'new_col': 6.0}, 
 {'ids': 200, 'col': 5, 'col2': 2, 'new_col': 8.0},
 {'ids': 100, 'col': 6, 'col2': 1, 'new_col': 3.0}, 
 {'ids': 300, 'col': 7, 'col2': 10, 'new_col': nan}, 
 {'ids': 300, 'col': 8, 'col2': 15, 'new_col': 105.0}]