I have a dataframe, I want to have the data for each id in one row. Some of the columns are constant for each id and some of them are changing. For example, val1 and val2 is constant in the following example.
So, I want to build a dictionary from this dataframe. Key is the id and the value is a list which is [number of row for each id, its t, val1, val2]. Here is an example, for id=1 we have 2 t, its value is 1, 2, so '1': [2, 1, 2, 1, 5] is the first key and value. For id=2 we have the same thins. Could you please help me with that? Thanks
df = pd.DataFrame()
df['id'] = [1, 1, 2, 2, 2,2]
df['t'] = [1, 2, 3, 4, 5,6]
df['val1'] = [1,1,0,0,0,0 ]
df['val2'] = [5,5, 4, 4,4,4]
out:
dic = {'1': [2, 1, 2, 1, 5], '2': [4, 3, 4, 5,6, 0,4]}
CodePudding user response:
It's just brute force.
import pandas as pd
df = pd.DataFrame()
df['id'] = [1, 1, 2, 2, 2,2]
df['t'] = [1, 2, 3, 4, 5,6]
df['val1'] = [1,1,0,0,0,0 ]
df['val2'] = [5,5, 4, 4,4,4]
print(df)
lastid = None
tval = []
rows = []
for i,row in df.iterrows():
if row['id'] != lastid:
if lastid:
rows.append( [len(tval)] tval save )
lastid = row['id']
tval = []
tval.append(row['t'])
save = [row['val1'],row['val2']]
rows.append( [len(tval)] tval save )
print(rows)
Output:
id t val1 val2
0 1 1 1 5
1 1 2 1 5
2 2 3 0 4
3 2 4 0 4
4 2 5 0 4
5 2 6 0 4
[[2, 1, 2, 1, 5], [4, 3, 4, 5, 6, 0, 4]]
CodePudding user response:
df.groupby('id').agg({'t':list, 'val1':'first', 'val2':'first'}).apply(lambda x: [len(x.t),*x.t,x.val1,x.val2], axis=1).to_dict()