Home > OS >  Python Pandas add data in to column if it same key
Python Pandas add data in to column if it same key

Time:04-21

I need some help about Pandas

   model        timestamp
0  Punto  20200124_083155
1  Punto  20200124_163540
2  Doblo  20200124_122052
3  Doblo  20200124_150801
4  Panda  20200124_134350
5   Tipo  20200124_195955

I want it become to

   model        timestamp
0  Punto  20200124_083155;20200124_163540
1  Doblo  20200124_122052;20200124_150801  
2  Panda  20200124_134350
3  Tipo   20200124_195955

Please, everybody help me, can give me some example?, thank so much.

CodePudding user response:

You can use agg for that and concatenate the grouped data.

new_df = df.groupby('model')['timestamp'].agg(timestamp= lambda x: ';'.join(x))

print(new_df)
                             timestamp
model                                 
Doblo  20200124_122052;20200124_150801
Panda                  20200124_134350
Punto  20200124_083155;20200124_163540
Tipo                   20200124_195955

new_df.reset_index(inplace=True)
print(new_df)

   model                        timestamp
0  Doblo  20200124_122052;20200124_150801
1  Panda                  20200124_134350
2  Punto  20200124_083155;20200124_163540
3   Tipo                  20200124_195955

CodePudding user response:

Suppose you have:

df_new=pd.DataFrame()
df_new['name']=['Punto','Doblo','Doblo','Punto','Punto','Tipo']
df_new['date']=['20200124_083155','20200124_163540','20200124_122052','20200124_150801',
'20200124_134350','20200124_195955']

output:

name    date
0   Punto   20200124_083155
1   Doblo   20200124_163540
2   Doblo   20200124_122052
3   Punto   20200124_150801
4   Punto   20200124_134350
5   Tipo    20200124_195955

Use shift with groupby:

df_new['date2']=df_new.groupby('name').shift(-1)

output:

name    date    date2
0   Punto   20200124_083155 20200124_150801
1   Doblo   20200124_163540 20200124_122052
2   Doblo   20200124_122052 NaN
3   Punto   20200124_150801 20200124_134350
4   Punto   20200124_134350 NaN
5   Tipo    20200124_195955 NaN

if you want to see it in the same column:

df_new['date3']=df_new['date'] ';' df_new['date2']

output:

name    date    date2   date3
0   Punto   20200124_083155 20200124_150801 20200124_083155;20200124_150801
1   Doblo   20200124_163540 20200124_122052 20200124_163540;20200124_122052
2   Doblo   20200124_122052 NaN NaN
3   Punto   20200124_150801 20200124_134350 20200124_150801;20200124_134350
4   Punto   20200124_134350 NaN NaN
5   Tipo    20200124_195955 NaN NaN
  • Related