I have tables that have many features, and these features can have the same ID. How can I check each for ID, then concatenate identical ID features in one row, for example, Here's an example of a simple table stored in a dataframe one feature and ID and the output will concatenate all features that have same ID and put them as new features and for IDs that don't have other features will be zero value as in this table result.
Thanks in advance.
CodePudding user response:
join
is what you need. And specify how='outer'
if you dont want to lose any of row.
df1.set_index('ID').join(df2.set_index('ID'), how='outer')
CodePudding user response:
IIUC, You can use:
mask=df.pivot_table(values='dat1',index='ID',aggfunc=list)
dfx=pd.DataFrame(mask['dat1'].tolist(), index=mask.index,columns=['dat1','dat2']).fillna(0)
Output:
ID dat1 dat2
1 9 3.0
2 6 5.0
3 5 0