I need to make a long-to-wide transformation (see image below) using
I made this code but unfortunately, it does not work !
Code :
import pandas as pd
df = pd.DataFrame({'Id': ['Id001', 'Id001', 'Id002', 'Id003', 'Id003', 'Id003'],
'val1': [np.nan, 'B', 3, 'H', np.nan, 'J'],
'val2': ['N', np.nan, 'M', 2, 'K', 'I'],
'val3': [5, 'E', 'P', 'L', np.nan, 'R']})
df = df.groupby('Id')
.agg(
val1=('val1',' | '.join),
val2=('val2',' | '.join),
val3=('val3',' | '.join))
.rename_axis(None))
df
Here is the error I'm getting :
Error :
TypeError: sequence item 0: expected str instance, float found
Do you have any suggestions/solutions ?
CodePudding user response:
The error is due to the presence of NaN
values, NaN
values are categroized as floating point types and hence you can't join strings with NaN
. The solution is to explicitly cast the NaN's to string
df.filter(like='val').astype(str).groupby(df['Id']).agg('|'.join)
val1 val2 val3
Id
Id001 nan|B N|nan 5|E
Id002 3 M P
Id003 H|nan|J 2|K|I L|nan|R