hi I want to concatenate multiple columns together using pipes as connector in pandas python and if the columns is blank values then skip this columns.
I tried the following code, it does not skip the values when its empty, it will still have a '|' to connect with other fields, what I want is the completely pass the empty field ..
for example: currently it gives me 'N|911|WALLACE|AVE||||MT|031|000600'
while I want 'N|911|WALLACE|AVE|MT|031|000600'
df['key'] = df[['fl_predir','fl_prim_range','fl_prim_name','fl_addr_suffix','fl_postdir','fl_unit_desig','fl_sec_range','fl_st','fl_fips_county','blk']].agg('|'.join, axis=1)
can anybody help me on this?
CodePudding user response:
cols = ['fl_predir','fl_prim_range','fl_prim_name','fl_addr_suffix','fl_postdir','fl_unit_desig','fl_sec_range','fl_st','fl_fips_county','blk']
df['key'] = df[cols].apply(lambda row: '|'.join(x for x in row if x), axis=1, raw=True)
CodePudding user response:
You can use melt
to flat your dataframe, drop null values then group by index and finally concatenate values:
cols = ['fl_predir', 'fl_prim_range', 'fl_prim_name', 'fl_addr_suffix' ,
'fl_postdir', 'fl_unit_desig', 'fl_sec_range', 'fl_st',
'fl_fips_county', 'blk']
df['key'] = (df[cols].melt(ignore_index=False)['value'].dropna()
.astype(str).groupby(level=0).agg('|'.join))
Output:
>>> df['key']
0 N|911|WALLACE|AVE|MT|31|600
Name: key, dtype: object
Alternative (Pandas < 1.1.0
)
df['keys'] = (df[cols].unstack().dropna().astype(str)
.groupby(level=1).agg('|'.join))