Home > Net >  How to concatenate multiple json columns in panda
How to concatenate multiple json columns in panda

Time:09-21

I have a df with the following format:

id json_1 json_2 json_3 
1  {a:b}  {a:c}  {c:d}
2  {a:b}  {b:c}  null
3  {a:c}  {c:d}  {a:g}

I want to create a new column which concatenates (i.e., takes union) json_1, json_2, and json_3 columns.

Desired output:

 id json_1 json_2 json_3 final_json
 1  {a:b}  {a:c}  {c:d}   {{a:b}, {a:c}, {c:d}}
 2  {a:b}  {b:c}  null    {{a:b}, {b:c}}
 3  {a:c}  {c:d}  {a:g}   {{a:c}, {c:d}, {a:g}} 

CodePudding user response:

IIUC use:

df['final_json'] = df.filter(like='json').apply(lambda x: [y for y in x if pd.notna(y)], axis=1)

CodePudding user response:

Depending on the type of data and additional requirements, this should do the work

df['final_json'] = df[['json_1', 'json_2', 'json_3']].apply(lambda x: set(x) - set(['null']), axis=1)

[Out]:
   id json_1 json_2 json_3             final_json
0   1  {a:b}  {a:c}  {c:d}  {{c:d}, {a:c}, {a:b}}
1   2  {a:b}  {b:c}   null         {{b:c}, {a:b}}
2   3  {a:c}  {c:d}  {a:g}  {{a:g}, {c:d}, {a:c}}
  • Related