I have a dataframe with two related columns that needs to be merged into a single dictionary
column.
Sample Data:
skuId coreAttributes.price coreAttributes.amount
0 100 price 8.84
1 102 price 12.99
2 103 price 9.99
Expected output:
skuId coreAttributes
100 {'price': 8.84}
102 {'price': 12.99}
103 {'price': 9.99}
What I've tried:
planProducts_T = planProducts.filter(regex = 'coreAttributes').T
planProducts_T.columns = planProducts_T.iloc[0]
planProducts_T.iloc[1:].to_dict(orient = 'records')
I get UserWarning: DataFrame columns are not unique, some columns will be omitted.
and this output:
[{'price': 9.99}]
Could you someone please help me on this.
CodePudding user response:
You can use a list comprehension with python's zip
:
df['coreAttributes'] = [{k: v} for k,v in
zip(df['coreAttributes.price'],
df['coreAttributes.amount'])]
Output:
skuId coreAttributes.price coreAttributes.amount coreAttributes
0 100 price 8.84 {'price': 8.84}
1 102 price 12.99 {'price': 12.99}
2 103 price 9.99 {'price': 9.99}
If you need to remove the initial columns, use pop
.
df['coreAttributes'] = [{k: v} for k,v in
zip(df.pop('coreAttributes.price'),
df.pop('coreAttributes.amount'))]
Output:
skuId coreAttributes
0 100 {'price': 8.84}
1 102 {'price': 12.99}
2 103 {'price': 9.99}
CodePudding user response:
you can use apply and drop for an optimize computation
df["coreAttributes"] = df.apply(lambda row: {row["coreAttributes.price"]: row["coreAttributes.amount"]}, axis=1)
df.drop(["coreAttributes.price","coreAttributes.amount"], axis=1)
output
skuId coreAttributes
0 100 {'price': 8.84}
1 102 {'price': 12.99}
2 103 {'price': 9.99}