I have a list of dicts in which one of the dict values is also a list of dicts. I want to flatten it into a list of dicts.
I have some working code and would like opinions on whether ot not there is a more idiomatic way of achieving this.
Here is my code:
from pprint import pprint
transactions = [
{
"Customer": "Leia",
"Store": "Hammersmith",
"Basket": "basket1",
"items": [
{"Product": "Cheddar", "Quantity": 2, "GrossSpend": 2.50},
{"Product": "Grapes", "Quantity": 1, "GrossSpend": 3.00},
],
},
{
"Customer": "Luke",
"Store": "Ealing",
"Basket": "basket2",
"items": [
{
"Product": "Custard Creams",
"Quantity": 1,
"GrossSpend": 3.00,
}
],
},
]
flattened_transactions = []
for transaction in transactions:
flattened_transactions.extend(
{
"Customer": transaction["Customer"],
"Store": transaction["Store"],
"Basket": transaction["Basket"],
"Product": item["Product"],
"Quantity": item["Quantity"],
"GrossSpend": item["GrossSpend"],
}
for item in transaction["items"]
)
pprint(flattened_transactions)
it outputs:
[{'Basket': 'basket1',
'Customer': 'Leia',
'GrossSpend': 2.5,
'Product': 'Cheddar',
'Quantity': 2,
'Store': 'Hammersmith'},
{'Basket': 'basket1',
'Customer': 'Leia',
'GrossSpend': 3.0,
'Product': 'Grapes',
'Quantity': 1,
'Store': 'Hammersmith'},
{'Basket': 'basket2',
'Customer': 'Luke',
'GrossSpend': 3.0,
'Product': 'Custard Creams',
'Quantity': 1,
'Store': 'Ealing'}]
Is there a better way of achieving this?
CodePudding user response:
I would use a list comprehension.
[{'Customer': d['Customer'],
'Store': d['Store'],
'Basket': d['Basket'],
**d2}
for d in transactions
for d2 in d['items']]
# [{'Customer': 'Leia', 'Store': 'Hammersmith', 'Basket': 'basket1',
# 'Product': 'Cheddar', 'Quantity': 2, 'GrossSpend': 2.5},
# {'Customer': 'Leia', 'Store': 'Hammersmith', 'Basket': 'basket1',
# 'Product': 'Grapes', 'Quantity': 1, 'GrossSpend': 3.0},
# {'Customer': 'Luke', 'Store': 'Ealing', 'Basket': 'basket2',
# 'Product': 'Custard Creams', 'Quantity': 1, 'GrossSpend': 3.0}]
If the 'items'
key is not present or empty, use dict.get
with a default value of []
.
[{'Customer': d['Customer'],
'Store': d['Store'],
'Basket': d['Basket'],
**d2}
for d in transactions
for d2 in d.get('items', [])]
Or more flexibly yet, generate a dictionary containing all of the keys in both the top level dictionary and the nested dictionary, then remove the extraneous 'items'
key with a dictionary comprehension.
[{k: v
for k, v in d3.items()
if k != 'items'}
for d in transactions
for d2 in d.get('items', [])
for d3 in ({**d, **d2},)]
CodePudding user response:
Your code looks good to me, I've modified a bit, if you think we can optimize the lines. Handled the case if items
is not there in input.
flattened_transactions = []
for transaction in transactions:
items = transaction.pop("items", [])
for item in items:
for key, value in item.items():
transaction[key] = value
flattened_transactions.append(transaction)
else:
flattened_transactions.append(transaction)
pprint(flattened_transactions)
CodePudding user response:
If you just want to flatten the list it can be done with this simple code:
ouput = []
for transaction in transactions:
output = [*output, *transaction]
The *
operator in python returns the values of an iterator as in iterator with in braces/brackets. It is equivalent to the ...
(spread) operator in javascript. More details in the documentaion.