Home > database >  Best way to flatten a list of dicts that contains a nested list of dicts?
Best way to flatten a list of dicts that contains a nested list of dicts?

Time:11-29

I have a list of dicts in which one of the dict values is also a list of dicts. I want to flatten it into a list of dicts.

I have some working code and would like opinions on whether ot not there is a more idiomatic way of achieving this.

Here is my code:

from pprint import pprint

transactions = [
    {
        "Customer": "Leia",
        "Store": "Hammersmith",
        "Basket": "basket1",
        "items": [
            {"Product": "Cheddar", "Quantity": 2, "GrossSpend": 2.50},
            {"Product": "Grapes", "Quantity": 1, "GrossSpend": 3.00},
        ],
    },
    {
        "Customer": "Luke",
        "Store": "Ealing",
        "Basket": "basket2",
        "items": [
            {
                "Product": "Custard Creams",
                "Quantity": 1,
                "GrossSpend": 3.00,
            }
        ],
    },
]
flattened_transactions = []
for transaction in transactions:
    flattened_transactions.extend(
        {
            "Customer": transaction["Customer"],
            "Store": transaction["Store"],
            "Basket": transaction["Basket"],
            "Product": item["Product"],
            "Quantity": item["Quantity"],
            "GrossSpend": item["GrossSpend"],
        }
        for item in transaction["items"]
    )
pprint(flattened_transactions)

it outputs:

[{'Basket': 'basket1',
  'Customer': 'Leia',
  'GrossSpend': 2.5,
  'Product': 'Cheddar',
  'Quantity': 2,
  'Store': 'Hammersmith'},
 {'Basket': 'basket1',
  'Customer': 'Leia',
  'GrossSpend': 3.0,
  'Product': 'Grapes',
  'Quantity': 1,
  'Store': 'Hammersmith'},
 {'Basket': 'basket2',
  'Customer': 'Luke',
  'GrossSpend': 3.0,
  'Product': 'Custard Creams',
  'Quantity': 1,
  'Store': 'Ealing'}]

Is there a better way of achieving this?

CodePudding user response:

I would use a list comprehension.

[{'Customer': d['Customer'], 
  'Store':    d['Store'], 
  'Basket':   d['Basket'], 
  **d2} 
 for d in transactions 
 for d2 in d['items']]
# [{'Customer': 'Leia', 'Store': 'Hammersmith', 'Basket': 'basket1', 
#   'Product': 'Cheddar', 'Quantity': 2, 'GrossSpend': 2.5}, 
#  {'Customer': 'Leia', 'Store': 'Hammersmith', 'Basket': 'basket1', 
#   'Product': 'Grapes', 'Quantity': 1, 'GrossSpend': 3.0},
#  {'Customer': 'Luke', 'Store': 'Ealing', 'Basket': 'basket2', 
#   'Product': 'Custard Creams', 'Quantity': 1, 'GrossSpend': 3.0}]

If the 'items' key is not present or empty, use dict.get with a default value of [].

[{'Customer': d['Customer'], 
  'Store':    d['Store'], 
  'Basket':   d['Basket'], 
  **d2} 
 for d in transactions 
 for d2 in d.get('items', [])]

Or more flexibly yet, generate a dictionary containing all of the keys in both the top level dictionary and the nested dictionary, then remove the extraneous 'items' key with a dictionary comprehension.

[{k: v 
  for k, v in d3.items() 
  if k != 'items'} 
 for d in transactions 
 for d2 in d.get('items', []) 
 for d3 in ({**d, **d2},)]

CodePudding user response:

Your code looks good to me, I've modified a bit, if you think we can optimize the lines. Handled the case if items is not there in input.

flattened_transactions = []
for transaction in transactions:
    items = transaction.pop("items", [])
    for item in items:
        for key, value in item.items():
            transaction[key] = value
        flattened_transactions.append(transaction)
    else:
        flattened_transactions.append(transaction)
pprint(flattened_transactions)

CodePudding user response:

If you just want to flatten the list it can be done with this simple code:

ouput = []

for transaction in transactions:
   output = [*output, *transaction]

The * operator in python returns the values of an iterator as in iterator with in braces/brackets. It is equivalent to the ...(spread) operator in javascript. More details in the documentaion.

  • Related