Converting list to a desired structure-CodePudding

list_sample = [{'name': 'A', 'data': {'date':['2021-01-01', '2021-02-01', '2021-03-01'], 
                        'credit_score':[800, 890, 895],
                        'spend':[1500, 25000, 2400], 
                        'average_spend':5000}},
               {'name': 'B', 'data': {'date':['2022-01-01', '2022-02-01', '2022-03-01'],
                                   'credit_score':[2800, 390, 8900],
                                   'spend':[15000, 5000, 400], 
                                   'average_spend':3000}},
               {'name': 'C', 'data': {'date':['2022-01-01', '2022-02-01', '2022-03-01'],
                                   'credit_score':[2800, 390, 8900],
                                   'spend':[15000, 5000, 400], 
                                   'average_spend':3000}}]

Above is the list, I have. I wish to arrive at three list (one for date, one for credit score and one for spend) as shown below: But while doing this I wish to assert or enforce the order for example: index 2 of date and index 2 of credit score should be coming from the same source. Is there a clean way to do this?

Expected output:

date = ['2021-01-01', '2021-02-01', '2021-03-01', '2022-01-01', '2022-02-01', '2022-03-01', '2022-01-01', '2022-02-01', '2022-03-01']
credit_score = [800, 890, 895, 2800, 390, 8900, 2800, 390, 8900]
spend = [1500, 25000, 2400, 15000, 5000, 400, 15000, 5000, 400]

CodePudding user response：

You can for example loop over the list elements:

credit_score = []
spend = []
date = []

for d in list_sample:
    date = date   d['data']['date']
    spend = spend   d['data']['spend']
    credit_score = credit_score   d['data']['credit_score']

CodePudding user response：

One posibility:

import itertools
date = itertools.chain.from_iterable(x["data"]["date"] for x in list_sample)

and similarly for the other

CodePudding user response：

Preference here would be to build a dictionary rather than 3 discrete variables which gives you easy access to the value you need.

For example:

list_sample = [{'name': 'A', 'data': {'date': ['2021-01-01', '2021-02-01', '2021-03-01'],
                                      'credit_score':[800, 890, 895],
                                      'spend':[1500, 25000, 2400],
                                      'average_spend':5000}},
               {'name': 'B', 'data': {'date': ['2022-01-01', '2022-02-01', '2022-03-01'],
                                      'credit_score':[2800, 390, 8900],
                                      'spend':[15000, 5000, 400],
                                      'average_spend':3000}},
               {'name': 'C', 'data': {'date': ['2022-01-01', '2022-02-01', '2022-03-01'],
                                      'credit_score':[2800, 390, 8900],
                                      'spend':[15000, 5000, 400],
                                      'average_spend':3000}}]

result = {}

KEYS = 'date', 'credit_score', 'spend'

for d in list_sample:
    data = d['data']
    for key in KEYS:
        result.setdefault(key, []).extend(data[key])

for key in KEYS:
    print(key, result[key])

Output:

date ['2021-01-01', '2021-02-01', '2021-03-01', '2022-01-01', '2022-02-01', '2022-03-01', '2022-01-01', '2022-02-01', '2022-03-01']
credit_score [800, 890, 895, 2800, 390, 8900, 2800, 390, 8900]
spend [1500, 25000, 2400, 15000, 5000, 400, 15000, 5000, 400]

CodePudding user response：

Using jmespath simply you can do

import jmespath
data = [{"name": "A", "data": {"date":["2021-01-01", "2021-02-01", "2021-03-01"], 
                        "credit_score":[800, 890, 895],
                        "spend":[1500, 25000, 2400], 
                        "average_spend":5000}},
               {"name": "B", "data": {"date":["2022-01-01", "2022-02-01", "2022-03-01"],
                                   "credit_score":[2800, 390, 8900],
                                   "spend":[15000, 5000, 400], 
                                   "average_spend":3000}},
               {"name": "C", "data": {"date":["2022-01-01", "2022-02-01", "2022-03-01"],
                                   "credit_score":[2800, 390, 8900],
                                   "spend":[15000, 5000, 400], 
                                   "average_spend":3000}}]

date = jmespath.search("[*].data.date[]", data) # [*].data.average_spend[] will also work
# ['2021-01-01', '2021-02-01', '2021-03-01', '2022-01-01', '2022-02-01', '2022-03-01', '2022-01-01', '2022-02-01', '2022-03-01']