Hi guys I have data like this
[
{
'name': 'snow 7',
'count': 1,
'rows_processed': None,
'pipelines': 1
},
{
'name': 'snow 6',
'count': 1,
'rows_processed': None,
'pipelines': 1
},
{
'name': 'snow 6',
'count': 1,
'rows_processed': None,
'pipelines': 1
},
{
'name': 'snow 6',
'count': 2,
'rows_processed': None,
'pipelines': 2
},
{
'name': 'snow 5',
'count': 2,
'rows_processed': 4,
'pipelines': 2
},
{
'name': 'snow 4',
'count': 2,
'rows_processed': None,
'pipelines': 2
}]
and i want to sum the values of rows_processed
and pipelines
based on name
key like for snow 6
pipelines
sum will be 4 and so on, basically the final data should look like this.
{
"Rows Processed": [0, 0, 4, 0],
"Pipelines Processed": [1, 4, 2, 2]
}
how can i make data like above? this is what i have done so for
rows_processed = {}
pipeline_processed = {}
for batch in batches:
for label in batch.keys():
rows_processed[label] = rows_processed.get(batch['rows_processed'], 0) batch['rows_processed'] if batch['rows_processed'] else 0
for batch in batches:
for label in batch.keys():
pipeline_processed[label] = pipeline_processed.get(batch['pipelines'], 0) batch['pipelines'] if \
batch['pipelines'] else 0
CodePudding user response:
One way using a two-level defaultdict
and Boolean Operations:
>>> from collections import defaultdict
>>>
>>> d = defaultdict(lambda: defaultdict(int))
>>> for batch in batches:
... d['Rows Processed'][batch['name']] = batch['rows_processed'] or 0
... d['Pipelines Processed'][batch['name']] = batch['pipelines'] or 0
...
>>> list(d['Rows Processed'].values())
[0, 0, 4, 0]
>>> list(d['Pipelines Processed'].values())
[1, 4, 2, 2]
CodePudding user response:
Hey guys I resolved the above question by doing the following code however i'm not sure if this is the right approach or not. If anyone has better approach then please let me know.
rows_processed = {}
pipeline_processed = {}
for batch in batches:
rows_processed[batch['name']] = rows_processed.get(batch['name'], 0) batch['rows_processed'] if batch['rows_processed'] else 0
for batch in batches:
pipeline_processed[batch['name']] = pipeline_processed.get(batch['name'], 0) batch['pipelines'] if batch['pipelines'] else 0
print(list(rows_processed.values()))
print(list(pipeline_processed.values()))