I am building something to sort and add values from an API response. I ended up going with an interesting structure, and I just want to make sure there's nothing inherently wrong with it.
from collections import defaultdict
# Helps create a unique nested default dict object
# for code readability
def dict_counter():
return defaultdict(lambda: 0)
# Creates the nested defaultdict object
ad_data = defaultdict(dict_counter)
# Sorts each instance into its channel, and
# adds the dict values incrimentally
for ad in example:
# Collects channel and metrics
channel = ad['ad_group']['type_']
metrics = dict(
impressions= int(ad['metrics']['impressions']),
clicks = int(ad['metrics']['clicks']),
cost = int(ad['metrics']['cost_micros'])
)
# Adds the variables
ad_data[channel]['impressions'] = metrics['impressions']
ad_data[channel]['clicks'] = metrics['clicks']
ad_data[channel]['cost'] = metrics['cost']
The output is as desired. Again, I just want to make sure I'm not reinventing the wheel or doing something really inefficient here.
defaultdict(<function __main__.dict_counter()>,
{'DISPLAY_STANDARD': defaultdict(<function __main__.dict_counter.<locals>.<lambda>()>,
{'impressions': 14, 'clicks': 4, 'cost': 9}),
'SEARCH_STANDARD': defaultdict(<function __main__.dict_counter.<locals>.<lambda>()>,
{'impressions': 6, 'clicks': 2, 'cost': 4})})
Here's what my input data would look like:
example = [
{
'campaign':
{
'resource_name': 'customers/12345/campaigns/12345',
'status': 'ENABLED',
'name': 'test_campaign_2'
},
'ad_group': {
'resource_name': 'customers/12345/adGroups/12345',
'type_': 'DISPLAY_STANDARD'},
'metrics': {
'clicks': '1', 'cost_micros': '3', 'impressions': '5'
},
'ad_group_ad': {
'resource_name': 'customers/12345/adGroupAds/12345~12345',
'ad': {
'resource_name': 'customers/12345/ads/12345'
}
}
},
{
'campaign':
{
'resource_name': 'customers/12345/campaigns/12345',
'status': 'ENABLED',
'name': 'test_campaign_2'
},
'ad_group': {
'resource_name': 'customers/12345/adGroups/12345',
'type_': 'SEARCH_STANDARD'},
'metrics': {
'clicks': '2', 'cost_micros': '4', 'impressions': '6'
},
'ad_group_ad': {
'resource_name': 'customers/12345/adGroupAds/12345~12345',
'ad': {
'resource_name': 'customers/12345/ads/12345'
}
}
},
{
'campaign':
{
'resource_name': 'customers/12345/campaigns/12345',
'status': 'ENABLED',
'name': 'test_campaign_2'
},
'ad_group': {
'resource_name': 'customers/12345/adGroups/12345',
'type_': 'DISPLAY_STANDARD'},
'metrics': {
'clicks': '3', 'cost_micros': '6', 'impressions': '9'
},
'ad_group_ad': {
'resource_name': 'customers/12345/adGroupAds/12345~12345',
'ad': {
'resource_name': 'customers/12345/ads/12345'
}
}
}
]
Thanks!
CodePudding user response:
There's nothing wrong with the code you have, but the code for copying the values from one dict to another is a bit repetitive and a little vulnerable to mis-pasting a key name. I'd suggest putting the mapping between the keys in a dict so that there's a single source of truth for what keys you're copying from the input metrics dicts and what keys that data will live under in the output:
fields = {
# Map input metrics dicts to per-channel metrics dicts.
'impressions': 'impressions', # same
'clicks': 'clicks', # same
'cost_micros': 'cost', # different
}
Since each dict in your output is going to contain the keys from fields.values()
, you have the option of creating these as plain dicts with their values initialized to zero rather than as defaultdicts (this doesn't have any major benefits over defaultdict(int)
, but it does make pretty-printing a bit easier):
# Create defaultdict of per-channel metrics dicts.
ad_data = defaultdict(lambda: dict.fromkeys(fields.values(), 0))
and then you can do a simple nested iteration to populate ad_data
:
# Aggregate input metrics into per-channel metrics.
for ad in example:
channel = ad['ad_group']['type_']
for k, v in ad['metrics'].items():
ad_data[channel][fields[k]] = int(v)
which for your example input produces:
{'DISPLAY_STANDARD': {'impressions': 14, 'clicks': 4, 'cost': 9},
'SEARCH_STANDARD': {'impressions': 6, 'clicks': 2, 'cost': 4}}
CodePudding user response:
I think you overthought this one a bit. Consider this simple function that sums two dicts:
def add_dicts(a, b):
return {
k: int(a.get(k, 0)) int(b.get(k, 0))
for k in a | b
}
Using this func, the main loop gets trivial:
stats = {}
for obj in example:
t = obj['ad_group']['type_']
stats[t] = add_dicts(stats.get(t, {}), obj['metrics'])
That's it. No defaultdicts needed.