Home > Enterprise >  Grouping Python dictionaries in hierarchical form with multiple keys?
Grouping Python dictionaries in hierarchical form with multiple keys?

Time:12-03

Here is my list of dicts:

[{'subtopic': 'IAM',
  'topic': 'AWS',
  'attachments': ['{"workflow.name": "aws_iam_policies_info","workflow.parameters": {"region": "us-east"}}'],
  'text': 'Sure! I can help with AWS IAM policies info'},
 {'subtopic': 'ECS',
  'topic': 'AWS',
  'attachments': ['{"workflow.name": "aws_ecs_restart_service","workflow.parameters": {"region": "us-east"}}'],
  'text': 'Sure! I can help with restarting AWS ECS Service'},
 {'subtopic': 'EC2',
  'topic': 'AWS',
  'attachments': ['{"workflow.name": "aws_ec2_create_instance","workflow.parameters": {"region": "us-east"}}'],
  'text': 'Sure, I can help creating an EC2 machine'},
 {'subtopic': 'EC2',
  'topic': 'AWS',
  'attachments': ['{"workflow.name": "aws_ec2_security_group_info","workflow.parameters": {"region": "us-east"}}'],
  'text': 'Sure, I can help with various information about AWS security groups'},
 {'subtopic': 'S3',
  'topic': 'AWS',
  'attachments': ['{"workflow.name": "aws_s3_file_copy","workflow.parameters": {"region": "us-west"}}'],
  'text': 'Sure, I can help you with the process of copying on S3'},
 {'subtopic': 'GitHub',
  'topic': 'AWS',
  'attachments': ['{"workflow.name": "view_pull_request","workflow.parameters": {"region": "us-west"}}'],
  'text': 'Sure, I can help with GitHub pull requests'},
 {'subtopic': 'Subtopic Title',
  'topic': 'Topic Title',
  'attachments': [],
  'text': 'This is another fact'},
 {'subtopic': 'Subtopic Title',
  'topic': 'Topic Title',
  'attachments': [],
  'text': 'This is a fact'}]

I would like to group by topic and subtopic to get a final result:

{
    "AWS": {
        "GitHub": {
            'attachments': ['{"workflow.name": "view_pull_request","workflow.parameters": {"region": "us-west"}}'],
            'text': ['Sure, I can help with GitHub pull requests']
            },
        "S3": {
            'attachments': ['{"workflow.name": "aws_s3_file_copy","workflow.parameters": {"region": "us-west"}}'],
            'text': ['Sure, I can help you with the process of copying on S3']
            },
        "EC2": {
            'attachments': ['{"workflow.name": "aws_ec2_create_instance","workflow.parameters": {"region": "us-east"}}',
                            '{"workflow.name": "aws_ec2_security_group_info","workflow.parameters": {"region": "us-east"}}'],
            'text': ['Sure, I can help creating an EC2 machine', 
                     'Sure, I can help with various information about AWS security groups']
        },
        "ECS": {
            'attachments': ['{"workflow.name": "aws_ecs_restart_service","workflow.parameters": {"region": "us-east"}}'],
            'text': ['Sure! I can help with restarting AWS ECS Service']
        },
        "IAM": {
            'attachments': ['{"workflow.name": "aws_iam_policies_info","workflow.parameters": {"region": "us-east"}}'],
            'text': ['Sure! I can help with AWS IAM policies info']
        }
        
    },
    "Topic Title": {
        "Subtopic Title": {
            'attachments': [],
            'text': ['This is another fact']
        }
    }
} 

I am using:

groups = ['topic', 'subtopic', "text", "attachments"]
groups.reverse()

def hierachical_data(data, groups):
    g = groups[-1]
    g_list = []
    for key, items in itertools.groupby(data, operator.itemgetter(g)):
        g_list.append({key:list(items)})
    groups = groups[0:-1]
    if(len(groups) != 0):
        for e in g_list:
            for k, v in e.items():
                e[k] = hierachical_data(v, groups)
    return g_list

print(hierachical_data(filtered_top_facts_dicts, groups))

But getting an error for hashing lists. Please advise how to transform my json to the desired format.

CodePudding user response:

To group the items in your list by topic and subtopic, you can use a combination of the groupby() function from the itertools module and a defaultdict() object from the collections module.

Here is an example of how you could do this:

from itertools import groupby
from collections import defaultdict

# List of dicts to group by topic and subtopic
data = [
    {'subtopic': 'IAM', 'topic': 'AWS', ...},
    {'subtopic': 'ECS', 'topic': 'AWS', ...},
    ...
]

# Group the items by topic and subtopic
grouped_data = defaultdict(lambda: defaultdict(dict))
for key, items in groupby(data, key=lambda x: (x['topic'], x['subtopic'])):
    topic, subtopic = key
    items = list(items)
    attachments = [item['attachments'] for item in items]
    text = [item['text'] for item in items]
    grouped_data[topic][subtopic] = {'attachments': attachments, 'text': text}

# Print the grouped data
print(grouped_data)

This code will group the items in the list of dicts by their topic and subtopic fields, and create a nested dictionary with the resulting groupings. The final result will be similar to the one you provided in your question.

I hope this helps!

CodePudding user response:

To group the list of dictionaries by topic and subtopic, you can create an empty dictionary and then loop through the list of dictionaries to add each item to the appropriate nested level in the dictionary.

result = {}

for item in data:
    topic = item['topic']
    subtopic = item['subtopic']

    if topic not in result:
        result[topic] = {}

    if subtopic not in result[topic]:
        result[topic][subtopic] = {}
        result[topic][subtopic]['attachments'] = []
        result[topic][subtopic]['text'] = []

    result[topic][subtopic]['attachments'].extend(item['attachments'])
    result[topic][subtopic]['text'].append(item['text'])

# Reverse the order of the sub-dictionaries within each topic
for topic, subtopics in result.items():
    result[topic] = dict(reversed(list(subtopics.items())))

After this loop has completed, the result dictionary will be in the format you described, with topic and subtopic as the keys and the attachments and text as the values within each sub-dictionary.

Output:

{'AWS': {'GitHub': {'attachments': ['{"workflow.name": "view_pull_request","workflow.parameters": {"region": "us-west"}}'],
   'text': ['Sure, I can help with GitHub pull requests']},
  'S3': {'attachments': ['{"workflow.name": "aws_s3_file_copy","workflow.parameters": {"region": "us-west"}}'],
   'text': ['Sure, I can help you with the process of copying on S3']},
  'EC2': {'attachments': ['{"workflow.name": "aws_ec2_create_instance","workflow.parameters": {"region": "us-east"}}',
    '{"workflow.name": "aws_ec2_security_group_info","workflow.parameters": {"region": "us-east"}}'],
   'text': ['Sure, I can help creating an EC2 machine',
    'Sure, I can help with various information about AWS security groups']},
  'ECS': {'attachments': ['{"workflow.name": "aws_ecs_restart_service","workflow.parameters": {"region": "us-east"}}'],
   'text': ['Sure! I can help with restarting AWS ECS Service']},
  'IAM': {'attachments': ['{"workflow.name": "aws_iam_policies_info","workflow.parameters": {"region": "us-east"}}'],
   'text': ['Sure! I can help with AWS IAM policies info']}},
 'Topic Title': {'Subtopic Title': {'attachments': [],
   'text': ['This is another fact', 'This is a fact']}}}
  • Related