Home > Net >  Generating nested dictionary from a text file
Generating nested dictionary from a text file

Time:05-08

I have a sample.txt file with following text.

abcd  10
abcd  1.1.1.1
abcd  2.2.2.2 
abcd  3.3.3.3 
wxyz  20 
wxyz  1.1.1.1 
wxyz  2.2.2.2 
wxyz  4.4.4.4

I want to store the different values from each line into a dictionary with specific keys.

Output desired with dictionary is -

details = { 
    "customer_names" : ["abcd", "wxyz"],
    "site_ids" : [10, 20],
    "neighbors" : [["1.1.1.1", "2.2.2.2", "3.3.3.3"], ["1.1.1.1", "2.2.2.2", "4.4.4.4"]]
}

so that specific configuration can be done with for each customer having different site-id and different neighbors independently.

I have tried with different codes but ended up loading all the neighbors in one single list due to which unable to process correctly. Please help me out in preparing the dictionary with separate nested lists within neighbors keys.

CodePudding user response:

You can use a for loop to loop though the data, and whenever it finds a line that starts with an id that doesn't already exist in the customer_names list, add it to the list along with the other necessary data to start filling up with the neighboring lines:

details = { 
    "customer_names" : [],
    "site_ids" : [],
    "neighbors" : []
}

with open("sample.txt") as f:
    for line in f:
        i, j = line.split()
        if i in details["customer_names"]:
            details["neighbors"][-1].append(j)
        else:
            details["customer_names"].append(i)
            details["neighbors"].append([])
            details["site_ids"].append(int(j))

sample.txt contains:

abcd  10
abcd  1.1.1.1
abcd  2.2.2.2 
abcd  3.3.3.3 
wxyz  20 
wxyz  1.1.1.1 
wxyz  2.2.2.2 
wxyz 4.4.4.4

Resulting details contains:

{'customer_names': ['abcd', 'wxyz'],
 'site_ids': [10, 20],
 'neighbors': [['1.1.1.1', '2.2.2.2', '3.3.3.3'], ['1.1.1.1', '2.2.2.2', '4.4.4.4']]}

CodePudding user response:

I read your data in the file and made a data frame (pandas) out of it. Here first got indexes where there are int values 10, 20. These indexes are used to get the name list. Next, the lists aaa, bbb are created.

import pandas as pd

df = pd.read_csv('ttt.txt', header=None, delim_whitespace=True)

index = [i for i in range(0, len(df[1])) if df[1][i].isdigit()]#Getting an index where there are integers.
name  = df.iloc[index, 0].to_list()
aaa = df.loc[(df.index != index[0]) & (df.index != index[1]) & (df[0] == name[0])][1].to_list()
bbb = df.loc[(df.index != index[0]) & (df.index != index[1]) & (df[0] == name[1])][1].to_list()

details = {
    "customer_names": name,
    "site_ids": [int(df.iloc[index[0], 1]), int(df.iloc[index[1], 1])],
    "neighbors": [aaa, bbb]

Output details

{'customer_names': ['abcd', 'wxyz'], 'site_ids': [10, 20], 'neighbors': [['1.1.1.1', '2.2.2.2', '3.3.3.3'], ['1.1.1.1', '2.2.2.2', '4.4.4.4']]}  
  • Related