Home > Net >  Create dictionary from lists matching with conditional
Create dictionary from lists matching with conditional

Time:10-27

Given these sample lists

main = ['dayn is the one', 'styn is a main', 'tyrn is the third main']
lst2 = ['dayz', 'stzn', 'tyrm']
lst3 = ['stywerwe', 'tyrmadsf', 'dayttt']

I am trying to create a dictionary that has every element of the main list as a key and only those elements that match the first three characters of any value in main and any value in lst2 or lst3 as a list that is the value of that key.

I tried several versions of this to no avail.

matched = {}

for x in main:
    for y in lst2:
        if x[:3] == y[:3]:
            matched[x] = y

This code gets me close, but not quite with this result:

{'dayn is the one': 'dayz', 'tyrn is the third main': 'tyrm'}

My actual data is four lists of various named locations for my company. The initial list is the proper name for those locations and the other three lists from three separate sources were created such that those authors were using shortened versions of those names, etc. So, if I can match the first 5 characters between the main list and each of the other three, I can create a mapping dictionary to correct for those unconventionally named versions of those facilities in the other three sources. The expected output is this:

Sample list items:

main = ['dayn is the one', 'styn is a main', 'tyrn is the third main']
lst2 = ['dayz', 'stzn', 'tyrm']
lst3 = ['styzerwe', 'tyrmadsf', 'dayttt']
lst4 = ['dayl', 'styyzt', 'tyrl']

Expected Result:

{'dayn is the one':['dayz','dayttt', 'dayl'],'styn is a main':['styzerwe', 'styyzt'],'tyrn is the third main':['tyrm', 'tyrmadsf', 'tyrl']} 

The goal is to use the above dictionary to then correct any versions of the facility name in any dataframe by using this as a mapping object in pandas. In all of the various naming conventions, the first 5 or so characters are the same and are a way to ensure matching unique names.

I researched updating dictionaries, ordered dictionaries, the default dictionary in python and nothing exists to solve this riddle.

CodePudding user response:

Try:

main = ["dayn is the one", "styn is a main", "tyrn is the third main"]
lst2 = ["dayz", "stzn", "tyrm"]
lst3 = ["styzerwe", "tyrmadsf", "dayttt"]
lst4 = ["dayl", "styyzt", "tyrl"]


tmp = {}
for l in [lst2, lst3, lst4]:
    for v in l:
        tmp.setdefault(v[:3], []).append(v)

out = {v: tmp.get(v[:3], []) for v in main}
print(out)

Prints:

{
    "dayn is the one": ["dayz", "dayttt", "dayl"],
    "styn is a main": ["styzerwe", "styyzt"],
    "tyrn is the third main": ["tyrm", "tyrmadsf", "tyrl"],
}

CodePudding user response:

from itertools import chain

main = ['dayn is the one', 'styn is a main', 'tyrn is the third main']
lst2 = ['dayz', 'stzn', 'tyrm']
lst3 = ['styzerwe', 'tyrmadsf', 'dayttt']
lst4 = ['dayl', 'styyzt', 'tyrl']

def create_dict(main, match=3, *rest):
    result = {item[:match]:[item, []] for item in main}
    result['unmatched'] = ['unmatched', []]
    for item in chain(*rest):
        (result.get(item[:match]) or result['unmatched'])[1].append(item)
    return dict(result.values())

result = create_dict(main, 3, lst2, lst3, lst4)
print(result)

output:

{'dayn is the one': ['dayz', 'dayttt', 'dayl'], 
 'styn is a main': ['styzerwe', 'styyzt'], 
 'tyrn is the third main': ['tyrm', 'tyrmadsf', 'tyrl'], 
 'unmatched': ['stzn']}

CodePudding user response:

With a few lines.

main = ['dayn is the one', 'styn is a main', 'tyrn is the third main']
lst2 = ['dayz', 'stzn', 'tyrm']
lst3 = ['styzerwe', 'tyrmadsf', 'dayttt']
lst4 = ['dayl', 'styyzt', 'tyrl']

keys= tuple(main)
data= tuple(lst2 lst3 lst4)
elem=  [[e for e in data if e.startswith(keys[i][:3])] for i in(range(3))]
result= dict(zip(keys, elem))

print(result)

[Output]

{'dayn is the one': ['dayz', 'dayttt', 'dayl'], 'styn is a main': ['styzerwe', 'styyzt'], 'tyrn is the third main': ['tyrm', 'tyrmadsf', 'tyrl']}
  • Related