how to remove puctuations in the values and change the dictionary into a nested dictionary-CodePudding

I'm trying to manipulate the string value in a dictionary and then transform this dictionary into a nested dictionary.

The original format I have is the following:

mat_dict= {'Shell': 'Polyester 98%, Spandex 2%', 'Pocket lining': 'Polyester 100%'}

The final output I'm looking for is the following:

output = {'Shell': {'Polyester': 0.98, 'Spandex': 0.02},
 'Pocket lining': {'Polyester': 1.0}}

But for now, I can only get a partial correct result using the following method:

for key, val in mat_dict.items():
    split =  val.split()
    mat_dict[key] = {" ".join(split[:-1]): float(split[-1].strip('%'))/100}

# the result I got:
output ={'Shell': {'Polyester 98%, Spandex': 0.02},
 'Pocket lining': {'Polyester': 1.0}}

# as you can see, my method cannot split "Polyester 98%" from the string and distinguish 98% from it

So, can anyone help me figure out how to deal with values with multiple materials so that I can treat them separately in my final output?

CodePudding user response：

Regex can be helpful here:

import re

output = {}

for component, composition in mat_dict.items():
    output[component] = {}
    for material, precentage in re.findall(r'([A-Za-z\s] )\s(\d )%', 
composition):
        output[component][material] = int(precentage)/100

CodePudding user response：

Another way:

from collections import defaultdict

output = defaultdict(dict)
for prod, mats in mat_dict.items():
    for mat in mats.split(", "):
        mat, perc = mat.rsplit(maxsplit=1) 
        output[prod][mat] = float(perc.replace("%", "")) / 100