How to change the numbers in a string in a value of a dictionary?-CodePudding

I have the following dictionary given and a list of keys. What I want to do this is that if these keys are in dictionary and has any numbers in the their value (which are strings). I want to change those numbers into '#' characters.

dict:

{"name":"Jone","age":"40 years","phone":"88777444"}

keys:

["age","phone"]

output:

{"name":"Jone","age":"## years","phone":"########"}

so far I have been able to grab those numbers but dont know how to change them in the dictionary:

my progress:

def convert(input, keys):
    for k in range(len(keys)):
        if keys[k] in input:
            for el in input[keys[k]]:
                if el.isdigit():
                    print(el)

As you can see I am using python. If you use different language a hint towards the right direction will be great.

CodePudding user response：

This is essentially the same as an answer that's already been given but takes an approach that is more easily adapted to having, say, a list of dictionaries:

import re

d = {"name":"Jone","age":"40 years","phone":"88777444"}
keys = ["age","phone"]

for k, v in d.items():
    if k in keys:
        d[k] = re.sub('\d', '#', v)
print(d)

Output:

{'name': 'Jone', 'age': '## years', 'phone': '########'}

CodePudding user response：

I'd recommend a regular expression.

import re

data = {"name":"Jone","age":"40 years","phone":"88777444"}
keys = ["age","phone"]

for key in keys:
    if key in data:
        # replace every key with a new string where every number 
        # ('[0-9]') is substituted by a '#'
        data[key] = re.sub('[0-9]', '#', data[key])

# {'name': 'Jone', 'age': '## years', 'phone': '########'}
print(data)

CodePudding user response：

Here is the one line solution using dictionary comprehension.

d = {k: re.sub("\d", "#", d[k]) if k in keys else v for k, v in d.items()}

CodePudding user response：

Solution without re:

d = {"name": "Jone", "age": "40 years", "phone": "88777444"}
keys = ["age", "phone"]

for k in d.keys() & keys:
    d[k] = "".join("#" if c in "0123456789" else c for c in d[k])

print(d)

Prints:

{'name': 'Jone', 'age': '## years', 'phone': '########'}

CodePudding user response：

I saw 2 methods, based on re and if ... else ..., and I'm adding one here based on str.translate.

The re methods can be optimized a bit by pre-compiling the regex, but still the str.maketrans method is a bit faster (see performances at bottom).

import re
from string import digits


digits_re = re.compile(r"\d")

trans_dict = dict.fromkeys(digits, "#")
trans_table = str.maketrans(trans_dict)


def obfuscate_dict_tr(d):
    return {k: v.translate(trans_table) for k, v in d.items()}

def obfuscate_dict_re(d):
    return {k: digits_re.sub("#", v) for k, v in d.items()}

def obfuscate_dict_cond(d):
    return {k: "".join("#" if c in digits else c for c in v) for k, v in d.items()}

To try on a large dictionary I'll use the faker library:

In [ ]: from faker import Faker
   ...:
   ...: fake = Faker()
   ...: num_keys = 10_000
   ...: d = {fake.name(): fake.address() for _ in range(num_keys)}

Let's first check that all methods are equivalent:

In [ ]: d1 = obfuscate_dict_tr(d)
   ...: d2 = obfuscate_dict_re(d)
   ...: d3 = obfuscate_dict_cond(d)
   ...:
   ...: assert d1 == d2
   ...: assert d2 == d3

And then a performance comparison

In [ ]: %timeit obfuscate_dict_tr(d)
   ...: %timeit obfuscate_dict_re(d)
   ...: %timeit obfuscate_dict_cond(d)
14.1 ms ± 216 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
16 ms ± 255 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
28.7 ms ± 217 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

CodePudding user response：

import re
input_={"name":"Jone",
       "age":"40 years",
       "phone":"88777444"}
key=["age","phone"]

def convert(input_, keys):
    for k in range(len(keys)):
        if keys[k] in input:
            input_[keys[k]] = re.sub(r'\d', "#", input_[keys[k]])
    return input_

print(convert(input_,key))