Home > Net >  How do I change array values with dict keys?
How do I change array values with dict keys?

Time:05-18

I have an 3d array that looks like this With the shape (20001, 128, 128)

array([[[48, 48, 48, ..., 48, 48, 48],
        [48, 48, 48, ..., 48, 48, 48],
        [48, 48, 48, ..., 48, 48, 48],
        ...,
       [[12, 12, 12, ..., 12, 12, 12],
        [12, 12, 12, ..., 12, 12, 12],
        [12, 12, 12, ..., 12, 12, 12],
        ...,
        [19, 19, 19, ..., 12, 12, 12],
        [19, 19, 19, ..., 19, 12, 12],
        [19, 19, 19, ..., 19, 19, 19]],

And I have a dict that look like this

{1: [1, 39],
 2: [2, 5, 9, 20, 32, 42, 47, 72, 88, 91, 95],
 3: [3, 49, 55],
 4: [4, 24, 34, 40, 53, 76, 81, 90, 96],
 5: [6, 17, 30, 48, 83],
 6: [7, 13, 15, 16, 27, 44, 51, 54, 56, 75],
 7: [8, 50],
 8: [10, 19, 22, 35, 61, 63, 65],
 9: [11, 12, 21, 46, 52, 69, 78, 84, 89],
 10: [14, 36, 74],
 11: [18],
 12: [23, 38, 66, 97],
 13: [25],
 14: [26, 28, 29, 62, 64, 86, 94],
 15: [31, 59, 85],
 16: [33, 80],
 17: [37, 45, 60],
 18: [41, 92, 93],
 19: [43, 77, 79, 82],
 20: [57, 67],
 21: [58],
 22: [68],
 23: [70],
 24: [71],
 25: [73, 87],
 0: [0]}

So what im after is that if the array value = dict value to change the array value to the key, like this ->

array([[[5, 5, 5, ..., 5, 5, 5],
        [5, 5, 5, ..., 5, 5, 5],
        [5, 5, 5, ..., 5, 5, 5],
        ...,
        [9, 9, 9, ..., 9, 9, 9],
        [9, 9, 9, ..., 9, 9, 9],
        [9, 9, 9, ..., 9, 9, 9]],
        ...,
        [8, 8, 8, ..., 9, 9, 9],
        [8, 8, 8, ..., 8, 9, 9],
        [8, 8, 8, ..., 8, 8, 8]],

Because 48 is in key 5, 12 is in key 9 etc

CodePudding user response:

arr = [  # Example list of lists - arbitrary values
    [11, 11, 12, 13],
    [24, 24, 24, 35],
    [16, 27, 27, 8]
]

dictionary = {
    1: [1, 39],
    2: [2, 5, 9, 20, 32, 42, 47, 72, 88, 91, 95],
    3: [3, 49, 55],
    4: [4, 24, 34, 40, 53, 76, 81, 90, 96],
    5: [6, 17, 30, 48, 83],
    6: [7, 13, 15, 16, 27, 44, 51, 54, 56, 75],
    7: [8, 50],
    8: [10, 19, 22, 35, 61, 63, 65],
    9: [11, 12, 21, 46, 52, 69, 78, 84, 89],
    10: [14, 36, 74],
    11: [18],
    12: [23, 38, 66, 97],
    13: [25],
    14: [26, 28, 29, 62, 64, 86, 94],
    15: [31, 59, 85],
    16: [33, 80],
    17: [37, 45, 60],
    18: [41, 92, 93],
    19: [43, 77, 79, 82],
    20: [57, 67],
    21: [58],
    22: [68],
    23: [70],
    24: [71],
    25: [73, 87],
    0: [0]
}

def get_key(search_value):
    for key, num_list in dictionary.items():
         if search_value in num_list:
             return key

for sub_list in arr:
    for index, value in enumerate(sub_list):
        new_val = get_key(value)  # get the key from 'dict'
        sub_list[index] = new_val  # replace old subarray value

print(arr)  # QED - see new array below
# [
#     [9, 9, 9, 6],
#     [4, 4, 4, 8],
#     [6, 6, 6, 7]
# ]

CodePudding user response:

You should reverse your original dictionary:

lookup_dict = {1: [1, 39],
 2: [2, 5, 9, 20, 32, 42, 47, 72, 88, 91, 95],
 3: [3, 49, 55],
 4: [4, 24, 34, 40, 53, 76, 81, 90, 96],
 5: [6, 17, 30, 48, 83],
 6: [7, 13, 15, 16, 27, 44, 51, 54, 56, 75],
 7: [8, 50],
 8: [10, 19, 22, 35, 61, 63, 65],
 9: [11, 12, 21, 46, 52, 69, 78, 84, 89],
 10: [14, 36, 74],
 11: [18],
 12: [23, 38, 66, 97],
 13: [25],
 14: [26, 28, 29, 62, 64, 86, 94],
 15: [31, 59, 85],
 16: [33, 80],
 17: [37, 45, 60],
 18: [41, 92, 93],
 19: [43, 77, 79, 82],
 20: [57, 67],
 21: [58],
 22: [68],
 23: [70],
 24: [71],
 25: [73, 87],
 0: [0]}

reversed_dict = {val: key for key, lst in lookup_dict.items() for val in lst}

Now, you could either iterate through the input array and set each item in a new array after looking it up from reversed_dict, and that would already be more efficient than JRiggles's answer because you don't need to iterate through all the lists to find the new value.

However, if you put the values of this reversed_dict into an array, such that the key in the dict is the index in the array, then you can simply use numpy's inbuilt broadcasting ability to index into the array and get you the result of the correct shape. I prefer this approach because it's much faster:

max_index = max(reversed_dict.keys())

lookup_array = np.zeros((max_index 1,))
for k, v in reversed_dict.items():
    lookup_array[k] = v

And finally:

input_array = np.array([[[48, 48, 48, 48, 48, 48],
        [48, 48, 48, 48, 48, 48],
        [48, 48, 48, 48, 48, 48]],
        
        [[12, 12, 12,  12, 12, 12],
        [12, 12, 12,  12, 12, 12],
        [12, 12, 12,  12, 12, 12]],
        
        [[19, 19, 19,  12, 12, 12],
        [19, 19, 19,  19, 12, 12],
        [19, 19, 19,  19, 19, 19]]])

output_array = lookup_array[input_array]

Which gives:

array([[[5., 5., 5., 5., 5., 5.],
        [5., 5., 5., 5., 5., 5.],
        [5., 5., 5., 5., 5., 5.]],

       [[9., 9., 9., 9., 9., 9.],
        [9., 9., 9., 9., 9., 9.],
        [9., 9., 9., 9., 9., 9.]],

       [[8., 8., 8., 9., 9., 9.],
        [8., 8., 8., 8., 9., 9.],
        [8., 8., 8., 8., 8., 8.]]])

The advantage of this approach is that it works as-is for input_array of any shape, and it's super fast!.

Timing the three approaches:

  1. JRiggles's answer, func1
  2. Looking up values from the reversed dictionary, func2
  3. Indexing into the new numpy array, func3
import timeit

input_array = np.random.randint(0, max_index, (100, 100, 100))

def get_key(search_value):
    for key, num_list in lookup_dict.items():
         if search_value in num_list:
             return key

def func1(arr):
  arr = np.copy(arr)
  for outer_lst in arr:
    for sub_list in outer_lst:
        for index, value in enumerate(sub_list):
            new_val = get_key(value)  # get the key from 'dict'
            sub_list[index] = new_val  # replace old subarray value
  return arr

def func2(arr):
    arr = np.copy(arr)
    for outer_lst in arr:
        for sub_list in outer_lst:
            for index, value in enumerate(sub_list):
                new_val = reversed_dict[value]
                sub_list[index] = new_val
    return arr

def func3(arr):
    return lookup_array[arr]

t1 = timeit.timeit("func1(input_array)", globals=globals(), number=2)
print("t1 =", t1)
t2 = timeit.timeit("func2(input_array)", globals=globals(), number=2)
print("t2 =", t2)
t3 = timeit.timeit("func3(input_array)", globals=globals(), number=2)
print("t3 =", t3)

On my computer, this gives:

t1 = 25.02508409996517
t2 = 1.2259434000588953
t3 = 0.01203500002156943

In other words,

  • JRiggles's approach is 20x slower than reversing the dictionary and looking up values in the reversed dictionary
  • JRiggles's approach is 2000x slower than creating the array and using numpy to index into the array.

And this is with a test array that contains ~300x fewer elements than your input array does. With your array, the time savings will be significantly larger.

  • Related