How to substitute value in numpy 2d array?-CodePudding

I have a very easy question but somehow I'm having trouble with it...

I'm creating an 81x41 string 2d-array with numpy. I then iterate through all positions of this array and want to put a certain string inside each position.

For some reason, it doesn't assign the variable to the position. It remains empty.

How can I do this simple value assignment? What am I missing?

My code:

def create_discrete_values(self, threshold: list[int]):
        self.map_index_discreet = np.ndarray(shape=(81, 41), dtype=str)

        for i in range(81):
            for j in range(41):
                val = self.map_index[i][j]
                discreet_value = None
                if val <= threshold[0]:
                    discreet_value = "Very Low"
                elif val <= threshold[1]:
                    discreet_value = "Low"
                elif val <= threshold[2]:
                    discreet_value = "Moderate"
                elif val <= threshold[3]:
                    discreet_value = "High"
                elif val <= threshold[4]:
                    discreet_value = "Very High"
                elif val <= threshold[5]:
                    discreet_value = "Extreme"
                else:
                    discreet_value = "Very Extreme"

                self.map_index_discreet[i][j] = discreet_value

CodePudding user response：

You should use dtype=object. When creating an array with dtype=str the array can only contain strings with equal or lower length than the maximum element. Since your array is empty, that length is 0.

CodePudding user response：

You can use dtype='object' which will assign a pointer and will allow you to put whatever in the cell. If you are concerned about performance, then you might want to use dtype='U<length>' which will assign memory for a Unicode string of size <length>. If you assign a longer string, it will simply be cut off (which might not be what you want...).

import numpy as np
map_index_discreet = np.ndarray(shape=(81, 41), dtype='U10')

for i in range(81):
    for j in range(41):
        map_index_discreet[i][j] = str(i)   '_'   str(j)
print(map_index_discreet[3][4])
#>>> '3_4'
map_index_discreet[3][4] = 'longer than 10'
print(map_index_discreet[3][4])
#>>> 'longer tha'

dtype='str' without initializing the array with some data is like defining a dtype='U0', which is quite useless as you found out. It makes, however sense if you wanted to initialize a mixed-type array and force it into a string array like so:

# forcing the array to dtype='str'
np.array(['abc', None, False, 'abdedf'], dtype='str')
#>>> array(['abc', 'None', 'False', 'abdedf'], dtype='<U6')

As you can see, Numpy looks for the longest string in the data to determine the dtype of the string array.

The performance implications of 'str'/'U<length> vs. 'object' depend very much on your use case. If you are not dealing with large amounts of data, then you should probably just stick to dtype='object'.

See here for the relevant Numpy documenation.