I have a very easy question but somehow I'm having trouble with it...
I'm creating an 81x41 string 2d-array with numpy. I then iterate through all positions of this array and want to put a certain string inside each position.
For some reason, it doesn't assign the variable to the position. It remains empty.
How can I do this simple value assignment? What am I missing?
My code:
def create_discrete_values(self, threshold: list[int]):
self.map_index_discreet = np.ndarray(shape=(81, 41), dtype=str)
for i in range(81):
for j in range(41):
val = self.map_index[i][j]
discreet_value = None
if val <= threshold[0]:
discreet_value = "Very Low"
elif val <= threshold[1]:
discreet_value = "Low"
elif val <= threshold[2]:
discreet_value = "Moderate"
elif val <= threshold[3]:
discreet_value = "High"
elif val <= threshold[4]:
discreet_value = "Very High"
elif val <= threshold[5]:
discreet_value = "Extreme"
else:
discreet_value = "Very Extreme"
self.map_index_discreet[i][j] = discreet_value
CodePudding user response:
You should use dtype=object. When creating an array with dtype=str the array can only contain strings with equal or lower length than the maximum element. Since your array is empty, that length is 0.
CodePudding user response:
You can use dtype='object'
which will assign a pointer and will allow you to put whatever in the cell. If you are concerned about performance, then you might want to use dtype='U<length>'
which will assign memory for a Unicode string of size <length>
. If you assign a longer string, it will simply be cut off (which might not be what you want...).
import numpy as np
map_index_discreet = np.ndarray(shape=(81, 41), dtype='U10')
for i in range(81):
for j in range(41):
map_index_discreet[i][j] = str(i) '_' str(j)
print(map_index_discreet[3][4])
#>>> '3_4'
map_index_discreet[3][4] = 'longer than 10'
print(map_index_discreet[3][4])
#>>> 'longer tha'
dtype='str'
without initializing the array with some data is like defining a dtype='U0'
, which is quite useless as you found out. It makes, however sense if you wanted to initialize a mixed-type array and force it into a string array like so:
# forcing the array to dtype='str'
np.array(['abc', None, False, 'abdedf'], dtype='str')
#>>> array(['abc', 'None', 'False', 'abdedf'], dtype='<U6')
As you can see, Numpy looks for the longest string in the data to determine the dtype of the string array.
The performance implications of 'str'/'U<length>
vs. 'object'
depend very much on your use case. If you are not dealing with large amounts of data, then you should probably just stick to dtype='object'
.
See here for the relevant Numpy documenation.