I have a NumPy array containing a list which contains strings of various lengths:
arr = np.array(["abcd", "abcdef", "ab"])
I'm trying to loop through the strings in order to pad them to a constant length. If i do it one at a time, like so:
new_len = 10
left_pad = divmod(new_len - len(arr[0]),2)[0]
right_pad = left_pad divmod(new_len - len(arr[0]),2)[1]
abcd = arr[0].join(["_"*left_pad, "_"*right_pad])
I get my desired output of:
'___abcd___'
But if I try doing it in a loop, like so:
for i in range(arr.shape[0]):
left_pad = divmod(new_len - len(arr[i]),2)[0]
right_pad = left_pad divmod(new_len - len(arr[i]),2)[1]
arr[i] = arr[i].join(["_"*left_pad, "_"*right_pad])
I get this different output:
array(['___abc', '__abcd', '____ab'], dtype='<U6')
I'd like to understand why the behaviour is different in these two cases, and how I can get the desired output with a loop. Thanks in advance for any help or suggestions.
CodePudding user response:
Try to define your array as an array of objects like in the example bellow:
arr = np.array(["abcd", "abcdef", "ab"], dtype='object')
According to the output of your example you've created an array of char with length of 6 (dtype='<U6')
CodePudding user response:
To elaborate further as daniboy000 perfectly mentioned, <U6 means, the longest string of the np.arr
declared here has the length of 6. So, for testing you can use dtype="<U11
also. It works here fine, but for a general use case dtype="object"
is more appropriate.
Also, I wanted to add it just as a comment but currently, I do not match the required reputation.