Home > front end >  Python: String concatenation working differently inside for loop
Python: String concatenation working differently inside for loop

Time:02-20

I have a NumPy array containing a list which contains strings of various lengths:

arr = np.array(["abcd", "abcdef", "ab"])

I'm trying to loop through the strings in order to pad them to a constant length. If i do it one at a time, like so:

new_len = 10

left_pad = divmod(new_len - len(arr[0]),2)[0]
right_pad = left_pad   divmod(new_len - len(arr[0]),2)[1]

abcd = arr[0].join(["_"*left_pad, "_"*right_pad])

I get my desired output of:

'___abcd___'

But if I try doing it in a loop, like so:

for i in range(arr.shape[0]):
    left_pad = divmod(new_len - len(arr[i]),2)[0]
    right_pad = left_pad   divmod(new_len - len(arr[i]),2)[1]
    arr[i] = arr[i].join(["_"*left_pad, "_"*right_pad])

I get this different output:

array(['___abc', '__abcd', '____ab'], dtype='<U6')

I'd like to understand why the behaviour is different in these two cases, and how I can get the desired output with a loop. Thanks in advance for any help or suggestions.

CodePudding user response:

Try to define your array as an array of objects like in the example bellow:

arr = np.array(["abcd", "abcdef", "ab"], dtype='object')

According to the output of your example you've created an array of char with length of 6 (dtype='<U6')

CodePudding user response:

To elaborate further as daniboy000 perfectly mentioned, <U6 means, the longest string of the np.arr declared here has the length of 6. So, for testing you can use dtype="<U11 also. It works here fine, but for a general use case dtype="object" is more appropriate.

Also, I wanted to add it just as a comment but currently, I do not match the required reputation.

  • Related