Home > Software engineering >  How to convert the string between numpy.array and bytes
How to convert the string between numpy.array and bytes

Time:07-03

I want to convert the string to bytes first, and then convert it to numpy array:

utf8 string -> bytes -> numpy.array

And then:

numpy.array -> bytes -> utf8 string

Here is the test:

import numpy as np

string = "any_string_in_utf8: {}".format(123456)

test = np.frombuffer(bytes(string, 'utf-8'))
print(test)

Here is the output:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_9055/3077694159.py in <cell line: 5>()
      3 string = "any_string_in_utf8: {}".format(123456)
      4 
----> 5 test = np.frombuffer(bytes(string, 'utf-8'))
      6 print(test)

ValueError: buffer size must be a multiple of element size

How to convert the string between numpy.array and bytes?

CodePudding user response:

Solution :

  • Main Problem in your Code is that you haven't mentioned dtype. By default dtype was set as Float and we generally started Conversion from String that's why it was throwing ValueError: buffer size must be a multiple of element size.

  • But If we convert the same into unsigned int then it will work because it can't interpret Object. For more refer to the Code Snippet given below: -

# Import all the Important Modules
import numpy as np

# Initialize String
utf_8_string = "any_string_in_utf8: {}".format(123456)

# utf8 string -> bytes -> numpy.array
np_array = np.frombuffer(bytes(utf_8_string, 'utf-8'), dtype=np.uint8)

# Print 'np_array'
print("Numpy Array after Bytes Conversion : -")
print(np_array)

# numpy.array -> bytes -> utf8 string
result_str = np.ndarray.tobytes(np_array).decode("utf-8")

# Print Result for the Verification of 'Data Loss'
print("\nOriginal String After Conversion : - \n"   result_str)

To Know more about np.frombuffer(): - Click Here !

# Output of the Above Code: -
Numpy Array after Bytes Conversion : -
[ 97 110 121  95 115 116 114 105 110 103  95 105 110  95 117 116 102  56
  58  32  49  50  51  52  53  54]

Original String After Conversion : - 
any_string_in_utf8: 123456
  • Related