Home > Enterprise >  Python convert string to numpy array with dtype set
Python convert string to numpy array with dtype set

Time:03-20

I have a script that saves me a numpy array (numpy.ndarray) in a file which then looks like this:

[-1.18229054e-01  1.29475027e-01  1.23235974e-02 -6.50683045e-02
 -9.55493823e-02  2.64045410e-02 -2.75938213e-03 -6.67368323e-02
  9.00188163e-02 -2.10145377e-02  2.66856700e-01  9.40015819e-03
 -2.89094746e-01 -3.51716615e-02 -1.66291017e-02  9.14614499e-02
 -1.06583416e-01 -1.23973750e-01 -1.78502172e-01 -1.08104065e-01
  7.14434907e-02  4.79072258e-02 -2.07335409e-02  7.78953582e-02
 -1.55227587e-01 -2.85112500e-01 -1.12559319e-01 -1.44451872e-01
  8.40620697e-02 -1.10208422e-01  1.74363628e-02 -2.76290975e-03
 -1.36272341e-01 -6.82922751e-02  4.50296067e-02 -6.39983825e-03
 -9.41441953e-02 -1.25229180e-01  2.32872620e-01 -5.47889844e-02
 -1.60193309e-01  4.34489064e-02  1.60261065e-01  2.53989100e-01
  1.81385174e-01 -2.92025302e-02  9.10784975e-02 -4.31539603e-02
  1.16627172e-01 -2.18724713e-01  1.27890080e-01  1.65818125e-01
  2.23305345e-01  1.85337320e-01  1.37735754e-01 -1.91762269e-01
  1.88680366e-04  1.87593102e-01 -1.55434877e-01  1.52645275e-01
  3.52016874e-02 -5.45317903e-02 -4.63064313e-02 -1.05244309e-01
  8.93522054e-02  6.31373525e-02 -1.44663259e-01 -1.47942394e-01
  1.54137358e-01 -1.68812618e-01 -4.91221882e-02  1.51609406e-01
 -1.54694289e-01 -2.37069070e-01 -2.09335104e-01  1.02899939e-01
  3.80017221e-01  2.50614733e-01 -2.11414948e-01  8.33256636e-03
 -3.30714136e-02 -4.17182557e-02  3.24694254e-02  5.18384315e-02
 -3.81644070e-02 -7.67701417e-02 -6.28693327e-02  4.59151231e-02
  2.07173750e-01 -1.39254145e-03 -3.61226127e-02  2.65808940e-01
  5.46564311e-02 -1.19565101e-02  4.88518104e-02  1.03624240e-01
 -9.25963074e-02 -6.21471591e-02 -6.01862222e-02  6.11118786e-02
 -3.40493880e-02 -1.42552108e-01 -4.63349074e-02 -3.13070416e-03
 -1.02715015e-01  1.79752141e-01  2.60357540e-02  4.92025986e-02
 -9.11326036e-02  2.52531655e-02 -9.72855166e-02  1.72551628e-02
  1.77889302e-01 -2.71036386e-01  2.24644095e-01  1.69801027e-01
  1.89895695e-03  1.36371464e-01  8.23959708e-02  8.03985670e-02
  4.42997254e-02 -8.99535567e-02 -1.45031705e-01 -1.34309188e-01
  6.71791881e-02 -5.01240678e-02  2.34461389e-02  2.16746666e-02]

Now my question: How can I read this file so that in the end I again have a numpy array with floats in there? File is read like this:

encodingFile = open("people/encodings","r")
encodingString = encodingFile.read()
encodingString = encodingString.replace("\n","")

I have tried a couple things to convert this string to an numpy array:

codingArr = np.asarray(encodingString)
codingArr = np.asarray(encodingString, dtype='float64')
codingArr = codingArr.astype('float64') #for this I first created the codingArr 
codingArr = np.fromstring(encodingString, dtype='float64',sep=' ') #I also tried this without removing the \n from the String and used this as sep

It always gives me either

ValueError: could not convert string to float: [...]

or

DeprecationWarning: string or file could not be read to its end due to unmatched data;

and then it fails later.

What am I missing? How can I convert the string?

CodePudding user response:

Okay I have found the problem as suggested by hpaulj I had to remove the brackets. Which I tried before by using encodingString[1:-1] but as it seems there was another character at the end so I had to use encodingString[1:-2]

CodePudding user response:

Copying your sample:

In [79]: txt = """[-1.18229054e-01  1.29475027e-01  1.23235974e-02 -6.50683045e-
    ...: 02
    ...:  -9.55493823e-02  2.64045410e-02 -2.75938213e-03 -6.67368323e-02
    ...:   9.00188163e-02 -2.10145377e-02  2.66856700e-01  9.40015819e-03
    ...:  -2.89094746e-01 -3.51716615e-02 -1.66291017e-02  9.14614499e-02
    ...:  -1.06583416e-01 -1.23973750e-01 -1.78502172e-01 -1.08104065e-01
    ...:   7.14434907e-02  4.79072258e-02 -2.07335409e-02  7.78953582e-02
    ...:  -1.55227587e-01 -2.85112500e-01 -1.12559319e-01 -1.44451872e-01
    ...:   8.40620697e-02 -1.10208422e-01  1.74363628e-02 -2.76290975e-03
    ...:  -1.36272341e-01 -6.82922751e-02  4.50296067e-02 -6.39983825e-03
    ...:  -9.41441953e-02 -1.25229180e-01  2.32872620e-01 -5.47889844e-02
    ...:  -1.60193309e-01  4.34489064e-02  1.60261065e-01  2.53989100e-01
    ...:   1.81385174e-01 -2.92025302e-02  9.10784975e-02 -4.31539603e-02
    ...:   1.16627172e-01 -2.18724713e-01  1.27890080e-01  1.65818125e-01
    ...:   2.23305345e-01  1.85337320e-01  1.37735754e-01 -1.91762269e-01
    ...:   1.88680366e-04  1.87593102e-01 -1.55434877e-01  1.52645275e-01
    ...:   3.52016874e-02 -5.45317903e-02 -4.63064313e-02 -1.05244309e-01
    ...:   8.93522054e-02  6.31373525e-02 -1.44663259e-01 -1.47942394e-01
    ...:   1.54137358e-01 -1.68812618e-01 -4.91221882e-02  1.51609406e-01
    ...:  -1.54694289e-01 -2.37069070e-01 -2.09335104e-01  1.02899939e-01
    ...:   3.80017221e-01  2.50614733e-01 -2.11414948e-01  8.33256636e-03
    ...:  -3.30714136e-02 -4.17182557e-02  3.24694254e-02  5.18384315e-02
    ...:  -3.81644070e-02 -7.67701417e-02 -6.28693327e-02  4.59151231e-02
    ...:   2.07173750e-01 -1.39254145e-03 -3.61226127e-02  2.65808940e-01
    ...:   5.46564311e-02 -1.19565101e-02  4.88518104e-02  1.03624240e-01
    ...:  -9.25963074e-02 -6.21471591e-02 -6.01862222e-02  6.11118786e-02
    ...:  -3.40493880e-02 -1.42552108e-01 -4.63349074e-02 -3.13070416e-03
    ...:  -1.02715015e-01  1.79752141e-01  2.60357540e-02  4.92025986e-02
    ...:  -9.11326036e-02  2.52531655e-02 -9.72855166e-02  1.72551628e-02
    ...:   1.77889302e-01 -2.71036386e-01  2.24644095e-01  1.69801027e-01
    ...:   1.89895695e-03  1.36371464e-01  8.23959708e-02  8.03985670e-02
    ...:   4.42997254e-02 -8.99535567e-02 -1.45031705e-01 -1.34309188e-01
    ...:   6.71791881e-02 -5.01240678e-02  2.34461389e-02  2.16746666e-02]"""
In [80]: txt1 = txt.replace("\n", " ")[1:-1]

With a clean enough string, fromstring works:

In [82]: arr = np.fromstring(txt1, dtype=float, sep=" ")
In [83]: arr.shape
Out[83]: (128,)

The errors suggest either your string contains '...' or you left the '[]' on.

  • Related