Home > Mobile >  converting a string with 'nan' into numpy
converting a string with 'nan' into numpy

Time:04-23

I receive the string below from a file

data = 'data: [nan, nan, nan, nan, nan, nan, -10.34, nan, 4.45533]'

and would like to convert this into a numpy array. Is there a good way to do this in python?

I already tried this

x_values_list = np.fromstring(data[5:], dtype=float, sep=',')

But it just returns me [-1]

CodePudding user response:

A suggestion in the comments indicated that you need to slice to get rid of the brackets:

np.fromstring(data[7:-1], dtype=float, sep=',')

A more generic solution might be to use a regex to extract only the part between the brackets:

import re
import numpy as np

a = np.fromstring(re.search(r'(?<=\[)[^\[\]] (?=\])', data).group(),
                  dtype=float, sep=',')

If you are not sure that there will be a match:

m = re.search(r'(?<=\[)[^\[\]] (?=\])', data)
if m:
    a = np.fromstring(m.group(), dtype=float, sep=',')
else:
    a = np.array([])

output:

array([      nan,       nan,       nan,       nan,       nan,       nan,
       -10.34   ,       nan,   4.45533])

CodePudding user response:

The question doesn't specify the desired output and / or restrictions in how to achieve it...

Assuming your goal is to get a numpy.ndarray similar to this

[  0.        0.        0.        0.        0.        0.      -10.34
   0.        4.45533]

then you can create a function like

import numpy as np

def string_to_numpy_array(data):
    data = data.replace('data: ', '')
    data = data.replace('[', '')
    data = data.replace(']', '')
    data = data.replace('nan', '0')
    data = data.split(',')
    data = [float(i) for i in data]
    data = np.array(data)
    print(data)
    print(type(data))
    return data

It basically

  • removes data: , [ and ]
  • replaces nan into 0
  • creates a float out of every item
  • transforms it into the a numpy array
  • prints the numpy array and the type for sanity

It's straightforward and whatever step you don't want you can easily take it out (for example, if you want nan, remove the line of the function where that gets replaced).

As you can see in the following image, if I test it

enter image description here

I get

enter image description here

  • Related