I was writing this code to make some data graphs on my jupyter notebook, and when I tried bringing in data from a csv file, I got a "could not convert string to float" error.
So here's my code:
phot_g = np.genfromtxt('gaia_hyades_search.csv', dtype='str', delimiter=",", skip_header=1, usecols=(6), unpack=True)
phot_bp = np.genfromtxt('gaia_hyades_search.csv', dtype='str', delimiter=",", skip_header=1, usecols=(7), unpack=True)
phot_rp = np.genfromtxt('gaia_hyades_search.csv', dtype='str', delimiter=",", skip_header=1, usecols=(8), unpack=True)
phot_g = phot_g.astype(np.float64)
phot_bp = phot_bp.astype(np.float64)
phot_rp = phot_rp.astype(np.float64)
And here's my error:
ValueError Traceback (most recent call last)
/tmp/ipykernel_63/3948901710.py in <module>
---> 18 phot_g = phot_g.astype(np.float64)
19 phot_bp = phot_bp.astype(np.float64)
20 phot_rp = phot_rp.astype(np.float64
ValueError: could not convert string to float: ''
I've tried searching the error up, but a lot of the solutions I've gotten have been for numpy.loadtxt, and moreover, they don't seem to help me at all. Any help would be greatly appreciated.
By the way, the error shows up for all three lines of code (phot_g, phot_bp, and phot_rp)
CodePudding user response:
Is that the full error message? I get more information when I try to recreate the error:
works:
In [104]: np.array(['1','2']).astype(float)
Out[104]: array([1., 2.])
doesn't:
In [105]: np.array(['1','2','two']).astype(float)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [105], in <cell line: 1>()
----> 1 np.array(['1','2','two']).astype(float)
ValueError: could not convert string to float: 'two'
See the 'two'! That tells me exactly what string is causing the problem.
If a line (or more) has two delimiters next to each other, the string array could end up with ''
. which can't be converted to a float:
In [109]: np.array('1,2,,'.split(',')).astype(float)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [109], in <cell line: 1>()
----> 1 np.array('1,2,,'.split(',')).astype(float)
ValueError: could not convert string to float: ''
genfromtxt
has some ability to fill missing data. pandas
csv reader is even better for that.
genfromtxt
with 'dtype=float' (the default case), will put np.nan
in the array when it can't make a float of the input.