I have a data file that can be downloaded from here:
Part of the output of x should contain the values below, as part of an np array:
What am I doing wrong?
edit: the above question has been answered and resolved. However, I just wanted to ask how would I ensure that the output is in float64.
I have edited the np.genfromtxt line to have type = np.float64 as shown:
x = np.genfromtxt(filename, usecols = [0,1,2,3,4,5,6,7,8,9,10,11,12], dtype = np.float64)
y = np.genfromtxt(filename, usecols = 13, dtype = np.float64)
I have also tried dataset.astype(float64)
but neither has worked. Would appreciate some help again. Thank you!
CodePudding user response:
You have already read the data from file in data variable. Use data variable instead of filename in genfromtxt as below instead of filename:
def loadData(filename):
dataset = None
file = open(filename, "r")
data = file.read()
print(data)
x = np.genfromtxt(data, usecols = [0,1,2,3,4,5,6,7,8,9,10,11,12])
y = np.genfromtxt(data, usecols = 13)
print("x: ", x)
print("y: ", y)
dataset = np.concatenate((x,y), axis = 1)
return dataset
CodePudding user response:
your code is almost correct. The problem there is that after loading x you got an array x
of shape (506, 13)
(two-dimensional) and an array y
with shape (506,)
(one-dimensional). So, after loading y
you have to add a new dimension to convert it to two-dimensional. Numpy
offers the np.newaxis
method for that. The code that solves your problem is:
import numpy as np
def loadData(filename):
x = np.genfromtxt(filename, usecols = [0,1,2,3,4,5,6,7,8,9,10,11,12])
y = np.genfromtxt(filename, usecols = 13)
y = y[:, np.newaxis]
dataset = np.concatenate((x,y), axis = 1)
return dataset
if __name__ == "__main__":
dataset = loadData("housing.data")
Hope it helps!