i use python to make an simple application for data mining. I code it in google colab. And I use elif on my function, here is the code
def data_pred(data):
# split(data)
X_train, y_train, X_test, y_test = split(data)
linreg = LinearRegression()
linreg.fit(X_train, y_train)
y_preds = linreg.predict(X_test)
for x in range(17):
y_test = np.insert(y_test, len(y_test), y_preds[len(y_preds)-1])
X_test = np.insert(X_test, len(X_test), y_test[len(X_test)-1])
X_test = np.array(X_test).reshape(X_test.size, 1)
y_preds = linreg.predict(X_test)
plt.scatter(X_test, y_test)
plt.scatter(X_test, y_preds, color='green')
plt.plot(X_test, y_preds, color="red")
plt.xlabel("X axis")
plt.ylabel("Y axis")
plt.show()
print("nilai slope/koef/a:",linreg.coef_)
print("nilai intercept/b :",linreg.intercept_)
print('Data hasil prediksi :', y_preds)
print('Data aktual :',y_test)
print()
print('MAPE : ', mape(y_test, y_preds))
if data["Nama Golongan"][0] == "INDUSTRI":
golongan = data.loc[0:23, "Nama Golongan"]
elif data["Nama Golongan"][44] == "INSTANSI PEMERINTAH":
golongan = data.loc[44:67, "Nama Golongan"]
elif data["Nama Golongan"][88] == "NIAGA KECIL":
golongan = data.loc[88:111, "Nama Golongan"]
elif data["Nama Golongan"][132] == "RUMAH MENENGAH":
golongan = data.loc[132:155, "Nama Golongan"]
elif data["Nama Golongan"][176] == "RUMAH MEWAH":
golongan = data.loc[176:119, "Nama Golongan"]
elif data["Nama Golongan"][220] == "SOSIAL KHUSUS":
golongan = data.loc[220:243, "Nama Golongan"]
elif data["Nama Golongan"][264] == "TOTAL PERBULAN":
golongan = data.loc[264:287, "Nama Golongan"]
more code...
when i run,
a = this[this['Nama Golongan'] == 'INDUSTRI']
data_pred(a)
i get graphic plot and the result without error. But, when i run this code
b = this[this['Nama Golongan'] == 'INSTANSI PEMERINTAH']
data_pred(b)
i get this
KeyError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key,
method, tolerance)
2897 try:
-> 2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
I though its coz the elif
code, but i dont know why. can anyone tell me why and how to fix it ? please help me, thanks
CodePudding user response:
OK, I finally see the problem. You are extracting a subset of a dataframe and passing it to this file. So, data["Nama Golongan"][44]
is referring to index 44, because the indicies get carried through with the subset.
The problem is that data.loc
does NOT use the index. It's strictly row numbers. They're all going to start with 0. If you ONLY want the first 23 rows, you don't need your if
sequence at all. Replace the whole thing with this:
golongan = data.loc[0:23, "Nama Golongan"]
The first row, when using loc
, is always 0.