I have two tables, the first one contains 300 rows, each row presents a case with 3 columns in which we find 2 constant values presenting the case, the second table is my data table collected from sensors contains the same indicators as the first except the case column, the idea is to detect to which case belongs each line of the second table knowing that the data are not the same as the first but in the range.
example:
First table:
[[1500, 22, 0], [1100, 40, 1], [2400, 19, 2]]
columns=['analog', 'temperature', 'case'])**
second table:
[[1420, 20], [1000, 39], [2300, 29]]
columns=['analog', 'temperature']
I want to detect my first row (1420 20) belongs to which case?
CodePudding user response:
You can simply use a classifier; K-NN for instance...
import pandas as pd
df = pd.DataFrame([[1500, 22, 0], [1100, 40, 1], [2400, 19, 2]],columns=['analog', 'temperature', 'case'])
df1 = pd.DataFrame([[1420, 10], [1000, 39], [2300, 29]],columns=['analog', 'temperature'])
from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors = 1, metric = 'minkowski', p = 2)
classifier.fit(df[['analog', 'temperature']], df["case"])
df1["case"] = classifier.predict(df1)
Output of df1;
analog temperature case
0 1420 10 0
1 1000 39 1
2 2300 29 2
so, first row (1420 20) in df1 (2nd table) belongs to case 0...
CodePudding user response:
What do you mean by belong
? [1420, 20] belongs to [?,?,?]?