I am having a CSV file like this
F1 | F2 | F3 | F4 | Label
I used the get_dummies
to change the label to a one-hot encoding representation, the data contains 3 different labels, so the file now looks like
F1 | F2 | F3 | F4 | Label1 | Label2 | Label3
let's say I want to use this data to train a machine learning model. I have to determine the features and label columns can I set it to:
Features, x = [0:3]
Labels, y = [4:6]
Is it right? I am thinking, by doing this way, maybe this could be understood as a multi-label problem since this is not! originally it was a multi-class classification.
Any help will be so much appreciated.
CodePudding user response:
You can try iloc
or with filter
x = df.iloc[:, :4]
y = df.iloc[:, 4:]
# or
x = df.filter(like='F')
y = df.filter(like='Label')
print(x)
F1 F2 F3 F4
0 1 2 3 4
1 1 2 3 4
2 1 2 3 4
print(y)
Label1 Label2 Label3
0 x y z
1 x y z
2 x y z