logistic regression using more than one predictor-CodePudding

I want to fit a logistic regression model that predicts Y using X1 and X2. What I know is that we use the following method:

x_train, x_test, y_train, y_test = train_test_split(X,Y,test_size)

and then

model = LogisticRegression()
model.fit(x_train,y_train)

To predict Y using X, I don't know how to train the data using more than one predictor. Any help, please?

CodePudding user response：

If there are 2 features X1 and X2, then the training data X will have 2 columns. For example if data has 1000 X1 and 1000 X2, then the shape of X should be (1000 x 2)

For example you have a csv file with 3 columns: X1, X2, y

import pandas as pd
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

df = pd.read_csv('my_file.csv')
X = df.loc['X1', 'X2']
Y = df.loc['y']

x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)
model = LogisticRegression()
model.fit(x_train,y_train)

y_pred = model.predict(x_test)

acc = accuracy_score(y_test, y_pred)

CodePudding user response：

You can use numpy.concatenate and join a sequence of X1 & X2 along row then use LogisticRegression:

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

X1 = np.random.rand(100,3) #-> shape=(100,3)
X2 = np.random.rand(100,4) #-> shape=(100,4)
Y = np.random.randint(0,2,100)

X = np.concatenate((X1, X2), axis=1)
print(X.shape)
# (100, 7)

x_train, x_test, y_train, y_test = train_test_split(X,Y,test_size=.33)
clf = LogisticRegression().fit(x_train, y_train)
clf.predict(x_test)