I'm trying to figure out how to implement a for loop in statsmodels to get the statistics summary for a logistic regression (Iterate through independent variables list). I can get it to work fine with the traditional method, but using a for loop will make my life easier to find significance between variables.
Here is what I'm trying to do:
df = pd.read_csv('source/data_cleaning/cleaned_data.csv')
def opportunites():
dep = ['LEAVER']
indep = ['AGE', 'S0287', 'T0080', 'SALARY', 'T0329', 'T0333', 'T0159', 'T0165', 'EXPER', 'T0356']
for i in indep:
model = smf.logit(dep, i, data = df ).fit()
print(model.summary(yname="Status Leaver", xname=['Intercept', i ],
title='Single Logistic Regression'))
print()
opportunites()
Here is the traditional method that works
def regressMulti2():
model = smf.logit('LEAVER ~ AGE ', data = df).fit()
print(model.summary(yname="Status Leaver",
xname=['Intercept', 'AGE Less than 40 (AGE)'], title='Logistic Regression of Leaver and Age'))
print()
regressMuti2()
CodePudding user response:
def opportunites():
indep = ['AGE', 'S0287', 'T0080', 'SALARY', 'T0329', 'T0333', 'T0159', 'T0165', 'EXPER', 'T0356']
for i in indep:
model = smf.logit(f'LEAVER ~ {i} ', data = df).fit()
print(model.summary(
yname="Status Leaver",
xname=['Intercept', i],
title=f'Logistic Regression of Leaver and {i}'
))
print()