I have the dataframe 'rankedvariableslist', with the index 'Sleepvariables' being the sleep variable of interest, and the two columns being the R-squared and P-value of that model and variable respectively.
I am trying to sort the data in ascending order by 'P-value', then by 'R-squared value', but I keep getting the error: ''values' is not ordered, please explicitly specify the categories order by passing in a categories argument' and am not sure why.
I would be so grateful for a helping hand!
correspondantsleepvariable = []
correspondantpvalue = []
correspondantpvalue = []
newerresults = resultmodeldistancevariation2sleepsummary.tables[0]
newerdata = pd.DataFrame(newerresults)
rsquaredvalue = newerdata.iloc[0,3]
rsquaredvalues.append(rsquaredvalue)
modelpvalues = resultmodeldistancevariation2sleepsummary.tables[1]
newerdatavalues = pd.DataFrame(modelpvalues)
pvalue = newerdatavalues.iloc[12,4]
correspondantpvalue.append(pvalue)
correspondantsleepvariable.append(sleepvariable[i])
rankedvariableslist.sort_values(['P-value','R-squared value'],ascending = [True, False])
print(rankedvariableslist.head(3)
Sleepvariables R-squared value P-value
0 hours_of_sleep 0.026 0.491
1 frequency_of_alarm_usage 0.026 0.681
2 sleepiness_bed 0.026 0.413
As an example of the dataframe 'newerresults':
OLS Regression Results
==============================================================================
Dep. Variable: distance R-squared: 0.028
Model: OLS Adj. R-squared: 0.016
Method: Least Squares F-statistic: 2.338
Date: Fri, 18 Nov 2022 Prob (F-statistic): 0.00773
Time: 12:39:29 Log-Likelihood: -1274.1
No. Observations: 907 AIC: 2572.
Df Residuals: 895 BIC: 2630.
Df Model: 11
Covariance Type: nonrobust
==============================================================================
CodePudding user response:
Your code works when I run it, it returns the below result.
Sleepvariables R-squared value P-value
0 hours_of_sleep 0.026 0.413
1 frequency_of_alarm_usage 0.026 0.491
2 sleepiness_bed 0.026 0.681
CodePudding user response:
The following code worked - instead of converting the model summary output to a dataframe, I converted the model summary output to a html file).
correspondantsleepvariable = []
correspondantpvalue = []
correspondantpvalue = []
results_as_html = resultmodeldistancevariation2sleepsummary.tables[0].as_html()
datehere = pd.read_html(results_as_html, header=None, index_col=None)[0]
rsquaredvalue = datehere.iloc[0,3]
rsquaredvalue.astype(float)
rsquaredvalues.append(rsquaredvalue)
results_as_html = resultmodeldistancevariation2sleepsummary.tables[1].as_html()
datehere = pd.read_html(results_as_html, header=0, index_col=0)[0]
pvalue = datehere.iloc[11,3]
pvalue.astype(float)
correspondantpvalue.append(pvalue)
correspondantsleepvariable.append(sleepvariable[i])
rankedvariableslist =
pd.DataFrame({'Sleepvariables':correspondantsleepvariable, 'R-squared value':rsquaredvalues,'P-value':correspondantpvalue})
rankedvariableslist.sort_values(by=['P-value','R-squared value'],ascending = [True,False],inplace=True)
print(rankedvariableslist)
Sleepvariables R-squared value P-value
9 time_spent_awake_during_night_mins 0.034 0.005
4 sleep_quality 0.030 0.041
20 sleepiness_resolution_index 0.028 0.129
Thanks so much for all your help - I am so grateful!