How to get/isolate the p-value from 'AnovaResults' object in python?-CodePudding

I want to use one way repeated measures anova in my dataset to test whether the values of 5 patients differ between the measured 3 days.

I use AnovaRM from statsmodels.stats.anova and the result is an 'AnovaResults' object. I can see the p-value with the print() function but i don't know how to isolate it from this object.

Do you have any idea? Also is my code correct for what i want to test?

Thanks in advance

day1 = [1,2,3,4,5]
day2 = [2,4,6,8,10]
day3 = [1.5,2.5,3.5,4.5,5.5]
days_list = [day1,day2,day3]

df = pd.DataFrame({'patient': np.repeat(range(1, len(days_list[0]) 1), len(days_list)),
                   'group': np.tile(range(1, len(days_list) 1), len(days_list[0])),
                   'score': [x[y] for y in range(len(days_list[0])) for x in days_list]})

print(AnovaRM(data=df, depvar='score', subject='patient', within=['group']).fit())

CodePudding user response：

I'm assuming the p value you're looking for is the number displayed in the Pr > F column when you run the code in your question. If you instead assign the results of the test to a variable, the underlying dataframe can be accessed through the anova_table attribute:

results = AnovaRM(data=df, depvar='score', subject='patient', within=['group']).fit()
print(results.anova_table)

which gives:

       F Value  Num DF  Den DF  Pr > F 
group  15.5     2.0     8.0     0.00177

Just access the 0th member of the Pr > F column, and you're all set:

print(results.anova_table["Pr > F"][0])

This yields the answer:

0.0017705227840260451

CodePudding user response：

I think i found a way!

a=AnovaRM(data=df, depvar='score', subject='patient', within=['group']).fit().summary().as_html()
pd.read_html(a, header=0, index_col=0)[0]['Pr > F'][0]

Hope it will help someone!