Home > Back-end >  Recover SAS output for ANOVA analysis using Python
Recover SAS output for ANOVA analysis using Python

Time:09-27

I have the following data:

import pandas as pd
m1 = [3129, 3000, 2865, 2890]
m2 = [3200, 3300, 2975, 3150]
m3 = [2800, 2900, 2985, 3050]
m4 = [2600, 2700, 2600, 2765]
t_dict = {'mix1': m1, 'mix2': m2, 'mix3': m3, 'mix4': m4}

df = pd.DataFrame.from_dict(t_dict, orient='index', columns=['sample1', 'sample2', 'sample3', 'sample4'])

When we apply SAS on this data we get the following table:

Is there any way to get an output like what SAS generates using Python?

enter image description here

CodePudding user response:

"SAS" is a software, I'm not sure what "apply SAS" means. I see in the image there is a ANOVA test, so I assume you want to apply ANOVA for your data using python.

statsmodel package is a useful library for statistics. If you are looking for ANOVA test, check out statsmodels' anova and interactions with anova pages.

Here at StackOverflow we are happy to guide you towards the solution, but we won't write the answer for you. Give statsmodels a shoot on your own, and ask a more specific question if you get stuck in the process.

CodePudding user response:

Kind of. ANOVA is a complex process not a single "thing", therefore the table is complex the same way and for programming that you'll need to assemble it from tiny parts:

Try to utilize SciPy (pip install scipy, or try it in Jupyter) as it mostly covers these models and statistical properties.

F-value and Pr > F (f_oneway()):

import scipy.stats
print(scipy.stats.f_oneway(m1, m2, m3, m4))
# F_onewayResult(statistic=12.728107492613294, pvalue=0.0004887151309914456)

For R-squared there needs to be a model and distinguishing between Y and Xn values. From the question I have zero clue which is which.

Perhaps you're looking to fit it into scipy.stats.linregress() from which you can then assemble errors and R-squared values, thus the rest of the table.

Also check this answer for calculating R-squared with numpy and this article which follows the process of ANOVA in detail which you can follow with the rest of the data and the relationship between its parts.

  • Related