Correlation Co-efficient calculation in Python
How would I calculate the correlation coefficient using Python between the spring training wins column and the regular-season wins column?
Name | Spr.TR | Reg Szn |
---|---|---|
Team B | 0.429 | 0.586 |
Team C | 0.417 | 0.646 |
Team D | 0.569 | 0.6 |
Team E | 0.569 | 0.457 |
Team F | 0.533 | 0.563 |
Team G | 0.724 | 0.617 |
Team H | 0.5 | 0.64 |
Team I | 0.577 | 0.649 |
Team J | 0.692 | 0.466 |
Team K | 0.5 | 0.477 |
Team L | 0.731 | 0.699 |
Team M | 0.643 | 0.588 |
Team N | 0.448 | 0.531 |
CodePudding user response:
You can use corr
(Pearson correlation by default):
df['Spr.TR'].corr(df['Reg Szn'], method='pearson')
output: 0.10811116955657629
CodePudding user response:
If we assume that your data is in a variable of type pandas.DataFrame named df.
from scipy.stats.stats import pearsonr
correlation = pearsonr(df["Spr.TR"].tolist(),df["Reg Szn"].tolist())[0]