Performing the non-paired t-test column-wise in my data-CodePudding

When we have two 1-D arrays:

import numpy as np
import scipy.stats as stats    
a=np.array([0.36619718309859156,
     0.32558139534883723,
     0.3333333333333333,
     0.3333333333333333,
     0.2549019607843137,
     0.3695652173913043,
     0.3157894736842105,
     0.3625])

and

b=np.array([0.938456,
 0.3239485723,
 0.300,
 0.8658,
 1.254901137,
 2.3695,
 0.75,
 1.3625])

we can perform the t-test by the following:

stats.ttest_ind(a=a, b=b, equal_var=np.amax([np.var(a),np.var(b)])/np.amin([np.var(a),np.var(b)])<4)

However, I would like to compare the columns of A and B where A and B are 2-D arrays:

A=np.array([[0, 0.375, 0.5, 0.3917],
 [0, 0.333, 0.4, 0.4285],
 [0, 0.27272727, 0.0, 0.2],
 [0.0, 0.25, 0.36365, 0.272],
 [0, 0.285857, 0.4, 0.25],
 [0, 0.416667, 0.33, 0.375],
 [0, 0.28, 0.083, 0.41667],
 [0, 0.2858, 0.25, 0.41666]])

B=np.array([[0, 0.4, 0.333, 0.142],
 [0, 0.33333, 0.4, 0.1111111],
 [0, 0.25, 0.285, 0.333333],
 [0.0, 0.5, 0.380, 0.333],
 [0.0, 0.5, 0.33, 0.375],
 [0, 0.25, 0.294, 0.5],
 [0.0, 0.5, 0.333, 0.2068965],
 [0, 0.5, 0.3846, 0.2]])

ie. I would like to perform t-test on, and compare, the first column of A and the first column of B, then the second column of A and the second column of B, and so on. (I tried specifying the axes but I think I am not sure how to correctly incorporate the equal_var < 4 property in this case.)

CodePudding user response：

You can transpose the data and then go through both at the same time:

def non_paired_t_test(a, b):
    return stats.ttest_ind(a=a, b=b, equal_var=np.amax([np.var(a),np.var(b)])/np.amin([np.var(a),np.var(b)])<4)

for a, b in zip(A.transpose(), B.transpose()):
    print(non_paired_t_test(a, b))

CodePudding user response：

for i in range(0,4):
    print(stats.ttest_ind(a=A[:,i], b=B[:,i], equal_var=np.amax([np.var(A[:,i]),np.var(B[:,i])])/np.amin([np.var(A[:,i]),np.var(B[:,i])])<4))