I am working on ML problem, trying to compute the fisher score for feature selection purpose
A B Y
1 1 1
2 5 1
1 5 1
7 9 0
7 9 0
8 9 0
t = pd.read_clipboard()
I am trying to compute the fisher score for each of the feature. I am just following the tutorials as is here and here
The code is given below
!pip install skfeature-chappers
from skfeature.function.similarity_based import fisher_score
score = fisher_score.fisher_score(t[['A','B']], t['Y'])) # error here
score = fisher_score.fisher_score(t[['A','B']], t['Y']), mode='rank') # tried this but also error
score = pd.Series(fisher_score.fisher_score(t[['A','B']], t['Y']))) # error here
I get
ValueError: Length of values (1) does not match length of index (2)
If I pass only one feature as input like shown below,
score = pd.Series(fisher_score.fisher_score(t[['A']], t['Y']))
I expect my output to have a list of scores for each feature, but I get another error:
ValueError: Data must be 1-dimensional
How to fix this issue?
CodePudding user response:
The inputs to the fisher_score method is expected a numpy array not a pandas dataframe/series.
Try this:
score = fisher_score.fisher_score(t[['A','B']].to_numpy(),
t['Y'].to_numpy())