Home > Back-end >  seaborn heatmap displays axis labels, but no values when df.corr is NaN
seaborn heatmap displays axis labels, but no values when df.corr is NaN

Time:11-07

I am trying to come up with heatmap for correlation and I realized some are wrong.

Below is my heatmap. As you can see, the number for the action are not appearing.

heatmap

This is my dataframe

all_gen_cols = steamUniqueTitleGenre[['action', 'adventure','casual', 'indie','massively_multiplayer','rpg','racing','simulation','sports','strategy']]

   action  adventure  casual  indie  massively_multiplayer  rpg  racing  simulation  sports  strategy
0       1          0       0      0                      0    0       0           0       0         0
1       1          1       0      0                      1    0       0           0       0         0
2       1          1       0      0                      0    0       0           0       0         1
3       1          1       0      0                      1    0       0           0       0         0
4       1          0       0      0                      1    1       0           0       0         1

This is the code to produce the heatmap

def plot_correlation_heatmap(df):
    corr = df.corr()
    
    sb.set(style='white')
    mask = np.zeros_like(corr, dtype=np.bool)
    mask[np.triu_indices_from(mask)] = True
    
    f, ax = plt.subplots(figsize=(11,9))
    cmap = sb.diverging_palette(220, 10, as_cmap=True)
    
    sb.heatmap(corr, mask=mask, cmap=cmap, vmax=0.3, center=0,
                square=True, linewidths=.5, cbar_kws={"shrink": .5}, annot=True)
    
    plt.yticks(rotation=0)
    plt.show()
    plt.rcdefaults()

plot_correlation_heatmap(all_gen_cols)

I am not sure what is the error.

print(all_gen_cols.corr()) The result for coorelation is below. I saw NaN for action but i am not sure why it is Nan.

                       action  adventure    casual     indie  massively_multiplayer       rpg    racing  simulation    sports  strategy
action                    NaN        NaN       NaN       NaN                    NaN       NaN       NaN         NaN       NaN       NaN
adventure                 NaN   1.000000  0.007138  0.135392               0.023964  0.239136 -0.039846    0.036345 -0.064489  0.001435
casual                    NaN   0.007138  1.000000  0.235474               0.003487 -0.057726  0.079943    0.161448  0.149549  0.084417
indie                     NaN   0.135392  0.235474  1.000000              -0.082661  0.023372  0.045006    0.064723  0.056297  0.076749
massively_multiplayer     NaN   0.023964  0.003487 -0.082661               1.000000  0.160078  0.036685    0.139929  0.018444  0.074683
rpg                       NaN   0.239136 -0.057726  0.023372               0.160078  1.000000 -0.046970    0.044506 -0.051714  0.097123
racing                    NaN  -0.039846  0.079943  0.045006               0.036685 -0.046970  1.000000    0.127511  0.308864 -0.012170
simulation                NaN   0.036345  0.161448  0.064723               0.139929  0.044506  0.127511    1.000000  0.212622  0.208754
sports                    NaN  -0.064489  0.149549  0.056297               0.018444 -0.051714  0.308864    0.212622  1.000000  0.020048
strategy                  NaN   0.001435  0.084417  0.076749               0.074683  0.097123 -0.012170    0.208754  0.020048  1.000000

Below is by printing out print(all_gen_cols.describe())

        action     adventure        casual         indie  massively_multiplayer           rpg        racing    simulation        sports      strategy
count  14570.0  14570.000000  14570.000000  14570.000000           14570.000000  14570.000000  14570.000000  14570.000000  14570.000000  14570.000000
mean       1.0      0.362663      0.232189      0.657241               0.050927      0.165202      0.040288      0.121826      0.044269      0.127111
std        0.0      0.480785      0.422244      0.474648               0.219855      0.371376      0.196641      0.327096      0.205699      0.333108
min        1.0      0.000000      0.000000      0.000000               0.000000      0.000000      0.000000      0.000000      0.000000      0.000000
25%        1.0      0.000000      0.000000      0.000000               0.000000      0.000000      0.000000      0.000000      0.000000      0.000000
50%        1.0      0.000000      0.000000      1.000000               0.000000      0.000000      0.000000      0.000000      0.000000      0.000000
75%        1.0      1.000000      0.000000      1.000000               0.000000      0.000000      0.000000      0.000000      0.000000      0.000000
max        1.0      1.000000      1.000000      1.000000               1.000000      1.000000      1.000000      1.000000      1.000000      1.000000  

Data

This is the enter image description here

Since action = [1,1,...,1] => var(action) = 0. Thus, the denominator of rho(action, Y) (where Y is any other column) is zero => rho(action, Y) is undefined (NaN).

As suggested by other users, you should drop the 'action' column before computing the correlation matrix, since it doesn't add information.

  • Related