I've the following data frame with all positive values
V1 V2 V3 V4 V5
0 F1H5N4S2 10.751263 0.216574 0.703209 10.674107
1 F2H4N7 12.131079 0.000004 1.883824 0.018118
2 H12N2 11.075072 0.214872 0.000004 10.674107
3 H3N7 1.091061 0.000004 3.503290 0.091797
4 F2H4N5 0.590545 0.000004 1.730215 0.223571
When I'm trying to convert the numerical values to log2 using the following syntax in numpy(np)
log2df = df.apply(lambda x: np.log2(x) if np.issubdtype(x.dtype, np.float) else x)
I'm getting the following data frame with NaNs in place of log2(0.000004). 0.000004 happens to be the smallest value in the dataframe which I imputed. Can anyone help me solve the problem? Thanks
V1 V2 V3 V4 V5
0 F1H5N4S2 3.426434 -2.207070 -0.507974 3.416043
1 F2H4N7 3.600636 NaN 0.913664 -5.786433
2 H12N2 3.469244 -2.218451 NaN 3.416043
3 H3N7 0.125732 NaN 1.808710 -3.445414
4 F2H4N5 -0.759880 NaN 0.790951 -2.161198
CodePudding user response:
This works fine for me, but avoid using apply. Select the numeric type and apply a vectorial operation:
cols = df.select_dtypes('number').columns
df[cols] = np.log2(df[cols])
output:
V1 V2 V3 V4 V5
0 F1H5N4S2 3.426434 -2.207068 -0.507975 3.416043
1 F2H4N7 3.600636 -17.931569 0.913664 -5.786432
2 H12N2 3.469244 -2.218451 -17.931569 3.416043
3 H3N7 0.125732 -17.931569 1.808710 -3.445409
4 F2H4N5 -0.759881 -17.931569 0.790951 -2.161195