I have a data frame that looks like:
print(file.head())
miRNAs baseMean log2FoldChange padj
0 hsa-miR-31-5p 221.442806 -7.037259 4.360127e-27
1 hsa-miR-337-5p 14.711123 -5.790422 4.556183e-01
2 hsa-miR-196b-5p 162.278255 -5.652917 1.365264e-3
3 hsa-miR-584-5p 6.430919 -5.554578 4.077578e-04
4 hsa-miR-196a-5p 455.152841 -5.361830 1.019622e-59
What I want to do is to set the color range for the file['padj']
column like below:
file['padj'] > 0.01 #gray
file['padj'] <= 0.01 #red
file['padj'] <= 0.0001 #orange
I tried to do this using np.where
but only for one of the conditions.
p = file.plot.scatter(x='baseMean',y='log2FoldChange',c=np.where(np.abs(file['padj'])>0.01, 'gray', 'b'), logx=True)
# specifying horizontal line type
plt.axhline(y = 0, color = 'r', linestyle = '-')
plt.show()
I could not manage to define multiple conditions to np.where
- any help is appreciated.
CodePudding user response:
You could make a color column like so:
file['color'] = 'gray'
file.loc[file['padj']<=0.01, 'color'] = 'red'
file.loc[file['padj']<=0.0001, 'color'] = 'orange'
and plot like this:
plt.scatter(file['baseMean'], file['log2FoldChange'], color=file['color'])
#edit: hacky way to generate legend:
for label, color in zip(['>0.01', '<=0.01', '<=0.0001'], ['gray', 'red', 'orange']):
plt.scatter([],[], c=color,label=label)
plt.legend()