I am trying to understand how a specific case, in this example "C7" stands against the rest of the population. I am doing a boxplot to visualize this. I have a dataframe in R, with the following columns:
gene case log2fc symbol
g1 c1 0.236291026 GG
g2 c2 0.073854478 GG
g3 c6 0.722921499 GG
g4 c7 0 GG
g5 c8 0.925691334 GG
g1 c3 0.412097286 HH
g2 c4 0.98899995 HH
g3 c5 0.494138717 HH
g4 c7 0.996523937 HH
g5 c9 0 HH
I would like to remove rows that are 0 except for this specific case, C7 and then do the boxplot. So far, I managed to convert the 0's to NAs and remove rows from the entire dataframe. But, I am not sure how I can remove rows conditionally.
df[df == 0] <- NA
CodePudding user response:
In base R:
df[df$log2fc != 0 | df$case == 'c7',]
Tidyverse solution:
df %>%
filter(log2fc != 0 | case == 'c7')
CodePudding user response:
df %>%
filter(log2fc!=0 | case=="c7")
gene case log2fc symbol
1 g1 c1 0.23629103 GG
2 g2 c2 0.07385448 GG
3 g3 c6 0.72292150 GG
4 g4 c7 0.00000000 GG
5 g5 c8 0.92569133 GG
6 g1 c3 0.41209729 HH
7 g2 c4 0.98899995 HH
8 g3 c5 0.49413872 HH
9 g4 c7 0.99652394 HH