Assuming I have an id column, a Gene_ID, and a value column. more than one row of data has same Gene_ID and there is no value in some rows.
I'd like to search for rows by non null value in that column and only need one row contains each Gene_ID. For example, I have the below data frames:
# ID Gene_ID Value
# 6 26470 1.137318
# 7 10878 -1.051181
# 8 "" -1.316229
# 9 26470 -1.015734
And I want the result to be:
# ID Gene_ID Value
# 6 26470 1.137318
# 7 10878 -1.051181
CodePudding user response:
library(tidyverse)
df %>%
filter(Gene_ID != '') %>%
group_by(Gene_ID) %>%
slice(1) %>%
ungroup()
This will keep the first row per Gene_Id.
Note that the filter command depends on the structure of your Gene ID column.