I have a df with a column "SOS" that has a number value in each column except some of the columns have "Strength of Schedule" instead of a number value. How do i delete the rows that have this issue using an if statement so i dont need to manually go in and check row numbers?
Thanks
CodePudding user response:
Here is a tidyverse
solution:
library(dplyr)
df <- data.frame(sos = c(0,1,2,3, "Strength of Schedule"))
clean_df <- df %>%
filter(sos != "Strength of Schedule")
CodePudding user response:
Suppose you have your data frame in the variable myDf
. You could try myDf <- myDf[myDf$SOS != "Strength of Schedule", ]
. That gives you every element of the data frame where the value of the SOS column is not "Strength of Schedule".
CodePudding user response:
you can use a regex to filter out everything that is not only digits to make a more broad case if you need to.
df %>%
filter(!str_detect(SOS, "Strength of Schedule"))
#positive or negative numbers.
df %>%
filter(str_detect(SOS, '(^\\d $)|(^\\d \\.\\d $)|(^-\\d $)|(^-\\d \\.\\d $)'))