I have a list of dataframes that each contain multiple of the same columns. In one of the columns, there are multiple instances where a row just contains "[]". My goal is to replace these instances with a blank.
I've attempted to do so via the map function and grepl. While it runs there is no change to the output. Am I going in the right direction here?
Please not that I differentiate between "[]" and "[value]"
I only want to replace the empty brackets with blanks.
My code below:
first_column <- c("1", "2", "3","4")
second_column <- c("value1", "value2","[]","[value]")
first_column_2 <- c("5", "6", "7","8")
second_column_2 <- c("value1", "[]","[]","[value2]")
first_column_3<- c("9", "10", "11","12")
second_column_3 <- c("[]", "[value2]","[]","[]")
df_1 <- data.frame(first_column,second_column)
df_2 <- data.frame(first_column_2,second_column_2)
df_3 <- data.frame(first_column_3,second_column_3)
df_list <- list(df_1,df_2,df_3)
var <- c(2)
df_list <- map(df_list, ~.x[!grepl("[[]",var),])
Thanks!
CodePudding user response:
We can use lapply
and gsub
to accomplish this. grepl
returns elements that match a pattern, whereas gsub
allows you to replace matches with something else. Note that instead of specifying an empty string (''
), you could just as easily specify NA
, but that will depend on your definition of "blank".
Here I use base R's lapply
, which in this case is equivalent to purrr::map
(even the syntax is interchangeable here).
data <- lapply(df_list, function(x) {
x %>%
mutate(across(where(is.character), ~gsub('\\[\\]', '', .x)))
})
[[1]]
first_column second_column
1 1 value1
2 2 value2
3 3
4 4 [value]
[[2]]
first_column_2 second_column_2
1 5 value1
2 6
3 7
4 8 [value2]
[[3]]
first_column_3 second_column_3
1 9
2 10 [value2]
3 11
4 12
CodePudding user response:
You've got a few issues:
- (a) you say you want to replace
"[]"
with""
, but your code is trying to drop them completely, not replace them. Usesub
instead ofgrepl
for replacing---or even better, since you are matching a whole string don't use regex at all - (b) you are running
grepl
on the number 2: you havevar <- 2
and your command isgrepl("[[]",var)
, which isgrepl("[[]", 2)
, which is alwaysFALSE
as the string "2" doesn't contain a brackets. - (c) Your
grepl
pattern is searching for any string that contains a[
in it.
As I said in (a), when you're matching a full string, you don't need regex at all. I'd do it like this:
df_list <- map(df_list, ~ {
.x[[var]][.x[[var]] == "[]"] = ""
.x
})
df_list
# [[1]]
# first_column second_column
# 1 1 value1
# 2 2 value2
# 3 3
# 4 4 [value]
#
# [[2]]
# first_column_2 second_column_2
# 1 5 value1
# 2 6
# 3 7
# 4 8 [value2]
#
# [[3]]
# first_column_3 second_column_3
# 1 9
# 2 10 [value2]
# 3 11
# 4 12