I have to change the range 1-4 to 0-3 for five columns. (replace 1 with 0, replace 2 with 1,...). I already replaced those values for one column below. I still have to do the same for 9 other columns (AM01_02 - AM01_10). How could I do this in a simpler way without using that much space? Thanks a lot
#change range 1-4 to 0-3
#replace 1 with 0
ba_data$AM01_01[ba_data$AM01_01 == 1] <- 0
#replace 2 with 1
ba_data$AM01_01[ba_data$AM01_01 == 2] <- 1
#replace 3 with 2
ba_data$AM01_01[ba_data$AM01_01 == 3] <- 2
#replace 4 with 3
ba_data$AM01_01[ba_data$AM01_01 == 4] <- 3
CodePudding user response:
Could use:
v <- sprintf("AM01_d", 1:10)
ba_data[v] <- ba_data[v] - 1
CodePudding user response:
As an alternative to Zheyuan Li's simpler method that works really well with math operations, this is a little more generalized for operations that might be a little more complex:
cols_to_change <- c(2,4,5,8,10)
# or
cols_to_change <- c("AM01_01", "AM01_02", ...)
myfun <- function(z) z-1
ba_data[cols_to_change] <- lapply(ba_data[cols_to_change], myfun)
Walk-through:
lapply(L, F)
iterates the functionF
over each "element" inL
(a list). In R, adata.frame
is mostly just alist
where each element (column) is the same length.- Because
lapply(..)
returns alist
, and the columns you're working on are likely a subset of the entire frame, we need to assign it back to the respective columns; ergoba_data[cols_to_change] <-
The reason this is more general and can be useful: if your operation is more of a "lookup" than a "subtract one", you can change myfun
to be more specific. For instance, if in all of these columns you need to replace 1
with 21
, 2
with 97
, and 3
with -1
, and leave all other values intact, then you might write the function as:
myfun <- function(z, lookup) {
for (nm in names(lookup)) {
z <- ifelse(as.character(z) == nm, lookup[[nm]], z)
}
z
}
ba_data[cols_to_change] <-
lapply(ba_data[cols_to_change],
function(x) myfun(x, c("1"=21, "2"=97, "3"=-1)))
If you were to use a lookup like this, realize that I named them as strings regardless of what class the original data is, because "names" of things in R should not start with (or be entirely) numbers.