divide multiple column by a value based on each condition-CodePudding

I have a dataset that has 3 different conditions. Data within condition 1 will need to be divided by 15, data within conditions 2 and 3 will need to be divided by 10. I tried to do for() in order to create separate datasets for each condition and then merge the two groups (group 1 is composed of condition 1, group 2 is composed of conditions 2 and 3). This is what I have so far for condition 1. Is there an easier way to do this that does not require creating subgroups?

Group1 <- NULL

for (val in ParticipantID) {
ParticipantID_subset_Group1 <- subset(PronounData, ParticipantID == val & Condition == "1")

I_Words_PPM <- (ParticipantID_subset_Group1$I_Words/"15")
YOU_Words_PPM <- (ParticipantID_subset_Group1$YOU_Words/"15")
WE_Words_PPM <- (ParticipantID_subset_Group1$WE_Words/"15")

df <- data.frame(val, Group, I_Words_PPM, YOU_Words_PPM, WE_Words_PPM)
Group1 <- rbind(Group1, df)
}

dim(Group1)
colnames(Group1) <- c("ParticipantID", "Condition", "I_Words_PPM", "YOU_Words_PPM", "WE_Words_PPM")
View(Group1)

CodePudding user response：

Couldn't fully test this solution without example data, but this should do what you want:

# make some fake data
PronounData <- data.frame(
  ParticipantID = 1:9,
  Condition = rep(1:3, 3),
  I_Words = sample(0:20, 9, replace = TRUE), 
  YOU_Words = sample(0:40, 9, replace = TRUE), 
  WE_Words = sample(0:10, 9, replace = TRUE)
)

# if Condition 1, divide by 15
PronounData[PronounData$Condition == 1, c("I_Words_PPM", "YOU_Words_PPM", "WE_Words_PPM")] <- 
  PronounData[PronounData$Condition == 1, c("I_Words", "YOU_Words", "WE_Words")] / 15

# if Condition 2 or 3, divide by 10
PronounData[PronounData$Condition %in% 2:3, c("I_Words_PPM", "YOU_Words_PPM", "WE_Words_PPM")] <- 
  PronounData[PronounData$Condition %in% 2:3, c("I_Words", "YOU_Words", "WE_Words")] / 10

# result 
PronounData

#   ParticipantID Condition I_Words YOU_Words WE_Words I_Words_PPM YOU_Words_PPM WE_Words_PPM
# 1             1         1      17        40        6      1.1333        2.6667       0.4000
# 2             2         2      14         1        6      1.4000        0.1000       0.6000
# 3             3         3       2        34        8      0.2000        3.4000       0.8000
# 4             4         1       0        33        1      0.0000        2.2000       0.0667
# 5             5         2       4        15        0      0.4000        1.5000       0.0000
# 6             6         3       1         7        6      0.1000        0.7000       0.6000
# 7             7         1       6        10        1      0.4000        0.6667       0.0667
# 8             8         2       1        33        9      0.1000        3.3000       0.9000
# 9             9         3       9        40        0      0.9000        4.0000       0.0000

NB, R is built on vectorized operations, so looping through each row is rarely the best solution. Instead, you generally want to find a way of modifying whole vectors/columns at once, or at least subsets of them. This will usually be faster and simpler.