Change value in a column if a criterion is meet and use value from another column-CodePudding

I would like to replace the values in the column called Expenses with the value of the column Average Expenses if the value in column Expenses is zero or negative.

This is my data:

structure(list(Product = c("A", "A", "A", "B", "B", "B", "C", 
"C", "C"), Date = c("09.12.2019", "10.12.2019", "11.12.2019", 
"09.12.2019", "10.12.2019", "11.12.2019", "09.12.2019", "10.12.2019", 
"11.12.2019"), Expenses = c(0.2, 0.2, 0.3, -0.03, 0, 0.3, 0, 
0.1, 0.1), `Average Expenses` = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 
0.2, 0.2, 0.2)), class = "data.frame", row.names = c(NA, -9L))

I have tried the following:

df <- Data %>%
  mutate(Expenses = replace(Expenses,
                            Expenses<=0, `Average Expenses` 
  ))

It works but I get a warning message: "Warning message: In x[list] <- values : number of items to replace is not a multiple of replacement length"

Can I ignore that or why is R warning me? Since I cannot see something that is wrong in the output.

CodePudding user response：

You need to index the return variable within your replace statement to tell it that you want those values returned WHEN the condition is satisfied. If you omit that, you get the warnings that say that it is trying to return more values for each entry (instead of just one that would be the one matching the criteria of Expenses <= 0.

replace(Expenses, Expenses<=0, `Average Expenses`[Expenses <= 0])

CodePudding user response：

Data %>%
  mutate(Expenses = ifelse(Expenses <= 0, `Average Expenses`, Expenses))

CodePudding user response：

case_when option:

library(dplyr)
df %>%
  mutate(Expenses = case_when(Expenses < 0 ~ `Average Expenses`,
                              TRUE ~ Expenses))

Output:

  Product       Date Expenses Average Expenses
1       A 09.12.2019      0.2              0.2
2       A 10.12.2019      0.2              0.2
3       A 11.12.2019      0.3              0.2
4       B 09.12.2019      0.2              0.2
5       B 10.12.2019      0.0              0.2
6       B 11.12.2019      0.3              0.2
7       C 09.12.2019      0.0              0.2
8       C 10.12.2019      0.1              0.2
9       C 11.12.2019      0.1              0.2

In data.table:

library(data.table)
setDT(df)
df[, Expenses := ifelse(Expenses < 0, `Average Expenses`, Expenses)]

Output:

   Product       Date Expenses Average Expenses
1:       A 09.12.2019      0.2              0.2
2:       A 10.12.2019      0.2              0.2
3:       A 11.12.2019      0.3              0.2
4:       B 09.12.2019      0.2              0.2
5:       B 10.12.2019      0.0              0.2
6:       B 11.12.2019      0.3              0.2
7:       C 09.12.2019      0.0              0.2
8:       C 10.12.2019      0.1              0.2
9:       C 11.12.2019      0.1              0.2

CodePudding user response：

df <- structure(list(Product = c("A", "A", "A", "B", "B", "B", "C", 
                           "C", "C"), Date = c("09.12.2019", "10.12.2019", "11.12.2019", 
                                               "09.12.2019", "10.12.2019", "11.12.2019", "09.12.2019", "10.12.2019", 
                                               "11.12.2019"), Expenses = c(0.2, 0.2, 0.3, -0.03, 0, 0.3, 0, 
                                                                           0.1, 0.1), `Average Expenses` = c(0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 
                                                                                                             0.2, 0.2, 0.2)), class = "data.frame", row.names = c(NA, -9L))



library(data.table)
setDT(df)[Expenses <= 0, Expenses := `Average Expenses`][]
#>    Product       Date Expenses Average Expenses
#> 1:       A 09.12.2019      0.2              0.2
#> 2:       A 10.12.2019      0.2              0.2
#> 3:       A 11.12.2019      0.3              0.2
#> 4:       B 09.12.2019      0.2              0.2
#> 5:       B 10.12.2019      0.2              0.2
#> 6:       B 11.12.2019      0.3              0.2
#> 7:       C 09.12.2019      0.2              0.2
#> 8:       C 10.12.2019      0.1              0.2
#> 9:       C 11.12.2019      0.1              0.2

^{Created on 2022-05-27 by the reprex package (v2.0.1)}