I have a table with two columns A and B. I want to create a new table with two new columns added: X and Y.
The X column is to contain the values from the A column, but with the division performed. Values from the first row (from column A) divided by the values from the second row in column A and so for all subsequent rows, e.g. the third row divided by the fourth row etc.
The Y column is to contain the values from the B column, but with the division performed. Values from the first row (from column B) divided by the values from the second row in column B and so for all subsequent rows, e.g. the third row divided by the fourth row etc.
So far I used Excel for this. But now I need it in R if possible in the form of a function so that I can reuse this code easily. I haven't done this in R yet, so I am asking for help.
Example data:
structure(list(A = c(2L, 7L, 5L, 11L, 54L, 12L, 34L, 14L, 10L,
6L), B = c(3L, 5L, 1L, 21L, 67L, 32L, 19L, 24L, 44L, 37L)), class = "data.frame", row.names = c(NA,
-10L))
Sample results:
structure(list(A = c(2L, 7L, 5L, 11L, 54L, 12L, 34L, 14L, 10L,
6L), B = c(3L, 5L, 1L, 21L, 67L, 32L, 19L, 24L, 44L, 37L), X = c("",
"0,285714286", "", "0,454545455", "", "4,5", "", "2,428571429",
"", "1,666666667"), Y = c("", "0,6", "", "0,047619048", "", "2,09375",
"", "0,791666667", "", "1,189189189")), class = "data.frame", row.names = c(NA,
-10L))
CodePudding user response:
You could use dplyr
's across
and lag
(combined with modulo for picking every second row):
library(dplyr)
df |> mutate(across(c(A, B), ~ ifelse(row_number() %% 2 == 0, lag(.) / ., NA), .names = "new_{.col}"))
If you want a character vector change NA
to ""
.
Output:
A B new_A new_B
1 2 3 NA NA
2 7 5 0.2857143 0.60000000
3 5 1 NA NA
4 11 21 0.4545455 0.04761905
5 54 67 NA NA
6 12 32 4.5000000 2.09375000
7 34 19 NA NA
8 14 24 2.4285714 0.79166667
9 10 44 NA NA
10 6 37 1.6666667 1.18918919
Function:
ab_fun <- function(data, vars) {
data |>
mutate(across(c(A, B), ~ ifelse(row_number() %% 2 == 0, lag(.) / ., NA), .names = "new_{.col}"))
}
ab_fun(df, c(A,B))
Updated with new data and correct code. Function