Home > database >  R get new table
R get new table

Time:07-13

I have a table with two columns A and B. I want to create a new table with two new columns added: X and Y.

The X column is to contain the values from the A column, but with the division performed. Values from the first row (from column A) divided by the values from the second row in column A and so for all subsequent rows, e.g. the third row divided by the fourth row etc.

The Y column is to contain the values from the B column, but with the division performed. Values from the first row (from column B) divided by the values from the second row in column B and so for all subsequent rows, e.g. the third row divided by the fourth row etc.

So far I used Excel for this. But now I need it in R if possible in the form of a function so that I can reuse this code easily. I haven't done this in R yet, so I am asking for help.

Example data:

   structure(list(A = c(2L, 7L, 5L, 11L, 54L, 12L, 34L, 14L, 10L, 
6L), B = c(3L, 5L, 1L, 21L, 67L, 32L, 19L, 24L, 44L, 37L)), class = "data.frame", row.names = c(NA, 
-10L))

Sample results:

structure(list(A = c(2L, 7L, 5L, 11L, 54L, 12L, 34L, 14L, 10L, 
6L), B = c(3L, 5L, 1L, 21L, 67L, 32L, 19L, 24L, 44L, 37L), X = c("", 
"0,285714286", "", "0,454545455", "", "4,5", "", "2,428571429", 
"", "1,666666667"), Y = c("", "0,6", "", "0,047619048", "", "2,09375", 
"", "0,791666667", "", "1,189189189")), class = "data.frame", row.names = c(NA, 
-10L))

CodePudding user response:

You could use dplyr's across and lag (combined with modulo for picking every second row):

library(dplyr)

df |> mutate(across(c(A, B), ~ ifelse(row_number() %% 2 == 0, lag(.) / ., NA), .names = "new_{.col}"))

If you want a character vector change NA to "".

Output:

    A  B     new_A      new_B
1   2  3        NA         NA
2   7  5 0.2857143 0.60000000
3   5  1        NA         NA
4  11 21 0.4545455 0.04761905
5  54 67        NA         NA
6  12 32 4.5000000 2.09375000
7  34 19        NA         NA
8  14 24 2.4285714 0.79166667
9  10 44        NA         NA
10  6 37 1.6666667 1.18918919

Function:

ab_fun <- function(data, vars) { 
  
  data |> 
    mutate(across(c(A, B), ~ ifelse(row_number() %% 2 == 0, lag(.) / ., NA), .names = "new_{.col}"))
  
}

ab_fun(df, c(A,B))

Updated with new data and correct code. Function

  • Related