Home > database >  R adding columns and data
R adding columns and data

Time:07-12

I have a table with two columns A and B. I want to create a new table with two new columns added: X and Y. These two new columns are to contain data from column A, but every second row from column A. Correspondingly for column X, starting from the first value in column A and from the second value in column A for column Y.

So far, I have been doing it in Excel. But now I need it in R best function form so that I can easily reuse that code. I haven't done this in R yet, so I am asking for help.

Example data:

structure(list(A = c(2L, 7L, 5L, 11L, 54L, 12L, 34L, 14L, 10L, 
6L), B = c(3L, 5L, 1L, 21L, 67L, 32L, 19L, 24L, 44L, 37L)), class = "data.frame", row.names = c(NA, 
-10L))

Sample result:

structure(list(A = c(2L, 7L, 5L, 11L, 54L, 12L, 34L, 14L, 10L, 
6L), B = c(3L, 5L, 1L, 21L, 67L, 32L, 19L, 24L, 44L, 37L), X = c(2L, 
NA, 5L, NA, 54L, NA, 34L, NA, 10L, NA), Y = c(NA, 7L, NA, 11L, 
NA, 12L, NA, 14L, NA, 6L)), class = "data.frame", row.names = c(NA, 
-10L))

CodePudding user response:

You could also make use of the row numbers and the modulo operator:

A simple ifelse way:

library(dplyr)

df |>
  mutate(X = ifelse(row_number() %% 2 == 1, A, NA),
         Y = ifelse(row_number() %% 2 == 0, A, NA))

Or using pivoting:

library(dplyr)
library(tidyr)

df |>
    mutate(name = ifelse(row_number() %% 2 == 1, "X", "Y"),
           value = A) |>
    pivot_wider()

A function using the first approach could look like:

See comment

xy_fun <- function(data, A = A, X = X, Y = Y) { 
  
  data |> 
    mutate({{X}} := ifelse(row_number() %% 2 == 1, {{A}}, NA),
           {{Y}} := ifelse(row_number() %% 2 == 0, {{A}}, NA))
  
  }

xy_fun(df, # Your data
       A, # The col to take values from
       X, # The column name of the first new column
       Y # The column name of the second new column
)

Output:

    A  B  X  Y
1   2  3  2 NA
2   7  5 NA  7
3   5  1  5 NA
4  11 21 NA 11
5  54 67 54 NA
6  12 32 NA 12
7  34 19 34 NA
8  14 24 NA 14
9  10 44 10 NA
10  6 37 NA  6

Data stored as df:

df <- structure(list(A = c(2L, 7L, 5L, 11L, 54L, 12L, 34L, 14L, 10L, 6L),
                     B = c(3L, 5L, 1L, 21L, 67L, 32L, 19L, 24L, 44L, 37L)
                     ),
                class = "data.frame",
                row.names = c(NA, -10L)
                )

CodePudding user response:

It is not a super elegant solution, but it works:

exampleDF <- structure(list(A = c(2L, 7L, 5L, 11L, 54L, 
                                  12L, 34L, 14L, 10L, 6L), 
                            B = c(3L, 5L, 1L, 21L, 67L, 
                                  32L, 19L, 24L, 44L, 37L)), 
                       class = "data.frame", row.names = c(NA, -10L))

index <- seq(from = 1, to = nrow(exampleDF), by = 2)

exampleDF$X <- NA
exampleDF$X[index] <- exampleDF$A[index]

exampleDF$Y <- exampleDF$A
exampleDF$Y[index] <- NA

CodePudding user response:

I like the @harre approach:

Another approach with base R we could ->

Use R's recycling ability (of a shorter-vector to a longer-vector):

df$X <- df$A
df$Y <- df$B
df$X[c(FALSE, TRUE)] <- NA
df$Y[c(TRUE, FALSE)] <- NA
df

 A  B  X  Y
1   2  3  2 NA
2   7  5 NA  5
3   5  1  5 NA
4  11 21 NA 21
5  54 67 54 NA
6  12 32 NA 32
7  34 19 34 NA
8  14 24 NA 24
9  10 44 10 NA
10  6 37 NA 37
  • Related