R: Create one single column out of two seperate ones-CodePudding

I would like to create a new column out of the columns "Score 1" and "Score 2" (see below my data). The new column should be named "Score" and has 14 rows since I would like that after the values from column "Score 1" in columns 8 to 14 the values from column "Score 2" are added.

My data looks like this:

structure(list(`Score 1` = c(1L, 2L, 2L, 3L, 4L, 4L, 2L), `Score 2` = c(3L, 
4L, 4L, 3L, 3L, 2L, NA)), class = "data.frame", row.names = c(NA, 
-7L))

I tried to do the following: Data_new <- data.frame(Score=unlist(Data)) This does not work since the two columns don't have the same number of rows.

In the end my data should look like this:

structure(list(Score = c(1L, 2L, 2L, 3L, 4L, 4L, 2L, 3L, 4L, 
4L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA, -13L
))

Could someone help me with the code?

CodePudding user response：

You were close, just omit the NAs using na.omit and use unname to get clean row names.

new_dat <- na.omit(data.frame(Score = unname(unlist(dat))))
new_dat
#    Score
# 1      1
# 2      2
# 3      2
# 4      3
# 5      4
# 6      4
# 7      2
# 8      3
# 9      4
# 10     4
# 11     3
# 12     3
# 13     2

Data

dat <- structure(list(`Score 1` = c(1L, 2L, 2L, 3L, 4L, 4L, 2L), `Score 2` = c(3L, 
4L, 4L, 3L, 3L, 2L, NA)), class = "data.frame", row.names = c(NA, 
-7L))

CodePudding user response：

library(tidyverse)
df %>%
  # cast the two `Score` columns longer:
  pivot_longer(starts_with("Score")) %>%
  # change the column name:
  rename(Score = value) %>%
  # select the new column:
  select(Score) %>%
  # remove `NA`:
  na.omit()

CodePudding user response：

Here's an example using pivot_longer.

edit: remove NA

library(tidyr)
df <-
    structure(list(
        `Score 1` = c(1L, 2L, 2L, 3L, 4L, 4L, 2L),
        `Score 2` = c(3L,
                                    4L, 4L, 3L, 3L, 2L, NA)
    ),
    class = "data.frame",
    row.names = c(NA,-7L))
    
new <- 
    df %>% 
    pivot_longer(names_to = 'Delete', cols=c('Score 1', 'Score 2')) %>%
    select(Score=value) %>% na.omit()
    
new
# A tibble: 14 × 1
# Score
#<int>
# 1     1
# 2     3
# 3     2
# 4     4
# 5     2
# 6     4
# 7     3
# 8     3
# 9     4
# 10     3
# 11     4
# 12     2
# 13     2

CodePudding user response：

Another short base R option is to use stack

na.omit(stack(Data))
#   values     ind
#1       1 Score 1
#2       2 Score 1
#3       2 Score 1
#4       3 Score 1
#5       4 Score 1
#6       4 Score 1
#7       2 Score 1
#8       3 Score 2
#9       4 Score 2
#10      4 Score 2
#11      3 Score 2
#12      3 Score 2
#13      2 Score 2

This also has the advantage of automatically adding a column with entries denoting the source column.