Home > Software engineering >  How to move values into correct column in R?
How to move values into correct column in R?

Time:01-03

Beginner here: I have a dataframe that looks like this:

Name   Col_1     Col_2          Col_3      Col_4
abc    Website   https://LINK1  Twitter    https://LINK4
def    Email     https://LINK2  Facebook   https://LINK5
ghj    Document  https://LINK3  Website    https://LINK6

I want to create columns for the values in Col_1 and Col_3 (aka Website, Email, Document, Twitter and Facebook) move the LINKS in Col_2 and Col_4 into the respective columns. The output should look like this:

Names  Website        Email           Document        Twitter        Facebook
abc    https://LINK1  NA              NA              https://LINK4  NA
def    NA             https://LINK2   NA              NA             https://LINK5
ghj    https://LINK6  NA              https://LINK3   NA             NA

How can I achieve this? Thank you!

CodePudding user response:

this output format is sooo way against all tidy data-conventions, that it took me a lot of wrangling to get it right..

Not the most pretty, but it gets the job done

library(data.table)
mydata <- read.table(text = "Col_1     Col_2          Col_3      Col_4
Website   https://LINK1  Twitter    https://LINK4
Email     https://LINK2  Facebook   https://LINK5
Document  https://LINK3  Website    https://LINK6", header = TRUE)

#split in chunks of 2 columns
L <- split.default(mydata, f = 2:(ncol(mydata)   1) %/% 2)
# bind rows together
DT <- data.table::rbindlist(L, use.names = FALSE)
# split to list by Col_1
L2 <- split(DT, by = "Col_1", keep.by = FALSE)
# make vectors
L3 <- unlist(lapply(L2, c), recursive = FALSE)
# make lengts equal
L4 <- lapply(L3, `length<-`, max(lengths(L3)))
# colbind
ans <- Reduce(cbind, L4)
# set colnames
colnames(ans) <- names(L2)
# make data.frame
as.data.frame(ans)
#         Website         Email      Document       Twitter      Facebook
# 1 https://LINK1 https://LINK2 https://LINK3 https://LINK4 https://LINK5
# 2 https://LINK6          <NA>          <NA>          <NA>          <NA>

CodePudding user response:

Here is an option using two times pivot_wider and coalesce once the duplicated website columns like this:

library(dplyr)
library(tidyr)
df %>%
  pivot_wider(names_from = Col_1, values_from = Col_2) %>%
  pivot_wider(names_from = Col_3, values_from = Col_4, names_repair = "unique") %>%
  mutate(Website = coalesce(`Website...2`, `Website...7`)) %>%
  select(-c(`Website...2`, `Website...7`)) 
#> New names:
#> • `Website` -> `Website...2`
#> • `Website` -> `Website...7`
#> # A tibble: 3 × 6
#>   Name  Email         Document      Twitter       Facebook      Website      
#>   <chr> <chr>         <chr>         <chr>         <chr>         <chr>        
#> 1 abc   <NA>          <NA>          https://LINK4 <NA>          https://LINK1
#> 2 def   https://LINK2 <NA>          <NA>          https://LINK5 <NA>         
#> 3 ghj   <NA>          https://LINK3 <NA>          <NA>          https://LINK6

Created on 2023-01-02 with reprex v2.0.2

CodePudding user response:

Firstly, you need to bind the columns on top each other to make it look like a long format data;

df_new <- data.frame(Names=df[,1],data.table::rbindlist(list(df[,2:3],df[,4:5])))

#  Names    Col_1         Col_2
# 1   abc  Website https://LINK1
# 2   def    Email https://LINK2
# 3   fgh Document https://LINK3
# 4   abc  Twitter https://LINK4
# 5   def Facebook https://LINK5
# 6   fgh  Website https://LINK6

After that, you can use any long to wide format option to end up with your output. In here, I chose dcast from reshape2 library.

reshape2::dcast(df_new, Names ~ Col_1)

gives,

#  Names      Document         Email       Website      Facebook       Twitter
#1   abc          <NA>          <NA> https://LINK1          <NA> https://LINK4
#2   def          <NA> https://LINK2          <NA> https://LINK5          <NA>
#3   fgh https://LINK3          <NA> https://LINK6          <NA>          <NA>

CodePudding user response:

Here is another way with group_split and map_dfr in tidyverse. First, split on Name column, and then create tibbles for each, specifying the sources and links. Then, using pivot_wider put into final format.

library(tidyverse)

df |>
  group_split(Name) |>
  map_dfr(
    ~tibble(
      Name = .x$Name,
      Source = c(.x$Col_1, .x$Col_3),
      Link = c(.x$Col_2, .x$Col_4)
    )
  ) |>
  pivot_wider(id_cols = Name, names_from = Source, values_from = Link)

Output

  Name  Website       Twitter       Email         Facebook      Document     
  <chr> <chr>         <chr>         <chr>         <chr>         <chr>        
1 abc   https://LINK1 https://LINK4 NA            NA            NA           
2 def   NA            NA            https://LINK2 https://LINK5 NA           
3 ghj   https://LINK6 NA            NA            NA            https://LINK3
  • Related