Home > Enterprise >  How to bring column name from wide dataset as row in long dataset if specified value corresponded wi
How to bring column name from wide dataset as row in long dataset if specified value corresponded wi

Time:08-27

the input dataset shows a "wide" dataset that includes unique actors and next to their name are corresponding movies as column name with a 1 assigned if movie corresponds to actors portfolio.

structure(list(Actor = c("Brad Pitt", "Matt Damon", "Leonardo Dicaprio", 
"Kate Winslet", "Jennifer Connoley", "Jude Law", "Gwenyth Paltrow"
), `Once upon a time in america` = c(NA, NA, NA, NA, 1, NA, NA
), `The Departed` = c(NA, 1, 1, NA, NA, NA, NA), `Once Upon a time in Hollywood` = c(1, 
NA, 1, NA, NA, NA, NA), `the holiday` = c(NA, NA, NA, 1, NA, 
1, NA), titanic = c(NA, NA, 1, 1, NA, NA, NA), contagion = c(NA, 
1, NA, 1, NA, 1, 1), `the talented mr ripley` = c(NA, 1, NA, 
NA, NA, 1, 1), `Oceans Eleven` = c(1, 1, NA, NA, NA, NA, NA), 
    `Blood Diamond` = c(NA, NA, 1, NA, 1, NA, NA)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -7L))

What I would like to do is to create a "long" dataset that shows actor and their corresponding movie by title in the following row if there was a 1 previously assigned under the movie title column. Below is how i'd like to see the output.

structure(list(Actor = c("Brad Pitt", "Brad Pitt", "Matt Damon", 
"Matt Damon", "Matt Damon", "Leonardo Dicaprio", "Leonardo Dicaprio", 
"Leonardo Dicaprio", "Leonardo Dicaprio", "Kate Winslet", "Kate Winslet", 
"Kate Winslet", "Jennifer Connoley", "Jennifer Connoley", "Jude Law", 
"Jude Law", "Jude Law", "Gwenyth Paltrow", "Gwenyth Paltrow"), 
    Movie = c("Once Upon a time in Hollywood", "Oceans Eleven", 
    "The Departed", "Contagion", "The Talented MR Ripley", "The Departed", 
    "Once Upon a time in Hollywood", "Titanic", "Blood Diamond", 
    "The Holiday", "Titanic", "Contagion", "Once Upon a time in America", 
    "Blood Diamond", "The Holiday", "Contagion", "The Talented MR Ripley", 
    "Contagion", "The Talented MR Ripley")), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -19L))

CodePudding user response:

just use pivot_longer() and filter() from tidyverse

library(tidyverse)

data %>% pivot_longer(!Actor,names_to="Movie",values_to="value") %>% dplyr::filter(!is.na(value))

  • Related