Home > database >  How to mutate columns and transform a 8 column dataframe in R to 3 columns
How to mutate columns and transform a 8 column dataframe in R to 3 columns

Time:02-02

I was solving a question and got stuck at this point : The question is: First, select the business_id and the seven columns that contain the duration information. Then, transform it from a wide data frame (with 8 columns) to a long data frame (with 3 columns). The 3 columns should be (1) business_id, (2) wday, specifying the day of the week, and (3) duration, specifying the duration of open time for a business on a specific weekday.

This is what I got so far I want to mutate the columns in a way I have a column that shows the week day and a column for the opening duration of that day.

CodePudding user response:

Welcome to StackOverflow!

I'll try to answer your questions while also showing you how to write more effective questions that will encourage people to answer. The key is reproducibility, as @Sotos commented. Also, it helps to create a minimal working example that removes a lot of unnecessary data in order to clarify the process.

This code creates a small dataframe similar to what you're starting with.

library(tidyr)

df_wide <- tibble(id = 1:3,
                  mon = rnorm(3),
                  tue = rnorm(3),
                  wed = rnorm(3),
                  thr = rnorm(3))
df_wide
#> # A tibble: 3 × 5
#>      id    mon    tue    wed    thr
#>   <int>  <dbl>  <dbl>  <dbl>  <dbl>
#> 1     1 -0.779 -0.691  1.22  -0.354
#> 2     2  1.16   0.650  0.824 -0.569
#> 3     3  0.128 -1.79  -1.31  -1.33

Created on 2023-02-01 by the reprex package (v2.0.1)

In order to convert your dataframe to long form, you can use the function pivot_longer from the package tidyr. You can find more detailed help on using that function by entering ?pivot_longer. Here it is in action.

library(tidyr)

df_wide <- tibble(id = 1:3,
                  mon = rnorm(3),
                  tue = rnorm(3),
                  wed = rnorm(3),
                  thr = rnorm(3))
df_wide
#> # A tibble: 3 × 5
#>      id    mon    tue    wed    thr
#>   <int>  <dbl>  <dbl>  <dbl>  <dbl>
#> 1     1 -0.864  1.18   0.605 -0.337
#> 2     2 -0.261  0.224 -1.16  -0.359
#> 3     3 -1.28  -0.849  0.333  0.896
df_long <- pivot_longer(data = df_wide,
                        cols = 2:5,
                        names_to = "wday",
                        values_to = "duration")
df_long
#> # A tibble: 12 × 3
#>       id wday  duration
#>    <int> <chr>    <dbl>
#>  1     1 mon     -0.864
#>  2     1 tue      1.18 
#>  3     1 wed      0.605
#>  4     1 thr     -0.337
#>  5     2 mon     -0.261
#>  6     2 tue      0.224
#>  7     2 wed     -1.16 
#>  8     2 thr     -0.359
#>  9     3 mon     -1.28 
#> 10     3 tue     -0.849
#> 11     3 wed      0.333
#> 12     3 thr      0.896

Created on 2023-02-01 by the reprex package (v2.0.1)

  • Related