Home > Software design >  Transform multiple columns into rows in R
Transform multiple columns into rows in R

Time:07-09

I have the following dataframe :

Place     Age   janv17 fev17 mars17 avril17 mai17 juin17 
France    69      0     0      1       1     1      1         
Germany   69      0     0      1       1     1      1         
Germany   45      0     0      0       0     0      0         
National  35      0     0      0       0     0      0         
France    43      0     0      0       0     0      0         
Germany   69      0     0      0       0     0      0         
France    39      0     0      0       0     0      0         
France    28      0     0      0       0     0      0         

I need to turn it into something like this :

            France     Germany     
janv17        0           0
fev17         0           0
mars17        0           0
avril17       1           1
mai17         1           1
juin17        1           1

And I need this table for each age. Do you have any idea how to do so ? I am new to R.

Thank you!

CodePudding user response:

Here's an option using dplyr and janitor:

library(dplyr)
library(janitor)

test <- data.frame(
  Place = c("France", "France", "Germany", "Germany"),
  Age = c(69, 45, 69, 45),
  janv17 = c(1, 0, 0, 1),
  fev17 = c(1, 1, 1, 0)
)

test %>% 
  split(.$Age) %>% 
  map(function(x) {
    x %>% 
      select(-Age) %>% 
      t() %>% 
      as.data.frame() %>% 
      janitor::row_to_names(1)
  })

CodePudding user response:

Maybe something like this:

library(dplyr)

df %>% 
  group_by(Place, Age) %>% 
  summarise(across(janv17:juin17, ~sum(., na.rm = TRUE)), .groups="drop") %>% 
  t() %>% 
  as.data.frame() %>% 
  `colnames<-`(.[1, ]) %>%
  .[-1, ]
        France France France France Germany Germany National
Age         28     39     43     69      45      69       35
janv17       0      0      0      0       0       0        0
fev17        0      0      0      0       0       0        0
mars17       0      0      0      1       0       1        0
avril17      0      0      0      1       0       1        0
mai17        0      0      0      1       0       1        0
juin17       0      0      0      1       0       1        0

CodePudding user response:

Since you want a table like this for every age, I use split() to create a list where each element is a subset of your dataframe for each age. Then I use lapply() to apply the same function to all of these subsets.

In lapply(), first I remove "Age", which is no longer useful since the dataframe is already split, then I rotate the dataframe so that each period is a row and each country is a column.

library(datawizard)

test <- data.frame(
  Place = c("France", "France", "Germany", "Germany"),
  Age = c(69, 45, 69, 45),
  janv17 = c(1, 0, 0, 1),
  fev17 = c(1, 1, 1, 0)
)

test
#>     Place Age janv17 fev17
#> 1  France  69      1     1
#> 2  France  45      0     1
#> 3 Germany  69      0     1
#> 4 Germany  45      1     0

lapply(split(test, test$Age), function(x) {
  x[["Age"]] <- NULL
  x <- column_as_rownames(x, "Place")
  data_rotate(x)
})
#> $`45`
#>        France Germany
#> janv17      0       1
#> fev17       1       0
#> 
#> $`69`
#>        France Germany
#> janv17      1       0
#> fev17       1       1

Created on 2022-07-08 by the reprex package (v2.0.1)

CodePudding user response:

How about pivoting and then splitting?

library(tidyverse)

test |> # Thanks to Matt for the data
  pivot_longer(-c(Place, Age)) |>
  pivot_wider(names_from = Place) |>
  split(~ Age) |>
  map(~ select(., -Age)) # Remove the age column

Here, if you'd prefer, you could also use the experimental group_split(Age, .keep = FALSE) instead of the last two lines, but it won't name the list for you.

Or the other way around:

test |>
  split(~ Age) |>
  map(~ . |> 
        pivot_longer(-c(Place, Age)) |>
        pivot_wider(names_from = Place) |>
        select(-Age)
      )

Output:

$`45`
# A tibble: 2 × 3
  name   France Germany
  <chr>   <dbl>   <dbl>
1 janv17      0       1
2 fev17       1       0

$`69`
# A tibble: 2 × 3
  name   France Germany
  <chr>   <dbl>   <dbl>
1 janv17      1       0
2 fev17       1       1
  • Related