I have the following dataframe :
Place Age janv17 fev17 mars17 avril17 mai17 juin17
France 69 0 0 1 1 1 1
Germany 69 0 0 1 1 1 1
Germany 45 0 0 0 0 0 0
National 35 0 0 0 0 0 0
France 43 0 0 0 0 0 0
Germany 69 0 0 0 0 0 0
France 39 0 0 0 0 0 0
France 28 0 0 0 0 0 0
I need to turn it into something like this :
France Germany
janv17 0 0
fev17 0 0
mars17 0 0
avril17 1 1
mai17 1 1
juin17 1 1
And I need this table for each age. Do you have any idea how to do so ? I am new to R.
Thank you!
CodePudding user response:
Here's an option using dplyr
and janitor
:
library(dplyr)
library(janitor)
test <- data.frame(
Place = c("France", "France", "Germany", "Germany"),
Age = c(69, 45, 69, 45),
janv17 = c(1, 0, 0, 1),
fev17 = c(1, 1, 1, 0)
)
test %>%
split(.$Age) %>%
map(function(x) {
x %>%
select(-Age) %>%
t() %>%
as.data.frame() %>%
janitor::row_to_names(1)
})
CodePudding user response:
Maybe something like this:
library(dplyr)
df %>%
group_by(Place, Age) %>%
summarise(across(janv17:juin17, ~sum(., na.rm = TRUE)), .groups="drop") %>%
t() %>%
as.data.frame() %>%
`colnames<-`(.[1, ]) %>%
.[-1, ]
France France France France Germany Germany National
Age 28 39 43 69 45 69 35
janv17 0 0 0 0 0 0 0
fev17 0 0 0 0 0 0 0
mars17 0 0 0 1 0 1 0
avril17 0 0 0 1 0 1 0
mai17 0 0 0 1 0 1 0
juin17 0 0 0 1 0 1 0
CodePudding user response:
Since you want a table like this for every age, I use split()
to create a list where each element is a subset of your dataframe for each age. Then I use lapply()
to apply the same function to all of these subsets.
In lapply()
, first I remove "Age", which is no longer useful since the dataframe is already split, then I rotate the dataframe so that each period is a row and each country is a column.
library(datawizard)
test <- data.frame(
Place = c("France", "France", "Germany", "Germany"),
Age = c(69, 45, 69, 45),
janv17 = c(1, 0, 0, 1),
fev17 = c(1, 1, 1, 0)
)
test
#> Place Age janv17 fev17
#> 1 France 69 1 1
#> 2 France 45 0 1
#> 3 Germany 69 0 1
#> 4 Germany 45 1 0
lapply(split(test, test$Age), function(x) {
x[["Age"]] <- NULL
x <- column_as_rownames(x, "Place")
data_rotate(x)
})
#> $`45`
#> France Germany
#> janv17 0 1
#> fev17 1 0
#>
#> $`69`
#> France Germany
#> janv17 1 0
#> fev17 1 1
Created on 2022-07-08 by the reprex package (v2.0.1)
CodePudding user response:
How about pivoting and then splitting?
library(tidyverse)
test |> # Thanks to Matt for the data
pivot_longer(-c(Place, Age)) |>
pivot_wider(names_from = Place) |>
split(~ Age) |>
map(~ select(., -Age)) # Remove the age column
Here, if you'd prefer, you could also use the experimental group_split(Age, .keep = FALSE)
instead of the last two lines, but it won't name the list for you.
Or the other way around:
test |>
split(~ Age) |>
map(~ . |>
pivot_longer(-c(Place, Age)) |>
pivot_wider(names_from = Place) |>
select(-Age)
)
Output:
$`45`
# A tibble: 2 × 3
name France Germany
<chr> <dbl> <dbl>
1 janv17 0 1
2 fev17 1 0
$`69`
# A tibble: 2 × 3
name France Germany
<chr> <dbl> <dbl>
1 janv17 1 0
2 fev17 1 1