Home > Software engineering >  How can I create a new filtered dataframe from an old one adding columns?
How can I create a new filtered dataframe from an old one adding columns?

Time:04-01

I have created a long table with 3 columns "valeur", "n" = number of observation for valeur (from 0 to 5), "item" :

valeur     n item          
    <dbl> <int> <chr>         
 1      0     6 DistrSEm_ACC  
 2      1     4 DistrSEm_ACC  
 3      2    11 DistrSEm_ACC  
 4      3    21 DistrSEm_ACC  
 5      4    20 DistrSEm_ACC  
 6      5     8 DistrSEm_ACC  
 7      0     8 DistrSEm_CLEAR
 8      1     8 DistrSEm_CLEAR
 9      2    11 DistrSEm_CLEAR
10      3    20 DistrSEm_CLEAR
11      4    13 DistrSEm_CLEAR
12      5    10 DistrSEm_CLEAR
13      1    14 DistrSEm_PROF 
14      2    12 DistrSEm_PROF
15      3    17 DistrSEm_PROF 
16      4    16 DistrSEm_PROF 
17      5     6 DistrSEm_PROF
18      0     9 lowALGf_ACC 
19      1    11 lowALGf_ACC   
20      3    10 lowALGf_ACC 

I need to change it to another dataframe which will look like :

valeur | nb for DistrSEm_ACC | nb for DistrSEm_CLEAR | nb for DistrSEm_PROF 
0      |                     |                       |
1      |                     |                       |
2      |                     |                       |
3      |                     |                       |
4      |                     |                       |
5      |                     |                       |

There is another problem that some "valeur" are absent, for exemple there is no "0" for "DistrSEm_PROF."

CodePudding user response:

You can use pivot_wider from tidyr. This will supply an NA value where there are no matching results in valeur

tidyr::pivot_wider(df, names_from = item, values_from = n, names_prefix = "nb_")
#> # A tibble: 6 x 5
#>   valeur nb_DistrSEm_ACC nb_DistrSEm_CLEAR nb_DistrSEm_PROF nb_lowALGf_ACC
#>    <int>           <int>             <int>            <int>          <int>
#> 1      0               6                 8               NA              9
#> 2      1               4                 8               14             11
#> 3      2              11                11               12             NA
#> 4      3              21                20               17             10
#> 5      4              20                13               16             NA
#> 6      5               8                10                6             NA

Data taken from question in reproducible format

df <- structure(list(valeur = c(0L, 1L, 2L, 3L, 4L, 5L, 0L, 1L, 2L, 
3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 0L, 1L, 3L), n = c(6L, 4L, 11L, 
21L, 20L, 8L, 8L, 8L, 11L, 20L, 13L, 10L, 14L, 12L, 17L, 16L, 
6L, 9L, 11L, 10L), item = c("DistrSEm_ACC", "DistrSEm_ACC", "DistrSEm_ACC", 
"DistrSEm_ACC", "DistrSEm_ACC", "DistrSEm_ACC", "DistrSEm_CLEAR", 
"DistrSEm_CLEAR", "DistrSEm_CLEAR", "DistrSEm_CLEAR", "DistrSEm_CLEAR", 
"DistrSEm_CLEAR", "DistrSEm_PROF", "DistrSEm_PROF", "DistrSEm_PROF", 
"DistrSEm_PROF", "DistrSEm_PROF", "lowALGf_ACC", "lowALGf_ACC", 
"lowALGf_ACC")), class = "data.frame", row.names = c("1", "2", 
"3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", 
"15", "16", "17", "18", "19", "20"))
  • Related