I am using the separate
function from tidyverse
to split the first column of this tibble :
# A tibble: 6,951 x 9
Row.names Number_of_analysis~ DL_Minimum DL_Mean DL_Maximum Number_of_measur~ Measure_Minimum Measure_Mean Measure_Maximum
<I<chr>> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2011.FACILITY.PONT-À-CELLES 52 0.6 1.81 16 0 0 0 0
2 2011.FACILITY.PONT-À-CELLES 52 0.07 0.177 1.3 0 0 0 0
3 2011.FACILITY.CHARLEROI 52 0.07 0.212 1.9 0 0 0 0
4 2011.FACILITY.CHARLEROI 52 0.08 0.209 2 0 0 0 0
Merge_splitnames <- Merge %>%
separate(Row.names,sep = "\\.",into = c("Year", "Catchment", "Locality"), extra = "drop")
While everything seems correct, the output is a tibble without the first 2 columns (the ones which have a name comprising an accent in French) :
# A tibble: 6,951 x 9
Year Catchment Locality Number_of_analysis~ DL_Minimum DL_Mean DL_Maximum Number_of_measur~ Measure_Minimum Measure_Mean Measure_Maximum
<I<chr>> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
3 2011 FACILITY CHARLEROI 52 0.07 0.212 1.9 0 0 0 0
4 2011 FACILITY CHARLEROI 52 0.08 0.209 2 0 0 0 0
Any idea how to deal with this issue ? I wish to keep the real name in French (with the accent). This is quite surprising for me, I've never got any issue with all the other functions from tidyverse.
NB : this is a simple and reproducible example, my real tibble is about 100 times bigger
CodePudding user response:
separate
is retaining the accent for me:
library(tidyverse)
tribble(
~names,
"2011.FACILITY.PONT-À-CELLES",
"2011.FACILITY.PONT-À-CELLES",
"2011.FACILITY.CHARLEROI",
"2011.FACILITY.CHARLEROI"
) %>%
separate(names, sep = "\\.", into = c("Year", "Catchment", "Locality"))
#> # A tibble: 4 × 3
#> Year Catchment Locality
#> <chr> <chr> <chr>
#> 1 2011 FACILITY PONT-À-CELLES
#> 2 2011 FACILITY PONT-À-CELLES
#> 3 2011 FACILITY CHARLEROI
#> 4 2011 FACILITY CHARLEROI
Created on 2022-05-06 by the reprex package (v2.0.1)
CodePudding user response:
Assuming DF shown reproducibly in the Note at the end, use extra = "merge"
in separate
. (It is possible that you may need to change your locale but I did not need to do that -- Sys.getlocale()
.)
library(tidyr)
DF %>%
separate(Row.names, c("Year", "Catchment", "Locality"), extra = "merge")
giving:
Year Catchment Locality Number_of_analysis~ DL_Minimum DL_Mean
1 2011 FACILITY PONT-À-CELLES 52 0.60 1.810
2 2011 FACILITY PONT-À-CELLES 52 0.07 0.177
3 2011 FACILITY CHARLEROI 52 0.07 0.212
4 2011 FACILITY CHARLEROI 52 0.08 0.209
DL_Maximum Number_of_measur~ Measure_Minimum Measure_Mean Measure_Maximum
1 16.0 0 0 0 0
2 1.3 0 0 0 0
3 1.9 0 0 0 0
4 2.0 0 0 0 0
Note
DF <-
structure(list(Row.names = c("2011.FACILITY.PONT-À-CELLES", "2011.FACILITY.PONT-À-CELLES",
"2011.FACILITY.CHARLEROI", "2011.FACILITY.CHARLEROI"), `Number_of_analysis~` = c(52L,
52L, 52L, 52L), DL_Minimum = c(0.6, 0.07, 0.07, 0.08), DL_Mean = c(1.81,
0.177, 0.212, 0.209), DL_Maximum = c(16, 1.3, 1.9, 2), `Number_of_measur~` = c(0L,
0L, 0L, 0L), Measure_Minimum = c(0L, 0L, 0L, 0L), Measure_Mean = c(0L,
0L, 0L, 0L), Measure_Maximum = c(0L, 0L, 0L, 0L)), class = "data.frame", row.names = c("1",
"2", "3", "4"))