Im trying to add columns to a data frame with factor levels based on a value in another column. Doing this through piping is easy enough and I have a script that does largely what I want:
Dai3D_evening_allBNR<- list.files(path = 'Z:/fishproj/Cambodia Dai project/Analytic/Flux/River_Width/Dai3C', #identifies all .csv files associated with Dai15 full water column Sv measurements and compiles them into one data frame
pattern = "^Dai3D_ABC_10mbin_20211209_fullwatercolumn_evening_BNR*.*csv", full.names = TRUE) %>%
map_dfr(read_csv) %>%
mutate(BNR = case_when(
Region_ID == 10 ~ "BNR1",
Region_ID == 13 ~ "BNR2",
Region_ID == 15 ~ "BNR3",
TRUE ~ as.character(Region_ID)))
It produces a dataframe that looks like this:
Region_ID Sv_mean BNR
1 10 -64.01115 BNR1
2 10 -64.96363 BNR1
3 10 -67.98841 BNR1
4 13 -66.88734 BNR2
5 13 -69.79789 BNR2
6 13 -69.94071 BNR2
7 15 -66.04855 BNR3
8 15 -68.31167 BNR3
9 15 -68.67383 BNR3
The 'mutate' function creates a factor column with those 3 levels. The issue is, that the numbers in the 'Region_ID' column are randomly generated for each file (in this instance it is 10, 13, and 15) so I have to edit the numbers with each iteration manually. The nice part is there are only 3 different numbers. I want to have the script automatically recognize the three different numbers and apply factor levels based on those. Some type of ordering is necessary, for example, the first number is always 'BNR1' the second value is always 'BNR2', etc. I did try this using conditional variables but had no luck. Perhaps someone else knows this better than I do.
CodePudding user response:
We could do this automatically by either match
applied on the unique
values of 'Region_ID' to return the index and then paste
with 'BNR' substring or convert to factor
with levels
specified as unique(Region_ID)
and coerce to integer with as.integer
list.files(path = 'Z:/fishproj/Cambodia Dai project/Analytic/Flux/River_Width/Dai3C', #identifies all .csv files associated with Dai15 full water column Sv measurements and compiles them into one data frame
pattern = "^Dai3D_ABC_10mbin_20211209_fullwatercolumn_evening_BNR*.*csv", full.names = TRUE) %>%
map_dfr(read_csv) %>%
mutate(BNR = str_c("BNR", match(Region_ID, unique(Region_ID))))