I am working on an R function that generates a ranked and sorted index with user inputs for starting values (in a list) and a total number of slots to fill for the index. If the list values count is < total number of slots, then sequential numbers are inserted into the gaps. Note that the first index slot in all cases must always = 1 (if 1.1 is not provided in the list) or 1.1 (if 1.1 is provided in the list).
I used the dplyr::dense_rank
function in the Reproducible Code for Example 1 at the bottom of this post to correctly fill in the gaps sequentially when the provided list elements are all < the total number of slots to fill.
Is there a way to use dplyr::dense_rank
, or another way/function, to fill in the gaps when the list elements are all > than [1 or 1.1] as illustrated in Examples 2 and 3 in the images below, or when there are other gaps between the list elements as illustrated in Example 4 in the image below? Gaps I'm trying to fill are highlighted in yellow in the images. Note that the Reproducible Code at the bottom provides the user inputs for Examples 2-4, commented-out since I ran Example 1.
Example 1 Reproducible Code output (which is correct, given the Value
and totalSlots
inputs):
# A tibble: 5 x 2
Slot Value
<int> <dbl>
1 1 1.1
2 2 1.2
3 3 2.1
4 4 2.2
5 5 3
Reproducible Code:
library(dplyr)
# Example 1:
Value <- c(2.1, 1.2, 1.1, 2.2)
totalSlots <- 5
# Example 2:
# Value <- c(2.1, 2.2)
# totalSlots <- 3
#
# # Example 3:
# Value <- c(4.1, 4.2, 4.3)
# totalSlots <- 6
# Example 4:
# Value <- c(1.1, 1.2, 3.1, 3.2, 3.3, 6.1, 6.2)
# totalSlots <- 10
tibble(Value) %>%
mutate(Slot = row_number()) %>%
complete(Slot = seq_len(totalSlots)) %>%
mutate(
Value = coalesce(Value[order(Value)], Slot),
Value = dense_rank(as.integer(Value)) Value - as.integer(Value)
)
Here is the Richard Berry solution, generating a 2-column dataframe:
indexDF <- data.frame(Slot = c(1:totalSlots), Value = sort(c(setdiff(1:totalSlots, floor(Value)), Value))[1:totalSlots])
indexDF
CodePudding user response:
You can achieve this with:
sort(c(setdiff(1:totalSlots, floor(Value)), Value))[1:totalSlots]
Breaking it down:
1:totalSlots %>% #candidates for integers to fill gaps
setdiff(floor(Value)) #remove fill integers already covered by Value
c(Value) %>% #combine with Value
sort() #get in order
Then take as many elements as you are interested in with [1:totalSlots]
CodePudding user response:
A potential solution by extracting the first character of each Value
to test for presence of each slot number and filling in blank Values to complete total slots:
tibble(Value) |>
mutate(initial_int = as.numeric(stringr::str_extract(Value, "^\\d"))) |>
full_join(tibble(initial_int = 1:totalSlots)) |>
mutate(Value = if_else(is.na(Value), initial_int, Value)) |>
arrange(Value) |>
head(10) |>
mutate(Slot = 1:10) |>
select(-initial_int)