The code in this question and the datasets used in the code can be found in my
As you can see, they are out of order. It is ordering them based on the individual digit in the last of the 4 descriptors for each dataset.
How can I reorder their names to be arranged correctly as in the attached photo?
I was also suggested to use this on here before:
# reformat the names of each of the csv file formatted dataset
DS_names_list <- basename(filepaths_list)
DS_names_list <- tools::file_path_sans_ext(DS_names_list)
> DS_names_list
[1] "0-3-1-1" "0-3-1-10" "0-3-1-11" "0-3-1-12" "0-3-1-13" "0-3-1-14" "0-3-1-15" "0-3-1-16"
[9] "0-3-1-17" "0-3-1-18" "0-3-1-19" "0-3-1-2" "0-3-1-20" "0-3-1-3" "0-3-1-4" "0-3-1-5"
[17] "0-3-1-6" "0-3-1-7" "0-3-1-8" "0-3-1-9"
But any alteration to this will not reorder or sort the actual file path list itself.
CodePudding user response:
Okay, I'm going to try to simplify this down so it's clear and concise. Minimal reproducible examples are much quicker and easier to answer than lengthy questions with github links and screenshots.
As far as I can tell, your problem is this: You have data like this:
## nicely copy/pasteable sample data
## demonstrates the problem
## omits unneeded details
sample_data = c(
"C:/path/0-3-1-1.csv",
"C:/path/0-3-1-10.csv",
"C:/path/0-3-1-2.csv"
)
And you want to be able to sort it by the numeric components separated by dashes, treated numerically not alphabetically, so the desired result is
desired_result = c(
"C:/path/0-3-1-1.csv",
"C:/path/0-3-1-2.csv",
"C:/path/0-3-1-10.csv"
)
Here's an approach:
# extract the file names (as you have already done)
filenames = sample_data |> basename() |> tools::file_path_sans_ext()
my_order = filenames |>
# split apart the numbers
strsplit(split = "-", fixed = TRUE) |>
unlist() |>
# convert them to numeric and get them in a data frame
as.numeric() |>
matrix(nrow = length(filenames), byrow = TRUE) |>
as.data.frame() |>
# get the appropriate ordering to sort the data frame
do.call(order, args = _)
my_order
# [1] 1 3 2
sample_data[my_order]
# [1] "C:/path/0-3-1-1.csv" "C:/path/0-3-1-2.csv" "C:/path/0-3-1-10.csv"
The my_order
result gives the indices to rearrange the original data to the desired result. You can use it on the sample_data
or on just the extracted file names.
Another solution is to use the gtools::mixedorder()
or gtools::mixedsort()
functions. Confusingly, when I tried them out on the sample data they gave the reverse order. Then I realized that the gtools
functions interpret your -
separators as negative signs. So to use that tool, we would need to replace -
with a different character:
sample_data |>
gsub(pattern = "-", replacement = "|", fixed = TRUE) |>
gtools::mixedorder()
# [1] 1 3 2
## same ordering result as above
CodePudding user response:
Another approach essentially the same as @Gregor's logic. Split the components out, and then call all of them as a list of inputs to the order
function.
ord <- do.call(order,
strcapture("(\\d )-(\\d )-(\\d )-(\\d )",
basename(sample_data), proto=list(1L,1L,1L,1L)))
sample_data[ord]
#[1] "C:/path/0-3-1-1.csv" "C:/path/0-3-1-2.csv" "C:/path/0-3-1-10.csv"