Home > Back-end >  How to sort files according to my pattern
How to sort files according to my pattern

Time:07-04

I'm using fs::dir_ls(path, regexp = paste0("LATO-", "[[:alpha:]]*", ".csv$")) to get a list of files in a specific directory.

The files are, by default, sorted alphabetically. Is there any way in R to keep them sorted according to my pattern

LATO-sty.csv, LATO-lut.csv, LATO-mar.csv, LATO-kwi.csv, LATO-maj.csv, LATO-cze.csv, LATO-lip.csv LATO-sie.csv, LATO-wrz.csv, LATO-paz.csv, LATO-lis.csv, LATO-gru.csv

CodePudding user response:

The pattern you're describing is not a generic pattern we can apply to all possible values in your file listing. However, we can make sure that if these specific values appear in your vector, they get sorted to the front:

Example fs::dir_ls() data:

files <- c('some/dir/LATO-bar.csv', 'some/dir/LATO-baz.csv', 'some/dir/LATO-foo.csv',
           'some/dir/LATO-kwi.csv', 'some/dir/LATO-lut.csv', 'some/dir/LATO-sty.csv',
           'some/dir/ZLATO-bar.csv', 'some/dir/ZLATO-baz.csv')

Code:

order <- c('LATO-sty.csv', 'LATO-lut.csv', 'LATO-mar.csv', 'LATO-kwi.csv',
           'LATO-maj.csv', 'LATO-cze.csv', 'LATO-lip.csv', 'LATO-sie.csv',
           'LATO-wrz.csv', 'LATO-paz.csv', 'LATO-lis.csv', 'LATO-gru.csv')


# get `files` present in `order`
set1 <- files[fs::path_file(files) %in% order] # extract filenames
ids <- match(fs::path_file(set1), order)       # get matching IDs from `order`
ids_sorted <- sort(ids, index.return=T)        # get sort order
set1_sorted <- set1[ids_sorted$ix]             # apply sort order

# get `files` NOT present in `order`, keep them in the same order
set2 <- files[!fs::path_file(files) %in% order]

# join sets
result <- unname(c(set1_sorted, set2))

Result:

> result
[1] "some/dir/LATO-sty.csv"  "some/dir/LATO-lut.csv"  "some/dir/LATO-kwi.csv"  "some/dir/LATO-bar.csv"  "some/dir/LATO-baz.csv" 
[6] "some/dir/LATO-foo.csv"  "some/dir/ZLATO-bar.csv" "some/dir/ZLATO-baz.csv"

CodePudding user response:

If you have this alphabetically sorted names,

x
# [1] "LATO-cze.csv" "LATO-gru.csv" "LATO-kwi.csv" "LATO-lip.csv" "LATO-lis.csv" "LATO-lut.csv"
# [7] "LATO-maj.csv" "LATO-mar.csv" "LATO-paz.csv" "LATO-sie.csv" "LATO-sty.csv" "LATO-wrz.csv"

you may easily sort them according to a minimal pattern

pattern <- c("sty", "lut", "mar", "kwi", "maj", "cze", "lip", "sie", "wrz", "paz", 
             "lis", "gru")

or if you want to type the whole pattern and think it is safer in your case,

pattern <- c("LATO-sty.csv", "LATO-lut.csv", "LATO-mar.csv", ...)

using grep.

sapply(pattern, grep, x)
# sty lut mar kwi maj cze lip sie wrz paz lis gru 
#  11   6   8   3   7   1   4  10  12   9   5   2 

sapply(pattern, grep, x, value=TRUE)  ## use `value=TRUE` to check if it's right
# sty            lut            mar            kwi            maj            cze 
# "LATO-sty.csv" "LATO-lut.csv" "LATO-mar.csv" "LATO-kwi.csv" "LATO-maj.csv" "LATO-cze.csv" 
# lip            sie            wrz            paz            lis            gru 
# "LATO-lip.csv" "LATO-sie.csv" "LATO-wrz.csv" "LATO-paz.csv" "LATO-lis.csv" "LATO-gru.csv" 

To finally sort the list lst, we simply subset it with grep result.

lst[sapply(pattern, grep, x)]
# $`LATO-sty.csv`
# list()
# 
# $`LATO-lut.csv`
# list()
# 
# $`LATO-mar.csv`
# list()
# 
# $`LATO-kwi.csv`
# list()
# 
# $`LATO-maj.csv`
# list()
# 
# $`LATO-cze.csv`
# list()
# 
# $`LATO-lip.csv`
# list()
# 
# $`LATO-sie.csv`
# list()
# 
# $`LATO-wrz.csv`
# list()
# 
# $`LATO-paz.csv`
# list()
# 
# $`LATO-lis.csv`
# list()
# 
# $`LATO-gru.csv`
# list()

Data:

x <- c("LATO-cze.csv", "LATO-gru.csv", "LATO-kwi.csv", "LATO-lip.csv", 
"LATO-lis.csv", "LATO-lut.csv", "LATO-maj.csv", "LATO-mar.csv", 
"LATO-paz.csv", "LATO-sie.csv", "LATO-sty.csv", "LATO-wrz.csv"
)

lst <- setNames(replicate(12, list()), x)
  •  Tags:  
  • r csv
  • Related