Home > Mobile >  Get names of files in many folders in R
Get names of files in many folders in R

Time:09-29

Im want get the name of files in different folders, because I have 30.000 folders with different names.

For example, G1 have 2 .jpg files, G1 SEED.png and G1 POD.jpg.

Other folder is G2 have 1 .jpg files, G2 SEED.jpg.

Im try with list.files() but dont know what put in pattern

The idea is make a data frame like this:

FOLDER  |  NAME SEED   |   NAME POD   |
--------|--------------|--------------|
G1      | G1 SEED      | G1 POD       |
G2      | G2 SEED      |              |
G3      | G3 SEED      |              |
G4      | G4 SEED      |              |
  ...         ...            ...

Update

Example of folders and files in each one.

https://drive.google.com/drive/folders/1vMgxDxos67S_FDJ5qC-JbehnJNh3tKb2?usp=sharing

head(list.files("C:/Users/macosta/Downloads/download2", pattern=" 
    (png|jpg)$", full.names=TRUE, recursive=TRUE),20). 

[1] "C:/Users/macosta/Downloads/download2/G    1/G    1 pod.jpg"   
 [2] "C:/Users/macosta/Downloads/download2/G    1/G    1 seed.jpg"  
 [3] "C:/Users/macosta/Downloads/download2/G    2/G    2 seed.jpg"  
 [4] "C:/Users/macosta/Downloads/download2/G    3/G    3 seed.jpg"  
 [5] "C:/Users/macosta/Downloads/download2/G    4/G    4 seed.jpg"  
 [6] "C:/Users/macosta/Downloads/download2/G    5/G    5 pod.jpg"   
 [7] "C:/Users/macosta/Downloads/download2/G    5/G    5 seed.jpg"  
 [8] "C:/Users/macosta/Downloads/download2/G    6/G    6 pod.jpg"   
 [9] "C:/Users/macosta/Downloads/download2/G    6/G    6 seed.jpg"  
[10] "C:/Users/macosta/Downloads/download2/G    7/G    7 seed.jpg"  
[11] "C:/Users/macosta/Downloads/download2/G    7A/G    7A seed.jpg"
[12] "C:/Users/macosta/Downloads/download2/G    8/G    8 pod.jpg"   
[13] "C:/Users/macosta/Downloads/download2/G    8/G    8 seed.jpg"  
[14] "C:/Users/macosta/Downloads/download2/G    9/G    9 pod.jpg"   
[15] "C:/Users/macosta/Downloads/download2/G    9/G    9 seed.jpg"  
[16] "C:/Users/macosta/Downloads/download2/G   10/G   10 pod.jpg"   
[17] "C:/Users/macosta/Downloads/download2/G   10/G   10 seed.jpg"  
[18] "C:/Users/macosta/Downloads/download2/G   11/G   11 seed.jpg"  
[19] "C:/Users/macosta/Downloads/download2/G   12/G   12 seed.jpg"  
[20] "C:/Users/macosta/Downloads/download2/G   13/G   13 pod.jpg"

All files are .jpg.

CodePudding user response:

Try this:

library(dplyr)
library(tidyr)   # pivot_wider
library(stringr) # str_extract
dat <- data.frame(folder = dirname(files), file = basename(files))
dat %>%
  mutate(type = coalesce(str_extract(file, "(pod|seed)"), "unk")) %>%
  pivot_wider(folder, names_from = "type", values_from = "file")
# # A tibble: 14 x 3
#    folder                                       pod            seed            
#    <chr>                                        <chr>          <chr>           
#  1 C:/Users/macosta/Downloads/download2/G    1  G    1 pod.jpg G    1 seed.jpg 
#  2 C:/Users/macosta/Downloads/download2/G    2  NA             G    2 seed.jpg 
#  3 C:/Users/macosta/Downloads/download2/G    3  NA             G    3 seed.jpg 
#  4 C:/Users/macosta/Downloads/download2/G    4  NA             G    4 seed.jpg 
#  5 C:/Users/macosta/Downloads/download2/G    5  G    5 pod.jpg G    5 seed.jpg 
#  6 C:/Users/macosta/Downloads/download2/G    6  G    6 pod.jpg G    6 seed.jpg 
#  7 C:/Users/macosta/Downloads/download2/G    7  NA             G    7 seed.jpg 
#  8 C:/Users/macosta/Downloads/download2/G    7A NA             G    7A seed.jpg
#  9 C:/Users/macosta/Downloads/download2/G    8  G    8 pod.jpg G    8 seed.jpg 
# 10 C:/Users/macosta/Downloads/download2/G    9  G    9 pod.jpg G    9 seed.jpg 
# 11 C:/Users/macosta/Downloads/download2/G   10  G   10 pod.jpg G   10 seed.jpg 
# 12 C:/Users/macosta/Downloads/download2/G   11  NA             G   11 seed.jpg 
# 13 C:/Users/macosta/Downloads/download2/G   12  NA             G   12 seed.jpg 
# 14 C:/Users/macosta/Downloads/download2/G   13  G   13 pod.jpg NA              

Data

files <- c("C:/Users/macosta/Downloads/download2/G    1/G    1 pod.jpg", "C:/Users/macosta/Downloads/download2/G    1/G    1 seed.jpg", "C:/Users/macosta/Downloads/download2/G    2/G    2 seed.jpg", "C:/Users/macosta/Downloads/download2/G    3/G    3 seed.jpg", "C:/Users/macosta/Downloads/download2/G    4/G    4 seed.jpg", "C:/Users/macosta/Downloads/download2/G    5/G    5 pod.jpg", "C:/Users/macosta/Downloads/download2/G    5/G    5 seed.jpg", "C:/Users/macosta/Downloads/download2/G    6/G    6 pod.jpg",  "C:/Users/macosta/Downloads/download2/G    6/G    6 seed.jpg", "C:/Users/macosta/Downloads/download2/G    7/G    7 seed.jpg", "C:/Users/macosta/Downloads/download2/G    7A/G    7A seed.jpg", "C:/Users/macosta/Downloads/download2/G    8/G    8 pod.jpg", "C:/Users/macosta/Downloads/download2/G    8/G    8 seed.jpg", "C:/Users/macosta/Downloads/download2/G    9/G    9 pod.jpg", "C:/Users/macosta/Downloads/download2/G    9/G    9 seed.jpg", "C:/Users/macosta/Downloads/download2/G   10/G   10 pod.jpg",  "C:/Users/macosta/Downloads/download2/G   10/G   10 seed.jpg", "C:/Users/macosta/Downloads/download2/G   11/G   11 seed.jpg", "C:/Users/macosta/Downloads/download2/G   12/G   12 seed.jpg", "C:/Users/macosta/Downloads/download2/G   13/G   13 pod.jpg")
  • Related