Home > Back-end >  File names and file sizes do not load into R basic data frame
File names and file sizes do not load into R basic data frame

Time:12-19

I am a student that has Basic R Programming classes and one of the tasks was to create a custom function named sizeReport() which has given arguments:

  • path : path to the directory (string),
  • patt: regular expression (regex, string) that filters the reported files / directories, by default = ".*",
  • dironly: logical value (boolean) that toggles whether all files or only directories are reported, by default = FALSE,
  • level: specify recursion depth (integer), default Inf.

A typical basic function call is shown below. In this call, we use only the path argument and specify the path to the directory in a relational way (you can also use direct path to a e.g. folder). The other arguments take default values so level = Inf which results in a complete search of the directory (all items in the directory, in all subdirectories, etc. are listed), patt = ".*" so all items are listed, and dironly = FALSE so all items are listed and not just directories.

                                            path     size
1                                            ../ 13849765
2                                   ..//cars.csv 12536233
3                                  ..//projekt-1    33789
6                     ..//projekt-1/project.html    25164
7                      ..//projekt-1/project.org     8625
8                             ..//projekt-1/test        0
4                                  ..//projekt-2  1041209
9  ..//projekt-2/A-List-Of-Epidemics_examples.nb   823182
10                ..//projekt-2/disease_data.csv     9109
11                  ..//projekt-2/event_data.csv      165
12                        ..//projekt-2/fig1.png     9954
13                        ..//projekt-2/fig2.png    27818
14                        ..//projekt-2/fig3.png    21237
15                        ..//projekt-2/fig4.png    16929
16                        ..//projekt-2/fig5.png    24854
17                        ..//projekt-2/fig6.png    24854
18                        ..//projekt-2/fig7.png    19974
19                    ..//projekt-2/project.html    18842
20                   ..//projekt-2/project.html~    18392
21                     ..//projekt-2/project.org    15587
22                    ..//projekt-2/project.org~    10312
5                                  ..//projekt-3   238534
23                      ..//projekt-3/analiza.nb     1780
24                       ..//projekt-3/fig_1.png    32498
25                       ..//projekt-3/fig_2.png    72864
26                       ..//projekt-3/fig_3.png    90707
27                    ..//projekt-3/project.html    14314
28                   ..//projekt-3/project.html~    14313
29                     ..//projekt-3/project.org     6648
30                    ..//projekt-3/project.org~     5410

The next example is the same call as above but the path is given in an absolute way. The width of the display is changed due to the length of the paths. Other than that, however, the results are the same. In further examples, I will limit myself to the path in relation to the working directory.

1                                            /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe 13849765
2                                   /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/cars.csv 12536233
3                                  /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-1    33789
6                     /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-1/project.html    25164
7                      /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-1/project.org     8625
8                             /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-1/test        0
4                                  /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2  1041209
9  /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/A-List-Of-Epidemics_examples.nb   823182
10                /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/disease_data.csv     9109
11                  /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/event_data.csv      165
12                        /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/fig1.png     9954
13                        /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/fig2.png    27818
14                        /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/fig3.png    21237
15                        /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/fig4.png    16929
16                        /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/fig5.png    24854
17                        /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/fig6.png    24854
18                        /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/fig7.png    19974
19                    /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/project.html    18842
20                   /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/project.html~    18392
21                     /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/project.org    15587
22                    /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-2/project.org~    10312
5                                  /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-3   238534
23                      /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-3/analiza.nb     1780
24                       /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-3/fig_1.png    32498
25                       /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-3/fig_2.png    72864
26                       /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-3/fig_3.png    90707
27                    /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-3/project.html    14314
28                   /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-3/project.html~    14313
29                     /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-3/project.org     6648
30                    /Users/michael/Documents/sync/lectures/sgh/2-programowanie-w-R-podstawy/2022-11-09-zadania-zaliczeniowe/projekt-3/project.org~     5410

I tried to write this code:

sizeReport <- function(path, patt = ".*", dironly = FALSE, level = Inf) {
  
  files <- data.frame(name = character(), size = numeric())
  

  walkDir <- function(path, level) {
   
    filesInDir <- list.files(path, recursive = FALSE)

    for (file in filesInDir) {
    
      fullPath <- file.path(path, file)
     
      if (dironly && !dir.exists(fullPath)) {
        next
      }
      
      if (dir.exists(fullPath) && level > 0) {
        walkDir(fullPath, level - 1)
      } else {
        
        if (!dir.exists(fullPath) && grepl(patt, file)) {
          files <- rbind(files, data.frame(name = fullPath, size = file.size(fullPath)))
        }
      }
    }
  }
  
  
  walkDir(path, level)
  
  
  return(files)
}
sizeReport("../")


filesInDir <- list.files("../", recursive = FALSE)
filesInDir

But it does not work. When calling that function with

sizeReport(path = "../")

It shows only this:


[1] name size
<0 rows> (or 'row.names' with 0 length)

When I use the filesInDir bit of the code:

filesInDir <- list.files(path, recursive = FALSE)

it shows filenames in a vector:

[1] "fUNKCJA JAKAS TAM.R"                "Labirynt"                          
[3] "Mapa.r"                             "sizeReport"                        
[5] "Zrzut ekranu 2022-12-13 232400.png"

So there should not be any problem with loading them into dataframe. What am I doing wrong?

CodePudding user response:

Welcome to stackoverflow.

You need to use the super-assignment <<- instead of the assignment operator <- in the line

files <<- rbind(files, data.frame(name = fullPath, size = file.size(fullPath)))

It seems to me that with the regular assignment, in each iteration R creates a new local variable which is not visible to other function calls. You can read more about super-assignment here

sizeReport <- function(path, patt = ".*", dironly = FALSE, level = Inf) {
  
  files <- data.frame(name = character(), size = numeric())
  

  walkDir <- function(path, level) {
   
    filesInDir <- list.files(path, recursive = FALSE)

    for (file in filesInDir) {
    
      fullPath <- file.path(path, file)
     
      if (dironly && !dir.exists(fullPath)) {
        next
      }
      
      if (dir.exists(fullPath) && level > 0) {
        walkDir(fullPath, level - 1)
      } else {
        
        if (!dir.exists(fullPath) && grepl(patt, file)) {
          files <<- rbind(files, data.frame(name = fullPath, size = file.size(fullPath)))
        }
      }
    }
  }
  
  
  walkDir(path, level)
  
  
  return(files)
}
sizeReport("../")
  • Related