Home > Enterprise >  Stream in CSV content in R
Stream in CSV content in R

Time:11-22

I understand how to read a CSV file that is stored on disk, but I don't know how to stream in CSV content via CLI using R.

E.g., Reading CSV file from disk using a simple CLI.

library(optparse)

option_list <- list(
    # Absolute filepath to CSV file.
    make_option(c("-c","--csv"),type="character",default=NULL,
                help="CSV filepath",metavar="character")
);
opt_parser <- OptionParser(option_list=option_list)
opt <- parse_args(opt_parser)

csv_filepath <- opt$csv
csv <- read.csv(csv_filepath)

How would I do this if I'm working with a data stream?

CodePudding user response:

R always reads from connections. A connection can be a file, and url, an in-memory text, and so on.

So, in case you wanna read csv-format data from a content that is already in memory, you just use the text= parameter, instead of a file name.

Like this:

my_stream = "name;age\nJulie;25\nJohn;26"
read.csv(text = my_stream, sep = ";", header = T)

The output will be:

   name age
1 Julie  25
2  John  26

You can place additional parameters to read.csv() normally, of course.

CodePudding user response:

R source and package optparse.

First, write an R source file "example.R", such as the following.

#!/usr/bin/env Rscript
#
# R source: example.R
# options:  -c --csv
# 
library(optparse)

option_list <- list(
    # Absolute filepath to CSV file.
    make_option(c("-c","--csv"),type="character",default=NULL,
                help="CSV filepath",metavar="character")
)
opt_parser <- OptionParser(option_list=option_list)
opt <- parse_args(opt_parser)

csv_filepath <- opt$csv
csv <- read.csv(csv_filepath)

message(paste("\nfile read:", csv_filepath, "\n"))
str(csv)

Then, change the execute permissions, in order for the bash shell to recognize the #! shebang and run Rscript passing it the file.
In this case, I will change the user permissions only, not its group.

bash$ chmod u x example.R

The test.

I have tested the above script with this data.frame:

df1 <- data.frame(id=1:5, name=letters[1:5])
write.csv(df1, "test.csv", row.names=FALSE)

And, at a Ubuntu 20.04 LTS, ran the command ./example.R passing it the CSV filename in argument csv. The command and its output were

bash$ ./example.R --csv=test.csv

file read: test.csv 

'data.frame':   5 obs. of  2 variables:
 $ id  : int  1 2 3 4 5
 $ name: chr  "a" "b" "c" "d" ...
  • Related