I've been working on this for hours and can't find a solution that works. For simplicity let us say the setup is that we have a DataFrame in R, let's call it df, with a single column of values and let's say it has values 1,2,3,4 and 5.
I want to embed the following query in R:
SELECT DISTINCT column1,
column 2
FROM database
WHERE value IN (1,2,3,4,5)
If I embed the query in R via the following
df5 <- dbGetQuery(db,paste0("
SELECT DISTINCT column1,
column 2
FROM database
WHERE value IN (1,2,3,4,5)"))
then my query works, but I want to reference this DataFrame which I am pulling in from an Excel file. The natural thing to do would be to convert it to a list
val_list=list(df$'values')
and do the following
df5 <- dbGetQuery(db,paste0("
SELECT DISTINCT column1,
column 2
FROM database
WHERE value IN '",vals,"))
This is, however, not working. How can I get it to work as I want?
CodePudding user response:
One should never interpolate data directly into the query, lest accidental sql-injection (or query-poisoning) occurs. It's better to use bound parameters or similar, see https://db.rstudio.com/best-practices/run-queries-safely/.
For this code, assuming you have a set of values you want to check the IN
set membership:
#?# vec <- df$values
qmarks <- paste(rep("?", length(vec)), collapse = ",")
df5 <- dbGetQuery(db, paste("
SELECT DISTINCT column1, column 2
FROM database
WHERE value IN (", qmarks, ")"),
params = as.list(vec))
CodePudding user response:
Suppose we wish to insert the Time column from the built in data frame BOD into an sql query.
sprintf("select * from X where Y in (%s)", toString(BOD$Time))
## [1] "select * from X where Y in (1, 2, 3, 4, 5, 7)"