Home > Software design >  How to load an encrypted sqlite file in RStudio?
How to load an encrypted sqlite file in RStudio?

Time:01-12

DESCRIPTION: I am using RStudio and I have a zip file called databse.zip. The file is encrypted by a password. My goal is to be able to load the file, decrypt it, extract the sqlite database inside (database.db) and be able to make a query to it.

So far I tried the following:

# Load required packages
library(dotenv)
library(Hmisc)
library(RSQLite)

# Get password
password = Sys.getenv("PASSWORD")

# Load .zip file and decrypt it
test = getZip("databse.zip", password=password)

# Connect to the SQLite database
con = dbConnect(RSQLite::SQLite(), test) # ERROR! Error in path.expand(path) : invalid 'path' argument

# Get list of items from table in the database
my_data = dbGetQuery(con, "SELECT column_name FROM table") # I do not even reach this point

PROBLEM: Basically, I am able to load and decrypt the file but then I have no idea how to connect to the sqlite database and make a query to it.

QUESTION: Would you be able to suggest a smart and elegant way to achieve my goal please? Feel free to use different packages if needed.

CodePudding user response:

getZip() is used to stream a content of a single file form zip archive without storing it on a disk first, it returns connection object. This would work with flat files like zipped csv-s while using it with e.g. read.csv(). For DBI / RSQLite you (generally) need to have a file on your disk.

As R can't handle password-protected zip archives natively, we use 7z, though any other suitable command line extractor would work as well (e.g. unzip from rtools in case of Win).

Here we first create password-protected zip arhive to create a reproducible example, then extract it to extracted/ in R session tmp. directory and connect to extracted SQLite database.

library(DBI)
library(glue)

# 7z executable
sevenz <- "C:/Program Files/7-Zip/7z.exe"
archive_pw <- "pw123"

tmp_dir <- tempdir()
# open temp folder in file manager :
# browseURL(tmp_dir)

# prepare reprex ----------------------------------------------------------
# store mtcars in sqlite, create pasword-protected zip archive

sqlite_f <- file.path(tmp_dir, "mtcars.sqlite")
con <- dbConnect(RSQLite::SQLite(), sqlite_f)
data("mtcars")
dbWriteTable(con, "mtcars", mtcars)
dbDisconnect(con)
# double-quotes to handle spaces in paths
system(glue('"{sevenz}" a "{sqlite_f}.zip" "{sqlite_f}" -p"{archive_pw}"' ))
#> [1] 0


# extract and load sqlite DB from password-protected zip ------------------

dest <- file.path(tmp_dir, "extracted")
system(glue('"{sevenz}" e "{sqlite_f}.zip" -o"{dest}" -p"{archive_pw}"' ))
#> [1] 0
con <- dbConnect(RSQLite::SQLite(), file.path(dest, "mtcars.sqlite"))
dbGetQuery(con, "SELECT * FROM mtcars LIMIT 5")
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
dbDisconnect(con)

Created on 2023-01-10 with reprex v2.0.2

  • Related