I am trying to create a query that cleans a data in SQL but I don't know SQL. To do so, I create a sample of the dataset and use dplyr
to clean and create the final data that I want and then I want to use dbplyr
to create the query. But I am having trouble to create the query
data <- read_csv("DataBase/raw/bquxjob_1016f9df_180d80c269e.csv")
#Creating query I will run in SQL
data %>%
filter(codigo_natureza_juridica == 2046
| codigo_natureza_juridica == 2054
| codigo_natureza_juridica == 2062
| codigo_natureza_juridica == 2070
| codigo_natureza_juridica == 2089
| codigo_natureza_juridica == 2097
| codigo_natureza_juridica == 2135
| codigo_natureza_juridica == 2232
| codigo_natureza_juridica == 2240
| codigo_natureza_juridica == 2259
| codigo_natureza_juridica == 2267 &
data_inicio_atividade <= '2011-12-31') %>%
mutate(year_open = format(data_inicio_atividade, format ="%Y"),
month_open = format(data_inicio_atividade, format ="%m"),
date_open = format(data_inicio_atividade, format ="%Y%m"),
year_close = ifelse(situacao_cadastral != 2,
format(data_situacao_cadastral, format ="%Y"),
NA),
month_close = ifelse(situacao_cadastral != 2,
format(data_situacao_cadastral, format ="%m"),
NA),
date_close = ifelse(situacao_cadastral != 2,
format(data_situacao_cadastral, format ="%Y%m"),
NA),
open_2007 = ifelse(year_open <= 2007, 1,0)) %>%
show_query()
This code gives the following error:
Error in UseMethod("show_query") :
no applicable method for 'show_query' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"
CodePudding user response:
show_query()
will only work on a database, and you are trying to use it on a dataframe. To send your data from the csv to a temporary database object to create the query, you could use tbl_memdb()
and instead do:
data %>%
tbl_memdb() %>%
filter(...) %>%
mutate(...) %>%
show_query()