What I try to do is that bind rows in my PostgreSQL databes in matched columns like rbindlist
's (from data.table
) fill
argument.
In short, the table I'd like to see in my database is like this;
a <- data.frame(no=c(234,235,346),year=2012:2014,col1=c(1,1,1))
b <- data.frame(no=c(333,353,324),year=2014:2016,col2=c(2,2,2))
union_data_frame <- data.table::rbindlist(list(a,b),fill=T)
union_data_frame
no year col1 col2
1 234 2012 1 NA
2 235 2013 1 NA
3 346 2014 1 NA
4 333 2014 NA 2
5 353 2015 NA 2
6 324 2016 NA 2
I tried it in RPostgres
in this way;
library(RPostgres)
a <- data.frame(no=c(234,235,346),year=2012:2014,col1=c(1,1,1))
b <- data.frame(no=c(333,353,324),year=2014:2016,col2=c(2,2,2))
drv <- dbDriver('Postgres')
con <- dbConnect(drv,user='postgres',dbname='dummy_db')
dbWriteTable(con,'dummy_table',a,append = T,row.names = F)
dbWriteTable(con,'dummy_table',b,append = T,row.names = F)
But it doesn't work and fields an error because the second table (b
) doesn't have a column called col2
.
How to append tables by only common columns ?
Thanks in advance.
CodePudding user response:
I think you need to:
- identify which columns are missing,
alter table
those new columns into existence, and then- upload the data, assuming all data in the first that are missing in the second are null-able.
### pg <- dbConnect(...)
dbWriteTable(pg, "some_table", a)
newcolumns <- setdiff(colnames(b), dbListFields(pg, "a"))
newcolumns
# [1] "col2"
addqry <- paste("alter table some_table",
paste("add", newcolumns, dbDataType(pg, b[,newcolumns]),
collapse = ", "))
addqry
# [1] "alter table some_table add col2 DOUBLE PRECISION"
dbExecute(pg, addqry)
dbWriteTable(pg, "some_table", b, append = TRUE)
dbGetQuery(pg, "select * from some_table")
# no year col1 col2
# 1 234 2012 1 NA
# 2 235 2013 1 NA
# 3 346 2014 1 NA
# 4 333 2014 NA 2
# 5 353 2015 NA 2
# 6 324 2016 NA 2