I have created two tables in SAS WORK Library using data queried from Oracle tables using implicit pass through connections. I am now looking to compare the values in colA of table_X with values in colP in table_Y and if they are not equal then insert the colA,colB values of table_X to colP and colQ values in table_Y. The colR value of table_Y needs to be updated by getting its value from a SQL Query. Sample data
Table_X is as follows
colA | colB |
---|---|
DAVE | 01-Apr-2022 |
MARK | 05-Jan-2021 |
ROCK | 10-FEB-2022 |
Table_Y is as follows
colP | colQ | colR |
---|---|---|
DAVE | 01-Apr-2022 | 100 |
What I essentially want is If Value.Table_X.colA != Value.Table_Y.colP then INSERT that value of Table_X.colA and Table_X.colB INTO Table_Y.colP and Table_Y.colQ respectively. Table_Y.colR value would be filled by a SQL Query (which I can connect to SQL DBs using Implicit or Explicit Pass-Through connections)
Final Output for Table_Y should look like this
colP | colQ | colR |
---|---|---|
DAVE | 01-Apr-2022 | 100 |
MARK | 05-Jan-2021 | 200 |
ROCK | 10-FEB-2022 | 500 |
How can I do this in SAS and/or Oracle SQL (I can push Table_X and Table_Y in Oracle using Explicit-Pass Through and use Oracle Syntax).
CodePudding user response:
Assuming that colR
sits in a table other_table
, also containing colP
, you could move the whole of table_x
to the target platform.
Then run this script (I add the data in CTE-s to test the functionality, you don't need the WITH clause) :
INSERT INTO table_y
-- your input, don't use in query ...
WITH
table_x(colA,colB) AS (
SELECT 'DAVE',DATE '01-Apr-2022'
UNION ALL SELECT 'MARK',DATE '05-Jan-2021'
UNION ALL SELECT 'ROCK',DATE '10-FEB-2022'
)
,
other_table(colA,colR) AS (
SELECT 'DAVE', 10
UNION ALL SELECT 'MARK',100
UNION ALL SELECT 'ROCK',500
)
-- end of necessary input, real rest of INSERT statemen starts here
SELECT
table_x.colA
, table_x.colB
, other_table.colR
FROM table_x
JOIN other_table USING(colA)
LEFT JOIN table_y ON table_x.colA = table_y.colP
WHERE table_y.colP IS NULL;
control query:
SELECT * FROM table_y;
-- out colP | colQ | colR
-- out ------ ------------ ------
-- out DAVE | 2022-04-01 | 100
-- out MARK | 2021-01-05 | 100
-- out ROCK | 2022-02-10 | 500
CodePudding user response:
Let's just use NAME as your key variable and X and Y as the dataset names to reduce confusion.
Sounds like you just want to get the list of values of NAME that are in X but not in Y. Easy enough in any flavor of SQL.
proc sql;
create table new as
select distinct name from X
except
select name from Y
;
quit;
Or if X and Y are already sorted by NAME then a simple data step will do.
data new;
merge x(in=inx) y(in=iny);
by name;
if inx and not iny and firt.name then output;
keep name;
run;
Once you have that list you could upload it into your remote database and use it to drive the query to select those new observations to append to Y. Or if the list is small enough you could use the dataset to generate a macro variable that you could use to generate a WHERE clause in your remote query to just select those values.
Once you have downloaded the new data you could use PROC APPEND to add it to your existing Y dataset. So if your new observations were downloaded into a dataset named NEWY then you could run a step like this:
proc append base=Y data=NEWY;
run;