I'm using pypika to build some queries. It used to work great but I have an issue with subqueries on an oracle database. The query reads as follow
fake_query = Query().from_(my_table).where(my_table.ID == "12345").select(my_table.ID)
QN = AliasedQuery("fake_query_with")
query = (
Query()
.with_(fake_query, "fake_query_with")
.from_(QN)
.select(
QN.star. # problematic line
)
)
df_temp = claim_conn.read_dataframe(query.get_sql())
So when I run the script, I got an error:
DatabaseError: ORA-00904: "fake_query_with": invalid identifier
The pypika query is translated as expected as the string
WITH fake_query_with AS
(
SELECT "ID"
FROM "MYTABLE"
WHERE "ID"=12345
)
SELECT "fake_query_with".*
FROM fake_query_with
So this query fails, but if I replace QN.star
by simply *
, it works, but of course I loose the interesting usage of the alias. I know this is just a dummy snippet but the it is just to demonstrate the error.
CodePudding user response:
I don't know Python nor Pypika, but - from Oracle's point of view - this is what's wrong here: usage of (evil!) double quotes.
In Oracle, by default, all "identifiers" (table names, column names, procedures, functions ...) are stored into the data dictionary in UPPERCASE. Then, you can reference them any way you want: UPPER, lower or MIxeD CaSe - no difference, everything will work.
But, if you enclose names into double quotes, then you must use double quotes every time, following the same letter case you used while creating that object - otherwise, it won't work.
For example:
SQL> create table mytable as
2 select 12345 id,
3 'Littlefoot' "NamE"
4 from dual;
Table created.
SQL> select table_name from user_tables where table_name = 'MYTABLE';
TABLE_NAME
------------------------------
MYTABLE --> stored in UPPERCASE
SQL> select column_name from user_tab_columns where table_name = 'MYTABLE';
COLUMN_NAME
------------------------------
ID --> stored in UPPERCASE because there were no double quotes
NamE --> mixed case, as I enclosed the name into double quotes
SQL> select id from mytable; --> reference it any way you want
ID
----------
12345
SQL> select name from mytable; --> reference it exactly as it was created
select name from mytable
*
ERROR at line 1:
ORA-00904: "NAME": invalid identifier
SQL> select "NamE" from mytable;
NamE
----------
Littlefoot
SQL>
In your case:
SQL> WITH fake_query_with AS --> CTE named using lower case, but no double quotes
2 (
3 SELECT "ID"
4 FROM "MYTABLE"
5 WHERE "ID"=12345
6 )
7 SELECT "fake_query_with".* --> double quotes, lower case
8 FROM fake_query_with;
SELECT "fake_query_with".*
*
ERROR at line 7:
ORA-00904: "fake_query_with": invalid identifier
SQL>
If you want to make it work, then either a) get rid of double quotes (if you can; maybe Py* requires them, can't tell) (that's my sincere suggestion) or b) use double quotes everywhere:
a) no double quotes
SQL> WITH
2 fake_query_with
3 AS
4 (SELECT id
5 FROM mytable
6 WHERE id = 12345)
7 SELECT fake_query_with.*
8 FROM fake_query_with;
ID
----------
12345
SQL>
b) double quotes, everywhere
SQL> WITH
2 "fake_query_with"
3 AS
4 (SELECT "ID"
5 FROM "MYTABLE"
6 WHERE "ID" = 12345)
7 SELECT "fake_query_with".*
8 FROM "fake_query_with";
ID
----------
12345
SQL>
See if it helps.
CodePudding user response:
As @astentx said in a comment, this looks like a bug in PyPika.
You could (probably) work around it by making the name uppercase in your code - i.e. "FAKE_QUERY_WITH"
in both places.
QN = AliasedQuery("FAKE_QUERY_WITH")
query = (
Query()
.with_(fake_query, "FAKE_QUERY_WITH")
.from_(QN)
.select(
QN.star. # problematic line
)
)
Then the generated SQL should look like:
WITH FAKE_QUERY_WITH AS
(
SELECT "ID"
FROM "MYTABLE"
WHERE "ID"=12345
)
SELECT "FAKE_QUERY_WITH".*
FROM FAKE_QUERY_WITH
and the distinction between the quoted and unquoted identifiers will be moot.
fiddle showing that modified SQL works - untested in PyPika but hopefully that is actually what it will generate...