Home > other >  Django Raw SQL Query using multiple databases
Django Raw SQL Query using multiple databases

Time:10-04

I am trying to pull information from 2 different databases in the same query.

Below is an example of the code I'm trying to use with identifying information changed to generic names.

I have my data bases set up in settings like this:

DATABASES = {
 'default': {},
 'primary': {
     'ENGINE': 'django.db.backends.oracle',
     'NAME': 'oracle.test.com:5432/erp',
     'USER':'test1',
     'PASSWORD':'',
 },
 'secondary': {
     'ENGINE': 'django.db.backends.oracle',
     'NAME': 'oracle2.test.com:5432/wms',
     'USER':'test2',
     'PASSWORD':'',
 },
}

Then in my views I have this:

def index(request):
    sql_query = ('select \n'
             '  primary.table1.organization_id "Org", \n'
             '  primary.table2.LOCATION "Loc", \n'
             '  primary.table3.inventory "Inv", \n'
             '  primary.table4.reorder_date "Reorder", \n'
             '  secondary.table3.COMMENTS "Comments", \n'
             'from \n'
               '  primary.table2, primary.table1\n'
               '  LEFT OUTER JOIN primary.table3 On\n'
               '    (primary.table1.SCHEDULE_NUMBER = primary.table3.SCHEDULE_NUMBER) \n'
               '  LEFT OUTER JOIN primary.table4 On \n'
               '    (primary.table1.SCHEDULE_NUMBER = primary.table4.PARENT_SCHEDULE_NUMBER) \n'
               '  LEFT OUTER JOIN secondary.table1 On \n'
               '    (primary.table1.SCHEDULE_NUMBER = primary.table1.SCHEDULE_NUMBER) \n'
             'where
               'primary.table1.item_id = primary.table2.inventory_item and \n'
               'primary.table1.organization_id = primary.table2.organization_id \n')
    with connections['primary', 'secondary'].cursor() as cursor:
        cursor.execute(sql_query)
        field_names = [tuple(x[0] for x in cursor.description)]
        row = cursor.fetchall()
        result = field_names   row
        df = pd.DataFrame(result)
        df.rename(columns=df.iloc[0], inplace=True)
        df.drop([0], inplace=True)
        table = df.to_html(index=False, classes='mystyle', justify='left')
    return render(request, 'template.html', {'table': table})

I thought this would work because the syntax I found online for SQL says to prepend the database name to each table. Unfortunately, I get a database error saying that primary or secondary is an invalid identifier.

EDIT - I also tried to simplify things to test. When I try this I get

getattr(): attribute name must be string

and it lists this line of code

with connections['primary', 'secondary'].cursor() as cursor:

So it would appear that I cannot use 2 different databases at the same time. Is there a better solution for what I am trying to accomplish?

    test = 'select * from erp.wip_lines where organization_id = 81'

    with connections['primary', 'secondary'].cursor() as cursor:
        cursor.execute(test)
        field_names = [tuple(x[0] for x in cursor.description)]
        row = cursor.fetchall()
        result = field_names   row
        df = pd.DataFrame(result)
        df.rename(columns=df.iloc[0], inplace=True)
        df.drop([0], inplace=True)
        table = df.to_html(index=False, classes='mystyle', justify='left')
    return render(request, 'template.html', {'table': table})

CodePudding user response:

First, connections is a dict so you should iterate on its keys to get both connections:

dbs_to_connect = ['primary','secondary']
for k in dbs:
    with connections[k].cursor() as cursor:
        # code here...

Second, I'm not sure how your dbs and schemas are configured in Oracle, but remember that the query should be always select <colums> from <schema>.<table>. Regardless, the query string should run directly in another DB software as well, so you can test it there to make sure the problem is not the query itself.

CodePudding user response:

Based on some advice from other forums. I learned that it is not possible to join tables across databases. So here is my solution. Run the 2 databases as 2 different queries and then store the outputs in separate dataframes. Then use pandas merge to join the 2 dataframes together with a left join.

    Filepath = 'example filepath'
    SQL_Query1 = 'example'
    SQL_Query2 = 'example'
    with connections['primary'].cursor() as cursor:
        cursor.execute(SQL_Query1)
        field_names = [tuple(x[0] for x in cursor.description)]
        row = cursor.fetchall()
        result = field_names   row
        df = pd.DataFrame(result)
        df.rename(columns=df.iloc[0], inplace=True)
        df.drop([0], inplace=True)
        df['SCHED_#'] = pd.to_numeric(df['SCHED_#'])

    with connections['secondary'].cursor() as cursor2:
        cursor2.execute(SQL_Query2)
        wms_field_names = [tuple(x[0] for x in cursor2.description)]
        wms_row = cursor2.fetchall()
        wms_result = wms_field_names   wms_row
        wms_df = pd.DataFrame(wms_result)
        wms_df.rename(columns=wms_df.iloc[0], inplace=True)
        wms_df.drop([0], inplace=True)
        wms_df['SCHED_#'] = pd.to_numeric(wms_df['SCHED_#'])

    merged_df = pd.merge(df, wms_df, on='SCHED_#', how='left', indicator=True)
    merged_df.to_csv(filepath, index=False)

Thanks everyone for your input

  • Related