Home > front end >  More efficient way to make dynamic sql queries than iteration
More efficient way to make dynamic sql queries than iteration

Time:12-09

I have to make multiple sql queries of entire tables, and concatenate them into one big data table.

I have a dictionary where the key is a team name, and the value serves as an acronym where the acronym is the prefix to mySQL data tables

    engine = create_engine('mysql mysqlconnector://%s:%s@%s/%s' % (mysql_user, mysql_password, mysql_host, mysql_dbname), echo=False, pool_recycle=1800)
    
    mysql_conn = engine.connect()
    
    team_dfs = []
    
    nba_dict = {'New York Knicks': 'nyk',
        'Boston Celtics': 'bos',
        'Golden State Warriors': 'gsw',
        'New York Knicks': 'nyk'}
    
        for name, abbr in nba_dict.items()
    
            query = f'''
            SELECT *
            from {abbr}_record
            '''
            df = pd.read_sql_query(query, mysql_conn)

            df['team_name'] = name
    
            team_dfs.append(df)
    
    team_dfs = pd.concat(team_dfs)

Is there a better way to refactor this code and make it more efficient?

CodePudding user response:

If your nba_dict is fixed, you can use UNION to manually combine the result table of SQL.

abbr = list(nba_dict.values())

query = f'''
SELECT *
from {abbr[0]}_record UNION
SELECT *
from {abbr[1]}_record UNION
SELECT *
from {abbr[2]}_record UNION
SELECT *
from {abbr[3]}_record UNION
'''
df = pd.read_sql_query(query, mysql_conn)
df['team_name'] = list(nba_dict.keys())

CodePudding user response:

Your database layout, with a separate table for each team, is doomed to inefficiency whenever you need to retrieve data for more than one team at a time. You would be much much better off putting all that data in one table, giving the table a column mentioning the team associated with each row.

Why inefficient? More tables: more work. And, more queries: more work.

I suggest you push back, hard, on the designer of this database table structure. Its design is, bluntly, wrong.

If you must live with this structure, I suggest you create the following view. It will fake the single-table approach and give you your "gold layer". You get away with this because pro sports franchises don't come and go that often. You do this just once in your database.

CREATE OR REPLACE VIEW teams_record AS
SELECT 'nyk' team, * FROM nyk_record
UNION ALL
SELECT 'bos' team, * FROM bos_record
UNION ALL
SELECT 'gsw' team, * FROM gsw_record
UNION ALL .... the other teams

Then you can do SELECT * FROM teams_record ORDER BY team to get all your data.

  • Related