so i have a function that returns an table from a SQL database but i'm wondering how to go about it the best way, currently this is what i have done but i feel like some of the code has been repeated so its not the best dry principle
class DataExtractor:
def extract_rds_table(engine, table_name: Union[str, list]):
"""
Extracts the table from the engine and reads it into a pandas Dataframe then returns it
Once given multiple table names as input, it will return a dictionary Key, Value pairs
of table_name and dataframe
Parameters
input: Engine (Sqlalchemy.engine)
input: Table/Tables (str | list)
Output: Pandas Dataframe of the table in question/ Dictionary of dataframes
"""
if(type(table_name) == str):
pandas_table_extracted = pd.read_sql_table(f'{table_name}', engine)
return pandas_table_extracted
elif(type(table_name) == list):
stored_table_names_and_data = {}
for table in table_name:
pandas_table_extracted = pd.read_sql_table(f'{table}', engine)
stored_table_names_and_data[f'{table}'] = pandas_table_extracted
return stored_table_names_and_data
CodePudding user response:
Alternately, explicitly remap the input to a known format and always return a dict:
class DataExtractor:
def extract_rds_table(engine, table_name: Union[str, list]) -> dict:
"""
Extracts the table from the engine and reads it into a pandas Dataframe then returns it
Once given multiple table names as input, it will return a dictionary Key, Value pairs
of table_name and dataframe
Parameters
input: Engine (Sqlalchemy.engine)
input: Table/Tables (str | list)
Output: Dictionary of dataframes
"""
if type(table_name) == str:
table_list = [table_name]
elif type(table_name) in (list, set, tuple):
table_list = list(table_name)
else:
raise ValueError(f'Unknown table name input type {type(table_name)}.')
stored_table_names_and_data = {}
for table in table_name:
pandas_table_extracted = pd.read_sql_table(f'{table}', engine)
stored_table_names_and_data[f'{table}'] = pandas_table_extracted
return stored_table_names_and_data
CodePudding user response:
Your approach is not that bad, except for you can avoid extra variables and iterative dict assignment:
class DataExtractor:
def extract_rds_table(engine, table_name: Union[str, list]):
"""
Extracts the table from the engine and reads it into a pandas Dataframe then returns it
Once given multiple table names as input, it will return a dictionary Key, Value pairs
of table_name and dataframe
Parameters
input: Engine (Sqlalchemy.engine)
input: Table/Tables (str | list)
Output: Pandas Dataframe of the table in question/ Dictionary of dataframes
"""
if (type(table_name) == str):
return pd.read_sql_table(f'{table_name}', engine)
elif (type(table_name) == list):
return dict((table, pd.read_sql_table(f'{table}', engine))
for table in table_name)