For instance, if I have a python function that creates a my_sql engine using sqlalchemy and ingests data into a table, I would just create a python operator and connect it to that callable.
What is the reason I would prefer to use a MySqlOperator over having the process contained in a PythonOperator? What are the pros? The cons?
CodePudding user response:
MySqlOperator is designed so that you will just provide the SQL:
drop_table_mysql_task = MySqlOperator(
task_id='create_table_mysql', sql="""DROP TABLE table_name;""", dag=dag
)
The operator already handle everything for you. You don't need to create mysql engine nor even know what sqlalchemy is.
PythonOperator
exists for executing arbitrary code which doesn't make sense to create a custom operator for.
Yes - you can do everything with PythonOperator
if you prefer.
From your description it looks like you prefer to write scripts and just schedule them with Airflow. Making Airflow to be a cron-job like system - This is a petty because this means that you are not leveraging the power of the tool.