Home > Mobile >  Connection used in Airflow DAG is not providing _decrypted_ password or extra - DAG authoring issue
Connection used in Airflow DAG is not providing _decrypted_ password or extra - DAG authoring issue

Time:10-30

I'm extending the Airflow:2.2.0 image and trying to use a Connection in a DAG to make a GET requests with a custom Hook. However, regardless what I try, and following any suggestions found, the DAG does not seem to get a decrypted version of the password and extra field.

The output of the DAG shows the connection info (logically redacted as the source code indicates when printing) properly, but trying to use it or printing it does not decrypt the passwords.

I'm redacting all sensitive/personal info with [var_name] below

I've tried getting the details directly within a PythonOperator of a DAG:

conn = BaseHook.get_connection(conn_id=[conn_id])
print(conn.password)
print(conn.extra)

and from within a custom hook that is imported

# inside the PythonOperator
with JwtHook(conn_id=[conn_id]) as api:
    result = api.get_metadata()
    print(result)


# with JwtHook partially doing (again, redacted majority of code):
...
def get_conn(self):
    """
    Returns the connection used by the hook for querying data.
    Should in principle not be used directly.
    """

    # Return existing session if initiated
    if self._session is None:
        # Fetch config for the given connection (host, login, etc).
        config = self.get_connection(self._conn_id)
        print(config.extra)        
        print(config.extra_dejson)

        if not config.host or not self._extras.get("token_host"):
            raise ValueError(
                f"No (token)host specified in connection {self._conn_id}"
            )

        self._login = (config.login, config.password)
        print(self._login)

       ...

...

{base.py:79} INFO - Using connection to: id: [conn_id]. Host: [url], Port: None, Schema: None, Login: [username], Password: ***, extra: {'token_host': '***', 'user_id': ''}
{logging_mixin.py:109} INFO - {"token_host": "***","user_id": ""}
{logging_mixin.py:109} INFO - {'token_host': '***', 'user_id': ''}
{logging_mixin.py:109} INFO - ([username], '***')

Things I've tried, followed and implemented after depleting the internet:

  • set up a shared Fernet key across containers
    # set up for all containers in docker-compose
    environment:
      &airflow-common-env
      AIRFLOW__CORE__FERNET_KEY: ${_AIRFLOW__CORE__FERNET_KEY}
    
    The shared use is confirmed after running
    docker-compose run airflow-[containername] airflow config get-value core fernet_key
    
    for all containers.
  • I've created the Connection through the Web UI, and through the command line. Either way the web UI nicely shows the Connection with all details, the command connections get [conn_id] output it identically with all secrets decrypted.
    id | conn_id   | conn_type | description             | host  | schema | login      | password   | port | is_encrypted | is_extra_encrypted | extra_dejson                           | get_uri
    
    1  | [conn_id] | HTTP      | Custom API connection   | [url] | None   | [username] | [password] | None | True         | True               | {'token_host': [url], 'user_id': [id]} | [uri]
    

Any help in getting this sorted would be great. I'm just not being able to pin down the issue - and potentially I am using Airflow wrongfully. Interestingly, I'm not getting any errors (like here). I was wondering whether the DAG executor has the same rights as the Connection creator - but I can't find/see how to confirm that suspicion.

Any help is much appreciated. Thanks for reading in advance!

[EDIT] So, I didn't try to run test run the DAG from the CLI - and what do you know - it works! Instead, I was triggering the DAG Manually from the UI - but because I only have one (Admin) user - I didn't suspect that it wouldn't work. So it is clear to me this is some kind of authoring issue. Anyone has any pointers to how to set this up properly? Thanks!

CodePudding user response:

This is expected. The *** in logs are just a sign that your connection secret was correctly retrieved. Airflow supports automated masking of the secrets - if you happen accidentally to print such secret in the log, it will be automatically replaced with "***". The fact that you see in logs means that you correctly got it decrypted.

See https://airflow.apache.org/docs/apache-airflow/stable/security/secrets/mask-sensitive-values.html

  • Related