aws_conn_id in RDS operators (e.g., RdsStartExportTaskOperator, RdsCreateDbSnapshotOperator) is ignored and falls back to aws_default in apache-airflow-providers-amazon>=9.6.0 #50766
Labels
area:providers
kind:bug
This is a clearly a bug
needs-triage
label for new issues that we didn't triage yet
Uh oh!
There was an error while loading. Please reload this page.
Apache Airflow Provider(s)
amazon
Versions of Apache Airflow Providers
apache-airflow-providers-amazon>=9.6.0
Apache Airflow version
2.10.1
Operating System
MWAA
Deployment
Amazon (AWS) MWAA
Deployment details
No response
What happened
In apache-airflow-providers-amazon>=9.6.0, when using RDS operators such as RdsStartExportTaskOperator or RdsCreateDbSnapshotOperator, the aws_conn_id parameter provided during operator instantiation is not being honored. Instead, the operator attempts to use the aws_default connection.
This behavior appears to be a regression from earlier versions of the provider (e.g., 8.x.x series) where the specified aws_conn_id was correctly used.
The issue seems to stem from the RdsBaseOperator's init method (from which other RDS operators inherit). In version 9.6.0 of
airflow/providers/amazon/aws/operators/rds.py
, the aws_conn_id parameter in the RdsBaseOperator.init signature:This explicitly captures the aws_conn_id if passed in kwargs. Consequently, when super().init(*args, **kwargs) is called to initialize the parent AwsBaseOperator, the aws_conn_id is no longer present in kwargs. AwsBaseOperator then falls back to its own default for its aws_conn_id parameter, which is "aws_default". The hook used by the operator then incorrectly uses this default connection.
What you think should happen instead
The RDS operator should use the specific AWS connection ID provided in its aws_conn_id parameter. If a user specifies aws_conn_id="my_custom_conn", the operator should use the Airflow connection named my_custom_conn, not aws_default.
How to reproduce
Prerequisites:
Airflow version 2.10
apache-airflow-providers-amazon==9.6.0 (or any subsequent version where this issue persists).
Airflow Connections:
In the Airflow UI, create an AWS connection named my_custom_rds_conn. The actual credentials can be placeholders for this reproduction; the existence of the named connection is key.
Ensure that there is no Airflow connection named aws_default. (Alternatively, if aws_default must exist for other reasons, ensure its credentials would obviously fail or be different from my_custom_rds_conn for the RDS operation).
Minimal DAG:
Create and run the following DAG:
Observe:
Trigger the DAG.
The create_db_snapshot_bug_test task is expected to fail (as the DB identifier is a dummy).
Inspect the logs for the failed task instance.
Expected vs. Actual Outcome:
Expected (if bug wasn't present): If my_custom_rds_conn was actually used, the error might relate to "DB instance my-dummy-db-id not found" using the (placeholder) credentials from my_custom_rds_conn.
Actual (due to bug): The task will fail with an error indicating that the connection aws_default could not be found (e.g., airflow.exceptions.AirflowNotFoundException: The conn_id aws_default isn't defined), or an equivalent Boto3/AWS SDK error if it tries to use default credential chain after failing to find aws_default. This demonstrates that the operator ignored aws_conn_id="my_custom_rds_conn" and attempted to use aws_default.
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: