-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Event Driven Dags can miss/delay runs when paused #49857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I believe this is the current behaviour, the dag must be active when you using event driven and thats what i see at-least event driven meaning. the request your referring is something like new functionality that is like, store the events and process. cc: @vincbeck please correct me. :) |
Yeah DAG needs to be active. |
thanks for your reponses, if I understand event driven dags should never be paused and if I need to prevent it to run (ex in case of maintenance of a database) I need to block what causes trigger to return event (in case of example_asset_with_watchers I need to stop what create /tmp/test file) ? Other question, during my tests I guess that one time between the time that dags was paused and the trigger stop whatching for file, the /tmp/test was deleted but no asset event was generated, but since I was not able to find which part of code manage the return of trigger and create the asset event I was not able to push more investigation. Could you tell me where I can find these code ? |
I confirm everything that has been said here.
https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/assets/manager.py#L109 |
thanks for your response, I've managed to understand what happens when file is delete but no asset_event is created when pausing the dag. In fact, it seems that the asset_trigger link is deleted between the moment trigger yield event and the moment register_asset_change is called due to dag paused. It may not be considered as a bug since event driven dag should not be paused, but it's limit their usage to handle case when you have the possibility to pause the initial event source or if events doesn't have significant payload and you can wait for the next event/create a dummy event from UI after unpausing |
This issue has been automatically marked as stale because it has been open for 14 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author. |
This issue has been closed because it has not received response from the issue author. |
Apache Airflow version
3.0.0
If "Other Airflow 2 version" selected, which one?
No response
What happened?
Hello,
I was playing with the "example_asset_with_watchers" sample dags in order to see how works the new event driven scheduling and I've seen that when you pause dag,there is a short period during which if a file is created, it will be processed by the FileDeleteTrigger, be added in the event list but since the dag is paused it will not plan an run (until now it's ok) but when dag is unpaused this event will not trigger a dag run until next time a file is created or you manually create a "fake" asset event.
What you think should happen instead?
Since I don't think it's possible to prevent the event creation, when unpaused a dag run should be launched to process not treated events
How to reproduce
use the example_asset_with_watchers sample dag, pause it and fastly create the /tmp/test file, a event should be created and then unpause dag. the event will remain "untreated" until a new one is triggered
Operating System
debian 12
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: