-
Notifications
You must be signed in to change notification settings - Fork 15.1k
Account for change in UTC offset when calculating next schedule #35887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
d85768e
6768ae7
f042915
3835f1c
a06ddd8
6f7366c
122b152
de92301
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -17,7 +17,6 @@ | |||||||||||||||||||
from __future__ import annotations | ||||||||||||||||||||
|
||||||||||||||||||||
import datetime | ||||||||||||||||||||
from functools import cached_property | ||||||||||||||||||||
from typing import TYPE_CHECKING, Any | ||||||||||||||||||||
|
||||||||||||||||||||
from cron_descriptor import CasingTypeEnum, ExpressionDescriptor, FormatException, MissingFieldException | ||||||||||||||||||||
|
@@ -32,21 +31,32 @@ | |||||||||||||||||||
from pendulum import DateTime | ||||||||||||||||||||
|
||||||||||||||||||||
|
||||||||||||||||||||
def _is_schedule_fixed(expression: str) -> bool: | ||||||||||||||||||||
"""Figures out if the schedule has a fixed time (e.g. 3 AM every day). | ||||||||||||||||||||
def _covers_every_hour(cron: croniter) -> bool: | ||||||||||||||||||||
"""Check whether the given cron runs at least once an hour. | ||||||||||||||||||||
|
||||||||||||||||||||
:return: True if the schedule has a fixed time, False if not. | ||||||||||||||||||||
This indicates whether we need to implement a workaround for (what I call) | ||||||||||||||||||||
the "fold hour problem". Folding happens when a region switches time | ||||||||||||||||||||
backwards, usually as a part of ending a DST period, causing a block of time | ||||||||||||||||||||
to occur twice in the wall clock. This is indicated by the ``fold`` flag on | ||||||||||||||||||||
datetime. | ||||||||||||||||||||
|
||||||||||||||||||||
Detection is done by "peeking" the next two cron trigger time; if the | ||||||||||||||||||||
two times have the same minute and hour value, the schedule is fixed, | ||||||||||||||||||||
and we *don't* need to perform the DST fix. | ||||||||||||||||||||
As an example, Switzerland in 2023 ended DST on 3am (wall clock time, UTC+2) | ||||||||||||||||||||
by dialing back the clock to 2am (UTC+1). So for (say) ``30 * * * *``, if | ||||||||||||||||||||
the last run was 2:30am (UTC+2), the next needs to be 2:30am (UTC+1, folded) | ||||||||||||||||||||
instead of 3:30am. | ||||||||||||||||||||
|
||||||||||||||||||||
This assumes DST happens on whole minute changes (e.g. 12:59 -> 12:00). | ||||||||||||||||||||
While this technically happens for all runs (in such a timezone), we only | ||||||||||||||||||||
really care about runs that happen at least once an hour, and can | ||||||||||||||||||||
provide a somewhat reasonable rationale to skip the fold hour for things | ||||||||||||||||||||
such as ``*/2`` (every two hour). So we try to *minially* peak into croniter | ||||||||||||||||||||
internals to work around the issue. | ||||||||||||||||||||
Comment on lines
+48
to
+52
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. mainly this suggestion (1) tries to clarify "happens for all runs" vs "happens for all cron schedules". A given run is just a single instant in time but this is more about the schedule, i.e. computing next run from this run and (2) suggests to just state the rationale rather than saying "we can provide" but don't provide |
||||||||||||||||||||
|
||||||||||||||||||||
The check is simple since croniter internally normalizes things to ``*``. | ||||||||||||||||||||
More edge cases can be added later as needed. | ||||||||||||||||||||
|
||||||||||||||||||||
See also: https://github.com/kiorky/croniter/issues/56. | ||||||||||||||||||||
""" | ||||||||||||||||||||
cron = croniter(expression) | ||||||||||||||||||||
next_a = cron.get_next(datetime.datetime) | ||||||||||||||||||||
next_b = cron.get_next(datetime.datetime) | ||||||||||||||||||||
return next_b.minute == next_a.minute and next_b.hour == next_a.hour | ||||||||||||||||||||
return cron.expanded[1] == ["*"] | ||||||||||||||||||||
|
||||||||||||||||||||
|
||||||||||||||||||||
class CronMixin: | ||||||||||||||||||||
|
@@ -91,18 +101,12 @@ def validate(self) -> None: | |||||||||||||||||||
except (CroniterBadCronError, CroniterBadDateError) as e: | ||||||||||||||||||||
raise AirflowTimetableInvalid(str(e)) | ||||||||||||||||||||
|
||||||||||||||||||||
@cached_property | ||||||||||||||||||||
def _should_fix_dst(self) -> bool: | ||||||||||||||||||||
# This is lazy so instantiating a schedule does not immediately raise | ||||||||||||||||||||
# an exception. Validity is checked with validate() during DAG-bagging. | ||||||||||||||||||||
return not _is_schedule_fixed(self._expression) | ||||||||||||||||||||
|
||||||||||||||||||||
def _get_next(self, current: DateTime) -> DateTime: | ||||||||||||||||||||
"""Get the first schedule after specified time, with DST fixed.""" | ||||||||||||||||||||
naive = make_naive(current, self._timezone) | ||||||||||||||||||||
cron = croniter(self._expression, start_time=naive) | ||||||||||||||||||||
scheduled = cron.get_next(datetime.datetime) | ||||||||||||||||||||
if not self._should_fix_dst: | ||||||||||||||||||||
if not _covers_every_hour(cron): | ||||||||||||||||||||
return convert_to_utc(make_aware(scheduled, self._timezone)) | ||||||||||||||||||||
delta = scheduled - naive | ||||||||||||||||||||
return convert_to_utc(current.in_timezone(self._timezone) + delta) | ||||||||||||||||||||
|
@@ -112,7 +116,7 @@ def _get_prev(self, current: DateTime) -> DateTime: | |||||||||||||||||||
naive = make_naive(current, self._timezone) | ||||||||||||||||||||
cron = croniter(self._expression, start_time=naive) | ||||||||||||||||||||
scheduled = cron.get_prev(datetime.datetime) | ||||||||||||||||||||
if not self._should_fix_dst: | ||||||||||||||||||||
if not _covers_every_hour(cron): | ||||||||||||||||||||
return convert_to_utc(make_aware(scheduled, self._timezone)) | ||||||||||||||||||||
delta = naive - scheduled | ||||||||||||||||||||
return convert_to_utc(current.in_timezone(self._timezone) - delta) | ||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's not used anywhere, and should not be used anywhere, why make a change to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah because you remove _should_fix_dst