Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify DAG schedule args and change default to None #41453

Merged
merged 3 commits into from
Aug 26, 2024

Conversation

uranusjr
Copy link
Member

@uranusjr uranusjr commented Aug 14, 2024

The arguments 'schedule_interval' and 'timetable' are removed from both the DAG class and the @dag decorator.

The default value of the 'schedule' argument (on both entities) is changed to None (i.e. a DAG will not have a schedule by default).

The 'timetable' attribute still exists on DAG, and is now the only value that reflects the DAG's schedule. The 'schedule_interval' attribute is removed from DAG.

The 'schedule_interval' on DagModel used to store a string representation of DAG's attribute of the same name, is now replaced by timetable_summary, which should (mostly?) work the same as before. We can fix minor UI differences as we go. Some use cases that rely on that field are also changed to use other fields instead (dataset_expression, for example, can be used to check whether a DAG is dataset-triggered).

The API field 'schedule_interval' has also been removed since that field no longer exists in the database. This has some side effects. The field previously contains some type information for delta values, but now becomes a simple string. It is unclear if anyone really needs the information, but we can always bring it back if needed.

Close #24842

@uranusjr uranusjr added the airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes label Aug 14, 2024
@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:CLI area:providers area:Scheduler including HA (high availability) scheduler area:serialization area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues provider:fab labels Aug 14, 2024
@kaxil kaxil added this to the Airflow 3.0.0 milestone Aug 14, 2024
@uranusjr uranusjr force-pushed the remove-schedule-interval branch 2 times, most recently from 961462c to 53391bd Compare August 15, 2024 02:46
@uranusjr uranusjr changed the title Remove schedule_interval and timetable DAG args Unify DAG schedule args and change default to None Aug 15, 2024
@uranusjr uranusjr force-pushed the remove-schedule-interval branch 9 times, most recently from 2ab573d to f8202f8 Compare August 15, 2024 20:00
@uranusjr uranusjr marked this pull request as ready for review August 15, 2024 20:33
@Lee-W Lee-W self-requested a review August 21, 2024 07:55
Copy link
Member

@Lee-W Lee-W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small nitpick. Everything else looks good to me, but I'm not an expert in the area. It would be great to have a second set of eyes.

newsfragments/24842.significant.rst Show resolved Hide resolved
@Lee-W
Copy link
Member

Lee-W commented Aug 21, 2024

if we're to add migration rule for this PR to #41641, we probably could do the following 2?

  • schedule_interval -> timetable_summary
  • schedule=NOTSET, schedule=None -> schedule=timedelta(days=1),

@uranusjr
Copy link
Member Author

uranusjr commented Aug 22, 2024

schedule_interval -> timetable_summary

These two are sort of interchangable for display purposes, but not if you want to use the value for something else. For example:

with DAG(schedule=timedelta(days=2)) as d1:
    ...

# This can not be changed to timetable_summary.
# d1.timetable would work, but fail for other use cases.
with DAG(schedule=d1.schedule_interval) as d2:
    ...

If we’re going the linter (ruff-like) route, this would be a case for showing a linting error without an automated fix available.

schedule=NOTSET, schedule=None -> schedule=timedelta(days=1),

schedule=NOTSET should be changed to timedelta(days=1), but schedule=None should be kept as-is (its behaviour is not changed in 3.0)

@uranusjr uranusjr force-pushed the remove-schedule-interval branch 3 times, most recently from 95b64ea to 91e5d67 Compare August 22, 2024 03:37
The arguments 'schedule_interval' and 'timetable' are removed from both
the DAG class and the `@dag` decorator.

The default value of the 'schedule' argument (on both entities) is
changed to None (i.e. a DAG will not have a schedule by default).

The 'timetable' attribute still exists on DAG, and is now the only value
that reflects the DAG's schedule. The 'schedule_interval' attribute is
removed from DAG.

The 'schedule_interval' on DagModel used to store a string
representation of DAG's attribute of the same name, is now replaced by
timetable_summary, which should (mostly?) work the same as before. We
can fix minor UI differences as we go. Some use cases that rely on that
field are also changed to use other fields instead (dataset_expression,
for example, can be used to check whether a DAG is dataset-triggered).

The API field 'schedule_interval' has also been removed since that field
no longer exists in the database. This has some side effects. The field
previously contains some type information for delta values, but now
becomes a simple string. It is unclear if anyone really needs the
information, but we can always bring it back if needed.
The key in facets are kept for backward compatibility, but they now do
not produce a value in the resulting JSON.

A new key timetable_summary is added on Airflow 3 to replace the old
key, similar to how it's done in the REST API.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes area:API Airflow's REST/HTTP API area:CLI area:providers area:Scheduler including HA (high availability) scheduler area:serialization area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues provider:fab
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Default start_date and schedule_interval to None
3 participants