Content-Length: 440097 | pFad | https://github.com/apache/airflow/pull/26188

60 Fix edge detection iteration by jedcunningham · Pull Request #26188 · apache/airflow · GitHub
Skip to content

Fix edge detection iteration #26188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 7, 2022
Merged

Conversation

jedcunningham
Copy link
Member

@jedcunningham jedcunningham commented Sep 7, 2022

In #26175, edge detection was changed to use iteration instead of recursion, however it wasn't keeping track of the parent task.

@boring-cyborg boring-cyborg bot added the area:webserver Webserver related Issues label Sep 7, 2022
@jedcunningham jedcunningham added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Sep 7, 2022
@jedcunningham jedcunningham force-pushed the fix_edge_detection_iteration branch from 7a6bb1e to 486b54f Compare September 7, 2022 01:19
Comment on lines -239 to +240
def downstream_list(self) -> Iterable["DAGNode"]:
def downstream_list(self) -> Iterable["Operator"]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I don’t think you can do this… Many had tried this before and failed 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, get_task is apparently returning an Operator?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but… well this thing is a mess.

Copy link
Member

@ashb ashb Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uranusjr Does this need changing or is it fine as it is? (It's passing type checks)

tasks_to_trace_next: List[Operator] = []
for task in tasks_to_trace:
for child in task.downstream_list:
edge = (task.task_id, child.task_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use node_id here to avoid task and child needing to be an Operator.

@ashb
Copy link
Member

ashb commented Sep 7, 2022

Curious, I wonder why that PR passed it's tests

In apache#26175, edge detection was changed to use iteration instead of
recursion, however it wasn't keeping track of the parent tasks.
This changes the set we iterate over to be `(task, {children})` pairs instead
of just adding all the children tasks themselves.
@ashb ashb force-pushed the fix_edge_detection_iteration branch from 3a5fb97 to cfb7baa Compare September 7, 2022 08:18
@potiuk
Copy link
Member

potiuk commented Sep 7, 2022

@ashb Because the change was only in "www" and "Core" tests" were skipped with "Selective checks" because of that. Apparently the "core tests" as defined now have some implicit dependency on www code. I will trace it and fix it.

@potiuk
Copy link
Member

potiuk commented Sep 7, 2022

That also help to track such cases where either the tests or the code have some implicit dependencies or testing unrelated code.

@potiuk
Copy link
Member

potiuk commented Sep 7, 2022

Yes. In this case there are two solutions (the first is way worse):

  • `test_dag_renderer" could be moved to "tests/www" becasue it mostly tests "dag_edges" which is airflow/www/views.py

or (probably way better solution)

  • dag_edges (and corresponding methods) should be moved to "airlfow/utils" because they are used by more than just views (they are used by both Views and Dot Renderer

I am happy to add PR to do the latter - in this case, this failure simply uncovered an implicit dependency that we missed that should be removed.

@potiuk
Copy link
Member

potiuk commented Sep 7, 2022

Fix (moving dag_edges and task_groups_to_dict in #26212. It depends on this one, so it should be merged after.

@ashb
Copy link
Member

ashb commented Sep 7, 2022

Ahhhh yeah that makes sense. The price we pay for having quicker tests is we sometimes miss things like this. (Totally worth it!)

@ashb
Copy link
Member

ashb commented Sep 7, 2022

This is passing it's tests, if you want to fix up types @uranusjr we can do that in a follow up PR.

@ashb ashb merged commit b7969d4 into apache:main Sep 7, 2022
@ashb ashb deleted the fix_edge_detection_iteration branch September 7, 2022 13:45
potiuk added a commit to potiuk/airflow that referenced this pull request Sep 7, 2022
The methods were implemented in "view" but they were used in other
places (in dot_renderer) so they conceptually belong to common
util code. Having those in a wrong package (airflow/www) caused
the tests to pass because selective checks did nor realise that
change in "airflow/www" also requires running Core tests.

This PR moves the methods to "airflow/utils". The methods
are imported in the "views" module so even if someone used them
from there, they will stil be available there, so the change is
fully backwards compatible (even if those are not "public"
airflow API methods.

Follow up after apache#26188
potiuk added a commit that referenced this pull request Sep 7, 2022
…26212)

The methods were implemented in "view" but they were used in other
places (in dot_renderer) so they conceptually belong to common
util code. Having those in a wrong package (airflow/www) caused
the tests to pass because selective checks did nor realise that
change in "airflow/www" also requires running Core tests.

This PR moves the methods to "airflow/utils". The methods
are imported in the "views" module so even if someone used them
from there, they will stil be available there, so the change is
fully backwards compatible (even if those are not "public"
airflow API methods.

Follow up after #26188
@ephraimbuddy ephraimbuddy added this to the Airflow 2.4.0 milestone Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:webserver Webserver related Issues changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: https://github.com/apache/airflow/pull/26188

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy