Skip to content

STORE_DAG_CODE not displaying correct code for zipped dag files #14412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
prithvisathiya opened this issue Feb 24, 2021 · 6 comments
Closed

STORE_DAG_CODE not displaying correct code for zipped dag files #14412

prithvisathiya opened this issue Feb 24, 2021 · 6 comments
Labels
area:Scheduler including HA (high availability) scheduler kind:bug This is a clearly a bug pending-response priority:medium Bug that should be fixed before next release but would not block a release

Comments

@prithvisathiya
Copy link

Apache Airflow version:
1.10.14

Kubernetes version (if you are using kubernetes) (use kubectl version):
1.19.4

Environment:

  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools: production docker image
  • Others: python3.7

What happened:
I recently upgraded from 1.10.9 -> 1.10.14, and enabled both STORE_SERIALIZED_DAGS and STORE_DAG_CODE to True. Everything seems to work as expected, and the scheduler is able to execute the DAG and the tasks within. However, the "Code View" in the UI displays a very long hex string instead of the actual code.

\x66726f6d206461746574696...

I have narrowed it down to be an issue, only when dag file is stored within zip files. For example, my_dag.py and my_other_dag.py would show in the UI incorrectly in the following case:

├── my_dag.zip
│   ├── my_dag.py
│   └── utils.py
├── my_other_dag.zip
│   ├── my_other_dag.py

but this is fine:

├── my_dag.py
├── my_other_dag.py
├── utils.py

A quick look at the database also reflects the incorrectly parsed DAG code within the source_code column of the public.dag_code table. But the fact that the scheduler can execute the dag, and that the data column of the public.serialized_dag table contains the correct code, suggests that this issue is isolated only when storing within the dag_code table.

What you expected to happen:
I should expect the actual python code to show in UI

import os
...

How to reproduce it:
Create 2 similar dags (1 within zip and 1 unzipped) and place into root dags folder. The UI will display the dag2.py incorrectly, but work fine for dag1.py

├── opt/airflow/dags
│   ├── dag1.py
│   ├── dag2.zip
│   │   └── dag2.py
@prithvisathiya prithvisathiya added the kind:bug This is a clearly a bug label Feb 24, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Feb 24, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

@vikramkoka vikramkoka added area:Scheduler including HA (high availability) scheduler priority:medium Bug that should be fixed before next release but would not block a release labels Mar 2, 2021
@kaxil kaxil added this to the Airflow 2.0.2 milestone Mar 8, 2021
@ashb ashb modified the milestones: Airflow 2.0.2, Airflow 2.0.3 Apr 22, 2021
@ashb
Copy link
Member

ashb commented May 7, 2021

Could you please test this on Airflow 2.0 and let us know if the problem persists there?

@github-actions
Copy link

github-actions bot commented Jun 8, 2021

This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jun 8, 2021
@ashb ashb removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Jun 15, 2021
@kaxil
Copy link
Member

kaxil commented Jun 22, 2021

Any luck with testing @prithvisathiya with Airflow >= 2?

@kaxil kaxil removed this from the Airflow 2.1.1 milestone Jun 22, 2021
@prithvisathiya
Copy link
Author

hey, sorry for late reply. Recent changes have made de-prioritize this issue for the last couple months. If I remember correctly, it was affecting Airflow >= 2 as well, but I will double check and update here shortly

@kaxil
Copy link
Member

kaxil commented Jul 6, 2021

This was fixed by #13984 and was released in 2.1.0, let us know if it is still the issue and we can reopen it

@kaxil kaxil closed this as completed Jul 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Scheduler including HA (high availability) scheduler kind:bug This is a clearly a bug pending-response priority:medium Bug that should be fixed before next release but would not block a release
Projects
None yet
Development

No branches or pull requests

4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy