Skip to content

Make GoogleBaseHook credentials functions public #25785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

feluelle
Copy link
Member

Currently the get_credentials functions in the GoogleBaseHook are private even though their interfaces are mandatory to not change for other hooks referencing them. To make it clear (and correct) for users of these functions we should change the accessibility level to public.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

- change _get_credentials to be public
- change _get_credentials_and_project_id to be public
@feluelle feluelle requested a review from turbaszek as a code owner August 18, 2022 07:21
@boring-cyborg boring-cyborg bot added provider:cncf-kubernetes Kubernetes (k8s) provider related issues area:providers provider:google Google (including GCP) related issues labels Aug 18, 2022
@feluelle feluelle marked this pull request as draft August 18, 2022 12:55
@feluelle feluelle marked this pull request as ready for review August 18, 2022 13:07
@feluelle feluelle requested a review from potiuk August 19, 2022 05:06
@uranusjr
Copy link
Member

For RM: This is a breaking change since the old unerscore-prefixed functions are no longer available. The next provider release needs to bump the major version.

@uranusjr uranusjr merged commit 6e66dd7 into apache:main Aug 19, 2022
@feluelle feluelle deleted the fix/google-base-hook-credentials-functions-access branch August 19, 2022 10:08
@kaxil
Copy link
Member

kaxil commented Aug 19, 2022

For RM: This is a breaking change since the old unerscore-prefixed functions are no longer available. The next provider release needs to bump the major version.

Actually we can add a backwards-compatible support for it, for a couple of releases. Thoughts?

def _get_credentials():
     .... raise deprecation warning here..
     return get_credentials()

@feluelle
Copy link
Member Author

@uranusjr @kaxil I don't think this is a breaking change for users. We have fixed all references of this private function. No one should ever use the private function from outside the provider. We should not guarantee private interface compatibility in my opinion. Otherwise we would have to do it for every private function we change.

@potiuk
Copy link
Member

potiuk commented Aug 19, 2022

@uranusjr @kaxil I don't think this is a breaking change for users. We have fixed all references of this private function. No one should ever use the private function from outside the provider. We should not guarantee private interface compatibility in my opinion. Otherwise we would have to do it for every private function we change.

I think @kaxil is right. Technically speaking _get_credentials() is actually a protected method not private one. Private methods start with double underscores __ . And GoogleBaseHook is almost by definition a Base classs for multiple Hooks so if someone created TheirOwnGoogleHook, using _get_credentials() is not only allowed, but actually very likely. I do believe we need back-compat _get_credentials() as @kaxil suggested.

@feluelle
Copy link
Member Author

TIL. For me single underscores was always private. Okay, then I will definitely change my way of writing code 😅 i.e. having so many "internal" functions don't make sense to me.

We can add such a function if you think we should provide backwards-compatibility.

@potiuk
Copy link
Member

potiuk commented Aug 20, 2022

Yeah. i think we should add it.

Context matters. A lot. I think we cannot say 'renaming protected class is always breaking' or renaming protected class is always not breaking'. It very much depends on what we are renaming.

As i explained several times before - my view on deciding whether a change is breaking or not is never 0-1. It should be based on our assessment on how likely it is to break heavily the workflows of a number of our users.

@ashb posted it several times already but this explains it very well:

workflow_2x.png

Python is really a dynamic language and we do not have fixed 'APIs' that would let us always know with 100% certainty that we introduce a breaking change. We have a bit of dualism (similarly as in case of dependencies) that airflow is both an application (when it comes to airflow core) and library intended to be used by our users (in case of DAG authoring). And breaking/not breaking has different meaning in those two cases.

There really no true 'private' methods in Python, the underscores (even the double ones) are really convention and expressions of intention. In this case since this is a public base Hook class and the intention is for users to extend the base class, the single underscore is an intention to use it in classes that users create. And my assessment in this case is that the likelihood of breaking user's workflows is high. Hooks are definitely parts of our 'public' APIs and in this case protected classes are intended for our users to use.

But In case of classes that are not intended to be extended and are purely airflow internal, and not part of the real user APIs, this is a bit different story. There, protected methods are for ourselves - maintainers and developers of Airflow, not for our users. And intention is that if the class is later extended (internally in Airflek core only) those classes should be able to use it. And we have full control where the method is used because it should only be done in Airflow and never in user's code.

In such case, it is less likely that someone bases their workflow on those, and even if they do, they are clearly not doing what was intended. And in this case i'd say renaming a protected class is not a breaking changes.

Sorry for such long comment, but I think it is really important to understand that breaking/not breaking is not as clear 0-1 decision in case of Airflow as you might think.

BTW. When we split providers to separate repos i have a dream (I am thinking hard of various cases it involves) that we will be able to make much clearer distinction and introduce much more explicit and clear APIs that will separate Airflow developers from Airflow users and make such decisions breaking/not breaking much easier. This is - IMHO a bit of necessity to keep the providers in check when they are separated, and i hope we will be able to pull that one off.

@potiuk
Copy link
Member

potiuk commented Aug 20, 2022

Wrong image :) but now the right one is there :)

@feluelle
Copy link
Member Author

I totally agree with that breaking changes are not always easy to identify, but in my opinion we are on a much safer side when we define public as being the only user exposed access level and we see internal and private as not..

I personally have never used any private or internal function from an external python library before. If I use it I would comment it and pin the version. And I truly believe that this is and should be the standard.

Also I see "Internal" as being internal to the library - not potential being external.

What is the problem of using public instead of internal?

I think it makes sense that move the discussion to the mailing list.

@feluelle
Copy link
Member Author

I will open a discussion shortly.

@potiuk
Copy link
Member

potiuk commented Aug 20, 2022

Btw. I have nothing against moving the method to public. This is even stronger indication 'usr this method as you will', but also 'use it from outside of the class' - this is what the intention of making method public is. In this case, it is likely that the call might be originated - for example when you use the hook in task-flow task.

It's just the breaking effect that worries me and i think we should add a method to deprecate old use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy