Skip to content

feat: Allow windowing in 'partial' ordering mode #861

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 5, 2024

Conversation

TrevorBergeron
Copy link
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

@product-auto-label product-auto-label bot added size: s Pull request size is small. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Jul 25, 2024
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: s Pull request size is small. labels Jul 29, 2024
@TrevorBergeron TrevorBergeron changed the title refactor: Internal flag to permit window ambiguity feat: Allow windowing in 'partial' ordering mode Jul 29, 2024
@TrevorBergeron TrevorBergeron requested a review from tswast July 30, 2024 16:40
@TrevorBergeron TrevorBergeron added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Jul 30, 2024
@TrevorBergeron TrevorBergeron marked this pull request as ready for review July 30, 2024 17:22
@TrevorBergeron TrevorBergeron requested review from a team as code owners July 30, 2024 17:22
@@ -377,6 +378,10 @@ def slot_millis_sum(self):
"""The sum of all slot time used by bigquery jobs in this session."""
return self._slot_millis_sum

@property
def allow_ambiguity(self) -> bool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a public property? I'd prefer to keep it private until we have a request from users to modifying it.

Also, I'm not sold on the naming. Should it be specific and mention ambiguous orderings? Or rename to allow all forms of non-determinism?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, no reason to make this a public property. This is specifically for use in windowing - where ambiguous orderings produce ambiguous row values. Ordering ambiguity is introduced through "partial" ordering mode already.

raise ValueError("Generating offsets not supported in unordered mode")
if self.node.order_ambiguous and not (self.session._strictly_ordered):
if not self.session._allows_ambiguity:
raise ValueError("Generating offsets not supported in unordered mode")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: partial ordered mode

Let's try to be consistent with our naming of the feature.

)
if not self.session._allows_ambiguity:
raise ValueError(
"Generating offsets not supported in unordered mode"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: partial ordered mode

@TrevorBergeron TrevorBergeron removed the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Aug 2, 2024
@TrevorBergeron TrevorBergeron merged commit ca26fe5 into main Aug 5, 2024
23 checks passed
@TrevorBergeron TrevorBergeron deleted the allow_ambiguous branch August 5, 2024 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy