-
Notifications
You must be signed in to change notification settings - Fork 128
feat(bigquery): Integrate Otel in client lib #3747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/OpenTelemetryHelper.java
Outdated
Show resolved
Hide resolved
google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/it/ITBigQueryTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case you are not aware. There are 2 additional locations where we perform network calls in TableDataWriteChannel.java and Job.java.
Yes, I wanted to keep this PR scoped to just the BigQueryImpl file (though with the refactor that has changed). I was planning to include the calls in TableDataWriteChannel.java and Job.java in a separate PR with a few other methods, such as runWithRetries to make it easier to review and reason about. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My biggest worry with this is the attribute structure here. This PR appears to define every field of every of every API as it's own attribute, which means we're more likely to introduce conflicts for users trying to filter based on attributes downstream.
This leads me to two questions:
- should we be putting the full contents of the responses into attributes, vs a more constrained approach (critical fields like identifiers, status, etc)?
- if we do want to capture the API response, should it just be stringified as a single attribute?
|
My 2 cents regarding attributes: On a side note, I think we should also clearly state that "Span names and attributes are subject to change without notice" to allow us the flexibility to change the structure as needed.
|
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryOptions.java
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java
Outdated
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java
Outdated
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/TableInfo.java
Outdated
Show resolved
Hide resolved
I've cleaned up the attributes that we are collecting by removing some fields that are not very useful in debugging. I also added in attributes for language and "db.name" to make the tracing more consistent with python. I've updated the span naming pattern to be more descriptive. I think it's a good idea to include "Span names and attributes are subject to change without notice", but that seems like it would better to put in the docs rather than the code. What do you guys think? |
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java
Outdated
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryOptions.java
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/DatasetId.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the changes.
@shollyman , can you take a look at the attributes to see if we missed anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I continue to be wary of replicating many of the API response fields in the span attributes, favoring the "less is more" approach, but I can be convinced otherwise.
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java
Outdated
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/JobId.java
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/InsertAllRequest.java
Outdated
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/JobInfo.java
Outdated
Show resolved
Hide resolved
google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java
Show resolved
Hide resolved
I've gone through and removed most attributes that we were collecting before, keeping mainly just the OP identifiers and time information such as creationTime, lastModifiedTime, etc. Let me know if you think we should shave off any more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some more reading, two other considerations for these attributes (which may require a lot of callsite touches):
-
Should the attributes be namespaced so it's clear they're fields from the BQ API? Right now they're just single term attributes and could conflict with other telemetry.
-
I believe otel attribute conventions are snake case rather than camel, so we may want to revisit the multi word attributes.
.getOpenTelemetryTracer() | ||
.spanBuilder("com.google.cloud.bigquery.BigQuery.createJob") | ||
.setAllAttributes(jobInfo.getJobId().getOtelAttributes()) | ||
.setAttribute("status", getFieldAsString(jobInfo.getStatus())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
status is a submessage. Should we just report status fields individually? status.state (pending/running/done), and error presence in the job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me, though I've decided to move those attributes to the Job class instead since during the create() function, job.status() is not yet populated. The field would always be set to null and thus be practically useless for debugging.
getOptions() | ||
.getOpenTelemetryTracer() | ||
.spanBuilder("com.google.cloud.bigquery.BigQuery.listDatasets") | ||
.setAttribute("projectId", projectId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: worth pulling the current page token into an attribute alongside the project for the list RPCs? It'll be visible in the HTTP request below it as part of the URL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the page token is supplied as a DatasetListOption, it will already be captured in an attribute here. I agree that adding the page token as an attribute is a good idea, though I think it would be better suited to the http layer rather than the interface.
Attributes attributes = Attributes.builder().build(); | ||
for (Field field : this.getFields()) { | ||
attributes = | ||
attributes.toBuilder().put("Field: " + field.getName(), field.toString()).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want BQ schema in the otel attributes at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is likely overkill, I went ahead and removed the schema from the attributes.
I renamed all the attributes to be snake case and gave them namespace prefixes to remove any overloading of attributes. I went with the namespace "bq" and created further namespaces depending on the objects, such as "bq.dataset", "bq.table", "bq.query", etc. Let me know if you think a different namespace style would be better. |
Adds OpenTelemetry tracing into the BigQueryImpl class, enabling in the following methods:
This is one PR of several to fully integrate OTel into the client library, broken up into chunks to make review easier.