feat(bigquery): Integrate Otel in client lib #3747

whuffman36 · 2025-04-08T18:57:58Z

Adds OpenTelemetry tracing into the BigQueryImpl class, enabling in the following methods:

(Dataset, Table, Job, Routine) Create
(Dataset, Table, Job, Routine, Model, IamPolicy) Get
(Dataset, Table, Job, Routine, Model, IamPolicy) Update
(Dataset, Table, Routine, Model) Delete
List (Datasets, Tables, Jobs, Routines, Models)
Job Cancel
insertAll
testIamMethods
queryRpc
getQueryResults

This is one PR of several to fully integrate OTel into the client library, broken up into chunks to make review easier.

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/OpenTelemetryHelper.java

google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/it/ITBigQueryTest.java

PhongChuong

In the case you are not aware. There are 2 additional locations where we perform network calls in TableDataWriteChannel.java and Job.java.

whuffman36 · 2025-05-05T23:41:24Z

In the case you are not aware. There are 2 additional locations where we perform network calls in TableDataWriteChannel.java and Job.java.

Yes, I wanted to keep this PR scoped to just the BigQueryImpl file (though with the refactor that has changed). I was planning to include the calls in TableDataWriteChannel.java and Job.java in a separate PR with a few other methods, such as runWithRetries to make it easier to review and reason about.

shollyman

My biggest worry with this is the attribute structure here. This PR appears to define every field of every of every API as it's own attribute, which means we're more likely to introduce conflicts for users trying to filter based on attributes downstream.

This leads me to two questions:

should we be putting the full contents of the responses into attributes, vs a more constrained approach (critical fields like identifiers, status, etc)?
if we do want to capture the API response, should it just be stringified as a single attribute?

whuffman36 · 2025-05-08T00:09:08Z

My biggest worry with this is the attribute structure here. This PR appears to define every field of every of every API as it's own attribute, which means we're more likely to introduce conflicts for users trying to filter based on attributes downstream.

This leads me to two questions:

should we be putting the full contents of the responses into attributes, vs a more constrained approach (critical fields like identifiers, status, etc)?

if we do want to capture the API response, should it just be stringified as a single attribute?

I'm of the opinion that since tracing is opt-in and meant to make debugging easier, the more visibility into the data the better. It's better to have visibility into the full contents and not need it than it is to want that visibility and not have it. What do you think?
I'm not convinced we should be capturing the response, as it essentially just passed to the user. The user can decide to capture or log the response once received outside of the library code. This design was to capture just the inputs to the API calls. That being said, I think stringifying a response as a single attribute makes the attribute harder to parse and read when debugging, which can defeat the purpose of creating a better debug system.

PhongChuong · 2025-05-08T16:23:21Z

My 2 cents regarding attributes:
Recall that we have 2 layers for the span. The upper level interface (BigQueryImpl.java, Job.java, etc.) and the HttpBigQueryRpc level. I believe we should capture the request and response (not in this PR). Specifically, it should be captured as a request/response* attribute at the HttpBigQueryRpc level. For the interface level (this PR), we should capture any metadata information that would help with debugging any issue that may arise and we should capture it in a structured manner so it is visible. Specifically, attributes in the options are great as they are rarely captured elsewhere. Meanwhile, we can be a little more selective with datatype attributes (TableId vs all the fields in TableInfo). And if needed, then the request attribute can be used to deep dive.

On a side note, I think we should also clearly state that "Span names and attributes are subject to change without notice" to allow us the flexibility to change the structure as needed.

For certain responses such as table row data, we probably won't want to capture those values.

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryOptions.java

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/TableInfo.java

whuffman36 · 2025-05-08T21:25:31Z

I've cleaned up the attributes that we are collecting by removing some fields that are not very useful in debugging. I also added in attributes for language and "db.name" to make the tracing more consistent with python. I've updated the span naming pattern to be more descriptive.

I think it's a good idea to include "Span names and attributes are subject to change without notice", but that seems like it would better to put in the docs rather than the code. What do you guys think?

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryOptions.java

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/DatasetId.java

PhongChuong

Thanks for all the changes.
@shollyman , can you take a look at the attributes to see if we missed anything?

shollyman

I continue to be wary of replicating many of the API response fields in the span attributes, favoring the "less is more" approach, but I can be convinced otherwise.

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/JobId.java

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/InsertAllRequest.java

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/JobInfo.java

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java

whuffman36 · 2025-05-16T00:19:00Z

I've gone through and removed most attributes that we were collecting before, keeping mainly just the OP identifiers and time information such as creationTime, lastModifiedTime, etc. Let me know if you think we should shave off any more

shollyman

Did some more reading, two other considerations for these attributes (which may require a lot of callsite touches):

Should the attributes be namespaced so it's clear they're fields from the BQ API? Right now they're just single term attributes and could conflict with other telemetry.
I believe otel attribute conventions are snake case rather than camel, so we may want to revisit the multi word attributes.

shollyman · 2025-05-16T19:28:42Z

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java

+              .getOpenTelemetryTracer()
+              .spanBuilder("com.google.cloud.bigquery.BigQuery.createJob")
+              .setAllAttributes(jobInfo.getJobId().getOtelAttributes())
+              .setAttribute("status", getFieldAsString(jobInfo.getStatus()))


status is a submessage. Should we just report status fields individually? status.state (pending/running/done), and error presence in the job?

Sounds good to me, though I've decided to move those attributes to the Job class instead since during the create() function, job.status() is not yet populated. The field would always be set to null and thus be practically useless for debugging.

shollyman · 2025-05-16T19:30:52Z

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/BigQueryImpl.java

+          getOptions()
+              .getOpenTelemetryTracer()
+              .spanBuilder("com.google.cloud.bigquery.BigQuery.listDatasets")
+              .setAttribute("projectId", projectId)


minor: worth pulling the current page token into an attribute alongside the project for the list RPCs? It'll be visible in the HTTP request below it as part of the URL.

If the page token is supplied as a DatasetListOption, it will already be captured in an attribute here. I agree that adding the page token as an attribute is a good idea, though I think it would be better suited to the http layer rather than the interface.

shollyman · 2025-05-16T19:38:58Z

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/Schema.java

+    Attributes attributes = Attributes.builder().build();
+    for (Field field : this.getFields()) {
+      attributes =
+          attributes.toBuilder().put("Field: " + field.getName(), field.toString()).build();


Do we want BQ schema in the otel attributes at all?

It is likely overkill, I went ahead and removed the schema from the attributes.

whuffman36 · 2025-05-20T23:13:46Z

I renamed all the attributes to be snake case and gave them namespace prefixes to remove any overloading of attributes. I went with the namespace "bq" and created further namespaces depending on the objects, such as "bq.dataset", "bq.table", "bq.query", etc.

Let me know if you think a different namespace style would be better.

product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/java-bigquery API. labels Apr 8, 2025

whuffman36 force-pushed the otel branch from 165bb24 to f49582f Compare April 29, 2025 10:19

product-auto-label bot added size: xl Pull request size is extra large. and removed size: m Pull request size is medium. labels Apr 29, 2025

whuffman36 marked this pull request as ready for review April 29, 2025 19:24

whuffman36 requested review from a team as code owners April 29, 2025 19:24

whuffman36 requested review from suzmue, shollyman and PhongChuong and removed request for suzmue April 29, 2025 19:24

PhongChuong reviewed Apr 30, 2025

View reviewed changes

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/OpenTelemetryHelper.java Outdated Show resolved Hide resolved

PhongChuong reviewed Apr 30, 2025

View reviewed changes

google-cloud-bigquery/src/test/java/com/google/cloud/bigquery/it/ITBigQueryTest.java Outdated Show resolved Hide resolved

PhongChuong reviewed Apr 30, 2025

View reviewed changes

whuffman36 added 9 commits May 5, 2025 13:51

feat(bigquery): Integrate Otel in client lib

008b1f6

Refactor Otel code into separate file

c15b03d

Add copyright comment

3193222

fix style

a009eb0

remove unused dependencies

fcbb2f1

remove unused test dependencies

6618883

fix dependency issue

bb7c1cc

fix it test

cc84936

refactor out OtelHelper class

7443564

whuffman36 force-pushed the otel branch from 602195f to 7443564 Compare May 5, 2025 23:30

change test name to start with lowercase

66d79bf

change datasetId key to dataset

55316f8

whuffman36 mentioned this pull request May 6, 2025

bigquery.it.ITBigQueryTest: testLocationFastSQLQueryWithJobId failed #3776

Closed

shollyman reviewed May 7, 2025

View reviewed changes

remove OtelHelper class

e4e4047

PhongChuong reviewed May 8, 2025

View reviewed changes

google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/TableInfo.java Outdated Show resolved Hide resolved

Clean up attribute values

c988c2e

PhongChuong reviewed May 9, 2025

View reviewed changes

Add spans to query and listPartitions

8c52f6f

PhongChuong previously approved these changes May 12, 2025

View reviewed changes

shollyman reviewed May 15, 2025

View reviewed changes

remove extraneous attributes

a963641

shollyman reviewed May 16, 2025

View reviewed changes

add attribute namespaces, change to snake_case

0ebaa92

whuffman36 dismissed PhongChuong’s stale review via 0ebaa92 May 24, 2025 06:02

feat(bigquery): Integrate Otel in client lib #3747

Are you sure you want to change the base?

feat(bigquery): Integrate Otel in client lib #3747

Uh oh!

Conversation

whuffman36 commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

PhongChuong left a comment

Choose a reason for hiding this comment

Uh oh!

whuffman36 commented May 5, 2025

Uh oh!

shollyman left a comment

Choose a reason for hiding this comment

Uh oh!

whuffman36 commented May 8, 2025

Uh oh!

PhongChuong commented May 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

whuffman36 commented May 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

PhongChuong left a comment

Choose a reason for hiding this comment

Uh oh!

shollyman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

whuffman36 commented May 16, 2025

Uh oh!

shollyman left a comment

Choose a reason for hiding this comment

Uh oh!

shollyman May 16, 2025

Choose a reason for hiding this comment

Uh oh!

whuffman36 May 20, 2025

Choose a reason for hiding this comment

Uh oh!

shollyman May 16, 2025

Choose a reason for hiding this comment

Uh oh!

whuffman36 May 20, 2025

Choose a reason for hiding this comment

Uh oh!

shollyman May 16, 2025

Choose a reason for hiding this comment

Uh oh!

whuffman36 May 20, 2025

Choose a reason for hiding this comment

Uh oh!

whuffman36 commented May 20, 2025

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

whuffman36 commented Apr 8, 2025 •

edited

Loading