Log requests and responses

Vertex AI can log samples of requests and responses for Gemini and supported partner models. The logs are saved to a BigQuery table for viewing and analysis. This page describes how to configure request-response logs for base foundation models and fine-tuned models.

Supported API methods for logging

Request-response logs are supported for all Gemini models that use generateContent or streamGenerateContent.

The following partner models that use rawPredict or streamrawPredict are also supported:

  • Anthropic Claude

Request-response logs for base foundation models

You can configure request-response logs for base foundation models by using the REST API or Python SDK. Logging configurations can take a few minutes to take effect.

Enable request-response logging

Select one the of the following tabs for instructions on enabling request-response logs for a base foundation model.

For Anthropic models, only REST is supported for logging configuration. Enable logging configuration through the REST API by setting publisher to anthropic and setting the model name to one of the supported Claude models.

Python SDK

This method can be used to create or update a PublisherModelConfig.

publisher_model = GenerativeModel('gemini-2.0-pro-001')

# Set logging configuration
publisher_model.set_request_response_logging_config(
    enabled=True,
    sampling_rate=1.0,
    bigquery_destination="bq://PROJECT_ID.DATASET_NAME.TABLE_NAME",
    enable_otel_logging=True
    )

REST API

Create or update a PublisherModelConfig using setPublisherModelConfig:

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The location of the model resource. Request-response logging is supported for all regions supported by the model.
  • PUBLISHER: The publisher name. For example, google.
  • MODEL: The foundation model name. For example, gemini-2.0-flash-001.
  • SAMPLING_RATE: To reduce storage costs, you can set a number between 0 or 1 to define the fraction of requests to log. For example, a value of 1 logs all requests, and a value of 0.1 logs 10% of requests.
  • BQ_URI: the BigQuery table to be used for logging. If you only specify a project name, a new dataset is created with the name logging_ENDPOINT_DISPLAY_NAME\_ENDPOINT_ID, where ENDPOINT_DISPLAY_NAME follows the BigQuery naming rules. If you don't specify a table name, a new table is created with the name request_response_logging.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig

Request JSON body:

{
  "publisherModelConfig": {
     "loggingConfig": {
       "enabled": true,
       "samplingRate": SAMPLING_RATE,
       "bigqueryDestination": {
         "outputUri": "BQ_URI"
       },
       "enableOtelLogging": true
     }
   },
 }

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Get logging configuration

Get the request-response logging configuration for the foundation model by using the REST API.

REST API

Get the request-response logging configuration using fetchPublisherModelConfig:

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The location of the model resource.
  • PUBLISHER: The publisher name. For example, google.
  • MODEL: The foundation model name. For example, gemini-2.0-flash-001.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:fetchPublisherModelConfig

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:fetchPublisherModelConfig"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:fetchPublisherModelConfig" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Disable logging

Disable request-response logging for the foundation model by using the REST API or Python SDK.

Python SDK

publisher_model.set_request_response_logging_config(
  enabled=False,
  sampling_rate=0,
  bigquery_destination=''
  )

REST API

Use setPublisherModelConfig to disable logging:

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The location of the model resource.
  • PUBLISHER: The publisher name. For example, google.
  • MODEL: The foundation model name. For example, gemini-2.0-flash-001.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig

Request JSON body:

{
  "publisherModelConfig": {
     "loggingConfig": {
       "enabled": false,
     }
  },
}

To send your request, choose one of these options:

curl

Save the request body in a file named request.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig"

PowerShell

Save the request body in a file named request.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/publishers/PUBLISHER/models/MODEL:setPublisherModelConfig" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Request-response logs for fine-tuned models

You can configure request-response logs for fine-tuned models by using the REST API or Python SDK.

Enable request-response logs

Select one the of the following tabs for instructions on enabling request-response logs for a fine-tuned model.

Python SDK

This method can be used to update the request-response logging configuration for an endpoint.

tuned_model = GenerativeModel("projects/PROJECT_ID/locations/REGION/endpoints/ENDPOINT_ID")

# Set logging configuration
tuned_model.set_request_response_logging_config(
    enabled=True,
    sampling_rate=1.0,
    bigquery_destination="bq://PROJECT_ID.DATASET_NAME.TABLE_NAME",
    enable_otel_logging=True
    )

REST API

You can only enable request-response logging when you create an endpoint using projects.locations.endpoints.create or patch an existing endpoint using projects.locations.endpoints.patch.

Requests and responses are logged at the endpoint level, so requests sent to any deployed models under the same endpoint are logged.

When you create or patch an endpoint, populate the predictRequestResponseLoggingConfig field of the Endpoint resource with the following entries:

  • enabled: set to True to enable request-response logging.

  • samplingRate: To reduce storage costs, you can set a number between 0 or 1 to define the fraction of requests to log. For example, a value of 1 logs all requests, and a value of 0.1 logs 10% of requests.

  • BigQueryDestination: the BigQuery table to be used for logging. If you only specify a project name, a new dataset is created with the name logging_ENDPOINT_DISPLAY_NAME_ENDPOINT_ID, where ENDPOINT_DISPLAY_NAME follows the BigQuery naming rules. If you don't specify a table name, a new table is created with the name request_response_logging.

  • enableOtelLogging: set to true to enable OpenTelemetry (OTEL) logging in addition to the default request-response logging.

To view the BigQuery table schema, see Logging table schema.

The following is an example configuration:

{
  "predictRequestResponseLoggingConfig": {
    "enabled": true,
    "samplingRate": 0.5,
    "bigqueryDestination": {
      "outputUri": "bq://PROJECT_ID.DATASET_NAME.TABLE_NAME"
    },
    "enableOtelLogging": true
  }
}

Get logging configuration

Get the request-response logging configuration for the fine-tuned model by using the REST API.

REST API

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Your project ID.
  • LOCATION: The location of the endpoint resource.
  • MODEL: The foundation model name. For example, gemini-2.0-flash-001.
  • ENDPOINT_ID: The ID of the endpoint.

HTTP method and URL:

GET https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID

To send your request, choose one of these options:

curl

Execute the following command:

curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID"

PowerShell

Execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method GET `
-Headers $headers `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/ENDPOINT_ID" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Disable logging configuration

Disable the request-response logging configuration for the endpoint.

Python SDK

tuned_model = GenerativeModel("projects/PROJECT_ID/locations/REGION/endpoints/ENDPOINT_ID")

# Set logging configuration
tuned_model.set_request_response_logging_config(
    enabled=False,
    sampling_rate=1.0,
    bigquery_destination="bq://PROJECT_ID.DATASET_NAME.TABLE_NAME",
    enable_otel_logging=False
    )

REST API

{
"predictRequestResponseLoggingConfig": {
  "enabled": false,
}
}

Logging table schema

In BigQuery, the logs are recorded using the following schema:

Field name Type Notes
endpoint STRING Resource name of the endpoint to which the tuned model is deployed.
deployed_model_id STRING Deployed model ID for a tuned model deployed to an endpoint.
logging_time TIMESTAMP The time that logging is performed. This is roughly the time that the response is returned.
request_id NUMERIC The auto-generated integer request ID based on the API request.
request_payload STRING Included for partner model logging and backward compatibility with the Vertex AI endpoint request-response log.
response_payload STRING Included for partner model logging and backward compatibility with the Vertex AI endpoint request-response log.
model STRING Model resource name.
model_version STRING The model version. This is often "default" for Gemini models.
api_method STRING generateContent, streamGenerateContent, rawPredict, streamRawPredict
full_request JSON The full GenerateContentRequest.
full_response JSON The full GenerateContentResponse.
metadata JSON Any metadata of the call; contains the request latency.
otel_log JSON Logs in OpenTelemetry schema format. Only available if otel_logging is enabled in the logging configuration.

Note that request-response pairs larger than the BigQuery write API 10MB row limit are not recorded.

What's next