Contact Us

Source: G-gen Tech Blog

How to Address Zero Search Results in Vertex AI Search Apps with Google Drive as a Data Source


This blog explains how to handle cases where Python searches in Vertex AI Search apps using Google Drive as a data source result in zero results. 


Introduction

This article covers how to perform searches in a Vertex AI Search app with Google Drive as a data source using Python. Vertex AI Search is a search engine service provided by Google Cloud.

You can search the Vertex AI Search app using two methods:

  1. Using the Python Client

  2. Using the Requests library to directly access Google Cloud APIs

However, depending on the implementation, searches may fail and return zero results.

Cases Where Searches Fail

1. Using a Channel Other Than v1alpha

As of December 2024, searches in Vertex AI Search apps using Google Drive as the data source work as expected only when using the v1alpha channel. If v1 or v1beta channels are used, the search results will always be zero.

What is v1alpha?
In Google Cloud APIs, v1alpha refers to an API version channel. Google Cloud APIs typically progress through these stages:

  • v1alpha: Experimental stage; features may be changed or removed without notice.

  • v1beta: Pre-production stage; more stable but not final.

  • v1: Fully supported production release.

For more details, refer to Google Cloud’s versioning documentation.

2. Using Service Account Credentials

Google Cloud APIs generally require authentication via either Google accounts or service accounts.

However, as of December 2024, using service account credentials to query a Vertex AI Search app with Google Drive as the data source results in a 500 Internal Server Error.

Solutions

1. Use the v1alpha Channel

Ensure that your Python client or API call explicitly specifies the v1alpha channel in the configuration.

2. Use Google Account Credentials Instead of a Service Account

Replace service account credentials with Google account credentials to avoid server-side errors.

Python Client

Sample Code

Here is a sample code snippet for using the Python Client. 

from google.cloud.discoveryengine_v1alpha import SearchServiceClient, SearchRequest
from google.protobuf.json_format import MessageToDict

PROJECT_ID = "xxx"  # Google Cloud Project ID
VERTEX_AI_APP_ID = "xxx"  # Vertex AI Search App ID

client = SearchServiceClient(credentials=credentials)

serving_config = f"projects/{PROJECT_ID}/locations/global/collections/default_collection/engines/{VERTEX_AI_APP_ID}/servingConfigs/default_serving_config"

content_search_spec = SearchRequest.ContentSearchSpec(
    # Do not output snippets
    snippet_spec=SearchRequest.ContentSearchSpec().SnippetSpec(
        return_snippet=False
    ),
    # Output summary
    summary_spec=SearchRequest.ContentSearchSpec().SummarySpec(
        summary_result_count=3,
        include_citations=False,
        # Specify use of Gemini Pro
        model_spec=SearchRequest.ContentSearchSpec().SummarySpec().ModelSpec(
            version="gemini-1.5-flash-001/answer_gen/v1"
        )
    )
)

# Send query to Vertex AI Search
response = client.search(
    SearchRequest(
        serving_config=serving_config,
        query="What is G-gen?",
        page_size=3,
        content_search_spec=content_search_spec
    )
)

# Print the summary text
print(response.summary.summary_text)

# Print the search results
for r in response.results:
    r_dct = MessageToDict(r._pb)
    print(r_dct)

Reference: Class SearchServiceClient.

Key Points

Library Channel Specification

When importing google-cloud-discoveryengine, you must explicitly specify the _v1alpha channel as shown below:

from google.cloud.discoveryengine_v1alpha import SearchServiceClient, SearchRequest

If you do not specify the channel and use an import statement like the ones below, the search will fail and return zero results, as described in the failure cases:

# No channel specified
from google.cloud.discoveryengine import SearchServiceClient, SearchRequest

# v1 specified
from google.cloud.discoveryengine_v1 import SearchServiceClient, SearchRequest

# v1beta specified
from google.cloud.discoveryengine_v1beta import SearchServiceClient, SearchRequest

Credentials

The credentials parameter provided to client = SearchServiceClient(credentials=credentials) must be associated with a Google account, not a service account.

  • If using Google account credentials, the variable type should be google.oauth2.credentials.Credentials.
  • If using service account credentials, the variable type would be google.oauth2.service_account.Credentials.

Direct API Access Using Requests Library

Here’s a sample implementation using the Requests library to directly access Google Cloud APIs:

import requests

# API endpoint for v1alpha channel
url = 'https://discoveryengine.googleapis.com/v1alpha/projects/your_project_id/locations/global/collections/your_collection_id/dataStores/your_data_store_id/servingConfigs/default_config:search'

# Google account access token
headers = {
    'Authorization': 'Bearer your_google_account_access_token',
    'Content-Type': 'application/json',
}

# Search request payload
payload = {
    'query': 'search_query',
}

# Perform the request
response = requests.post(url, headers=headers, json=payload)

# Print the results
if response.status_code == 200:
    print(response.json())
else:
    print(f'Error: {response.status_code} - {response.text}')

Reference: Method: projects.locations.collections.dataStores.servingConfigs.search

  • Specify v1alpha in the API URL:
    The API endpoint must include the v1alpha channel.

  • Authorization Token:
    Use a Google account access token in the Authorization header when making API calls.

For detailed API reference, see Method: projects.locations.collections.dataStores.servingConfigs.search.

Key Points for Direct API Access

  1. Channel Specification in the API URL
    Ensure the API endpoint explicitly specifies the v1alpha channel.
  2. Use Google Account Access Token
    Include an access token from a Google account in the Authorization header. Avoid using service account tokens.

Conclusion

To avoid zero search results in Vertex AI Search apps with Google Drive as a data source, ensure you use the v1alpha channel and authenticate with Google account credentials. Staying updated with Google Cloud’s documentation can help prevent such issues in future implementations.

About the author

Ryuuki Dohara- Ryuuki Dohara is part of the Cloud Solutions Department's Data Analytics Team and joined G-gen in April 2023. Recognized as a Google Cloud Partner Top Engineer in both 2023 and 2024 (and awarded Rookie of the Year in 2024), Ryuuki is passionate about solving complex cloud challenges. Outside of work, you’ll often find him gaming or occasionally taking long bike rides on his days off.

This content was originally created by our partner in Japan, G-gen, a trusted Google Cloud Premier Partner. It has been translated and shared on our website with their permission. G-gen retains full ownership of this content.