Additional considerations must be taken into account for best practices and testing your conversational commerce agent interface.
Implement best practices
Consider these best practices when implementing your conversational commerce agent interface:
- Visitor ID consistency: Help to ensure that a unique
visitor_idis consistently sent with each request for a given end user. This is vital for accurate personalization and model training. This identifier should ideally remain consistent for an end user across sessions and sign in or sign out states. - Branch management: While
default_branchis common, ensure you are using the correct branch ID if your product catalog is structured with multiple branches. - Search API interaction: For
SIMPLE_PRODUCT_SEARCHand any cases whererefined_searchis provided, remember to make a separate call to the core Search API (SearchService.Search) using thequeryfrom therefined_searchfield or the original query to get the actual product listings. The Conversational API primarily focuses on the conversational experience and user intent understanding rather than directly returning product results. - User interface design: Design your web interface to clearly present
conversational_text_response,followup_question, andrefined_searchoptions in an intuitive manner to guide your user.
Plan A/B tests
While relevance is an important input metric, Vertex AI Search for commerce also takes other variables into account with the goal of optimizing for business results:
| Metrics | |
|---|---|
| Revenue per visit (RPV) | Revenue per visit is the most effective metric for search performance as it takes into account conversion rate, AOV, and relevance. |
| Conversion—Average order value (AOV) | Conversion % and AOV both contribute to RPV. |
| Relevance—Buyability—Price | Relevance, among other inputs, is used to produce high performing search results. |
A/B readiness checklist
These are the success metrics used:
| Item | Definition | Stage |
|---|---|---|
| Event attribution scheme | Work with Google to properly segment the user events for measurement. | Pre-experiment |
| Monitoring data inputs | Ability to quickly understand when training data contains anomalies that could impact performance. | Pre-experiment |
| Event coverage | Are we instrumenting all possible outcomes associated with search or recommendations AI sessions? | Pre-experiment |
| Measurable success criteria | Documented definition of done (in measurable terms). | Pre-experiment |
| Ability to measure UX biases | Ensure consistent UX across experiment arms. | During experiment |
| Coherency between VAIS data and consumption | Verify attribution tokens, filters, order by, offset, etc., are being passed from API to UserEvents. Visitor/UserIDs match between event and API requests. | During experiment |
| Approval to tune during the experiment | Plan for tuning activities, document changes, adjust measurements and interpretation accordingly. | During Experiment |
Implement proof of concept or minimum viable product
| Data ingestion | A/B test design | Performance metrics | Governance and process |
|---|---|---|---|
|
Up-to-date and complete product catalog ingestion Adherence to recommended events ingestion methods to ensure data synchronization between Google and you. Pass through necessary attributes such as experiment IDs, visitor IDs, and correctly implement search tokens where applicable. |
Incorporate experimentation best practices to ensure reliable results:
|
All evaluation criteria should be empirical, objectively measured, and driven by metrics. Alignment on exact definitions of metrics tracked is critical to measure performance accurately. Standard metrics tracked include:
| Data integration, testing, feature rollout, and optimization will be an iterative process, requiring resources. |
Example experiment cadence
| Satisfy minimum viable product dependencies | Calibrate measurement | Deploy production dark mode | Go/no-go decision |
|---|---|---|---|
|
|
|
|

| Ongoing testing | Ramp to X% of traffic | Measure, adjust, and repeat | Ramp to X% live traffic |
|---|---|---|---|
|
|
|
|
Components of a successful experiment
| Calibrate measurements and establish success criteria | Maintain experiment fairness | Monitor data quality |
|---|---|---|
|
|
|
Roles and experiment ownership
| You | ||
|---|---|---|
| Quality evaluation | Commerce search outcomes | UX impact |
| Measurements | Backup/validate | Authoritative |
| Telemetry/data | Platform volumetrics (validating performance) Event and index anomalies |
Attribution tokens and steps to reproduce (validating issues) |
| Search platform |
Product-level items
|
Query/serving items
|
| Go/No-go | Recommend | Approve |
Conduct experiments in the console
Go to the Experiments page in the Search for commerce console.
Go to the Experiments pageUse the console for advanced self-service analytics for Vertex AI Search for commerce onboarding and A/B testing by applying Google's attribution methodology:
Monitor traffic segmentation, business metrics, and search and browse performance.
Apply per-search visit level metrics across both keyword search and browse.
View experiment performance as a time-series with statistical significance metrics.
Use the embedded Looker platform.