Filter recommendations

If you have a recommendations app, you can use document fields to filter your recommendation results. This page explains how to use document fields to filter a recommendation to a specific set of documents. Although the examples on this page are for media recommendations, the principles shown here are the same for custom recommendations. For more information about media recommendations, see Introduction to Vertex AI Search for media.

Filter recommendations and data store updates

After any data store update, you'll need to wait up to 8 hours while the model retrains. This is because the model needs to know about the current values in the document metadata, as well as what fields are configured as filterable. You need to wait for document changes and schema changes to propagate. For recommendations (unlike for search), filtering is not done in real time.

Filters and diversification settings (Media recommendations only)

In addition to filters, an app's diversification setting also affects the results returned in a media recommendation response. The effects of filters and diversification are combined. The diversification is done first and the filtering is done second.

Combining high, rule-based diversity and category-based attribute filtering often results in empty output. This is because high diversity limits the app to returning one result for each category.

For example, you want to recommend movies based on Toy Story. You set the rule-based diversity level to high. Because the diversity level is high, although many movies might be recommended only one movie (for example, WALL·E) in the category of children's movies is returned. When the filter for children's movies is then applied, only WALL·E is returned as a recommendation.

For general information about diversity, see Diversify media recommendations.

Before you begin

Make sure you have created a recommendations app and data store. For more information, see Create media apps or Create a custom recommendations data store.

Example documents

Review these example media documents. You can refer back to these example documents as you read through this page.

{"id":"1","schemaId":"default_schema","structData":{"title":"Toy Story (1995)","categories":["Adventure","Animation","Children","Comedy","Fantasy"],"uri":"http://mytestdomain.movie/content/1","available_time":"2023-01-01T00:00:00Z","media_type":"movie"}}
{"id":"88125","schemaId":"default_schema","structData":{"title":"Harry Potter and the Deathly Hallows: Part 2 (2011)","categories":["Action","Adventure","Drama","Fantasy","Mystery","IMAX"],"uri":"http://mytestdomain.movie/content/88125","available_time":"2023-01-01T00:00:00Z","media_type":"movie"}}
{"id":"2857","schemaId":"default_schema","structData":{"title":"Yellow Submarine (1968)","categories":["Adventure","Animation","Comedy","Fantasy","Musical"],"uri":"http://mytestdomain.movie/content/2857","available_time":"2023-01-01T00:00:00Z","media_type":"movie"}}
{"id":"60069","schemaId":"default_schema","structData":{"title":"WALL·E (2008)","categories":["Adventure","Animation","Children","Romance","Sci-Fi"],"uri":"http://mytestdomain.movie/content/60069","available_time":"2023-01-01T00:00:00Z","media_type":"movie"}}

Filter expressions

Use filter expressions to define your recommendations filters.

Filter expressions syntax

The following Extended Backus–Naur form summarizes the filter expression syntax that you can use to define your recommendations filters.

  # A single expression or multiple expressions that are joined by "AND" or "OR".
  filter = expression, { " AND " | "OR", expression };
  # An expression can be prefixed with "-" or "NOT" to express a negation.
  expression = [ "-" | "NOT " ],
    # A parenthesized expression
    | "(", expression, ")"
    # A simple expression applying to a textual field.
    # Function "ANY" returns true if the field contains any of the literals.
    textual_field, ":", "ANY", "(", literal, { ",", literal }, ")"
    # OR filter by "available"
    available, ":", "true",
  # A literal is any double-quoted string. You must escape backslash (\) and
  # quote (") characters.
  literal = double-quoted string;
  textual_field = see the tables below;

Filter expressions restrictions

The following restrictions apply to filter expressions for recommendations:

The depth of embedding AND and OR operators in parentheses is limited. The logical expressions in the filter must be in conjunctive normal form (CNF). The most complex supported logical expression can be an AND-connected list of clauses that only contain OR operators, such as: (... OR ... OR ...) AND (... OR ...) AND (... OR ...)
Expressions can be negated with the NOT keyword or with -. This only works with ANY() expressions with a single argument.
available restrictions must be at the top level. They cannot be used as part of an OR clause or a negation (NOT). You can only use available: true. If you omit this filter, expired documents and not-yet-available documents might be returned as recommendations.

The available field maps to the following logic:

datetime.now >= available_time AND datetime.now <= expire_time

If the expire_time is not set, datetime.now <= expire_time resolves to true.
The maximum number of terms in the top-level AND clause is 20.
An OR clause can have up to 100 arguments that are included in ANY() expressions. If an OR clause has multiple ANY() expressions, their arguments all count toward this limit. For example, categories: ANY("drama", "comedy") OR categories: ANY("adventure") has three arguments.

Filter expressions examples

The following table shows valid and invalid filter expression examples. It also gives the reasons why the invalid examples are invalid.

Expression	Valid	Notes
`language_code: ANY("en", "fr")`	Yes
`NOT language_code: ANY("en")`	Yes
`NOT language_code: ANY("en", "fr")`	No	Negates an `ANY()` with more than one argument.
`language_code: ANY("en", "fr") OR categories: ANY("drama")`	Yes
`(language_code: ANY("en") OR language_code: ANY("fr")) AND categories: ANY("drama")`	Yes
`(language_code: ANY("en") AND language_code: ANY("fr")) OR categories: ANY("drama")`	No	Not in conjunctive normal form.
`(language_code: ANY("en")) AND (available: true)`	Yes
`(language_code: ANY("en")) OR (available: true)`	No	Combines `available` in an `OR` expression with other conditions.

The following filter expression filters for documents that are in the drama or action category, that are not in English, and that are available:

categories: ANY("drama", "action") AND NOT language_code: ANY("en") AND available: true

Filtering limits

Each filterable document field consumes some memory in each of your models. The following limits help prevent adverse effects on serving performance:

Up to 10 custom fields can be set as filterable in your schema.

If more than 10 custom fields are found during app training, only 10 are used.
Up to 100,000,000 filterable field values can be present in your schema.

You can estimate the total number of filterable field values in your schema by multiplying the number of documents in your schema by the number of filterable fields. If you exceed these limits, the following things happen:
- You cannot set additional fields as filterable.
- App training fails.