Method: projects.locations.obtainCrawlRate

Obtains the time series data of organic or dedicated crawl rate for monitoring. When dedicated crawl rate is not set, it will return vertex AI's organic crawl rate time series. Organic crawl means Google automatically crawl the internet at its own convenience. When dedicated crawl rate is set, it will return vertex AI's dedicated crawl rate time series.

HTTP request

POST https://discoveryengine.googleapis.com/v1alpha/{location=projects/*/locations/*}:obtainCrawlRate

The URL uses gRPC Transcoding syntax.

Path parameters

Parameters
location

string

Required. The location resource where crawl rate management will be performed. Format: projects/{project}/locations/{location}

Request body

The request body contains data with the following structure:

JSON representation
{
  "crawlRateScope": string
}
Fields
crawlRateScope

string

Required. The scope of the crawl rate that the user wants to monitor. Currently, only domain and host name are supported. A domain name example: example.com. A host name example: www.example.com. Please do not include / in the domain or host name.

Response body

Response message for CrawlRateManagementService.ObtainCrawlRate method. The response contains organcic or dedicated crawl rate time series data for monitoring, depending on whether dedicated crawl rate is set.

If successful, the response body contains data with the following structure:

JSON representation
{
  "state": enum (State),
  "error": {
    object (Status)
  },

  // Union field crawl_rate_time_series can be only one of the following:
  "organicCrawlRateTimeSeries": {
    object (OrganicCrawlRateTimeSeries)
  },
  "dedicatedCrawlRateTimeSeries": {
    object (DedicatedCrawlRateTimeSeries)
  }
  // End of list of possible types for union field crawl_rate_time_series.
}
Fields
state

enum (State)

Output only. The state of the response.

error

object (Status)

Errors from service when handling the request.

Union field crawl_rate_time_series. Once the user sets the dedicated crawl rate, it will return the dedicated crawl rate time series, otherwise it will return the organic crawl rate time series. crawl_rate_time_series can be only one of the following:
organicCrawlRateTimeSeries

object (OrganicCrawlRateTimeSeries)

The historical organic crawl rate timeseries data, used for monitoring.

dedicatedCrawlRateTimeSeries

object (DedicatedCrawlRateTimeSeries)

The historical dedicated crawl rate timeseries data, used for monitoring.

Authorization scopes

Requires the following OAuth scope:

  • https://www.googleapis.com/auth/cloud-platform

For more information, see the Authentication Overview.

OrganicCrawlRateTimeSeries

The historical organic crawl rate timeseries data, used for monitoring. Organic crawl is auto-determined by Google to crawl the user's website when dedicate crawl is not set. Crawl rate is the QPS of crawl request Google sends to the user's website.

JSON representation
{
  "googleOrganicCrawlRate": {
    object (CrawlRateTimeSeries)
  },
  "vertexAiOrganicCrawlRate": {
    object (CrawlRateTimeSeries)
  }
}
Fields
googleOrganicCrawlRate

object (CrawlRateTimeSeries)

Google's organic crawl rate time series, which is the sum of all googlebots' crawl rate. Please refer to https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers for more details about googlebots.

vertexAiOrganicCrawlRate

object (CrawlRateTimeSeries)

Vertex AI's organic crawl rate time series, which is the crawl rate of Google-CloudVertexBot when dedicate crawl is not set. Please refer to https://developers.google.com/search/docs/crawling-indexing/google-common-crawlers#google-cloudvertexbot for more details about Google-CloudVertexBot.

CrawlRateTimeSeries

The historical crawl rate timeseries data, used for monitoring.

JSON representation
{
  "qpsTimeSeries": {
    object (TimeSeries)
  }
}
Fields
qpsTimeSeries

object (TimeSeries)

The QPS of the crawl rate.

TimeSeries

A collection of data points that describes the time-varying values of a metric. A time series is identified by a combination of a fully-specified monitored resource and a fully-specified metric. This type is used for both listing and creating time series.

JSON representation
{
  "metric": {
    object (Metric)
  },
  "resource": {
    object (MonitoredResource)
  },
  "metadata": {
    object (MonitoredResourceMetadata)
  },
  "metricKind": enum (MetricKind),
  "valueType": enum (ValueType),
  "points": [
    {
      object (Point)
    }
  ],
  "unit": string,
  "description": string
}
Fields
metric

object (Metric)

The associated metric. A fully-specified metric used to identify the time series.

resource

object (MonitoredResource)

The associated monitored resource. Custom metrics can use only certain monitored resource types in their time series data. For more information, see Monitored resources for custom metrics.

metadata

object (MonitoredResourceMetadata)

Output only. The associated monitored resource metadata. When reading a time series, this field will include metadata labels that are explicitly named in the reduction. When creating a time series, this field is ignored.

metricKind

enum (MetricKind)

The metric kind of the time series. When listing time series, this metric kind might be different from the metric kind of the associated metric if this time series is an alignment or reduction of other time series.

When creating a time series, this field is optional. If present, it must be the same as the metric kind of the associated metric. If the associated metric's descriptor must be auto-created, then this field specifies the metric kind of the new descriptor and must be either GAUGE (the default) or CUMULATIVE.

valueType

enum (ValueType)

The value type of the time series. When listing time series, this value type might be different from the value type of the associated metric if this time series is an alignment or reduction of other time series.

When creating a time series, this field is optional. If present, it must be the same as the type of the data in the points field.

points[]

object (Point)

The data points of this time series. When listing time series, points are returned in reverse time order.

When creating a time series, this field must contain exactly one point and the point's type must be the same as the value type of the associated metric. If the associated metric's descriptor must be auto-created, then the value type of the descriptor is determined by the point's type, which must be BOOL, INT64, DOUBLE, or DISTRIBUTION.

unit

string

The units in which the metric value is reported. It is only applicable if the valueType is INT64, DOUBLE, or DISTRIBUTION. The unit defines the representation of the stored metric values. This field can only be changed through CreateTimeSeries when it is empty.

description

string

Input only. A detailed description of the time series that will be associated with the google.api.MetricDescriptor for the metric. Once set, this field cannot be changed through CreateTimeSeries.

Metric

A specific metric, identified by specifying values for all of the labels of a MetricDescriptor.

JSON representation
{
  "type": string,
  "labels": {
    string: string,
    ...
  }
}
Fields
type

string

An existing metric type, see google.api.MetricDescriptor. For example, custom.googleapis.com/invoice/paid/amount.

labels

map (key: string, value: string)

The set of label values that uniquely identify this metric. All labels listed in the MetricDescriptor must be assigned values.

MonitoredResourceMetadata

Auxiliary metadata for a MonitoredResource object. MonitoredResource objects contain the minimum set of information to uniquely identify a monitored resource instance. There is some other useful auxiliary metadata. Monitoring and Logging use an ingestion pipeline to extract metadata for cloud resources of all types, and store the metadata in this message.

JSON representation
{
  "systemLabels": {
    object
  },
  "userLabels": {
    string: string,
    ...
  }
}
Fields
systemLabels

object (Struct format)

Output only. Values for predefined system metadata labels. System labels are a kind of metadata extracted by Google, including "machine_image", "vpc", "subnet_id", "security_group", "name", etc. System label values can be only strings, Boolean values, or a list of strings. For example:

{ "name": "my-test-instance",
  "security_group": ["a", "b", "c"],
  "spot_instance": false }
userLabels

map (key: string, value: string)

Output only. A map of user-defined metadata labels.

MetricKind

The kind of measurement. It describes how the data is reported. For information on setting the start time and end time based on the MetricKind, see TimeInterval.

Enums
METRIC_KIND_UNSPECIFIED Do not use this default value.
GAUGE An instantaneous measurement of a value.
DELTA The change in a value during a time interval.
CUMULATIVE A value accumulated over a time interval. Cumulative measurements in a time series should have the same start time and increasing end times, until an event resets the cumulative value to zero and sets a new start time for the following points.

ValueType

The value type of a metric.

Enums
VALUE_TYPE_UNSPECIFIED Do not use this default value.
BOOL The value is a boolean. This value type can be used only if the metric kind is GAUGE.
INT64 The value is a signed 64-bit integer.
DOUBLE The value is a double precision floating point number.
STRING The value is a text string. This value type can be used only if the metric kind is GAUGE.
DISTRIBUTION The value is a Distribution.
MONEY The value is money.

Point

A single data point in a time series.

JSON representation
{
  "interval": {
    object (TimeInterval)
  },
  "value": {
    object (TypedValue)
  }
}
Fields
interval

object (TimeInterval)

The time interval to which the data point applies. For GAUGE metrics, the start time is optional, but if it is supplied, it must equal the end time. For DELTA metrics, the start and end time should specify a non-zero interval, with subsequent points specifying contiguous and non-overlapping intervals. For CUMULATIVE metrics, the start and end time should specify a non-zero interval, with subsequent points specifying the same start time and increasing end times, until an event resets the cumulative value to zero and sets a new start time for the following points.

value

object (TypedValue)

The value of the data point.

TimeInterval

A time interval extending just after a start time through an end time. If the start time is the same as the end time, then the interval represents a single point in time.

JSON representation
{
  "endTime": string,
  "startTime": string
}
Fields
endTime

string (Timestamp format)

Required. The end of the time interval.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

startTime

string (Timestamp format)

Optional. The beginning of the time interval. The default value for the start time is the end time. The start time must not be later than the end time.

A timestamp in RFC3339 UTC "Zulu" format, with nanosecond resolution and up to nine fractional digits. Examples: "2014-10-02T15:01:23Z" and "2014-10-02T15:01:23.045123456Z".

DedicatedCrawlRateTimeSeries

The historical dedicated crawl rate timeseries data, used for monitoring. Dedicated crawl is used by Vertex AI to crawl the user's website when dedicate crawl is set.

JSON representation
{
  "userTriggeredCrawlRate": {
    object (CrawlRateTimeSeries)
  },
  "autoRefreshCrawlRate": {
    object (CrawlRateTimeSeries)
  }
}
Fields
userTriggeredCrawlRate

object (CrawlRateTimeSeries)

Vertex AI's dedicated crawl rate time series of user triggered crawl, which is the crawl rate of Google-CloudVertexBot when dedicate crawl is set, and user triggered crawl rate is for deterministic use cases like crawling urls or sitemaps specified by users.

autoRefreshCrawlRate

object (CrawlRateTimeSeries)

Vertex AI's dedicated crawl rate time series of auto-refresh, which is the crawl rate of Google-CloudVertexBot when dedicate crawl is set, and the crawl rate is for best effort use cases like refreshing urls periodically.

State

Different states of the response.

Enums
STATE_UNSPECIFIED The state is unspecified.
SUCCEEDED The state is successful.
FAILED The state is failed.