Storage Client

Client for interacting with the Google Cloud Storage API.

class google.cloud.storage.client.Client(project=

Bases: google.cloud.client.ClientWithProject

Client to bundle configuration needed for API requests.

  • Parameters

    • project (str* or [None*](https://python.readthedocs.io/en/latest/library/constants.html#None)) – the project which the client acts on behalf of. Will be passed when creating a topic. If not passed, falls back to the default inferred from the environment.

    • credentials (Credentials) – (Optional) The OAuth2 Credentials to use for this client. If not passed (and if no _http object is passed), falls back to the default inferred from the environment.

    • _http (Session) – (Optional) HTTP object to make requests. Can be any object that defines request() with the same interface as requests.Session.request(). If not passed, an _http object is created that is bound to the credentials for the current object. This parameter should be considered private, and could change in the future.

    • client_info (ClientInfo) – The client info used to send a user-agent string along with API requests. If None, then default info will be used. Generally, you only need to set this if you’re developing your own library or partner tool.

    • client_options (ClientOptions or dict) – (Optional) Client options used to set user options on the client. API Endpoint should be set through client_options.

SCOPE( = ('https://www.googleapis.com/auth/devstorage.full_control', 'https://www.googleapis.com/auth/devstorage.read_only', 'https://www.googleapis.com/auth/devstorage.read_write' )

The scopes required for authenticating as a Cloud Storage consumer.

batch()

Factory constructor for batch object.

NOTE: This will not make an HTTP request; it simply instantiates a batch object owned by this client.

bucket(bucket_name, user_project=None)

Factory constructor for bucket object.

NOTE: This will not make an HTTP request; it simply instantiates a bucket object owned by this client.

  • Parameters

    • bucket_name (str) – The name of the bucket to be instantiated.

    • user_project (str) – (Optional) The project ID to be billed for API requests made via the bucket.

  • Return type

    google.cloud.storage.bucket.Bucket

  • Returns

    The bucket object created.

classmethod create_anonymous_client()

Factory: return client with anonymous credentials.

NOTE: Such a client has only limited access to “public” buckets: listing their contents and downloading their blobs.

  • Return type

    google.cloud.storage.client.Client

  • Returns

    Instance w/ anonymous credentials and no project.

create_bucket(bucket_or_name, requester_pays=None, project=None, user_project=None, location=None, predefined_acl=None, predefined_default_object_acl=None, timeout=60, retry=<google.api_core.retry.Retry object>)

API call: create a new bucket via a POST request.

See https://cloud.google.com/storage/docs/json_api/v1/buckets/insert

  • Parameters

    • bucket_or_name (Union[ Bucket, str, ]) – The bucket resource to pass or name to create.

    • requester_pays (bool) – DEPRECATED. Use Bucket().requester_pays instead. (Optional) Whether requester pays for API requests for this bucket and its blobs.

    • project (str) – (Optional) The project under which the bucket is to be created. If not passed, uses the project set on the client.

    • user_project (str) – (Optional) The project ID to be billed for API requests made via created bucket.

    • location (str) – (Optional) The location of the bucket. If not passed, the default location, US, will be used. See https://cloud.google.com/storage/docs/bucket-locations

    • predefined_acl (str) – (Optional) Name of predefined ACL to apply to bucket. See: https://cloud.google.com/storage/docs/access-control/lists#predefined-acl

    • predefined_default_object_acl (str) – (Optional) Name of predefined ACL to apply to bucket’s objects. See: https://cloud.google.com/storage/docs/access-control/lists#predefined-acl

    • timeout (Optional[Union[float, **Tuple[float, *[float](https://python.readthedocs.io/en/latest/library/functions.html#float)]]]*) – The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • retry (Optional[Union[google.api_core.retry.Retry, **google.cloud.storage.retry.ConditionalRetryPolicy]]) – How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options.

      A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set.

      See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them.

  • Returns

    google.cloud.storage.bucket.Bucket

      The newly created bucket.
    
  • Raises

    google.cloud.exceptions.Conflict – If the bucket already exists.

Examples

Create a bucket using a string.

bucket = client.create_bucket("my-bucket")
assert isinstance(bucket, Bucket)
# <Bucket: my-bucket>

Create a bucket using a resource.

>>> from google.cloud import storage
>>> client = storage.Client()
>>> # Set properties on a plain resource object.
>>> bucket = storage.Bucket("my-bucket-name")
>>> bucket.location = "europe-west6"
>>> bucket.storage_class = "COLDLINE"
>>> # Pass that resource object to the client.
>>> bucket = client.create_bucket(bucket)  # API request.

create_hmac_key(service_account_email, project_id=None, user_project=None, timeout=60)

Create an HMAC key for a service account.

  • Parameters

    • service_account_email (str) – e-mail address of the service account

    • project_id (str) – (Optional) Explicit project ID for the key. Defaults to the client’s project.

    • user_project (str) – (Optional) This parameter is currently ignored.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

  • Return type

    Tuple[HMACKeyMetadata, str]

  • Returns

    metadata for the created key, plus the bytes of the key’s secret, which is an 40-character base64-encoded string.

property current_batch()

Currently-active batch.

download_blob_to_file(blob_or_uri, file_obj, start=None, end=None, raw_download=False, if_generation_match=None, if_generation_not_match=None, if_metageneration_match=None, if_metageneration_not_match=None, timeout=60, checksum='md5')

Download the contents of a blob object or blob URI into a file-like object.

  • Parameters

    • blob_or_uri (Union[ Blob, str, ]) – The blob resource to pass or URI to download.

    • file_obj (file) – A file handle to which to write the blob’s data.

    • start (int) – (Optional) The first byte in a range to be downloaded.

    • end (int) – (Optional) The last byte in a range to be downloaded.

    • raw_download (bool) – (Optional) If true, download the object without any expansion.

    • if_generation_match (long) – (Optional) Make the operation conditional on whether the blob’s current generation matches the given value. Setting to 0 makes the operation succeed only if there are no live versions of the blob.

    • if_generation_not_match (long) – (Optional) Make the operation conditional on whether the blob’s current generation does not match the given value. If no live blob exists, the precondition fails. Setting to 0 makes the operation succeed only if there is a live version of the blob.

    • if_metageneration_match (long) – (Optional) Make the operation conditional on whether the blob’s current metageneration matches the given value.

    • if_metageneration_not_match (long) – (Optional) Make the operation conditional on whether the blob’s current metageneration does not match the given value.

    • timeout ([Union[float, **Tuple[float, *[float](https://python.readthedocs.io/en/latest/library/functions.html#float)]]]*) – (Optional) The number of seconds the transport should wait for the server response. Depending on the retry strategy, a request may be repeated several times using the same timeout each time. Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • checksum (str) – (Optional) The type of checksum to compute to verify the integrity of the object. The response headers must contain a checksum of the requested type. If the headers lack an appropriate checksum (for instance in the case of transcoded or ranged downloads where the remote service does not know the correct checksum, including downloads where chunk_size is set) an INFO-level log will be emitted. Supported values are “md5”, “crc32c” and None. The default is “md5”.

Examples

Download a blob using a blob resource.

>>> from google.cloud import storage
>>> client = storage.Client()
>>> bucket = client.get_bucket('my-bucket-name')
>>> blob = storage.Blob('path/to/blob', bucket)
>>> with open('file-to-download-to') as file_obj:
>>>     client.download_blob_to_file(blob, file_obj)  # API request.

Download a blob using a URI.

>>> from google.cloud import storage
>>> client = storage.Client()
>>> with open('file-to-download-to') as file_obj:
>>>     client.download_blob_to_file(
>>>         'gs://bucket_name/path/to/blob', file_obj)

generate_signed_post_policy_v4(bucket_name, blob_name, expiration, conditions=None, fields=None, credentials=None, virtual_hosted_style=False, bucket_bound_hostname=None, scheme='http', service_account_email=None, access_token=None)

Generate a V4 signed policy object.

NOTE: Assumes credentials implements the google.auth.credentials.Signing interface. Also assumes credentials has a service_account_email property which identifies the credentials.

Generated policy object allows user to upload objects with a POST request.

  • Parameters

  • Return type

    dict

  • Returns

    Signed POST policy.

Example

Generate signed POST policy and upload a file.

>>> from google.cloud import storage
>>> import pytz
>>> client = storage.Client()
>>> tz = pytz.timezone('America/New_York')
>>> policy = client.generate_signed_post_policy_v4(
    "bucket-name",
    "blob-name",
    expiration=datetime.datetime(2020, 3, 17, tzinfo=tz),
    conditions=[
        ["content-length-range", 0, 255]
    ],
    fields=[
        "x-goog-meta-hello" => "world"
    ],
)
>>> with open("bucket-name", "rb") as f:
    files = {"file": ("bucket-name", f)}
    requests.post(policy["url"], data=policy["fields"], files=files)

get_bucket(bucket_or_name, timeout=60, if_metageneration_match=None, if_metageneration_not_match=None, retry=<google.api_core.retry.Retry object>)

API call: retrieve a bucket via a GET request.

See https://cloud.google.com/storage/docs/json_api/v1/buckets/get

  • Parameters

    • bucket_or_name (Union[ Bucket, str, ]) – The bucket resource to pass or name to create.

    • timeout (Optional[Union[float, **Tuple[float, *[float](https://python.readthedocs.io/en/latest/library/functions.html#float)]]]*) – The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • if_metageneration_match (Optional[long]) – Make the operation conditional on whether the blob’s current metageneration matches the given value.

    • if_metageneration_not_match (Optional[long]) – Make the operation conditional on whether the blob’s current metageneration does not match the given value.

    • retry (Optional[Union[google.api_core.retry.Retry, **google.cloud.storage.retry.ConditionalRetryPolicy]]) – How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options.

      A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set.

      See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them.

  • Returns

    google.cloud.storage.bucket.Bucket

      The bucket matching the name provided.
    
  • Raises

    google.cloud.exceptions.NotFound – If the bucket is not found.

Examples

Retrieve a bucket using a string.

try:
    bucket = client.get_bucket("my-bucket")
except google.cloud.exceptions.NotFound:
    print("Sorry, that bucket does not exist!")

Get a bucket using a resource.

>>> from google.cloud import storage
>>> client = storage.Client()
>>> # Set properties on a plain resource object.
>>> bucket = client.get_bucket("my-bucket-name")
>>> # Time passes. Another program may have modified the bucket
... # in the meantime, so you want to get the latest state.
>>> bucket = client.get_bucket(bucket)  # API request.

get_hmac_key_metadata(access_id, project_id=None, user_project=None, timeout=60)

Return a metadata instance for the given HMAC key.

  • Parameters

    • access_id (str) – Unique ID of an existing key.

    • project_id (str) – (Optional) Project ID of an existing key. Defaults to client’s project.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • user_project (str) – (Optional) This parameter is currently ignored.

get_service_account_email(project=None, timeout=60, retry=<google.api_core.retry.Retry object>)

Get the email address of the project’s GCS service account

  • Parameters

    • project (str) – (Optional) Project ID to use for retreiving GCS service account email address. Defaults to the client’s project.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • retry (google.api_core.retry.Retry* or *google.cloud.storage.retry.ConditionalRetryPolicy) – (Optional) How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options.

      A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set.

      See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them.

  • Return type

    str

  • Returns

    service account email address

list_blobs(bucket_or_name, max_results=None, page_token=None, prefix=None, delimiter=None, start_offset=None, end_offset=None, include_trailing_delimiter=None, versions=None, projection='noAcl', fields=None, timeout=60, retry=<google.api_core.retry.Retry object>)

Return an iterator used to find blobs in the bucket.

If user_project is set, bills the API request to that project.

  • Parameters

    • bucket_or_name (Union[ Bucket, str, ]) – The bucket resource to pass or name to create.

    • max_results (int) – (Optional) The maximum number of blobs to return.

    • page_token (str) – (Optional) If present, return the next batch of blobs, using the value, which must correspond to the nextPageToken value returned in the previous response. Deprecated: use the pages property of the returned iterator instead of manually passing the token.

    • prefix (str) – (Optional) Prefix used to filter blobs.

    • delimiter (str) – (Optional) Delimiter, used with prefix to emulate hierarchy.

    • start_offset (str) – (Optional) Filter results to objects whose names are lexicographically equal to or after startOffset. If endOffset is also set, the objects listed will have names between startOffset (inclusive) and endOffset (exclusive).

    • end_offset (str) – (Optional) Filter results to objects whose names are lexicographically before endOffset. If startOffset is also set, the objects listed will have names between startOffset (inclusive) and endOffset (exclusive).

    • include_trailing_delimiter (boolean) – (Optional) If true, objects that end in exactly one instance of delimiter will have their metadata included in items in addition to prefixes.

    • versions (bool) – (Optional) Whether object versions should be returned as separate blobs.

    • projection (str) – (Optional) If used, must be ‘full’ or ‘noAcl’. Defaults to 'noAcl'. Specifies the set of properties to return.

    • fields (str) – (Optional) Selector specifying which fields to include in a partial response. Must be a list of fields. For example to get a partial response with just the next page token and the name and language of each blob returned: 'items(name,contentLanguage),nextPageToken'. See: https://cloud.google.com/storage/docs/json_api/v1/parameters#fields

    • timeout (Optional[Union[float, **Tuple[float, *[float](https://python.readthedocs.io/en/latest/library/functions.html#float)]]]*) – The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • retry (Optional[Union[google.api_core.retry.Retry, **google.cloud.storage.retry.ConditionalRetryPolicy]]) – How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options.

      A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set.

      See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them.

  • Returns

    Iterator of all Blob in this bucket matching the arguments.

Example

List blobs in the bucket with user_project.

>>> from google.cloud import storage
>>> client = storage.Client()
>>> bucket = storage.Bucket(client, "my-bucket-name", user_project="my-project")
>>> all_blobs = list(client.list_blobs(bucket))

list_buckets(max_results=None, page_token=None, prefix=None, projection='noAcl', fields=None, project=None, timeout=60, retry=<google.api_core.retry.Retry object>)

Get all buckets in the project associated to the client.

This will not populate the list of blobs available in each bucket.

for bucket in client.list_buckets():
    print(bucket)

This implements “storage.buckets.list”.

  • Parameters

    • max_results (int) – (Optional) The maximum number of buckets to return.

    • page_token (str) – (Optional) If present, return the next batch of buckets, using the value, which must correspond to the nextPageToken value returned in the previous response. Deprecated: use the pages property of the returned iterator instead of manually passing the token.

    • prefix (str) – (Optional) Filter results to buckets whose names begin with this prefix.

    • projection (str) – (Optional) Specifies the set of properties to return. If used, must be ‘full’ or ‘noAcl’. Defaults to ‘noAcl’.

    • fields (str) – (Optional) Selector specifying which fields to include in a partial response. Must be a list of fields. For example to get a partial response with just the next page token and the language of each bucket returned: ‘items/id,nextPageToken’

    • project (str) – (Optional) The project whose buckets are to be listed. If not passed, uses the project set on the client.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • retry (google.api_core.retry.Retry* or *google.cloud.storage.retry.ConditionalRetryPolicy) – (Optional) How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options.

      A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set.

      See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them.

  • Return type

    Iterator

  • Raises

    ValueError – if both project is None and the client’s project is also None.

  • Returns

    Iterator of all Bucket belonging to this project.

list_hmac_keys(max_results=None, service_account_email=None, show_deleted_keys=None, project_id=None, user_project=None, timeout=60, retry=<google.api_core.retry.Retry object>)

List HMAC keys for a project.

  • Parameters

    • max_results (int) – (Optional) Max number of keys to return in a given page.

    • service_account_email (str) – (Optional) Limit keys to those created by the given service account.

    • show_deleted_keys (bool) – (Optional) Included deleted keys in the list. Default is to exclude them.

    • project_id (str) – (Optional) Explicit project ID for the key. Defaults to the client’s project.

    • user_project (str) – (Optional) This parameter is currently ignored.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • retry (google.api_core.retry.Retry* or *google.cloud.storage.retry.ConditionalRetryPolicy) – (Optional) How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options.

      A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set.

      See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them.

  • Return type

    Tuple[HMACKeyMetadata, str]

  • Returns

    metadata for the created key, plus the bytes of the key’s secret, which is an 40-character base64-encoded string.

lookup_bucket(bucket_name, timeout=60, if_metageneration_match=None, if_metageneration_not_match=None, retry=<google.api_core.retry.Retry object>)

Get a bucket by name, returning None if not found.

You can use this if you would rather check for a None value than catching an exception:

bucket = client.lookup_bucket("doesnt-exist")
assert not bucket
# None
bucket = client.lookup_bucket("my-bucket")
assert isinstance(bucket, Bucket)
# <Bucket: my-bucket>
  • Parameters

    • bucket_name (str) – The name of the bucket to get.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • if_metageneration_match (long) – (Optional) Make the operation conditional on whether the blob’s current metageneration matches the given value.

    • if_metageneration_not_match (long) – (Optional) Make the operation conditional on whether the blob’s current metageneration does not match the given value.

    • retry (google.api_core.retry.Retry* or *google.cloud.storage.retry.ConditionalRetryPolicy) – (Optional) How to retry the RPC. A None value will disable retries. A google.api_core.retry.Retry value will enable retries, and the object will define retriable response codes and errors and configure backoff and timeout options.

      A google.cloud.storage.retry.ConditionalRetryPolicy value wraps a Retry object and activates it only if certain conditions are met. This class exists to provide safe defaults for RPC calls that are not technically safe to retry normally (due to potential data duplication or other side-effects) but become safe to retry if a condition such as if_metageneration_match is set.

      See the retry.py source code and docstrings in this package (google.cloud.storage.retry) for information on retry types and how to configure them.

  • Return type

    google.cloud.storage.bucket.Bucket

  • Returns

    The bucket matching the name provided or None if not found.