Storage Client

Client for interacting with the Google Cloud Storage API.

class google.cloud.storage.client.Client(project=

Bases: google.cloud.client.ClientWithProject

Client to bundle configuration needed for API requests.

  • Parameters

    • project (str* or [None*](https://python.readthedocs.io/en/latest/library/constants.html#None)) – the project which the client acts on behalf of. Will be passed when creating a topic. If not passed, falls back to the default inferred from the environment.

    • credentials (Credentials) – (Optional) The OAuth2 Credentials to use for this client. If not passed (and if no _http object is passed), falls back to the default inferred from the environment.

    • _http (Session) – (Optional) HTTP object to make requests. Can be any object that defines request() with the same interface as requests.Session.request(). If not passed, an _http object is created that is bound to the credentials for the current object. This parameter should be considered private, and could change in the future.

    • client_info (ClientInfo) – The client info used to send a user-agent string along with API requests. If None, then default info will be used. Generally, you only need to set this if you’re developing your own library or partner tool.

    • client_options (ClientOptions or dict) – (Optional) Client options used to set user options on the client. API Endpoint should be set through client_options.

SCOPE( = ('https://www.googleapis.com/auth/devstorage.full_control', 'https://www.googleapis.com/auth/devstorage.read_only', 'https://www.googleapis.com/auth/devstorage.read_write' )

The scopes required for authenticating as a Cloud Storage consumer.

batch()

Factory constructor for batch object.

NOTE: This will not make an HTTP request; it simply instantiates a batch object owned by this client.

bucket(bucket_name, user_project=None)

Factory constructor for bucket object.

NOTE: This will not make an HTTP request; it simply instantiates a bucket object owned by this client.

  • Parameters

    • bucket_name (str) – The name of the bucket to be instantiated.

    • user_project (str) – (Optional) The project ID to be billed for API requests made via the bucket.

  • Return type

    google.cloud.storage.bucket.Bucket

  • Returns

    The bucket object created.

classmethod create_anonymous_client()

Factory: return client with anonymous credentials.

NOTE: Such a client has only limited access to “public” buckets: listing their contents and downloading their blobs.

  • Return type

    google.cloud.storage.client.Client

  • Returns

    Instance w/ anonymous credentials and no project.

create_bucket(bucket_or_name, requester_pays=None, project=None, user_project=None, location=None, predefined_acl=None, predefined_default_object_acl=None, timeout=60)

API call: create a new bucket via a POST request.

See https://cloud.google.com/storage/docs/json_api/v1/buckets/insert

Examples

Create a bucket using a string.

bucket = client.create_bucket("my-bucket")
assert isinstance(bucket, Bucket)
# <Bucket: my-bucket>

Create a bucket using a resource.

>>> from google.cloud import storage
>>> client = storage.Client()
>>> # Set properties on a plain resource object.
>>> bucket = storage.Bucket("my-bucket-name")
>>> bucket.location = "europe-west6"
>>> bucket.storage_class = "COLDLINE"
>>> # Pass that resource object to the client.
>>> bucket = client.create_bucket(bucket)  # API request.

create_hmac_key(service_account_email, project_id=None, user_project=None, timeout=60)

Create an HMAC key for a service account.

  • Parameters

    • service_account_email (str) – e-mail address of the service account

    • project_id (str) – (Optional) Explicit project ID for the key. Defaults to the client’s project.

    • user_project (str) – (Optional) This parameter is currently ignored.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

  • Return type

    Tuple[HMACKeyMetadata, str]

  • Returns

    metadata for the created key, plus the bytes of the key’s secret, which is an 40-character base64-encoded string.

property current_batch()

Currently-active batch.

download_blob_to_file(blob_or_uri, file_obj, start=None, end=None)

Download the contents of a blob object or blob URI into a file-like object.

  • Parameters

    • blob_or_uri (Union[ Blob, str, ]) – The blob resource to pass or URI to download.

    • file_obj (file) – A file handle to which to write the blob’s data.

    • start (int) – (Optional) The first byte in a range to be downloaded.

    • end (int) – (Optional) The last byte in a range to be downloaded.

Examples

Download a blob using a blob resource.

>>> from google.cloud import storage
>>> client = storage.Client()
>>> bucket = client.get_bucket('my-bucket-name')
>>> blob = storage.Blob('path/to/blob', bucket)
>>> with open('file-to-download-to') as file_obj:
>>>     client.download_blob_to_file(blob, file_obj)  # API request.

Download a blob using a URI.

>>> from google.cloud import storage
>>> client = storage.Client()
>>> with open('file-to-download-to') as file_obj:
>>>     client.download_blob_to_file(
>>>         'gs://bucket_name/path/to/blob', file_obj)

generate_signed_post_policy_v4(bucket_name, blob_name, expiration, conditions=None, fields=None, credentials=None, virtual_hosted_style=False, bucket_bound_hostname=None, scheme='http', service_account_email=None, access_token=None)

Generate a V4 signed policy object.

NOTE: Assumes credentials implements the google.auth.credentials.Signing interface. Also assumes credentials has a service_account_email property which identifies the credentials.

Generated policy object allows user to upload objects with a POST request.

  • Parameters

  • Return type

    dict

  • Returns

    Signed POST policy.

Example

Generate signed POST policy and upload a file.

>>> from google.cloud import storage
>>> import pytz
>>> client = storage.Client()
>>> tz = pytz.timezone('America/New_York')
>>> policy = client.generate_signed_post_policy_v4(
   
"bucket-name",
   
"blob-name",
    expiration
=datetime.datetime(2020, 3, 17, tzinfo=tz),
    conditions
=[
       
["content-length-range", 0, 255]
   
],
    fields
=[
       
"x-goog-meta-hello" => "world"
   
],
)
>>> with open("bucket-name", "rb") as f:
    files
= {"file": ("bucket-name", f)}
    requests
.post(policy["url"], data=policy["fields"], files=files)

get_bucket(bucket_or_name, timeout=60, if_metageneration_match=None, if_metageneration_not_match=None)

API call: retrieve a bucket via a GET request.

See https://cloud.google.com/storage/docs/json_api/v1/buckets/get

  • Parameters

    • bucket_or_name (Union[ Bucket, str, ]) – The bucket resource to pass or name to create.

    • timeout (Optional[Union[float, **Tuple[float, *[float](https://python.readthedocs.io/en/latest/library/functions.html#float)]]]*) – The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • if_metageneration_match (Optional[long]) – Make the operation conditional on whether the blob’s current metageneration matches the given value.

    • if_metageneration_not_match (Optional[long]) – Make the operation conditional on whether the blob’s current metageneration does not match the given value.

  • Returns

    google.cloud.storage.bucket.Bucket

      The bucket matching the name provided.
    
  • Raises

    google.cloud.exceptions.NotFound – If the bucket is not found.

Examples

Retrieve a bucket using a string.

try:
    bucket
= client.get_bucket("my-bucket")
except google.cloud.exceptions.NotFound:
   
print("Sorry, that bucket does not exist!")

Get a bucket using a resource.

>>> from google.cloud import storage
>>> client = storage.Client()
>>> # Set properties on a plain resource object.
>>> bucket = client.get_bucket("my-bucket-name")
>>> # Time passes. Another program may have modified the bucket
... # in the meantime, so you want to get the latest state.
>>> bucket = client.get_bucket(bucket)  # API request.

get_hmac_key_metadata(access_id, project_id=None, user_project=None, timeout=60)

Return a metadata instance for the given HMAC key.

  • Parameters

    • access_id (str) – Unique ID of an existing key.

    • project_id (str) – (Optional) Project ID of an existing key. Defaults to client’s project.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • user_project (str) – (Optional) This parameter is currently ignored.

get_service_account_email(project=None, timeout=60)

Get the email address of the project’s GCS service account

  • Parameters

    • project (str) – (Optional) Project ID to use for retreiving GCS service account email address. Defaults to the client’s project.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

  • Return type

    str

  • Returns

    service account email address

list_blobs(bucket_or_name, max_results=None, page_token=None, prefix=None, delimiter=None, start_offset=None, end_offset=None, include_trailing_delimiter=None, versions=None, projection='noAcl', fields=None, timeout=60)

Return an iterator used to find blobs in the bucket.

If user_project is set, bills the API request to that project.

  • Parameters

    • bucket_or_name (Union[ Bucket, str, ]) – The bucket resource to pass or name to create.

    • max_results (int) – (Optional) The maximum number of blobs to return.

    • page_token (str) – (Optional) If present, return the next batch of blobs, using the value, which must correspond to the nextPageToken value returned in the previous response. Deprecated: use the pages property of the returned iterator instead of manually passing the token.

    • prefix (str) – (Optional) Prefix used to filter blobs.

    • delimiter (str) – (Optional) Delimiter, used with prefix to emulate hierarchy.

    • start_offset (str) – (Optional) Filter results to objects whose names are lexicographically equal to or after startOffset. If endOffset is also set, the objects listed will have names between startOffset (inclusive) and endOffset (exclusive).

    • end_offset (str) – (Optional) Filter results to objects whose names are lexicographically before endOffset. If startOffset is also set, the objects listed will have names between startOffset (inclusive) and endOffset (exclusive).

    • include_trailing_delimiter (boolean) – (Optional) If true, objects that end in exactly one instance of delimiter will have their metadata included in items in addition to prefixes.

    • versions (bool) – (Optional) Whether object versions should be returned as separate blobs.

    • projection (str) – (Optional) If used, must be ‘full’ or ‘noAcl’. Defaults to 'noAcl'. Specifies the set of properties to return.

    • fields (str) – (Optional) Selector specifying which fields to include in a partial response. Must be a list of fields. For example to get a partial response with just the next page token and the name and language of each blob returned: 'items(name,contentLanguage),nextPageToken'. See: https://cloud.google.com/storage/docs/json_api/v1/parameters#fields

    • timeout (Optional[Union[float, **Tuple[float, *[float](https://python.readthedocs.io/en/latest/library/functions.html#float)]]]*) – The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

  • Returns

    Iterator of all Blob in this bucket matching the arguments.

Example

List blobs in the bucket with user_project.

>>> from google.cloud import storage
>>> client = storage.Client()
>>> bucket = storage.Bucket("my-bucket-name", user_project='my-project')
>>> all_blobs = list(client.list_blobs(bucket))

list_buckets(max_results=None, page_token=None, prefix=None, projection='noAcl', fields=None, project=None, timeout=60)

Get all buckets in the project associated to the client.

This will not populate the list of blobs available in each bucket.

for bucket in client.list_buckets():
   
print(bucket)

This implements “storage.buckets.list”.

  • Parameters

    • max_results (int) – (Optional) The maximum number of buckets to return.

    • page_token (str) – (Optional) If present, return the next batch of buckets, using the value, which must correspond to the nextPageToken value returned in the previous response. Deprecated: use the pages property of the returned iterator instead of manually passing the token.

    • prefix (str) – (Optional) Filter results to buckets whose names begin with this prefix.

    • projection (str) – (Optional) Specifies the set of properties to return. If used, must be ‘full’ or ‘noAcl’. Defaults to ‘noAcl’.

    • fields (str) – (Optional) Selector specifying which fields to include in a partial response. Must be a list of fields. For example to get a partial response with just the next page token and the language of each bucket returned: ‘items/id,nextPageToken’

    • project (str) – (Optional) The project whose buckets are to be listed. If not passed, uses the project set on the client.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

  • Return type

    Iterator

  • Raises

    ValueError – if both project is None and the client’s project is also None.

  • Returns

    Iterator of all Bucket belonging to this project.

list_hmac_keys(max_results=None, service_account_email=None, show_deleted_keys=None, project_id=None, user_project=None, timeout=60)

List HMAC keys for a project.

  • Parameters

    • max_results (int) – (Optional) Max number of keys to return in a given page.

    • service_account_email (str) – (Optional) Limit keys to those created by the given service account.

    • show_deleted_keys (bool) – (Optional) Included deleted keys in the list. Default is to exclude them.

    • project_id (str) – (Optional) Explicit project ID for the key. Defaults to the client’s project.

    • user_project (str) – (Optional) This parameter is currently ignored.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

  • Return type

    Tuple[HMACKeyMetadata, str]

  • Returns

    metadata for the created key, plus the bytes of the key’s secret, which is an 40-character base64-encoded string.

lookup_bucket(bucket_name, timeout=60, if_metageneration_match=None, if_metageneration_not_match=None)

Get a bucket by name, returning None if not found.

You can use this if you would rather check for a None value than catching an exception:

bucket = client.lookup_bucket("doesnt-exist")
assert not bucket
# None
bucket
= client.lookup_bucket("my-bucket")
assert isinstance(bucket, Bucket)
# <Bucket: my-bucket>
  • Parameters

    • bucket_name (str) – The name of the bucket to get.

    • timeout (float* or [tuple*](https://python.readthedocs.io/en/latest/library/stdtypes.html#tuple)) – (Optional) The amount of time, in seconds, to wait for the server response.

      Can also be passed as a tuple (connect_timeout, read_timeout). See requests.Session.request() documentation for details.

    • if_metageneration_match (long) – (Optional) Make the operation conditional on whether the blob’s current metageneration matches the given value.

    • if_metageneration_not_match (long) – (Optional) Make the operation conditional on whether the blob’s current metageneration does not match the given value.

  • Return type

    google.cloud.storage.bucket.Bucket

  • Returns

    The bucket matching the name provided or None if not found.