Skip to content

bobby_dreamer

Google Cloud Storage

notes, GCP, GCS4 min read

Last updated : 10/April/2020

I am a google fan. I been using gmail for nearly 15years and google is providing 15gb of free space in google drive for nearly 8+ years. With freebies like this i am going to store my personal data as well in google storage and thats how everything started.

All the below information are copied from google's official documentation and reproduced here for my quick reference. I haven't copied all information, only commands & texts which i probably might use.

Google Cloud Storage official links

Storage class
First thing one needs to understand is everything stored in cloud storage is an object. Second thing one needs to understand about is storage class. These classes are categorized based on How often you access these objects. By default, everything gets stored in STANDARD class.

ClassStorage durationDescription
STANDARDNoneDefault. Best for data that is frequently accessed.
NEARLINE30 daysNearline Storage is ideal for data you plan to read or modify on average once per month or less.
COLDLINE90 daysColdline Storage is ideal for data you plan to read or modify at most once a quarter.
ARCHIVE365 daysArchive Storage is the best choice for data that you plan to access less than once a year.

Additional classes

  • Multi-Regional Storage: Equivalent to Standard Storage, except Multi-Regional Storage can only be used for objects stored in multi-regions or dual-regions.

  • Regional Storage: Equivalent to Standard Storage, except Regional Storage can only be used for objects stored in regions.

Overview of access control

  • Uniform(recommended) : Uses Cloud IAM and applies permissions to all the objects contained inside the bucket or groups of objects with common name prefixes. Cloud IAM also allows you to use features that are not available when working with ACLs, such as Cloud IAM Conditions and Cloud Audit Logs.

  • Fine-grained: The fine-grained option enables you to use Cloud IAM and Access Control Lists (ACLs) together to manage permissions. ACLs are a legacy access control system for Cloud Storage designed for interoperability with Amazon S3. You can specify access and apply permissions at both the bucket level and per individual object.

Caution: If you use Cloud IAM and ACLs on the same resource, Cloud Storage grants the broader permission set on the resource. For example, if your Cloud IAM permissions only allow a few users to access my-object, but your ACLs make my-object public, then my-object is exposed to the public. In general, Cloud IAM cannot detect permissions granted by ACLs, and ACLs cannot detect permissions granted by Cloud IAM.

Uniform bucket-level access
Allows you to uniformly control access to your Cloud Storage resources. When enabled on a bucket, only bucket-level Cloud Identity and Access Management (Cloud IAM) permissions grant access to that bucket and the objects it contains; Access Control Lists (ACLs) are disabled and access granted by ACLs is revoked.

After you enable uniform bucket-level access, you can reverse your decision for 90 days and cannot disable afterwards.

Certain Google Cloud services that export to Cloud Storage cannot export to buckets that have uniform bucket-level access enabled. These services include: Cloud Logging, Cloud Audit Logs, and Datastore.

Object ACL permission & Cloud IAM role

  • READER - Storage Legacy Object Reader (roles/storage.legacyObjectReader)
  • OWNER - Storage Legacy Object Owner (roles/storage.legacyObjectOwner)

Creating Storage Buckets

Listing Buckets & Objects

Bucket size

Displays the amount of space (in bytes) being used by the objects. If bucket is bigger might take lots of time

Changing object storage classes

Object metadata

Copying objects

Parallel Downloads
Suppose there are folders like this in cloud,

You can run like this to download parallely(this can run on multiple machines & dir could be a shared directory as well)

Delete objects

Moving & Renaming object
This can be used for renaming objects as well. The gsutil mv command does not perform a single atomic operation. Rather, it performs a copy from source to destination followed by removing the source for each object.

Synchronize local changes with the bucket

Changing the default storage class of a bucket
When you upload an object to the bucket, if you don't specify a storage class for the object, the object is assigned the bucket's default storage class.

Making individual objects publicly readable

Note : If you have uniform bucket-level access enabled on your bucket, you cannot use below command as its ACL

Making all objects in a bucket publicly readable

Remove public access

Accessing public data
In public datasets, you can usually list files and copy specific files to local(eg:- Google public bucket : gcp-public-data-landsat)

Using Cloud IAM permissions

Where:

  • [MEMBER_TYPE] is the type of member to which you are granting bucket access. For example, user. Member types :
    • Google accounts and Google groups represent two general types, while allAuthenticatedUsers and allUsers are two specialized types.
    • Cloud IAM supports the following member types, which can be applied specifically to your Cloud Storage bucket Cloud IAM policies:
      • projectOwner:[PROJECT_ID]
      • projectEditor:[PROJECT_ID]
      • projectViewer:[PROJECT_ID]
  • [MEMBER_NAME] is the name of the member to which you are granting bucket access. For example, jane@gmail.com.
  • [IAM_ROLE] is the IAM role you are granting to the member. For example, roles/storage.objectCreator.
  • [BUCKET_NAME] is the name of the bucket you are granting the member access to. For example, my-bucket.

Note : Important: It typically takes about a minute for revoking access to take effect. In some cases it may take longer. If you remove a user's access, this change is immediately reflected in the metadata; however, the user may still have access to the object for a short period of time.

Storage Pricing

Storage location wise cost

Cloud Storage Free limits

ResourceMonthly Free Usage Limits
Standard Storage5 GB-months
Class A Operations5,000
Class B Operations50,000
Network Egress1 GB from North America to each GCP egress destination (Australia and China excluded)

Cloud Storage Always Free quotas apply to usage in US-WEST1, US-CENTRAL1, and US-EAST1 regions.

General network usage

  • Ingress : Free
  • Egress
Monthly Usage$ Per GBEgress to destinations
1-10TB$0.11Egress to Worldwide Destinations (excluding Asia & Australia), Egress to Asia Destinations (excluding China, but including Hong Kong)
1-10TB$0.18Egress to Australia Destinations
1-10TB$0.22Egress to China Destinations (excluding Hong Kong)

Other than above, you are charged for operations you do, An operation is an action that makes changes to or retrieves information about buckets and objects in Cloud Storage. Operations are divided into three categories: Class A, Class B, and free. Billing rates are per 10,000 operations.

StorageStandardNearlineColdlineArchive
Class A$0.05$0.10$0.10$0.50
Class B$0.04$0.01$0.05$0.50

Operations that fall into each class

  • Class A

  • Class B

  • Free

Retrieval and early deletion
There are charges for retreiving data or metadata from Nearline Storage, Coldline Storage, and Archive Storage as they are intended for storing infrequently accessed data.

# My take

  1. Keep it simple
  2. Dont give too much public access
  3. Keep a watch on the pricing
  4. Before starting, plan on setting some standards for buckets, naming conventions, file types, content types and etc..
  5. You don't need to dump all the data to one single project, if you need you can have a separation there.

# Resources