Setup Logging for Google Cloud Storage Buckets
In most cases, Cloud Audit Logs is the recommended method for generating logs that track API operations performed in Cloud Storage
To track access publicly accessible bucket, we need to enable logging for the bucket.
Docs - https://cloud.google.com/storage/docs/access-logs
Create logging bucket(should be in the same org/project as other buckets) -
gcloud storage buckets create gs://LOGS_BUCKET_NAME --location=LOGS_BUCKET_LOCATION
Give permission to storage analytics to write to the bucket
gcloud storage buckets add-iam-policy-binding gs://LOGS_BUCKET_NAME --member=group:[email protected] --role=roles/storage.objectCreator
Enable logging
gcloud storage buckets update gs://TARGET_BUCKET_NAME --log-bucket=gs://LOGS_BUCKET_NAME --log-object-prefix=PREFIX
prefix is the bucket name by default
Check logging status
gcloud storage buckets describe gs://LOGS_BUCKET_NAME --format="default(logging_config)"
It will take some time(> 1hr) for the logs to start appearing in the bucket
Log format - csv
File name - OBJECT_PREFIX_usage_TIMESTAMP_ID_v0
, OBJECT_PREFIX_storage_TIMESTAMP_ID_v0
"time_micros","c_ip","c_ip_type","c_ip_region","cs_method","cs_uri","sc_status","cs_bytes","sc_bytes","time_taken_micros","cs_host","cs_referer","cs_user_agent","s_request_id","cs_operation","cs_bucket","cs_object"
Download logs
gcloud storage rsync gs://LOGS_BUCKET_NAME LOCAL_DIR
# gcloud storage rsync gs://logs-bucket/example-bucket_usage_2022_06 logs
TODO - Explore logs in console/bigquery - https://cloud.google.com/storage/docs/access-logs#BigQuery
Setting cache control for all files
Set cache control to reduce the number of requests to the bucket, change from immutable to something else if the files would change.
gcloud storage objects update --recursive --cache-control="public, max-age=31536000, immutable" gs://BUCEKT_NAME