search
Search
Login
Math ML Join our weekly DS/ML newsletter
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
brightness_medium
share
arrow_backShare
Twitter
Facebook

Getting list of file names in bucket in Google Cloud Storage using Python

Cloud Computing
chevron_right
Google Cloud Platform
chevron_right
Cloud Storage
chevron_right
Python client library
schedule Jul 1, 2022
Last updated
local_offer Cloud Computing
Tags

Prerequisites

To follow along with this guide, please make sure to have:

  • created a GCP (Google Cloud Platform) project

  • created a service account and downloaded the private key (JSON file) for authentication

  • installed the Python client library for Google Cloud Storage (GCS):

    pip install --upgrade google-cloud-storage

If you haven't, then please check out my detailed guide first!

Getting list of file names in Google Cloud Storage bucket

Suppose we have the following two files on Google Cloud Storage (GCS):

├─ cat.png
├─ uploaded_sample.txt

To get the list of file names in a certain bucket, use the list_blobs(~) method, which returns a list of blobs (files):

from google.cloud import storage
# Authenticate ourselves using the service account private key
path_to_private_key = './gcs-project-354207-099ef6796af6.json'
client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)

blobs = client.list_blobs('example-bucket-skytowner')
for blob in blobs:
print(blob.name)
uploaded_sample.txt
cat.png

Here, note the following:

  • our private key (JSON file) resides in the same directory as this Python script.

  • the name of our bucket is called example-bucket-skytowner.

  • each blob has the name property that represents the file name.

Getting list of file names under a specific folder

Suppose we have a folder called my_folder that contains the following two files in GCS:

📁 my_folder
├─ cat.png
├─ uploaded_sample.txt

To fetch the list of file names under my_folder, use the list_blobs(~) method with the prefix argument:

path_to_private_key = './gcs-project-354207-099ef6796af6.json'
client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
bucket = storage.Bucket(client, 'example-bucket-skytowner')

str_folder_name_on_gcs = 'my_folder/'
blobs = bucket.list_blobs(prefix=str_folder_name_on_gcs)
for blob in blobs:
print(blob.name)
my_folder/
my_folder/cat.png
my_folder/uploaded_sample.txt

Notice how the first blob represents a folder, while the latter two are the files. By setting the prefix argument to be 'my_folder/', we are fetching all the blobs that begin with 'my_folder/', which includes the directory blob. Since directories are characterized by an ending '/', we can filter them out as we iterate like so:

for blob in blobs:
if not blob.name.endswith('/'):
print(blob.name)
my_folder/cat.png
my_folder/uploaded_sample.txt
mail
Join our newsletter for updates on new DS/ML comprehensive guides (spam-free)
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!