search
Search
Login
Unlock 100+ guides
menu
menu
web
search toc
close
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
What does this mean?
Why is this true?
Give me some examples!
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to

Getting list of file names in bucket in Google Cloud Storage using Python

schedule Aug 12, 2023
Last updated
local_offer
Cloud Computing
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Prerequisites

To follow along with this guide, please make sure to have:

  • created a GCP (Google Cloud Platform) project

  • created a service account and downloaded the private key (JSON file) for authentication

  • installed the Python client library for Google Cloud Storage (GCS):

    pip install --upgrade google-cloud-storage

If you haven't, then please check out my detailed guide first!

Getting list of file names in Google Cloud Storage bucket

Suppose we have the following two files on Google Cloud Storage (GCS):

├─ cat.png
├─ uploaded_sample.txt

To get the list of file names in a certain bucket, use the list_blobs(~) method, which returns a list of blobs (files):

from google.cloud import storage
# Authenticate ourselves using the service account private key
path_to_private_key = './gcs-project-354207-099ef6796af6.json'
client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)

blobs = client.list_blobs('example-bucket-skytowner')
for blob in blobs:
print(blob.name)
uploaded_sample.txt
cat.png

Here, note the following:

  • our private key (JSON file) resides in the same directory as this Python script.

  • the name of our bucket is called example-bucket-skytowner.

  • each blob has the name property that represents the file name.

Getting list of file names under a specific folder

Suppose we have a folder called my_folder that contains the following two files in GCS:

📁 my_folder
├─ cat.png
├─ uploaded_sample.txt

To fetch the list of file names under my_folder, use the list_blobs(~) method with the prefix argument:

path_to_private_key = './gcs-project-354207-099ef6796af6.json'
client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
bucket = storage.Bucket(client, 'example-bucket-skytowner')

str_folder_name_on_gcs = 'my_folder/'
blobs = bucket.list_blobs(prefix=str_folder_name_on_gcs)
for blob in blobs:
print(blob.name)
my_folder/
my_folder/cat.png
my_folder/uploaded_sample.txt

Notice how the first blob represents a folder, while the latter two are the files. By setting the prefix argument to be 'my_folder/', we are fetching all the blobs that begin with 'my_folder/', which includes the directory blob. Since directories are characterized by an ending '/', we can filter them out as we iterate like so:

for blob in blobs:
if not blob.name.endswith('/'):
print(blob.name)
my_folder/cat.png
my_folder/uploaded_sample.txt
robocat
Published by Isshin Inada
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
7
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!