created a GCP (Google Cloud Platform) project
created a service account and downloaded the private key (JSON file) for authentication
installed the Python client library for Google Cloud Storage (GCS):
pip install --upgrade google-cloud-storage

If you haven't, then please check out my detailed guide first!

Downloading a single file from Google Cloud Storage using Python

Suppose we have a text file called uploaded_sample.txt that lives in the bucket example-bucket-skytowner on Google Cloud Storage (GCS).

To download this file from GCS, use the download_to_filename(~) method:


        
        
            
                
                
                    from google.cloud import storage

path_to_private_key = './gcs-project-354207-099ef6796af6.json'
client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
bucket = storage.Bucket(client, 'example-bucket-skytowner')
blob = bucket.blob('uploaded_sample.txt')
blob.download_to_filename('.downloaded_file')

Note the following:

the credential JSON file for the service account resides in the same directory as this Python script
example-bucket-skytowner is the name of the bucket in which the file resides
uploaded_sample.txt is the name of the file on GCS that you wish to download
the download_to_filename(~) method takes as argument the path of where the file should be downloaded to.

After running this code, we should see a file called downloaded_file in the same directory as this Python script.

Referencing blob and bucket name

We can reference the names of our file and bucket using the name property:


        
        
            
                
                
                    bucket = storage.Bucket(client, 'example-bucket-skytowner')
blob = bucket.blob('uploaded_sample.txt')
print(f'Bucket name: {bucket.name}')
print(f'Blob name: {blob.name}')
blob.download_to_filename(f'{bucket.name}_{blob.name}')
                
            
            Bucket name: example-bucket-skytowner
Blob name: uploaded_sample.txt

The name property is oftentimes quite handy when organizing where the files should be locally downloaded to. We will see examples of this later in this guide.

Downloading to a directory using relative path

The download_to_filename(~) will throw an error if we supply a local path that does not exist. For instance, suppose we wanted to download a file in a local downloads directory, which currently does not exist:


        
        
            
                
                
                    blob.download_to_filename(f'./downloads/{blob.name}')
                
            
            FileNotFoundError: [Errno 2] No such file or directory: './downloads/uploaded_sample.txt'

The way to get around this is to create the folders using the method mkdir(~) in the Path library before we call the download_to_filename(~) method:


        
        
            
                
                
                    from pathlib import Path
path_folder = f'./downloads/{bucket.name}'
# Create this folder locally if it does not exist
# parents=True will create intermediate directories if they do not exist
Path(path_folder).mkdir(parents=True, exist_ok=True)
blob = bucket.blob('uploaded_sample.txt')
blob.download_to_filename(f'{path_folder}/{blob.name}')

When running this code, the directory downloads/example-bucket-skytowner will be created if they do not exist yet, and the file will be downloaded in this directory. The final local path of the downloaded file would therefore be:


        
        
            
                
                
                    ./downloads/example-bucket-skytowner/uploaded_sample.txt

Handling error in case of file not found

Trying to download files that do not exist in GCS will throw a 404 NotFound error:


        
        
            
                
                
                    blob = bucket.blob('.some_non_existing_file')
blob.download_to_filename('./downloaded_file')
                
            
            NotFound: 404 GET https://storage.googleapis.com/download/storage/v1/b/example-bucket-skytowner/o/.some_non_existing_file?alt=media:
No such object: example-bucket-skytowner/.some_non_existing_file:
('Request failed with status code', 404 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)

To account for this case, we can wrap our methods in a try-except clause:


        
        
            
                
                
                    from google.cloud.exceptions import NotFound

try:
    blob = bucket.blob('.some_non_existing_file')
    blob.download_to_filename('./downloaded_file')
except NotFound:
    print(f'🚨 {blob.name} does not exist - do something')
    # Handle this case
                
            
            🚨 .some_non_existing_file does not exist - do something

Note the following:

we had to import the NotFound error from google.cloud.exceptions.

Downloading multiple files from Google Cloud Storage

Currently, GCS only allows downloading files one at a time. Therefore, we must iteratively call the download_to_filename(~) method to download multiple files from GCS.

The following code block extends the case of downloading a single file:


        
        
            
                
                
                    from google.cloud import storage

path_to_private_key = './gcs-project-354207-099ef6796af6.json'
client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
bucket = storage.Bucket(client, 'example-bucket-skytowner')

list_files_to_download = ['uploaded_sample.txt', 'cat.png']
for file_to_download in list_files_to_download:
    blob = bucket.blob(file_to_download)
    blob.download_to_filename(f'./{blob.name}')

Once running this code, we should see the files uploaded_sample.txt and cat.png downloaded in the same directory as this Python file.

Downloading a folder from Google Cloud Storage

Suppose we have the following two files under a folder called my_folder on GCS:


        
        
            
                
                
                    📁 my_folder
 ├─ cat.png
 ├─ uploaded_sample.txt

To download all files inside the folder my_folder:


        
        
            
                
                
                    path_to_private_key = './gcs-project-354207-099ef6796af6.json'
client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
bucket = storage.Bucket(client, 'example-bucket-skytowner')

str_folder_name_on_gcs = 'my_folder/'

# Create the directory locally
Path(str_folder_name_on_gcs).mkdir(parents=True, exist_ok=True)

blobs = bucket.list_blobs(prefix=str_folder_name_on_gcs)
for blob in blobs:
    if not blob.name.endswith('/'):
        # This blob is not a directory!
        print(f'Downloading file [{blob.name}]')
        blob.download_to_filename(f'./{blob.name}')
                
            
            Downloading file [my_folder/cat.png]
Downloading file [my_folder/uploaded_sample.txt]

After running this code, we should see a new my_folder folder containing the two files in our current directory:


        
        
            
                
                
                    ├─ script.py
📁 my_folder
  ├─ cat.png
  ├─ uploaded_sample.txt

Now, let's explain how our code works:

the list_blobs(~) method takes in as argument prefix which allows us to fetch all blobs starting with prefix.
in our case, we are fetching blobs whose name begins with 'my_folder/'. Unfortunately, my_folder/ which represents a directory in GCS is also fetched as a blob. Since we do not want to download directory blobs, we filter these blobs out by ignoring those that end with the '/' character.
even though the file name is my_folder/cat.png, the method download_to_filename(~) will place the cat.png inside the folder my_folder. We must make sure that this folder exists by using the built-in Path(~) library - otherwise a DirectoryNotFound error will occur.

Downloading the content of files in memory

Instead of downloading an actual file to a local path, suppose we wanted to store the content of the file in a variable. For instance, let's read the content of a text file on GCS called uploaded_sample.txt in memory using the download_as_string() method:


        
        
            
                
                
                    from google.cloud import storage

path_to_private_key = './gcs-project-354207-099ef6796af6.json'
client = storage.Client.from_service_account_json(json_credentials_path=path_to_private_key)
# The name of our bucket
bucket = storage.Bucket(client, 'example-bucket-skytowner')
# The name of the file on GCS
blob = bucket.blob('uploaded_sample.txt')

byte_str_file_content = blob.download_as_string()
str_file_content = byte_str_file_content.decode('utf-8')
print(str_file_content)
                
            
            This is some sample text.
Hello World.

Note the following:

the download_as_string(~) method returns a byte string
we use the decode('utf-8') method to convert the byte string into a standard string
the content of our text file ('uploaded_sample.txt') is printed in the output

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!