2 min read

How to upload files into Cloud Storage using Python container

How to setup Python docker container in your local machine and upload files into Cloud Storage using Python client libraries.
How to upload files into Cloud Storage using Python container
How to upload files into Cloud Storage using Python container

Resources

https://cloud.google.com/apis/docs/cloud-client-libraries


Pre-requisites

  1. Visual Studio Code https://code.visualstudio.com/download
  2. Docker https://www.docker.com/products/docker-desktop

Steps

Create a directory with service_account.json, requirements.txt, main.py, hello_gcs.txt, Dockerfile in Visual Studio Code


service_account.json, follow the guide here https://cloud.google.com/iam/docs/creating-managing-service-accounts to setup service account

{
  "type": "service_account",
  "project_id": "abc",
  "private_key_id": "abc",
  "private_key":"abc",
  "client_email": "abc",
  "client_id": "abc",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "abc"
}

requirements.txt based on your requirements

# Use this version as of Sep 2021
google-cloud-storage==1.42.2
#google-cloud-storage

main.py, refer here for more details https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python

from google.cloud import storage
"""Uploads a file to the bucket."""
# The ID of your GCS bucket
bucket_name = "chenmingyong-udemy"

# The path to your file to upload
source_file_name = "hello_gcs.txt"

# The ID of your GCS object
destination_blob_name = "hello_gcs.txt"

storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)

blob.upload_from_filename(source_file_name)

print(
    "File {} uploaded to {}.".format(
        source_file_name, destination_blob_name
    )
)

hello_gcs.txt → just type some random value

hello_gcs_from_chenming-yong_udemy

Dockerfile, based on the guide here https://hub.docker.com/_/python

FROM python:3.7

# Install nano editor just in case we need to write some file
RUN apt-get update 
RUN apt-get -y install nano 

# Install the python dependencies
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

# Copy all the files in the existing folder into /usr/src/app
WORKDIR /usr/src/app
COPY . .

# Export google application credentials to have the necessary permission
ENV GOOGLE_APPLICATION_CREDENTIALS="/usr/src/app/service_account.json"

Assume you have installed Docker and Visual Studio Code

Open Visual Studio Code → Click on Terminal → New Terminal

Open New Terminal in Visual Studio Code
Open New Terminal in Visual Studio Code

Docker build and run the python container

docker build -t my-python-app .
# Check that the image is built

# docker image ls
docker run -it --rm --name my-running-app my-python-app /bin/bash

Run the python script

python main.py 

Login to Google Cloud Platform → Cloud Storage → you should be able to see the new file (hello_gcs.txt) added!

Confirmation of new file in Cloud Storage
Confirmation of new file in Cloud Storage

This is part of my online course on how to kickstart data engineering workflows in Google Cloud Platform (GCP) for beginners, sign up here to watch detailed video explanation! 🤗