2 min read

How to run gcloud, gsutil and bq using Google Cloud SDK docker image

How to setup and run Google Cloud SDK docker image. Execute gcloud, gsutil and bq command on your local machine.
Extract data from local machine into Google Cloud Platform
Extract data from local machine into Google Cloud Platform

Pre-requisites

  1. (Optional) Visual Studio Code https://code.visualstudio.com/download - You can use Powershell/Command Prompt (Windows) or Bash (MAC OS) but I prefer to use this as it works well for both Windows and MAC OS.
  2. Docker https://www.docker.com/products/docker-desktop

Steps

Source

https://cloud.google.com/sdk/docs/downloads-docker


Assume you have installed Docker and Visual Studio Code Open Visual Studio Code → Click on Terminal → New Terminal

Open New Terminal in Visual Studio Code
Open New Terminal in Visual Studio Code

Pull the docker image (Version 357.0.0 is used for performance consistency throughout the course, pull the latest if you prefer). It will take a while since the file is quite large (~1GB)

# Pull the docker image

docker pull gcr.io/google.com/cloudsdktool/cloud-sdk:357.0.0

# To check the docker image exists 

docker image ls
Downloading Cloud SDK docker image
Downloading Cloud SDK docker image

Verify the installation

# Check the gcloud version without saving its container file system 
# Refer here fore more info https://docs.docker.com/engine/reference/run/#clean-up---rm 

docker run --rm gcr.io/google.com/cloudsdktool/cloud-sdk:357.0.0 gcloud version

Authenticate

# Run the docker and create a container called "gcloud-config" to store your credentials 

docker run -ti --name gcloud-config gcr.io/google.com/cloudsdktool/cloud-sdk:357.0.0 gcloud auth login 

# Check that container created 

docker ps -a

Run some basic gcloud command, refer here for more tips https://cloud.google.com/sdk/gcloud/reference

# Run a new docker based on volume from the container created above for credential purposes 
# List projects 

docker run --rm --volumes-from gcloud-config gcr.io/google.com/cloudsdktool/cloud-sdk:357.0.0 gcloud projects list

Run some basic gsutil command, refer here for more tips https://cloud.google.com/storage/docs/gsutil/commands/help

# Run a new docker based on volume from the container created above for credential purposes 
# List files in a bucket 

docker run --rm --volumes-from gcloud-config gcr.io/google.com/cloudsdktool/cloud-sdk:357.0.0 gsutil ls gs://chenmingyong_udemy

Run some basic bq command, refer here for more tips https://cloud.google.com/bigquery/docs/bq-command-line-tool

# List the projects 

docker run --rm --volumes-from gcloud-config gcr.io/google.com/cloudsdktool/cloud-sdk:357.0.0 gcloud projects list

# Set your project based on the project id

docker run --rm --volumes-from gcloud-config gcr.io/google.com/cloudsdktool/cloud-sdk:357.0.0 gcloud config set project udemy-325708

# Run sample bq command

docker run --rm --volumes-from gcloud-config gcr.io/google.com/cloudsdktool/cloud-sdk:357.0.0 bq query --nouse_legacy_sql 'SELECT COUNT(*) FROM `bigquery-public-data`.samples.shakespeare'

This is part of my online course on how to kickstart data engineering workflows in Google Cloud Platform (GCP) for beginners, sign up here to watch detailed video explanation! 🤗