Home > Enterprise >  Python getting output from running shell command - gcloud create dataproc cluster
Python getting output from running shell command - gcloud create dataproc cluster

Time:12-15

I trying to get the expire-dataproc-tag from running gcloud-dataproc-create-cluster using python

I tried subprocess.Popen, the-issue I think due to it's an ERROR or it taking long time to retrieve the result, I end-up with and empty string

I tried command, and command_1 worked fine, the issue appeares when running command_2

import subprocess

command = "echo hello world"
command_1 = "gcloud compute images list --project {project-id} --no-standard-images"
command_2 = 'gcloud beta dataproc clusters create cluster-name --bucket {bucket} --region europe-west1 --zone europe-west1-b --subnet {subnet} --tags {tag} --project {project-id} --service-account {service-account} --master-machine-type n1-standard-16 --master-boot-disk-size 100 --worker-machine-type n1-standard-1 --worker-boot-disk-size 100 --image {image} --max-idle 2h --metadata enable-oslogin=true --properties {properties} --optional-components=ANACONDA,JUPYTER,ZEPPELIN --enable-component-gateway --single-node --no-address'.split(' ')

process = subprocess.Popen(command_2, stdout=subprocess.PIPE, shell=True)
# process.wait()
try:
    print('inside-try')
    result, err = process.communicate()
    result = result.decode('utf-8')
except Exception as e:
    print('The Error', e)

print('the result: ', result)
print("the-error: ", err)

the output is

inside-try  
ERROR: (gcloud.beta.dataproc.clusters.create) INVALID_ARGUMENT: Dataproc custom image '{image-name}' has expired. Please rebuild this custom image. To extend the custom image expiration date to '2022-02-11T08:29:58.322549Z', please use this cluster property during cluster creation: 'dataproc:dataproc.custom.image.expiration.token=1.{image-name-properties......}'               
the result: 
the-error: None

I'm trying to get the ERROR: .... output to the result-variable (to be printed after the result)

CodePudding user response:

You're not capturing stderr from the process.

Try:

process = subprocess.Popen(
    command,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    shell=True
)

And so err wasn't being set by result, err = process.communicate()

With the above change, err will contain the error message that you're receiving.

I strongly encourage you to consider using Google's SDKs to interact with its services. Not only are these easier to use but, instead of shipping strings in/out of sub-processes, you can ship Python objects.

Here's the documentation for Creating a Dataproc cluster in Python.

  • Related