I am specifying --extra-files
as job parameter on Glue, yet front end is not happy with it as this picture shows:
I am anyway able to save my job and run it succesfully:
def read_yaml(config_file_name: str) -> dict:
with open(config_file_name, 'r') as stream:
try:
return yaml.safe_load(stream)
except yaml.YAMLError as exc:
logger.info(exc)
config = read_yaml(config_file_name=CONFIG_FILE_NAME)
logger.info(config)
What is then very weird is that that if I check job parameters again, then -extra-files
is disappeared, and if I run the job again it still reads my config file stored in S3.
Anyone has an explanation for: (1) why --extra-files
generates a front end error, and (2) why the job runs smoothly although --extra-files
is not set?
Thanks!
CodePudding user response:
It is just one of the quirks of AWS Glue. '--extra-files' is equivalent of 'Referenced Files Path' on the UI console.
Typically, when creating a new Glue job using CLI or CloudFormation, we use '--extra-files' to set this value. However, on the console, this parameter can be set up in the section 'Security Configuration, script libraries, and job parameters' ---> 'Referenced Files Path'
Once successfully set up, you will no longer see it as a separate key in the 'Job Parameters' section. Rather, it will shown in the job details tab with (believe it or not) a totally new name - 'Other lib path'