Home > database >  How can I parse a YAML file using a shell script?
How can I parse a YAML file using a shell script?

Time:11-06

I have a YAML file which also has lists.

YAML File -

configuration:
  account: account1
  warehouse: warehouse1
  database: database1
  object_type:
    schema: schema1
    functions: funtion1
    tables:
      - table: table1
        sql_file_loc: some_path/some_file.sql
      - table: table2
        sql_file_loc: some_path/some_file.sql

I want to store the key-pair values to shell variable and loop it through. For example, the value for account/warehouse/database should go to variables which I can use later on. Also, the values for tables(table1 and table2) and sql_file_loc should go to shell variable which I can use for looping like below -

for i in $table ;do 
    echo $i
done

I have tried this code below -

function parse_yaml {
   local prefix=$2
   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p"  $1 |
   awk -F$fs '{
      indent = length($1)/2;
      vname[indent] = $2;
      for (i in vname) {if (i > indent) {delete vname[i]}}
      if (length($3) > 0) {
         vn=""; for (i=0; i<indent; i  ) {vn=(vn)(vname[i])("_")}
         printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, $2, $3);
      }
   }'
}

And this is the output I get -

configuration_account="account_name"
configuration_warehouse="warehouse_name"
configuration_database="database_name"
configuration_object_type_schema="schema1"
configuration_object_type_functions="funtion1"
configuration_object_type_tables__sql_file_loc="some_path/some_file.sql"
configuration_object_type_tables__sql_file_loc="some_path/some_file.sql"

It doesn't print - configuration_object_type_tables__table="table1" and configuration_object_type_tables__table="table2"

Also for a list, it prints two underscores(__) unlike other objects. And I want to loop the values stored in configuration_object_type_tables__table and configuration_object_type_tables__sql_file_loc.

Any help would be appreciated!

CodePudding user response:

Consider using a YAML processor mikefarah/yq. It's a one liner:

yq e '.. | select(type == "!!str") | (path | join("_"))   "=\""   .   "\""' "$INPUT"

Output

configuration_account="account1"
configuration_warehouse="warehouse1"
configuration_database="database1"
configuration_object_type_schema="schema1"
configuration_object_type_functions="funtion1"
configuration_object_type_tables_0_table="table1"
configuration_object_type_tables_0_sql_file_loc="some_path/some_file.sql"
configuration_object_type_tables_1_table="table2"
configuration_object_type_tables_1_sql_file_loc="some_path/some_file.sql"

Also take a look at this cool builtin feature of yq:

yq e -o props "$INPUT"

Output

configuration.account = account1
configuration.warehouse = warehouse1
configuration.database = database1
configuration.object_type.schema = schema1
configuration.object_type.functions = funtion1
configuration.object_type.tables.0.table = table1
configuration.object_type.tables.0.sql_file_loc = some_path/some_file.sql
configuration.object_type.tables.1.table = table2
configuration.object_type.tables.1.sql_file_loc = some_path/some_file.sql

CodePudding user response:

I suggest you try yq yaml processor like jpseng mentioned.

About the code you have here, the regex is not matching the "- table" pattern due to "- " prifix.

  • Related