Home > Software engineering >  How to load a multiline setting from a file using bash?
How to load a multiline setting from a file using bash?

Time:11-16

I have a long config file which looks like:

<some stuff before our example>
    'Realtime' => [
        'foo' => 'bar',
        'enabled' => true,
        'lorem' => 'ipsum'
    ],
<some stuff after our example>

The above is a large config php file and I was asked to mine the enabled value of 'Realtime` with bash. I could do it with PHP, but I was specifically asked to do it with bash.

I tried the following:

echo $(tr '\n' ' ' < myconfig.php | sed '$s/ $/\n/') | grep -o -P '(?<=Realtime).*(?=\])'

and this mines the text from the file between Realtime and the last ]. But I would like to mine the content between Realtime and the first ]. For the time being I have implemented a simplistic bash and accompanied that with PHP parser, as follows:

    public function getConfig($name)
    {
        $path = Paths::CONFIG_FILE;
        if (!$this->config) {
            $this->config = Command_ShellFactory::makeForServer('zl', "cat {$path}")->execute(true, true);
        }
        $splitName = explode('.', $name);
        $lastPosition = 0;
        $tempConfig = $this->config;
        foreach ($splitName as $currentName) {
            if (($position = strpos($tempConfig, $currentName)) === false) {
                throw new RuntimeException('Setting was not found');
            }

            $tempConfig = substr($tempConfig, $position);
        }

        return trim(explode("=>", explode("\n", $tempConfig)[0])[1], ", \n\r\t\v\x00");
    }

and this works, but I'm not satisfied with it, because it loads the whole file into memory via the shell command and then searches for the nested key (Realtime.enabled is passed to it). Is it possible to improve this code in such a way that all the logic would happen via bash, rather than helping it with PHP?

EDIT

The possible settings to mine could be of any depth. Examples:

[
    /*...*/
    'a' => 'b', //Depth of 1
    'c' => [
        'a' => 'd' //Depth of 2
    ],
    'e' => [
        'f' => [
            'g' =>'h' //Depth of 3
        ]
    ]
    /*...*/
]

Theoretically any amount of depth is possible, in the example we have a depth of 1, a depth of 2 and a depth of 3.

EDIT

I have created foo.sh (some fantasy name of no importance):

[
    'Realtime' => [
        'enabled' => [
            'd' => [
                'e' => 'f'
            ]
        ],
        'a' => [
            'b' => 'c'
        ]
    ]
    'g' => [
        'h' => 'i'
    ]
    'Unrealtime' => 'abc'
]

Working one-dimensional command:

sed -Ez ":a;s/.*Unrealtime' =>  ([^,]*).*/\1\n/" foo.sh | head -1

The result is

'abc'

Working two-dimensional command:

sed -Ez ":a;s/.*g[^]]*h' =>  ([^,]*).*/\1\n/" foo.sh | head -1

The result is

'i'

Three-dimensional command:

sed -Ez ":a;s/.*Realtime*[^]]*a[^]]*b' =>  ([^,]*).*/\1\n/" foo.sh | head -1

It is working if and only if the

    'a' => [
        'b' => 'c'
    ]

is the first child of Realtime. So, something is missing, as I need to avoid assuming that the element I search for is the first child.

Working four-dimensional command:

sed -Ez ":a;s/.*Realtime[^]]*enabled[^]]*d[^]]*e' =>  ([^,]*).*/\1\n/" foo.sh | head -1

Again, it only works if enabled is the first child of Realtime. I was modifying my test case above, changing the order of the children of Realtime. So, it seems that the only thing missing from this expression is something that would specify that we are not necessarily looking for the first child.

CodePudding user response:

One awk idea:

awk -F"'" '                                             # define input field delimiter as single quote
$2 == "Realtime"           { inblock=1 }                # if 2nd field == "Realtime" then set flag
$2 == "enabled" && inblock { if (NF==3) {               # value is not wrapped in single quotes
                                pos=index($0,"=>")      # find location of "=>" string
                                value=substr($0,pos 2)  # grab everything after "=>"
                                gsub(/[ ,]/,"",value)   # remove all spaces and commas
                             }
                             else                       # value is wrapped in single quotes
                                value=$4                # grab 4th field
                             print value
                             exit                       # no need to process rest of file so exit script
                           }
' myconfig.php

This generates:

true

NOTE: this solution is hardcoded based on provided sample

CodePudding user response:

Assuming the string Realtime occurs only once in the file, you can try this sed

$ sed -Ez "s/.*Realtime[^]]*enabled' =>  ([^,]*).*/\1\n/" myconfig.php
true

CodePudding user response:

Based on your test with tr and grep:

#! /usr/bin/env bash

tr -d "\n" < "myconfig.php" \
| grep -o "'Realtime' => \[[^\]*\]" \
| grep -oE "'enabled' => (true|false)" \
| head -1 \
| cut -d " " -f 3

Notes:

  • If you have only one Realtime block in your file, | head -1 if not necessary
  • If you are not sure about number of spaces, remove all spaces and changes next filters like this:
#! /usr/bin/env bash

tr -d "\n" < "myconfig.php" \
| tr -d " " \
| grep -o "'Realtime'=>\[[^\]*\]" \
| grep -oE "'enabled'=>(true|false)" \
| head -1 \
| cut -d ">" -f 2

UPDATE

More generic solution:

#! /usr/bin/env bash

INPUT_FILE="$1"
LEVEL_1_OBJ="Settings"
LEVEL_N_OBJ="Realtime"
FIELD="enabled"

tr -d "\n" < "${INPUT_FILE}" \
| tr -d " " \
| grep -o "'${LEVEL_1_OBJ}'=>\[.*" \
| grep -o "'${LEVEL_N_OBJ}'=>\[[^\]*\]" \
| grep -oE "'${FIELD}'=>[^,]*" \
| head -1 \
| cut -d ">" -f 2

You cloud add level filter between 1 and N (the last) and/or any filter between each grep commands

CodePudding user response:

There's no way to do that task reliably.

IMHO, the less worse solution would be to include your config file from a PHP command and output the requested setting.

Here's an example of config file that shows how horrible it can be to mine a setting from it without interpreting the PHP:

<?php
$val = 'h';
$config = [
    'a' => 'b',
    'c' => array ( 'a' => 'd', 'b' => " /* "
    ), 'e' => [
        'f' => [ 'g' => $val ]
/*
        'f' => [
            'g' =>'h' */
]];

Now, if you include it from a shell script that runs PHP:

#!/usr/bin/env php
<?php
    require("/path/to/configfile.php");

    $path = explode(".",$argv[1]);
    $result = $config;

    while ($path) {
        if (!is_array($result))
            exit(1);
        $p = array_shift($path);
        if (!array_key_exists($p,$result))
            exit(1);
        $result = $result[$p];
    }

    var_dump($result);

Then you can accurately get a setting value, as long as the config file doesn't rely on environment settings (which would be different in the web server compared to the command-line):

#!/bin/sh
./script.php a
./script.php c.a
./script.php e.f.g
string(1) "b"
string(1) "d"
string(1) "h"
  • Related