Home > Software engineering >  How to stop SED from un-escaping the output?
How to stop SED from un-escaping the output?


There's a million sed-related questions, but I couldn't find this specific case. I will happily stand corrected if it turns out I'm a bad googler.

I have a file with special characters and newline in it Let's call it query.kql:

| where $__timeFilter(TimeGenerated)
| where ResourceProvider == "MICROSOFT.NETWORK"
| order by TimeGenerated asc

I also have a json file. It's called data.json:

"analytics": {
            "query": "{{query.kql}}",
            "resource": "$GlobalDataSource",
            "resultFormat": "time_series"

What I want to do it insert the contents of query.kql into the {{query.kql}} placeholder in data.json, in escaped form (newline->\n, "->", etc)

This gives me the contents of query.kql in the desired format (works):

q=$(sed -e "N;s/\n/\\\n/" -e 's|["]|\\"|g' query.kql)
#q: AzureMetrics\n| where $__timeFilter(TimeGenerated) | where ResourceProvider == \"MICROSOFT.NETWORK\"\n| order by TimeGenerated asc

What I've tried:

# This does not work, because sed chokes on the result of the shell substitution:
sed -e "s/{{query.kql}}/$q/g" data.json
# Output: sed: -e expression #1, char 79: unterminated `s' command
# This works, but the output is wrong:
sed -e "s/{{query.kql}}/`echo $q`/g" data.json

# Output is unescaped and makes the json structure invalid:
"analytics": {
            "query": "AzureMetrics
| where $__timeFilter(TimeGenerated) | where ResourceProvider == "MICROSOFT.NETWORK"
| order by TimeGenerated asc",
            "resource": "$GlobalDataSource",
            "resultFormat": "time_series"

What I would like to have as output, is the exact contents of q inserted:

"analytics": {
            "query": "AzureMetrics\n| where $__timeFilter(TimeGenerated) | where ResourceProvider == \"MICROSOFT.NETWORK\"\n| order by TimeGenerated asc",
            "resource": "$GlobalDataSource",
            "resultFormat": "time_series"

How can I get sed to maintain the original contents of $q in the output? I'm also open to suggestions using awk, perl or anything also commonly available from a bash script.

CodePudding user response:

Using sed

$ q=$(sed '2s/|/\\\\n&/;s/"/\\\\&/g;4s/|/\\\\n&/' query.kql)
$ sed "s/{{query.kql}}/`echo $q`/" data.json
"analytics": {
            "query": "AzureMetrics \n| where (TimeGenerated) | where ResourceProvider == \"MICROSOFT.NETWORK\" \n| order by TimeGenerated asc",
            "resource": "",
            "resultFormat": "time_series"

CodePudding user response:

It looks like you are almost there. I think if you try and double escape the string you would get what you want. Try the following:

q=$(cat query.kql | sed -e ':a;N;$!ba;s/\n/\\\\n/g' -e 's#["]#\\\\"#g')
sed -e "s/{{query.kql}}/$q/g" data.json

Here is my output:

"analytics": {
            "query": "Metrics\n| where $__timeFilter(TimeGenerated)\n| where ResourceProvider == \"MICROSOFT.NETWORK\"\n| order by TimeGenerated asc",
            "resource": "$GlobalDataSource",
            "resultFormat": "time_series"

Edit: By the way, you should also escape the backslashes '' before escaping anything else. Otherwise you might end up interpreting original backslashes as escaping in the end result. sed -e 's/\\/\\\\/g' right before all other substitutions should do the trick.

CodePudding user response:

For dealing with JSON in the shell you should use jq:

jq --arg kql "$(< query.kql)" '.analytics.query = $kql' data.json
  "analytics": {
    "query": "Metrics\n| where $__timeFilter(TimeGenerated)\n| where ResourceProvider == \"MICROSOFT.NETWORK\"\n| order by TimeGenerated asc",
    "resource": "$GlobalDataSource",
    "resultFormat": "time_series"

or with a real replacement (instead of just setting a value to a key):

jq --arg kql "$(< query.kql)" '
    if .analytics.query == "{{query.kql}}"
        .analytics.query = $kql
' data.json
  • Related