Home > Software design >  Convert String to Valid JSON with SED
Convert String to Valid JSON with SED

Time:11-30

I have some sudo-json / random string returned from an endpoint where there are double-quotes missing, some commas with nothing in between e.g. , ,, and some fields with no values. Example:

{issuingColo=1, csUserId=0, expirationTimestamp=2022-11-28 15:53:51.754, sessionId=0, isImpersonator=false, loginSession=1737438, identities=urn:thing:123 , urn:thing:456(urn:thing:account:123,234) , urn:li:thing:123 , , keyVersion=6, tokenVersion=9, permissions=, midToken=123, loginTimestamp=2022-11-28 14:53:49.705, isUser=false, memberId=5555}%

and I'm trying to change it to valid json to pass it to jq. I tried something like:

sed 's/\b\([\w:.-]*\)\b/"\1"/g'

but that didn't seem to do anything. Any help would be appreciated on what I'm missing!

CodePudding user response:

I don't have any experience with JSON, but as an exercise in reformatting to fit JSON, I have put together the following logic which does NOT handle nested braces (I'll look at that for he future). But for the single line you provided, which is very much malformed, I have come up with the following "transformer" script, if that is the format that is being generated by some canned utility that you are using.

The comma on the last line should not be present, being the last element in the "array". That is the only item which falls short of compliance with JSON standard.

I also believe that your trailing "%" is extraneous, but I have left it in with this code. That is easily reworked.

So, here is the code for what it is worth, as a learning experience:

#!/bin/bash

#QUESTION:  https://stackoverflow.com/questions/74604942/sed-convert-bad-json-to-valid

echo '{issuingColo=1,  csUserId=0,   expirationTimestamp=2022-11-28 15:53:51.754, sessionId=0, isImpersonator=false, loginSession=1737438, identities=urn:thing:123 , urn:thing:456(urn:thing:account:123,234) , urn:li:thing:123 , , keyVersion=6, tokenVersion=9, permissions=, midToken=123, loginTimestamp=2022-11-28 14:53:49.705, isUser=false, memberId=5555}%' |
    sed 's { {\n g' |
    sed 's } \n}\n g' |
    sed 's \ , , g' |
    sed 's ,, , g' |
    sed 's ^[ ]*  g' |
awk 'BEGIN{
    defer=0 ;
}{
    if( index( $0, "=" ) == 0 ){
        print $0 ;
    }else{
        n=split( $0, seg, ", " ) ;
        for( i=1 ; i<=n ; i   ){
            gsub( /^ *| *$|/, "", seg[i] ) ;
            if( index( seg[i], "=" ) == 0 ){
                if( index( seg[i], "urn:" ) == 1 ){
                    printf(",\"%s\"", seg[i] ) ;
                    defer=1 ;
                }else{
                    printf("\"%s\",\n", seg[i] ) ;
                } ;
            }else{
                if( defer == 1 ){
                    print "]," ;
                    defer=0 ; 
                } ;
                p=index( seg[i], "=" ) ;
                beg=substr( seg[i], 1, p-1 ) ;
                rem=substr( seg[i], p 1 ) ;
                if( index( rem, "urn:" ) == 1 ){
                    printf("\t\"%s\": [\"%s\"", beg, rem ) ;
                    defer=1 ;
                }else{
                    printf("\t\"%s\": \"%s\",\n", beg, rem ) ;
                } ;
            } ;
        } ;
    } ;
}'

The output is as follows:

{
    "issuingColo": "1",
    "csUserId": "0",
    "expirationTimestamp": "2022-11-28 15:53:51.754",
    "sessionId": "0",
    "isImpersonator": "false",
    "loginSession": "1737438",
    "identities": ["urn:thing:123","urn:thing:456(urn:thing:account:123,234)","urn:li:thing:123"],
    "keyVersion": "6",
    "tokenVersion": "9",
    "permissions": "",
    "midToken": "123",
    "loginTimestamp": "2022-11-28 14:53:49.705",
    "isUser": "false",
    "memberId": "5555",
}
%
  • Related