I have a legacy cli tool which outputs a structured list with sub-items intended with a tab
(stackoverflow won't let me put tabs here so I replaced them with 4 spaces in this example).
Heading One:
Sub One: 'Value 1'
Sub Two: 'Value 2'
Heading Two:
Sub Three: 'Value 3'
Sub Four: 'Value 4'
Key One: 'This key has no heading'
I try to achieve an JSON output like
{
"Heading One": {
"Sub One": "Value 1",
"Sub Two": "Value 2"
},
"Heading Two": {
"Sub Three": "Value 3",
"Sub Four": "Value 4"
},
"Key One": "This key has no heading"
}
Is this possible with jq
or do I need to write a more complex python-script?
CodePudding user response:
This is an approach for a deeply nested input. It splits on top-level items using a negative look-ahead regex on tabs following newlines, then separates the head and "unindents" the rest by removing one tab following a newline, which serves as input for a recursive call.
jq -Rs '
def comp:
reduce (splits("\n(?!\\t)") | select(length > 0)) as $item ({};
($item | index(":")) as $hpos | .[$item[:$hpos]] = (
$item[$hpos 1:] | gsub("\n\t"; "\n")
| if test("\n") then comp else .[index("'\''") 1: rindex("'\''")] end
)
);
comp
'
{
"Heading One": {
"Sub One": "Value 1",
"Sub Two": "Value 2"
},
"Heading Two": {
"Sub Three": "Value 3",
"Sub Four": "Value 4"
},
"Key One": "This key has no heading"
}